Methods in Electromagnetic Wave Propagation , 2nd Edition

METHODS IN ELECTROMAGNETIC WAVE PROPAGATION SECOND EDITION IEEE Series on Electromagnetic Wave Theory aeory The IEEE ...

Author: D. S. J. Jones

273 downloads 1774 Views 50MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

METHODS IN ELECTROMAGNETIC WAVE PROPAGATION SECOND EDITION

IEEE Series on Electromagnetic Wave Theory aeory

The IEEE Series on Electromagnetic Wave Theory consists of new titles as well as reprintings and revisions of recognized classics that maintain long-term archival significance in electromagnetic waves and applications.

Series Editor Donald G. Dudley University of Arizona

Advisory Board Robert E. Collin Case Western Reserve University Akira Ishimaru University of Washington D. S. Jones University of Dundee

Associate Editors Electromagnetic Theory, Scattering, and Diffraction EhudHeyman Tel-Aviv University Differential Equation Methods Andreas C. Cangellaris University of Arizona Integral Equation Methods Donald R. Wilton University of Houston Antennas, Propagation, and Microwaves David R. Jackson University of Houston

Books in the Series Chew, W. C., Waves and Fields in Inhomogeneous Media Christopoulos, C., The Transmission-Line Modeling Methods: TIM Collin, R. E., Field Theory ofGuided Waves, Second Edition Dudley, D. G., Mathematical Foundationsfor Electromagnetic Theory Elliott, R. S., Electromagnetics: History, Theory, and Applications Felsen, L. B. and Marcuvitz, No, Radiation and Scattering of Waves Harrington, R. F., Field Computation by Moment Methods Jones, D. So, Methods in Electromagnetic Wave Propagation, Second Edition Lindell, I. V., Methods for Electromagnetic Field Analysis Tai, Co To, Generalized Vector and Dyadic Analysis: Applied Mathematics in Field Theory Tai, Co To, Dyadic Green Functions in Electromagnetic Theory, Second Edition Van Bladel, J., Singular Electromagnetic Fields and Sources

Wait, J., Electromagnetic Waves in Stratified Media

METHODS IN ELECTROMAGNETIC WAVE PROPAGATION SECOND EDITION

D. S.

JONES

UNIVERSITY OF DUNDEE

+IEEE

The Institute of Electrical and Electronics Engineers, Inc., New York

ffiWILEY~ INTERSCI ENCE A JOHN WILEY & SONS, INC., PUBLICATION

A NOTE TO THE READER This book has been electronically reproduced from digital information stored at John Wiley & Sons, Inc. We are pleased that the use of this new technology will enable us to keep works of enduring scholarly value in print as long as there is a reasonable demand for them. The content of this book is identical to previous printings.

IEEE PRESS 44S Hoes Lane, PO Box 1331 Piscatawaty NJ 08855-1331 IEEE Antennasand Propagation Society, Sponsor ©D.S.Jones, 1979, 1994 First edition 1979Oxford University Press Second edition 1994 Reissued 1995 jointly with IEEE Press

<91994 THE INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS, INC., 3 Park Avenue,17th Floor,New York, NY 10016-5997 <0 2003 Published by John Wiley & Sons, Inc., Hoboken,New Jersey. All rightsreserved.

No part of this publication maybe reproduced,stored in a retrieval system, or transmitted in any form or by any means, electronic,mechanical, photocopying, recording, scanning,or otherwise, except as permittedunder Section 107or 108of the 1976 United StatesCopyrightAct, without either the prior writtenpermission of the Publisher,or authorization throughpayment of the appropriateper-eopy fee to the CopyrightClearance Center, Inc., 222 Rosewood Drive, Danvers,MA01923, 978-7508400, fax 978-750-4470, or on the web at www.copyright.com. Requests to the Publisherfor permission shouldbe addressedto the Permissions Department, John Wiley & Sons, Inc., 111 RiverStreet, Hoboken,NJ 07030, (201) 748-6011,fax (201) 7486008, e-mail:[email protected] ISBN 0-7803-1155-8

PREFACE TO THE SECOND EDITION There has been much activity in the areas covered by this book since the first edition. Numerous papers have appeared on all the topics and the constraints of space have meant that not all of them can be quoted. To the many authors whose research I have had to omit I offer my apologies. It is hoped that a sufficient selection has been included for the interested reader to become acquainted with the progress that has been made and to follow up developments. Once again I am most grateful to my wife and Mrs Ross for their forbearance, unfailing assistance, and encouragement. D.S.J.

PREFACE TO THE FIRST EDITION Modern methods of tackling problems associated with electromagnetic waves' involve a judicious mixture of analysis and computation. The analysis occurs in the mathematical formulation and in establishing that it has the requisite properties. Conversion to a form suitable for the computer entails numerical analysis, whose justification may also rest on a considerable body of analysis. Therefore, the aim of these two volumes is to develop a suitable framework of theory and numerical analysis with applications to various aspects of the propagation of electromagnetic waves. An attempt has been made to couch the explanation in as comprehensible a language as possible and to assume a starting point as early as commensurate with the size of the text. To assist with the understanding numerous exercises have been inserted at convenient points and some of these are open-ended so that any instructor has plenty of freedom in determining the mode of treatment. Complementary material will be found in D. S. Jones, Acoustic and electromagnetic waves, Oxford University Press (1986). The first five chapters are devoted to the provision of a theoretical background and the topic of guided waves. The first chapter sets out the fundamentals of numerical analysis which are essential in handling a problem numerically. Propagation in waveguides can be approached from three different points of view. One possibility is a direct numerical attack based on difference equations; this is the subject matter of Chapter 2. Another angle is to consider the problem as one of finding the eigenvalues of an operator. This avenue is explored in Chapter 3, which also treats the cavity resonator from a theoretical standpoint. The third route employs variational methods and these are considered in Chapter 4. Chapter 5 returns to numerical techniques with particular emphasis on variational methods, integral equations and finite elements. Chapters 6 to 9 deal with radiating waves whether produced directly from a transmitter or indirectly by scattering from an irradiated obstacle. Antennas are discussed in Chapter 6 with separate sections for wires, solids, and dielectrics. The analysis in Chapter 6 is concerned with the frequency domain. The changes necessary in the time domain are examined in Chapter 7, including singularity expansion method. The well-known geometric theory of diffraction receives an extensive review in Chapter 8. Finally, Chapter 9 investigates inverse scattering, embracing holography and adaptive arrays as well as other applications. Again my thanks are due to my wife and Mrs Ross for their constant help and encouragement. Dundee December 1986

D.S.J.

This book is dedicated with deep affection to the Streather family and, in particular Bessie, Kittie, Nell, Flo, Alice, Peg, and Frank

CONTENTS 1

ASPECTS OF NUMERICAL ANALYSIS

1

Interpolation and approximation

1

1.1 1.2 1.3 1.4 1.5 1.6 1.7

2

Interpolation Inverse interpolation Interpolation in two dimensions Approximation L 2 -norm approximation Rational approximation Trigonometric interpolation

1 9 9 12 15 23 28

Solution of equations

31

1.8 Solution of an equation 1.9 Systems of non-linear equations

40

Matrices

40

1.10 Matrices 1.11 Matrix norms

46

Linear equations

51

1.12 Linear equations-direct methods 1.13 Iterative methods 1.14 Matrix eigenvalues

51 57 60

Generalized inverse

66

1.15 The generalized inverse

66

WAVEGUIDES AND DIFFERENCE EQUATIONS

72

2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14

Introduction Waveguides Numerical derivatives Properties of difference equations TEM modes The dominant mode Higher modes Direct methods Other equations Conformal mapping Waveguides containing dielectric Microstrip transmission lines Other methods for guides The fast Fourier transform

31

40

72 74 78 83 93 96 98 100 108 109 109 112 117 117

x

3

CONTENTS

OPERATORS AND EIGENVALUES Preliminaries

120 120

3.1 Hilbert space 3.2 Linear operators 3.3 Bounded linear operators

120 123 127

Partial differential equations

133

3.4 Integral and partial differential equations 3.5 The cavity resonator

133 143

Unbounded operators and eigenvalues

146

3.6 Unbounded operators 3.7 Approximation theorems 3.8 Point matching

146 149 152

The derivative of an operator

156 156

4.1 4.2 4.3 4.4

156 162 167 170

4 VARIATIONAL METHODS AND OPTIMIZATION The derivative Mean-value theorem Higher derivatives Convex functionals

Newton's method for operators

172

4.5 Newton's method

172

Optimization 4.6 Unconstrained optimization 4.7 The effect of constraints

181 181 191

Variational principles

194

4.8 Variational approach 4.9 Examples 4.9.1 Network analysis 4.9.2 Integral equations 4.9.3 Ordinary differential equations 4.9.4 Poisson's equation

194 200 200 202 203 205

Waveguides

208

4.10 4.11 4.12 4.13 4.14

208 212 215 218 219

The capacitive iris Another form of variational principle The inductive iris Vector optimization Sobolev spaces

5 NUMERICAL ASPECTS OF VARIATIONAL METHODS Minimal systems

222 222

CONTENTS

5.1 5.2 5.3 5.4

6

Galerkin's method Minimal systems Positive-definite operators Stability

xi

222 225 230 234

Integral equations

241

5.5 Compact operators 5.6 Integral equations 5.7 Equations of the first kind

241 245 258

Numerical trial functions

262

5.8 5.9 5.10 5.11

262 273 275 276

Finite elements Finite differences Comparison between finite difference and finite element Eigenvalues

Numerical integration

276

5.12 Quadrature

276

ANTENNAS AND INTEGRAL EQUATIONS

286

Wire antennas

286

6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10

286 286 290 297 299 302 307 316

Introduction The perfectly conducting wire General excitation of the infinite wire The semi-infinite wire The finite wire The receiving antenna Numerical methods Curved antennas Log-periodic antennas Loads and arrays

321

323

Solid antennas

324

6.11 6.12 6.13 6.14 6.15 6.16 6.17 6.18 6.19 6.20 6.21 6.22

324 328 335 337 340 346 348 357 359 360 362 362 366 372

Wire grid models The electric-field integral equation Uniqueness The magnetic-field integral equation The Fredholm alternative Compactness and other properties of the MFIE Other integral equations Numerical considerations for surfaces Singular integrals The algebraic system The null-field method The impedance boundary condition ~.23 Absorbing boundary conditions 6.24 The surface radiation condition

xii

7

8

CONTENTS

Dielectric antennas

373

6.25 The infinite dielectric circular rod 6.26 Modal excitation 6.27 The finite rod 6.28 General shapes 6.29 Homogeneous isotropic dielectric 6.30 Uniqueness for the homogeneous isotropic dielectric Appendix: Geometry of surfaces

373 377 381 381 385 388 389

TRANSIENT PHENOMENA

395

7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10 7.11

Finite methods Integral equations in the time domain Numerical methods for thin wires in the time domain Perfectly conducting bodies Numerical matters The harmonic approach versus the impulse response The Laplace transform The location of the poles The impulse response Practical determination of the positions of the poles Prony's method and modifications

395 396 400 404 405 409 410 417 423 427 428

GEOMETRIC THEORY OF DIFFRACTION

434

8.1 8.2 8.3 8.4 8.5

The high-frequency approximation Geometrical optics The ray and transport equations The stratified medium Fermat's principle

Numerical solution of ordinary differential equations 8.6 8.7 8.8 8.9

Multistep methods Runge-Kutta methods Extrapolation Systems of differential equations

434 438 441 446 451

456 457 465 468 469

Canonical problems

469

8.10 8.11 8.12 8.13 8.14 8.15 8.16 8.17 8.18 8.19 8.20

469 470 475 482 489 492 499 509 517 522 535

Geometrical optics revisited Focusing Reflection by stratification Edges Edge rays Uniformly valid approximations Double edge diffraction Emission from a waveguide The wedge The effect of curvature Generalization

CONTENTS

8.21 8.22 8.23 8.24

Optimal curvature The diffraction matrix for a curved boundary Diffraction by a discontinuity in curvature Reflector antennas

xiii

539 539 541 549

Leaky rays

553

8.25 Gaussian beams and complex sources 8.26 Complex rays 8.27 Optical fibres

553 557 559

9 SOURCE DETECTION

568

9.1 General considerations

568

Inverse scattering

570

9.2 9.3 9.4 9.5

Low frequencies High frequencies Scattering in the time domain Moving targets

570 573 577 579

The inverse source problem

581

9.6 9.7 9.8 9.9 9.10

581 584 585 590 591

Harmonic sources Inhomogeneities Statistical considerations Correlation techniques Far-field cross-correlation technique

Holographic techniques

596

9.11 9.12 9.13 9.14

596 601

Basic principles of holography Location of an inhomogeneity Field in the aperture of an antenna Zeros of entire functions

602

605

Synthesis of radiation patterns

607

9.15 9.16 9.17 9.18 9.19 9.20

607 609 611 612 616 617

General considerations Synthesis by series expansion Construction errors Constrained aperture norm Directivity Penalty functions

Array signal processing

618

9.21 Adaptive beam forming 9.22 Simultaneous multiple beams 9.23 Time-varying arrays

618 622 625

References

627

Index

641

1 ASPECTS OF NUMERICAL ANALYSIS INTERPOLATION AND APPROXIMATION 1.1 Interpolation There are many situations in which one is given the value of a quantity at certain times and would like to say something about the behaviour at intermediate times. For instance, from readings of an electric meter taken at noon every day one might wish to make deductions about the consumption of electricity at 9 in the morning. This is a problem of interpolation in which one attempts to estimate from data at isolated points the form of a function at intervening points. The same problem arises in the use of mathematical tables when the value of a function is required at some point not listed in the table. When all that we know about a function are the isolated data we can expect that there will be different opinions on its performance in between. Suppose we are given the values of 11' 12 and 13 at x = Xl' x 2 , and X3 respectively (see Fig. 1.1). Then, a simple rule would be to join successive values by straight lines and use these lines to tell us the value of f in between. This is an approximation 10 in which

and, in general, if f" is the value at x, XII + 1 -

X

Xn+ 1 -

XII

fo(x)=---fn+

x, f,

X -

x II + 1

n+l

-

(1.1)

XII

Of course, some people will say that they are not willing to accept this approximation because the derivatives are not continuous across the data points but, for the moment, let us note that by combining the formulae (1.1) we can obtain the approximation fo(x)

=

n

L

m= 1

Bm(x)fm

(1.2)

ASPECTS OF NUMERICAL ANALYSIS

2 f

'2 I, I I I I I I I I

I I

13

, I

X,

x

X3

X2

Fig. 1.1. Linear interpolation.

where X2 -

X

X2 -

Xl

B1 ( x ) = - -

=0 B,,(x) = x - X,,-l XII -

X,.-l

=0 and, if m =I:- 1 or n,

x - Xm-l

Xm+l -

X

=----

=0

(Xm+l

~X ~

XII)'

Each of Bt , ••• , B,. vanishes outside a finite interval and has the shape of either a half triangle or full triangle (Fig. 1.2). For this reason the functions Bh .•• , B" are known as triangle or pyramid functions. So we could call (1.2) an approximation to our function in terms of pyramid functions. It will be necessary to consider more complicated expansions in order to meet some of the conditions encountered.

INTERPOLATION

3

AND APPROXIMATION

1 -------

Fig. 1.2. Pyramid functions.

Suppose, now, that we are given the additional data of the values of the derivative of I, say I;, I;, ... at x = Xl' Xl' .... It is immediately obvious that the derivatives of 10 will not agree with the derivatives of 1 except in rare circumstances. If we are to remedy this we need an approximation between Xi and Xi+ 1 which gives the correct derivatives and must therefore satisfy two extra conditions. So our straight lines must be replaced by cubics, if we stick with powers of x for our approximations. Let us try y

= a(x

- Xi)3 + b(x - Xi)2

+

c(x - Xi)

+ d.

Then since y = Ii and y' = I; when X = Xi we see that d = Ii and c = I~. The conditions y = Ii + h y' = 1 ~ + 1 at X = Xi+ 1 then imply that a(x i+1

X;)3 + b(X i+1

-

3a(Xi+ 1

-

-

Xi)2 + (Xi+1

Xi)2 + 2b(Xi+ 1

-

Xi)

-

xi)f~

+ h = h+l'

+ I; = f:+ 1·

From these can be deduced a(Xi+ 1

-

Xi)3 = (Ii+ 1

+

fi)(Xi+ 1

-

Xi) - 2(/;+

1 -

/;),

b(X i+1 - Xi )2 = 3(/;+1 - Ii) - (/~+1 + 2/~)(Xi+l - Xi). Therefore our approximation between Xi and Xi+ 1 can be expressed as

where

Y = (J.i(X)/;

+

f3i(X)/;+ 1

+ Yi(x)/; + t5 i(x )/

;+

1

4

ASPECTS OF NUMERICAL ANALYSIS

Xm - 1

\

.... /

/

x.;

Fig. 1.3. Cubic basis functions: solid curve, B~); broken curve, B~).

By means of these formulae we can construct our approximation over the whole interval as lo(x)

=

L {B~)(x)/m + B~)(x)/~} m=1 PI

where B~)(x)

(1.3)

=0

(Xl ~

x ~

=Pm-1(X)

(Xm -

~ x ~ x".)

= tIm(x)

(X m ~ X ~ X m + 1)

=0

(X m+1 ~ X ~ X PI)

1

Xm -

l)

and B~) is the same with rm and hm taking the place of C(m and Pm respectively. These formulae do not hold for m = 1 and m = n; the necessary modifications are easy to carry out and are left to the reader. The behaviour of B~) and B~) when m is neither 1 nor n is shown in Fig. 1.3. In both (1.2) and (1.3) the interpolant 10 consists of a series, each term of which is the product of a given value such as 1m or I~ and a function of x such as Bm or B~). The given values occur only in the coefficients, the functions of X depending only on the points which are selected for observation and not on the values found there. For this reason the functions Bm , B~), and B~) are known as basis functions. In the following whenever we have an expansion which has the form of (1.2) or (1.3) we shall call the corresponding B, basis functions whether or not they are polynomials. So far we have discussed the two cases in which the B, are linear and cubic polynomials respectively in the interval (X"'-1' x m+ 1) and zero outside. These are obviously particular instances of the more general situation in which B is a polynomial of degree 2q - 1 in the interval and zero outside. The general case is known as piecewise Hermite interpolation, the adjective piecewise being incorporated to indicate that once we have partitioned our interval at the points

INTERPOLATION

5

AND APPROXIMATION

Xl' X2' ••• the basis function is required to be zero on all of the sub-intervals except one or two. Suppose that we are given a function which, together with its first r derivatives, is continuous for Xl ~ X ~ X n • Such a function will be signified by writing f E C'[x 1 , x n ] ; sometimes CO will be denoted by C. Also the brackets will be dropped if there is no ambiguity about which interval is being referred to. Now, if we have a polynomial of degree 2q - 1 it will contain 2q coefficients which we can adjust. Consequently we can make it satisfy q conditions at x = Xi and q conditions at x = x.; i- Thus, if E Cq-l[X i , x.; 1] we can ask that the polynomials pzq - 1 (x) satisfy

,r

dkf _ dkP2q_ l(X) dx k dx k at both x = Xi and x = X i + 1 for k = 0,1, ... , q - 1. In this way we construct a piecewise Hermite interpolant which agrees with a function and its first q - 1 derivatives at the points of observation. The corresponding basis functions can be deduced as in the cases q = 1 and q = 2 which we have already discussed. Of course, even if f or one of its derivatives is not continuous between the points of observation we can use the same interpolant so long as there is continuity near the points of observation. This is an example of approximating a discontinuity by something continuous. Whether it is valuable or not will depend upon the circumstances. One plain disadvantage of this type of interpolation when q > 1 is its involvement of the derivatives of f and the steadily increasing complexity of the equations to be solved as q grows. One way of avoiding the derivatives of f is to ask that the derivative of the interpolant be continuous at the points X l' X2' .•• , X n _ 1 but not to impose the additional restriction that it has the same value as the derivative of f. So we can reduce the order of the polynomial to 2 and try

To satisfy y =

h

at

x = Xi

ai(x i+ I -

and y = X i )2

h+l

at

+ bi(Xi+ I

X = X i+ 1 -

Xi)

we need

= [;+

I -

c,

= hand

[;.

If we substitute for b, from this relation we obtain y

= ai(x -

x;)(x -

X i+l)

+ /; + (/;+1 -

/;)(x -

x;)/(X i+ 1 -

Xi)'

(1.4)

The constant ai is at our disposal but must be such that the derivative of y is the same as x approaches Xi from above or below. Hence Qi

(Xi

-

Xi+ 1)

+ /;+1-/; = ai-l (Xi Xi+ 1 -

Xi

-

Xi-l

)

+ /;-/;-1 . Xi -

Xi - 1

(1.5)

6

If Xi+ 1

ASPECTS OF NUMERICAL ANALYSIS -

Xi

= Xi

-

Xi-l

= h this simplifies to (1.6)

These equations hold at the n - 2 points X2' ••. ,X,.-l. Since there are n - 1 coefficients at it follows that one can be chosen arbitrarily and then the remainder are known from (1.5) and (1.6) as appropriate. It will be noticed that the second derivative of y is 2ai so that choosing one of the at is equivalent to specifying the second derivative of the interpolant in a sub-interval. An approximation of the form (1.4) subject to (1.5) or (1.6) is known as a quadratic spline and Xl' ... , X,. are known as its nodes or nodal points or knots. The quadratic is the simplest of the splines. If we demand that the first and second derivative be continuous at the internal nodal points we are led to a cubic spline. It is easiest to work with the second derivative of the spline. Since it will be a linear function we can ensure its continuity by adopting the form (1.1) i.e. 2

d y_ b

Xi+l -

x

dx

Xi+l -

Xi

i

-2 -

+

b

i+1

x-

Xi

Xi+l -

Xi

for each of the intervals (Xi' Xi + 1). The coefficients b, will then be values of the second derivative of the spline at the nodal points. For simplicity, it will now be assumed that the nodal points are equally spaced so that Xi + 1 - Xi = h for i = 1, ... , n - 1. Then an integration gives dy dx =

1 b,

-"ih (Xi+l

- x)

2

1 b.; 1

+ "i-h- (x

- Xi)

2

+

c,

and 1 b,

y = - - (x.; I

6h

x)

-

3

b.; 1 + -1 - (x

6 h

- Xi)

3

+

Ci(X -

Xi)

+ di:

To make dyjdx continuous at x = Xi we must have -tbih

while Y = h, h+ I at x =

X h Xi+ I

+ c, = !bih + Ci-l necessitate

I. = !b ih 2 + di , h+l

= !b i + 1h

2

+ Ci h + die

From the last two equations we deduce that the cubic spline can be written as y

= !~(X'+l 6h

I

- X)3

+!

6

bi+ 1 (x - xy h

+(};:l_hb~+l)<X_Xi)

+

(fjh

hbi ) (x.; 1 6

-

x)

(1.7)

INTERPOLATION

7

AND APPROXIMATION

provided that (1.8) for i = 2, ... , n - 1. There are now n coefficients available so that two can be selected arbitrarily and the rest are then determined by (1.8). Often, the choice b l = b; = 0 is made. These formulae can be combined so as to express the interpolant in terms of basis functions. However, it is more convenient to proceed in a different way. Let Sj(x) denote the spline in (x., x.; 1). Then S7 and 87-1 must agree at x = Xi so that (1.9) S~' = S;'-1 + 6Pi(X - Xj)/(Xi+l - Xi)3. where Pi has to be found. Define the function x + by (X> 0)

=0

(x

~

0).

Thus [(3)+}3 = 27, {(-3)+}3 = 0 whereas (t - 5)+ = t - 5 if t> 5 but 0 if t ~ 5. Then applying (1.9) for i = 2, ... , n and using an equally spaced partition with Xi + 1 = ih we see that n

S"

= 2Po/h 2 + 613 I x/h 3 + L 6p;{ x i=2

where the first two terms represent S

(i - l)h} +/h 3

S';.

(0

~

x ~ nh)

After two integrations we obtain

= ct o + ctt(x/h) + Po(X/h)2 + Pl(x/h)3 +

n

L Pi{X -

;=2

(i - l)h}~/h3.

(1.10)

By construction S and its first two derivatives are continuous: we make it take the value /; at x = ih by requiring that CXo =

CXo

+ cxlm + 130m2 + 131 m3 +

10'

m

L pi(m i=2

i + 1)3 =

Let us use the central difference operator ~f(x)

~,

= I(x + th) -

Then ~/(th) = f(h) - 1(0) or ~/l/2

~21m =

= 11 -

1m

(n ~ m ~ 2).

defined so that

f(x - th). fOe Similarly

Im+l - 21m + 1m-10

(1.11)

8

ASPECTS OF NUMERICAL ANALYSIS

Our equations can now be written as cto = 10'

ct l

+ 130 + 131 = ~/I/2' 2130 + 6f31 + f32 = ~2/I

6131 + 5132 + 133 = b3/3/2' 13m + 2 + 4f3m+ 1 + 13m = ~4/m (m = 2, ... , n - 2)

(1.12)

which are (!l + 1) equations governing the (n + 3) coefficients a o, aI' Po, · .. , /3". Two of these coefficients may be chosen arbitrarily and then the others found from (1.11) or (1.12). Once the eqns (1.11) or (1.12) have been solved the coefficients in (1.10) are linear combinations of the values Ii of 1 at x = ih. Accordingly, (1.10) can be rewritten in the form

S=

L"

hCj(x)

(1.13)

i=O

where the polynomials Ci(x) can be determined. Clearly Ci(jh) = 0 (j =1= i) and Ci(ih) = 1 for i.] = 0, 1, ... .n. The functions Ci(x) are known as cardinal splines. They can be regarded as basic functions for (1.13) but they are not satisfactory for many practical applications because they are non-zero over most of the interval. To overcome this difficulty cubic splines which vanish identically outside an interval of length 4h have been constructed. Consider the function Bi defined by Bi(x)=*[(x-i+2)~ -4(x-i+ l)l +6(x-i)l-4(x-i-l)l +(x-i-2)l].

(1.14)

Notice firstly that Bi vanishes identically for x ~ i - 2 and is also identically zero for x ~ i + 2. Also, since the first two derivatives of x~ are continuous, the first two derivatives of Bi are continuous and, in addition, vanish identically for x ~ i - 2 and x ~ i + 2. Thus the Bi are splines which are non-zero only for the interval i - 2 < x < i + 2; they are known as cubic B-splines and each forms a bell-shaped curve. Special consideration may have to be given to the B-splines to be used at the ends of intervals. Often one will wish them to be lop-sided in order not to stray outside the given interval; sometimes taking half a bell is satisfactory. (There is additional information about B-splines in §6.8.) One reason why splines may be preferred to the polynomial approximations described earlier in this section is that the latter are subject to the Runge phenomenon. If one is given a function and, in a definite interval, one seeks to improve the approximation by increasing the number n of points where the given function and approximant agree, one finds that, although the separation between the points of agreement decreases, the maximum difference between the given function and approximant increases and, in fact, becomes infinite as n -+ 00 if the length of the interval exceeds a certain quantity. By using different

INTERPOLATION

AND APPROXIMATION

9

polynomials in adjacent intervals as when splines are employed this difficulty can be overcome. It is, of course, possible once the splines have been constructed with specified knots to ask that the given function be matched not at the knots but at some data points chosen in some convenient way. For quadratic splines the error .between the given function and approximant tends to have a ripple on it when the data points coincide with the knots. If, however, the data points are midway between the knots the ripples die away, effectively by a factor of 6, as can be seen from the parabolic shape of cardinal spines. (For further information on splines see Ahlberg, Nilson, and Walsh (1967). Extensive tables of coefficients are given by Sard and Weintraub (1971).) 1.2 Inverse interpolation Frequently, the problem of determining where a function takes a specified value is met. In other words, given y find an approximate value of x such that f(x) = y when f is known only for certain values of x, perhaps corresponding to entries in a table. One method is to construct an interpolating polynomial p(x) and then solve (1.15) p(x) = y This is known as inverse interpolation. Inverse linear interpolation occurs when p(x) is chosen to be linear. In this case, the table is first inspected and two consecutive entries x 1 and x 2 are determined between which x must lie. Then define

p(x) = {(X2 - x)f(x 1) + (x - Xt)!(X2)}/(X2 - Xl) and the solution of (1.15) is

x

= [{!(.~2) -

y}x 1 + {y - !(X t)}X2]/{!(X2) - f(x t ) } .

If p(x) is not chosen to be linear then more complicated methods must be used to solve (1.15). Examples are Muller's method, the secant method, the method of false position and the method of bisection described in §1.8. An alternative way, if the function inverse to f is known, is to carry out interpolation on the inverse function. In general, this will be less reliable than inverse interpolation on ! because, although a polynomial may well be a good approximation to f, there is no guarantee that the inverse function can be represented equally well by a polynomial. For example, if f(x) = x 2 the inverse function x = y does not have a good representation as a polynomial near the origin x = 0, y = o.

J

1.3 Interpolation in two dimensions The problem of interpolation in two or more dimensions is much more complicated than for one variable. In part, this is due to the fact that functions

10

ASPECTS OF NUMERICAL ANALYSIS

(x"y, )

Fig. 1.4. Triangular interpolation.

may be specified on domains of highly irregular shape. It is usually assumed that any shape likely to arise in practice can be approximated to as high a degree of accuracy as required by a network of standard shapes, e.g. triangles or rectangles, provided that they are made sufficiently small. Therefore we restrict our attention to such shapes. Suppose that we want an approximation F to f(x, y) over the triangle shown in Fig. 1.4 and suppose that F has the form F(x, y)

= cx + Bx + YY

i.e. we make a linear approximation. If we impose the condition that F and! are to agree at the three vertices we discover that where

F(x, y)

= cxt!(x t, Yt) + rx2!(X2, Y2) + rx3!(X3, Y3)

= X 2Y3 Arx2 = X3Yt Acx 3 = X 1Y2 Arx t

+ (Y2 XIY3 + (Y3 X2Yl + (Yl X3Y2

Y3)X - (x 2

-

x 3)Y,

Yl)X - (X3 - xt)Y,

Y2)X -

(Xl -

x 2)Y

and A, twice the area of the triangle, is given by A

= (x 2 -

Xl)(Y3 - Yl) -

(X3 -

X 1 )(Y2 -

Yl)·

Take another triangle with vertices (x., Yi), (X2, Y2) and (X4' Y4) which does not overlap that of Fig. 1.4 and find a similar linear approximation F, to f over this triangle. Then, since both F and F1 vary linearly along the side joining (x., Yl) and (x 2, Y2), and have the same values at the two vertices, they must be equal at every point of the side. In other words, F and F1 are continuous across the common side. In this way, by selecting non-overlapping triangles to cover the region of interest, we obtain a linear approximant which is continuous throughout the region.

INTERPOLATION

11

AND APPROXIMATION

(x"y,+k)

(x,+h,y,+k)

(x,+h,y,)

(x, ,y,)

Fig. 1.5. Interpolation on a rectangle.

If rectangular elements are employed (Fig. 1.5) we can try the approximation F(x, y) = a + px + YY + ~xy. If we require that F = f at the four vertices, we have

where at

~l

= {f(x t

= f(XI' Yt), PI = {f(x 1 + h, YI) - f(X h YI)}/h, Yt = {f(x t , YI + k) - f(x h YI)}/k, + h, YI + k) - f(x 1 + h; Yl) - f(x 1 , Yl + k) + f(x h

YI)}/hk.

For fixed Y, F is a linear function of x and, for fixed x, a linear function of F is known as a bilinear interpolant. On any side F depends only on the values at the two vertices so that, for two non-overlapping rectangles with a common side, the two bilinear interpolants take the same value on the common side. Thus bilinear interpolants yield a continuous approximant over the region covered by non-overlapping rectangles.

y. Consequently,

Exercises 1. The function f(x) has the values shown

x

f(x)

0.1 0.2 0.3 0.4

1.10517 1.22140 1.34986 1.49182

Using linear interpolation determine an approximate value for f(0.26). 2. If f(x) = 3x 2 - 1 find a piecewise linear interpolant which agrees with it at x = 0, 0.1, 0.2, 0.3, 0.4, 0.5. What approximation to f(0.33) does it give?

ASPECTS OF NUMERICAL ANALYSIS

12

3. If I(x i) and I(Xi+ 1) are increased by the small quantities 81 and 82 respectively, what is the change to the value of the linear interpolant for I{!(x i + X i + I)}? 4. If the approximation F is linear on [a, b] and agrees with I at the end-points, show that there is some c satisfying a < c < b such that I(x) - F(x) = !(x - a)(x - b)/"(c) if f E C 1 [ a, b] and I" exists. What accuracy does this suggest for linear interpolation in a table of (i) sin x, (ii) In x when x is given at intervals of 0.01 between 1 and 2, while I is given to 5 decimal places? 5. Find a polynomial P(x) of degree 2 or less such that P(I) = 1, P(2) = 1, P'(I) = 1. 6. Show that there is no polynomial P(x) of degree 2 or less such that P(x) = a, P(x

+

h) = b, P'(x

+ !h)

¢ !
7. For each of the functions (a) sin Ptx, (b) tan -1 x, (c) (1 +

X 2)-1 determine a single polynomial and a cubic spline approximation which agrees over - 1 ~ x ~ 1 at points separated by (i) 0.5. (ii) 0.25, (iii) 0.1, (iv) 0.01. Draw graphs of the original functions and their interpolants. 8. For the function of Q.l find Xo such that f(xo) = 1.3. 9. Show that, for linear interpolation on a triangle, at + a2 + a3 = 1. 10. Prove that, in bilinear interpolation on a unit square, the basis function at an internal node is given by

(1 ~ j, k ~ m - 1)

where aJ(x)

=x =j

- j

+

(j - 1 ~ x

1

+ 1- x

(j ~ x

~j)

«i +

1)

and is zero elsewhere.

11. If

3

F(x, y) =

3

L L a"x'y'

,=0.1=0

express the coefficients a" in terms of the values of F, oflox, ofloy, 02 Flax oy at (0,0), (0, 1), (1,0), and (1, 1). 12. If 1(1, 11) = 1, /(3, 1) = 4, f(I,2) = 5, f(3,2) = 7 find the approximate value of 1(1,1) by (a) triangular interpolation over (1, 1), (3, 1), (1, 2), (b) bilinear interpolation, (c) interpolation over a triangle formed from a side and two diagonals.

1.4 Approximation

How do we know when an interpolant is a good approximation to a function? In a sense this question has no answer because what is regarded as good by one person will be deemed unsatisfactory by another. Nevertheless, certain measures of error have been introduced and once a particular measure has been adopted we have decided on a criterion which determines whether some errors are better than others. One measure of the difference between two functions f and F over an interval [a, b] is provided by sup If(x) - F(x)l· a~x~b

This is known as the maximum or uniform norm and measures the maximum

INTERPOLATION

AND APPROXIMATION

13

d

d

L.l.. X

-d

I

'to..

t----'-----.......;'----..----..x b

-d

Fig. 1.6. A possible deviation in approximation.

Fig. 1.7 Comparison of norms.

deviation that occurs between the two functions. Another measure which is often used is

[Lb {j(x) -

F(X)}2 dx

J/

2.

It is known as the L 2 or least squares norm. The L 2-norm estimates the total deviation of ! from F over the whole interval. In Fig. 1.6 the maximum norm has value d whereas, in Fig. 1.7, it has the greater value d i . Therefore, if these figures represent different approximants F to the same !, Fig. 1.6 will be considered to be better than Fig. 1.7 as far as the maximum norm is concerned. On the other hand, the L 2-norm is larger in Fig. 1.6 than in Fig. 1.7 so that Fig. 1.7 will be preferred on the basis of the L 2-norm. The maximum norm is the natural one if one wishes to be within an assigned accuracy at every point of the interval. In general, there is little virtue in arranging high accuracy throughout most of the interval with only moderate accuracy elsewhere. It is better to have the difference! - F small over the whole interval and making small oscillations through positive and negative values. For the maximum norm there are two theorems related to approximation and which will be quoted without proof.

1.4 (WEIERSTRASS). If f polynomial Pn(x) such that

THEOREM

E

C[a, b] then, given any e > 0, there is a

IplI(x) - !(x) I ~ e

for x E [a, b].

14 THEOREM

ASPECTS OF NUMERICAL ANALYSIS

1.4a. If f

E

C[a, b] and n is a given integer, there is a unique polynomial

PIt of degree n or less such that

sup Ip,.(x) - f(x) I ~ sup IQ,.(x) - [(x) I

a~x~b

a~x~b

for every polynomial Q,. of degree n or less. The sup on the left is attained at n + 2 points at least. There is no algorithm for calculating PIt in Theorem 1.4a in a finite number of stages. If, however, we only impose the condition at a finite number of points then we can construct an algorithm often known as the first algorithm ofRemes. Let us denote the set of points by S and select from them n + 2 points Xo, Xl' · · · , x,.+ 1 such that xo < Xl < · · · < x,.+ 1. Define

=

Ai

where the prime means omit j 11

Construct p,,(x) =

Then

n'

,.+1

(Xi -

Xj )-

j=o

,.+1

1=0

1=0

L (- )iAi = - L

ittI]~:i =~} P,.(x

i)

then put

Aif(xi)·

{f(Xi)

=- L

i=O

Ai{f(xi)

(1.17)

+ (- )i,,}.

= !(XI) + (- )£" n

P,.(X,.+I)

(1.16)

= i from the product, and

,.+1

for i = 0, ... , n. Also

1

(1.18)

+ (_)i,,}/A"+1

= f(x,.+ 1) + (- r" 111 from (1.17). Thus (1.18) holds for i = n + 1 as well and we have ensured that, at n + 2 points, PIt does not differ from ! by more than 1111. Now check the other points of S. If the difference at them does not exceed 1,,1, then PIt is the required polynomial. Otherwise find the point x' of S when Ip,. - II is a maximum. If Xi ~ X' ~ Xi+ 1 (i = 0, ... ,n) replace Xi by x' if {p,.(x') - !(x')}{P,.(Xi) - !(Xi)} > 0; otherwise replace Xi+ 1 by x'. If x' < Xo put x' for X o if {p,.(x') - !(x')}{p,.(x o) - !(xo)} > 0; otherwise replace x, + 1 by x'. Operate similarly if x' > X n + 1. Return now to (1.16) and repeat the calculation with the new set of points. Proceeding in this way we shall, after a finite number of steps (since there is a finite number of selections of n + 2 points), reach a polynomial PIt for which the inequality of Theorem 1.4a is valid at all points of the set S. Acceleration of the convergence may sometimes be achieved by the second

INTERPOLATION

AND APPROXIMATION

15

algorithm of Remes. Since p" - f changes sign in each of the intervals [x o, Xl]' [Xl' X2], ... , [X", X" + 1] it has at least one zero in each interval. Let Yi be a typical zero in [Xi' Xi+l]. In each of the intervals [a, Yo], [Yo, Yl]' ... ' [Y", b] find a value of x, say z., where p,,(z,) - f(z,) is an extremum and has the same sign as f(Xi). If, for some z., Ip,,(zi) - f(zi)1

= max Ip,,(x) -

f(x)1

xeS

work with the set zo, ... , z,,+ l' otherwise find x' so that Ip,,(x') - j(x')

and replace one

Zi

= max Ip"(x) xeS

j(x)1

by x' as in the preceding paragraph.

Exercise 13. Construct a computer program to carry out the first algorithm of Remes and use it to determine some best approximation over a finite set of points.

1.5 L2-norm approximation The determination of the best polynomial in the L 2 or least squares norm involves considerations which are more conveniently handled in a rather more general setting. If Ifl2 dx exists we write f E L 2 (a, b) or, more briefly, f E L 2 when no confusion can arise. When f E L 2 and g E L 2 we can introduce the inner product (f, g) by

S:

(f, g)

=

{b

Ig* dx

(1.19)

where g* is the complex conjugate of g. Although we are only concerned with real functions at the moment, complex-valued ones will occur later and it makes little difference to the analysis to cover both cases at once. We may verify that the right-hand side of (1.19) exists by deriving the Schwarz inequality. Clearly

or

;.2

{b 1/1

2

dx

+ 2A./1

{b I/gl

r {b

dx

+ /12

{b Igl

2

dx

~0

for any real land u. The inequality on the quadratic form can hold only if

({b IIgl

dx

~

1/1 2 dx

{b Igl

2

dx

16

ASPECTS OF NUMERICAL ANALYSIS

whence

which constitutes the Schwarz inequality. The norm ,,/ 1/ of / is defined by

u. /)1/2.

II/II=

(1.20)

(When other norms are considered, a suffix will be added to this norm to distinguish it from the others.) The norm is always positive unless I = 0 almost everywhere. Further consideration of norms will be found in §1.11. It will be remarked that, if c is a complex constant,

(cl, g) = cif, g); (I, cg) = c*(/, g);

IIc/1I = fcill/II; (I, g) = (g,/)*·

(1.21)

From the Schwarz inequality Also

1(/, g)1

~

11/11 "gil·

~ {({b 111 2 dx /1

2+ ({b Igl 2dx /12f

(1.22)

(1.23)

by the Schwarz inequality. This may be expressed as

II! + gil On replacing T by

/1 - /2 and 11/1 - !311

~

IIfll + IIgll.

/2 - /3' ~ 11/1 - 1211 + 11/2 - 1311. as the length of f, (1.22) states

(1.24)

g by

(1.25)

If the norm of I is regarded that the modulus of the inner product of I and g is never greater than the product of their lengths. There is an obvious analogy with the scalar product of vectors and, if (f, g) = 0, we often say that I and g are orthogonal. Similarly, (1.25), expressed in terms of lengths, is the same as the triangle inequality of vectors. The distance between two functions 11 and 12 is "11 - 1211 and is zero only when 11 = 12 almost everywhere. Approximation in the L 2-norm is an attempt to reduce the distance between two functions to a minimum, distance being understood in the sense above. An important role is played by orthogonal elements. Suppose there is a finite or infinite set of functions cPl' cP2' ... , of L 2 such that

4Jn) = 0 (m # n), (4Jn' 4Jn) = l14Jn11 2 = 1.

(4Jm,

(1.26) (1.27)

INTERPOLATION

17

AND APPROXIMATION

Such a set is said to be an orthonormal set and (1.26) and (1.27) are often abbreviated to (lPm, lPn) = <>mn· Suppose we want to approximate a function f E L 2 by means of an orthonormal set lPh lP2' ... , lPN using the L 2-norm. Then we wish to choose the coefficients Cn so that

is a minimum. Now, on account of (1.26) and (1.27)

Ik- ntl nrJ>nl C

2 =

2

IIfl1 - ntl {c:(J, rJ>n) + cn(rJ>n, J) - cnc:}

= IIfl1 2

N

-

L

n=1

I(!, lPn)1 2 +

N

L I(f, cPn) - c nl2 • n=1

Only the third term contains the coefficients Cn and, since no member of the series can be negative, it attains its smallest value of zero when (n = 1, 2, ... , N).

(1.28)

Thus (1.28) gives the rule for selecting the coefficients so that the norm is a minimum. When this choice is made (1.29) The left-hand side cannot be negative and so N

N

n=1

n=1

IIfl1 2 ~ L I(f, lPn)1 2 ~ L Icnl 2

(1.30)

which is known as Bessel's inequality. An orthonormal set is said to be complete, if for every f E L 2 , there is a linear combination such that the L 2-norm of the difference is arbitrarily small. If (f, lPm) = 0 for every lPm of a complete orthonormal set all the coefficients Cm are zero so that the norm of the difference cannot be made arbitrarily small unless f = o. There is no loss of generality in assuming that the number of elements in a complete orthonormal set is infinite. Letting N -+ 00 in Bessel's inequality (1.30), we obtain 00

L I(f, 4>n)1 n=1

2

~

IIfll2

(1.31 )

which shows that the series on the left-hand side is convergent. Therefore

18

ASPECTS OF NUMERICAL ANALYSIS

must tend to zero as m and n tend to infinity. It follows (from the Riesz-Fischer theorem) that there is agE L 2 such that

!~~ II g -

ttl (f, cPt)cPtll = o.

From the Schwarz inequality (1.22)

Hence II

(g, lPPfI) = lim II'" 00

L (/,. lP")(lP,,, lPPfI) = (/, lPPfI)·

k= 1

Consequently, (g - f, tPPfI) = 0 for m = 1, ... and since the orthonormal set is complete our earlier remarks entail f = g. We may summarize this by saying: if tPl' tP2' · · · is a complete orthonormal set every / E L 2 can be expressed as 00

/ =

L (f, tP")tP,,,

"=1 the equality being understood to mean that lim 11-+00

Ilf- ,,=i

1

(1,

cPt)cPtll = o.

It follows from (1.29) that, for a complete orthonormal set, implies that

IIfl1 2

f = Ltex>= 1 c"tP"

ex>

= L Ic,,1 2 •

(1.32)

"=1

If 9 = L~= 1 b"lP" and we apply (1.32) to f the identity

+ g, / -

g,

f + ig, f - ig then, from

IIf + gll2 - IIf - gll2 + illf + igll 2 - illf - igll 2 = 4(f,g), is derived Parseval's formula

(f, g)

=

ex>

L c"b:.

"=1

Given a set of linearly independent elements e l' '" 2' •.. which will approximate any / E L 2 arbitrarily close in L 2-norm we can always manufacture a complete orthonormal set by a method known as the Schmidt process. First define tPl by

tPl = "'1/11"'111· Then pick lP2 = g2/1tg211 where 92 = "'2 - ("'2' tPl)tPl; 92 cannot be zero because and are linearly independent. Clearly (tP2' tPl) = O. In general, tPlI = gllill gil II

"'1

"'2

INTERPOLATION

AND APPROXIMATION

19

where

It is important to observe that in the whole of the preceding discussion concerning the minimization of the norm we have not used the specific form (1.19) but only properties of the inner product such as (1.20), (1.21), (1.22), and (1.24). Therefore we can draw the same conclusions if the inner product is defined in another way so long as it has the properties (1.20), (1.21), (1.22), and (1.24). For instance, if we choose M

(f, g) =

L

i= 1

f(xi)g*(X i)

for some fixed Xi we can easily verify that the properties are valid and so we 2 may deduce that II 1 1 «», II or L~ 1 I/(Xi) 1 cn4>n(Xj)1 is a minimum when

L:=

L:=

M

en

= (I, 4>n) = L

i= 1

f(Xi)4>:(Xi ) ·

It is this kind of problem which arises in fitting data at a discrete number of points by the method of least squares. Note that it is frequently a computational advantage to employ orthonormal polynomials for least squares rather than expansions in non-orthogonal functions because the matrices tend to be diagonally dominant even when round-off error is present. Another possibility is to take

(f, g) =

Lb w(x)f(x)g*(x) dx

where w is a real non-negative function. This corresponds to varying the contribution from the various parts of the interval according to the weight function w. In this connection there is the following interesting result: THEOREM 1.5. If cPt, cP2' ... is an infinite orthonormal set of polynomials on the finite interval [a, b] with weightfunction w, i.e.

then the orthonormal set is complete. Proof. Theorem 1.4 ensures that, for continuous j', there is a polynomial p(x)

such that

If(x) - p(x)1 <

8.

20

ASPECTS OF NUMERICAL ANALYSIS

The choice (1.28) guarantees a minimum of the L 2 -norm so that

f'

W(X)k(X) -

"tl

~ L" w(x)/f(x) -

C,,<j>,,(Xf dx

p(xW dx

provided that N is made larger than the degree of p. Since the right-hand side does not exceed e 2 w(x) dx and can be made arbitrarily small we have the desired result. Since any f E L 2 can be approximated as close as one wishes by continuous functions the proof is terminated. As an example let a = -1, b = 1, and w = 1; first construct an orthonormal set (which must be complete by Theorem 1.5) from the powers of x, i.e. with t/Jj = The Schmidt process gives

S:

«::

4>1 = 1/21/2, 4>2 = (3/2)1/2 X, 4>3 = (5/2)1/2t(3x 2

-

1), ... ,

which are multiples of the Legendre polynomials PII(x) which are defined by

Rodrigue's formula

PII(x)

= _1_ ~ (x 2 -

1)".

n!2 I dx"

The first few are Po(x)

= 1,

P1(x) =

X,

P2(x)

= -!(3x 2 -

1), P3(x)

= t(5x 3 -

3x)

and they satisfy the recurrence relation (n

+ I)PII + I(X) = (2n +

l)xP II(x) - nP II- 1(x)

and have the orthogonal property

f~

1

P",(x)P,,(x) dx

= 2lJ",,,/(2n + 1).

In practical calculation it may be more convenient to compute the l/Jk via the recurrence relations directly instead of deriving the analytical expressions first. A second example is supplied by a = -1, b = 1, W = (1 - X 2 ) - 1/2. Again we start from the powers of x and find for our orthonormal set

4>1 = l/n 1 / 2, 4>2 = (2/n)I/2 x, 4>3

= (2/n)I/2(2x 2 - 1),...

which are multiples of the Chebyshev polynomials. The Chebyshev polynomial T" is defined by T,,(x) = cos(n cos -1 x)

= n!( -2)" (1 _ (2n)!

X 2)l/2

~ [(1 dx"

_

X 2)"-l/2].

Some examples are To(x)

= 1,

T1(x)

= x,

T2(x)

= 2x 2 -

1, T3(x)

= 4x 3

-

3x.

INTERPOLATION

AND APPROXIMATION

21

The term of the highest power in 1'" is 2n - 1 x", The Chebyshev polynomial has a celebrated property concerning the maximum norm, namely 1.5a (CHEBYSHEV). Of all polynomials ofdegree n in which the coefficient of the highest power is unity the one with the smallest maximum norm on [ -1, 1] is T,,(x )/2n - 1 and

THEOREM

Here the notation IlfII 00 is employed to signify the maximum norm, i.e. sup over the appropriate interval which, in this case, is [ -1, 1].

IfI

Proof. Assume that there is a polynomial Pn(x) of degree n and with leading coefficient unity which is of smaller maximum norm than T,,(X)/2"-1. Let q(x) = Pn(x) - Tn(x)/2 n -

I

.

Then q is a polynomial of degree at most n - 1. Since P« has a smaller norm than 1;./2"- 1, q must be negative at the maxima of Tm/2" - 1 and positive at the minima of T,,/2 n - 1. Now putting x = cos B, 1',.(cos 8) = cos nO so that ~(x) has zeros at x = cos{(2k - 1)rc/2n} for k = 1, 2, ... , n and therefore possesses n + 1 maxima and minima on [ -1, 1]. Hence q must vanish at least n times which is contrary to its being a polynomial of degree n - 1. Thus the first part of the theorem is proved and the second part follows from the form of T" when x = cos o. Another way of expressing Theorem 1.5a is to say that of all polynomials of degree n with maximum norm unity on [ -1, 1], T,,(x) has the largest leading coefficient, namely 2"- 1. Series of Chebyshev polynomials can be readily summed on the computer by taking advantage of the recurrence formula

1;, + 1(x) - 2xT,,(x) For instance, if

+

1;,-1(x) = O.

N

f(x)

= L

n=O

anT,,(x) ,

define bN + 1 = 0, bN=aN and then calculate bN-

h""

b, from

It follows from the recurrence formula for 1;, that !(x) = ao - b2

+ b,».

The round-off characteristics of this method are no worse than those of ordinary polynomial evaluation and the same number of multiplications is used. In fact, the method can be used for any system of polynomials Pn(x) which

22

ASPECTS OF NUMERICAL ANALYSIS

satisfies a recurrent relation of the form Pn+ l(X) - P(X)Pn(X)

by putting

+

Pn+ l(X) = 0

and then N

L

n=O

anPn(x) = (ao - b 2 )po(x)

+

b1Pl(X),

Any power series can be expressed as an expansion in Chebyshev polynomials by employing formulae such as

1 = To(x),

x = Tt(x), X

3

X

2

= !{3Tt (x) +

= !{To(x)

+

T2(x)},

T3 (x)} .

It is often possible to reduce the degree of an approximating polynomial and thereby economize in computation by implementing the properties of Chebyshev polynomials. For example, if the function f is approximated by the polynomial Pn+ I where consider the polynomial Pn defined by

Pn(x)

= Pn+l(X) - an+ I 1',, +1(X)/2n.

Then Pn is of degree nand

Pn -

f = Pn+l

-

f -

n

a n+ 1 T,. +1(X)/2 .

Thus the error in fin does not exceed that in P« + I by more than an + 11;.+ 1(x )/2n. Since 11;.+ l(x)1 ~ 1 on [ -1, 1], this error can be quite small when a n + t/2n is small enough. In other words, truncation of the power series by removal of the higher powers by subtracting appropriate multiples of Chebyshev polynomials can lead to an effective measure of economization. Although the properties of Chebyshev polynomials have been described for the interval [-1, 1] they can be extended to other finite intervals such as [Xl' X2] by first making the substitution

Exercises 14. Express 1, x, ... , x 5 in terms of Legendre polynomials. 15. Find the polynomial of degree 2 which gives the best L 2 -norm approximation to eX on [0, 1].

INTERPOLATION

AND APPROXIMATION

23

16. The function f(x) was determined experimentally and found to have the following values 1.12 1.16 1.20 1.08 1.04 x: 1.00 9.17 9.32 9.00 8.82 8.63 f(x): 8.41

Find the polynomial of degree 2 which gives the best approximation in L2-norm. 17. By making the substitution (~2 + l)y - 1 x=----(~2 - l)y + 1

expresstan - 1 Y in terms of Chebyshev polynomials of x. If only those 1;. are retained for which n ~ 7 show that the recurrence relation method gives tan -1(1/~3) = 0.5235986. 18. By starting from the Taylor series for eX up to powers of x 5 show that Chebyshev truncation leads to

eX = (382 + 383x + 208x2 + 68x 3)/384 with an error of not more than one unit in the second decimal place on [ - 1, 1].

1.6 Rational approximation Although Weierstrass's theorem tells us that any continuous function can be approximated as closely as we like on a finite interval, the degree of the polynomial may be unduly high for a specified level of accuracy. Again, the presence of a singularity in the complex plane near the real axis may render polynomial approximation awkward. For these reasons it is worth considering whether a rational function will give better accuracy as an approximant than a polynomial. It has been suggested (see, for example, Hart et ale (1968» that for a given amount of computational effort rational functions give greater accuracy than polynomials. Consider the possibility of constructing a rational approximation to f in a neighbourhood of the origin-there is no loss of generality in selecting the origin since any other point can be converted to it by a simple change of variable. We try Pm(x)/q,.(x) where Pm and q,. are polynomials of degree m and n respectively, and are supposed to have no common zero since, otherwise, it could be cancelled. One method of specifying Pm and q,. is to require that Pm/q,. and its first m + n derivatives agree with f and its first m + n derivatives at x = 0; it is then called a Pade approximant. For example, for a Pade approximant to In(1 + x) with m = 2 and n = 2 we would want the coefficients in (a o

+ a1x + a2x2)/(bo + b1x + b 2x 2 )

chosen so that the expansion of the rational function near x = 0 was the same as x - x 2 /2 + .... To put it another way we wish to make as many powers of

24

ASPECTS OF NUMERICAL ANALYSIS

x disappear from ao

+

alx

+ a2 x 2 -

(b o

+

b,»

+

b 2x 2)(X -

1X2 + ... )

as possible. Therefore, select

= 0, a l = bo, a 2 = b, - tbo, tb i + tbo = 0, -tb 2 + !b 1 - ib o = 0

ao

b2

-

so as to eliminate powers up to and including x 4 ; if we tried to remove x 5 we should find bo = b i = b2 = 0 which is obviously unacceptable. Since we have one more coefficient than equations we normalize by putting bo = 1. Then a l = 1, bl = 1, a2 = t, b2 =! and the Pade approximant to In(1 + x) is

x + -!x 2

1+ x

+ tx2 + x).

agreeing to powers of up to x 4 in In(1 Other Pade approximants can, of course, be constructed by choosing different values of m and n but, as a matter of practice, it is usually found that the best approximations are obtained by taking In = n or possibly m = n + 1 provided that f has a Taylor expansion at the origin. An alternative form of rational approximation may be derived from Obreschkoff's formula

t r-r

k=O

n!(m + n - k)! (x - X1)k (n - k)!(m + n)! k!

=

f

k=O

jlk)(X)

n!(m + n - k)! (x - Xl)k (n - k)!(m + n)! k!

+

1

(m

+

n)!

IX Xl

jlk)(X

1)

(x - t)m(x 1 - t)nf<m+n+ l)(t) dt

which may be verified by integrating the integral by parts m + n + 1 times. The integral is effectively of order (x - X 1 )m+n+1 and so can be ignored to a first approximation; its explicit form can be used to provide an estimate of the error made in such neglect. As an example let f(x) = x lJ and. Xl = 1. Then, dropping the integral, we have with m = n = 1 x lJ

-

t(x - 1)Jlx lJ -

1

= 1 + t(x

- l)Jl

or

x" =

as a rational

2 - Jl + JlX

x

+ (2 - lJ')x approximation valid near x = 1 for any real u. Jl

INTERPOLATION

AND APPROXIMATION

25

Pade approximants usually become increasingly inaccurate as [x] increases. So attempts have been made to minimize IPmlq" - fl over an interval. Something like the second algorithm of Remes (§1.4) can be constructed but the algorithm may not converge if the initial approximation is not sufficiently good and, in any case, the solution of non-linear equations is involved at each stage. A convenient method for evaluating rational functions is by continued fractions, which may also arise in other contents in numerical work. (Expansions for numerous functions in polynomials, Chebyshev polynomials, rational functions, and continued fractions can be found in Abramowitz and Stegun (1965).) To fabricate a continued fraction suppose we are given min. Divide m by n; let at be the quotient and P the remainder so that P n

m

1 nip

- = at + - = at + -. n

Divide n by p; let a2 be the quotient and q the remainder; then n

q

P

p

- = a2 + -

= a2

1

+ -. plq

Proceeding in this way we obtain m I l -=a l + =al +---n

a2+ a3

1 a2+--a3 + ...

+

...

More generally we can consider expressions of the form

bo+~~···. bl

+ b2 +

If the number of terms is finite it is called a terminating continued fraction. Otherwise,it is called an infinitecontinuedfraction and the terminating fraction f

j,

_

n

-

b0 +a-l - a2- ... -a" bl + b 2 + b;

is called the nth convergent. If Iimn -+ oo f" exists, an infinite continued fraction is said to be convergent. It can be proved that, if a, = 1 and the b, are integers, convergence is always secured. If f" = AniB" it may easily be verified that A" = b"A,,-t + a"A"-2'

(1.33)

+ a"Bn- 2 ,

(1.34)

B, = bnBn -

1

26

ASPECTS OF NUMERICAL ANALYSIS

subject to A -1 = 1, A o = bo, B_ 1 = 0, Bo J,,+

1 -

f" =

= 1. Hence

-a"+ I B"-I(!n - f,,-I)/B"+

1·

If a, and bi are all positive, (1.34) indicates that 0 < an+ t B"-t/Bn+ t < 1. Thus /,.+ 1- in is numerically less than, and of opposite sign to, In - /"-1. Now, in

this case, bo is less than the continued fraction since part is omitted while the convergent bo + at/b t is greater than the continued fraction because the denominator is too small. Following this route we conclude that, when the a, and b, are positive, every convergent of odd order is greater than the continued fraction and every convergent of even order is less than the continued fraction; moreover

so that the convergents of odd order steadily decrease while those of even order steadily increase. These properties make continued fractions very convenient for computation. Since, for any rational function an equivalent terminating continued fraction can be manufactured (clearly, a terminating continued fraction in which a, and b, are polynomials is equivalent to a rational function), the continued fraction may be evaluated more economically, as far as the number of arithmetical operations is concerned, than calculating the numerator and denominator of the rational function separately and then dividing. For the conversion of series the following terminating continued fractions may be noted:

b" -b + i:

"

1

1

1

1

- + - + ... + - = - - - - - Ut U2 Un U 1 - U 1 + U2 1

X

x2

ao

aOa l

aoa l a2

(-

-Un - t

+

(1.35)

(1.36) U"'

)"x"

- - - - + - - ... + - - - aOa l · · ·

an

- - - - - _ ...

_-

a" - x

(1.37)

Infinite series may be handled via ~

i..J

where

tXo

= ao,

C("

tXo

a x" = -

"=0"

x - -tXtX- - - -tX2-

1- 1 +tX1X- 1 +

= a"/a n - 1 (n ~

(1.38)

tX2 X -

1). Alternate expressions can be derived by

INTERPOLATION AND APPROXIMATION

27

using the fact that the nth convergent can be written as a2

a3

clal C lC2 C 2C3 C"-lCna" L> b0 + -----···--clb l

n

for arbitrary non-zero

+ c2 b2 + c3b 3 +

c"b"

Cia

Exercises

19. (i) Construct the Pade approximant with m = n = 2 for eX in the neighbourhood of

the origin. (ii) Find the maximum norm of the difference between the Pade approximant and eX on [0, 1]. Compare your result with the polynomial of degree 5 obtained by the first algorithm of Remes with S the set 0(0.1) 1. 20. Find the Pade approximants with (i) m = 2, n = 2, (ii) m = 3, n = 2 for sin x near the origin. 21. Use Obreschkoff's formula to obtain the approximations (i) In(1 + x)

=-

=

6(x

+ 2) 2 2 (2x + 1)

3x - 3),

-

x2

1

(11..) ex

x(x

1+-x+-

2 1 1 - -x 2

_12 x2 +12

near the origin. How does (ii) compare with 19(i)?

22. Find a, b, and c so that

+bxl max ex -a1 + ex

O~x~ 1

I

is a minimum. Compare the corresponding Pade approximant with m = n = 1. 23. Calculate successive convergents to . 1 1 1 1 1 (1) 2 + - - - - - - . 6+ 1+ 1+ 11+ 2

..

1111111

(11) - - - - - - - - .

2+ 2+ 3+ 1+ 4+ 2+ 6

24. A metre equals 1.0936 yards. Find limits to the error in taking 222/203 yards as equivalent to a metre. 25. Show that

x2 x 3 1- 3- 5X

(i) tan x = - - _ ... ,

+x 1- x

(ii) In I

= ~ ~ (2X)2 (3X)2 1- 3- 5-

7-

...

28

ASPECTS OF NUMERICAL ANALYSIS

26. The numerator and denominator of a rational function, both of degree n, are expressed in terms of Chebyshev polynomials. Obtain the formulae converting it to a continued fraction of the form

b1

b2

°0+------

+ X+ a2 + X+

at

1.7 Trigonometric interpolation The approximation of a function tao

+

f

on [0,21t] by a series of the form

N

L (an cos nx + bn sin nx)

n=O

is a particular case of the general theory developed in §1.5. Nevertheless some of the formulae are of interest and will be needed subsequently. By the general theory the best L 2 -norm approximation to f is obtained when an = an and b; = Pn where (Xn

= (l/n)

Pn = (1/n)

1 1 2K

0

f(x) cos nx dx,

2K

0

f(x) sin nx dx.

The coefficients a" and B; are, of course, those which would occur in the infinite Fourier series representation of f. This infinite series may not converge to f but, if f has only a finite number of discontinuities which are finite jumps, the series converges to !{f(x + 0) + f(x - O)} at interior points and t{f(O + 0) + f(21t - O)} at x = 0,21t (when f is piecewise smooth). However, since at the moment we are concerned with finite trigonometric series the problem of convergence does not arise. Suppose now that we ask that the trigonometric expansion be specified not by the L 2 -norm but by being required to agree with f at certain points. Let the points be chosen as kh (k = 0, 1, ... , M) where M is a positive integer and h = 21t/M. Then we try to find an and b; so that tao

+

N

L

n=1

(an cos nkh + b; sin nkh) = f(kh)

(k

= !{f(O) + f(21t)} Now M

einh _ ei(M+ l)nh

k= 1

1 - e,nh

L einkh =

unless e inh = 1. But e iM nh

= 1, since

= 1, .. . ,M -

(k = 0, M)

1)

(1.39)

.

n is an integer and so the series is zero if

29

INTERPOLATION AND APPROXIMATION e i nh

;f: 1. If, however,

e

inh

M

L k=l

e

= 1, each inkh

term in the series is 1 and so

=M

(if nlM is an integer)

=0

(otherwise)

since n/M being an integer is the condition for we see from (1.40) that M

L

i nh

= 1. If m and n are

ei(m+n)kh

=M

(if (m + n)/M is an integer),

ei(n-m)kh

=M

(if (n - m)/M is an integer)

k=l M

L

e

(1.40)

integers

k=l

and otherwise the sum of each series is zero. With cos nkh cos mkh =

t91{e i(m+n)kh

~

+

denoting the real part

ei(n-m)kh}

and hence M

L

k=l

cos nkh cos mkh = 0

or

1M

or

M

(1.41)

according as (a) neither (n + m)/M nor (n - m)/M is an integer, (b) one but not both of (n + m)/M and (n - m)/M is an integer, (c) both (n + m)/M and (n - m)/M are integers. Similarly, from

LM sin nkh sin mkh =

k=l

f cos nkh sin mkh

k=l

1

~-

M L {

2 t =1

= .F!

f

2 k= 1

ei(n - m)kh -

{ei(n + m)kh -

ei(n

+ m)kh }

,

ei(n - m)kh}

we deduce that M

L

sin nkh sin mkh = 0

k=l

or

-tM

or

tM

(1.42)

according as (a) both (n + m)/M amd (n - m)/M are integers or neither is, (b) (n + m)/M is an integer but (n - m)/M is not, (c) (n - m)/M is an integer but (n + m)/M is not, and that M

L

cos nkh sin mkh = O.

(1.43)

k=l

Multiply the kth equation of (1.39) by cos mkh, where m is one of the integers

30

ASPECTS OF NUMERICAL ANALYSIS

0, ... , N, and add. Then M-l

L

k=l

f(kh) cos mkh + t{f(O)

= k~l

+ f(2n)} cos 2nm

{tao + ntl (ancosnkh + bnSinnkh)}cosmkh.

(1.44)

Suppose how that M is even; select N = !M. Then, from (1.40), (1.41), and (1.43) the right-hand side of (1.44) is tMa", if m '# tM and MaN if m = tM = N. In a similar way the right-hand side of M-l

L

k=l

f(kh) sin mkh + !{f(O) =

+ f(2n)} sin Znm

k~l {tao + ntl (an cos nkh + b; sin nkh)} sin mkh

is tMbm when m =f; 0, !M. Thus the solution to our problem when M is even is N-l

tao + taN cos Nx + L (a,. cos nx + bPI sin nx) n=1

where N =

tM and 2

M

M

k=1

am = - L f(kh) cos mkh, M k=l 2 M b; = - L f(kh) sin mkh

(1.45) (1.46)

with the understanding that f(Mh) means !{f(O) + f(2n)}. It will be observed that there is no other solution since the coefficients am and bm vanish when f is zero in (1.45) and (1.46). If M is odd, an analogous procedure gives the expansion N

tao + L

(an cos nx

n=1

where N (1.46).

= t(M -

+

b; sin nx)

1) and the coefficients am, b". are still given by (1.45) and

L

The analysis of the inner product f(xi)g*(X i) in §1.5 demonstrates that, not only does the trigonometric polynomial agree with the function at the specified points, but also it is the same as would be obtained by the method of least squares in fitting the data by a trigonometric polynomial of degree N. Exercises 27a. Find the trigonometric interpolant on [0,2n] for f(x) that it is badly in error at the end-points.

=x

with M

= 4 and

show

SOLUTION OF EQUATIONS

31

27b. If f(x) = x(O ~ x ~ n), = 2n - x(n ~ x ~ 2n) obtain the trigonometric interpolant when M = 3 and when M = 10. Compare the graphs of the interpolants with the original function.

SOLUTION OF EQUATIONS 1.8 Solution of an equation Often one is faced with the problem of finding the values of x which satisfy an equation of the form (1.47) f(x) = O. Such a value of x is called a root of (1.47) or a zero of f. Since the number of equations which can be solved analytically is very limited, the devising of numerical techniques is of paramount importance. It is necessary to be aware right from the start that it will rarely be possible to find the roots of (1.47) exactly by numerical methods. There are several reasons for this. In the first place, unless f is a very elementary function, it will usually have to be replaced by some approximant-perhaps one of the types discussed in preceding sections. Such replacement is bound to introduce some error. Secondly, any computation will usually involve round-off error. Thirdly, any computer can carry only a certain set of rational numbers so that if the root of (1.47) is not a rational number or is a rational number outside the computer set its representation in the computer must inevitably be in error. Given that these sources of error are virtually inescapable it is vital to arrange that techniques produce answers which can be related to the 'roots of (1.47) and, in particular, do not supply more or less zeros of f than were originally present. Suppose that f is continuous for a ~ x ~ b and that f(a) and f(b) have opposite signs, i.e. f(a)f(b) < O. Then we know that f(x) = 0 has at least one root in [a, b]. In the bisection method we aim to locate a root by taking a sequence of intervals, each half the size of the previous one and each containing a root. The actual algorithm is: Define ao = a, bo = band then form the numbers at, b t , a2 , b2 , ••• successively

by the following procedure. Put

= -!(a r - t + br - 1 ) f(c r) = 0 then x = c, is the c;

and calculate f(c,). If root sought. If f(c,) =F 0 then either (i) f(cr)f(a r - 1 ) > 0 and then we define a; = c, b, = br - H or (ii) f(c,)f(a r-· 1 ) < 0 and then we define a, = ar-I' b, = c.. Stop the process when lar - brl ~ E, where E is some pre-assigned number. In general e is selected so that desired accuracy is attained or so as to keep the number of iterations down to a specified level. The convergence of the process is governed by Theorem 1.8.

32

ASPECTS OF NUMERICAL ANALYSIS

THEOREM

1.8. Under the conditions of the algorithm

(i) b, - a, = (b - a)/2'

and, if X o is the root of f(x) = 0, (ii) Ixo -1{a, + b,)1 < t(b, - a,) < (b - a)/2'+

1.

Proof If (i) of the algorithm applies b, -

Q,

= b'-l

b,. -

Q,.

= c,. - a,.-1

-

C,

= t(b'-l - a,.-I).

If (ii) applies

= !
a,.-l)

so that there is the same connection between the lengths of successive intervals in both cases. Part (i) of the theorem is an immediate consequence. For part (ii) remark that

Xo - t(a r /+ b,.)

= l(x o -

a,)

+ l(x o -

b,.).

Now Xo - a, is positive and Xo - b, is negative so that the right-hand side must be less than t(x o - a,) and greater than 1{x o - b,.). However, X o < b, so that Xo - a, < b, - a" and Xo > a, so that Xo - b, > a, - b.. Thus the right-hand side is smaller than !(b, - a,) and larger than !(a, - br ), i.e.

Ixo - !(a,. + b,)I
b,.f(a,) - a,f(b,) c,+1 = f(a,) - f(b,)

(1.48)

Apart from this change the method of false position has the same procedure as the bisection method. It can be proved that the method of false position converges to a root under the same conditions as Theorem 1.8 but the convergence is generally much slower than that for the bisection method.

33

SOLUTION OF EQUATIONS

A relation of the method of false position is the secant method, in which a sequence of points Xl' X2' ••. is generated via (1.48) so that xr + 1

x'-lf(x,) - x,f(X'-l) =-------

(1.49)

!(x,) - !(X,-l)

with Xl = a, X2 = b. Here there is no requirement that f(a)f(b) < 0 but now we have no guarantee of convergence. Indeed, if there is convergence, the denominator of (1.49) must approach zero which can make for numerical difficulty. There is, of course, complete failure if f(x,) = f(x,- 1). On the other hand, the secant method will, when it converges, usually do so faster than the bisection method or the method of false position. The iterative methods that have been discussed so far and those to be mentioned subsequently are all of the type X,+ 1

= F(x,).

(1.50)

Iflim,_ 00 X, = Xo and F is continuous in a neighbourhood of x o, lim.., 00 F(x,) F(xo). Hence, a convergent iteration with continuous F leads to a root of X

= F(x).

=

(1.51)

Thus the main question is whether the sequence converges and the answer to this may depend not only on the form of F but also the starting value Xl. A somewhat stronger condition than continuity is to require (1.52)

IF(x) - F(y)1 ~ Mix - yl

which is a Lipschitz condition. If F is differentiable the mean value theorem asserts that F(x) - F(y) = F'(~)(X - y) for some e in (x, y). Thus, if IF'(e)1 We now prove

~

M, F satisfies the Lipschitz condition (1.52)

1.8a. If F satisfies (1.52) for all x, y with M < 1 then (1.51) has a unique root X o and the iteration (1.50) converges to it for any Xl. THEOREM

Proof. From (1.50) and (1.52) IXr+ l - x,l

= IF(x r) -

F(xr-l)1 ~ Mix, - xr-ll ~ M,-11x 2

by repeated application. Hence, for any integer s Ix,+s - x.] ~ Ixr + s - xr+s-ll ~ (M r + s - 2

~

+

+ IXr + s- l

Mr+ s- l

~

+ .. · +

M

-

xII

1,

- X,+s- 21 r

-

+ ... +

l)lx

2 -

Ixr + 1

-

x.]

xII

Mr- l lx2 - x 1 1/(1 - At!).

Since M < 1, the right-hand side tends to zero as r

-+ 00

and therefore so does

34

ASPECTS OF NUMERICAL ANALYSIS

the left-hand side. But this is the standard Cauchy condition for the convergence of the sequence {X,} to a limit Xo. Because (1.52) implies that F is continuous, Xo is a solution of (1.51). To complete the proof it remains to show that there is no other root. Suppose there were another root Yo. Then Iyo - xol = IF(yo) - F(xo)1

~

Mlyo - xol

from (1.52). On account of M < 1, the only possibility is Yo is terminated.

= Xo and the proof

The disadvantage of Theorem 1.8a is that it needs the Lipschitz condition (1.52) to hold for all x and y. If we are prepared to assume that Xo exists in some interval we can lighten this restriction. 1.8b. Let Xo = F(xo) and assume that (1.52) holds with M < 1 for all x, y in the interval [xo - a, Xo + a] for some a > o. If Xo - a < Xl < Xo + a the iteration (1.50) has the properties (i) X o - a < x, < X o + Q, (ii) lim,-+ 00 x, = Xo (iii) !x,+ 1 - xol ~ M'!x 2 - x 1 1/(l - M).

THEOREM

The result (i) ensures that all iterates stay within the given interval while (ii) shows that the iteration converges to the root. An estimate of the distance of an iterate from the root is supplied by (iii). Proof Assume firstly that, for some r, Xo - a < x, < Xo

+ a. Then

IX'+l - xol = IF(x,) - F(xo)1 ~ Mix, - xol

(1.53)

from (1.52). Hence Ix,+ 1- - xol < a. Therefore, if the result is true for r it is true for r + 1. Since IX 1 - xol < a, the validity of (i) follows by induction. Inequality (1.53) implies that IX'+l -

whence lim,-+oo !x,+ 1 Further

-

xol ~ M'!x l

xol

-

xol = 0 and (ii) is proved.

IX2 - xol = IF(x1) - F(X2) + F(X2) - F(xo)1

~

Mix! - X2! + Ml x2 - xol

so that IX 2 - xol ~ Mlx l - x21/(1 - M). From (1.53) M,-11x2 - xol and the proof of the theorem is finished. THEOREM Xo

!x,+ 1

-

xol ~

1.8c. If F is continuous and differentiable on [x o - a, Xo + a] where ~ M < 1 then Theorem 1.8b holds and

= F(x o ), and IF'(x)1

· x,+ 1 - Xo -- F'( x ) . 11m o '-+00

X, -

Xo

35

SOLUTION OF EQUATIONS

Proof. We have already seen that the differentiability of F entails the conditions of Theorem 1.8b so only the last part needs proof. Now · x,+ I - Xo I1m x, - X o

= I'1m F(x,) x, -

r-« 00

'-00

F(xo) Xo

= F'( Xo )

from the definition of a derivative and Theorem 1.8b (ii). It should be remarked that Theorem 1.8c states that the iteration converges if IF'I < 1 but this does not imply that the iteration diverges if IF'I ;?; 1. In fact, we could permit F'(x o) = 1 without invalidating the theorem. More generally, if x - F(x) > 0 and F'(x) > 0 for a + X o ;?; x > Xo then a + X o ;?; x, > X o has the consequence x,+ 1 = F(x,) < x, while the mean value theorem x,+ 1

Xo

-

= (x, -

xo)F'(c,),

with c, between x, and x o, shows that x,+ 1 > xo. Therefore, if a + Xo ;?; x, > Xo, induction demonstrates that X o < x,+ 1 < X, for all r. Thus the sequence converges to a limit L ;?; x o. By continuity, L = F(L) and so L = xo. Thus the sequence converges to X o' Similarly, the conditions F(x) - x > 0, F'(x) > 0 for X o - a ~ x < X o give a sequence converging to X o if Xo - a ~ Xl < xo. Newton's method for finding X o so that f(xo) = 0 may be derived in the following manner. Let x, be an approximation to x o. Then f(xo) = j'(x,)

+

(xo - x,)f'(x,)

+ t(x o -

x,)2f"{x,

+

8(xo - x,)}

(1.54)

where 0 < 0 < 1. If X, is a good approximation to Xo, Xo - x, can be expected to be small and then, if f" is not too large, the last term can be neglected, i.e. f(xo)

~

f(x,)

+ (xo -

x,)f'(x,).

This will make f(xo) zero if xo - x,

=-

f(x,)/f'(x,).

In other words, if x, is an approximation to Xo, x, - f(x,)/ f'(x,) should be a better one. Calling this new approximation x,+ 1 we have the iteration formula x,+

1

= X,

-

f(x,) f'(x,)'

(1.55)

Note that if x, converges we expect its limit to be a zero of f if f' does not vanish there. In fact, the iteration will converge to a multiple zero as will be seen later. Sometimes to simplify the computation f'(x,) is replaced by f'(x 1 ) but we shall consider only the form (1.55).

36

ASPECTS OF NUMERICAL ANALYSIS

The eqn (1.55) has the structure of (1.50) if F(x)

=x

- I(x)/I'(x).

Hence F'(x) = !(x)!"(x)/{/'(x)}2

and Theorem 1.8e tells us that Newton's method converges to a simple zero of if 111"/f,21 < 1 in a neighbourhood of the zero. Since f is small near zero, the basic assertion is that the method will converge if x I is close enough to the zero. However, it must not be concluded that, if Xl is closer to one zero than another, the iteration will necessarily converge to the nearby zero. For example, the iteration for

I

f(x)

= (x

- l)(x

+

1)3

will converge to - 1 if x I = ! even though x 1 is closer to 1 than - 1. A modification of Newton's method is Cauchy's method in which (J is placed equal to zero in (1.54). Then x,+ 1 is chosen as the root of !(x,+

1 -

X,)2!"(x,) + (x,+

1 -

x,)f'(x,) + f(x,)

=0

for which x,+ 1 - x, has the smallest modulus. The obvious disadvantage of Cauchy's method is that it requires the calculation of two derivatives as well as the solution of a quadratic equation. An iteration scheme which is a generalization of the secant method is Muller's method. For this, three starting values, say Xl' X2' and X3' are necessary. Then one constructs a polynomial of degree 2 which has the values f(xI)' f(x2)' and f(X3) at Xl' X2' and X3 respectively. The polynomial has two zeros; choose the one X4 for which IX4 - x31 is smallest. Then repeat the process starting with X2' x 3, and x 4 • The polynomial always possesses a root unless f(x,) = f(x r + 1) = f(x, + 2) when it represents a straight line parallel to the x-axis. Hence, provided that this situation is never met, the iteration can proceed. The advantage of Muller's method over Newton's is that no computation of a derivative has to be undertaken. Also Muller's method offers the possibility of finding complex roots, which are excluded by Newton's method when f is real. To discuss the convergence of an iterative process we say that, if

· [x, + I - xol I1m [x, - xol P

r-+ 00

=b

where b is finite and non-zero, the iterative method is of order p. If sup ,~s

IXr + 1

IXr -

-

xol xol

=B

37

SOLUTION OF EQUATIONS

we have Ix,+s+

xol ~ Blx,+s - xoI P

1 -

~ B 1 + PIX,+s-l - xoI

P2

and, continuing in this, we obtain where

c = 1 + P + p2

+ ... +

p'- 1 •

If p = 1, c = rand (1.56) whereas, if p

-:1=

1, C = (p' - 1)/(p - 1) and

IX,+s+

1 -

1 l ) {B1 /(P- l )1x s+ I ~ B 1 /(p-

Xo ~

1 -

Xo 1}P" ·

(1.57)

It is evident that, if p = 1, convergence is relatively slow and only certain if B < 1. On the other hand, if p > 1 and Ixs + 1

-

xoIB1/(p-l)

<1

convergence will be very fast. Therefore iterative methods of higher order are to be preferred from the point of view of speed of convergence. A theorem on the order of an iterative procedure is

1.8d. Let lim,_oo x, = X o where X,+l = F(x,) and F is continuous on a ~ x ~ Xo + a (a > 0). (i) If F'(x) exists on Xo - a < x < Xo + a and F'(x o) # 0, the iterative method is of order 1. (ii) If F'(x o) = 0 and F"(x) is continuous on Xo - a < x < Xo + a, then the iterative method is of order 2 if F"(x o) =1= o.

THEOREM

Xo -

Proof. As in Theorem 1.8c

I

lim x,+ 1 - Xo x, - X o

''''00

I= IF'(xo)1

so that, when F'(xo) =F 0, the method is of order 1. In case (ii) lim,-+fX) x, = Xo implies that all x, from some r onwards will certainly lie between Xo - a and X o + a. For such r Taylor's theorem gives

38

ASPECTS OF NUMERICAL ANALYSIS

where c, is between x, and Xo' Since F'(xo)

lim ,~CX)

= 0,

IX'+l-X~1 = lim IF(X,)-F(~o)1 (x, - xo)

= lim

because F" is continuous and c, of the theorem is complete.

-+

(x, - xo)

,~CX)

ItF"(c,)1

Xo since x,

-+

Xo' Since F"(xo) :1= 0 the proof

In Newton's method F'(x o) = 0 and F"(xo) = f"(xo)/f'(xo). Therefore, if f"(x o) =F 0, Newton's method is of order 2 for a simple root provided that f'" is continuous on an interval including Xo' If Xo is a q-fold root where f(xo) = f'(xo) = . · · = f(q - l)(XO) = but f(q)(xo):F 0, Newton's method may still be shown to converge when f(q) is continuous in a neighbourhood of Xo' First, observe that

°

(x, - xo)f'(x,) - f(x,) x ,+ 1 - x 0 -- - - - ----. f'(x,)

By Taylor's theorem

f(x,) = (x, - xo)qf(q)(el)/q!, f'(x,)

= (x,

-

XO)q-l f(q)(e 2 )/(q

where both

e1and '2 lie between x, and Xo' Hence

As x,

'1

--+

Xo,

-+

Xo and

'2

x,+ 1

- I)!

-+

Xo so that, from the continuity of j(q),

-

Xo

~

(x, - x o)(1 - l/q).

This demonstrates that the convergence is much slower than in the case of a simple root and can be very slow indeed if q is large. For a multiple root the convergence of Newton's method can be improved by adopting the formula (1.58) X,+l = x, - qf(x,)/f'(x,). Using the same technique as just above but taking one extra term in the Taylor expansions we obtain X,+l -

xo =

(x, - XO)2 f(q+ l)(X O) q+ 1 f(q)(x o)

39

SOLUTION OF EQUATIONS

so that the method is of order 2. However, one should be warned that if (1.58) is employed near a simple root convergence may fail. It can be demonstrated that the secant method is of order 1.62 approximately and Muller's method of order 1.84 approximately. A standard scheme for accelerating the convergence of an iteration procedure is Aitken's c5 2 -method. In this method, starting from x, we generate Y,+ 1 = F(x,), Y,+2 = F(Y,+l) and then define Xr + 1

= Y, + 2

-

(Yr+ 2 Y,+2

-

Yr+ 1)2

+ x, -

2Y'+1

•

Analysis reveals that this scheme is of order 2 if F'(xo) ~ 1 and neither F'(x o) nor F"(xo) is zero. If F'(x o) = 1 the scheme is of order 1.

Exercises 28. Use the bisection method to solve (i) 8x3 - 4x - 5 = 0, (ii) 2x = tan x, correct to two decimal places. 29. On 0 ~ x ~ !, f(x) = ! and on ~ x ~ 1,

t

f(x)

= 6x -

1 - 6x 2 •

Obtain the value of c;+ 1 in the method of false position. 30. Solve 3 sin x = 2 correct to three decimal places by the secant method. 31. Use Newton's method to find.J7 correct to 2 decimal places, starting from Xl = 3. 32. Obtain by Newton's method a root of (i) x 3 - 2x 2 - 5x + 10 = 0, starting from Xl = 3, (ii) x 3 - 6x 2 + 13x - 9 = 0, starting from Xl = 2. 33. Find the root of 27x 3 + 18x - 25 = 0 between 0 and 1 using the iteration

checking whether Theorem 1.8c is satisfied. Is the iteration better? 34. Examine the iterations (i) (ii)

35. 36. 31. 38.

Xr + 1

Xr + 1

= (x;

=b-

xr+ 1

= (15 -

27x:)/18

+ c)/b, (cjx,)

as possible schemes for determining the larger root of x 2 - bx + c = 0 when b > 0, !b 2 > c > O. What happens when Newton's method is applied to x 2 - 2x + 2 = O? Solve x 3 = 3 by Cauchy's method starting from Xl = 3. Find a root of sin x + 2 = x by Muller's method starting with Xl = -1, X2 = 0, X3 = 1. If F(x) = x + h(x)f(x) find h so that the iteration method is of order 2.

40

ASPECTS OF NUMERICAL ANALYSIS

39. To calculate

Ja when a> 0 the following iteration is suggested: Xr + 1

=

x;

+ 2

3x r

3ax r

+a

.

Show that it is of order 3. 1.9 Systems of non-linear equations

The solution of simultaneous non-linear equations is complicated and we shall be content to describe how Newton's method can be generalized. Suppose the values of x and yare required which simultaneously satisfy f(x, y)

= 0,

y(x, y)

= o.

By Taylor's theorem, if we neglect second orders, f(x r+ l' Yr+ 1) = f(x r, Yr) + (Xr+ 1

-

g(Xr+l' Yr+ 1) = g(Xr, Yr) + (Xr+1

-

+ (Yr+ 1 Xr)gx + (Yr+ 1 -

xr)fx

Yr)fy, Yr)gy

where fx denotes the partial derivative af/ax and all the partial derivatives are evaluated at (x., y,'), If we hope that (x r+ H Yr+ 1) is close to a zero we want the left-hand sides to be zero. This can be arranged by putting

= x, + (yfy - fYy)/J, Yr+l = Yr + Ue, - gfx)/J

Xr+ 1

(1.59) (1.60)

where J is the Jacobian defined by J

= Le, -

I,gx.

Eqns (1.59) and (1.60) constitute the generalization of Newton's method to two equations, all quantities on the right-hand side being calculated at (x., Yr).

MATRICES 1.10 Matrices It is assumed that the reader has some acquaintance with the theory of matrices so that the treatment here will be somewhat cursory (see, for example, Liebeck (1969)). A general matrix consists of mn entries arranged in m rows and n columns, giving an m x n array, to be denoted by a capital letter such as A:

A=

41

MATRICES

The symbol aij denotes the element in the ith row and jth column and often we shall abbreviate the notation by writing A = (aij). The matrix is called square and of order n if m = n. If n = 1 so that the matrix consists of a single column we shall call the matrix a column vector and signify its special nature by using bold type, e.g.

a=

The elements au for i = 1, 2, ... , n in a square matrix are said to be the diagonal elements. The elementary rules of combination are:

= B if and only if aij = A + B = (au + bij)' A

bi j all t.i

aA = (aaij)'

Multiplication of A and B is possible only if A has the same number of columns as B has rows. If A is m x nand B is n x p then AB

=(

±

k=l

aikb ki)

the result being an m x p matrix. In general, two matrices do not commute, i.e, AB '# BA even if both are square. The unit matrix I of order n is a square matrix all of whose elements are zero except the diagonal ones which are unity. Thus Al = A. The transpose of a m x n matrix A = (ajj) is the n x m matrix whose ijth element is aji' The symbol AT will be used to indicate the transpose. Note that the transpose aT of a column matrix will be a row matrix, i.e. a matrix whose elements lie in a single row. There is no difficulty in verifying that (A + B)T = AT + B T, (AT)T = A, (AB)T = BTA T. If A is m x n and x is a column matrix with n elements Ax is a column matrix whose ith element is PI

L aijxj' j=l Observe that xT AT is a row matrix. If B is a n x m matrix such that BA = I then B is called a left-inverse of A. Similarly, if C is n x m and AC = I then C is called a right-inverse of A. Suppose A is square and has both a left-inverse and a right-inverse then

B

= BI = (B(AC) = (BA)C

= IC = C.

42

ASPECTS OF NUMERICAL ANALYSIS

Thus there is only one left-inverse and only one right-inverse and both are equal. This unique matrix is called the inverse of A and denoted by A-I. Clearly, (A- 1)-1

= A, (AB)-1 = B- 1A- l

but, in general, (A + B)-1 =f. A-I + B- 1. A matrix is called symmetric if A = AT and anti-symmetric if A = _AT. A matrix such that A-I = AT is known as orthogonal. From now on we shall be concerned primarily with square matrices A. It will therefore be assumed that A is square and of order n unless otherwise is specifically stated. It is known that the equations Ax =0

possess a solution with x =F 0 if and only if det A determinant of the matrix. The quantities Ai such that

= 0, where

det signifies the

(1.61) has solutions

Xi

=F 0 are called the eigenvalues of A. The

det(A - l.!)

Ai

are solutions of

=0

and are therefore n in number, though some of them may be multiple roots. Since the determinant of the transpose of a matrix is the same as the determinant of the original matrix det(A T - AI) = O. Consequently, there are Yj

=1=

0 such that ATYj

= A.jYj

(1.62)

Hence A and AT have the same eigenvalues. Multiply (1.61) by yJ and (1.62) by xi and subtract. Then YjTA Xi

-

TAT Xi Yj

= AiYjT Xi 't

-

.. T )"jX i Yj'

The left-hand side vanishes and so

(Ai - A.j)yJX

i

= o.

If Ai =1= A. j then yJ Xi = 0, i.e, the eigenvectors of A and AT corresponding to distinct eigenvalues are orthogonal. Moreover, the eigenvectors corresponding to distinct eigenvalues of A are linearly independent. Suppose, to the contrary, that s are linearly dependent and that any smaller number are linearly independent. Then ~lXl

where all the

(Xi

+ ... +

(XsX s

=0

are non-zero. On multiplying by A we obtain (XIA.1 X 1

+ ... +

ex)sxs = O.

(1.63)

43

MATRICES

If At = 0, s - 1 vectors would be linearly dependent contrary to our hypothesis. If At =F 0 multiply (1.63) by At and subtract; then ~2(A2

-

)"1)X 2

+ ... +

~s(As

-

AI)X s =

O.

Since Ai - At =F 0 for i = 2, ... .s this gives a linear relation between s - 1 vectors. Again, a contradiction occurs and the statement is proved. One consequence is that, if A has n distinct eigenvalues, yT Xi :F O. For, if this were not true, Yi would be orthogonal to the n independent vectors X h · · • , x, which is impossible because Yi =F O. It is therefore always possible to select Yi so that yJ Xi = 1. Moreover, if A has n distinct eigenvalues, define X as the matrix with columns Xi' i.e. x = (x., X2 , • • • , xn ) . Then, with Yi picked so that Y;Xi = 1,

because of the orthogonal relations. Hence X

-1 AX

= X -t(A I Xl' A2 X 2 ,

••. , AnX n)

= diag(Ai)

(1.64)

where diag is used to denote a diagonal matrix, i.e. a matrix whose non-diagonal elements are all zero. Two matrices A and B are said to be similar if there is a non-singular matrix R (i.e. det R =F 0) such that B = R - 1 AR. Sometimes, A is said to have undergone a similarity transformation. The eigenvaues of similar matrices are the same because Ax = ).. X can be written as (R-1AR)R-1x

= AR-1x

showing that R - IX is an eigenvector of R -1 AR. What has been demonstrated above is, if A has n distinct eigenvalues, that A is similar to a diagonal matrix whose entries are the eigenvalues of A. If A is also symmetric then Yi = Xi and X-I = X T so that, in this case, there is an orthogonal similarity transformation converting A to diagonal form. If A is symmetric with multiple eigenvalues it can be shown that there is still an orthogonal similarity transformation which changes A to diag(Ai). If, however, A has multiple eigenvalues but is not symmetric the situation is more complicated. What can be demonstrated is that there is a non-singular R such that (1.65)

44

ASPECTS OF NUMERICAL ANALYSIS

where J is the Jordan canonicalform of A and has the following structure: J is a block-diagonal matrix

o J=

where each

~

o

s;

is either the number Ai or a matrix of the form

o (1.66)

~=

o which is an upper triangular matrix since all the elements below the diagonal are zero. The Jordan canonical form is the most compact to which a general matrix can be reduced by a similarity transformation. The same eigenvalue may occur in different J;, but the total number of times that a given eigenvalue occurs in the diagonal of J is the same as the multiplicity of the eigenvalue. The number of linearly independent eigenvectors of A is k, i.e, the number of Jordan blocks in the canonical form. In particular, if ~ is m, x m, and r, is the ith column of R then r l , rm 1 + l' ... ,rm 1 + .. . mlc-l + I are the eigenvectors of A. If the elements of A are changed continuously, then det(A - AI) varies continuously and so the eigenvalues of A change continuously. In general, however, the eigenvectors do not alter continuously. If p(t) = ao + alt + ... + amtm is a polynomial in t, a corresponding matrix polynomial p(A) can be defined by p(A)

= ao +

alA

+ ... +

amA m

where, of course, A' = AA,-l. It is immediate that any eigenvector of A is an eigenvector of p(A) with eigenvalue p(A i ) . If A has an inverse the eigenvalues of A -1 are l/li. If the elements of A are complex, the matrix A * is obtained by replacing each element of A by its complex conjugate. Write AH = A*T. Then a matrix is said to be unitary if AHA = I. It is called Hermitian if A H = A. Note that the Hermitian matrices include the real symmetric matrices. If A is the eigenvalue of a Hermitian matrix A so that Ax = AX then (1.67)

45

MATRICES

Now (x"Ax)" = (x" A*X*)T = x"A"x = x"Ax so that x"Ax is real. Since x"x is real and non-zero it follows that A is real, i.e. the eigenvalues of a Hermitian matrix are real. Furthermore, if Xi and Xj are two eigenvectors, A;xfx; = xfAx i = (xr Ax j)"

= A.j(xrXj)H = ;"jXfX i

from which we deduce that X7Xi = 0, i.e.the vectors are orthogonal, if A. i i= Aj • It can be shown that if A is Hermitian there is a unitary matrix U such that

= diag(A;).

UHAU

(1.68)

Moreover, the eigenvectors can be arranged to be mutually orthogonal. Consequently, any vector y can be expressed in the form n

y

L i=

=

I

Hence

aix i· n

n

yHAy

= yH L

aiAiX

i= 1

=L

Ail ail2

i= 1

since X7Xi = 0 (i :f:. j) and the magnitude may be made to satisfy xrx i = 1. Put the eigenvalues in order so that At ~ A2 ~ ••• ~ An. Then

An

n

n

i= I

i= 1

L la;\2 ~ r" Ay ~ z, L la;\2.

In other words, for arbitrary y, AnyH y ~

r" Ay ~ )vlY"y

(1.69)

when A is Hermitian and AI' An are the least and largest eigenvalues respectively of A. A Hermitian matrix is said to be positive definite if x"Ax > 0 for every x :f:. 0 and positive semi-definite if x"Ax ~ 0 for every x :f:. O. A deduction from (1.67) is that a Hermitian matrix is positive definite if, and only if, all its eigenvalues are positive. It is positive semi-definite if and only if all its eigenvalues are non-negative. A measureof the eigenvalues of a matrix is provided by the trace Tr A definedby Tr A

Obviously

= all + a22 + ... + ann' (1.70)

+ B)

(1.71)

Tr(A

Also Tr(AB)

=

n

= k Tr A, = Tr A + Tr B.

Tr(kA) n

L L

i=1 k=l

aikbki

=

n

n

L L

k=1 i=1

bkiaik = Tr(BA).

(1.72)

46

ASPECTS OF NUMERICAL ANALYSIS

A deduction from (1.72) is that Tr(R-tAR)

= Tr(ARR- t) = Tr A.

It therefore follows from the Jordan canonical form (1.65) and (1.66) that Tr A

= At + A2 + ... + An.

(1.73)

Exercises 40. Find the eigenvalues, eigenvectors, and Jordan canonical form of

(i)

C~)

(ii)

(

-10 0) 1 0

-1

1 1 -2

41. If A and B are symmetric prove that AB is symmetric if and only if AB 42. Show that A and AT are similar. 43. If det A ::1= 0 prove that AHA is positive definite. 44. Prove that the eigenvalues of Am(A + !J.I) -1 are )..i(A i + !J.) -1 given !J. any i. 45. Prove that the eigenvalues of

1

1 + - cos 2'+ 1 2'

-2' sin 2'+1

-2' sin 2'+ 1

1 1 - - cos 2'+

2'

= BA.

*- -

Ai for

1

are 1 ± 2-'. Deduce that the eigenvalues tend to 1 as r -+ 00 but that the eigenvectors do not have a limit. 46. A is real positive semi-definite and R is an orthogonal matrix such that R T AR = diag(Ai). If B = diag(.JAi) and C = RBR T prove that C 2 = A so that a square root of A may be defined as A 1/2 = C. 47. Show that the Hermitian matrix A is positive semi-definite if and only if there is a matrix B such that A = BB H• 48. Show that (i) Tr(AA H ) > 0, (ii) if A is anti-symmetric Tr A = O.

1.11 Matrix norms The modulus of a complex number gives an idea of its size and it is desirable to have a single number which plays a similar role for matrices and vectors. This quantity will be known as a norm (see also §1.5). We define the norm in terms of its properties, and not by means of a specific formula. In this way it is possible to define many different kinds of norm associated with a vector. In fact, any formula for the norm [x] of a vector x will be acceptable if it has the properties (a) [x] > 0 if x #- 0; [x] = 0 only if x = 0; (b) II(Xx II = 1(Xlllxll for any complex number et; (c) IIx + YII ~ [x] + IIYIl·

47

MATRICES

If x has elements Xl' ... , x, standard norms are the lp-norms defined by (l~p
The loo-norm is defined by

Ilxll oo =

max 1 ~i~n

Ixd

and corresponds to the uniform norm we have already considered. Norms can always be obtained from inner products as we have seen in §1.5 and we now take this opportunity to define an inner product formally. An inner product (x, y) is required to satisfy (a) (x, x) > 0 if X =1= 0; (x, x) = 0 only if X = 0; (b) (x, y) = (Y, x)*; (c) (x + y, z) = (x, z) + (y, z), (ax, y) = a(x, y). An inner product supplies a norm via [x]

= (x, X)I/2 and the Schwarz inequality

I(x, y)\ ~ IlxllllYII

always holds. The 12-norm is often known as the Euclidean norm since it stems from the inner product x"y. Note that in inner product notation

x"Ay = (x, Ay) = (A"x,

y).

A matrix norm can also be introduced by asking that it has the properties (a) (b) (c) (d)

IIAII > 0 if A =1= 0; IIAII = 0 only if A = 0; IlaA II = lalll A" for any complex number et; IIA + BII ~ IIAII + IIBII; IIABII ~ IIAIIIIBII.

If II II' is a matrix norm and are said to be compatible if

II II is a

IIAxl1

vector form, the matrix and vector norms ~

IIAI/'I/xl/

(1.74)

A matrix norm can be constructed from a vector norm by defining

IIAII =

sup

lIxll =

1

IIAxll;

(1.75)

such a matrix norm is said to be subordinate to the given vector norm. It is obvious that the subordinate norm is compatible. From (1.75) can be seen by putting A = 1 that any subordinate norm has the property III II = 1. From now on the only matrix norm which will be considered is the one

48

ASPECTS OF NUMERICAL ANALYSIS

specified by (1.75). Corresponding to the vector lp-norms we have n

IIAIII = max

L

laul,

(1.76)

IIAILx> = max

L laul,

(1.77)

l~j~ni=l

n

l~i~nj=l

(1.78) where

u is the largest eigenvalue of AHA. Sometimes IIAII2 is known as the

spectral norm of A. From (1.76) and (1.77) IIAIII = IIAT ll oo. To prove these results we remark that

This inequality shows that the norm certainly does not exceed the value given in (1.76). Moreover, if n

L

lau l

i= 1

is largest when j = k choose Xi = 0 (i :f:. k), = 1 (i = k) and then the value in (1.76) is actually attained and so (1.76) is proved. The proof for II A II 00 is similar except that in the last stage, if n

LI laul

j=

is greatest for i = k, we choose Xi = aki/lakil (aki =F 0), = 1 (aki = 0) to achieve the supremum. For IIAI12 we remark that IIAxll 2 = (x HAHAx)1/2 and the result follows from

(1.69).

If A is Hermitian, (1.78) implies that

IIAI12 = max lAd· I

~i~n

The spectral radius p(A) of a matrix is defined by p(A)

=

max I

~i~n

lAd.

Thus IIAI12 = {p(A HA)}1/2 which simplifies, if A is Hermitian, to IIAII2 In general, if x is an eigenvector of A,

IIAxl1 = IIAxl1 = IAlllxll

=

p(A).

49

MATRICES

which demonstrates via (1.75) that IAI ~

p(A)

~

IIA II, i.e. IIAII

(1.79)

for any norm of A. However, if the norms are badly chosen the norm and spectral radius need not be close; for example

but the spectral radius is zero. In contrast, it can be shown that there is always some norm which is arbitrarily close to the spectral radius. A useful theorem is: THEOREM

1.11. lim,_oo .4'

= 0 if and only if p(A) <

1.

Proof It is evident from the Jordan canonical form that Ar can approach the zero matrix if and only if each of its eigenvalues tends to zero. But its eigenvalues are A.r which can vanish as r -+ 00 if and only if lA. i I < 1 and the theorem is proved. THEOREM

LIla. If p(A) < 1 then (1 - A) - I exists and m

L Ai.

(I - A)-l = lim

m-oo i=O

Proof Since p(A) < 1, I - A has no zero eigenvalues and so possesses an inverse. Also (I - A)(I + A + ... + Am-I) = I - Am and so 1 + A + ... + Am - 1 = (I - A)-l - (I - A)-lAm. The result now follows from Theorem 1.11 by letting m -+

00.

It will be remarked that a sufficient condition for the validity of Theorems 1.11 and 1.11a is that IIA II < 1, on account of (1.79). A matrix A is called strictly diagonal dominant if n

L' laul

j=l

where the prime on arises from: THEOREM

< laul

(i

= 1, ... , n)

L means omit the termj =

i. The importance of this concept

1.11h. If A is strictly diagonal dominant then A-I exists.

Proof Suppose there is x;/;O such that Ax

= O.

Let

Xm

= max! ~i~n [x.],

50

ASPECTS OF NUMERICAL ANALYSIS

Then, from n

L

j= 1

we obtain lammllxml

amjxj = 0,

= IL

j:;/:m

amjXjl

~ Ixml j:;/:m L lamjl

which contradicts the condition of strict diagonal dominance. We can now prove: THEOREM l.lle (GERSCHGORIN one of the complex domains

Iz - aul

CIRCLE THEOREM). n

~

I' laul

j= 1

(i

Every eigenvalue of A lies in

= 1, 2, ... , n).

Proof Let i be any eigenvalue which does not lie in one of these domains. Then

Ii - aul >

n

L' laul

(i

= 1, ... , n).

j= 1

Hence A - Al is strictly diagonal dominant and hence, by Theorem l.llb, has an inverse. But this is impossible because A is an eigenvalue and the theorem is proved. One consequence of Gerschgorin's theorem is that p(A)

~ min (m~x ~ laijl; m~x ~ lau l) . I

J

J

I

Gerschgorin's theorem and (1.69) provide rules for locating the positions of eigenvalues. While they may not always be very precise they do at any rate limit the possibilities. A type of matrix often encountered with difference equations is a Stieltjes matrix which is a real positive definite matrix with all its off-diagonal elements non-positive.

Exercises 50. Prove that Ilxll ~ nl/xll oo and I/x1l ~ Jnllxl/oo. 51. If U is unitaryprovethat (i) II Uxll 2= IIx112' (ii) II UAII2 = IIA112' (iii) II UAU" 112 = IIAII2·

2

l

52. If a is real and A

=

G:)

show that

II A' "2 = a' [ 1 +

2 8r2 { ( a ~ 1 + 1 + 4r2

)1/2}]1/2

•

LINEAR EQUATIONS

51

53. Prove that maxlaijl ~ IIA 112 ~ n maxlaijl, the maximum being taken over all i andj. 54. Prove IIAlli ~ IIAlltllAlloo· 55. If a is real show that the spectral radius and spectral norm of

:

(~ :)

are equal only if a = o. 56. Show that (i) [p{(A"A)-I}]-1 ~lliI2~p(AHA), (ii) LI=t Ild2~{LI=t D=t laijI2}1/2. 57. For the tridiagonal matrix b 0 0

0

c a b 0

0

0 c a

b

0

0

c

a

b

0

0

c

a

a

..................

show that

Aj = a + 2(bc)1/2 cos{jn/(n + I)}. Check that this satisfies Gerschgorin's theorem. 58. In the tridiagonal matrix

d"-l

o

dn- l

an

d, = (b;C;)l/2 where b.c, > 0, a; ~ fb;l + Ic;- II (i = 2, ... , n - 1), al > Ibtl and an> Ic n- d. Prove that it is positive definite.

LINEAR EQUATIONS 1.12 Linear equations-direct methods The solution of the system of linear equations Ax = b where A is non-singular is simple in principle. In fact, the solution can be written as x = A -lb. When A -1 can be calculated easily by analytical means this is often satisfactory. However, for numerical work, it must be recognized that simple analytical formulae for A - 1 usually involve the ratio of determinants and the computation of determinants of order 4 or higher is a complex task. Therefore, if we are to realize efficient numerical methods we must seek other ways of finding a solution.

52

ASPECTS OF NUMERICAL ANALYSIS

In practice, the systems of equations which arise are frequently of two types: (i) The matrix A may be of moderate order, say n < 100, and dense, i.e. nearly all its elements are non-zero; (ii) A may be of large order, say n > 1000, and sparse, i.e. it contains a large number of zero elements. The type of matrix is of prime importance in deciding on a method of solution. For dense matrices, direct methods are appropriate and will be described in this section. Sparse matrices should be treated by iterative techniques (see next section) and it may be possible to economize in computer storage by retaining only non-zero elements. Perhaps the most popular direct method for the numerical solution of a system of linear equations is Gaussian elimination. It has two parts-an elimination or trianqularization procedure and back substitution. Its principle is simple and can be illustrated by the problem of finding the unknowns Xl' x 2 , and X 3 in

+ x 2 + 4x 3 = 7,

XI Xl -

2x 2 + 6x 3

2x I +

Xl -

X3

= 15, = 7.

Subtract the first equation from the second; then subtract twice the first from the third. There results Xl + X2 + 4x 3 = 7,

-3x 2 + 2x 3 -X 2 -

= 8,

9x 3 = -7.

The first equation, which was used to remove X I from the other two equations, is known as the pivot equation and the coefficient of X I in it is called the pivot. Now, we make the new second equation the pivot equation and use it to eliminate X2 from the third. Thus, by subtracting t of the second from the last, we reach Xl + X 2 + 4X3 = 7,

= 8, -29x3/3 = -29/3.

-3X2

+

2X3

If these were written in matrix form, the matrix on the left would be upper triangular which explains why the process is sometimes called triangularization. Clearly, if we reversed our steps we should recover the original equations so the two systems are equivalent. The final step of back substitution is now undertaken. From the last equation X 3 = 1. Substituting this value in the second we obtain Xl = - 2. Then the first equation gives Xl = 5. The method can obviously be generalized to a system of n equations in n unknowns and is very easy to program. A simple count of the operations

53

LINEAR EQUATIONS

tn

involved reveals that about 3 multiplications and additions are required to solve a system. By taking b as a unit vector and by employing the special properties of unit vectors we find that the inverse of A can be found in n3 (and not !n 4 as might be expected) multiplications and additions. If mIl multiplications are required for a determinant of order n and additions are ignored, expansion in co-factors gives m,,+ 1 = (n + l)m n whence mIl is about nn! Thus Gaussian elimination gives a dramatic improvement over Cramer's rule. Even if more sophisticated methods of evaluating determinants are adopted this statement remains true (see, for example, Kunz (1957)). It may happen that during the elimination the normal pivot is zero. In that case two equations are interchanged so that the pivot is non-zero (there must be at least one equation with non-zero pivot so long as A is non-singular) but, if the pivot is non-zero but very small compared with other coefficients in its column, numerical instability can arise. This is caused by the fact that computers have a finite word length. As an example suppose that the computer can store only three significant digits in floating point and is working in the base of 10. Let the equations be (Forsythe and Moler (1967)) 1.00 x 10- 4 X I

1.00xI

+ 1.00x2 = 1.00,

+ 1.00x2 = 2.00.

Gaussian elimination, taking account of the limitation of word length, supplies 1.00

X

-1.00

lO-4 X I X

104 x 2

+ 1.00x2 = 1.00,

= -1.00

X

104 •

From back substitution, X2 = 1.00 and Xl = 0.00 which is obviously incorrect. By reversing the order of the original equations and performing Gaussian elimination we obtain X 2 = 1.00 and Xl = 1.00 which is acceptable. The general rule therefore, in eliminating x, from some equations, is to select as the pivot the coefficient of x, which has the largest magnitude; this is termed partial pivoting. If, however, the element of largest magnitude in both rows and columns is chosen as pivot the process is known as complete pivoting. According to Wilkinson (1965) and Ralston and Wilf (1967) it is doubtful whether complete pivoting warrants the additional complication and computer time. Wilkinson also shows that partial pivoting is not necessary for numerical stability even ifit is sufficient, e.g. if A is a real symmetric positive definite matrix, or if A is strictly diagonal dominant. Although pivotal strategy can control the difficulty of large multipliers in Gaussian elimination it still leaves open the possibility that the solution is very sensitive to small changes in the coefficients, i.e, the system is ill-conditioned. Suppose that A (which is non-singular) is perturbed to A + Band b to b + c.

54

ASPECTS OF NUMERICAL ANALYSIS

These perturbations cause a change in x, altering it to x

+

y (say). Then

+ B)(x + y) = b + c.

(A

A bound for y is provided by: TflEOREM

1.12. If

liB IIIIA-111 < 1 then

IIYII ~CIIA-ll1(lIcli + IIBllllxlD and

ll!ll~c,,(~+

IIbll where K = IIAII IIA -111 and C = III II + IIA-lBII/(1 - IIA -1 BII). [x]

BII)

II IIAII

Proof. Since A is non-singular (I

+

A -IB)y

= A -l(C - Bx).

From (1.79) and Theorem LIla, IIA-1BII < I implies that 1+ A-1B has an inverse. Consequently, y = (1 + A-1B)-lA-1(c - Bx) whence Because of the expansion in Theorem l.lla,

+ A- 1B)-111 ~ 11111 + IIA-1BII(1 - IIA-1BII)-1; !lAllllxll and the result stated in the theorem follows. 11(1

also [b]

~

If K is small then small changes in b and A will produce only small changes in [x] and the equations can be regarded as well-conditioned. However, large K does not necessarily mean that the system is ill-conditioned because only upper bounds occur in Theorem 1.12. Nevertheless, we cannot improve those bounds, when B = 0 at any rate, since examples are known (see, for example, Forsythe and Moler (1967)) in which equality is achieved in Theorem 1.12. We call K the condition number of the system. Its precise value depends upon the choice of norm. If the spectral norm II 112 is selected then, from (1.78), K = (JJ.l/JJ.n)1/2 where J.ll and JJ.n are the largest and smallest eigenvalues of AHA; this K is sometimes described as the spectral condition number. The condition number can be altered by scaling, i.e. by multiplying each equation in Ax = b by some integer power of 10, in the decimal system, though the same power need not be used for each row. If K is large, whatever scale factors are employed, the equations are ill-conditioned. A small value of K indicates a well-conditioned system. It is desirable to have available a systematic technique for scaling that ensures that K is small as possible. Unfortunately, no method is known which applies to arbitrary matrices and arbitrary norms. One practical method is to attempt to arrange that n elements of A are of order unity, no two of these elements being in the same row or column, all other

55

LINEAR EQUATIONS

elements of A being less than unity in magnitude. Round-off error may be reduced by using the power of the machine number base closest to the largest element. It should be remarked that det A not being large does not necessarily signify ill-conditioning. Examples are available in which det A is small and IIY112/llx11 2 = IIcI1 2/llbIl 2 • If, by scaling, it is arranged that IIAII2 = 1 then" = 1/JL~/2 and the spectral condition number is large if and only if Jl.n is small. Let det A --+ 0; then u; -+ 0 provided that Jl.l is fixed. In other words, if A is normalized so as to keep JLl fixed then det A is closely related to the condition of A. But, in general, the largeness of the condition number is more significant than the smallness of det A as a criterion for determining ill-conditioning. Gaussian elimination is applicable to complex equations if the computer has a facility for complex arithmetic. Otherwise the equations must be separated into their real and imaginary parts and the resulting real equations solved. There are other inversion algorithms. A popular one is triangular decomposition in which the aim is to write A in the form (1.80)

A =LU

where L is a lower triangular matrix (i.e, all elements above the diagonal are zero) and U is an upper triangular matrix. If this can be done in such a way that Land U are non-singular then, by putting Ux = y, we have to solve the two systems Ly = b,

Ux

= y.

Since both systems are triangular the first can be solved for y by back substitution and then x can be determined from the second by back substitution. The effort involved in the back substitution is substantially less than that in the triangular decomposition. Conditions which permit (1.80) are contained in:

1.12a. Let Ak be the matrix formed by the first k rows and columns of ••• , An-I' A are all non-singular A = LU and the decomposition is unique if the diagonal elements of either L or U are specified. THEOREM

A. If A h A 2 ,

Proof. Only the situation in which all the diagonal elements of L are chosen to be unity will be considered, the general case being left to the reader. With 0 L=

12 1 In1

0 0

In2

, u=

Ul 1

U 12

U 1n

0

U 22

U 2n

0

0

Unn

56

ASPECTS OF NUMERICAL ANALYSIS

we req uire that

A=

To get the first row of A right we need Ul j

=

alj(j

= 1, ... , n).

The first column of A will be given correctly if

(i which is possible since U I I = A is obtained by taking U2j

all

= a 2i

= 1,... , n)

i= 0 by assumption. Now the second row of

-

121 Uli

(j

= 2, ... , n)

and then the second column may be realized by

(i

= 2, ... ,n).

The last formula is legitimate provided that U2 2 # 0, i.e. a 2 2 - a21 al2/all # 0 which is true since det A 2 i= O. Proceeding in this way, a row and a column at a time, we construct Land U and the construction obviously leads to unique elements. Remark that, since A is non-singular, neither L nor U can be singular. If A is real, symmetric, and positive definite we can show by the same procedure that there is a real lower triangular matrix L such that

This is known as Cholesky decomposition. The algorithm, in this case, is highly stable. If A is Hermitian positive definite then A = LLH where the diagonal elements of L are positive. Finally, we observe that it is sometimes possible to improve the accuracy of a computed solution to a system of linear equations by iteration. If x(1) is the first approximation, calculate the residual r(l)

=b-

Ax(l)

as accurately as possible. Then solve the system Ay = r(l) and take X(2) = x(1) + y. Clearly this is the first stage of an iterative procedure which, under suitable circumstances, will lead to more accurate numerical values.

57

LINEAR EQUATIONS

Exercises 59. Is the system with

10- 4 A

=(

0.1

-2 x 100.1 1.0

0.2 -10- 4

4 )

-10- 4

0.2

badly scaled? 60. Find the triangular decomposition of

2 4

(:

-2 4 0

61. Suppose there are two triangular decompositions

A = LtUt = L 2U2 with L, and L 2 having units on the diagonal. If A is non-singular prove, without using Theorem 1.12a, that VI and V 2 are not singular. Deduce from Li 1 L 2 = VIVii that the decomposition is, in fact, unique.

1.13 Iterative methods Iterative methods, which are appropriate for large sparse matrices, are based upon the idea of starting with an initial guess x(O) to the solution of Ax = b and then deriving a sequence xU), X(2), ..• which converges to the exact solution. All of the methods are based on rewriting A in the form

A=L+D+U where D is a diagonal matrix with diagonal elements the same as those of A, L is lower triangular with zeros on the diagonal, and U is upper triangular with zeros on the diagonal. For instance, if we express Ax = b as Dx = b - (L

+

U)x

this suggests the iterative scheme x(r+ 1)

= D- 1b

- D- 1(L

+

U)x(r)

which is known as the Jacobi method. A necessary condition for the application of this method is that all the diagonal elements of A are non-zero. Alternatively, the form (L + D)x = b - Ux suggests the iteration x(r+1)

= (L + D)-lb -

(L

+ D)-lUx(r)

58

ASPECTS OF NUMERICAL ANALYSIS

which is the Gauss-Seidel method. Again, by introducing the non-zero scalar parameter wand writing (D + wL)x = {-wU + (1 - ro)D}x + rob,

we derive the procedure x(r+l)

= (D + WL)-l{ -wC] + (1 - w)D}x(r) + (D + WL)-lWb.

This is the method of successive over-relaxation (SOR). It reduces to the Gauss-Seidel method if co = 1. Of course, one does not in practice calculate the inverse matrices on the right-hand sides in the iterative schemes but, instead, solves the linear system which arises before the application of the inverse matrix. One advantage is evident in that zero elements of the matrix need not be stored and successive vectors can be overwritten on their predecessors so that considerable economy of computer storage can be achieved. The iterations are all examples of taking an equation x=Bx+c and replacing it by

x""

I)

= Bx(r)

+ c.

Let e(r) = xC,) - x be the error at a particular stage. Then, by subtraction, we see that e(r + 1) = Be(r) whence e(r)

= B'e(O).

Now x(r) converges to x if and only if e(r) approaches zero which can happen for every choice of e(O) if and only if B' -+ O. From Theorem 1.11 we deduce: THEOREM

if p(B) <

1.13. The iterative scheme converges to the correct solution 1.

if and only

Thus the convergence of the three schemes turns upon the spectral radii of the relevant matrices, i.e. of D-I(L + V), (L + D)-IV and 2 w = (D + WL)-I x {-wV + (1 - w)D}. The smaller the spectral radius the more rapid the convergence. One aim of successive overrelaxation is to choose t» so that the spectral radius of ftJw is as small as possible. However, we are limited in our choice by: THEOREM

1.13a. If p(.!e w ) < 1 then 0 < to < 2.

Proof. If A. is an eigenvalue of 2 ro , det(.!ew - A.I) = O. Now, in the polynomial equation for ).. which results, the coefficient of )lIn is ± 1 and the constant term is det!ew which, from the structure of D, L, and V, is (det D)-I(l - co)n det D. Thus the product of the roots of the characteristic equation is ± (1 - co)", Hence, at least one of the eigenvalues must have a modulus as great as 11 - wi. Since p(!l'w) < 1 this implies that 11 - wi < 1 and the theorem is proved.

59

LINEAR EQUATIONS

As a consequence of Theorem 1.13a there is no point in considering values of (J) outside the interval (0, 2) and, in practice, it is normal to choose (J) so that 1 < io < 2. In general, the determination of the optimal value of (J) is extremely complicated (see, for example, Mitchell (1969». It can be shown that the Gauss-Seidel method converges if A is a real positive definite matrix, though the Jacobi method may not, and other conditions for convergence are known (see, for example, Varga (1962». In general the convergence of the Gauss-Seidel method is faster than that of the Jacobi method and SOR is usually appreciably better than either (see subsequent exercises). Another iterative scheme which is often employed is the Peaceman-Rachford method. In this method we write A = At + A 2 + A 3 where A h A 2 , and A 3 have certain properties which will not be elaborated here. An intermediate iteration is inserted so that x(r) -+ y(r) -+ x" + 1) by the process (A 1 (A 3

+ (Ut A 2 + (J)2 1) y (r) = b -

+ w 3 A 2 + ( 4 1 ) x (r + 1) =

{A 3

+ (1 - (

b - {At

+

(1 -

1 )A 2 -

(J)3)A 2 -

ro21 }x(r), (J)41}y(r).

The analysis of this scheme is highly complex but it does seem to be a profitable method in connection with difference equations (see, for example, Mitchell (1969». Particular methods for sparse matrices have been the subject of considerable research in recent years (see, for example, Duff (1976».

Exercises 62. Find the spectral radii of the Jacobi and Gauss-Seidel methods for

A

= (-:

~:

-;)

and show that both methods converge. 63. If A is symmetric show that an eigenvalue )~ and the associated eigenvector u (possibly complex) of the Gauss-Seidel method satisfy {(u, Du)

+

(u, Lu)}A.

= (u, LTu) = (u, Lu)*.

By forming IAI 2 deduce that, if A is a real positive definite matrix, I)~I < 1 and that the Gauss-Seidel method converges. 64. Let A be a tridiagonal matrix. Let A, u be an eigenvalue and associated eigenvector of the matrix of the Jacobi method so (L + U)u = ADu. By applying (L + U)D- 1 show that A. is a diagonal element of (UD - 1 L + LD - 1 U)D - 1. If J1. is an eigenvalue of the SOR method use this procedure to demonstrate that

60

ASPECTS OF NUMERICAL ANALYSIS

Deduce that the Jacobi and Gauss-Seidel methods either both converge or both diverge for a tridiagonal matrix and that the convergence of the Gauss-Seidel is faster. If all A. are real and A. 1 ( < 1) is the largest show that the optimal choice for (JJ is 2{1 + (1 - A.i)I/2} -1 and determine the corresponding value of J.lt. If, say, A. t = 0.995 then J1.1 ~ 0.8 and the convergence of SOR is much more rapid than that of Gauss-Seidel.

1.14 Matrix eigenvalues

The matrix eigenvalue problem occurs in many applications and is concerned with solving (A - ~I)x = 0, i.e. with determining the eigenvalues A and associated eigenvectors x of A. The eigenvalues must satisfy det(A - ~I) = 0 and it is tempting to try expanding this as a polynomial in l, whose roots can be found by one of the methods of §1.8. Apart from the difficulty of calculating the coefficients there is a classic example due to Wilkinson (1965) which demonstrates why this should not be done. The polynomial (x - l)(x - 2) ... (x - 20) - 2- 2 3 X 19

is a slight perturbation from a polynomial with zeros at 1, 2, ... , 20. But only ten of its zeros are real and two of the complex ones have imaginary parts of about 2.8 in magnitude. Since round-off error can easily introduce perturbations in the coefficients of a polynomial the characteristic polynomial is never used for the computation of eigenvalues. Indeed, it may be wiser to calculate the zeros of a polynomial by solving an associated eigenvalue problem. To begin with we discuss an iterative scheme known as the power method.

1.14. Let A have n linearly independent eigenvectors corresponding eigenvalues Ai satisfy

THEOREM

Xi

and the

<Xi

such that

Then the sequence u, + I = Au, is such that lim (u,+ l)j/(U,)j

= Al

where (u,) j denotes the j th element of U,. proof Since the Hence

Xi

are linearly independent, there are constants

61

LINEAR EQUATIONS

and (0,+ 1)j

= ..1.

(U,)j

By hypothesis, (Ai l )"1)' proved.

1

-+

a l (Xl) j + ai)'2/A1Y+ 1(X 2) j + . ·.. C(l(X 1)j + C(2(A 2IA1)'(x2)j + ... 0 as

r -+

00

if i =1= 1 and

the

theorem

is

Strictly, the proof of the theorem requires that (Xl =1= 0, otherwise A2 is obtained as the limit rather than AI' However, round-off error is almost certain to introduce a small component of Xl and the effect of this will be to direct the convergence towards AI' It is also common to normalize the iteration by putting vr + 1 = Au, and then defining ur + I = vr + 1/11v r + I ILX) The effect of this is that Ilv, II --+ Al as r --+ 00. Furthermore, U, --+ x1/\lx111 00 so that the associated eigenvector is supplied at the same time. The iteration fails if there are a number of unequal eigenvalues of the same modulus. There are ways of overcoming this difficulty but details will not be given here. (See, for example, Gourlay and Watson (1973).) Aitken's method (§1.8) can be employed to accelerate convergence. Another technique is to work with A - qI instead of A. The eigenvalues of the new matrix are Ai - q and, provided that Al - q is still dominant, it may be possible to choose q so that I(A 2 - q)/()"1 - q)1 is much smaller than 1)"2/All. Another iterative plan is that of inverse iteration. In this procedure we form wr+ 1 = (A - qI)-l wr; actually we determine w,+ 1 by solving the linear system 0

By Theorem 1.14 inverse iteration provides an eigenvalue of (A - qI)-t, i.e. one of Ij(Ai - q). With the normalization described above the associated eigenvector is also obtained. Inverse iteration is capable, by judicious choice of q, of finding any eigenvalue or eigenvector of A. It also has a fast rate of convergence. It is one of the most powerful and accurate methods available. While the above methods compute a single eigenvalue at a time there are others which aim for the complete eigensystem right from the beginning, usually at the price of requiring A to be symmetric or Hermitian. Let eij denote the (i,j) element of the unit matrix. A plane rotation matrix Rij is a matrix derived from the unit matrix by replacing four elements according to the following scheme -sin

0)

cos 0

62

ASPECTS OF NUMERICAL ANAL YSIS

so that

0

Rij

0

0

0

cos

e

-sin

0

sin

e

cos

0

0

e

0

=

e

0

0

The nomenclature stems from the fact that the replacement represents a rotation of two-dimensional axes through an angle O. It is immediate that Rij is orthogonal, i.e. RijR &I. Let A be a real symmetric matrix and then put B

= R~ARij.

The elements of B are the same as those of A except for bik = bki

= a ik cos 0 + ajk sin 0, k "# i,j bjk = bkj = - aik sin 0 + a jk cos 0, k =I- i,j b u = au cos" 0 + 2a ij sin 0 cos 0 + aj j sirr' 0, bi j = bj ; = aij cos 2lJ + t(ajj - au) sin 2lJ, bjj = au sin? 0 - 2aij sin 0 cos 0 + a jj cos? O. = O. If aj j "# au we take 0 so that -in < 0 < in and tan 28 = 2aij/(aj j - au). =I- 0 we select () = !n(aij/laijl); if aij = 0 no choice is necessary.

We now choose 0 so that bij

If ajj = au and aij So far i and j are at our disposal. They are determined by searching the elements of A above the diagonal and finding the element aij of maximum modulus. Having fixed () the resulting matrix B is denoted by A l . The largest off-diagonal element of A l is now reduced to zero by the same procedure of applying a plane rotation matrix. Denoting the new matrix by A 2 we note that the (i,j) element which was reduced to zero in Al is no longer zero in A 2 • However, it can be shown that if the procedure is repeated indefinitely, the limit of the sequence A, is a diagonal matrix with the eigenvalues of A on its diagonal. This algorithm, which is known as the Jacobi method, is easy to program but it is not very efficient. For small matrices which can be held entirely in the fast store it may be appropriate because it is very reliable. Elements annihilated by a plane rotation in the Jacobi method may be

LINEAR EQUATIONS

63

recreated at later stages. An algorithm to overcome this is provided by the Givens method. Instead of choosing (J so that bij = 0, pick some k =1= i and ask that bkj = 0, i.e. select tan () =

ajk/aik'

Suppose, in fact, Al = RIAR I where R I is the plane rotation which annihilates a l 3 (say, i = 2, j = 3, k = 1). Next form A 2 = R1A lR 2 where R 2 reduces the (1,4) element of Al to zero with i = 2, j = 4, k = 1; the (1, 3) element which is zero in Al remains zero in A 2 • By repeating the process we can make (n - 2) elements in the first row zero; by starting from (2, 4) we can operate on the second row without affecting the first. In this way a real symmetric can be transformed by tridiagonal form by plane rotations. Denote the tridiagonal matrix by B and let bii = bi' bi,i+l = bi+l,i = c.. Let Pr(A.) be the determinant formed by the first r rows and colums of B - AI and define Po(A.) = 1. Then, it may easily be established that Po(A) = 1, PI()~) = b l Pr(A.)

= (b r -

2,

-

A)Pr-I(A.) - C;-IPr-2(A.)

(r

= 2,3, ... , n).

The zeros of Pr-l (A.) lie between those of Pr(A). Also, define the sequence sr(A.) by sr(2) = Sr-l (A.) + 1 if Pr(2) is zero or has the same sign as Pr-l (A) and by sr(2) = s,_ 1 (2) otherwise. Starting from so(2) = 0, we generate s,(A.) as either zero or a positive integer. Define S(A) = Sn(A), then the Sturm sequence property states that S(A) is equal to the number of eigenvalues of B which are strictly greater than )..; in fact, this property holds if B is symmetric instead of being tridiagonal. The Sturm sequence property permits the location of an interval (),,~o), 2~O») in which only one eigenvalue 2 lies, i.e. for some m < n, S(A~O») = m, s(2~O») = m + 1. Let jl(O} = !(A,\O) + A,~O») then s(jl(O}) is either m or In + 1 and a smaller interval for ).. has been determined. This is akin to the method of bisection (§1.8) and is very efficient for finding the eigenvalues in a particular interval. Once the eigenvalues have been calculated it is tempting to find the eigenvectors by solving (B - }"iI)x = 0 by omitting the last equation of the system and solving the first (n - 1) for Xl' ... 'X n - 1 in terms of an arbitrary X n • In general, this is catastrophically unstable and should never be undertaken. The preferred method is inverse iteration. Another technique for the reduction of a real symmetric A to tridiagonal form is the Givens-Householder method. Here, one considers matrices of the type

p where the vector w satisfies Ilwll firstly that P is symmetric. Also

ppT

=I

= 1-

= 1 but is otherwise at

- 4ww T

so that P is, in fact, orthogonal.

2ww T

+ 4WW TWW T = I

our disposal. Observe

64

ASPECTS OF NUMERICAL ANALYSIS

=

If u and v are real vectors such that [u] w

Then and

= (v -

\lvll put

u)/llv - u]',

IIv - ul1 2 = (v, v) - (v, u) - (u, v) + (u, u) = 2(u - v, u) Pu

= v.

Thus, with this choice of w, P converts to a real vector u into another one v of the same length. In particular, let 8 1 be the first column of A. Let b 1 be a vector with element b 1 l in the first row and zero elsewhere. Take

= -a111Ialll/lalll l/a1 1/ so that we may place u = 8 1 , V = b, bl 1

and then IIb l ll = above. Consequently, PA is a matrix whose first column is zeros except the diagonal element which is b 11. An arbitrary m x n matrix can be transformed by this approach to the form QU where Q is orthogonal and U upper triangular. For our particular purposes the transformation is not quite suitable since we want pT AP, rather than PA, to be simpler than A. However, it indicates the direction in which to go. When A is real and symmetric partition it according to

A=

(all (I

~T) Al

M=G :)

and introduce

where P is an (n - 1) x (n - 1) matrix of Givens-Householder type. Then

MTAM=

(lTP)

al l ( pT(I p T A P 1

so that, if we choose P so that the last n - 2 elements of pTa are zero, the matrix M T AM will have n - 2 zeros on its first row and column. We remark that M = I - 2(OO)T

where co = (~) and w is an (n - 1) vector. This suggests that for the next step we try M,

=I

- 20)10)T where

LINEAR EQUATIONS

65

and WI is an (n - 2) vector. There is no difficulty in checking that MIM T AMM1 has the same zeros as M T AM in the first row and column and we can choose WI so that there are n - 3 additional zeros in the second row and column. It is now obvious how we may create a tridiagonal matrix from a real symmetric A by the Givens-Householder process. The Givens-Householder method requires 3 multiplications as compared with ~n3 for the Givens. It is therefore more efficient and generally regarded as one of the best methods for finding the eigenvalues of a Hermitian matrix though the Givens method is sometimes valuable for sparse matrices. For non-symmetric matrices where the two methods reduce A to the more complicated Hessenberg form (which is the same as a tridiagonal matrix below the diagonal but may have non-zero elements anywhere above the diagonal) the situation is less clear. In the symmetric case both methods lead to a tridiagonal matrix and we have already described one way of dealing with eigenvalue problems by Sturm sequences. Another technique is based on triangular decomposition. Suppose

in

At=LtVI where L 1 is lower triangular with units on the diagonal and VI is upper triangular (cf. Theorem 1.12a). Let A 2 = U1LI • Then A 2 = L 1 lA 1L 1 and is similar to A r- This suggests the iteration: given As, write it as As = LsUs and then form As + t = VsLs • This is known as the LR algorithm (LR because Rutishauser, who introduced it, called the decomposition left, right instead of lower, upper). It is subject to many shortcomings but its introduction led to the QR algorithm, one of the most powerful devices for the matrix eigenvalue problem. The QR algorithm is based on the fact that, for any non-singular A, there is a decomposition A=QU in which Q is unitary and V is upper triangular. Moreover the decomposition is unique if we impose the condition that the diagonal elements of U are positive. The iteration is now performed as: write As = QsUs and then define A s+ 1 = VsQso Since A s+ 1 = Q~ AsQs' A s+ 1 is unitarily similar to As and therefore to A. In the Q R algorithm the tridiagonal property is preserved, i.e. if A is tridiagonal so are its iterates. In spite of the power of the algorithm it has been suggested that a matrix should be reduced to Hessenberg or tridiagonal form before the algorithm is applied. (See, for example, Ralston and Wilf (1967).)

Exercises 65. Use the power method to find the largest eigenvalue of

(a)

62 32 1)1 ,

(1

1

1

66

ASPECTS OF NUMERICAL ANALYSIS

66. Use inverse iteration to find the eigenvalues of the matrix in 65(a).

67. Find the eigenvectors of (: :

;)

68. Use the Givens and Givens-Householder methods to reduce to tridiagonal form

2

4 4 2) (a)

(:

4 :

'

2

(b)

2

2-1 -1

2 69. Show how P = I - 2ww" may be used to transform a Hermitian matrix to tridiagonal form.

-2 70. Show that

0

-2

o

0

has two eigenvalues in ( - 2, 0) and is, in fact,

-2

o

-2

negative definite.

71. Find the eigenvalues of (:

-

~

;) by Sturm sequences.

72. In the QR algorithm prove that

GENERALIZED INVERSE 1.15 The generalized inverse It is not uncommon in applications to encounter the problem of solving the system (1.81 ) Ax = b where A is not square but m x n. For example, in making observations it may be that we have data from less points than we have unknowns so that m < n, or we may have more data points than unknowns in which case m > n. In the former case the linear system possesses an infinite number of solutions while,

GENERALIZED INVERSE

67

in the latter.. the system may strictly have no solution because the equations are inconsistent with one another. Yet it may be important for the application to identify a single entity which one is prepared to accept as 'the solution' of the system. One method which suggests itself is that of least squares. There are at least two ways in which this could be applied. We could consider minimizing the sum of the squares of the residual rTr where r = b - Ax or we might try minimizing xT x. Let us deal with residuals first. The rank of a matrix is the order of the largest non-singular submatrix in the matrix. It is not difficult to confirm that, when A is real, A and AT A have the same rank. With that notion we can formulate 1.15. Ifreal A is »1 x n with m > nand of rank n, the solution of(I.81) which minimizes r Tr is given by

THEOREM

Proof Since AT A is n x n and of rank n, it is non-singular and so the formula makes sense. Also rTr = bTb - xTATb - bTAx

so that o(rTr)/oxi =

°

+ xTATAx

for i = 1, ... , n leads to

ATAx

= ATb

(1.82)

and the theorem is proved. For the case m < n we have THEOREM 1.15a. If real A is m x n "vith m < n and of rank m, the solution of (1.81) which minimizes x Tx is given by

x = AT(AAT)-lb. Proof. The formula makes sense because AA T is m x m of rank m and therefore non-singular. The minimum of xTx subject to Ax = b is found by Lagrange multipliers, i.e. by minimizing S = xTx + A,T(b - Ax) where A. is a column vector with m elements. From as/ax; = 0, i = 1, ... , n we obtain x = ATA, and from as/OA j = 0, j = 1, ... ,n we have Ax = b. Hence

AATA. = Ax = b which can be solved for A. and the theorem follows. It is desirable to relax the conditions on rank in Theorems 1.15 and LISa and find a single formula which encompasses all possibilities. To this end we

68

ASPECTS OF NUMERICAL ANALYSIS

examine whether there is a matrix A + such that x Theorem 1.15 we have

= A "b.

In the case of

and we remark that A+ AA+

= (A TA)-lA TA(A TA)-IA T = (A TA)-IA T = A+.

Similarly AA + A = A. Also AA + and A + A are symmetric. Now, for Theorem 1.15a, A + = AT(AAT)-t and a check reveals that it has the same three properties. This prompts: DEFINITION. A matrix A + with the properties (i) A + AA + = A +, (ii) AA + A = A, (iii) AA + amd A + A are symmetric, is called the generalized inverse of A. If A is complex replace symmetric in (iii) by Hermitian.

It has already been verified that the inverses of Theorems 1.15 and 1.15a comply with this definition. If A is square and non-singular, the inverse A-I obviously satisfies it. Moreover, we can show that A possesses only one generalized inverse so that A + = A -1 when A -1 exists. Suppose, in fact, that real A had a second generalized inverse B+. Then, from property (ii)

= A + AB+ A,

A +A

and then property (iii) implies that

= B+ AA + A = B+ A whence B + = A + AB ". Similarly, AA + = AB + from which hence A + = B+ so that the generalized inverse is unique. A+ A

A+

= A + AB +

and

By taking the transpose of the quantities in the definition we deduce that

(A +)T = (A T ) + so that, if A is symmetric so is A +. Obviously, (A +)+ = A and A + has the same rank as A. To obtain an explicit formula for A + it is convenient to derive first some properties of complex m x n matrices. If JL is a non-zero number and there are

vectors u, v such that

Au = JLV,

AHv =

JLU

then JL is known as a singular value of A and u, v as the corresponding pair of

singular vectors. Now

A" Au

= JlA"v =

jJ2U.

Since AHA is positive semi-definite the values of JL2 are real and non-negative. Also AHA, which is of order n x n, possesses n linearly independent eigenvectors U 1 , •.• , u, which can be arranged to be orthonormal. If the rank of A is k so is the rank of AHA and precisely k of the values of jJ2 are non-zero. Pick the order of the eigenvectors so that U1 , U2, ••• ,Uk correspond to the non-zero

69

GENERALIZED INVERSE

eigenvalues Ili, ... ,Il~. Note that uk + l' • •• , u, can be chosen to satisfy AUi = O. Define Vi for i = 1, ... .k by Vi = AUi/lli with Ili the positive square root of Ill. Then AAHvi = AAHAui/lli = lliAui = J1.fv i so that Vi is an eigenvector of AA H and, since AHvi = lliUh Ill' ... ,J1.k are the positive singular values of A and Ui,V i the corresponding singular vectors. The vectors Vi are orthonormal because JliJlj(V i , Vj)

= (AUi' AUj) = (U;, A HAu j) = JJJ(U h Uj).

The set may be completed by adding on m - k orthogonal vectors satisfying AHv = 0; they are automatically eigenvectors of AA H corresponding to the eigenvalue zero. Define the n x n matrix U and the m x m matrix V by Then we have THEOREM

U

= (u.,

· .. , u,),

V = (V 1 ,

••. ,

vm ) .

i.i5b. If A is of rank k there are unitary matrices U and V such that VHAU

(~ ~)

=

where A is a diagonal matrix of order k whose diagonal elements are the singular values of A. Proof. By construction UHU is the n x n unit matrix so that U is unitary. Similarly V is unitary. Also AU = (Ill VI"

••

,llkVk' 0, ... ,0)

and the theorem follows from the orthonormal property of the

Vi.

Since the diagonal elements of A are non-zero, A- 1 exists. This fact enables us to state THEOREM

1.15c. If A is of rank k, its generalized inverse is given by A+

=U(

Proof A+ AA+

= U( A- 1

o

0) V

A -1 0

H

0

•

O)(A O)(A- 0) V 1

0

0

0

0

H

0

from Theorem I.ISb. Property (i) of the Definition follows at once. Further,

70

ASPECTS OF NUMERICAL ANALYSIS

since U and V are unitary

so that AA+

=

v(;

~) V

A+ A

=

uG

~)UH

H

which show that property (iii) is satisfied. Finally,

and the proof is complete. Remark now that so that x = A "b satisfies (1.82) with the affix T replaced by H. Moreover, in the analogue of Theorem 1.15a, we need a solution of AA"A, = b. But AA" is Hermitian and so has the structure of Theorem 1.15b with V = U whence (AA H)+ = (A")+ A ". Therefore x = A"(A")+ A "b = A "b. Therefore, we have proved THEOREM· 1.15d. If A is a complex m x n matrix of rank k the vector x which minimizes (a) (b - Ax, b - Ax) and (b) (x, x) subject to Ax = b is given by x = A "h where A + is specified in Theorem 1.15c.

It is sometimes possible to derive formulae for A + which do not involve finding U and V (see exercises). In practice, the system (1.82) may be ill-conditioned and, indeed, worse than the original system. For consider the case when A is square. Then, if Ax = b is ill-conditioned, det A is likely to be small and det(A T A) = (det A)2 will be much smaller again. There are similar arguments if A is not square. For this reason, when A is real, advantage is sometimes taken of the result derived in the previous section that A = QUI where Q is orthogonal and VI upper triangular to solve instead

The condition of the system may then be considerably improved.

71

GENERALIZED INVERSE

Exercises

73. If u and v are non-zero real vectors prove that (iju" = (UTU)-1 UT, (ii) A + = A T/ (VTV)(UTU) where A = uvT • 74. If A = BC where B is m x k, Cis k x n and all three matrices are of rank k prove that A+ = CT(CCT)-l(BTB)-lB T. 75. Give an example in which (AB) + :F B+ A

+.

-1

76. Calculate the singular values of

-2

2

77. If Ax = b is a consistent system prove that k

X

=

L

i=1

(Vi' b)udlli +

n

L

i=k+l

~iUi

in the notation of this section, the ~i being arbitrary constants. 78. If A is Hermitian with eigenvalues Ai and orthonormal eigenvectors

Xi

show that

n

A=

L

i=1

AiXiXr.

79. Prove from (1.83) that A + = (A" A) -1 A" if A" A is non-singular. 80. If AHA is non-singular prove that the vector x which minimizes ,...Br, where B is positive definite, is x = (A"BA)-t A"Bb.

2 WA VEGUIDES AND DIFFERENCE EQUATIONS 2.1 Introduction The phenomena of electromagnetism are governed by Maxwell's equations, which may be expressed as curl E

+ aB/at = 0,

(2.1)

aD/at = J,

(2.2)

curl H -

div D = p,

(2.3)

= 0,

(2.4)

div B

where the vector E is the electric intensity, B is the magnetic flux density, D is the electric.flux density, H is the magnetic intensity, J is the current density, and p is the charge density. The charge density and current density are connected by the equation of continuity or conservation of charge div J

+ op/ot = o.

(2.5)

Equations (2.1)-(2.5) are insufficient to determine the electromagnetic field and have to be supplemented by constitutive equations showing how the field is related to the properties of the medium. The simplest constitutive equations occur in free space, where (2.6) D = BoE, B = JloU, where Jlo and eo are certain constants. They are related by c

= 1/(Jl oeo) 1/2 ,

where c is the speed of light. In the SI system of units which will be employed here, Jlo = 4n x 10- 7 henry /metre, c is about 3 x 108 mls and eo is about (1/36n) x 10- 9 farad/metre. For many bodies the laws (2.7) D = eE, B =,uH are reasonable. The ratios e/e o and Jl/Jlo are often known as the dielectric constant, or permittivity, and permeability respectively. The permittivity is never less than unity but the permeability can be, though it is very close to unity for many substances.

WAVEGUIDES AND DIFFERENCE EQUATIONS

73

There are other constitutive laws for bodies such as ferrites and ferrotriagnets but they will not concern us in the present context. (See, for example, Jones (1986), where many subsequent statements in this section are substantiated also.) In a conductor, Ohm's law holds in the form J

= O'E,

(2.8)

where 0' is known as the conductivity. Many metals possess a high conductivity and so it is often a theoretical convenience to regard a metal as an ideal perfect conductor in which (J is taken to be infinite. At a boundary which separates one medium from another the parameters 8, u, and 0' will often change sharply and it is necessary to have formulae which connect the fields on the two sides of the boundary. These boundary conditions are as follows. (i) The normal component of B is continuous. (ii) Each tangential component of E is continuous. (iii) If the conductivities of both media are finite each tangential component of H is continuous. If one medium is a perfect conductor, (iii) is not valid and (ii) becomes (ii)' Each tangential component of E vanishes on a perfect conductor.

There are two further results which can be helpful. (iv) The change in the normal component of D is equal to the surface charge density. (v) At a perfect conductor n /\ H = J s ' where n is a unit vector normal to the boundary and J s is a surface current density. In many important cases (ii) implies (i) and it is sufficient to impose only conditions (ii) and (iii) (or only (ii)' for a perfect conductor). Fields which are produced by currents and charges whose variation with time t is simple harmonic are of considerable importance. If 8, u, and 0' are independent of time Maxwell's equations and the constitutive laws are linear. It is therefore possible to consider writing where i = .J - 1 and to is a constant, w/21t being known as the frequency. Here it is understood that E, and H, do not involve t and the real part of the complex expressions is taken at the end of the analysis. This will be legitimate provided that there are no extraneous circumstances or boundary conditions which introduce non-linearities. The boundary conditions (i)-(iii) and (ii)' are linear so that the suggested procedure is certainly acceptable for them. Accordingly, for harmonic fields subject to (2.7) the governing equations

74

WAVEGUIDES AND DIFFERENCE EQUATIONS

can be taken as

+

iWJlHh = 0,

(2.9)

curl H, - iweEh = J h,

(2.10)

= Ph' div(Jl"h) = 0,

(2.11)

curl E,

div(eE h)

(2.12)

from which all time variations have disappeared. If OJ =1= 0, the divergence of (2.9) gives (2.12) and the divergence of (2.10) gives (2.11) when the equation of continuity (2.5) is taken into account. Therefore, for harmonic fields, which are non-static, it is sufficient to employ (2.9) and (2.10) plus the harmonic form of the equation of continuity. Also the boundary condition (i) will be covered automatically by (ii). Conductivity would also be allowed for by replacing e by e - i(J/ OJ. It can be shown that in these circumstances the solution of the electromagnetic problem is unique, except possibly if the medium extends to infinity. To ensure uniqueness when the medium is infinite it is necessary to impose extra conditions to guarantee that a source at the origin produces a field which is radiating or outgoing at infinity. If u and e are constant near infinity introduce Zo = (Jl/e)1 /2, the impedance of the medium. Then if r is the radial vector from the origin and I, a unit vector in the same direction so that r = r l, the radiation conditions are rE, rH bounded, r(E - ZoH

1\

L)

r{H - (I/Zo)l r as r

1\

} ~

0,

(2.13)

E} -+ 0,

-+ <::1:;'.

2.2 Waveguides A hollow waveguide is a cylindrical metal tube containing a single medium. Assume that the generators of the cylinder are parallel to the z axis and that the tube is perfectly conducting. Let C be the perimeter of the cross-section in z = constant (Fig. 2.1), and let sand n be tangential and normal respectively to C in this plane. Then the boundary condition of perfect conductivity on C can be expressed as

B,

= 0, E, =

0

on C, where the suffixes z and s on E indicate the components oi E in the specified directions. Assume that the time variation is harmonic and that E, It are constant. We are then going to search for solutions of (2.9) and (2.10) whose z dependence is entirely given by the factor e- iKZ, where K is some constant. It will be advantageous to simplify the notation by dropping the subscript h in (2.9) and (2.10).

WAVEGUIDES AND DIFFERENCE EQUATIONS

75

n

Fig. 2.1. Geometry, of waveguide cross-section.

Let k be a unit vector along the z axis. Write E

= E, +

Ezk,

H

= H, +

Hzk,

so that E t , H, are the transverse components of E, H respectively and lie in the (x, y) plane. Similarly, let grad, denote the gradient operator in the (x, y) plane. Then, with the z dependence assumed, (2.9) and (2.10) can be expressed as

iKE t iKH t

+ iwtlk 1\ -

(2.14)

+ grad, Hz = 0,

(2.15)

iwek 1\ E,

vE}! oE H a; - ay + !OJ/l x

H, + grad, E; = 0,

•

z

=0

,

oR y

ax

_

oHx iJy

_

iweE = 0

z'

(2.16)

when the current density is zero. Substitute for H, in (2.14) from (2.15). Then, since k· E, = 0, (,,2 - k 2)E t

= ix grad, E,

- iWJlk

1\

grad, Hz,

(2.17)

where k 2 = w 2 Jle. 1£(2.17) is substituted in (2.15) we obtain

(,,2 - k 2)H t

= iwek

1\

grad, E;

+

ix grad, Hz.

(2.18)

Eqns (2.17) and (2.18) express E, and H, in terms of E; and Hz provided that K

2

=j:.

k2•

If now (2.17) and (2.18) are employed to eliminate the transverse components from (2.16), the equations (V; + k 2 - K 2 )E z = 0, (2.19)

(V; + k 2 - K 2 )Hz = 0 (2.20) are obtained, where V; == 0 2jox 2 + 0 2joy2 is the two-dimensional Laplacian. Thus the problem has been reduced to solving the two-dimensional Helmholtz

WA VEGUIDES AND DIFFERENCE EQUATIONS

76

equation subject to the boundary conditions. To convert the boundary conditions to a form suitable for (2.20), let n be the unit vector in the direction of n in Fig. 2.1. Then, from multiplying (2.14) vectorially by D,

= ix E, + oEz/os,

-iw,uneH t

(2.21)

from which we deduce that H; = 0 on C if w "#- O. On the other hand, vectorial multiplication of (2.15) by k /\ n gives

iKHn + oHz/on + iweEs = O.

(2.22)

Since it has just been proved that H; = 0 on C, it follows that oHz/on = 0 on C. Consequently, it has been shown that our boundary conditions imply

e, =

8Hz /on = 0

0,

(2.23)

on C. Conversely, if (2.23) holds, (2.21) and (2.22) show that H; = 0 and E, = 0 provided that K 2 "# k 2 • Hence, it has been demonstrated that our problem has been converted to solving (2.19) and (2.20) subject to (2.23) with the sole exception of the case K 2 = k 2 • If K 2 = k 2 , direct substitution for H, from (2.14) into the second of (2.16) gives V; E, = 0, i.e. E, is a solution of Laplace's equation which vanishes on its surrounding boundary. Hence E; is identically zero. From (2.18) it follows that Hz is independent of x and y. However, if A is the cross-section of the guide, we infer from (2.16) that

-iw,u

f

Hz dx dy

A

= f (OEy - OE x ) A AX oy

dx dy

=

f s, c

ds = 0

(2.24)

on account of the boundary conditions on E s • Hence Hz is identically zero. Then the first of (2.16) can be satisfied by E, = grad, t/J, (2.15) gives H t , and then the second of (2.16) supplies

V;t/! = O. In order to satisfy the boundary condition on Es ' t/J must take a constant value on any connected portion of the boundary, though it may have different constant values on disconnected sections. Solutions of (2.19) subject to E; = 0 do not exist for all values of K but only for certain discrete values, so that we have an eigenvalue problem. Suppose that the eigenvalues are and the corresponding eigenfunctions are 4>m so that

v;

(v: + with
= 0 on

v;)
+ (k 2

-

V;')1/2

= -i(v;' - k2 ) 1/ 2 =0

(2.25)

C. Define Am by

Am =

Put Hz

=0 (k 2 > v;.) (k 2 < v;').

and then a possible electromagnetic field is given by (2.17) and

WAVEGUIDES AND DIFFERENCE EQUATIONS

77

(2.18) as

E, E,

H,

=

=

= 4>m exp( -

iAmZ),

-(iAm/V~) exp( -iAmZ) grad c/>m'

-(iOJe/v~) exp( -iAmZ)k

1\

grad

4>m.

It is characterized by having no component of the magnetic field in the direction of propagation along the z axis and is therefore known as a transverse magnetic or TM mode. Similarly, if

(2.26) with ot/Jm/on provided by

=

°on C and

K~

= k2 E,

E,

J1~, a transverse electric or TE mode is

= 0,

= (iOJJ1/J1~) exp( -iKmz)k 1\ grad t/Jm' Hz = t/Jm exp( -iKmZ),

H, = -(iKm/J1~) exp( -iKmz) grad t/Jm.

A possible solution occurs when J1m is zero and ifim is a constant. However, (2.24) then forces the constant to be zero. Therefore J-lI is taken as the first non-zero eigenvalue which occurs. Finally, there is the case K 2 = k 2 in which both E; and Hz vanish; this is a TEM mode and E, = 0, E, = exp( -ikz) grad tjJ, Hz

= 0,

H,

= (k/wJ-l)

exp( -ikz)k

1\

grad

t/J.

It can be shown (see, for example, Jones (1986)) that any electromagnetic field in a waveguide can be expressed as a sum of constant multiples of all the modes propagating in the positive and negative z directions. A TE mode propagates without attenuation only if k 2 > J1;'; otherwise it is exponentially damped. Therefore every TE mode is damped if k 2 < J1i. Similarly every TM mode is damped if k 2 < vi. Since it is known that J1i ~ vi there are no propagating TE or TM modes if k 2 < ~i. For this reason J11 is often known as the cut-off wavenumber and 2n/J11 as the cut-off wavelength. The TE mode corresponding to J11 is called the fundamental or dominant mode. The TEM mode propagates without attenuation for all values of k. However, it may not always exist. If the cross-section of the guide is simply connected, the boundary consists of one connected piece and t/J must be a constant. Then the field disappears. If the cross-section is multiply connected there may be a TEM mode (e.g. in a coaxial cable where the boundary consists of two separate concentric circles).

78

WAVEGUIDES AND DIFFERENCE EQUATIONS

Unless otherwise stated, the cross-section will be assumed to be simply connected so that the TEM mode does not arise from now on. Thus our prime objective becomes that of finding 11m' Vm, 4Jm' and I/Jm' If there are known charges and currents present this is still true since the main modification is to add known terms to the right-hand sides of (2.25) and (2.26). The number of boundaries C for which analytic solutions can be found is strictly limited. In numerical work, however, we can take advantage of approximating functions by their values on a discrete point set; then operations can be reduced to simple arithmetic forms which are convenient for digital computer programs. If the approximation becomes more accurate as the spacing between the points diminishes, a satisfactory procedure will have to be derived. Accordingly, we turn to the question of formulating finite-difference approximations for (2.25) and (2.26).

Exercises 1. The boundaries of a waveguide are the planes y = 0 and y = b, and the field is independent of x. Show that the y dependence of Hz in a TE mode is cos(mny/b) and find the mode. Is there a TEM mode? If so, find it. 2. A rectangular waveguide has boundaries x = 0, a and y = 0, b. Show that the modes are derived from t/Jmn = cos(m1tx/a) cos(nnyjb), ljJmn

= sin(mnx/a)

sin(nny/b),

indicating any restrictions on the integers m and n. What is the cut-off wavenumber? 3. In a circular waveguide of radius a, prove that in cylindrical polar coordinates r, 4>, Z the modes are derived from

where Jm is the customary Bessel function and Jm(jmn) = 0, J~(j~n) = O. What modes occur when the perfectly conducting sheets ljJ = 0 and ljJ = in are added? 4. The cross-section of a coaxial line consists of the concentric circles r = band r = a (a > b). Show that there is a TEM mode with t/J = A + Bin r. 5. The cross-section of a waveguide is an equilateral triangle of side a. Take the centre of the triangle as origin with the x axis parallel to one side. Find the modes by considering products of trigonometric functions whose arguments are constant multiples of !(J3x + y + 4a) and !( -J3x + 3y).

2.3 Numerical derivatives A derivative f' of a function f is defined by f '(a) -- li1m f(a h-O

+ h) h

f(a) .

WAVEGUIDES AND DIFFERENCE EQUATIONS

79

This suggests that a possible definition for the numerical derivative at a, which will be denoted by F~, might be F~

= f(a +

h) - f(a) . h

The formula is very simple and requires the evaluation of f at two points. For small enough h, one would expect it to be a reasonably good approximation. It is known as the forward differenceformula. Since there is no change to the analytical formula when the sign of h is altered, we could equally well adopt the backward difference formula F'

= f(a) - f(a - h) h

a

for the numerical derivative. A glance at Fig. 2.2 will reveal that the two numerical derivatives are not equal in general. The formulae give the slopes of the two chords on opposite sides of the point under consideration. Since the correct direction is that of the tangent, the forward difference gives too low a line and the backward difference too high. The average value might be better, i.e. F'

= f(a + h) - I(a - h) 2h

a

or the central differenceformula. Because it is the chord joining f(a - h) and f(a + h) the slope certainly looks more satisfactory. A general justification of the superiority of the central difference formula stems from Taylor's theorem. For

f(a

+ h) = f(a) + hf'(a) + !h 2 f"(e)

f(x)

a-h

a

a+h

Fig. 2.2. Backward and forward difference formulae.

x

80

WAVEGUIDES AND DIFFERENCE EQUATIONS

where a < ~ < a + h. Hence, for the forward difference formula, F~

- f'(a) = !hf"(~).

(2.27)

Consequently, when I" is bounded, the error in the forward difference formula is O(h). Similarly, the error in the backward difference formula is O(h). On the other hand, from

f(a

+ h) = f(a) + hf(a) + th 21"(a) + lh 3 f"'(~),

(2.28)

+ th 2 f"(a) - ih 3 f"'(~)

(2.29)

f(a - h) = f(a) - hf'(a)

we see that the central difference formula satisfies (2.30) so that the error is O(h2 ) . So long as the derivatives of f are well behaved and h is not too large, the central difference should be the most accurate of the three. Moreover, its error decreases quadratically as h is decreased, whereas the decrease is only linear for the other two formulae. In general, therefore, the central difference formula is to be preferred, though one may be obliged to use one of the other two if data are not available on both sides of the point where the numerical derivative is to be calculated. The impression may be gained that any desired accuracy can be achieved by making h small enough. However, this is not true since there is a limit to the accuracy in the evaluation of f. If this is e, the error in F ~ in the forward difference formula may be 2e/h in absolute value. It follows from (2.27) that the error in F~ - f'(a) may be as high as

thM

+ 2e/h,

where M is a bound for If"l. Since this has a minimum for h2 = 4e/M there is no point in making h any smaller than 2(e/M)1/2. The right-hand sides of (2.27) and (2.30) are known as the local truncation error of the appropriate difference formulae. By making the local truncation error small we hope to make our difference scheme accurate. Nevertheless, it must not be assumed that after applying it successively over a large number of points, as is required for partial differential equations, the accuracy all over the region will be good-this requires an investigation of the global accuracy which is usually a very complicated matter. With this reservation, difference formulae of higher orders of accuracy can be constructed. For example, F'

= -tf(a - h) - tf(a) + f(a + It) - !f(a + 2h)

a

has an error of O(h 3 ) .

h

WAVEGUIDES AND DIFFERENCE EQUATIONS

81

Higher derivatives may be handled in a similar manner. For instance

" f(a Fa =

+

+

h) - 2f(a)

h2

f(a - h)

(2.31)

is a suitable formula for the second derivative. By extending the expansions in (2.28) and (2.29) by an additional term, we may verify that the error is O(h2 ) . Difference formulae are connected with the theory of interpolation. If the parabola y

= «(x

is required to interpolate f atx ex

(3

+

f(a

- a)2 + f3(x - a) + y

=a-

h, a, a + h it is found that

h) - 2f(a)

+

f(a - h)

= --------2h 2

= f(a + h) -

f(a - h)

2h

y = f(a).

'

Then y'(a)

= F~,

y"(a)

= F:,

where F~ and F: are given by the central difference formula and (2.31) respectively. Thus numerical derivatives are actually specified by fitting a polynomial to the data and then taking the derivatives of the polynomial. The higher-order formulae need higher-order polynomials and, as has been seen in Chapter 1, these may provide very oscillatory interpolants to the consequent detriment of the numerical derivative. To find stationary points, i.e. points where f'ex) = 0, we can either discover the zeros of the interpolating polynomial just described by the methods of §1.8 or we can carry out inverse interpolation on the values of F ' at the data points by the method of §1.2. It may happen that the data points are not equally spaced. Since interpolating polynomials are still available, difference formulae can still be generated. An example is

F: = b(b +2 1)h

2

{bf(a - h) - (l

+ b)f(a) + f(a + bh)}.

(2.32)

The accuracy of such formulae is usually not so high as those with equally spaced data points. All of these ideas can be generalized without difficulty to several variables

82

WAVEGUIDES AND DIFFERENCE EQUATIONS

and partial derivatives. Thus, possible difference formulae in two dimensions are

F'

= f(a + h, b) - f(a - h, b) 211

a

F"

F'

h2

F" _ f(a, b

'

'

+ k) - 2f(a, b) + f(a, b - k) k2

bb -

= k, the

2k

= f(a + h, b) - 2f(a, b) + f(a - h, b)

aa

If h

= f(a, b + k) - f(a, b - k)

'b

'

difference approximation for Laplace's operator is

f(a, b + h) + f(a - h, b) + f(a + h, b) + f(a, b - h) - 4f(a, b) h2 Since five points are involved it is known as the five-point difference scheme. By means of Taylor's expansion it may be demonstrated that the local truncation error is of the form

Exercises 6. Show that the errors in

-!f(a)

+ 2f(a + h) - !f(a + 2h)

, Fa

=

' Fa

= !f(a - 2h) - 2f(a - h) +

h

,

~f(a)

-----h-----

are both O(h 2 ) . 7. Show that the error in , f(a - 2h) - 8f(a - h) + 8f(a + h) - f(a + 2h) F = . 12h a is O(h 4 ) . 8. Prove that the local truncation error in

F"

=-

f(a - 2h) + 16f(a - h) - 30f(a) 12h2

a

+

16f(a

+ h) - f(a + 2h)

is h4 f(6)( ~)/90. 9. Given the table x f(x)

determine f' (0.48).

0.40 0.389

0.44 0.426

0.48 0.462

0.52 0.497

0.56 0.531

83

WAVEGUIDES AND DIFFERENCE EQUATIONS

10. Find the approximate values of x where f is a maximum for the table

x

f(x)

1.46 1.994

1.50 1.998

1.54

1.58

2.000

2.000

1.62 1.999

1.66 1.996

11. Show that a possible difference formulae for Laplace's operator in two dimensions is

a.f(a - h1 , b)jh1 + a.f(a + h2 , b)jh2 + pf(a, b - k1)/k 1

+ f3f(a, b + k2)/k 2 - 2(1/h 1h2 + 1/k1k2 )f (a, b), where a. = 2/(h1 + h2 ) , P = 2/(k 1 + k2 ) · 12. Show that a nine-point difference formula for Laplace's operator in two dimensions is 1 -2

6h

[4{f(a + h, b) + f(a - h, b) + f(a, b + h) + f(a, b - h)} - 20f(a, b)

+ f(a + h;b + h) + f(a + h, b - h) + f(a - h, b + h) + f(a - h, b - h)] and that the local truncation error is O(h6 ) .

2.4 Properties of difference equations Formulae for numerical derivatives are not difficult to write down, but assessing their value for solving partial differential equations is another matter. We are not concerned here with determining analytical solutions of difference equations or even with their asymptotic behaviour though many useful results have been obtained in this direction. (See, for example, Milne-Thomson (1960); Dingle and Morgan (1967).) Instead our attention is centred on how solutions of the discrete system which replaces the continuous one are related to those of the partial differential equation. Our first demand is obviously that the difference scheme should approach the partial differential equation as the distances between the data points become zero. This is usually easy to verify by examination of the local truncation error. In addition, there may also be boundary conditions or initial values that have to be confirmed. Next, it will be desirable to know how closely the numerical solution U approximates to the theoretical solution u of the partial differential equation. For example, does u(mh, nh) - U(mh, nh) tend to zero as h --. 01 For this purpose, it will not be sufficient to hold m and n fixed as h --+ 0, otherwise conclusions will be restricted to the behaviour at the origin. Thus, convergence as h -+ 0, m --+ 00, n --. 00 in such a way that mh and nh remain fixed will be relevant; this is sometimes known as fixed-station convergence. Again, if a difference scheme can produce a solution which is not bounded, the numerical effect can be dramatically unpleasant. Any error may then be quickly amplified from step to step and the sought solution rapidly swamped with consequent instability. However, it may be that a growing solution is looked for and that an increasing error term might be tolerated so long as it

84

WAVEGUIDES AND DIFFERENCE EQUATIONS

remains substantially smaller than the growing solution. Nevertheless, it will generally be wiser to reformulate the problem so that the required solution does not get too large and insist that the numerical procedure is stable. To illustrate the ideas involved let us consider the solution of Poisson's equation (2.33) subject to the boundary condition u = g(x, y) on C when the five-point difference scheme of the preceding section for Laplace's operator is employed. Draw a mesh of squares of side h and let (mh, nh) be a typical point of intersection or node (Fig. 2.3). Nodes inside C which are the centres of four other nodes which are either inside or on C are called regular points. Let the set of regular points, denoted by crosses in Fig. 2.3, be designated Rh • Let R~ be the set of nodes inside or on C which are not regular points but which are one mesh length from at least one regular point. They are marked by points in Fig. 2.3. The reader may care to visualize Rh and R~ as the discrete analogues of a region and its boundary respectively. The difference replacement is (2.34) where Um,n == U(mh, nh) and fm.n = f(mh, nh). Strictly, it applies only at the regular points since at points of R~ it would necessitate asserting that u satisfies Poisson's equation outside C. This difficulty will be dealt with later.

Fig. 2.3. Grid for Poisson's equation: x points of R,,; • points of R~.

85

WAVEGUIDES AND DIFFERENCE EQUATIONS

If the right-hand side of (2.34) is zero, the Laplace difference equation Um,n+l

+

U".-l,n

+

Um+l,n

+

4Um, n = 0

Um,n-l -

(2.35)

is obtained. An important maximum principle is valid for it because the equation asserts that at a regular point Um , n coincides with the average of its values at the four neighbouring nodes. Therefore, not all the neighbouring values can exceed Um , ,. , nor.can everyone of them be less than Um , ,. . Thus Um , ,. cannot be a strict maximum or a strict minimum at a regular point. It follows that, if U vanishes at every point of R~, U is zero at every regular point, for if U differed from zero on R, it would have to have either a positive maximum or a negative minimum there contrary to our previous conclusion. An immediate consequence is that the solution of the Laplace difference equation is unique if the values of U are specified on R~. More generally, let w(x, y) be any function such that Wm,,.+l

+ \v".-l,,. + wm + l, n + wm,n-t

4w m , n ~ 0

-

(2.36)

on R h • Then w cannot have a strict minimum on R h • Accordingly, if w ~ 0 on R~ then \V ~ 0 on R h • This result enables the estimate of a bound on a general function e(x, y). For let w ~ 0 on R~ and denote the left-hand side of (2.36) by L(w). Consider v(x, y)

=

w(x, y) max {-IL(e)IIL(\v)} Rh

+ max le(x,

y)l.

Rh

Since the last term on the right-hand side is a constant L(v) = L(lV) max {-IL(e)I/L(w)} ~ -IL(e)1 Rh

on account of (2.36). This implies that L(v - e)

whence, since v ~

lei on

R~,

= L(v) + e)

gives v ~ - e on R h • Consequently lei arbitrary e, on R; + R~ \V o

~

0

v ~ e on R h • Also L(v

le(x, y)1 ~

L(e)

~

~

0

v on R h +

max { - IL(e)I/L(w)} Rh

R~.

In other words, for

+ max le(x,

y)1

(2.37)

Rh

with W o the maximum of Iwl on R h + R~. Now, suppose that the fourth partial derivatives of u are bounded by M. Then, by analysis of the local truncation error, u"..,.+

1

+ um-1,n + um+ 1,,. + Um,,.-l

-

4u m , ,.

= h2 f m,n + O(h4 )

where the order term is, in fact, bounded by !Mh 4 • We then subtract (2.34).

86

WAVEGUIDES AND DIFFERENCE EQUATIONS

Then the error V - u in the numerical solution satisfies

L(V - u)

= O(h4 )

(2.38)

from which can be deduced

(2.39) Choose w above as

where a, x o, and Yo are picked conveniently so that R; + R~ lies within the circle whose perimeter is w = O. Note that w ~ 0, L(w) = -h 2ja2 and we can certainly take Wo ~ 1. Hence, from (2.37) and (2.39)

IV - ul

~ tMa2h2

+

max

R"

IV - ul

(2.40)

on R; + R~. The inequality (2.40) provides a bound on the error of the numerical solution. It is evident that the error tends to zero as h --+ 0 provided that the errors on R~ tend to zero. If we ensure that points of R~ tend to points of C and that U takes the specified values of u there we can be certain that as the mesh size tends to zero our numerical solution will approach the analytical one. One way of doing this is by the following strategy: if a point P of R~ is on C put U(P) = g(P); if P is not on C select the point Q on the grid lines through P which is nearest to P and then impose U(P) = g(Q). Only the points of R~ not on C need be considered. For them

= lu(Q) -

IU(P) - u(P)1

u(P)1 ~ MId

where d is the distance between P and Q, and M 1 is a bound for the first partial derivatives of u. By construction d ~ h and so max

R"

IV - ul

~

M 1h.

Thus, our rule does make sure that the numerical solution does approach the theoretical one as h --+ O. Observe, however, that the treatment of the boundary condition can introduce a larger error than that due to truncation. The above analysis assumes that the solution to the difference equation is obtained exactly. In practice, it will not because of round-off error. The effect of round-off, if only I decimal places are retained in the computation of Um,n' is to add a term on the right-hand side of (2.38) which possesses a bound of the form 10- IM. There is a corresponding addition in (2.39) and then (2.40) becomes

1 (6

+ -10-') 2 +

max IV- ul· h R" The numerical answer now converges to the exact solution as the mesh size

IV - ul

~ Ma 2 _h 2

87

WAVEGUIDES AND DIFFERENCE EQUATIONS

tends to zero only if the number I of decimal places kept in the computation goes to infinity fast enough. The formula suggests that there is little point in choosing h smaller than the value ho which gives the minimum of the right-hand side, i.e. h~ = 6 X 10- 1• Remark also that, for a fixed choice of I, the expected size of a2 /h2 of the round-off error is proportional to the number of difference equations or mesh points that have been employed. The boundary condition adopted above is that of Dirichlet in which u is specified on C. Often the Neumann boundary condition in which the normal derivative of u is given on C is relevant. More generally, a condition such as a(p)u(p)

+ P(p)

ou(p) = g(p)

(2.41)

an

may be imposed at each point p of C. If f3 == 0, ex #: 0 this is the Dirichlet problem, and if ex == 0, P=1= 0 it is the Neumann problem. The derivation of (2.40) is unaffected by the boundary condition but the magnitude of the last term depends upon how the boundary condition is implemented. If the boundary has horizontal or vertical sides, a relatively simple technique may be adopted. For example, in Fig. 2.4 the normal derivative at p can be approximated by the backward difference formula and (2.41) becomes a.(p)U(p)

+

P(p){U(p) - U(q)}/h

= g(p).

It is probably more accurate to take into account that the partial differential equation is satisfied within the boundary. To do this introduce an extra node s and use the central difference formula for the normal derivative.Then (2.41)gives a.(p)U(p)

+

P(p){U(s) - U(q)}/2h

= g(p).

(2.42)

To remove the unknown U(s) we treat p as a regular point so that U(s)

+

U(t)

+

U(q)

+

VCr) - 4U(p) = h 2 f(p).

(2.43)

Elimination of U(s) from (2.42) and (2.43) gives {2ha(p)

+ 4fJ(p)}U(p)

- {J(p){2U(q)

+

U(r)

+ Vet)} =

2hg(p) - h 2f3(p)f(p)

(2.44)

as the equation to be applied at the node p. t

q

P

--s

r

Fig. 2.4. Vertical boundary.

88

WAVEGUIDES AND DIFFERENCE EQUATIONS

a

q

c Fig. 2.5. General curved boundary.

When the boundary is curved, the situation is more awkward. A typical situation is indicated in Fig. 2.5 where the normal at p intersects the grid line between a and b at d. The normal derivative is approximated by the one-sided difference formula ou(p)jon = {u(p) - u(d)}jpd and substitution in (2.41) then gives a relation between u(p) and u(d). The value of u(d) is estimated by linear interpolation so that u(d)

= {dbu(a) + adu(b)}jab

and substitution in (2.41) permits u(p) to be expressed in terms of u(a) and u(b). Finally, we apply the difference replacement for Laplace's operator with unequal mesh lengths (Exercise 11) at b to obtain

+

~ {U(p) + U(C)} ~ {U(q) + u(a)} cp

pb

h

aq

bq

h

_

~ {~+ ~}U(b) = f(b). h pb

bq

Inserting the derived value for u(p) and a similar one for u(q), we obtain a relation to be satisfied at b by the values of u at a, b, and c respectively. In general, one can anticipate that less accuracy will occur for a given mesh size for normal derivative problems than for Dirichlet conditions since the representation of the normal derivative is somewhat inexact. There is another point which has to be watched when the boundary condition is pure Neumann, i.e, <X == 0 on C. No loss of generality is incurred by putting p = 1. Then g must satisfy a consistency condition if u is to exist, namely

f

ds = v~ u dx dy = Jcr 9 ds = Jcr au an A

f

f

dx dy.

(2.45)

A

The matrix of the associated finite difference scheme will be singular and the solution not unique. Even if g is compatible with (2.45), the discrete analogue may not be complied with unless special steps are undertaken. If the discrete condition is not satisfied one can expect that a numerical solution will fail to meet the discrete boundary conditions. An implicit assumption in the foregoing has been that u possesses at least four bounded partial derivatives. If the boundary possesses a re-entrant corner this will not be true and first derivatives will become infinite as the corner is

WAVEGUIDES AND DIFFERENCE EQUATIONS

89

approached. There is some evidence (Fox 1962) that it pays to work with the nine-point difference formula (Exercise 12) in these circumstances rather than the five-point difference formula though the opposite opinion has also been supported (Duncan 1967). The preceding ideas can be generalized in a straightforward manner to the elliptic equation

a2u A- 2 +

ax

a2u

c- + ay2

au au D - + E - + Fu = f

ox

oy

(2.46)

where the functions A, C, and F satisfy the conditions A > 0, C > 0, and F ~ 0 throughout the domain under consideration. One can arrange that the discrete operator takes the form a1Um+1,n

+

a 2Um,n+l

+

a3 Um-l,n

+

a4Um,n-l - aoUm,n

(2.47)

where ao, al' ... , CX 4 are all positive and

The theory leading to (2.40) can be adapted to this case with very little modification if A + C > a(IDI + lEI); if this inequality does not hold corresponding results may still be derived but the analysis is more complicated since more complex comparison functions ware involved. If there is also a mixed derivative Bo 2ujox oy in (2.46) but AC > B2 so the equation is still elliptic, the term can then be removed in principle by a suitable change of coordinates. Should this be impracticable one has to accept that the discrete operator will lose the simple form (2.47) and its desirable properties. An interesting feature of difference equations is the way in which they reflect whether boundary or initial conditions have been correctly posed. Suppose, for instance, the discrete analogue of Laplace's equation is taken in the form

where hand k are the increments in mesh in the x and y directions respectively. Set up an initial value problem by specifying, say, Urn, 0 and Um, 1. By representing the initial data as a Fourier series and concentrating on a single term we are seeking a solution of (2.48) in which Um,o = eimb where b is some known real number. Try as a solution of (2.48) Um,n

= exp{i(mb + nc)}.

(2.49)

k: sin" tb = O.

(2.50)

Then band c must satisfy

sin? !C +

h

It is evident at once that c cannot be real. The roots must, in fact, occur in

90

WAVEGUIDES AND DIFFERENCE EQUATIONS

conjugate complex pairs and therefore at least one of them has a negative imaginary part. The resulting solution (2.49) of the difference equations grows exponentially as n -+ 00, but at a fixed station n -+ 00 as k -+ 0 and so Urn. n can take arbitrarily large values even for very small initial data. Consequently, Urn, n is unstable in its dependence on initial values. Although the stability is due to trying to solve an initial value problem for an elliptic equation, the device by which it was detected can be applied to other problems. Suppose that the partial differential equation was the one-dimensional wave equation (2.51) instead of Laplace's equation. Then, (2.48) will be appropriate so long as the sign of the second set of parentheses is altered. In place of (2.50)

is obtained. If klh > 1 there are b such that the right-hand side is greater than unity and again instability occurs despite the problem being correctly posed. If, however, k/h ~ 1 all the values of c are real and (2.49) remains bounded as n --. 00. This is an illustration of numerical instability depending on the mesh size and demonstrates that to hope for stability and convergence of the numerical solution to the theoretical solution as h --+ 0 one must insist that at least klh ~ 1. In fact, this is not sufficient for when k = h the difference equation has the solution

which displays instability though not exponential growth. Nevertheless, the notions described form the basis of the von Neumann criterion, which is a necessary condition for numerical stability, in which all exponential solutions of the difference equation are examined and the system is declared unstable unless every possible solution remains bounded. The von Neumann criterion can be expressed in matrix form. Suppose that we have started from a 'vector differential equation so that we are concerned with a vector difference Urn," and suppose further that it has been arranged that in the difference scheme Urn,n+ 1 can be expressed in a linear combination of Um+,.n with r taking a certain finite number of integer values. Specifically, we assume

Urn,n+ 1 =

L A,(h, k)Um+ r.n

the A r being matrices in general. Make the substitution imb U m.n =Ve n

WAVEGUIDES AND DIFFERENCE EQUATIONS

91

where Vn is independent of m. Then where

G = L A,(h, k) eib, . Since Vn = Gnv o, G is called the amplification matrix. Obviously, if the solution is to remain bounded, G" must not be allowed to grow. In general, only points in some finite domain will be under consideration. There, take the following as the stability requirement: for some positive k o there is a constant C such that

IIGnl1 for 0 < k < k o, 0

~

nk

~

~ C

K and all real b.

The spectral radius p(G) (§1.11) has the property {p(G)}n ~ IIGn II so that, on taking n = Klk, a necessary condition for stability is that p(G) ~ C k / K• There is no loss of generality in assuming C ~ 1; then C k/ K ~ 1 + (k/K)Cko/K for o < k < ko. Hence we have the von Neumann necessary condition of stability: there must be a constant C I such that the modulus of every eigenvalue of the amplification matrix does not exceed 1 + elk for 0 < k < ko and all real b. A sufficient conditionfor stability is that for some M

IIGII ~ 1 + Mk for 0 < k < ko. For then IIGlln ~ eM nk ~ eM K and so IIGUn is bounded. If G is a normal matrix, i.e, GHG = GG", then p(G) = IIGlI and we see that the von Neumann condition is both necessary and sufficient for stability. This includes the cases when G is Hermitian or unitary. The general investigation of necessary and sufficient conditions is involved and culminates in the Kreiss-Buchanan theorem, details of which will be found elsewhere (Richtmyer and Morton 1967). As an illustration, consider again the wave equation (2.51) but this time put v = ou/oy and w = au/ax so that the system

av/oy

= ow/ax,

av/ax

= ow/oy

is acquired. Replace this system by the difference equations (see also Exercise 19) Vm.n+1 - Vm.n = k(Wm+l . n - Wm- l .n)/2h,

= k(Vm.n+1 - Vm- 2 ,n+I)/2h. ~ e imb , Wm,n = ~ eimb and then,

Wm-l,n+l - Wm-1,n Make the substitution Vm,n = manipulation, the equation (

~ + 1)

(

1

J¥,.+ 1 = ic

iC)( J¥,.~ ) = G (J¥,.~ )

1 - c2

(2.52) (2.53) after slight

92

WAVEGUIDES AND DIFFERENCE EQUATIONS

where c = (k/h) sin b arises. The characteristic equation of G is

A2

-

A(2 - c2 )

+ -I = o.

If c ~ 4, both roots of this equation have absolute value 1. Therefore, if we hold k/h constant as k -+ 0, the von Neumann necessary condition requires that k/2h ~ 1. If c 2 > 4 the von Neumann condition cannot be met with k/h kept constant. Since GG" =1= G"G, G is not normal and it cannot be asserted that the von Neumann condition is also sufficient. In fact, instability occurs when k = 2h. An alternative more stable difference scheme is furnished by 2

Vm,n+l - Vm,n

= k{Wm+l,n -

Wm-1,n

+

Wm+1,n+l - Wm- l, n+l }/4h,

Wm-l,n+l - Wm-1,n = k{Vm,n+l - Vm- 2,n+l

+

Vm,n - Vm- 2,n}/4h.

In this case

(1 +

ic )G = ( 2

1-

.

ic 2

IC

IC

1-

ic2

) .

Now G is a unitary matrix and both its eigenvalues have absolute value 1. The von Neumann condition is both necessary and sufficient and it is met without any restriction on the mesh size. The system is unconditionally stable. The stability criteria may be applied when the coefficients are variable by introducing new constant coefficients equal to the values of the original ones frozen at some particular point of interest and investigating the modified problem. Finally, there is an interpretation of Laplace's difference equations via random walks which is worth remarking. Suppose that a particle moves at random in such a fashion that when it leaves a point of R; the probability of it stepping to each one of the adjacent mesh points is !. In addition, the particle comes to rest whenever it reaches a point R h. Let Pm,n be the probability that the path of the particle starting at the grid point (mh, nh) terminates at one of the set S of points of R;'. Clearly, Pm,n = 1 when (mh, nh) lies in S, whereas Pm,n = 0 when (mh, nh) is located elsewhere in R h. If (mh, nh) is in R h , the probability is the sum of the probabilities of travelling to any adjacent mesh point followed by a path terminating from there. Hence

Pm,n = !(Pm+ l,n

+ Pm,n+ 1 + Pm-1,n + Pm,n-l)·

Consequently, Pm,n solves Laplace's difference equation subject to taking the value 1 on Sand 0 on R;' - S. On this basis, a statistical procedure, known as the Monte Carlo method, may be devised for calculating the solution of a Dirichlet problem.

Exercises

t

13. Set up difference equations with h = for solving Laplace's equation on the unit square with u = 1 on y = 1 and u = 0 on the other three sides.

93

WAVEGUIDES AND DIFFERENCE EQUATIONS

14. In a calculation of the solution of (2.35) the computed value V~,n agrees exactly with Vm,n on R~ but is found not to satisfy (2.35) exactly because L(U') = <5 m , n ' Show that where lJ = max <5 m, n and a is the radius of a circle enclosing Ria + R~. Deduce that, ifin a computation a numerical solution is accepted if the residuals lJm,n are less than a specified quantity, this quantity should be diminished as the mesh size is decreased. 15. The function u satisfies Laplace's equation on - 2 < x < 2, - 1 < y < 1. Is it unique if (a) Gu/on = x on the boundary, (b) ou/on = 0 except for the side y = -1 where

u = I?

16. Obtain an approximate solution of Laplace's equation on 0 < x < 2, 0 < y < 2 subject to u(x,O) = 1, iJu(O, y)/ox = 1 and u = 0 on the rest of the boundary by taking h = i. 17. Find an approximate solution of Laplace's equation on x 2 + y2 < 1, y > 0 subject to u = 0 on y = 0, - 1 ~ x ~ 1 and ou/iJn = x on the remainder of the boundary by taking h = !. 18. The function u satisfies

au au

p ou

ox

x ox

2

2

-+ 2 ----=0 iJy2

P being a positive constant, and u is specified on the boundary which lies entirely in x > o. Show that on Ria + R~ lu - VI ~ !Mah 2J2(1 + tal +

max Rh

Iu - VI

where M is a bound for the partial derivatives of u. Hint: Consider w

= -hM{a 2 -

(x - XO)2 - (y - YO)2}

+ iM(1 + ta)(J2a + x - xo -

y

+ Yo).

19. If the right-hand side of (2.53) is replaced by k(v".,n - Vm- 2,n)/2h show that stability requires k/h2 to be bounded as k -+ 0, which is undesirably demanding. 20. In the discrete analogue of

iJ2 u

iJ2 u

a2u

ox

oy2

iJt

- 2+ - = -2

a formula of type (2.31) is used for each derivative, with k for increments in t and h for increments in both x and y. Prove that stability requires h ~ kJ2.

2.5 TEM modes The fields in TEM modes are expressed entirely in terms of a solution t/J of Laplace's equation which takes constant values on the boundary. Therefore the Laplace difference equation of the preceding section is especially relevant. If, for example, the boundary of the cross-section of the guide consists of two distinct portions as in the coaxial line put t/J = 0 on one connected part and t/J = 1 on the other. The difference scheme will then supply numerical values

94

WAVEGUIDES AND DIFFERENCE EQUATIONS

for 1/1. The field components may then be deduced from the formula for a TEM mode but note, however, that they may be less accurate than 1/1 because numerical derivatives are involved in determining them. The only matter which merits further attention is how the numerical solution of the difference equations is to be accomplished. The most common practice is to carry it out iteratively using the successive over-relaxation (SOR) method described in §1.13 because the matrices for difference equations are sparse and usually large. The practical implementation of SOR entails the determination of the parameter to which occurs. Theorem 1.13a shows that w must be limited to the range 0 < w < 2. In Exercise 64 of §1.13 an optimal choice of w is given when the matrix A of the linear system of equations is tridiagonal. Actually, the relation given there between the eigenvalues A and J1 of the Jacobi and SOR methods is valid for a wider class of matrices, though the explicit formula for the optimal to holds only if all A are real. The wider class consists of matrices which are suitably sparse and whose non-zero elements follow a certain pattern. Choose a number of rows from A and call them S1 rows; the corresponding columns are called 8 1 columns. Term the remaining rows and corresponding columns S2 rows and 8 2 columns respectively. There will, of course, be no S2 rows if S1 includes all the rows of A. Then we have the following.

A matrix has Property A if every non-zero off-diagonal element lies either in an 8 1 row and an 82 column or an S2 row and an 8 1 column. YOUNG'S PROPERTY A.

If there are no 8 2 rows then a matrix with Property A is diagonal. It is advisable to know conditions under which it can be affirmed that, for a matrix with Property A, the optimal formula Wo

= 2{1 + (1 - Ai)I/2} -1

(2.54)

is correct. Now, a matrix with Property A can be converted by permutations of the rows and columns into a consistently ordered matrix whose properties will be described later. Then, if A is a positive definite consistently ordered matrix, the formula (2.54) can be confirmed. Despite this being a sufficient but not necessary condition it is highly relevant to difference methods. Broadly speaking, one can say that, given a selection of difference equations for the same partial differential equation, it is wise to restrict oneself to those which have Property A and preferable to pick one which is positive definite when consistently ordered. The question crops up of how to arrange that a matrix is consistently ordered. In other words, if the Umn are components of x, which Umn is to be identified with a particular Xi. For five-point difference schemes the natural ordering, in which one goes along a mesh line from left to right and then moves vertically one increment and repeats the process, achieves consistent ordering. Thus, for nine mesh points arranged on a square of side 3, Xl = U1 1 , X 2 = U12 , X 3 = U1 3 ,

WAVEGUIDES AND DIFFERENCE EQUATIONS

95

X4 = U2 h X s = U2 2 , • • • • Another possibility is take Umn before Upq if m + n < p + q which orders by diagonals. Yet another is to place all Umn with m + n even ahead of those with m + n odd. After organizing the difference scheme suitably, there still remains the topic of deriving a pragmatic estimate for A. 1 in (2.54). One device is to select a value of co, believed to be less than roo, and compute

b, = [x, + 2

-

X, + 1 " 00 / " x, + 1 -

x, II 00

where x, is an iterate of the vector x above with components Umn • Then, as in Theorem 1.14, b, should converge to p(ftIw ) and so, when b, has stabilized at bo (say), we make the estimate

Al

= (bo + co

- 1)/ro{)A/ 2 •

With the value of roo so gained the process is repeated. If ro has been chosen too large initially, it will be evident from the oscillation of b,. Practical schemes based on this idea have been reported by a number of authors (Kulsrud 1961; Carre 1961; Reid 1966). Another version uses the 12 norm in the definition of b, and changes to roo only when to and roo differ significantly. It may happen that the difference scheme Ax = b can be partitioned so that it can be manipulated into an equivalent system x(1)

= A 1 x (2)

+ bI ,

X(2)

= A 2 x (1 ) + b2 •

There is then the, possibility of using SOR with parameter OJ for the first equation of the system and ro' for the second. The method, known as modified SOR, might be expected to give improved convergence. For details and for information about the symmetric SOR, which has better characteristics than SOR in some circumstances, the reader is referred elsewhere (Young 1971). SOR is simple for the computer to handle and it is quite feasible to contemplate having 10000 mesh points on the cross-section of the waveguide. Some advantage accrues from commencing with a relatively coarse mesh. Then the mesh length is halved and the field values already found used as a starting approximation. The mesh halving can be repeated until a specified accuracy is attained. Usually, this gives quicker results than adopting 'the final mesh at the beginning and it automatically arrives at the mesh size for the desired accuracy. By incorporating it in the computer program complete calculations can be carried out in a minute or two at most of computer time. The rate of convergence is controlled to some extent by the boundary conditions (discussed in more detail in the next section) and, generally speaking, the rate is poorer the more curved the boundary. In the Dirichlet problem the mesh points carrying boundary values end up on the right-hand side of the equations and the matrix A is symmetric. For the Neumann problem, however, A is slightly non-symmetric because of the grid points near the boundary. For this reason it may be beneficial to contruct the difference scheme via a variational formula (Chap. 4) since then the resulting matrix is symmetric.

96

WAVEGUIDES AND DIFFERENCE EQUATIONS

Exercise 21. Find the TEM mode of a waveguide the boundary of whose cross-section consists of (a) two rectangles with the same centre and parallel sides, (b) the same as (a) except that one side of the inner rectangle is zero so that it is a strip, (c) a rectangle containing two circles symmetrically placed with their line of centres parallel to the longer side of the rectangle, and (d) two circles, one containing the other, but not necessarily concentric.

2.6 The dominant mode To unearth the dominant TE mode renders necessary the solution of (Vt

+

fJ~)t/Jl

=0

(2.55)

subject to 8t/J l/on = 0 on C. The field components will, in addition, require numerical derivatives when numerical values for t/! 1 have been ascertained. To avoid this process it has been suggested (Harrington 1968) that (2.55) should be replaced by the system

iJt/! l/OX = - JJ1 u, at/! 1/0Y = -

JJl v,

au/ax + ov/oy = fJl f/J 1

so that the fields are evaluated to the same accuracy as f/J 1. However, the matrices are now tripled in size and, although some benefit may be drawn in faster convergence because the equations are first order instead of second order, it is not at all evident that improvement occurs for the general problem. Accordingly, we shall concentrate on (2.55). The problem is very similar to that of the preceding section with the extra feature that fJ1 has to be found. A way of performing this which has been successful in practice (Davies and Muilwyk 1966) will now be described. Let the difference replacement of (2.55) be (A - A)'I'

= o.

If we can find an eigenvalue A we expect, if the mesh is fine enough, that it will be related to a corresponding eigenvalue of the continuous equation by )\0 = JJ~h2. First a coarse mesh is superimposed on the cross-section of the guide. The eigenvalues of the resulting A are computed. One should be zero, as has been seen in §2.2, but is not wanted, so, if that one is ignored, the smallest eigenvalue is chosen as an approximation to A.; denote it by A.(O). Then solve

A'I'(1) =

).(0)'1'(0)

by the technique of the foregoing section starting from any convenient mesh. The approximation '1'(0) may be chosen to suit one's taste, e.g. with components assuming the values 1 and - 1 on alternate grid points (putting the components all equal to 1 is inadvisable since the constant eigenfunction is to be rejected). Having found '1'(1) a new value of A is prepared from )\0(1)

= 'I'(1)T A'I'(1)/'I'(1)T'I'(1).

(2.56)

WAVEGUIDES AND DIFFERENCE EQUATIONS

97

Next, the approximation '1'(2) is determined from A'I'(2) = A,(l)'I'(1) and an iteration procedure has been set up. It may be expected to have good qualities because of the maximum principle established in §2.4 (cf. Exercise 14) and the properties of the Rayleigh quotient (2.56) because A is positive semidefinite (§1.10). Acceleration in the convergence of l(k) from three consecutive values by Aitken's l5 2 method (§§1.8 and 1.14) should be considered; it will produce substantial improvement if the error is proportioned to a constant power of h. To guarantee that the procedure stays away from the constant eigenfunction the weighted average of 'I' can be subtracted from each value of 'I' after each coverage of the mesh, thereby complying with (2.24). So far little has been said about the approximation to the boundary. The requisite action can be undertaken by the computer and essentially consists of replacing the boundary by a polygon all of whose sides are horizontal, or vertical, or inclined at 45° to' the horizontal. On each horizontal mesh line the mesh points nearest to the boundary and consistent with the above are chosen as the points of the polygonal perimeter (Fig. 2.6). Since the polygon is neither wholly inside nor wholly outside, but varies in a somewhat random fashion, the effects of the perturbations from the true boundary should tend to cancel out especially as the mesh is refined. Greater accuracy will, of course, be obtained if the curved boundary is retained and the procedure of §2.4 adopted but at the expense of some increase in complexity. It may well be best to leave this possibility on one side until preliminary results from the polygon are available. The method can also be applied to calculate the first TM mode. Thus the cut-off frequencies (easily determined from Jl1 and v 1 ) of both types of mode are found. Also the mode impedances Zo(l- Jlf/k 2 ) - 1/2 and Zo(l- vf/k 2 ) 1/2, where

Fig. 2.6. Polygonal approximation of boundary.

98

WA VEGUIDES AND DIFFERENCE EQUATIONS

2 0 = (J,l0/e o)1/2, follow without difficulty. However, the method has certain deficiencies when it comes to higher-order modes as we shall see in the next section.

Exercises 22. Find the cut-off frequency of the first TM mode by the preceding method in (a) a rectangle of sides d, 4d/9, (b) a square of side d, (c) a circle of diameter d, and (d) an ellipse with axes d and id. (The theoretical answers are dV 1 = 7.7352, 4.4429, 4.8100, and 7.5543 respectively.) 23. Find the dominant mode of a circular waveguide of diameter d (dILl = 3.6820). 24. A ridge waveguide has boundaries y = !d, - td ~ x ~ td; x = ±!d, 0 ~ y ~ !d; x = ±id, 0 ~ y ~ ~d; y = !d, -!d ~ x ~ !d; y = 0, -td ~ x ~ -id and id ~ x ~ id. Find the dominant mode. (Experimentally dILl ~ 2.25.) 25. A lunar waveguide consists of a circle of radius !d containing a circle of radius 0.286d with the centres displaced so that a piece of metal of length 0.055d along the line of centres joins the two circles. Show that dILl is about 1.95 in the fundamental mode.

2.7 Higher modes A first attempt to find the second or higher mode might be to start from a reasonable approximation to it and go through the procedure of the preceding section. It can be anticipated that such an attempt would be doomed to failure since the numerical process would almost certainly produce an element of the first mode which would become accentuated as the iteration advanced because (2.56) provides a bound only for the first eigenvalue. To circumvent this, the orthogonal properties of the modes can be invoked. Suppose that the first numerical mode 'I' 1 has been determined and suppose that the vector 'I'(r) for the second mode has been reached after a relaxation sweep across the guide. Then pick b so that 'l'T('I'(r) - b'l'1) = 0 and use 'P(r) - b'P 1 for the next stage of the iteration. If this is done after each sweep the vector should be kept orthogonal to 'I' 1 and (2.56) should approach the relevant eigenvalue. The method has been implemented (Pontoppidan 1969) but needs quite a bit of extra computer time and storage. As an alternative (Beaubien and Wexler 1968; Silvester 1970) rewrite the governing partial differential equation for t/Jn as

C\l; + Jl;-l)t/Jn = (J,l;-l - Jl;)"'n· The positive definite character of the operator has now been lost, and this renders impotent most of the methods for determining the optimal parameter to in SORe But a further application of the operator on the left gives (V~

+

Jl;-1)2t/Jn

= (Jl;-l

- Jl;)2t/Jn

and the property of positive semi-definiteness has been regained.

(2.57)

WAVEGUIDES AND DIFFERENCE EQUATIONS

99

Before pondering difference replacements for (2.57), we must examine whether there are other solutions to (2.57) besides t/J n. Since (2.57) can be expressed as (V;

+ /-l;)(V; + 2/-l;-1 - /-l;)X = 0

we have whence where

x = - _1_ 2/-l;

(x 01/1" + y 01/1,,) + X ox iJy

I .

If Jl; - 1 '# Jl; and X satisfies the same boundary conditions as the sought function t/J" then so must Xl; then 2/-l;-1 - J,.l; would have to be an eigenvalue. So, precluding the case where 2J,.l;-1 - Jl; is an eigenvalue one would hope that only "'" is generated. If J,.l; _ 1 = J.l;, i.e. the eigenvalue is degenerate with at least two possible distinct modal fields, the situation is more complicated. Even though Xl may now be a multiple of t/J" there is no guarantee that different starting approximations for different modal structures will remain disentangled as the iteration proceeds. Indeed, they may tend to the same final result. In that case, recourse must be had to the more complex techniques of Chapter 1. Assuming that no degeneracy is involved we may, after resorting to a coarse mesh for an initial I and eigenvector, adopt the iteration of §2.6 provided that we have the difference formula for A which now emanates from the left-hand side of (2.57). The new ingredient is V:, the biharmonic operator. A difference replacement can be realized by applying the Laplacian twice. When the five-point Laplacian is employed a 13-point formula is supplied, while the mne-point Laplacian originates a 25-point formula. Unfortunately neither of these formulae possesses Young's Property A which has been seen in §2.5 to be a dClirable concomitant for SOR purposes. Providentially, 17-point biharmonic difference formulae are known (Tee 1963) which do have Young's Property A and whose local truncation error is O(h 2 ) . They are

+ 3(Um + 2•n + 1 + Um + 1•n + 2 + Um - 1•n + 2 + Um - 2 , n + l + Um - 2 •n - 1 + Um - L n - 2 + Um + L n - 2 + Um + 2 , n - l ) - 39(Um , n + l + Um - L n + Um• n - 1 + Um + L n ) + 128Um,, } ,

(2.58)

100

V~U ~

WAVEGUIDES AND DIFFERENCE EQUATIONS

-dis

{3(Um + 3 • n + 2 + Um + 2 • n + 3 + Um -

2•n + 3

+ Um - 3 • n + 2

+ Um - 3 •n - 2 + Um - 2 •n - 3 + Um + 2 • n - 3 + Um + 3 •n - 2 ) + 11(Um + 3 • n + Um , n + 3 + Um - 3 • n + Um• n - 3 ) - 177(Um , n + l + Um - 1. n + Um• n - 1 + Um + 1. n ) + 640Umn } . (2.59) Of these (2.59) has greater diagonal dominance than (2.58) and therefore will be preferred in general. If (2.59) is used in conjunction with the five-point Laplacian for the term 2jJ;-1 V: on the left-hand side of (2.57), the appropriate difference formula is prepared. Provided that good starting values are available it should supply higher modes so long as degeneracies do not arise, but then all methods appear to experience difficulty.

Exercise 26. Find the first four modes of each type in the waveguides of Exercises 22-25.

2.8 Direct methods So far attention has been concentrated on the iterative procedure ofSOR. There are other techniques of iteration such as ADI (alternating direction implicit), but since we now wish to discuss a direct method we refer the reader elsewhere for details (Mitchell 1969; Mitchell and Griffiths 1980). The basic idea is to take advantage of the fast algorithms which are available for computing Fourier series (see §2.14). If the cross-section is a rectangle this can be accomplished in a satisfactory manner (Hockney 1965, 1970; Buneman 1969) and such methods are currently the fastest available. For other crosssections one tries to approximate by rectangles and then, by a suitable matrix decomposition, transforms the equations into a style for which Fourier methods are applicable. The method was first described by Buzbee and co-workers (Buzbee et ale 1970; Buzbee et ale 1971) but has been substantially enlarged and extended (Nokes et al. 1974; Nokes 1974). To fix ideas, consider the problem of solving Poisson's equation in a rectangle with sides a and b parallel to the x and y axes respectively. Superimpose a rectangular mesh, the sides of the rectangle being mesh lines. The mesh lines are equally spaced at a distance h along the x axis but the vertical spacing may be variable, being kj between rows j and j + 1 (Fig. 2.7). Let there be m small rectangles along a horizontal row and n along a vertical column. At a mesh point in the interior of the rectangle the five-point difference equation is (Exercise 11)

WAVEGUIDES AND DIFFERENCE EQUATIONS

101

h

t-----+----+-----+----+---""1-----t

j+1 k,

b

.----+----+-----+----+---""1-----tj

a Fig. 2.7. Rectangular mesh for Fourier method.

where Yj = 2h2jkj_l(kj_l + kj), 8j

bj

= 2h2jkJ{k j_ 1 + kj ) ,

= 2 + 2h 2 jkj-1k j

for j = 1, ... , n - 1. For the Dirichlet problem the values of Vi,j at mesh points on the boundary are known and may be transferred to the right-hand side. If these be assumed to be absorbed in the /i,j then we are confronted by the problem of solving (2.60) conditional on U vanishing on the boundary. This suggests an expansion in a Fourier sine series, say m

o., = L

Uj(k) sin(nkijm).

k=l

If the right-hand side of (2.60) is similarly expanded (cr. §1.7), then equating the coefficients of each Fourier harmonic leads to 2 c5 U + 1 (k) + {2 cos(nk/m) - 8j} Uj(k) + yjUj- (k) = h fi(k). j

1

j

Thus for each Fourier mode k, a tridiagonal matrix equation for the unknown Uj is obtained. Each tridiagonal matrix equation may be solved in turn (see Chapter 1) and the cyclic reduction methods of Hockney and Bunemann can be considered. However, we wish to take a somewhat different point of view. Return to the original system (2.60) and express it in matrix form. Adopt the natural ordering of mesh points so that the unknown vector x has components Xl = Ul,l' X2 = U2 , 1" ' " X m = UI , 2' · · · . The matrix equation is Ax

=b

where x is vector with (m - 1)(n - 1) components and A is a square matrix of the same order. The matrix A can be partitioned in (m - 1) x (n - 1) matrices by grouping together all the equations from a single mesh row. The result is

102

WAVEGUIDES AND DIFFERENCE EQUATIONS

that A has block tridiagonal form, namely

A=

(2.61)

where the (m - 1) x (m - 1) matrices L h Mh N, are defined by

Li =

I m - 1 being the unit matrix of order (m - 1) x (m - 1). The array for A contains (n - 1) x (n - 1) blocks but all of them except the L, M, and N are zero matrices of order (m - 1) x (m - 1). Now L h being symmetric, can be reduced to diagonal form by a similarity transformation. In fact, if Aij is the j th eigenvalue of L, we have (Exercise 57 of Chapter 1) Ai j =

-ei

+ 2 cos(jn/m).

If the matrix Q is such that Q-1LiQ is in diagonal form the kjth component of Q is given by Qkj = (2jm)1/2 sin(nkjjm)

(2.63)

and Q-t has the same elements (§1.7). It will be noted that Q does not involve e, so that only a single Q is required for all the L; Also Q -1 MiQ = Mi and Q-1NiQ = N;. Partition x into n - 1 vectors Xl' X 2, ... , X,.-t each with m - 1 elements and partition b similarly. Put x j = QX j and bj = QB j . Then the matrix equation for the Xj will be the same as that for the xj except that each L, will be replaced by the diagonal matrix similar to it and the right-hand side will have B, instead of bj . Consequently, define new vectors Xj and Hj so that CXj)k

= (Xk)j,

(Bj)k

= (Bk)j

WAVEGUIDES AND DIFFERENCE EQUATIONS

where

(Xj)k

signifies the kth element of the vector AjX j

Xj.

103

Then

= Hj

(2.64)

where

c5 m }'m-l

2

A.m-l,j

Thus three steps occur: (i) form Bj = Q-1bj ; (ii) solve the tridiagonal system (2.64); (iii) form xj = QX j . The solution of the tridiagonal sysem can be carried out by any convenient algorithm. One based on Gaussian elimination is as follows: (a) put U 1 = I/Al j and then calculate recursively u, = (A,j - Y'~'_lU'_l)-l for r = 2, , m - 1; (b) put VI = (Bj)t and then form V, = (Hj), - y,U'-l V,-l for r = 2, , m - 2; (c) set (Xj)m-l = Um- 1 Vm - l and then calculate

CXj)m-l-,

= um- 1 - , { Vm -

1- , -

c5 m-

1

-,(Xj )m- r}

recursively for r = 1, ... , m - 2. On account of (2.63) the steps (i) and (ii), in fact, involve the summation of Fourier sine series and, in effect, recover the expansions made earlier. The advantage of the latest approach, however, is that it can be generalized in a relatively straightforward fashion, for whenever A has the block tridiagonal structure (2.61) one can follow the same procedure even if L i , Mi , and N, do not have the explicit form (2.62) provided that a single matrix Q can be found which makes them all diagonal in a similarity transformation. Suppose, for example, that the Dirichlet condition on the top side is changed to a Neumann condition in which aulan is given. The Neumann condition is implemented in the form analogous to (2.44), namely that

is given where Yn = 2h2 Ik;_I and en = 2 + 2h2 Ik;_I which boils down to the standard interior difference operator applied at a mesh point on the boundary with the value at the point outside equal to that at its reflection in the boundary. There are now n(m - 1) unknowns so that A is now an n x n block array, but otherwise its structure is precisely the same as (2.61) and no change in Q is necessary to operate the procedure. Similarly, if a Neumann condition is imposed on the right-hand side while Dirichlet conditions apply on the other three sides, there are (n - I)m unknowns. A is still an (n - 1) x (n - 1) block array though the submatrices Li, Mh and N; are m x m but, apart from that, the only change is that the 1 in the lowest

104

WAVEGUIDES AND DIFFERENCE EQUATIONS

row of L, becomes 2. In this case

Ai j

= - ei + 2 cos{(j - t)njm}

and, although L, is no longer symmetric, Q exists with Qki

= (2/tn)1/2 sin{k(j - t)n/m},

(Q -1 )kj = (2/m)1/2 sin{j(k - !)n/m}

(j

=1=

m),

(Q -1 )km = t(2/m)1/2 sin{(k - !)n}.

Again, the procedure may be followed through. These are typical problems for which fast Fourier techniques are relevant so long as we choose the denominators in the arguments of the Fourier series to be powers of 2. Let us now consider the Dirichlet problem for the rather more complicated boundary of Fig. 2.8. Continue to draw the mesh so that each side of the boundary lies along a grid line. The mesh points are placed in their natural order but with those on the interface (shown broken) between the two rectangles omitted. Let the non-interface grid points form the vector Xl of order M, say, and let the interface mesh points in their natural order make the vector X2 of order N, say. Then the unknowns can be represented by the partitioned vector

and the difference equations may now be written as (2.65) The matrix A, coming from the interior points of the rectangles, is of order M x M and has the same block structure that has been described above. The matrix B, of order M x N, represents the influence of interface points on adjacent interior points, and therefore consists mainly of zeros. The matrix D, of order N x N, is drawn from the coefficients for the interface points in the difference equations which hold on the interface.

....--.------------

Fig. 2.8. Composite region for Fourier method.

WAVEGUIDES AND DIFFERENCE EQUATIONS

105

The aim is to extract the A so that the fast Fourier transform method may be applied to it. Assume A is non-singular. Then there is a matrix E and vector y such that AE=B, Ay=b l . Consequently,

Xl

= Y-

EX 2

and CX 2

= b2 -

BTy

(2.66)

where C = D - BT E. Therefore (2.66) is first solved for X 2 and then Xl is determined from AX I = b l - Bx 2 . The success of this strategy depends upon C, known as the capacitance matrix, which is independent of b I and b2 , being solely a function of the differential equation and the geometry. Once it is known it does not need to be recalculated when repeatedly solving the same differential equation in the same region. Usually it can be computed without too great an effort or too large a demand on storage. Equation (2.66) can be solved by splitting by triangular decomposition (§1.12). The calculation of C and its triangular factors is often known as

preprocessing. Next, turn to the topic of what happens when part of the boundary is not rectilinear, as when a corner in the upper rectangle in Fig. 2.8 is torn off (see Fig. 2.9). The artifice now is to use the same rectangular mesh as in Fig. 2.8 and introduce, as new points to be taken into account in the difference formula, the intersections of the mesh lines with the curved boundary. Set up the usual equations for the region employing, as necessary, the formula for arbitrary spacing (Exercise 11) for interior points near the curvilinear boundary. Do the same for the additional region between the arch and the broken straight lines, prescribing U on the broken lines in any convenient fashion. These last equations solve a Poisson problem outside the domain of interest and are therefore strictly unnecessary. However, by adding them to the first set we are able to preserve the rectangular structure and that is more important than solving two Poisson problems simultaneously, one of which is superfluous. Actually, the rectangular configuration has not quite been conserved because of the irregular spacing formula adopted near the curved portion of the boundary. To overcome this some artificial variables are added. It will fix ideas

Fig. 2.9. Non-rectilinear boundary.

106

WAVEGUIDES AND DIFFERENCE EQUATIONS

if a simple example is studied. Suppose that the shape of a regular equation is VI + V2 = f1 and of the irregular is «U, + pU2 = fl. The latter can be shifted into the guise of the former by inserting a new variable V 3 and writing VI

+

U2

= f1

-

V3 ,

V3

= (rl -

l)UI

+ (P -

1)V 2 •

Thus an additional equation is involved and the artificial variable also contributes to the right-hand side of the regular equations. Hence, in our problem, sufficient artificial variables (and their attendant extra equations) are added to render all the irregular equations regular. Denote the artificial variables by the vector X 3• Then, the system is

(

A BT

B D

F)(XI) (b 0 x = b

G

H

0

i

2

2

x3

b3

(2.67)

The matrix F represents the addition of the artificial variables to the standard whereas G, H, and b3 are concerned with the extra equations required with the artificial variables on the right-hand side, i.e. in b3 . Now define new matrices by B1

= (B

F),

B2

=

(:T).

D1

=

(~ ~)

and new vectors by

Then (2.67) can be expressed as

and the same path as traced for resolving (2.65) may be followed. Accordingly, solve first Cz = c - B 2 y (2.68) where now C = D1 - B 2E 1 , AE I = B 1 , and then find Xl from AX I = b, - BIZ. Notice that, in fact, the artificial variables are confined to (2.68). Again, the problem has been reduced to dealing with a capacitance matrix following by fast Fourier transform techniques. It is clear that, despite attention having been concentrated on the Dirichlet problem, other boundary conditions can be handled because of the generality of the procedure. There is, however, one exceptional case and that is when all the boundary conditions are of Neuman type, for then the matrix A is singular (cf. §2.4) contrary to the assumption made in solving (2.65). Nevertheless, it turns out that by suitable modification it is still possible to devise a similar procedure. For details the reader is referred to the papers already cited.

107

WA VEGUIDES AND DIFFERENCE EQUATIONS

Eigenvalues may also be tackled in this way. If the difference equation for Poisson's equation is Ax = b then the corresponding eigenvalue problem for Helmholtz's equation is Ax = AX. Inverse iteration (§1.14) will then supply the desired answer via (A - A')X(r) = x(r-l) (2.69) where ),,' is some approximation to the sought eigenvalue and the process is initiated by some x(o>. Considerable saving in computer time in grappling with (2.69) is achieved by omitting the intermediate fast Fourier transform between successive iterations. For, according to (iii) following (2.64) an interation will end by manufacturing xy> = Qx~r> and the next stage will commence with (i) in which X y> = Q- 1 xY> is formed. Therefore Xj found in step (ii) of one iteration can be carried straight over to stage (ii) of the next iteration. The relative merits of SOR and the direct method of this section are difficult to assess for general domains. Broadly speaking, it seems reasonable to expect that the more nearly a cross-section approximates to a union of rectangles the more advantageous the direct method is likely to be. For a highly crooked boundary it is not at all obvious which will have superiority.

Exercises 27. For a rectangular domain with the Neumann condition on the left side and the Dirichlet condition on the right side show that

Aij =

-ei

+

2 cos{(j - t)n/m}

and that Qkj = (2/m)I/2 sin{n(m

+

1 - k)(j - !)/m} ,

(Q -1 )kj = ej (2/m)I/2 sin{ n(m

+

1 - j)(k - !)/rn}

where el = t and ej = 1 ifj i= 1. If-the Dirichlet condition is replaced by a Neumann condition prove that the corresponding results are Aij

=

Qkj = (Q-l)kj

-ei

+ 2 cos{(j - l)n/rn},

= (2/m)I/2jj cos{(j -

l)(k - l)1t/m}

where 11 = Im+ 1 = ! and Jj = 1 if j i= 1 or m + 1. 28. If periodic boundary conditions are applied to the rectangle, i.e. U takes the same values on the left and right sides prove that

Aij =

-ei

+ 2 cos{2n(j

- 1)/m} ,

Qkj = (2/m)I/2gj cos{2n(k - I)(j - I)/m}

= (2/m)1/2

sin{2n(j -

!n -

l)(k - 1)/m}

(1 ~ j ~ 1 + 1m) (I

+ !m < j

~ m)

where gl = gl +tm = IJ2 and gj = 1 if j i= 1 or 1 + !m. For (Q-l)ltj interchange j and k on the right-hand side except in gj. 29. Explain the detailed steps that would be necessary to undertake preprocessing for (2.67).

108

WAVEGUIDES AND DIFFERENCE EQUATIONS

30. Calculate the number of operations required for preprocessing and also for the complete direct method. 31. Obtain the form of the matrix equations when the boundary condition is (2.41). 32. Find the eigenvalues of the first five TM modes in an L-shaped waveguide.

2.9 Other equations

Although the theory so far developed has been discussed in the context of two-dimensional equations there is no difficulty in principle in taking it over to problems in higher dimensions, e.g. Poisson's equation in a box. The storage and time requirements are greater but the principles are unaffected. Also there is no reason why the coefficients in the partial differential equation should not depend upon the coordinates though the direct method effectively excludes dependence on x. Nor is the theory restricted to Cartesian coordinates, though the order in which to take the nodes of the mesh to supply desired matrix properties may not be quite so transparent. For example, the Laplacian with axial symmetry

has the discrete analogue k ) Um,n+i h12 (Um+l.n + Um- i,n) + k12 ( 1 + 2r n

1(

+ k2

k)

1 - 2r Um,n- 1 n

-

2(h2

+ k 2 )Um,nlh 22 k

on a rectangular mesh, hand k being mesh lengths parallel to z and r respectively. There is nothing fresh to be said about SOR but, for the direct method, Fourier analysis would have to be parallel to the z axis. For two-dimensional polar coordinates where the Laplacian is

a mesh of radial lines through the origin and concentric circles can be tried. The mesh length parallel to r can be selected as a constant k but that in the () direction will depend on r, being r"()o on r = r" where ()o is the angle between the radial lines. A discrete analogue in this case is

WAVEGUIDES AND DIFFERENCE EQUATIONS

109

The Fourier analysis in the direct method is now along the () direction, with periodic boundary conditions if the whole range 0 ~ () ~ 2n is involved. When the origin lies in the domain the appropriate formula there is

4(}2

2

r1

0

{! s

(Vl,l + V2 , 1 + · . · +

o; 1) -

Vo,o}

where s is the number of grid points on the innermost mesh circle. 2.10 Conformal mapping The convenience of rectangular boundaries for numerical methods suggests that it may be worth while transforming the curved boundary of a waveguide into a rectangle. In fact, this method was attempted quite early on (Meinke, Lange, and Ruger 1963; Meinke and Baier 1966; see also Howe 1973). Define the complex variable' by , = x + iy. Then the conformal mapping w = F( () will convert the cross-section in the , plane into another one in the w plane. Further, eqn (2.26) becomes

V;t/Jm + Jl;'t/Jm/I F' 12 = 0 Laplacian in the w plane and F' = dF/d(. The equation has

where V~ is the to be solved under the condition that the normal derivative of t/Jm vanishes on the transformed boundary. Further, ¢m satisfies a similar equation with apposite boundary condition. If F be such that the new boundary is rectangular, the original problem of a waveguide of given shape containing a uniform isotropic medium has been converted to one for a rectangular guide filled with a uniaxial transversely inhomogeneous medium. When an explicit formula is available for F, the numerical procedures delineated earlier can be considered for deriving the solution. If the initial boundary is highly curved this may be beneficial since some of the errors associated with boundary fitting will be eliminated. If this is not so or a closed form is not to hand, the value of conformal mapping before starting numerical work is dubious. In particular, if F has to be calculated numerically it is fairly certain that the conformal mapping should be abandoned. 2.11 Waveguides containing dielectric While the interior. of a waveguide is often composed of a single homogeneous dielectric, such as air, this is not by any means always true. When, however, the cross-section consists of two or more separate media as in Fig. 2.10 the form of the modes is changed in general. Assume that the z dependence is still e- hcz • Then eqns (2.17)-(2.20) continue to hold between C and Ct. They are, in addition, valid inside C l provided that u, 8, and k are replaced by Jll' 8 1 , and

k1

= W(Jlt 8 t ) 1/ 2•

110

WAVEGUIDES AND DIFFERENCE EQUATIONS

Ji,E

c Fig. 2.10. A dielectric cylinder within a waveguide.

As far as TEM modes are concerned, they would require K 2 = k 2 and K 2 = ki which is possible only if ue = Pt f,t. In general this will not be true and so TEM modes can be ignored in waveguides containing two or more dielectric media. If, indeed, they are present, then Laplace's equation is satisfied throughout with extra boundary conditions on Ct. Since these are particular cases of those to be dealt with later, it is not difficult to devise the relevant numerical techniques (see, for example, Seeger 1968) and so we shall not consider TEM modes further. The boundary conditions are that E; and oHz/on are zero on C while the tangential components of E and H are continuous across Ct. This implies four conditions on C t , namely that

must be continuous across Ct. First examine whether TE and TM modes are possible. If E; = 0 then Hz and {K/(k2 - K2 )}(oHz/osl ) must be continuous across Cl. Since oHz/os l is a tangential derivative this is possible only if k = k 1 or oHz/os t = O. Therefore, apart from these special exceptions, no TE modes can exist. Similarly, no TM modes exist unless k = k 1 or oE z /osl = Consequently, it can be safely accepted that in general the field will not split into TE and TM modes. As a result, two partial differential equations, coupled through the boundary conditions on C t , have to be dealt with in each region simultaneously. The eigenvalue, effectively K, appears in both partial differential equations as well as the boundary conditions on C l . In order to assure that the dimensions of the unknowns are the same it is convenient to put t/J = Hz and 4> = (tto/f,o)t/2E z. If a square mesh is employed, then in one region both t/J and 4> satisfy

o.

t/Jm.n+l

+

t/Jm.n-t

+ t/Jm+l.n + t/Jm-t.n - 4t/Jm.n + vt/Jm.n = 0

WAVEGUIDES AND DIFFERENCE EQUATIONS .N

• W

•

111

p,E

•

P

E

11,,£,

·S

Fig. 2.11. Nodal points for a horizontal dielectric interface.

where v = k 2 - ,,2, and in the second region they satisfy the same equation with v replaced by vir where r = (k2 - K 2 )/(ki - K 2). For a point on the dielectric interface consider the horizontal boundary shown in Fig. 2.11 for simplicity. Then I/J(P) = 1/J(1)(P) and cP(P) = cP(l)(p) where the superscript (1) indicates values appropriate to the lower region. Also (1

+ Jl'r){I/J(E) +

t/J(W)}

+

2Jl'rl/J(1)(S)

+

21/J(N) - K'(r - 1){cP(E) - cP(W)}

- 4(1 + Jl't)I/J(P) + v(1 + Jl')I/J(P) = 0, (1

+ e'r){cP(E) +

cP(W)}

+

2e'rcP(l)(S) + 2cP(N) - ,,"(1 - r){-/J(E) - t/!(W)}

-4(1 + e'r)cP(P) + v(1 + e')cP(P) = 0 where ,,' = KJ JlolwJlJeo, x" = KJeolweJ Jlo = Jl6oK'IJloe, Jl'~ JlI/Jl and e' = e 1/e. Let the unknowns be ordered so that all cP except interface values are taken first, then all t/J apart from interface values, and finally interface values are added. The matrix equation now takes the form (2.70) when the dielectric has rectangular sides parallel to the walls of a rectangular guide. Although excellent methods (Peters and Wilkinson 1969) based on Sturm sequences (§1.14) are known for equations of the type (2.70) they are not necessary when D 2 is diagonal, for then we put y = D~/2X where D~/2 is the square root of D2 (Exercise 46, §1.10) and then (2.70) goes over to Di 1 / 2 ADi 1 / 2 y = vy. Further multiplication by diagonal matrices may be executed if modification to elements stemming from Neumann conditions is desired. The equation has now been cast into a shape in which the application of previous procedures can be considered.

Exercises 33. A rectangular guide of sides a and b (a > b) has permittivity £1 for 0 ~ y ~ d and permittivity e for d ~ y ~ b. If u = PI' check whether the matrices above have Young's Property A. 34. A circular guide of radius a contains a concentric circle of dielectric of radius b. Find the first four modes.

112

-(a)

(b)

(c)

(d)

WAVEGUIDES AND DIFFERENCE EQUATIONS

Fig. 2.12. Microstrip transmission lines.

2.12 Microstrip transmission lines The name microstrip is an abbreviation whose purpose is to describe a microwave circuit which is fabricated by printed-circuit techniques. The microstrip transmission line usually consists of a conducting ribbon or ribbons mounted on a dielectric substrate which is often backed by a large conducting plane. Some typical examples are shown in Fig. 2.12. Note that Figs. 2.12(a) and (b) are essentially equivalent. When the line is totally enclosed, as in Figs. 2.12(c) and (d), the line is often referred to as shielded microstrip. Their simplicity of manufacture carries obvious economic merit as well as advantages in size and reliability. Designing them so that reflections, loss, and spurious coupling are kept to a tolerable level is a far from trivial task. Even when the frequency is low enough (say at the bottom end of the gigahertz range) for the propagation to be regarded as near enough to TEM for a quasistatic approximation to be acceptable, theoretical analysis is difficult. In spite of that, first attempts were in this direction (Wheeler 1965; Stinehelfer 1968). Nevertheless, it was not long before the problems were tackled by means of difference equations and there are now several papers on the subject (Green 1965; Bryant and Weiss 1968; Whiting 1968; Silvester 1968; Cermak and Silvester 1968; Gelder 1970; Hornsby and Gopinath 1969; Corr and Davies 1972). From our point of view the problems for shielded microstrip lines are natural extensions of those of the preceding section. There are now extra boundary conditions to be satisfied on the conducting strip(s) where 4> must vanish. If these conditions be added on at the end of the other equations the structure of (2.70) is unaltered so that the numerical methods which are applicable there are also available here. It has been suggested (Mittra and Itoh 1971) that it is profitable to tackle

WAVEGUIDES AND DIFFERENCE EQUATIONS

113

eP=O I I

:otP =0

e

lax

I I

b

eP=1

I

yrIY~ tP=O I I I I I

:.

x

I

I

I

I

I

I

.:

Fig. 2.13. Quasistatic problem for the microstrip of Fig. 2.l2(a).

microstrip lines by a hybrid approach in which a certain amount of analytical preprocessing is carried out before any numerical work is tried. It could be anticipated that this would be so when a good portion of the problem can be resolved in an exact fashion by analysis and some numerical difficulties thereby removed. As an illustration consider the problem of Fig. 2.12{a), with a shielding conductor above, operating at a frequency where the quasistatic approximation is valid. Confining attention to fields which are symmetric about the centre line, we have to find a solution to Laplace's equation which complies with the boundary conditions depicted in Fig. 2.13. The following Fourier series expansions for 4> may then be assumed 00

cP

= y/d + L

n=1

= (b where

y)/t

+

L

(0 <

X

< I, 0 < y < d)

(0 <

X

< I, d < y < b)

00

n=1

b; sin{nnib - y)/t} cosh(nnx/t)

b - d. Similarly in the remaining regions

t =

4> =

an sin(nnyld) cosh(n1txld)

00

L

n=1

(X > 1, 0 < y < d)

en sin ctnY exp{ - ctn(x - I)}

sin ctnd . en -.SIn cxn(b - y) exp{ -cxn{x - I)} n=1 sm (ln t

=L 00

where the

(Xn

(x

>

I, d

< y < b)

are positive solutions of

e1 cos(cxnd) sin ctnt or (el

+

e) sin f.Xnb

+

+ s sin ctnd cos cxnt = 0

(8 -

8 1)

sin cxn(d - t) = O.

(2.71)

It will be observed that these expressions automatically satisfy the boundary

114

WAVEGUIDES AND DIFFERENCE EQUATIONS

conditions on y = 0, d, b, x = 0 and behave properly as x -+ 00. The continuity of ¢ and o¢lox on x = 1 have still to be guaranteed. This can be achieved by req uiring that

f:

4> sin PnY dy,

{b

4> sin }'nY dy,

and the corresponding quantities with al/Jlox in place of 4> are the same as x approaches I from above and below, Pn being nnld and Yn being nnlt. Hence

~ - -1 + 2Id()n an cos h 13 ,.1 = i..J 13n

m= 1

C 13n 2--2' ~m - Pn

m

(2.72) (2.73)

h I ~ - -1 + 2:1 t()nb n COS Yn = i..J -

emyn 2' m= 1 am - Yn

'Yn

1

2 t( -)

2

m nb· h I ~ n sin }' n = c: - 2 C ~m2 m= 1 ~m - i',.

(2.74) (2.75)

where C; = em sin amd. Eliminating an from (2.72) and (2.73), and b; from (2.74) and (2.75) we obtain

(1_ + ~) ~ L em (1 -_- + --", + em

(~n - 1)/Pn = ~ 00

m- 1

('1,. - l)/Yn =

am

P

n

am

Y,.

(Xm

P ,

'1,.)

00

m= 1

am

(2.76)

n

(2.77)

i,.

where ~n = exp( - 2Pn l ), Yin = exp( - 2Yn l). To tackle (2.76) and (2.77) let z be a complex variable and consider the meromorphic function j(z) defined by j(z)

= Kg(z)h(z)

(2.78)

where g(z) = exp (

x

!

Z{d In (bid) + t In (bit)})

Ii (1 -

n z/Pm~

z m= 1 h(z)

=1+

exp(zd/mn)(1 - z/}'m) exp(zt/mn) , (1 - zlam) exp(zblmn)

f (1 - AmzlPm + 1 - BmzlYm )

m= 1

and K, Am' and Bmare constants to be determined. As Izi -+ 00 the dependence of g on z can be derived by supposing m to be large so that am can be

WAVEGUIDES AND DIFFERENCE EQUATIONS

115

approximated by mttfb. Then, the formula

n {(I + zlm) e00

liz! = eY%

z m / } ,

(2.79)

m=l

where "I is Euler's constant, implies that the asymptotic behaviour of the infini te product in g is

( - zb/n) !/( - zd/n)!( - zt/n)! apart from a multiplicative constant. Stirling's formula z! '" (2n)1/2 exp{(z

+ t) In z -

z}

(2.80)

then supplies

Izr 1/ 2 expe{d In d + t ~n t - bIn b}) unless z is positive real. Consequently, g(z) '" [z]" 3/2 as Izl -+ 00, except possibly on the real axis. It follows that f has the same asymptotic behaviour provided that h is bounded away from its poles. Also, if C is a closed contour enclosing z = ±Pn and the origin, 1 -.

f

2nl c

(1-

z - p,.

e)

+ _n_ z + Pn

f(z) dz

=

L R(rx m ) ( 1

+ en) + R(O) Pn (Im + Pn {en - 1}/Pn + f(Pn) + enf(- Pn) (2.81)

m= 1

X

(Im -

where R(zo) is the residue of f(z) at z = Zo and the upper limit of summation is governed by the number of poles am within C. If C moves off to infinity so as to embrace the whole plane the left-hand side tends to zero because of the behaviour of f at infinity. Therefore, if the constants at our disposal are elected to make

+

enf( - Pn)

=0

we recover (2.76) and can find Cm from C; with if

= R(rx m ).

Similarly, (2.77) is coped

R(O)

=-

f(Pn)

I,

f(}'n)

+

tlnf( - "In)

= o.

(2.82)

(2.83)

The problem has thus been converted into discovering Am' Bm, and K so that (2.82) and (2.83) are complied with. Now en decreases exponentially as n increases so that f(Pn) and hence An is exponentially small as n -+ 00. Similarly, B; decays exponentially; so, as Izi -+ 00, h is bounded and f '" Izl- 3/2 and a self-consistent scheme has been arrived at. Therefore, for a good approximation, it should be sufficient to ignore any Am or Bm after the first 10 or so and just solve the first few of (2.82) and (2.83). It is evident that the advantage of this process lies in the rapid convergence of the series in Am and Bm as compared with those for em in (2.76) and (2.77). Remark also that replacing en by 1 in (2.81) leads, via (2.73) and (2.82), to

116

WAVEGUIDES AND DIFFERENCE EQUATIONS

As a consequence, the charge density on the lower surface of the strip y = d, x < 1is

o<

00

GI(04J/OY)y=d--O = ei/d

+ (2ei/d) L Pmf( -Pm) exp( -Pm 1) cosh Pm X • m=l

When x approaches 1, the only singular part can come from large values of m and for these values it is legitimate to put Pm ~ mnld, f( - Pm) ~ A/IPmI 3 / 2 • Accordingly, the only singular part has the form 00

L

1 2

nl- /

exp{- mn(l - x)/d} ~ {d/(I-x)}1/2

m=l

as x --+ I - O. There is a similar result for the charge density on the upper surface of the strip. Thus, the structure of f carries a warranty that the edge conditions (Jones 1986) on the metal strip are satisfied. The technique does not sutTer from the singularity quandaries which can afflict difference methods. Numerically, it requires the solution of (2.71), which can be carried out by the devices set out in §1.8, and the solution of a relatively small number of simultaneous equations for Am and Bm. Against this must be posed the facts that, for each new configuration, another f must be constructed (assuming a suitable one exists) and that the infinite products in g must be computed. In practice, only a finite number of the brackets can be taken and some error will be incurred. If, in the omitted brackets, n is large enough for lIn to be approximated by nnlb, an estimate of the error committed can be derived from (2.78). Of course, the numerator of g in this particular problem can be expressed exactly in terms of the factorial function. The general structure has, however, been left since in many typical cases the Pm and Ym are not such simple functions ofm.

Exercises 35. Analyse the microstrip lines of Fig. 2.l2(c), (d) by difference methods. 36. Does it make much difference in Fig. 2.12(c) if the dielectric does not extend beyond the ends of the metal strips? 37. Obtain numerical values for the microstrip of Fig. 2.13 by the method described in detail in the text. By letting b become large obtain an idea of the transmission in the line of Fig. 2.12(a). 38. In Fig. 2.13 the structure, instead of going off to infinity in the positive x direction, is terminated at x = fa so as to form half a boxed microstrip line. If the boundary condition on x = !a, 0 < y < b, is 4> = 0 show that the resulting equations can still be solved by a pertinent f(z). 39. Make a critical comparison, including numerical values, of methods for the boxed microstrip line. 40. The boundary condition on x = 0 in Fig. 2.13 is changed from o4>/ox = 0 to 4> = 0, corresponding to antisymmetric modes. Find the f(z) which is germane to this problem.

WAVEGUIDES AND DIFFERENCE EQUATIONS

117

41. If, in Fig. 2.13, the metal strip is removed from 0 < x < 1, y = d, and strip is added on 1 < x, Y = d, a slot line is produced. Can you discover an appropriate f(z) in this case? 42. In the coupled line of Fig. 2.l2(d) the side walls are taken to infinity. Show that two functions 11(z) and 12(Z) are needed to resolve the matching equations.

2.13 Other methods for guides This chapter has concentrated on a group of methods for tackling the propagation of waves in guiding structures. These are not, by any means, the only ways which have been devised for attacking these problems. The next chapter will be concerned with a variety of approaches which may be classed under the general heading of variational methods. Some of these are relevant to waveguide calculations. However, before turning to a discussion of these matters, we shall devote the next section to giving the basic facts about fast Fourier transform techniques. 2.14 The fast Fourier transform There are many occasions when one is faced with calculating a Fourier series or Fourier transform of a function. Let us deal with the case of series (Cooley and Tukey 1965) first so that we require N-l

X(j)

= L

(2.84)

a(k) exp(2nijk/N)

k=O

for j = 0, 1, ... , N - 1 with given complex coefficients a(k). Suppose that N is a composite number so that it has integer factors n1 , n2 such that N = n 1n2. There are integers jg.j. such thatj =jl nl + jo with 0 ~jo ~ 1, 0
"1 -

"2 - 1 "I - 1

X(jl,jO) =

L L

ko=O kl =0

a(k 1 , k o) exp{2nij(k tn 2

+ ko)/N}

"2-1 "1-1

=

L L

ko=O kl =0 "2

=

-1

L

a(k 1 , k o) exp{2ni(jok tn 2

al(jo, ko) exp{2ni(jl nl

+ jko)/N}

+ jo)ko/N}

ko=O

where

a 1(jo, k o) =

"1-1

L

k l =0

a(k l , k o) exp(2nij ok1 n2/N).

There are N elements in a l each requiring n 1 operations (an operation being

118

WAVEGUIDES AND DIFFERENCE EQUATIONS

a complex multiplication followed by a complex addition) so Nn 1 operations are necessary to find all at. Then Nn 2 are needed to calculate the X from the at giving the total of N(n t + n2 ) for the two steps. It is clear how m steps can be carried out if N = n tn2 ... nm and a total of N(n 1 + n 2 + ... + nm ) operations will be involved. If n t + n2 + ... + n m can be made much less than N 2 there will be substantially fewer operations than would be required by a straightforward computation of (2.84). For example, if n l = n2 = ... = 2, m = log, N and the total number of operations is 2N log, N. While other choices are possible, the advantage of this one is that j =jm_1 2m- 1 + ... + j l2 + io with eachj, either 0 or 1 and this information is already contained in the binary representation of j. Since the same is true of k it pays to arrange the series in (2.84) so that N = 2P where p is some positive integer. (For simple procedures based on real arithmetic see Kruseman, Aretz and Zonneveld (1975).) Turning now to the case of the integral transform (Cooley, Lewis, and Welch 1967) we want

G(f)

= too", get) exp( -2nift) dt

when g(t) is given. By Fourier inversion

get) =

f:oo G(f) exp(2nift) df·

Let to be fixed and let j be any integer. Then, if F

g(jt o) =

f

_00",

G(f) exp(2nifj/F)

= k=~oo JkF = k=~oo

f:

G(f) exp(2nifj/F) df G(f + kF) exp(2nifj/F) df

on changing the variable of integration from et:>

Ga(f) =

g(jt o) =

df

f(k+ I)F

CX)

we have

= lito,

f:

L

k= -

f

to

f + kF.

G(f + kF)

Putting

(2.85)

00

Ga(f) exp(2nifj/F) df·

(2.86)

The function Ga is sometimes said to be an aliased version of G. It is a periodic function of f with period F so that it can be expanded in a Fourier

WAVEGUIDES AND DIFFERENCE EQUATIONS

119

series. Thus 00

Ga(f) where

Fen

=

= L en exp( -2nifn/F)

f:

n= - ao

Ga(f) exp(2nifn/F) df·

An immediate deduction from (2.86) is that Ga(f)

1

ao

F

j= - ao

=- L

g(jt o) exp( -2nifj/F).

Hence, for n = 0, 1, ... , N, Ga(nF/N)

1

=- L ao

g(jt o) exp( -2ninj/N) F j= - oo 1 ao (k + l)N - 1 =- L L g(1to) exp( -2ninj/N) F

1

k= -

=- L

00

j=kN

N-l

L g(jt o + kNt o) exp( -2ninj/N) F k= - ao j=O 1 N-l = - L gb(jt O) exp( -2ninj/N) F j=O 00

where

co

gb(t)

= L k= -

g(t

(2.87)

+ kNt o)·

00

A stage has been reached where Ga is given at a discrete set of points by the finite Fourier series (2.87). For values at other points one may consider trigonometric interpolation (§1.7) provided N is not too small. However, we first select F or lito sufficiently large that the error committed in (2.85) in taking G(f) = 0 for Ifl > iF is negligible. With this choice Ga(f) = G(f) for Ifl ~ tF so that Ga can be used to evaluate G. Next N is chosen. In order that the fast algorithm for the Fourier series can be employed for (2.87) it is taken as 2P with p large enough to give Ga at a satisfactory number of points, depending on the frequency resolution desired. The alias gb is available in principle and will involve only a finite summation if 9 is zero outside a finite interval. Otherwise the series must be truncated and it may be desirable to adopt smoothing techniques to minimize errors. For an account of errors in the fast Fourier transform see Thong and Liu (1977).

3 OPERATORS AND EIGENVALUES PRELIMINARIES The literature on operators is vast and covers many different applications, some of which are especially relevant to the variational methods to be discussed later. Although the principles are always the same the details may vary substantially. Therefore, to avoid unnecessary duplication, it is desirable to place the basic theory in as general a setting as possible. It is this motivation that prompts the introduction of Hilbert space so that a fair number of applications can be covered once and for all.

3.1 Hilbert space It is convenient to start by saying something about the entities with which we are prepared to work. Suppose that we have a set of elements, denoted by x, y, ... , for which a rule is specified for calculating x + y and for which a meaning can be attributed to cx· x where a is any complex number. In other words, we are provided with a rule for the addition of two elements and a rule for the multiplication of an element by a complex number. If these rules obey certain axioms such as x + (y + z) = (x + y) + z, (cx + P)·x = cx·x + B:«, l·x = x (for a full list see Brown and Page (1970», the elements are said to form a complex linear space. In particular, if E is such a space, when x and yare elements of E so is cx • x + p. y for any complex numbers cx and p. The axioms require the existence of a zero element () of E such that x + 8 = x. However, it can be shown that O· x = (J for all x in E and a . (J = (J for all complex numbers cx. Therefore, no confusion will be caused if the same symbol 0 is used to denote the zero complex number and the zero element of E. The notation x E E indicates that x is an element of E. When the complex numbers cx above are replaced by real numbers E in a rea/linear space. Since results for complex spaces can usually be carried over to real spaces by simple modification, our formulation will deal mainly with complex spaces. If Xl' X 2, ••• 'X n belong to E, the set of elements cxlX! + CX2X2 + ... + CXnX n generated by giving CX I, cx 2, ... , CXn all complex values is called the space spanned by x., x 2 , ••• , X n • The elements Xl' ... ' x; are said to be linearly independent if (XIX! + CX2X2 + ... + CXnX n = 0 implies CX I , ••• , CXn are all zero otherwise they are linearly dependent. A non-empty subset S of E is said to be linearly independent

PRELIMINARIES

121

if and only if there is no finite subset of S whose elements are linearly dependent. A linearly independent subset of E which spans E is called a basis for E and then every element of E can be expressed uniquely as a linear combination of elements of the subset. If the basis is finite E is said to be of finite dimension; otherwise it has infinite dimension. A non-empty subset S of E is said to be a linear subspace or linear manifold of E if etx + fJy E S whenever XES and yES. Obviously, E is a linear subspace of itself. The linear hull of a non-empty subset S is the set of all finite linear combinations of elements of S. Sometimes the linear hull is known as the linear subspace spanned by S. The inner product has already been encountered (§§1.5, 1.11). It is a complex number which satisfies certain laws set out in the earlier sections. A linear space for which an inner product is defined is called an inner product space or pre-Hilbert space. In an inner product space, Schwarz's inequality

l(x,y)1

~

IlxlIIIYII,

(3.1)

Minkowski's inequality IIx +

YII

~ [x]

+ lIyll,

(3.2)

and the triangle inequality (3.3) /Ix - z/l ~ Ilx - yll + II y - a] are all valid. Some examples of inner product spaces are the following. (a) E is the set of real numbers and (x, y) = xy, provided that the multipliers et, fJ, ... are restricted to real numbers. (b) E is the set of complex nurnbers and (x, y) = x y*, the asterisk indicating a complex conjugate as usual. (c) E is the space of column matrices with n components which are complex numbers. Now x has the complex numbers ~1' ••• , ~n as components and n

(x, y)

=L

j=l

~jt1j.

e

(d) E is the space of all sequences {~1' 2' ...} of complex numbers such that

2:f= 1 fe jf 2 is finite. Here

00

(x, y)

=L

j=l

~jt1j.

(e) E is the space of column matrices with n components, each of which is a complex-valued function of the real variable t that is square integrable for a ~ t ~ b. Here

(x, y) =

Jl fb ~j(t)l1j(t) n

a

dt.

(3.4)

122

OPERATORS AND EIGENVALUES

(f) The same as (e) except that t is a variable in tn-dimensional space. (g) With suitable restrictions on the functions in (e) and (f) a possible inner product is (x, y)

= it

1

kt fajk(t)~i(t)'1:(t)

dt

where ajk(t) is an element of an assigned positive definite Hermitian matrix. There is one further notion to be discussed before the definition of a Hilbert space. Suppose that {x n } is a sequence of elements of an inner product space E such that [x, - X m II --. 0 as m and n tend to infinity. Then, the question arises as to whether there is an element y in E such that "y - x, II --. 0 as n --+ 00. Either possibility may occur for general spaces, as can be seen by letting the elements of a space X be the real numbers in (0, 1] so that 0 < x ~ 1. The sequence {lin} tends to 0 which is not in X. On the other hand, if X had consisted of the real numbers in [0, 1] so that 0 ~ x ~ 1 the limit would have been in X. A space in which every convergent sequence converges to an element in the space is said to be complete. DEFINITION.

A Hilbert space is an inner product space that is complete.

Thus, if [x, - X m " - . 0 as m and n tend to infinity in a Hilbert space Hone can be sure that there is an element y E H such that II Y - x, 1/ --+ O. Actually, any pre-Hilbert space can be made into a Hilbert space by adding to it all the limits (of convergent sequences) not already contained therein. The examples (a)-(f) are all illustrations of Hilbert spaces. Had, however, the functions in (e) been restricted to continuous functions instead of being square integrable, E would have been an inner product space but not a Hilbert space because the limits of convergent sequences of continuous functions need not be continuous. It is sometimes convenient to consider linear spaces for which a norm can be defined (satisfying the conditions of §1.11) although an inner product is not available; they are known as normed linear spaces. Such spaces, when they are complete, receive the special name of Banach spaces. Clearly, a Hilbert space is also a Banach space but the converse is not true. As an example, let x be a complex-valued function of t such that IdSx(t)/drIP dt is finite for 0 ~ s ~ k and some fixed p such that 1 ~ p < 00. Then

S:

!lxll =

{± fb Idsx~t) IP s=o

a

dt

dt}l/

P

is a possible norm. The space is often denoted by Wk,p and known as a Sobolev space. When the derivatives are generalized, a Sobolev space is a Banach space. In particular, the space W k , 2 is a Hilbert space with inner product (x,y) =

dSx(t) dSy(t)* Lk fb ---dt.

5=0

a

dr'

dr'

123

PRELIMINARIES

Sobolev spaces are often useful when one is attempting to approximate a function and a number of its derivatives at one and the same time. Note that all the results derived in §1.5 for orthonormal sets are also applicable to Hilbert spaces since the derivation rested primarily on the inner product notation. Thus, if Xl' x 2 , ••• is a complete orthonormal set in a Hilbert space H then, for any X E H, 00

X

= L

k=l

in the sense that lim

n-+(X)

Ilx -

(x,

±

k=t

(x,

Xk)X k

Xk)Xkll = o.

Also (x, Y)

=

00

L (x, Xk)(X k, Y)·

k=l

Further, if Yl' ... , Ym are linearly independent elements of H, the best approxima2 tion in the space spanned by them to x E H is based on [x 1 CLkYk 11 being a minimum and is provided by

Lk=

m

(x, Yk) -

L

s=l

CLs(Ys' Yk) = 0

(s

= 1, ... , m).

It may, of course, be desirable first to orthonormalize the Yt, . . . ,Ym by the Schmidt process.

Exercises 1. Let S be a subspace of E which does not comprise the whole of E. Let Xo E E but not be in S. Show that the set of elements x + Xo for XES is not a linear space. 2. Let X be a non-empty subset of the Hilbert space H. Let X.1 consist of all those elements y E H such that (y, x) = 0 for every x E X. Show that X.1 is a linear subspace of H. 3. If {Yn} is a convergent sequence in X.1 prove that its limit is in X.1. This is sometimes expressed as X.1 is closed. 4. Prove that the only element common to X and X.1 is the zero element. If x E H show that there is a unique Xl E X and X2 E X.1 such that x = Xl + X2. 5. Determine a basis for cubic splines. 6. Given a continuous differentiable function x(t) on [a, b] find the best approximation that would he achieved by polynomials of degree 3 when using the norms of the Sobolev spaces (i) J¥O. 2, (ii) WI , 2 ,

3.2 Linear operators Let E and F be two complex or two real linear spaces. Let D be a linear subspace of E and R a subset of F. Suppose that there is a rule which, for each xED, assigns an element Y E R to it. Then this relationship is signified by writing

124

OPERATORS AND EIGENVALUES

y = Tx. Tis called an operator on D into F or, rather more loosely, an operator from E into F. D and R are known as the domain and range of T respectively. The notation D(T) is used to indicate the domain of a particular operator when more than one is present. If T has the property

for all x 1 and X2 in D and all multipliers associated with E and F, T is said to be a linear operator. For a linear operator T.O = 0,

T( - x) = - Tx.

If R contains only complex (real if E and F are real) numbers then a linear operator is often described as a linearfunctional. From now on all operators will be linear unless otherwise is specified. If to each y E R there corresponds one and only one xED, then there is a linear operator T- I on R onto D such that T-ITx

=x

(all x ED),

r r »,

= y (all y E R).

T- 1 is called the inverse operator or, in short, the inverse of T~ Evidently, a linear operator T admits the inverse T- 1 if and only if Tx = 0 implies that x = O. Let E and F be inner product spaces. The inner products need not be defined in the same way, so <,) is used to denote the inner product in F. Suppose that for a given )'1 we can find an Xl E E such that

The element Xl is determined uniquely by Y1 if and only if D and all the limits of its convergent sequences comprise E (in which case, D is said to be dense in E). When it is known uniquely we can write Xl = TAYI and the linear operator T A is termed the adjoint of T. The domain of T A is in F and its range in E; also

(Tx, y)

= (x, TAy)

for all x E D(T) and all y E D(TA ) . The following are some examples. (a) E and F are the spaces of column matrices with m and n components respectively which are complex numbers. Let B be an n x m matrix of complex numbers; then the matrix equation Y = Bx defines B as a linear operator. Specify the inner products by m

(x., X 2) =

L

j=1

n

(X 1 )j(X!

)j,

<)'1' Y2) =

L

j=l

(Yl)j(Y!)j

PRELIMINARIES

125

where (xI)j designates thejth component of Xl' Then (Bx, y) =

=

it

(Bx)iY*)i =

m

n

k= I

j= I

Jl Ct

L (x), L Bjk(Y*)j

=

Bik(X>t)(Y*)i

(X, BHy)

so that the Hermitian of a matrix of complex numbers is its adjoint. The domains of T and T A are E and F respectively. (b) E and F arethe spaces of complex-valued functions x(t) which are square integrable in a ~ t ~ b, both inner products being of the type (3.4) with n = 1. Assume that

Lb Lb Ik(s, tW ds dt

is finite. Then a linear operator T can be defined by Tx =

Lb k(s, t)x(t) dt.

Also (Tx, y)

= Lb Tx. y*(s) ds = Lb y*(s) Lb k(s, t)x(t) dt ds = Lb x(t) Lb k(s, t)y*(s) ds dt

the interchange in the order of integration being justifiable by the assumption on k. Hence TAy

= Lb k*(s, t)y(s) ds

which is of the same form as T but the arguments of k are transposed and its complex conjugate is taken. Again the domains of T and T A are E and F respectively. (c) E and F arespaces constructed from continuously differentiable functions on a ~ t ~ b, an X E E being defined by

X

=

(:~) x(a)

and similarly in F. Both scalar products are given the form (x, y)

= Lb x(t)y*(t) dt + x(b)y*(b) + x(a)y*(a).

126

OPERATORS AND EIGENVALUES

This time the linear operator T is defined by

Tx =

dX/dt ) (

0

.

x(a)

Then

(Tx, y)

=

f

b

a

=

dX

- y* dt + x(a)y*(a) dt

x(b)y*(b) -

I a

X -dy* dt dt

by integration by parts. Now

TAy = ( -yd~dt) and the domains of T and T A are E and F respectively. (d) If in (c) we had restricted D(T) to those x such that x(a) defined Tx = dx/dz then with the inner product of (b)

(Tx, y)

= x(b) = 0 and

= (x, TAy)

where TAy = -dy/dt. Now D(T) is smaller than E and D(T A) = F. Notwithstanding, D(T) is dense in E because, for example, Fourier sine series are contained in D(T) and form a complete orthonormal set in E. Suppose now that E and F are both the same Hilbert space H. Let T be a linear operator with domain D(T) dense in H. Then the adjoint T A exists. If the domain D(T A) contains D(T) and Tx = TAX for all x E D(T) then Tis said to be symmetric. For a symmetric operator (Tx, y) = (x, Ty).

(3.5)

For example, in (d) above Tx = i dx/dr gives TAy = i dy/dt so that T is symmetric and in this case D(T A) possesses elements not in D(T). When T is symmetric and D(T A) is the same as D(n then T is called self-adjoint. We have just had an example of a symmetric operator which is not self-adjoint. An illustration of a self-adjoint operator is supplied by the integral operator in (b) above provided that k(s, t) = k*(t, s) for almost all sand t. If T is self-adjoint and admits an inverse T- 1 then T- 1 is also self-adjoint. It follows that a symmetric operator T is self-adjoint if D(T) is the same as H or if R(T) coincides with H.

Exercises 7. E and F are both the space of continuous complex-valued functions on 0 ~ t ~ 1. Show that (i) y(t) = Sh x(r) dr, (ii) y = SA x(t) dt, and (iii) y(t) = tx(t) define linear operators, that of (ii) being a linear functional.

PRELIMINARIES

127

8. Let the operators in Exercise 7(i) and (iii) be denoted by T1 and T2 respectively. Prove that T1 T2x #= T2 T1 x for all x so that, in general, linear operators do not commute. 9. Let H be the Hilbert space of square-integrable complex-valued functions over ( - 00, (0). Define D as consisting of those elements x(t) such that x(t) E Hand tx(t) E H. Prove that y(t) = tx(t) on D defines a self-adjoint operator. 10. If T is symmetric prove that (Tx, x) is real.

3.3 Bounded linear operators Let E and F be normed linear spaces and let T be a linear operator on E into F. If there is a real number M such that

II Txll

~

Mllxll

(3.6)

for all x E E then T is said to be a bounded linear operator. Here we have employed the same symbol for both the norm on E and the norm on F since this simplifies the notation and should cause no confusion. It is immediately evident that, if [x] ~ 0, II Txll ~ 0 so that a bounded linear operator is said to be continuous. The norm of a bounded linear operator is defined by

II Til = sup II Txll· [lx]] = 1

Equally well it could have been taken as SUPxeE II Txl!/IIxll with x ¥= O. Plainly, the norm is the smallest value of M for which (3.6) is valid and so

II Txll

IITllllxll.

~

(3.7)

If S is another bounded linear operator on E into F, S + T can be specified by (S

Hence

II(S + T)xll

~

+

T)x

= Sx +

IISxll + II Txll

~

Tx.

liS \I\1xll + II Tllllxll

so that

liS + Til

~

IISII + IITII·

(3.8)

If U is a bounded linear operator on F into the normed linear space G, UT can be defined by (UT)x = U(Tx). Therefore whence

II(UT)xll ~ IIUl/llTxl/ ~ IIUIIIITIII/xil IIUTII ~ IIUIIIITII·

(3.9)

A linear operator possesses a bounded inverse if and only if there is a positive

128

OPERATORS AND EIGENVALUES

real number m such that

II Txll ~ mllxl/

(3.10)

for all x E E. For, if (3.10) holds, Tx = 0 implies that x = 0 which is the condition for the existence of the inverse T- 1. Then putting x = T- 1Y in (3.10), shows that T- 1 is bounded with 1/ T- 1 11 ~ 11m. These ideas permit the definition of powers of T when F is the same space as E via T" = TT n - 1. Let J be the identity operator such that Jx = x for all x E E. Then, if E is a Banach space and T is a bounded linear operator on E into E,

(J - T)-1

=I +

L Tn 00

(3.11)

n=1

when II Til < 1 (cf. Theorem l.11a); in fact, it is true if limn-+ oo II T" Illln < 1. The statement (3.11) means that

lim m-+oo

11(1 -

n-

1

-

I -

f

n= I

Tn I!

= o.

E has to be a Banach space to ensure that completeness is available to guarantee that the limit is within the space. Eqn (3.11) has many applications as we shall see. Suppose now that E is a Hilbert space H. Let Tbe a bounded linear functional on H. Then it can be proved that there is a unique y E H such that Tx

= (x, Y)

for all x E H and, moreover, II Til = II yll. By means of this result it may be demonstrated that a bounded linear operator on H into H always has a unique adjoint. Indeed, by Schwarz's inequality

I(Tx,y)1 ~ IITxllllYl1 ~

IITllllxllllyll·

Thus, if T'x = (Tx,y), T' is a bounded linear functional with IIT'II ~ IITllllyl1 and there is a Z E H such that (Tx, y) = T'x = (x, z) with II T' 1/ = [z]. Therefore the adjoint is a bounded linear operator with II T A II ~ II Til. However, it is obvious that (TA)A = T and therefore II Til = II(TA)A II ~ II TA 1/ ~ II Til. Consequently, II T A II = II Til, i.e. the norm of the adjoint is the same as the norm

ofT.

It follows from the preceding paragraph that On the other hand

II TII2 =

sup

IIxll =

1

II TATII ~ II T A II II Til

II Txl1 2 = sup(Tx, Tx)

= sup(TATx, x) ~ sup] TATxllllxl1 ~

II TATII.

~

II T1I 2 •

129

PRELIMINARIES

Hence

the last equality following from the first by applying it to T A • The equation Tx

= AX,

where A is a complex number, mayor may not have solutions in which X i= O. If there are such solutions )" is called an eigenvalue of T and the solution X is known as an eigenvector or eigenfunction. If T is a self-adjoint linear operator on H into H and X is an eigenvector (Tx, x)

= A(X, x).

The left-hand side is real (Exercise 10) and (x, x) > 0, so A must be real, i.e. the eigenvalues of a self-adjoint operator are real. Also, if x j is an eigenvector corresponding to the eigenvalue Aj'

since Ak is real. If Aj #= Ak then ix], x k ) = 0, i.e. the eigenvectors of distinct eigenvalues are orthogonal. If )\,j = Ak linearly independent eigenvectors can be arranged to be orthogonal by the Schmidt process of §1.5. Thus any countable set of eigenvectors of a self-adjoint operator can be regarded as forming an orthonormal set. A linear operator is said to be compact or completely continuous if and only if for every infinite sequence {Yj} of bounded elements (i.e. II Yj II ~ c for all j) the sequence {TYj} has a convergent subsequence. A compact operator is necessarily bounded. For, if T were not bounded there would be a sequence {Yj} with II Yj II = 1 such that II TYjl1 > j for j = 1,2, ... and there could be no subsequence of TYj which converged. A compact self-adjoint linear operator will be denoted by C. We now wish to show that if C is on H into H

IICII = sup !lxll =

(Cx, x). 1

Firstly,

I(Cx, x)1

~

IICxl1 = IICII

so that

sup(Cx,

x) ~

IICII.

Secondly,

IICxl1 2 = (Cx,

Cx)

= (C 2x,

x)

= (C 2x/a,

ax)

OPERATORS AND EIGENVALUES

130

where a 2

= [Cx]: Hence

//Cxl/ 2

= i(C[ax + Cx]a - {ax - Cx/a}], ax + Cx]a + {ax - Cx/a}) = i[(C {ax + Cx/a} , ax + Cx/a) + (C {ax - Cx/a}, ax - Cx/a)] ~

t{ [ex + Cx/all2 + llax - Cx/all 2 } sup (Cx, x)

~

!{a 11 x ll + /ICxI/ /a } sup(Cx, x)

~

[Cx] sup(Cx, x)

IIxll =

2

2

2

1

2

whence sup(C, x) ~ [Cx] so that sup(Cx, x) ~ IICII. Combining the two inequalities we have IICII = sup (Cx, x). /lxl/ = t

Asa consequence there is a sequence {Yj} with II YjII = 1 such that I(CYn, Yn)1 -. IIC II as n -. 00. However, I(CYn' Yn)1 ~ IICYn II ~ IIC II and therefore IICYn II -. IIC II also. Suppose that the limit of (CYn' Yn) is denoted by At; then At = ± IIC II. Also IICYn - A1Yn 11 2 = IICYn 1/2 - 2A 1(CYn, Yn) + Ai -. 0 as n -. 00. Therefore CYn - )"1 Yn converges to zero. Since C is compact, the sequence {CYn} contains a subsequence {CYnk} which converges to an element of H. On account of the result just proved {Ynk} converges to the same element x (say). Since C is bounded it is continuous and so {CYnk} converges to Cx. Hence Cx = AtX i.e. x is an eigenvector and At, with IAtl Accordingly we have

= IIC II, the corresponding eigenvalue.

THEOREM 3.3. For a compact self-adjoint linear operator C on H into H the extremum problem I(Cx, x)l/llxll 2 = maximum

has solutions. Each solution is an eigenvector and the eigenvalue is equal ill modulus to the maximum attained. Every such compact operator has at least one eigenvalue different from zero. Alternatively, 1)~11 = IICII = maxIICxll/llxll. Again, if A1 > 0 it will be the maximum of (Cx, x) under IIxll = 1; correspondingly. if A. t < 0 it will be the minimum of (ex, x). Let Xl be one of the eigenvectors determined by Theorem 3.3 and consider the same extremum problem with elements x orthogonal to Xl' The entire argument may be repeated except that we work in a subspace of H. The process may be carried out again and again. Thus the eigenvector x, is a solution of the

PRELIMINARIES

extremum problem I(Cx,

131

x)l/llxll 2 = maximum

under (x, x m) = 0 (m = 1, 2, ... , n - 1) and the corresponding eigenvalue An is equal in modulus to the maximum. Obviously IAll ~ IA 2 1 ~ •.•• When H is of finite dimension the process stops when the finite number of eigenvalues has been discovered. When H is of infinite dimension the possibility of an infinite number of eigenvectors for a single eigenvalue must be examined. Suppose the eigenvectors form an orthonormal set {xU)} (which we know to be permissible) and that the eigenvalue A :f:. O. Then the sequence {x U)/ ),, } has II XU)/A II ~ I/IAI and so must have a subsequence such that {CxU)jA} or {xU)} converges to an element of H. However, this is impossible since !IxU) - X(k) 11 2 = 2 for j "# k. Hence, corresponding to a non-zero eigenvalue, there is only a finite number of linearly independent eigenvectors. The number of linearly independent eigenvectors for a single eigenvalue is called its multiplicity; if there is only one eigenvector the eigenvalue is called simple. What has just been established is that every non-zero eigenvalue is offinite multiplicity. Suppose now that )"n does not tend to zero as n -+ 00. Then, if x, is the corresponding orthonormal eigenvector,the argument of the previous paragraph may be applied to the sequence {Xn/A n} to show that a contradiction arises. Hence, in an infinite dimensional H, it is necessary that An -+ 0 as n -+ 00. This last result demonstrates that C cannot possess a bounded inverse in an infinite dimensional H for (3.10) is violated. It is not, of course, assumed that C has an infinite number of non-zero eigenvalues. It may happen that An = 0 for all but a finite set of integers n. There is an important expansion theorem associated with compact selfadjoint operators which we consider for infinite dimensional H. Let x be any element of H and put Yn

=X

n

-

2:

m=l with the eigenvectors orthonormal. Then (Yn, x m) = 0

(x, xm)x m

(m = 1, ... , n)

and therefore IICYnll ~ 1)"n+lII1Ynll·

Since (Yn' Yn) = (Yn' x), II Yn II ~ [x]; also An+ CYn converges to zero, i.e,

1

-+

0 as n -+

00

and it follows that

00

Cx

= L

)"m(x, xm)xmo m=l Now, by Bessel's inequality (§1.5), L:=l I(x, xm)1 2 ~ IIxII2 so that the series of numbers on the left is convergent. Hence

I/ktm (x, Xk)Xkl/

Z

= kt I(x, XkW

m

132

OPERATORS AND EIGENVALUES

must tend to zero as m and n tend to infinity. Hence the completeness of H guarantees that there is y E H such that

can be made as small as we like by choosing n large enough. Because of the boundedness of C

is correspondingly small. In other words 00

= L Am(X, xm)x m·

Cy We conclude that Cx

m-I

= Cy or x = y +

= O.

Consequently

L (x, xm)x m·

=z +

x

z where Cz

00

m=1

In summary we have THEOREM

3.3a. The eigenvectorsform allorthonormal set such thatfor any x E H 00

Cx

= L Am(X, xm)x m, m=1

00

x where Cz

= o.

=z+ L

m=1

(x, xm)x m

If Cz :1= 0 for any z :f:. 0 the set is complete and 00

= L (x, xm)x m·

x

m=1

One particular consequence of Theorem 3.3a is important. Suppose that A. is neither zero nor an eigenvalue. Then, from the first equation of Theorem 3.3a and (Cx, x m) = Am(X, x m),

Cx

=

z, L= - -A. (Cx oo

j

m

~

1 Am -

AX Xm )x m ,

which may be rewritten as

A _m_ (Cx - AX, xm)x m. A. A m= I Am - A The definition of A entails the existence of (C - A) -1. Therefore, if Cx - AX 1

x = -~ (Cx - AX)

1

+-

L 00

= y,

we have

(3.12)

PARTIAL DIFFERENTIAL EQUATIONS

133

Another offspring of Theorem 3.3a is that, if (x, x m) = am, 2

(x, Cx) _ L~= 1 Amla m l

~Cx112 - L~J=l A~lamI2· Hence, if all the eigenvalues are positive,

1 . (x, Cx) -=mln--

(3.13)

Ilexll

At

and other eigenvalues are given by minimizing the same quantity with x orthogonal to the earlier eigenvectors. These theorems have far-reaching applications but they are only available when it can be proved that an operator is both compact and self-adjoint. Since compactness implies boundedness the discussion is limited to bounded operators. The demonstration of self-adjointness is usually the easier of the two. Often it is relatively easy to prove that T is symmetric, perhaps not for the whole of H, but only for a dense subset S of it. However, given a bounded operator T on a dense subset, there is a unique bounded operator To on H such that Tox = Tx for XES and II To" = II Til. Effectively, To is defined by saying that Tox = Tx when XES and, when x ~ S, we select a sequence {x n } from S such that [x - x; II -+ 0 as n -+ 00, defining Tox as lim n -+ oo TXn i.e. II Tox - TXn II -+ O. By this device, if T is symmetric on S, To is symmetric on H and hence self-adjoint. Therefore, it is sufficient for the self-adjointness of a bounded operator to check that it is symmetric on a dense sub-space of H. To deal with compactness we must either show that the operator satisfies the original definition or use one of the two following sufficient conditions. The first is that, if for every e > 0 there is a compact operator 1;. such that

IITx -

~xll ~

ellxll

for all x E H then Tis compact. The second says that, if Tis a bounded operator on Hand Yj' Zle are two complete orthonormal sets, then T is compact if 00

00

L L

j= 1 Ie = 1

I(TYj, zle)1

2

is finite.

PARTIAL DIFFERENTIAL EQUATIONS 3.4 Integral and partial differential equations Let us first examine an integral operator Tx

=

f

k(s, t)x(t) dt

OPERATORS AND EIGENVALUES

134

with inner product

(x, y) =

I

x(t)y*(t) dt.

Although only one variable of integration is shown explicitly, the integrals may, in fact, be over an n-dimensional region. The Hilbert space is composed of those functions such that Ix(t)1 2 dt is finite. Assume that JJ Ik(s, t)1 2 ds dt is finite. Then we already know that T is self-adjoint if and only if k(s, t) = k*(t, s). As to compactness, consider the criterion given at the end of the last section, since T is clearly bounded. Suppose that we have two complete orthonormal sets Yj' Zk; such sets can be constructed using, for example, polynomials or Fourier series as building blocks. Then, by eqn (1.32)

J

k~1 I(TYj' zk)1 2 =

II TYjll2

But, by Bessel's inequality

jt If

k(s, t)yi t) dtr

=

=

III

JI If

k(s, t)yP) dtl2 ds.

k*(s, t)yj(t) dtl2

: :; flk(S, tW dt by regarding the integral as the inner product of k* and Yj and hence

Jl k~l

I(TYj' ZkW

:::;;

I Ilk(S, tW dt ds < 00.

Consequently, T is compact. Notice that there is no necessity in this proof for T to be self-adjoint. If k(s, t) = k~(t, s) so that T is self-adjoint, as well as compact, the theorems of the preceding section are available. Thus, the integral equation

f

k(s, t)x(t) dt = AX(S)

will have square-integrable solutions only when l is an eigenvalue and for each eigenvalue there will be only a finite number of linearly independent eigenvectors. According to Theorem 3.3a, if X(t) is any square-integrable function

I

k(s, t)X(t) dt

where

Xm

=

m~l AmXm(S)

I

X(t)x:(t) dt

(3.14)

is a typical eigenvector of the integral equation. If, in addition, the

PARTIAL DIFFERENTIAL EQUATIONS

f

only solution of

k(s, t)z(t) dt

is z

= 0, then X(s)

135

=0

= m~l xm(s) f X(t)x:(t) dt.

(3.15)

Furthermore, the solution of

f

k(s, t)x(t) dt - AX(S)

= y(s)

where y E H is, by (3.12), 1 )"

x(s) = - - y(s)

+ -1 ~ i..J

AmXm(S) A m = 1 Am - A-

f

y(t)x:(t) dt

(3.16)

(3.17)

if )" is neither zero nor an eigenvalue. There are naturally applications in which integral operators are not square integrable, e.g. potential theory in higher dimensions. Here it may be on the cards to write k = k1 + k 2 where k, leads to a compact self-adjoint operator on the above H whereas k 2 leads to a bounded self-adjoint operator (this does not force k2 to be a bounded function of course). Then k gives rise to a bounded self-adjoint operator. If, further, the split can be arranged to depend on a parameter which can be chosen so that, given e > 0,

then k creates a compact operator by the criterion at the end of the last section. This can be done for kernels in potential theory by selecting the parameter as the radius of a sphere surrounding the singularity and taking k 1 , k 2 as the kernel outside and inside the sphere respectively. Allowing the radius to tend to zero supplies the desired result. (See also §6.16.) Next consider partial differential equations. Start with the equation

V;u

+ Aru = 0

where V; is the Laplacian in n-dimensional space and r is a known function. It will be assumed that the equation holds on an open simply connected point set G of Euclidean space. The boundary of G will be denoted by aG and Gwill be used to denote the union of the sets G and oG. A basic assumption will be that G is such that the divergence (or Green's) theorem holds. There is no difficulty in extending results to regions which can be split up into a finite number of regions of the assumed type.

136

OPERATORS AND EIGENVALUES

The function r will be taken to be real and positive on G. In addition, rand its first partial derivatives will be supposed to be continuous on G. The Hilbert space H will be chosen real and consists of those real functions u such that

t

Iu(t)1 2r(t) dt < CX);

the inner product

t

u(t)v(t)r(t) dt,

the symbol (. , .) being reserved for the case r == 1. Now specify the operator Tby

Tu= -G)v;u

with the domain D(T) of T consisting of those u such that u and its first partial derivatives are continuous on G while the second partial derivatives are continuous on G and Tu E H. Also u is required to satisfy the boundary condition u = 0 on aGo Observe that if the second partial derivatives had been assumed continuous on G the condition Tu E H would have been automatically fulfilled. From the divergence theorem

- f uV;v dt = f grad u. grad v dt G

f

G

cG

un. grad v dS

(3.18)

where n is the unit outward normal to the boundary of G. Hence

fG

f

(vV;u - uV;v) dt =

aG

n. (v grad u - u grad v) dS.

(3.19)

If u and v are both in D(T) the right-hand side of (3.19) vanishes because of the boundary condition on aG and then (3.19) can be expressed as

(3.20)

Since D(T) is dense in H it follows that T is symmetric on D(T). The same conclusion can be drawn if the boundary condition associated with the operator T is changed to au/an = 0 on aG or to au/an + (JU = 0 on aG with (J a given function continuous on aGo For, in both cases, the right-hand side of (3.19) vanishes when u, v E D(T) and (3.20) results. Actually, more can be deduced. By putting v = u in (3.18), we obtain

-J

G

uV;udt=f grad2 u d t - f uOudS. G

iJG

on

137

PARTIAL DIFFERENTIAL EQUATIONS

It is therefore evident that (u, Tu) ~ 0, if u E D(D, for all three boundary conditions provided that (J ~ O. In fact, (u, Tu) = 0 in these circumstances only if grad u = 0, i.e. u = constant on G. Therefore, if u = 0 on aG or if (J > 0 somewhere on aG, the only possibility is that u is identically zero on G. For the Neumann condition au/an = 0 the contingency that u is a non-zero constant cannot be excluded. Restricting attention again to the Dirichlet boundary condition u = 0 we have just seen that Tu = 0 implies u == o. Hence r:' exists. An explicit expression can be given for it in terms of the Green's function G(s, t) for Laplace's operator which vanishes on the boundary. Thus v= -

V~

L

G(s, t)v(t) dt

(3.21)

for those v which, together with their first derivatives, are continuous on k(s, r)

G. Let

= G(s, t){r(s)r(t)} 1/2

and consider the eigenvalue problem

fa k(s, t)v(t) dt = KV(S).

(3.22)

Note that, if n ~ 3, k is square integrable and, since G(s, t) = G(t, s), the integral operator on the left is compact and self-adjoint. If n > 3 the same inference may be drawn by the device in the paragraph following (3.17). Thus the theorems of the preceding section are available. Also, since G(s, t) = 0 for s E oG, the eigenvectors vanish on the boundary. Let "m' Vm be a typical eigenvalue and eigenvector respectively, and put I /2 Wm = vm/r • Then, from (3.21) and (3.22), so that W m and l/"m are the eigenvector and corresponding eigenvalue of the partial differential equation. Further "1

Ifa

= max fa v(s) fa k(s, t)v(t) dt ds

Iv(t)1 2 dt,

the maximum being taken over all permissible v, and we know that eigenvalues must be positive since (u, Tu) > 0 for all u ~ O. Put r 1/

2u

=

"1 and all

L

k(s, t)v(t) dt.

Then the eigenvalues Ah A2' ... of the partial differential equation are such that

o < Al ~ A2 ••• and

(u, Tu) 1 - SG uV;u dt A. 1= max fG r(V~u)2 dt = max II Tull2

(3.23)

138

OPERATORS AND EIGENVALUES

the maximum being taken over functions u which vanish on iJG. Alternatively, we could adopt (3.13) for K 1 since all the eigenvalues are positive and then ~

At

. -fa uV;u dt JG ru 2 dt

= mm

. (u, Tu)

= mm

II u] 2

(3.24)

the functions in the minimizing process being subject to the boundary condition

u = O. Because of the boundary condition - fa uV;u dt in both (3.23) and (3.24) can be replaced by fa grad" u dt; then the competing functions in the minimum

may be chosen from the more extensive class of continuous functions on G (with boundary values zero) which have piecewise continuous first derivatives because this class contains functions with continuous first derivatives as a dense subset. The Vm form a complete orthonormal set with respect to the inner product (., .) because there is no solution to the integral equation when K = O. Now, for any u E H, (r1 /2u, r 1 /2u) < 00 and so by Theorem 3.3a (:J:)

r 1 / 2u

= L

(r 1 / 2u, vm)vm·

m=l

Hence 00

L

u=

m=l

(u, wm)wm

showing that any function in H can be expanded in terms of the eigenvectors of the partial differential equation. We now enquire whether a similar expansion for derivatives exists; namely 00

grad u

= L

m=l

bm grad

Wm

assuming, of course, that u possesses derivatives. Remark, firstly, that

L

grad W m· grad w. dt

because the eigenvectors

Wm

L L

=-

wmV;w. dt = A.,<wm• w.> = A..l5 m•

are zero on aGo This suggests

Ambm =

grad u. grad

Wm

dt

(3.25)

there being no difficulty about division by Am since it is never zero. The series then converges in L 2 norm if fa grad? u dt < 00 and so the expansion can fail only if there is a non-zero u for which all b; vanish. When this happens (3.18) gives, provided that V;u E H,

f

V n2 u dt = 0

Wm

G

since ~Vm = 0 on aGe Hence V;u is orthogonal to all

Wm

and must be zero on

PARTIAL DIFFERENTIAL EQUATIONS

account of their being complete. Thus, for any u such that V;u grad u = curl A +

139 E

H

00

L

b; grad

m=l

(3.26)

Wm

where A is arbitrary and bm is given by (3.25). Thus the grad Wm are orthonormal but not complete. Impose now the extra condition that u = 0 on aGo Then (3.25) supplies bm = (u, ~vm) so that bm can be zero for all m only if u vanishes identically. Accordingly, if SG grad? u dt < 00 and u = 0 on aG then

L 00

grad u =

m=l

(u, w m ) grad Wm •

(3.27)

When the boundary condition for the partial differential equation involves the normal derivative the direct application of the preceding technique encounters a difficulty because the existence of T- 1 cannot be asserted unless (J > O. A simple way around this is to redefine T by

T=

-0)(V~u

- aru)

where a is a positive constant whose value will be specified presently. The partial differential equation remains unaltered if A is replaced by A + a. Clearly, the domain and symmetry of T are unaffected by this change but

(u, Tu) =

f

G

(grad? u + aru2 ) dt

+

f

oG

(Ju

2

dS.

Evidently (u, Tu) > 0 if (J ~ 0; it will now be shown that, even if (J is not restricted in sign, ('J., can be picked to make the inner product positive. Let I(JI ~ (Jo on aGo Let h be a function on G such that h = n on aGo There are such h, continuous with continuous first derivatives on G, if the boundary aG is sufficiently smooth. Then

IfaG

2 (fU

-

~

(fo

= (J 0

=

fa

faG

f

cG

2 u

dS 2

n · hu dS

(2uh. grad u + u2 div h) dt.

Ihl and div h on the region. Then 2luh. grad ul ~ 2(1..11ullgrad ul ~ (l..1(YJ grad? u + u2/YJ )

Let a 1 , (1..2 be bounds for for every YJ > O. Hence

IfaG

(fU

2

dsl ~ fa {a ('1 grad? (fo

1

u

+ u 2 /'1) + 2a2 u 2 } dt

OPERATORS AND EIGENVALUES

140

and

(u, Tu) ~ fa {(I -

UOlX t ,, )

grad"

u+ (IXr - 2UoIXz - uolt1 )U

Z

}

dt.

First make '1 small enough for (J oCt1 '1 < 1 and then select Ct large enough for ar > 2(JoCt 2 + ao/'1 (possible since r is bounded away from zero). In this way, the right-hand side is positive and the desired result has been achieved. Having ensured the existence of the inverse of T (cf. (3.10)) we can repeat our analysis. For example, corresponding to (3.23) and (3.24) we have A. I

and

- JG u(V;u - Ctru) dt = max - - - - - - SG r(V;u - aru)2 dt

+ Ct

, _ . - SG uV;u dt

JG ru

min

Al -

2

dt

•

The permissible u in these formulae are required to satisfy the boundary condition au/an + au = 0 on aGo Therefore, the upper integral in the last formula can be converted to

f

G

grad? u dt +

f

uu2 dS.

oG

To enlarge the class of competing functions let

= min (fa grad" udt +

111

IG uu z Ifa ru z dS)

dt

when no boundary condition is imposed on u. In particular, put u = WI + ef where lV I gives the minimum, ! has piecewise continuous first derivatives, and e is an arbitrary constant. Then,

f

grad? u dt

G

implies that

+

f

cG

au2 dS - J.ll

f

ru

2

G

2e(fa grad Wt • grad f dt + lG uwtf dS + ez{fa grad? f

dt

+

I1t

dt

~0

fa rw.! dt)

IG up

dS - 111

fa rt? dt} ~ O.

By making e small enough the term in e 2 will be negligible in comparison with that in e and, by an appropriate choice of the sign of e, the inequality is violated unless

f

G

grad

WI •

grad! dt

+

f

oG

aw.f dS -

J1l

f

oG

rw.! dt =

o.

PARTIAL DIFFERENTIAL EQUATIONS

141

Applying the divergence theorem we obtain

+ CTW I) dS - f f.eo 1 (aWl an

G

I(V 2\4)1 +

,ulrw l)

dt = O.

Since 1 is at our disposal we may elect to have it vanish on aG and since such 1 are dense in H we deduce that

+ ,ulrw l = O.

V2 W l

Once that has been established aG so that we must have

1 may

be allowed to take arbitrary values on (3.28)

on aGo Thus WI must be the first eigenvector of our original problem and can be identified with AlIt has therefore been demonstrated that

IG

-. = min (fa grad" u +

CTU

2

Ifa

dS)

2

ru dt

,ul

(3.29)

subject only to u being continuous with piecewise continuous first derivatives. Higher eigenvalues can be found by minimizing the same expression provided that the u are orthogonal to the earlier eigenvectors. Owing to the fact that no boundary condition is imposed on the trial functions u in (3.29) the condition (3.28) is known as a natural boundary condition. The eigenvectors for the boundary condition (3.28) form a complete orthonormal set so that the standard expansion for any u E H holds. For the derivatives there are some slight differences. Consider, firstly, the Neumann problem so that (J is identically zero. The orthogonal properties of the grad Wm are synonymous with those for the Dirichlet problem and (3.25) is still true, though b l is zero because A. l = 0 and \\-'1 is a constant. On account of vm/on being zero on aG we always have bm = am (m =1= 1) so that the vanishing of the bm entails u being constant. Consequently, for the Neumann eigenvectors.

a..

00

L

grad u =

(u,

m=2

\V m )

grad

\4)m

so long as grad u is square integrable. If (J is non-zero, the property of orthogonality now takes the shape

f

grad

\V m •

grad

"-'s

G

dt +

f

eo

GWmW s

and the resultant formula for bm is

i»; =

f

G

grad

U•

grad

Wm

dt -

dS

f anau eo

= As <5 ms

Wm

dS.

OPERATORS AND EIGENVALUES

142

Apart from the case when Am = 0 for some m, b.; then being indeterminate, expansion (3.26) holds in general and (3.27) holds if ou/on + au = 0 on oG. The above theory can be generalized to the partial differential equation

L L -a ( Pjk -Ou) + (Ar n

n

j= 1 k= 1 OX j

q)u

OX k

=0

(3.30)

where Pjk' q, and r are real functions with Pjk = Pkj' Into the bargain, q is required to be continuous on G, r is positive continuous on G, and the Pjk' together with their first derivarives, are to be continuous on G. Moreover, Pjk must satisfy n

n

n

LL

Pjk'j'k

j=l k=l

~

C

L 'J

'j

j=l

at every point of G, C being a fixed positive number and the arbitrary real numbers. If Pjk = P for all j and k the differential operator can be transformed into the Laplacian by the substitution U 1 = uJp provided that P has suitable differentiability. The analogue of (3.18) for (3.30) can be expressed as

-f uS ±± !-(Pjk~) f {± ± ~ G

0=lk=1

=

0Xj

oX k

- qV}dt

Pjk iJu

G

j = 1 k = l O X j OX k

+ QUV} dt -

f ±± eo

u

j = 1 k= 1

njPjk iJv dS (3.31) a xk

where nt, n2"" are the components ofn. The normal boundary condition now takes the form

By putting Tu

= -1 { - Ln Ln -a ( Pjk -au ) + r

j= 1 k= 1 aXj

aXk

qu - (lru

}

it can be shown that (l can be chosen sufficiently positive for positive if u is non-zero.

Exercises

to be

11. The sequence {In} is defined by f" = 9 + lTg + 12T 2g + ... + lnT"g with 9 E H. If 11111 Til < 1, show that {f,,} converges to an element I of H which satisfies I -lTI = g. 12. C is compact and self-adjoint. Show that C 2 , C 3 , • • • also have these properties and that 1;', 1;, ... respectively are their eigenvalues. 13. The kernel k is such that k(s, t) = k*(t, s) and

ff

2

Ik(s, t)1 ds dt <

00.

PARTIAL DIFFERENTIAL EQUATIONS

Let km(s, t)

f

= k(s, u)km-t(u, t) du

143

(m ~ 2)

and k 1 = k; define Am = J km(t, t) dt. Prove that A p = L:'= 1 A::'. Deduce that If ~ A2p+2/A2p and that lfp ~ A 2p/m1 where m t is the multiplicity of It. Prove also that and, if m 1

lfA.~ ~ (A~p+ 2

= 1,

-

A4p+4)/(A~p - A 4p)

l~P ~ (A~p - A 4 P )/2l f pm 2

where m2 is the multiplicity of l2. These inequalities supply bounds for the first two eigenvalues of the kernel.

3.5 The cavity resonator In the absence of charges and currents, Maxwell's equations for periodic phenomena are (3.32) curl E + iwp" = 0, div E = 0, curl H - iroeE = 0,

div H

=0

(3.33)

when p and e are constant. If the oscillations occur in a perfectly conducting cavity, the condition (3.34) nAE=O must be satisfied on the bounding surface. Thus, the problem of determining the eigenvalues w for which oscillations are possible has to be faced. The natural attack, in view of the preceding section, would be to employ Green's tensors. However, the analysis is bedevilled by the singularities of Green's tensors, which are constructed from electric and magnetic dipoles, because they render some integration awkward. For this reason, a somewhat indirect technique has to be adopted. (For an alternative approach see Jones (1964).) At first sight it appears that the equations involving the divergence are superfluous since they can be deduced from the other pair if w =I: 0. Unfortunately, we cannot assume a priori that Q) is non-zero. If w = 0, a possible solution is E = grad ljJ and, if the divergence equation were not present, there are many ljJ which would permit the satisfaction of (3.34). With the divergence equation in (3.32), ljJ must be a solution of Laplace's equation and then (3.34) is satisfied if l/J is constant on aGo If G is simply connected, l/J must be constant in G and E = 0. Therefore our consideration will be limited to simply connected G. When Q) = 0, equations (3.33) are uncoupled from equations (3.32) and imply that H = grad 1/1 where 1/1 is a solution of Laplace's equation. Noting further can be said about 1/1, even when G is simply connected, without a boundary condition for H. Now, when Q) =I: 0, taken an arbitrary small curve C on oG;

144

OPERATORS AND EIGENVALUES

then, from (3.34) and Stokes's theorem,

o=

Ie E • ds = Is n . curl E dS = - uou fs n · H dS

where S is the smaller part of iJG enclosed by C. Since C and S are arbitrary, it follows that n , H = 0 on iJG when OJ =F o. If this boundary condition is imposed also when OJ = 0, the consequence is that t/J = constant when G is simply connected with the result that H = o. Accordingly, it will be assumed from now on that G is simply connected and OJ i= O. The intricate analysis that enables compact operator theory to be invoked via Green's tensors will not be given (for full details see Muller and Niemeyer 1961); instead the consequences of that investigation will be described. Firstly, since OJ i= 0, substitution from (3.33) permits (3.32) to be rewritten curl curl E

= k 2E,

div E

=0

(3.35)

where k 2 = OJ2jlE. In fact, the second equation of (3.35) is forced by the first because k =F O. The eigenvalue problem becomes one of solving (3.35) subject to (3.34). Then H is found from (3.32). Now, by (3.35) and the divergence theorem,

k2

L

E*. Edt

=

L

E*. curl curl Edt

=

L

curl E* . curl Edt

on account of (3.34). Hence k2 is positive and the corresponding OJ is real. It follows that, in an eigenoscillation, E may be presumed to be real and H purely imaginary. There is no loss of generality in taking to to be positive since changing the sign of OJ merely reverses the sign of H, i.e, converts the electromagnetic field into its complex conjugate. The theory establishes that there is a (real) set {Em} such that curl curl Em with n /\ Em

k;'

= 0 on oG

L

and

En' Em dt

= =

= k;' Em

k; = OJ;jlE >

L L

(3.36)

O. Since

En· curl curl Em dt

curl En. curl Em dt

= k;

L

En. Em dt

(3.37)

it can be seen that {Em} can be arranged as an orthonormal set, i.e.

L

Em • En dt

This orthonormal set is complete.

= «:

(3.38)

145

PARTIAL DIFFERENTIAL EQUATIONS

An immediate deduction from (3.37) and (3.38) is that

(3.39) Let e be such that dive

=0

in G and

fG e. e* dt

<

00 •

Then, from our general theory, e can be expanded in the form

e=

f

Em

m=1

f

(3.40)

e. Em dt.

G

With regard to derivatives, a procedure similar to that for (3.27) gives, by virtue of (3.39), curl e

= grad X + L 00

m= 1

21 curl Em

km

f

(3.41)

curl e . curl Em dt

G

where the scalar X satisfies Laplace's equation. If n A e = 0 on aG the term in X disappears from (3.41) and the coefficient of curl Em in (3.41) is the same as that of Em in (3.40). If e is an electric field satisfying (3.32) and (3.33), the associated magnetic intensity h can be found from (3.32), i.e. via iWJlh = -curl e. Invoking (3.41) we have

h = grad XI

+

L 00

m= 1

1 curl Em 2"

km

f

(3.42)

h • curl Em dt

G

where Xl is of the same nature as x. In general the equation div B = 0 implies that there is a vector A such that B = curl A. By applying (3.41) to curl A we deduce that

B = grad X2 +

f ~k curl Em f B. curl Em dt

m= 1

m

(3.43)

G

when div B = O. Analogous to (3.24) is, for a simply connected G,

ki

= min JG (curl 0)2 dt SG u 2 dt

where the competing vectors u are solutions of div u = 0 subject to n on eo.

(3.44) A

u= 0

146

OPERATORS AND EIGENVALUES

UNBOUNDED OPERATORS AND EIG ENV AL UES 3.6 Unbounded operators The analysis of self-adjoint operators which are not compact is substantially more difficult than that for compact operators. Perhaps the easiest to deal with and of physical importance are the positive operators, a self-adjoint T being called positive if (Tx, x) ~ 0 for all x E H. In fact, the theory covers any self-adjoint operator such that (Tx, x) ~ c(x, x) for some real numbers a and all x E H. For, if a is negative, define T1 = T + (1 - a)1 and then T1 is a positive operator. Examples of such operators have already been provided by the partial differential operators in the preceding two sections. There we found, by conversion to integral equations, that they enjoy properties similar to compact operators but this is not generally true for positive operators. It is obvious that T 2 is positive for any self-adjoint T =1= 0; indeed TT A and TAT are positive for every linear operator T =1= 0 with an adjoint. Moreover, if T1 and T2 are bounded positive self-adjoint operators and T1 T2 = T2 T1 , then T1 T2 is positive as will be seen shortly. A bounded positive operator possesses a positive square root, i.e. a positive operator S such that S2 = T. By dividing by II Til if necessary we can ensure that the norm of the operator does not exceed unity. The square root can be defined by an iterative process which resembles somewhat Newton's method for the square root of a number. The sequence is specified by

= Sm + t(T - S;') = T - T 2/8 and, generally, Sm

Sm+ 1

(3.45)

with So = O. Thus SI = T/2, S2 is a polynomial of degree m in T. It can be proved that, as m -+ 00, Sm tends to a positive operator S and clearly, from (3.45), S2 = T. Further, it can be shown that there is no other positive square root. Since Sm is a polynomial in T, SmT = TSm and consequently ST = TS. For the same reason, if BT = TB then SB = BS. Thus, if T1 , T2 are bounded self-adjoint positive operators with T1 T2 = T2 T1 and 8~ == T2 we can write T1 T2 = 8 2 T182 from which it is obvious that T1 T2 is a positive operator. Returning to the general self-adjoint operator T we say that A is a point of the spectrum if T - )_1 does not have a bounded inverse. Obviously, any eigenvalues of T belong to the spectrum but the spectrum need not consist solely of eigenvalues. The eigenvalues constitute the point spectrum while the remaining points of the spectrum are said to form the continuous spectrum. Only real values of ).. lie in the spectrum. For, if A = a + iP with a, Preal and P-1= 0, (x, (T - )_)x) - (T - A)X, x)

Hence

21Plllxli

~

= (A

- A*)(X, x)

211(T- A)xll

= 2iPllxlI 2 .

UNBOUNDED OPERATORS AND EIGENVALUES

147

which, by (3.10), is sufficient to ensure that T -)..1 has a bounded inverse. For the moment, devote attention to compact operators for which ex = 0 implies x = 0 and Theorem 3.3a is applicable. Assume, for simplicity, that the eigenvalues are such that A1 > )"2 > .. '. Define an operator El , depending on the parameter A, such that

Elx

=

L (x, xm)x m 00

m=l 00

= L (x, xm)x m m=2

=

00

L (x, xm)xm

m=3

and so on. It will be observed that the right-hand side does not vary with A. when A ~ A1 • Therefore we may say that El is constant on A ~ A1 • Similarly, it is constant for ,1,1 > A ~ A2 or, indeed, for Am > )" ~ Am + l' Next, if A > J1"

the summation being over those m for which A ~ Am > J1,. Thus El - EJl is a positive operator for A > JL Further, if A. ~ u, 00

00

E1EJlx = ElL (x, xm)x m = L (x, xm)x m = EJlx the summation being over those m for which Am (if Am

=0

~

~

u, since

A)

(if Am > A).

This show that E1EJl = EJl for A ~ J1,. Finally, from Theorem 3.3a, Tx

=

L

m=l

A.m(Elm - EAm-O)x

(3.46)

where El - O means the limit of E;._£ as the positive e tends to zero. These are the ideas that we wish to generalize for self-adjoint operators which are not compact. The generalization is achieved by means of the Stieltjes integral. The Stieltjes integral

148

OPERATORS AND EIGENVALUES

is defined as the limit of n

L

k=l

f(~k){g(Pk) - g(Pk-I)}

as maxk(Pk - Pk-l) --+ 0 where a = Po < PI < ... < Pn = band Pk-l ~ ~k ~ Pk' Clearly, there is no contribution to the integral from intervals on which g is constant. That observation allows us to write (3.46) as

Tx

= f"'oo A. dE;.x

(3.47)

with A as the variable of integration. Actually, the limits of integration could have been chosen finite for (3.46) because E).x is constant for A > Al and for A < 0 but the infinite limits are more convenient for future developments. The next step is to release E). from the obligation of being constant except for discontinuous changes. Instead, E). will be required to be such that and E). - Ell is a positive operator for ,t > J1.. These requirements are the same as properties in the compact case but with the necessity for constancy in E). dropped. It is possible still to have a discontinuity of A where E).-o =1= E).; such values of A are eigenvalues. Points where E). is continuous but not constant form the continuous spectrum. On the other hand, intervals where E). is constant are not in the spectrum. For instance, let H consist of functions h(t) which are of integrable square for t on the interval (0, 1). Suppose that Th(t) = th(t) so that II Til = 1. Take E). = 0 for A < -1, E). = I for A. > 1 and otherwise

E;.h(t)

=

r 0

(A ~ t) (A < t).

It can be verified that E). has the requisite properties and that (3.47) holds. The verification will be left as an exercise (remember that h(t) does not involve A). In this case there is no point spectrum because E). is not discontinuous; the spectrum is entirely continuous. The formula (3.47) can be shown to be valid for any self-adjoint operator whether bounded or not. Other formulae which can be obtained as generalizations of the compact case are

f Ad(E;.x, y),

(3.48)

IITxI1 2 = A2 dIlE;.xI1 2 ,

(3.49)

= f(A) dE;.x,

(3.50)

(Tx, y) =

f(T)x

f f

UNBOUNDED OPERATORS AND EIGENVALUES

149

f(A.) being a continuous function of A.. These results suggest that it is reasonable to write (3.51)

Exercises 14. H consists of the functions which are square integrable on (0, 1) and Tx(t) = tx(t). Show that T 1/2X(t) = t 1/2X(t). 15. For the operator Tof Exercise 14 verify that the E). given in the text has the required properties. 16. If H consists of the functions which are square integrable on (- 00, C() and Tx(t) = - i(djdt)x(t), then

-i -d = dt

where

f''Xi Ad(UE).U- 1 ) -(Xl

E).x(t)

= x(t)

(t ~ l) (t > A)

=0

and

Ux(s)

1 f'Xi = --1-/2 (2n)

-

eistx(s)ds.

00

3.7 Approximation theorems It has already been indicated in Exercise 13 how the general theory can lead to techniques which are of practical value in determining the eigenvalues of an operator. Our aim in this section is to derive two theorems which are of wide applicability. We commence by establishing LEMMA 3.7. For real ).., b such that IA - bl < m the self-adjoint operator T - A.I has a bounded inverse if, and only if, II(T - b)xll ~ mllxll for every x E H.

Proof. It is obvious that II(T - b)xll ~ II(T- A)xll

+ IA - bill x].

Therefore, if II(T- b)xll ~ mil x], the inequality IA - bl < m ensures that T - AI has a bounded inverse. Conversely, if T - A.I has a bounded inverse for A satisfying IA - bl < m the choice A. = b show that T - bI has a bounded inverse. Assume that there is an x such that II(T- b)xll < mil x]. Then II(T- bI)-l\1 > 11m. Denoting II(T- bI)-lll by M there is, as in §3.3, a sequence {Yi} with IIYjll = 1 such that «T- bI)-lYj'Yj)

--+

M,

II(T- bI)-l Yj ll --+ M.

150

OPERATORS AND EIGENVALUES

Putting Xj = (T - bI)-lYj we have II(T- b)xjll = 1 and

(x j ' (T - b)x j )

Therefore

II(T- A)x j11 2 = II(T- b)xjll2 --+

--+

M,

+ 2(b -

II x j II

--+

M.

)~)(Xj' (T - b)x j )

+ (A - b)211xj112

{I + (b - A)M}2.

The choice A - b = 11M gives IA - bl < m and makes the right-hand side zero. This is impossible since T - A.I has a bounded inverse for such a A.. On account of the contradiction the initial assumption must be valid. Consequently, it must be true that II(T- b)xll ~ mil xII for all x. The Lemma is proved. A useful corollary is LEMMA

3.7a. If no point of the spectrum of T lies in (a, c) then «T - aI)(T - cI)x, x)

~

o.

Proof. Take b = (a + c)/2 and m = (c - a)/2. Then the absence of the spectrum from (a, c) means that T - )1.1 has a bounded inverse for IA - bl < m. Therefore, Lemma 3.7 enforces IITx - t(a

+ c)x11 2

~!(c

- a)211x1l 2

which is another way of writing the inequality in the Lemma. It is now possible to demonstrate the following theorem. 3.7 (TEMPLE-KATO). Let T be self-adjoint and x a non-zero element of H. Suppose that b > (x, Tx)/IITxIl 2 > a and that there is no point of the spectrum of T in (a, b) other than the isolated eigenvalue Ao . Then

THEOREM

pwhere p

= (x, Tx)/lIxI1 2 ,

(1

(J2 _ p2

b-p

(12 _

p2

~A.o~p+---

p-a

= IITxll/llxll.

Proof If e is sufficiently small there is no point of the spectrum in (a, Ao - e) and so, by Lemma 3.7a, «T - aI)(T - (Ao - e)I)x, x) ~ 0 whence (A o - e)(p - a) ~ (J2 - ap. By hypothesis, p > a and so the upper bound on Ao follows by letting e --+ O. A similar procedure for the interval (A o + s, b) gives the lower bound and the proof is terminated.

UNBOUNDED OPERATORS AND EIGENVALUES

151

As an illustration of the utility of the theorem suppose that Ao is the lowest eigenvalue of T and Al is the next highest. Assume that it is known that Al ~ c and that Ao is the only point of the spectrum below Al . Then choose a = - 00, b = c in Theorem 3.7. For any p < c, bounds on Ao are provided by (12 _

P-

p2

c-p

~

Ao

~ p.

The main difficulty in this method is obtaining an estimate of c, though this can be done by any means which supplies one-sided bounds. Theorem 3.7 caters for the case when the existence of an isolated eigenvalue is known and bounds on its location are required. The next theorem specifies an interval which guarantees that an eigenvalue lies within it. THEOREM 3.7a (KRYLOV-WEINSTEIN). If T is self-adjoint then there is at least one eigenvalue Ao such that

p_

where

(1

«(12 _ p2)l/2 ~

Ao ~ p

+ «(12

_ p2)l/2

and p are defined in Theorem 3.7

Proof. The proof will be given for compact T; the proof for general T travels similar lines but using the representation (3.47). When T is compact we can write x = L amxm in terms of the orthonormal set {x m } . Hence 2 2 (12 - p2 = II(T- pI)xI1 / llxIl

= L (Am m

~

P)2a

;. jLa;' m

min (Am - p)2 = (Ao _ p)2 m

for some Ao. The result stated in the theorem is an immediate conclusion.

Exercises 17. For the matrix

(

3.4 -2 0)

-2

4

-2

use p to show that an upper bound for the lowest

o -2 4 eigenvalue is 1.14.Show that a lower bound for the next eigenvalue is 3.6 and deduce that a lower bound for the first eigenvalue is 0.96. Will the Krylov-Weinstein theorem help? 18. If C is a compact self-adjoint operator, prove that max (x, Cx)/(x, x)

~

A,

152

OPERATORS AND EIGENVALUES

the maximum being taken over those x for which (x, Yi) the Yj being some fixed element of H. Deduce that

=0

for j

= 1, ... ,r -

1,

A, = minjrnaxtx, Cx)/(x, x)} the minimum being taken over all possible Yj. 19. The matrix A 1 is formed from the Hermitian matrix A by deleting the rth row and the rth column. By means of Exercise 18, prove that the eigenvalues of A 1 separate those of A. This is the basis of a method of estimating the positions of eigenvalues from those of a matrix of lower order.

3.8 Point matching

Consider the problem of finding the eigenvectors of

au) + Aru = 0 Ln -iJ ( Pj --

j= 1 aXj

(3.52)

OXj

in a domain G with a Dirichlet boundary condition on iJG. While it may be possible to construct solutions of (3.52) (by separation of variables, for example) often they will not comply with the specified boundary condition. Then, an expansion of u in terms of such functions cannot be employed in some of the preceding methods because the trial functions must comply with the boundary condition. However, it might be asked that the coefficients in the expansion are determined by imposing the boundary condition at a sufficient number of points. This is known as point matching and the question arises as to whether it is likely to generate accurate answers. Suppose, for example, that when A = A' the solution u = u' of (3.52) can be found. If u' vanishes on oG then an eigenvalue and eigenvector have been found. If not, write

fa

u' 2rdt

= Ilu'11 2 •

Then the following assertion can be made (Fox, Henrici, and Moler 1967). THEOREM 3.8. If 11 value A such that

Proof Let

= maxseaG lu'(s)1 SG r dr/] u' II and 11 < 1, there is an eigen-

lila au) L - ( Pi::

Tu= -and define

r

j= 1 OXj

=

fa ruv

J oX j

dt,

UNBOUNDED OPERATORS AND EIGENVALUES

153

only real functions being involved. Then T is self-adjoint. Let w be such that Tw = 0 and w = u' on oG. Now, from v = u' - w, calculate p and (J in Theorem 3.7a. It will be discovered that

p = {llu'11 2 - (w, u')}A'/{\Iu'11 2

= A'21Iu'11 2/{llu'11 2 -

(J2

+ \IwI1 2}, + Il w Il 2}.

2(w, u')

-

2(w,u')

Hence, by the Krylov-Weinstein theorem, there is an eigenvalue A such that

l - A' lies between the bounds {(w,u')

-lI wI1 2 }A' ± Il'I{lIwI1 21I u'11 2 Ilu' 1 2 - 2(w, u') + II wII 2

(W,U')2}1/2

In the estimation of bounds it may be noted that

Ilwll ~

(m:x IWI) fa

r

dt ~ (~:x lUll)

fa

r

dt

by the maximum property of T. Hence

II w II

~

,,11 u' II < II u' II

(3.53)

since " < 1 by hypothesis. To simplify the notation put

U= Ilu'll,

w= Ilwll,

V=(U 2JV2-Z2)1/2.

Z=(w,u'),

Then a bound for·

(IZ - JV21 +

V)/(U 2

+

W2

-

2Z)

is required subject to W < U on account of (3.53). Since

V2 = (J2WU + Z)2 - (WU + J2Z)2

(3.54)

= (J2WU - Z)2 - (WU - J2Z)2

(3.55)

it is evident that

v < J2WU -IZI unless Z

= UW/J2

when V

= Z.

12 - JV21 +

(3.56)

It follows from (3.56) that, when

V

U 2 + W 2 - 2Z

~ JV2 + .J2 WU . U2 - W2

To confirm that (3.57) continues to hold when

JV2 <

JV2

~ Z,

(3.57)

Z, observe that

(.J2WU + JV2)(U 2 + JV2 - 2Z) - (U 2 - JV2)(Z - JV2 + V) = U 2{J2WU - Z - V} + JV2(J2WU - Z + V) + 2UW(UW - J2Z).

154

OPERATORS AND EIGENVALUES

Obviously, when Z can be written

= UWjJ2 =

{U(J2WU - Z - V)

+

V, this is positive. For other values of Z it

W{WU - J2Z)}2j{J2WU - Z - V)

by virtue of (3.55). Then (3.56) makes this positive and (3.57) has been verified. On inserting W ~ tlU from (3.53) into (3.57) the inequality stated in the theorem is recovered. To illustrate how this theorem can be applied to the accurate determination of eigenvalues consider the problem of TM modes in a waveguide. In this problem Pi and r are both unity while n = 2. Select a convenient point as origin and then try M-l

U'

=

L

m=O

am Jm{JA'p) sin m8

(3.58)

where p, () are polar coordinates and J m is the usual Bessel function. Note that one is not obliged to adopt an expansion of this form any more than one is forced to work with polar coordinates. Any finite sum of solutions of the governing equations is acceptable, but a judicious choice may be valuable in improving convergence. Polar coordinates would seem suited to domains which display some sort of circularity, whereas other coordinates will be appropriate to other shapes. For example, if the boundary has a corner, a suitable choice might be to make the corner the origin and pick the orders of the Bessel functions so that the angular functions vanish on the nearby boundary. A rule is necessary to ascertain the relevant values of A' and the coefficients am. Choose M points (Pi' OJ) for j = 1, ... , M on 8G and require u' to be zero at these points, i.e. M-l

L

m=O

amJmCJA'Pj) sin m()j

=0

(j = 1, ... , M).

(3.59)

This is a homogeneous system of linear equations for the am and will have a non-trivial solution only when the determinant of the coefficients vanishes. Any root gives a possible value of A' and then the corresponding am which solve (3.59) supply a possible u' via (3.58). Having found u', we may compute the maximum of its modulus on the boundary. The quantity tl will then be known as soon as an estimate of /Iu' II is available. One way of doing this analytically-since a lower bound is sufficientis to take the largest circle with centre the origin which lies within the cross-section of the guide. If its radius is d,

UNBOUNDED OPERATORS AND EIGENVALUES

155

Now it can be checked whether or not 11 < 1. If it is not, attempts to make it so may involve increasing M, or adjusting the points where u' is zero or, maybe, putting conditions on the derivatives of u' at some points to iron out the larger fluctuations.

Exercises 20. The axes of an ellipse are 2a and 2b respectively. Use the preceding technique to validate the following bounds for the first eigenvalue: (i) a = b, 5.78318596 < Al < 5.78318597; (ii) a = 2b, 3.56672658 < AI < 3.56672662; (iii) a = 5b, 2.818069 < Al < 2.818071.

21. An L-shaped waveguide has it shorter sides of unit length and its longer sides are twice that. Show that bounds for the eigenvalues are 9.639723 80
4 VARIATIONAL METHODS AND OPTIMIZATION

THE DERIVATIVE OF AN OPERATOR 4.1 The derivative The eigenvalues of a compact operator have arisen from the consideration of the maxima of functionals over certain spaces. When dealing with conventional functions on the real line, the maxima can often be located from an investigation of the zeros of the derivative. If we hope to apply a similar technique to cases involving operators, it is necessary to have a rule which specifies how a derivative is to be calculated. Therefore this matter will be studied in the present section. Although several definitions are available, two will be sufficient for our purposes. It should, however, be pointed out for those who consult other references that there is no common agreement on terminology and care must be taken in any comparison of the results of different authors. We start with a definition that stems from the derivative in conventional calculus. DEFINITION 4.1. Let X and Y be normed linear spaces and T an operatorfrom X to Y. Iffor given a E X and non-zero hEX there is an operator A a such that

lim II T(a + th) - Ta - tAahlllt = 0

(4.1)

t ..... 0

then Aah E Y is called the Gateaux variation at a in the direction h. The norm in (4.1) is, of course, that of Y. More general definitions are available but will not concern us. If (4.1) holds for every hEX then A a is called the Gateaux derivative of T at a and we write A a = bT(a). To simplify notation we shall write 1JT(a) . x instead of (bT(a))x to indicate the effect of the Gateaux derivative on the element x. The statement that Tpossesses a Gateaux derivative at each a of some set A is often abbreviated to T has a Gateaux derivative on A. As an example suppose that X is the real line Rand f(x) is a conventional

157

THE DERIVATIVE OF AN OPERATOR

function with values on R. If f has a derivative in the conventional sense lim If(a 1-0

+ th) - f(a) - thf'(a)lJt = O.

Thus, taking the modulus as norm, we recover (4.1) and

= f'(a),

()f(a)

()f(a). x

= xf'(a).

Consequently, the Gateaux derivative agrees with the conventional derivative when the latter exists. This is perhaps not surprising since (4.1) could be regarded as bT(a) . h =

~ T(a + th)! dt

(4.2)

. 1=0

Another illustration is provided by taking X as the two-dimensional real plane R 2 with coordinates Xl' X2' and Yas R. Let f(x) = x~(l + 1/x 2 ) for X2 =F 0 and f(x) = 0 for X 2 = O. Then if h has components hh h2 with h2 =F 0

()j(O) . h = hflh 2 as is immediately evident from (4.2). This example demonstrates that a Gateaux derivative need be neither linear nor continuous. If X is R" and Y is R choose h as ej which has all zero components except in the jth position where it is unity. Then bJ(x). ej

= oJ(x) OXj

so that partial derivatives are also within the scope of the Gateaux derivative. A somewhat more recondite example is provided when x is a two-vector with components y(s) and z(s) which are real-valued functions of the real variables, continuous on the interval [a, b]. Let F(s, y, z) be a real-valued function of the three variables s, y, z and continuously differentiable both in y and in z. Then, if J(x)

=

lb

F(s, y(s), z(s» ds

and h has components hh h 2 , ()f(x). h = lim (lit) 1-0

fb {F(s, y(s) + thl(s), z(s) + th 2(s»

= fb (OF hl + a

oy

- F(s, y, z)} ds

a

of h2) ds.

oz

In this case {)j has the form of an integral operator. One of the important features of the Gateaux derivative is supplied by

158

VARIATIONAL METHODS AND OPTIMIZATION

4.1. If f is a real functional on the normed linear space X and has a local maximum or minimum at a E X then, if t>f(a) exists, t>f(a) = O.

THEOREM

Bya local maximum is meant that f(x) < f(a) for all x satisfying [x - all < d for some positive d; correspondingly for a local minimum f(x) > f(a).

Proof If there is hEX such that t>f(a) . h > 0 then, for sufficiently small t, {f(a + th) - f(a)}/t > O. Thus f(a + th) - f(a) changes sign with t, contradicting the fact that f is a local minimum or maximum. Hence t>f(a) . h > 0 cannot occur and a similar argument excludes t>f(a) . h < O. Consequently t>f(a). h = 0 for all hEX which is only possible if t>f(a) = 0 and the theorem is proved. Of course, the theorem does not assert that any solution of t>f(a) = 0 must give a local maximum or minimum. If I is said to be stationary when t>I = 0 the theorem states that at a local maximum or minimum f is stationary but the converse need not be true. If X is an inner product space and there is an element 9 E X such that t> f(a) . h = (g, h) for hEX then g is called the gradient of the functional I at a. This is often expressed as 9 = Val so that

t>1(a). h = (Val, h). There is no guarantee that the gradient of a functional exists, but the theory of §3.3 ensures that it will if X is a Hilbert space and t>f(a) . h is a bounded linear functional. To give an indication of an application, we let E and F be inner product spaces with T a linear operator on E, and T A , its adjoint, an operator on F. Let y E D(T A) so that TAy E E. Form a functional from y and TAy, i.e. a functional on the appropriate subset of the product space F x E. Then write

I(y)

= G(y, TAy)

where G is the functional. Consider G(y, x) and suppose that only changes in yare permitted. Assuming a gradient exists we have, if h E F,

t>G(y, x) . h = (VyG, h) since the inner product relevant to F has to be employed. On the other hand, for alterations in x alone,

c5G(y, x) . hI = (VxG, hI) with hI

E E.

Applying these results to

f

we obtain

t>f(y). h = (VyG, h)

+ (VxG, TAh)

where x is replaced by TAy after the gradient operations have been performed.

159

THE DERIVATIVE OF AN OPERATOR

Therefore ~f(y).

h = (VyG, h)

+ (1VxG, h).

This equation shows that the gradient of f(y) is

VyG and

f

+

1VxG

is stationary when (4.3)

In general, if T has a Gateaux derivative at a, necessary and sufficient conditions for bT(a) . h to be linear and continuous in h are as follows. (i) To each h there is a ~, which may depend on h, such that It I ~ b implies II T(a + th) - T(a)II ~ M II th 1/ where M is independent of h. (ii) T(a + th l + th2 ) - T(a + th I ) - T(a + th2 ) + T(a) = o(t) as t -.. o. One difficulty with the Gateaux derivative is the defining of a derivative of a derivative. Also it is desirable that the notion of differentiation should entail that differentiable functions are continuous. This suggests the concept of the

Frechet derivative. 4.1a. Let X and Y be normed linear spaces. If,for given a E X, there is a bounded linear operator Lafrom X to Y such that

DEFINITION

lim "T(a

111111-+0

+ h) -

T(a) - La hll/llhll = 0

then La is called the Frechet derivative of T at a and written T'(a). The norm in the numerator is that of Y, whereas the one in the denominator is that of X. The domain of T'(a) need not be the whole of X. By replacing h by th and letting t --+ 0 we recover Definition 4.1 so that the existence of the Frechet derivatives implies that of the Gateaux derivative, the two being equal. The converse need not be true but, whenever a continuous Gateaux derivative exists, so does the Frechet derivative and the two are equal. If f is a functional on the Hilbert space H to Rand f has a Frechet derivative at a, the gradient of f is related to it by

f'(a) · x = (Va!' x) for any x Since

E

H.

II T(a +

h) - T(a)/1 ~ /I T(a

+

h) - T(a) - T'(a) . hI/

+

"T'(a)lIl1hll

and II T'(a) II is finite, the left-hand side tends to zero as IIhl/ -.. O. This demonstrates that T is continuous at a and verifies the desirable property mentioned before Definition 4.1a. It may be checked that the Frechet derivative is unique. For, if S is another

160

VARIATIONAL METHODS AND OPTIMIZATION

possible derivative, II(T'(a) - S). hll = II T'(a) . h - S . hll ~

IIT'(a). h - T(a + h) + T(a) II + IIT(a

~

ellhll

+

h) - T(a) - S.

hll

with e arbitrarily small, Ilhll being taken small enough. For any x E X, put h = l1x/llxll and then Ilhll can be made suitably small by choice of 11. Hence II(T'(a) - S) . xII

~

ellxll

confirming that T'(a) - S is a bounded operator. Allowing e -+ 0 we deduce II T'(a) - S II = 0 and so T'(a) = S. Thus further support is lent to the identification of the Frechet and Gateaux derivatives when the Frechet exists. Clearly, if T is a linear operator, T'(a) = T for all a E X. Let X be Rn and Y be R m. Let f(x) have components ft(x), . . . ,fm(x) with x ERn, and assume that f1' ... ,fm have first partial derivatives at a. The inner products are defined by ( X ( l ),

X(2»)

r=

>= L y~1)y~3) m

n

= "i..J

t

X(1)X(2) r r'

(y(1), y(2)

r= 1

in an obvious notation. Now

I

Jj(a

as IIhll

-+

+ h) -

Jj(a) -

a~.(a) L_ J- h, n

r= 1

oar

II

IIhll

-+

0

O. It may therefore be deduced that aft aft -- aat aa2

aft aan

aim oat

aim

f'(a). h =

oa,

ht h2 hn

since the matrix clearly supplies a bounded linear operator. The matrix which represents f'(a) is called the Jacobian matrix of the functional f. When m = n, the determinant of the Jacobian matrix of f is known as the Jacobian of f; it is often denoted by 8(fl' f2' , f,,) o(ah a2' , an) The analogue of the derivative of a function of a function in conventional calculus is

161

THE DERIVATIVE OF AN OPERATOR

(CHAIN RULE). Let X, Y, Z be normed linear spaces. If T is an operatorfrom X to Y with a Gateaux derivative in X and S is an operatorfrom Y to Z with a Frechet derivative in Y then ST has a Gateaux derivative and

THEOREM 4.1

bST(a)

= S'(T(a))bT(a).

If T has a Frechet derivative then ST has a Frechet derivative and (ST)'(a) = S'(T(a))T'(a). Proof Let hEX and define Y = T(a), Yl = T(a + th) - T(a). Then {ST(a

+ th) -

ST(a)}/t

= {S(y + Yt) - S(y)}/t = {S'(y). Yl + S(y + Yl) - S(y) - S'(y). YI}/t = S'(y). {T(a + th) - T(a)}/t

+ {S(y + Yl) -

Hence

IIS'(T(a))bT(a). h - {ST(a ~

+ th) -

S(y) - S'(y). Yl}(1/IIY111)IIYIII/t.

ST(a)}/tll

IIS'(y)lllIbT(a). h - {T(a + th) - T(a)}/t!l

+ IIS(y + Yl) -

S(y) - S'(y) . ydl

II T(a +

th) - T(a)

IIYIII

II .

(4.4)

t

When T has a Gateaux derivative at a it is continuous at a in each direction although it may not be continuous at a, i.e. II YIII -+ 0 as t -+ o. Consequently, the first part of the theorem is proved. If T has a Frechet derivative replace b T in (4.4) by T'. Tis now continuous at a and II YIII -+ 0 for all h such that IIhll = 1. Hence the second part of the theorem is demonstrated and the proof is terminated. The application of this theorem to the conventional function f(fl (x), · .. ,fm(x)) leads to the standard formula

of -_

-

OXj

of

Ot1k

k= 1 0t1k

OXj

~

i..J--

where 11k = h(Xl'J .. · , x n )· The reason for introducing two kinds of derivative is that construction of the Gateaux derivative is usually much easier than that of the Frechet derivative because of (4.2). If, morever, the Gateaux derivative is continuous, as occurs in most applications, it can be identified with the Frechet,

Exercises 1. If f is defined from R 2 to R by f(O)

bf(O).

= 0 and

f(x)

= xlx~/(xi +

x~) otherwise, find

162

VARIATIONAL METHODS AND OPTIMIZATION

2. If I is defined from R 2 to R by I(x) = Xl if X2 = 0, I(x) = X2 if Xl = 0 and I(x) otherwise, show that I does not have a linear Gateaux derivative. 3. Let x(t) be a real-valued function of t continuous on [0, 1] and define

f

f(x) =

U{(t

Find b/(x). h and, by choosing h(t) minimum of - t In 2. 4. Prove that

+

=1

l)x(tW - x(t)] dt.

= (t +

l)x(t) - 1, deduce that

I

has a local

(i) Vxllxll = x/],»], (ii) Vx IIxll 2 = 2x, (iii) Vx(y, x) = y, (iv) Vx(x, Tx) = (T + TA)x.

5. If I is defined from R 2 to R by 1(0) I(x)

= x2(xi

= 0 and + X~)3/2 I{xci + X~)2 + x~}

(x

=1=

0)

show that f has a Gateaux derivative but not a Frechet derivative at x = o. 6. Let X be a real nonned linear space and f a real functional on X. Prove that 1«(Xx for all

+

(1 - rx)y)

~

a.f(x)

+

(1 - a.)f(y)

x, y, (1. such that "xII < 1, "YII < 1, 0 ~ (X ~ 1 if and only if I(x) - f(y)

~

f'(y) . (x - y).

7. T is an operator R 2 to R 2 such that Tx = (x h X~)T. If I is the same as in Exercise This example 5 show that fT does not have a linear Gateaux derivative at x = shows that it would not be sufficient for S to have only a Gateaux derivative in Theorem 4.1a.

o.

4.2 Mean-value theorem The standard mean-value theorem for functions of a single variable states that, if f(t) is continuous on [a, b] and differentiable on (a, b), there is atE (a, b) such that f(b) - f(a) = f'(t)(b - a). The purpose of this section is to seek generalizations of this result; they are often helpful in dealing with the convergence of iterative processes. It will be convenient to discuss firstly functionals.

4.2. If f, a functionalfrom R" to R, has a linear Gateaux derivative at each point of a convex set Do in the domain of f, then,for any x, y E Do, there is a t e (0, 1) such that THEOREM

f(y) - f(x)

= bf(x + t(y

- x)). (y - x).

A set Do of a linear space is called convex if, given x, y E Do, then all of + (1 - ex)y with 0 ~ ex ~ 1 are in Do. In geometrical terms, all the points on the straight line joining x and yare in Do.

exx

163

THE DERIVATIVE OF AN OPERATOR

Proof. For given x, y E Do, the function <jJ(s) = f(x able, and hence continuous, on [0, 1]; therefore '(s) = bf(x

+ s(y -

+ s(y -

x)) is differenti-

x) . (y - x)

for all s E [0, 1]. Hence, by the standard theorem for the single variable,

f(y) - f(x)

= <jJ(I) -

(0)

= bf(x + t(y - x)). (y - x)

for some t E (0, 1). The proof is complete. The theorem does not hold, in general, if f is from R" to B" with m > 1 (see Exercise 8). It is possible to apply Theorem 4.2 to each component Jj of f but then the t j need not all be the same so that the component formulae cannot be combined into a single equation of the type in Theorem 4.2. It is, however, possible to obtain an upper bound of wide applicability.

4.2a. Let X and Y be normed linear spaces. Let T be an operatorfrom X to Y which possesses a Frechet derivative at each point of a convex set Do in

THEOREM

D(T). If, for given x, Y E Do, sup IIT'(x + t(y - x))11

O~t~

then

~

M,

1

IIT(y) - T(x) II ~ Mlly - x].

Proof. For given e > 0, let B be the set of t E [0, 1] for which

II T(x + t(y - x)) - T(x)1I

°

~

Mtlly - x] + sr] y - xII.

(4.5)

Obviously, E B so that sup B is well defined. Because T has a Frechet derivative T(x + t(y - x)) is continuous in t and so IIT(x + to(Y - x)) - T(x) II ~ (M + e)tolly - x]

(4.6)

where to = sup B. Evidently, since e is arbitrary, the theorem will be proved if to = 1. Suppose that to < 1. Then the existence of T' implies that there is t 1 with to < t 1 < 1 such that

IIT(x

+ t 1 (y -

x)) - T(x

+ to(Y -

x)) - T'(x

+ to(Y -

x)).

«. - to)(Y -

~

whence IIT(x + t1(y - x)) - T(x

+ to(Y - x))11

~ (M

+ e)(t 1 -

8(t1 -

x)11 to)lly - xII

t o) II Y - x].

It follows from (4.6) that IIT(x + t1(y - x)) - T(x) II ~ (M + 8)t11l Y - xII. From (4.5) this implies that tIE B contradicting the definition of to' Hence to = 1 and the proof is terminated.

164

VARIATIONAL METHODS AND OPTIMIZATION

A useful extension is

4.2a. Under the conditionsof Theorem 4.2a, if there is a bounded linear operator 8 such that

COROLLARY

IIT'(x

+

811

t(y - x) -

~

M,

for all t E [0, 1], then IIT(y) - T(x) - S(y - x)II ~ MillY -

xII.

Proof Define f(x) = T(x) - S(x). Then f is continuous on Do and has a Frechet derivative f'(x) = T'(x) - S for all x E Do. Therefore IIf'(x + t(y -

x»11

~ M

and Theorem 4.2a gives

x)1I

IIT(y) - T(x) - S(y -

= IIf(y) - f(x)

II

~ Milly -

x].

The proof is concluded. By choosing S = T'(z) where z

II T(y)

- T(x) - T'(z) . (y -

x)1I

E

Do we can deduce from Corollary 4.2a that sup

~

O~t~

1

II T'(x + t(y - x) -

T'(z) II II y - xII.

Another approach leading to mean-value theorems is via integration. First, however, a suitable definition of integration is needed (cf. §3.6). Let t E [0, 1] and let Tt E Y, Y being a nonned linear space. Partition [0, 1] by choosing points to, t l , . . . , t, so that 0 = to < t l < ... < t; = 1. Then, if given e > 0 there are a y e Y and a ~ > 0 such that

I/Y -

jtl

(t j - tj-l)Ttil/ < e

for any partition with maxu, - ti - l ) ~ ~ and tj E [ti Riemann integral of T from 0 to 1 and we write

y=

Ii

(4.7) l,

ti ] , y is called the

T(t) dt.

Now, suppose that X is a normed linear space and that

Ii

ti»;

+ t(Xl

-

xo»

dt

exists for given x o, Xl E X in accordance with the definition just given. Then it is known as the Riemann integral of T from X o to Xl and denoted by

I

x . T(x)

Xo

dx.

THE DERIVATIVE OF AN OPERATOR

165

This definition differs slightly from the conventional one for real-valued functions of a real variable. For, let X = Y = R and then, with Tthe function f,

II

f X! f(x) dx = xo

f(x o

+ t(X I

1

fX!

0

=

Xl -

Xo

-

Xo)

dt

f(s) ds

xo

on making the change of variable s = X o + t(x 1 - xo). The integral on the right is the usual one so the new definition differs from the customary one by the factor (x 1 - xo) in the denominator; the two definitions thus agree for conventional integrals only when Xl - Xo = 1. Nevertheless, this should occasion no difficulty in the following. If T is a continuous operator, the same argument as is used for conventional integrals may be adopted to show that J~~ T(x) dx exists for all X o, Xl E X when Y is complete, i.e. when Y is a Banach space. If cP(t) E Rand IIT(xo + t(x i - xo» 11 ~ cP(t) for 0 ~ t ~ 1 then, from (4.7), n

L

(t j

-

xo)11

~

cP(t) for 0

/I y/l ~ e +

j=I

tj-l)cP(tj).

An immediate deduction is THEOREM

4.2b. If II T(xo

+ t(x i

-

~

t

~ 1

then

provided the integrals exist. In particular

IlL:'

T(x) dxll

~ L:' II T(x) II dx.

The mean-value theorem can now be established. THEOREM

4.2c. If T has a continuous Frechet derivative and Y is a Banach space

then

for all

Xl' X o E

X.

Proof Let S(t)

= T(xo + t(x i T'(xo

-

x o» and let S'(t) denote

+ t(x i

-

x o» · (Xl

-

x o)

on account of Theorem 4.1a. Pick' the partition of [0, 1] in which t j

= j/n.

166

VARIATIONAL METHODS AND OPTIMIZATION

Then n

L

S(I) - S(O) -

j=l

S'(tj)(t j - t j- I)

n

=

L

j=l

{S(t j) - S(t j- 1 )

-

(t j - tj_I)S'(t j)}.

The definition of a derivative implies that, given e > 0, there is N, such that IIS(tj) - S(t j- l) - (t j - tj_l)S'(tj)11 ~ e/2n

for n

N; Hence, for n

~

Nt,

~

/lS(1) - S(O) - it S'(tJ(ti -

ti-l)/I

~k

From the definition of an integral there is N2 such that

/lit S'(tJ(ti -

Il ~ t6 II dtll ~

ti- l) -

S'(t) dt/l

for n ~ N2 • Hence, if n exceeds the larger of N, and N2

IIS(l) - S(O) -

S'(t)

6

which proves the theorem. One consequence of this theorem is THEOREM

4.2d. If Y is a Banach

space and

IIT'(x) - T'(y) II ~

Kllx -

yllP

for all x, y in a convex set Do with p > 0 then IIT(x l )

-

T(xo) - T'(x o).

(Xl -

xo)11 ~ Kllx 1

-

xollp+1/(p + 1).

Proof Since p > 0 the hypothesis on T' signifies that it is continuous. Therefore, by Theorem 4.2c,

T(x l) - T(xo)

=

Il

».(Xl -

T'ix., + t(x l - x o

xo) dt.

Hence, Theorem 4.2b gives

II T(x 1 )

-

IIII ~ Il

T(xo) - T'(xo) . (x, - xo)11

»- T'(xo)} · (Xl -

{T'(x o + t(x l - x o

=

K

and the theorem follows at once.

tPllxl - xolI p + 1 dt

x o)

dtll

167

THE DERIVATIVE OF AN OPERATOR

Exercises

8. If f, from R 2 to R 2, has components fl(X) = x~, f2(X) t E [0, 1] such that f(y) - f(x)

= f'(x + t(y -

= x~

show that there is no

x) . (y - x)

when x = 0 and y is the point (1, 1). 9. If T has a Gateaux derivative at each y such that 1/ y - all < 1 and bT is continuous at a show by Corollary 4.2a that T has a Frechet derivative at a. 10. Show that Theorem 4.2d is valid for p ~ 0 if the additional assumption that T' is continuous on Do is incorporated.

4.3 Higher derivatives The Frechet derivative T' is such that T' . h for fixed hEX assigns the element T'(x) . h to x E X. It may happen that this operator has a Frechet derivative at x = a. If that is so, this' second derivative is written as T"(a) . h. Acting on k E X it produces the element (T"(a) . h) . k of Y. It is more convenient to rewrite this as T"(a) . (h, k). Since both hand k are in X, (h, k) is an element of the product space X x X and so T"(a) is an operator with domain in X x X and range in Y. By the definition of a Frechet derivative T" is linear for changes in hand for changes in k. Therefore T" is a bilinear operator from X x X to Y and T"(a) is known as the second Frechet derivative of T at a. Remark that the existence of T' requires T to be continuous so that a necessary condition for the existence of T"(a) is that T'(a) be continuous. The norm of T" can be calculated in a straightforward manner via

II T"(a) . hll = so that

II T"(a) II = As an example let second derivative

sup

IIhll=1

f

lim If'(a

Ikl-+O

sup

IlkII=

1

II T"(a). hll =

II (T"(a) . h) . kll sup

sup

111111=1 IlkII=1

II T"(a). (h, k)ll.

be a functional from R" to R. By the definition of the

+ k) . h - f'(a) . h - f"(a) . (h, k)I/lkl = O.

From §4.1,

f'(a) · h =

n

L

j= 1

hj of/oaj

when a and h have components at, ... , an and hi' ... , h; respectively. Further

If'(at, · · · , aj + k j, · · · , an) · h - f'(a) · hi

ki

-+

~

± oaof .

Oa i j = l

h

j

j

168

VARIATIONAL METHODS AND OPTIMIZATION

Hence

in matrix notation, H being the n x n Hessian matrix o2f

o2f

oar

oa loa 2

Thus the second Frechet derivative carries with it all the information contained in the Hessian matrix. The second Frechet derivative has a property of symmetry, as described in the following theorem. THEOREM

4.3. T"(a). (h, k)

= T"(a) . (k, h).

Proof. Given e > 0, choose <5 >

°

so that T'(x) exists for [x - all < <5 and

II T'(x) - T'(a) - T"(a). (x - a)1I Pick hand k so that IIhll ~ 1<>, Ilkll ~ t<5. Let variable t on [0, 1] defined by f(t)

~

ellx -

all.

(4.8)

f be the function of the real

= T(a + th + k) - T(a + th).

By the chain rule f'(t) = T'(a

+ th + k).h - T'(a + th).h.

Hence, from (4.8) and the bilinearity of T", for any t E [0, 1J

II f'(t) - T"(a). (k, h)11

~

II T'(a + th + k). h - T'(a). h - T"(a). (th + k, h) II

+ \I T'(a + th). h - T'(a). h - T"(a). (th, h)II ~ IIhll(ellth

+

~ 2ellhll(lIhll

kll

+ ellthll)

+ Ilk II ).

Consequently Ilf'(t) - f'(O)U ~ IIf'(t) - T"(a).(k, h)11 ~

4ellh/l(IIhll + IIkl!)·

+ 11/'(0) - T"(a).(k, h)1I

THE DERIVATIVE OF AN OPERATOR

169

By Corollary 4.2a.

+ 11/'(0) - T"(a). (k, h)II sup Ilf'(t) - f'(O)11 + 2ellhll(lIhll + Ilkll)

Ilf(l) - f(O) - T"(a). (k, h)II ~ IIf(l) - f(O) - f'(O)11 ~

I

O~t~

~ 6ellhll(lIhll

But 1(1) - 1(0) = T(a

+ h + k) -

+ Ilkll)·

T(a

+ h) -

T(a

+ k) + T(a)

is symmetric in hand k so that, by exchanging hand k, we obtain

II T"(a). (h, k)

- T"(a). (k,

h)11

~

6e(lIhll + IIk11)2.

(4.9)

The inequality (4.9) has been derived subject to IIhll ~ !<5, IIkll ~ i<5. Suppose now that hand k are arbitrary. Choose IJ (~O) so that IJllhll ~ tb, J-lllkll ~ tb. Then, using p,h and uk in place of hand k in (4.9), we have

J-l 2 11 T"(a). (h, k) - T"(a). (k, h)1I ~ 6eJ-l2(JlhJl

+ JlklJ)2.

Hence (4.9) is established without restriction on hand k; since s is arbitrary the theorem is proved. Higher derivatives may be defined in an obvious way; for example, T"'(a) is a trilinear operator which converts the element (h, k, I) of X x X x X into the element T"'(a). (h, k, I) of Y. In general rn)(a). (hi' h 2 , ••• , hit) is symmetric in the sense that the interchange of any hj and hIt leaves it unaltered. Moreover (rna»(II) = T(m+II). The analogue of Theorem 4.2c is the following.

4.3a (TAYLOR'S THEOREM). If T has p continuous Frechet derivatives and Y is a Banach space, then

THEOREM

T(x

+

e) =

+

T(x)

+

f

I

T'(x).

1 e+ -2!1 T"(x)·(e, e) + ... + (p-l)! T(p-I) ·(e,.·., e)

(1 - t)P-1

o (p - I)!

T(p)(x

+

te). (e, . · . , e) dt

and, for every e > 0, T(x + e) - T(X) -

II

e)

... -

1

(p - I)!

r
e

,e)ll~elleIlP.

The quantity (e, ... , contains as many as the number of derivatives on the attached T. The proof of the first part of the theorem is by induction based on Theorem 4.2c. The second part follows from the first part by Theorem 4.2b.

170

VARIATIONAL METHODS AND OPTIMIZATION

Exercises 11. Let 1 be a functional on the Banach space X to R. If I'(a) = 0 and there is a positive constant K such that f"(a).(h, h) ~ -Kllh11 2 for every hEX prove that f has a local maximum at a. 12. The real-valued function f(t, u, v) on R 3 has continuous second partial derivatives

r

with respect to all three variables and F(x)

=

t«

x(t). x'(t)) dt

where x(t) is a real-valued function of t, continuously differentiable on [a, b], which vanishes at the endpoints a and b. With the norm

Ilhll = max Ih(t)1 + max Ih'(t)1 a~,~b

a~t~b

where h has the same properties as x, prove that

where 12' for example, means the partial derivative with respect to the second argument. 13. Prove that

lim {rex + h) - T(x) - T'(x).h - !T"(x).(h, h)} = o.

IIhll 2

111111-+0

14. Show that

T(x

II

+ ~) - T(x) - ... -

_1_ (p - I)!

tv », (e,

... , e)11

~

sup O~,~ 1

II T(P)(x + t~)IIII~IIP p!

4.4 Convex functionals There is a class of functionals which is very important in applications. This section lays out some of their basic properties.

4.4. A real-valued functional f on a convex set Do of a normed linear space X is said to be convex if DEFINITION

f(a.x + (1 - a.)y)

~

a.f(x) + (1 - a.)f(y)

for all x, y E Do and any a. satisfying 0 < a. < 1. If equality does not occur in the definition unless x = y, then f is said to be strictly convex. For a linear functional, equality always holds so that a linear functional is convex but not strictly convex. One reason for discussing convex functionals is

171

THE DERIVATIVE OF AN OPERATOR THEOREM

4.4. Iff is convexon the convex set Do then,for arbitrary x', x 2 , ••

•,

x"

in Do,

for any non-negative numbers

at, .. · ,

am such that

LJ=

t

aj = 1.

Proof. The method of proof is by induction on m, starting from the observation that the result is true for m = 2 by the convexity of f (m = 1 corresponds to a trivial statement). Assume that

J

Ct:

bix

i)

~ ~t: bJ(xi)

(4.10)

for any non-negative bj such that Lj~l bj = 1. Let c = "f;=-l aj' If c= 0 there is nothing to prove so take c > O. Then, from the convexity of f,

f

(f )=1

aix i)

~ cf (~il (ai/c)Xi) + amf(xm) J=1

m-l

~c

from (4.10) with bj

L (aj/c)f(X

j= 1

= aj/c. Hence the

j

)

+ amf(x'")

theorem is proved.

If Do is open it can be shown that f must be continuous. However, this need not be true if Do is not open. For example, if, on 0 < t ~ 1, f(t) = 0 for t '# 1 and f(l) = 1 then f is convex but not continuous. A relation between convexity and the derivative is provided by

4.4a. If f has a linear Gateaux derivative on the convex set Do then f is convex on Do if and only if

THEOREM

~f(y)· (x

- y)

~

f(x) - f(y)

for all x, y E Do. Furthermore, f is strictly convex if and only holds if x ;f: y.

(4.11)

if strict inequality

This is really a statement that a convex functional is always above its tangent plane at a point.

Proof. If f is convex J(y

+ a(x - y» - J(y) ~ J(x) - J(y). (X

172

VARIATIONAL METHODS AND OPTIMIZATION

Allowing a --. 0 we obtain (4.11). Conversely, if (4.11) is true, af(x) + (1- a)f(y) - f(ax

+ (1- a)y) = a{f(x) -

f(z)}

~ a~f(z).(1

+ ~

(1 -

+ (1 -

a){f(y) - f(z)}

- a)(x - y)

a)~f(z).a(y

- x)

(4.12)

0

where z = ax + (1 - a)y. Thus f is convex. If strict inequality holds in (4.11), then it does in (4.12) and f is strictly convex. Finally, if f is strictly convex it is convex and so ~f(y)·(z

Adding to this f(z) < a.f(x) ~

- y) ~ f(z) - f(y)·

+ (1 -

It». (x

a.)f(y) we derive - y) < f(x) - f(y)

and the proof is complete.

Exercises 15. Show that the inequality in Theorem 4.4a can be replaced by {!'(y) - !'(x)}. (y - x) ~ O.

Hint: Theorem 4.2c. 16. If !"(x). (h, h) ~ 0 for all hEX and all x E Do show that! is convex. Show that !"(x). (h, h) > 0 for h # 0 implies strict convexity. 17. If X is a Banach space and B is a bounded linear operator on X to X, the logarithmic norm of B is defined by p(B)

=

lim {III + tBII - I}. ' ....0+

Show that II I

+

t

tB \I is convex and deduce that J1 always exists. Prove that

(i) J-l(aB) = aJ-l(B) for any positive number a, (ii) J-l(B l + B 2 ) ~ J-l(B 1 ) + Jl(B 2 ) , (iii) IJl(B)1 ~ IIBII, (iv) IJl(B l ) - Jl(B 2)1 ~ IIBl - B2 11 , (v) Jl(B + ~I) = p(B) + 9lfl.

(The logarithmic norm is important in connection with matrices and differential equations.)

NEWTON'S METHOD FOR OPERATORS 4.5 Newton's method The method of Newton has already been encountered (§§1.8, 1.9) in the context of deriving numerical solutions of equations. It is capable of considerable

NEWTON'S METHOD FOR OPERATORS

173

generalization and, when so generalized, plays a fundamental role in the theory of optimization. Sufficient theory has now been developed to present Newton's method in a wide enough setting. The problem is essentially that of finding x E X, X being a Banach space, such that T(x) = 0 for an operator T on X to the Banach space Y. Given Xl EX near enough to x we can make the approximation T(x t

+x

-

Xl)

= T(x t) +

T'(xt)·(x - x.).

Since the left-hand side is zero, this suggests that x - Xl

= -{T'(XI)}-IT(xl )

provided that T'(x l ) possesses an inverse. This leads to the Newton iterative scheme X m + l = X m - {T'(xm)}-IT(x m) for finding a zero of T. In practice, because of the difficulty of evaluating the inverse in many cases, the scheme is often modified to X m+ 1

= X m - (Am)-lT(x m )

where Am is some approximation to T' for which an explicit formula can be determined for the inverse. This is particularly likely to be done in optimization procedures. As has been seen previously for single algebraic equations, there are two basic questions which have to be faced. Under what conditions does the equation have a solution X o and when will the iteration process converge to it? In the following it will always be assumed that T' exists at all points of a closed convex set Do and that (4.13) II T'(x) - T'(y)1I ~ K IIx - yll for all x,y E Do. The first case to be considered is when Am does not change from iteration to iteration, namely Am = At for all m. It will be assumed that Al is a bounded linear operator with an inverse such that (4.14)

Also, it will be supposed that A t can be chosen sufficiently close to T' for

(4.15) Let C2 C3

= 2K IIA I I T'(x l )ll/al(1 = {I - (1 - C 2 )1/2}( 1 -

c4 = {I

Then we have

+ (1 -

-

C I)2,

ct)al/K,

C2)1/2}(1 - cl)at/K.

174

VARIATIONAL METHODS AND OPTIMIZATION

4.5. If C2 < 1 and all x satisfying IIx - xIII < C3 are in Do, the equation T(x) = 0 has one, and only one, root X o among those x such that [x - xIII ~ C3• The iteration

THEOREM

converges to X o and where subject to t 1

= O.

In fact, the equation has a unique root in those x of Do which satisfy [x -

xIII <

C4 ·

Proof Note, firstly, that the conditions C2 < 1 and (4.15) make C3 and C4 positive real so that the inequalities in the theorem make sense. It will be shown now that the sequence {tm } converges to c3' The recurrence relation may be written (4.16) from which it is clear that tm + I > t m whenever t m < c 3 • Furthermore, t 2 < since KC 4 < 2al • By writing the relation as tm + 1

-

C3

= «, - c3){1 + K(t m -

C3

c 4 )/2a 1 }

we see that KC4 < 2al implies that t m+ 1 < C3 when tm < C3' Thus {tm} is an increasing sequence which is bounded above by C 3; therefore, it must converge to a finite limit. This limit can be no other than C3 as is evident from (4.16). The next step is to show that the convergence of {tm } carries with it the convergence of the iteration scheme for X m • Assume that Ilxj - X j _ 111 ~ t j - tj - 1 and that the x j are in Do for j = 2. 3, ... ,m. This is certainly true for m = 2 since IIx2 - xIII = t 2 < C 3 which, by hypothesis, shows that X2 is in Do. More generally, for k ~ m,

Ilxk - xIII ~ which verifies that the Now

k-l

L IIXj+1 -

j=l

Xk

Xjll

k-l

~

L (t + j

j=l

l -

t j ) = tk ~

C3

are in Do.

IIxm + 1 - xmll = IIA 11 T(x m) 11 ~ II T(x m) II/a 1

= IIT(x m) - T(x m- 1 ) + At(xm - xm-1)II/a t ~

(l/a t){IIT(xm) - T(x m- t) - T'(xm-t)'(X m - x m - l ) 1I

+ IIA 1 - T'(xm-l)lIl1xm-xm-lll}.

175

NEWTON'S METHOD FOR OPERATORS

On account of (4.13) and Theorem 4.2d

IIxm+ 1 - xmll

~ (1/a 1){tK\lx m

-

xm_111 2 + IIA I

-

T'(xm-l)lllIx m -

xm-11l}.

Furthermore

IIA 1 - T'(Xm- I )1/ Hence

Ilxm + 1 - xmll

T'(xl)1I + II T'(x m -

~

IIA 1

~

a l c l + Kllxm-t - xIII.

-

~ (1/a t){tK(tm + t m -

1)

t) -

T'(xl)1I

+ a l c 1}(tm - tm-l) = tm + l - tm (4.17)

from the recurrence relation for t m • Consequently, the inequality is true for m + 1 if it is for m. Since it is valid for m = 2 it holds for general m by induction. In addition

Ilxm+k - xmll ~

m+k-l

L

j=m

Ilxj+l - Xjll ~

m+k-l

L

j=m

(t j + l - t j )

= tm+ k - tm • (4.18)

Since {tm } is a convergent sequence it follows that {x m } is a Cauchy sequence and therefore tends to a limit Xo as m -+ 00 because X is a Banach space. Let k -+ 00 in (4.18). Then Ilxo - X m II ~ C3 - t m as stated in the theorem. Moreover,

so that T(x m ) -+ 0 as m -+ 00. The existence of T' implies that Tis a continuous operator and so T(xo) = o. The only outstanding question is whether T could have a root other than Xo. Suppose T(yo) = 0 with II Yo - xIII < C4 · Put Sl = II Yo - xIII and use this as a starting point for a sequence {sm} which satisfies the same recurrence relation as t m • From the argument for {t m } it is clear that, when S 1 < C 3' {sm} is an increasing sequence which converges to c 3 • On the other hand, if Sl > C 3, repetition of the argument reveals that {sm} is a decreasing sequence which converges to C3 • In either case Sm -+ C3 as m -+ 00. Also, the recurrence relation indicates that Sm + 1 ~ t m + t if Sm ~ t m • Since S t ~ 0 it follows that Sm ~ t". holds for general m. Assume that /I Yo - Xjll ~ Sj - t j for j = 1,2, ... , m; it is known to be true for m = 1. Then

IIA 1 yo - Atx m + T(x m )lI /a l ~ (1/a t ) { II T(yo) - T(x m ) - T'(x m ) · (Yo - xm)11 + "At - T'(x m)lI llyo - xmll} ~ (l/at){!K(sm + tm) + alct}(sm - t m )

\I Yo - x m + 1 1 ~

176

VARIATIONAL METHODS AND OPTIMIZATION

as in the derivation of (4.17). From the recurrence relation, the right-hand side is 8 m + 1 - t m + 1. By induction, the inequality is valid for general m. Allowing In -+ 00 and observing that {sm} and {tm} both tend to the same limit we deduce that Yo = X o. Therefore, T(x) = 0 has the single root X o for those x in Do which comply with [x - XIII < C4 • The proof is terminated. Next, it is desirable to introduce more flexibility into the choice for Am. Am should be permitted some freedom but it should not be allowed to deviate from T' by too much. The conditions which will be imposed are that Am is a bounded linear operator such that IIA'; 1 II ~ l/a m , IIT'(x l) - AlII ~ a l

IIT'(xm) - Amll

(4.19) (4.20)

b,

-

m-l

L IIxj + 1 -

~ am - b l +aK

Xjll

j= I

(m > 1)

(4.21)

where a ~ 1, a l ~ b » 0, b ~ b l > 0 are all independent of m. It is worth remarking that these conditions are fulfilled in the case already considered in which Am = Al for all m. Choose am = at, b = bl = a 1 (1 - c l ) , a = 1. Then (4.19) and (4.20) follow from (4.14) and (4.15) while (4.21) was demonstrated in the proof of Theorem 4.5. Let 1T(x Cs = 2aa lKIIA 1 1 )II/br, C6

=

b t {1 - (1 - c s)1/2}/aK,

These quantities revert to c 2 , preceding paragraph is made. THEOREM

that am

~

C3' C 4

C7

when Am

4.5a. If C s ~ 1, if all x in IIx - XIII < ao for m = 1, 2, ... then the iteration x m+ l

converges to a root Xo of T(x)

= x m-

tm + l

subject to t 1

C6

and the choice of the

are Do and if there is ao such

A,;lT(a m )

= tm+ {!aKt;

~

- bt;

c6

-

tm

+ alllAltT(xl)II}/am

= O. There is no other root of

l/ < c7·

= At

= 0 and

[x; - xoll where

IIx-x l

= b 1{1 + (1 - c s)1/2}/aK.

T(x)

= 0 in Do which lies in

Proof. The proof is similar to that of Theorem 4.5 but there are some subtle differences due to the structure of the right-hand side of (4.21). The first point

177

NEWTON'S METHOD FOR OPERATORS

to notice is that

II x 2

-

xIII = t 2 = IIA1 1T(x 1)11 = {I + (1 - cs)1/2}C6 d l/2a l <

C6

by the assumptions on C s and (4.19)-(4.21). Now assume that C6 > tj > tj - l , [x, - Xj-III ~ tj - tj - 1 and the x j are in Do for j = 2, 3, ... ,m. This assumption has just been verified for m = 2. Since tm + 1

-

t m = aK(t - c6)(tm - c7)/2am

(4.22)

it follows that t m + 1 > t m • Also

tm+ I

and

-

C6

= «, - c6){1 + aK(t m - c7)/2am}

2am + aK(t m - C7) = 2am + aKt m - 2b1 + aKc 6 > 2(am - b, + aKt m) > 0 (4.23) on account ·of (4.21). Hence t m + 1 < c 6 • As in Theorem 4.5, but using (4.19),

IIxm + 1

-

xmll ~ IIT(x m) - T(x m - 1) + Am-l(xm - xm-l)ll/a m ~ (ljam){!aKllxm-xm-11l2+(am-l-bl +aKtm-l)(tm-tm-I)}

from (4.21) and a ~ 1. One infers that Ilxm + I - x m II ~ t m + I - t m • Thus, the assumptions made for m imply validity for m + 1 so that they hold for all m by induction. The conclusions that t m -+ c 6 , X m -+ Xo as m -+ 00 and that

IIx m

-

xoll

~

C6 -

tm

may be deduced now by the same arguments as in Theorem 4.5. That T(xo) can be inferred from II T(x m ) II ~ IIAm1lllxm+ I - X m II

=0

provided that Am is uniformly bounded. But, from (4.21),

IIAml1

~ IIT'(xm)11

+ am - b i + aK

m-I

L

j=l

Ilxj + 1

-

Xjll

~

Kllx m - xIII + IIT'(xl)11 + ao - b l + aKc 6

~

(a + I)Kc 6 + II T'(x1)11 + ao - b l

so that uniform boundedness is verified. With regard to uniqueness, form the sequence {sm} as in Theorem 4.5 but using the changed recurrence relation. It may be checked that sm+ 1 ~ t m+ I if Sm ~ t m because of (4.21). Therefore, the validity of (4.23) for t.; implies, a fortiori, that the same inequality holds for Sm. Hence {Sm} is either an increasing or a decreasing sequence converging to C6. Thereafter, the argument runs along similar lines to Theorem 4.5 and the proof is finished.

178

VARIATIONAL METHODS AND OPTIMIZATION

Sometimes it is awkward to confirm that both (4.19) and (4.21) are true. Therefore, we now give a simpler sufficient condition which ensures their validity. Condition (4.20) is kept and so is (4.19) for m = 1. For m ~ 2, instead of (4.19) and (4.21), we consider m-l

IIAm -

T'(x m )/1 ~ ~

+Y L

j= 1

II x j + l -

(4.24)

Xjll

with y ~ 0 and 0 ~ 2£5 ~ b. The aim is to show that (4.24) enforces (4.19) and (4.20) with m-l

a m=b-<5-(K+y)

L Ilxj + 1 -

j= 1

X jll,

b 1=b-2<5,

a=(2y+K)/K,

(4.25)

With this choice m-l

am-bi +aK

L II Xj + I - xjll

j=l

m-l

=~+Y

L

IIXj+l-

j=1

xj ll

so that (4.21) holds whenever (4.24) does. Also am ~ a l so that the condition on am in Theorem 4.5a is satisfied. It remains to check (4.19). The substitution (4.25) changes Cs to c where c = 2a 1(2y and

C6

to d where

+ K)IIA 1 l T(x l)II/(b -

d = {I - (1 - C)l/2}(b - 2(5)/(2y

2~)2

+ K).

In view of Theorem 4.5 we need consider what happens only when m-l

L

II x j + l -

j=l

Xjll

< d.

Now

~~

m-l

+Y L

j= 1

x

II j

+1-

Xj

II + K IIxm - XIII + a l

-

b

m-l

~<5+(y+K)

L

j= I

IIXj+l-Xjll+al-b

so long as c < 1. Hence

III -

Al

lA

m

ll < 1.

According to the remark at the end of Theorem 1.11a, if II A II < 1, then (I - A) -1 exists and 11(1 - A)-til ~ 1/(1 - IIAII). On replacing A by 1 - A 11 Am

NEWTON'S METHOD FOR OPERATORS

179

we deduce that A~ 1 A 1 exists and IIA~IAlll ~ a 1 /{b -J - (1'

+ K)

m-l

L

j=1

/Ixj + 1

-

Xjll}.

Confirmation of (4.19) with the am of (4.25) has been secured. Theorem 4.5a can be applied now. However, the final part involves the sequence {sm} and its properties can be reproduced only if the denominator of the recurrence relation is kept positive. With that proviso we can state

4.5b. If (4.24) holds, if c < 1 and all x in II x - xIII < d are in Do then the iteration X m+ 1 = X m - A~IT(xm)

THEOREM

converges to a root X o of T(x)

= 0 and Ilxm - xoll

where

~

d - tm

= tm + {t(2y + K)t~ - (b - 2J)tm + al11AllT(xl)II}/dm, d, = a l , dm = b - J - (K + y)tm (m > 1) subject to t I = O. There is no other root of T(x) = 0 in Do which lies in IIx - xIII < min[(b -J)/(K + y), {I + (1 - c)l/2}(b - 2~)/(2y + K)]. tm+ 1

One particular case where Theorem 4.5b can be invoked is in Newton's method. Here Am = T'(x m) is that (4.24) is satisfied with y = 0 and ~ = O. Moreover, (4.20) can be met by selecting b = at with at = 1/11{T'(x t )} - t ll. Hence we have

4.5c (NEWTON-KANTOROVICH). Let a l=I/I1{T'(x l)}-lll and c=2KII{T'(x t)}-lT(x 1)llla. If c<1 and all x in Ilx-xlll
X m+ l

converges to a root X o of T(x)

= X m - {T'(xm)}-lT(x m)

= 0 and Ilxm - xoll

where

~

d - tm

(m ~ 1)

subject to t 1

= O.

The root is unique in [x - xIII < aiIK.

To assist with practical computations it is desirable to supplement the above theorems with estimates of error bounds which are more readily applicable.

180

VARIATIONAL METHODS AND OPTIMIZATION

Although the ideal theorem is not available because determination of K is entailed the following theorem goes some way to meet the need by clamping the error between successive iterates. THEOREM

satisfies

4.5d. Under the conditions of Theorem 4.5b the error em =Xm - Xo tllx m+ 1

x mII ~ II em II ~ 2am(tm+ t

-

-

tm)/(b t

-

aKt m)·

For Newton's method tllx m+ I

-

x mII ~ II em II ~ 2(tm+ I

-

tm)'

Proof. Consider first the lower bound. From Theorem 4.2c and the fact that T(x o) = 0 we have [x; + 1

-

X

mII = IIL1 A; 1 T'(xo + tem)· em dtll =

IIL1 {A; ru; + tem) - I} .e; dt + emil 1

~ {sup

te[O,1]

IIA;IT'(xo + tem) - III + 1}lle m ll

from Theorem 4.2b. Hence

IIx m+ 1 -X mII ~ {IIA; I T'(x m ) - I II +supIIA; I(T'(x o + tem)- T'(x m» II + 1}Ile mII {am-bi +aKtm+ K II em am

,,+ am} IlemII

~-------------

from (4.14), (4.15), and (4.13). Consequently, Theorem 4.5a implies that

II Xm+ l

_

Xm

II

{2am+aKc6-bl}lIemll am

~---------

and the inequality for the lower bound is established. The upper bound can be deduced immediately from that in Theorem 4.5b and the observation that C7 ~ bl/aK in (4.22). As far as Newton's method is concerned it is only necessary to note that am = b l - aKt m. The proof of the theorem is complete.

Exercises

18. In Theorem 4.5 a new starting point x~ is chosen such that II x~ - x 111 ~ t~ < C4 and is taken as the first value of the t sequence instead of zero. Assuming that every x in [x - xIII < t~ is in Do show that the iteration converges to the same Xo'

t;

181

OPTIMIZATION

19. S is a subspace of a nonned linear space, and for each S E S, [s] ~ B. If, for each S E Sand X n , there is a bounded linear operator A ns from X to Y such that

IIAlIs

-

T'(x lI )1I

~

e [s]

where a ~ 0 is independent of n and (I, explain what Theorem 4.5b predicts about the iteration Xm+1 = Xm - (A msm ) - 1T(x m ) when the right-hand side of (4.23) is replaced by 20. Show that

alls m II.

21. Prove that

Ile m + 111

~

Ile m II sup II A'; 1 T'(xo + te m ) te[O, 1]

-

I II.

Hence prove that

Use this result to show that in Newton's method

so that the convergence is quadratic for c < 22. If I, from R 2 to R 2 , has components

11 = x 2 + y2 - 20x use Newton's method starting from x 1 zero Xo = (2, 2).

+

1-

32, 12 = y2 - 2x

= (2.4, 1.8) to

find an approximation to the

OPTIMIZA TION 4.6 Unconstrained optimization In the optimal design of engineering systems it is frequently necessary to vary the parameters of a system until a minimum or a maximum is reached. For instance, one might be interested in an antenna design with minimum input power or one with maximum gain. The problem is then usually one of seeking a stationary point of a function of several variables. In other words, we are given I fromR" to R and we wish to know those points of R" at which the Frechet derivative I' vanishes. This question has already been encountered in the preceding section, and if we replace T there by f' the Newton iteration scheme X m + 1 = X m - (f")-lf'(x m ) suggests itself. Since it has been seen in §4.3 that f" can be identified with the

182

VARIATIONAL METHODS AND OPTIMIZATION

Hessian matrix we can write this as (4.26) where Gm is the Hessian matrix calculated at X m and gm is the gradient of f at x"., i.e. o; = Vf(x m ) . The theorems of the previous section give some idea of when the iteration is likely to succeed. Only minima need be considered since maxima can be dealt with by changing f to -f. Broadly speaking, the theorems state that, if x 1 is close enough to the desired solution, the sequence will converge quadratically (Exercise 21) when Gm satisfies a condition of the form (4.13) and is positive definite. Since (4.21) represents a system of n equations it is normally not the practice to construct G~ 1 9m directly but rather to solve the set of equations GmPm = gm. This can be carried out by Cholesky decomposition (§1.12) which is numerically stable when G is positive definite. If Gm is not positive definite alternative strategies must be devised. The Newton method is efficient in terms of the number of iterations needed for convergence but it does require knowledge of the Hessian and expensive multiplication to find its inverse. Other reasons why the Newton method is not suitable for a general purpose minimization algorithm are the requirement that Xl be close to X o and the difficulties which arise when G is not positive definite. If G is not positive definite the iteration may stop at a stationary point which is not a minimum (which may be checked by seeing whether G is positive definite there) but, what is worse, it may stick at a point which is not a stationary point (Powell 1966), because gm is orthogonal to G; 19m. Two techniques are available to overcome the difficulty of Gm not being positive definite. The first is to replace Gm by a matrix which is forced to be positive definite. A simple possibility is to put Gm + III for G". and ensure that u guarantees positive definiteness. The main difficulty is deciding how to choose u, though an iterative method has been suggested (Hebden 1973; Goldfeld, Quandt, and Trotter 1966). Alternatively, Gm + A may be used as a replacement with A related to the Choleski decomposition (Gill and Murray 1972, Matthews and Davies 1971) of Gm • Another strategy is to approximate f locally by a quadratic function such as ¢m where
-

X m)

Xm+ I

so that it

< f(x m ) ·

On this basis Fletcher (1972a) has invented a method which leads to f(x m + 1) < f(x m ) via a quadratic programming problem. This algorithm is probably to be preferred to those which replace Gm by a positive definite matrix when X m is a saddlepoint of f.

OPTIMIZATION

183

The modifications which have been proposed to Newton's method are related to the general notion of selecting a direction Pm' and then determining a step size am in making the iteration (4.27) Observe that here am is a positive scalar whereas the other quantities are vectors. The question is how to choose am and Pm so that gm -+ o. Of course, in any practical implementation it will rarely be possible to attain the point where g = 0 since only a finite number of iterations can be carried out. It will have to be acceptable that the procedure terminates if an X m is found at which II s; II < eo for some preselected eo· Since one wishes f to be reduced at every iteration an obvious condition to ask is that the direction of search Pm should have a substantial component in the direction of the negative gradient. One way of accomplishing this is to impose (4.28) for some specified 6 1 > o. Next, we want to ensure that we can take a non-zero step in the chosen direction, i.e, am ¥- o. One possible condition is I(gm + l' Pm)1 ~ e 2 1(gm, Pm)1

for some selected

62

(4.29)

such that 1 > e2 > O. Another is

If(x m) + (gm' Pm) - f(x m+ 1)1> 63 aml(gm' Pm)1

(4.30)

with 6 3 > o. If e2 is about !, am should turn out to be of reasonable size. Finally, to make sure that there is a non-zero change in [, the restriction

f(x m) - f(X m+ l )

~ -6 4 am(Pm, gm)

(4.31)

is imposed, the right-hand side being positive when (4.28) holds. Algorithms in which f(x m + 1) ~ f(x m ) are often known as descent methods. One of the earliest was the method of steepest descent in which Pm = - gIrl and a". is varied until It»; + 1) is a minimum. The steepest descent method has extremely slow convergence in general, primarily because it makes no allowance for the curvature of f, and so it is now rarely used. Once the direction of search has been settled, the choice of am has to be considered. Ifit is to be chosen so that f(x". + amP".) is a minimum, the question arises of how this is to be done numerically. Although it is a problem in a single variable only one cannot usually make a reliable estimate of it from a few values of f. Contrariwise, one does not wish to calculate f too often so that frequently a compromise is involved. At any rate, am will seldom be located at the precise minimum of It»; + 1) in practice though theoretical investigations often assume that an exact line search can be undertaken. Inexact line searches can cause algorithms to behave differently from one another when theory would predict that they gave the same results.

184

VARIATIONAL METHODS AND OPTIMIZATION

A practical method for computing elm is to estimate a value (x' for which it is certain that am < o'. Then choose am = and if f is say, decreasing for am > test f at + j{J for j = 1, 2, ... until a point a" is found at which f begins to increase. Then repeat the procedure starting at a" with a smaller steplength fJ 1. More information on line searches can be found in Fletcher (1987). A useful theorem (see also Wolfe 1971) in connection with descent methods is

ta'

ta'

ta'

4.6. If there are finite Land M such that f(x) ~ L for all x and M for all xfor which f(x) ~ f(x t ) the descentmethod (4.27) will terminate where II gil < eo under conditions (4.28), (4.31), and either of (4.29) and (4.30).

THEOREM

IIGII

~

Since the customary L 2 inner product on R" has been adopted the condition on is equivalent to specifying that all the eigenvalues of G lie between - M and M.

"G"

Proof. From Theorem 4.2c (gm+1' Pm)

= (gm. Pm) + (Pm.

Hence

Ll

G(Xm + t(Xm+ 1

-

xm»amPm dt)'

l(gm+hPm)1 ~ l(gm,Pm)l- MIIPmI1 2Ctm·

Then (4.29) implies that

(1 - G2)I(gm,Pm)1 ~ Ma mllPml12. Therefore, it follows from (4.28) and (4.31) that

f(x m )

-

- G2)I(gm,Pm)1 2/MIIPmI1 2/ ~ £4ei(1 - G2)llgmI1 M.

f(x m + 1 ) ~

6 4 (1

2

5(

Consequently, if ·11 gm II ~ £0' f(x m) - f(x m+ 1) cannot be less than G4£i e 1- £2)/M. Since f is bounded below, a finite number of such steps is possible at most, i.e. after a finite number of iterations a point at which II gil < eo must be reached. The theorem has been proved when (4.28), (4.29), and (4.31) are valid. If (4.30) is invoked instead of (4.29), we infer from Theorem 4.3a that

f(x m+ 1) = f(x m) + am(gm. Pm) +

a~(Pm.

Ll

(1 - t)G(Xm+ t(Xm+ 1 - Xm»Pm dt)

and so, from Theorem 4.2b,

If(xm+ 1 )

-

f(x m) - Ctm(gm,Pm)1 ~ ta~MIlPmI12.

We deduce from (4.30) that

2 tCtmM IIPm 11 ~ e31(gm, Pm)l;

185

OPTIMIZATION

then (4.28) and (4.31) give

f(x m )

-

f(x m + 1) ~

Again only a finite number of steps with is finished.

2G4G3GT

II gmll

II e; 1 2·

~ GO

are possible and the proof

How near the iteration terminates to a point where 11911 = 0 depends upon the flatness of f, i.e. on how large the set for which II gil < Go is. Remark that the proof does not require (4.28)-(4.31) to be applied at consecutive steps. Therefore we have the following corollary. COROLLARY 4.6. The conclusion of Theorem 4.6 will hold after a finite number of iterations at which (4.28), (4.31) and one of (4.28), (4.30) are imposed provided that on other steps f(x m+ 1) ~ f(x m).

In a large number of algorithms the search direction is taken as Pm = - Hrng m where H; is some suitable matrix. For such directions (4.28) can be confirmed in certain conditions. THEOREM 4.6a. If A.~) and A.~) are the smallestand largesteigenvalues respectively ofthe positivedefinite Hm , (4.28) is valid when A.~)/A.~) is uniformly boundedbelow by a positive constant.

Proof. Since (gm~ H".gm) ~ A.~)II gmll 2 and

IIHmgmll

~

IIHmll llgml1~ A.~)llgmll,

-(Pm' gm)/IIPmllllgmll ~ l~)/l~) ~ a where a is the positive lower bound. Thus (4.28) is verified with proof is complete.

G1

=

a and the

On account of Theorem 4.6a it is not usual in many algorithms to redefine H; if it is discovered that A.~)/A.~) is becoming unduly small. The advantage of working with H; instead of G;1 is that G; 1 requires the calculation of the second partial derivatives of f, whereas this can be avoided for H; so long as it is a good enough approximation to G~ 1. One simple method is to use differences of gradients, i.e, to replace the ij component of Gm by {gj(x m

+ hej) -

9j(Xm )}/h

where ej is the j th coordinate vector and h is a suitable step length. The new matrix may be made symmetric by replacing Gm by !(G". + G~); the symmetric matrix may not be positive definite which will entail further modification. Numerical experimentation (Gill, Murray, and Picken 1972) would appear to suggest that h = 2- 1', where t is the number of bits in the word length of the

186

VARIATIONAL METHODS AND OPTIMIZATION

computer, will keep round-off error at a tolerable level. It may, of course, be not worth updating Gmif IIxm- xm-111 < h. Gradients can, however, be employed for other purposes than estimating the elements of G. For example, let

and then form the iterative scheme for H as

Hm+l -- Hm - HmYmy~Hm (Ym' HmYm)

+

£5m£5~ ,k T + o/mUmUm (£5 m, Ym)

(4.32)

starting from some arbitrarily chosen positive definite HI' though the unit matrix is the most often chosen. In (4.32)

=

U m

s; (£5 m, Ym)

- - HmYm --(Ym, Hm'Ym)

(4.33)

and
This is desirable because, if f is a quadratic function of the variables in R", G is a constant matrix and G£5 m = Ym. Thus (4.34) is a property which might be expected of G- 1 especially as the iterations get close to their limit. Secondly, (h, Hm+ 1h) = (h, Hmh) _ (h, HmYm)2

(Ym' HmYm)

+

(h, 15 m )2 +
(£5 m , Ym)

If H; is positive definite (h, HmYm)2 ~ (Ym' Hmy".)(h, Hmh) with equality only when h is parallel to y".. Therefore Hm + I is positive definite if (£5"., Ym) > 0 and cPm ~ o. With perfect line searches (gm+ l' £5 m ) = 0 so that

(£5 m, Ym)

= a.m(g"., Hmg".)

and the condition on (£5 m , Ym) is achieved when H; is positive definite. In fact, (h, Hm+ Ih) can vanish when O. It can therefore be concluded that H; is positive definite when HI is positive definite and for all m provided that all line searches are perfect. Thirdly, it can be proved that, when f is a quadratic function, the iteration reaches the minimum of f after n perfect line searches (Powell 1971a, b) when

187

OPTIMIZATION

G is positive definite. Even if G is not positive definite the iteration either diverges to - 00 or arrives at the stationary point (Jones 1973) after n perfect line searches. In both cases it is assumed that any iteration, on which the line search is not perfect, is such that f is not increased. Powell also shows that the iteration converges to the minimum of f when f is a convex functional and all line searches are perfect. Fourthly, the quasi-Newton algorithms have the extraordinary property (Dixon 1972, 1973) that, starting from a given Xl and Hit the same sequence of points X m is generated whatever choice is made for
Hm+1 = (I - <5mY~ ) H". (I _ Ym<5~) + -<5m<5~ -(<5""1,,,)

(<5 m,Ym)

(<5""Y",)

(4.35)

for (4.32) in this case. A more general class of iteration formulae (Huang 1970) is

Hm+ 1 -- Hm - HmY","1~Hm + (y"" H","I",)

.11 'II",

<5m<5~ (<5"" "1",)

At. T + 'I'",U",U m

(4.36)

where .pm is another scalar parameter at the disposal of the user. This updating formula has similar properties to (4.32) and criteria for the best estimation of .pm have been examined (Biggs 1971). The quasi-Newton methods can be implemented without calculating first derivatives by using difference formulae such as

= {f(x", + hei ) - !(xrn)}/h, = {f(x m + hei) - f(Xj - he

gi(Xm ) gi(X m )

j)}/2h.

(4.37) (4.38)

The choice of h is dictated by considerations similar to those already mentioned in connection with G; 1. It will be seen that at least n function values are required to estimate first derivatives and so the time spent on this evaluation may greatly exceed that devoted to the line searches although the line searches are more productive in the sense that they are directly concerned with reducing

188

VARIATIONAL METHODS AND OPTIMIZATION

the value of the function being minimized. There is another disadvantage of quasi- Newton algorithms. The direction of search - Hmg m will entail fast convergence only if gm is a good approximation to the correct gradient of f at X m • Unfortunately, the correct gradient must approach zero as m -+ 00 if the iteration is successful and so there are stem difficulties in computing an appropriate gm. Often (4.38) is employed in place of (4.31) when gm is small in order to secure greater accuracy. However, that involves effectively doubling the number of function evaluations at each stage of the iteration. The search for ways around these disadvantages leads one to consider conjugate methods. To describe these let us assume that f is a quadratic function. There is no loss of generality in taking

f(x)

= !(x, Gox)

(4.39)

since a suitable choice of origin puts a quadratic function in this form. The Hessian matrix Go has constant entries and is assumed to be positive definite so that f has a true minimum at the origin. Two non-zero directions P and q are said to be conjugate if (4.40) (p, Goq) = o. To generate conjugate directions form

+ PmPm

(4.41)

= (GOX m+ 1- GOPm).

(4.42)

Pm+l

where

Pm

=

-gm+l

(Pm' GOPm) The choice of am which makes f(x m + amPm) a minimum is iJ.m

=

(Pm' GOxm)

(4.43)

(Pm' GOPm)

Then we can state the following theorem.

4.6b. If f is a positive definite quadratic function, thenfor arbitrary XI and PI = -gl the directions (4.41) are mutually conjugate and the minimum is achieved after at most n iterations. THEOREM

Proof Since g(x) = Gox, (4.27) gives gm+l = gm + amGoPm and so (gm+ l' Pm)

= (GOx m, Pm) + am(Pm, GOPm) = 0

(4.44)

from (4.43). Also

(Pm' GOPm+ 1) = (Pm, PmGOPm) - (GOPm' gm+ 1) = 0 from (4.42). Suppose now for j

= 1, ... ,k -

(Pk' GOPj)

= 0,

(4.45)

1

(gk' gj)

=0

(4.46)

189

OPTIMIZATION

= 2 by (4.44) and (4.45). Then (glc+ h gj) = (glc + awIcGOPIc' gj) = (l1c(GOPIc' f3j-1Pj-l - Pj) = 0 on account of (4.41) and (4.46) for j = 1, ... , k - 1. Further (glc+hglc) = (glc+hf3lc-IPIc-1 - Pic) = f3lc-l(gk+ltPIc-l) = f3lc-1 (gk + (lkGOPk' Pic-I) = 0

which are certainly true for k

so that the second of (4.46) is verified for k As far as the first of (4.46) is concerned

+ 1 in place of k.

= (I3mPIc - gk+l, GOPj) = -(glc+h GOPj) = - (glc+ I' gj+ I - gj)/(lj = 0 1 so long as (lj =F O. For j = k, the result holds by (4.45) and

(Pk+l' GOPj)

for j = 1, ... , k so induction confirms (4.46) for arbitrary k provided that (lj =F O. However, (lj = 0 implies that Pm is orthogonal to gm which is impossible since Pm - 1 is a linear combination of gl' ... ,gm-l' all of which are orthogonal to gm. Thus conjugacy has been established. Furthermore, the mutual orthogonality of glt g2' ... enforces one of them to be zero after at most n iterations and the theorem is proved. Different ways of writing (4.42) are available by taking advantage of the properties derived in Theorem 4.6b. For instance, f3m(Pm,gm)

= (Pm+l,gm) = (Pm+ltgm+l) =

-(gm+hgm+l)·

Thus

Pm

= (gm+ h gm+ 1) = (gm+ l' Ym+ 1) = (gm+ 1. Ym+ 1). (gm,gm)

(gm,Ym)

(4.47)

(gm,gm)

When f is no longer a quadratic function, one can imagine attempting to follow the same strategy since near a minimum the Hessian will be approximately constant and Theorem 4.6b should apply. However, it is now not possible to adopt (4.42) and an explicit formula for (Xm cannot be stated. Instead, one of the three forms in (4.47) is employed-naturally one cannot declare that they are all equal now-and a family of conjugate gradient methods is generated. The algorithms are named Fletcher-Reeves, Hestenes-Stiefel and Polak-Ribiere (Hestenes and Stiefel 1952; Fletcher and Reeves 1964; Polak and Ribiere 1969) corresponding to the three alternatives in (4.47) in the order given. When line searches are not perfect it may not be possible to meet the conditions of Theorem 4.6. In that case it is usual to reset 13m to zero and start the conjugate gradient procedure again with Pm+ 1. While conjugate gradient methods are less robust, as well as needing more iterations and function evaluations than quasi-Newton methods, they do have the advantage of storing only vectors and avoiding matrix manipulation. It is

190

VARIATIONAL METHODS AND OPTIMIZATION

not possible to state a preference between Fletcher-Reeves and Polak-Ribiere because each performs better than the other in suitable circumstances, though Polak-Ribiere seems generally to be more effective when the dimension of gm is large. On the other hand, Fletcher-Reeves is globally convergent even with inexact line searches (AI-Baali 1985) whereas Polak-Ribiere does not enjoy this property. Efforts have been made to combine the numerical performance of Polak-Ribiere with the global convergence of Fletcher-Reeves by specifying conditions on Pm which ensure global convergence (Hu and Storey 1991a). In practice, exact line searches are rarely possible. With inexact line searches it may be more efficient to use directions which are somewhat different from conjugate. Liu and Storey (1991) and Hu and Storey (1991b) suggest combining conjugate gradients and Newton's method by taking where

(

am) Pm

.r:

g~GmPm-l

= - P~-l Gmgm P~-l GmPm-l

)-1( g~gm

)

g~Pm-l'

More generally, one can seek to devise algorithms in which the directions are conjugate to a matrix A so that (p, Aq) = 0, often called A-conjugacy. If A is a good approximation to the Hessian this may be expected to be effective especially if the rule for determining the search directions avoids the estimation of too many derivatives. In passing, we note that in the quasi-Newton methods

(bm+ 1 , H~!.1c5m)

= -(am+tHm+tgm+l' H~!.tbm) = 0

so that consecutive search directions are conjugate with respect to H ~ J. t which may be regarded as an estimate of the Hessian. The possibilities of A-conjugate methods have yet to be explored thoroughly arid it remains to be seen whether they will turn out to be superior to other techniques in practice (Greenstadt 1972). An alternative to the quasi-Newton and conjugate gradient methods is that of Levenberg-Marquardt (Levenberg 1944; Marquardt 1962). This is concerned with solving (4.48) in order to determine bit. An equivalent technique is to find c5k which minimizes

tbr Gkbk + grs,

brb"

subject to ~ hf where hIe is chosen as large as is feasible without causing the function to be minimized too far from f(x" + bk). This is a problem in constrained minimization (see next section) for which it can be shown (More and Sorensen 1982) that b" is a global solution of the minimization if, and only if, (4.48) holds, v(hf k ) = 0 and Gk + vI is positive semi-definite; bk is unique if G, + vI is positive definite. It is this global property which has prompted some investigators to deploy the Levenberg-Marquardt method.

brc5

OPTIMIZATION

191

Exercises 23. The following is a list of standard functions for testing optimization procedures, a point in R" being denoted by (X., X2'· .. , XII): (i) l00(xi - X2)2 + (1 - X 1)2; (ii) D~ 1 {a exp( -a j x l ) - b exp(-ajX2)-exp( -aj)+c exp( -IOa j )}2 where aj = jjl0 and (a) a = b = C = 1, (b) a = 1, b = C = 5, (c) a = 1, b = X3' C = 5, (d) a = X3'

b = X4 ,

C

= 5;

(iii) (Xl + 10x2)2 + 5(X3 - X4)2 + (x 2 - 2X3)4 + 10(XI - X4)4; (iv) l00(xf - X2)2 + (1 - X I )2 + 90(x 4 - X~)2; (v) l00(x2 - Xf)8 + (1 - X1)8; 2 (vi) 1 [exp{ -(b j - X3)X j x 1 } _aj]2 where a j=jjl00 and bj = 25 + {50 In(ljaj)}2/3; (vii) (1 - X t )2 + (1 - X10)2 + 1 (XJ - Xj + 1)2.

212

21=

4.7 The effect of constraints The generalization of quasi-Newton methods to deal with minimization when constraints are present depends upon the form that the constraints take. Many of the methods are based on what happens when the constraints are linear equations so this type will be discussed first. Suppose that the minimization is required to satisfy p equations of the form (j

= 1,... ,p)

(4.49)

where the nj are constant vectors and the dj are constant scalars. The vectors nj should be linearly independent in order that the equations are sensible. The matrix N, whose columns are nt, n2 , ••• , np is of order n x p and will be taken to be of rank p. It will clarify ideas to start with the quadratic function (4.34); for this function the following assertion can be made THEOREM 4.7. If Go is positive definite and x is any vector satisfying (4.49), the minimum of f subject to (4.49) occurs at Xo where

Proof. Introducing the Lagrange multipliers Ah ..• , A. p we have to find the stationary point of f(x) Aj{(nj, x) - d j}. Hence

I:r=t

p

Goxo -

I:

j=

t

Ajnj = O.

(4.50)

Also (4.49) requires (4.51)

192

VARIATIONAL METHODS AND OPTIMIZATION

where d is the p vector with components d h

Go(xo - x)

+

dp • Now (4.50) can be expressed as

... ,

p

g(x) -

L A.jn j = 0

j= 1

whence (4.52) because Go is non-singular, A. denoting the p vector with components A. 1 , Since NTx = d, (4.52) and (4.51) imply that

Go

••• ,

A. p •

N TG 0 1NA. = N TG0 1g(X).

But, NT 1 N is a P x P matrix which is non-singular because G is non-singular and N of rank p. Therefore (4.53) On substituting in (4.52) we obtain the statement of the theorem and the proof is complete. To extend the ideas occurring here to a general function regard H; in (4.32) as an approximation to 1. Then, instead of conducting a line search in the direction - Hmg m , choose

Go

Pm = -{Hm - HmN(NTHmN)-INTHm}gm

(4.54)

provided that X m complies with (4.49). When. the constraints (4.49) are replaced by the linear inequalities (n j , x) - dj

~

0

(4.55)

the suggested procedure is to follow an active set strategy (Fletcher 1971). A constraint is said to be active at x when equality holds for it in (4.55); the active constraints at x form the active set at x. An x which satisfies all of (4.55) is often said to be afeasible point, and it will be assumed that all the X m generated by the iterative procedure are feasible points. Suppose that X m satisfies all the equalities in the active set; then use (4.54) to find x m + l' restricting N to the members of the active set. If this X m + 1 satisfies the inequalities outside the active set we accept it. If not, reduce «; in (4.27) until all the inequalities are complied with and one of the constraints not in the active set is an equality. This determines X m + 1 and the new constraint which has become an equality is added to the active set before x, + 2 is calculated. The rule for adding a constraint to the active set is straightforward, but the criterion for dropping a constraint from the active set when X m + 1 is accepted without modification is more obscure. Suppose that Xo is where the constrained minimum of f occurs and that, for the moment, N refers to the active set of X o. Let H be a positive definite matrix and consider the point Xo - e{H - HN(NTHN)-lNTH}y where y is an arbitrary vector. If lei is

193

OPTIMIZATION

sufficiently small the inequalities in the non-active set will not be violated while the equalities in the active set will be automatically satisfied. Therefore we have a feasible point and, if a/lae =F 0 at s = 0, the sign of e can be allocated so that / is smaller than at xo. This contradicts the definition of X o and so

(g, {H - HN(NTHN)-lNTH}y)

= 0,

g being evaluated at X o' Since y is arbitrary, the consequence is that

Hg - HN(NTHN)-lNTHg or, since H is non-singular,

=0

(4.56)

g = NA, A= (NTHN)-lNTHg. The notation for A has been chosen by analogy with (4.53). If vJ is the jth row of (NTHN)-lNTH, then obviously VJni = ~ij' Thus Xo + eVj is a feasible point provided that e is small enough and non-negative, though now (nj' x) ~ dj + s so that this constraint is no longer active. Again X o will not be the minimum if (v j ' g) < O. Hence Aj

~

0

(j = 1, ... ,p)

(4.57)

are further necessary conditions for Xo to be a minimum. The stipulations (4.56) and (4.57) are often known as Kuhn-Tucker necessary conditions for optimality. Notice that H can be replaced by I in (4.56) with some resulting simplification. On this basis, calculate A = (NTHmN)-lNTHmgm and delete any constraint from the active set for which Aj < O. If it is desired not to remove more than one constraint at a time, choose the one with the most negative Aj' It may happen that this rule will cause zigzagging in which the active set oscillates between two possibilities (Zoutendijk 1970) but this ought not to occur when / has continuous second derivatives and li; is producing a reasonable approximation to the inverse Hessian. If the constraint (nq , x) - dq ~ 0 is added to the active set the extra column nq is added to N and then the search in the direction (4.54) is undertaken. No change is made to Hm • On the other hand, when a constraint is deleted, N loses a column before the line search is carried out. An alternative procedure has been proposed (Goldfarb 1969) which may be advantageous when n is large and the active set always contains about n constraints. The generality of these techniques to non-linear constraints is not easy. One idea is to linearize the constraints and use the linearized versions with the above procedure. A rather more recondite process (Fletcher 197'2.b) evades previous calculations in estimating the Lagrange multipliers. Suppose the constraint has the form h(x) ~ O. When x is replaced by x + ~ the constraint is linearized to h(x)

+ (~, Vh(x»

~

0

and all constraints are dealt with likewise. Now seek the

~

which minimizes

194

VARIATIONAL METHODS AND OPTIMIZATION

+ !a(<5, <5) subject to the linearized constraints, a being a scalar parameter at our disposal. The resulting <5 will depend on x and so will the associated Lagrange multipliers. Any negative multiplier is replaced by zero in accordance with earlier strategy. Let l(x) denote the resulting vector of Lagrange multipliers. The unconstrained minimum of

(g, <5)

f(x) - (l(x), h(x))

where h is now the vector of constraints, turns out to be the required minimum if a is chosen sufficiently large. Thus the techniques of unconstrained minimization can be applied so long as l(x) can be determined (see also §9.20). Extensive information on constrained minimization is given by Fletcher

(1987). Exercises 24. Minimize

Xl -

2x 2 + 3x 3 in R 3 subject to x j ~ 0 (j 2x I 2x I

X2 - 3x 3

-

= 1,2,3) and

= -2,

+ 3x 2 + 4X3 = 1.

25. Find the minimum of X2 - 3x 3 + 2xs in R 6 subject to x j ~ 0 (j

= 1, ... , 6),

xI+3x2-X3+2xs=7, 2x 2 - 4X3 -

X4

= -12,

4X2 - 3X3 - 8xs - X6 = -10. 26. Minimize 3x I - X2 in R 2 subject to Xl ~ 0, 0 ~ X2 ~ 4, Xl + 1X2 ~ 1, Xl + 3X2 ~ 3. 27. If the technique of the last paragraph is applied to f(x) = X subject to X ~ 1, show that (x ~ 2) l(x) = 0

=2-x when a tion is

= 1 and

(x ~ 2)

deduce that the function to be tackled by unconstrained minimizax

(x

~

x - (2 - x)(x - 1)

2)

(x

~

2).

Does this suggest any difficulty which might arise with the method?

VARIATIONAL PRINCIPLES 4.8 Variational approach In the preceding two sections the question of finding a minimum for a function on R" has been examined. We now wish to consider a similar problem for a

195

VARIATIONAL PRINCIPLES

functional which may be defined on a space other than R". It is convenient to commence by reviewing what is involved in finding the constrained minimum of a function on R". Let a minimum of f(x) be required subject to the constraints h 1(x) = 0, ... , hp(x) = O. Introduce the Lagrange multipliers Ah ... , Ap and define p

X(x, A)

= f(x) - L

Ajhj(x)

j= 1

where A has components Ah ... , A. p • Assume that f and each -h are convex (§4.4), i.e. each h is concave. Then, by Theorem 4.4a,

f(x) - f(y) ~

L" -of(y) (x, -

k=l

0Yk

Yk),

(j

= 1, ... ,p).

(4.58)

(4.59)

Now suppose that y corresponds to a stationary point of f subject to the constraints. Let the corresponding values of the Lagrange multipliers be Jlh Jl2' · .. ,Jl p • Assume that Jlj ~ 0 (j = 1, ... ,p) or, in a briefer notation, Jl ~ O. Then (k

= 1, ... , n;j = 1,... ,p).

(4.60)

From (4.58) and (4.60)

by (4.59) and u ~ O. If hj(x) = 0 (j = 1, ... ,p), f(x) ~ f(y). Thus, of all points on hj = 0 (j = 1, ... ,p), y gives a minimum of f or a maximum of -f. For later purposes it is convenient to note that -f(x) can be expressed as
196

VARIATIONAL METHODS AND OPTIMIZATION

Then

{of(Y) Ohj(Y)} L Yk - - - L JLj-- k=l 0Yk j=l 0Yk n

p

X(y,JL)

= -f(y)

~ -f(x) -

of(x)

L - - (Yk n

k= 1 OXk

~ -X(x, A) -

p

L

j=l

Ajhj(x) -

±{O!(X) _ f ~ ± Xk{O!(X) - f +

k=l

k=l

Xk) n

oh (x)

p

L L

k=l j=l

OX k

j=l

Aj OhiX)} Xk OXk

OXk

j=l

Aj OhiX)} oXk

-

Aj t:» - (Yk - Xk) OX k

X(x, A)

= (x, VxX) - X

by (4.58) (4.59), and (4.62). Thus, among points satisfying (4.62), Y supplies a minimum of (x, VxX) - X. Since that minimum is -f(y),

(V#!X(y, Jl), Jl)

= (y,

VyX(y, Jl»

(4.63)

when Y satisfies (4.60). In other words, the maximum attained by
(4.62).

If the convexity of f and -hj is replaced by strict convexity it can be shown that Y and JL are unique. Suppose that z and v form another solution of (4.60) with z "# y. Replace x in (4.58) and (4.59) by z; the equality signs disappear on account of the strict convexity. The argument leading to (4.61) may be repeated with the conclusion f(z) > f(y). Interchanging the roles of z and y we have f(y) > f(z). The inconsistency makes the starting hypothesis false. Therefore z = y and then v = Il follows from (4.60). That the maximum of (V;.X, A.) - X coincides with the minimum of (x, VxX)- X can be established under weaker conditions than used above, for example allowing some hj(y) to be positive so long as the corresponding Ilj is zero. Rather than go down this avenue it is more advantageous to discuss a more general setting. The generalization to be considered is applicable to a wide class of variational problems (Noble and Sewell 1972; Sewell 1987). Firstly, the definition of X used hitherto is dropped; instead, X(x, A) is taken to be a real functional of the n-vector x and the m-vector A. Secondly, equations more general than (4.60)

197

VARIATIONAL PRINCIPLES

are adopted, namely Ty

= VIlX(y, Jl),

TAJl

= VyX(y,

(4.64)

Jl)

(4.65)

where T is some specified operator from the space of x to the space of A. These do include (4.60) as can be seen by selecting T as the zero operator. In order to obtain results for the functional similar to those for the function, some restriction has to be imposed. A functional X(x, A) will be referred to as saddle-shaped when X(x, )v) - X(y, Jl) - (VyX(y, J-L), x - y) - (V;.X(x,

A), A- J-L)

~

0 (4.66)

for all permissible x, A., y, J-L. 'The inner products (.,.) and (.,.) are the usual ones for vectors of nand m components respectively. If equality holds in (4.66) only when x = y and A. = J-L, X will be said to be strictly saddle-shaped. Assume that X is saddle-shaped. Then, from (4.66).

+ (VItX(y, p), J-L) y) + (VIlX(y, Jl) - V;.X(x,

X(x, l) - X(y, p) - (V;.X(x, l), l) ~ (VyX(y,

~ (TAp,

J-L), x -

A),Jl)

(4.67)

x - y) + (Ty - V;.X(x, A), Jl)

when y satisfies (4.64) and (4.65). Since (TAp, x) = (J-L, Tx) (§3.2) the right-hand side can be expressed as (Tx - V;.X(x, A), ,u). This is zero if Tx = V;.X(x, A), i.e. (x, A) is a solution of (4.64). An immediate conclusion is that the value of (V;.X(x, A), A) - X(x, A) when (x, A) satisfies (4.64) does not exceed its value when (x, A) satisfies (4.65) in addition. To put i' another way, (VItX(y, J-L), J-L) - X(y, Jl) when (y, Jl) satisfies (4.64) and (4.65) is an upper bound for values of (V;.X(x, l), l) - X(x, A) when (4.64) is satisfied by (x, A). A different bound can be inferred from X(y, J-L) - X(x, A) + (VxX(x, A.), x) - (VyX(y, Jl), y) ~ (VIlX(y, Jl),

~ (VxX(x,

J-L - A)

+ (VxX(x,

l) - VyX(y, Jl), y)

(4.68)

A) - rAA, y)

when y satisfies (4.64) and (4.65). Accordingly, (VxX(x, A), x) - X(x, A) when VxX(x, A) = TAl is never below the value attained when (x, A) complies with

(4.64) as well. Next, observe that (4.64) and (4.65) imply (VIlX(y, J-L), J-L)

= (Ty, Jl >

= (y, 'rJ-L) = (y, VyX(y, J-L».

Thus, the two expressions considered above have precisely the same value for (y, Jl). The bounds derived enable bracketing of this value. Indeed, the value of (VIlX(y, J-L), Jl) - X(y, J-L) or, equivalently, (VyX(y, ,u), y) - X(y, Jl) when y and J-L satisfy (4.64) and (4.65) is never less than (V).X(x, l), l) - X(x, A) subject to V).X(x, A) = Tx nor more than (VxX(x, l), x) - X(x, l) subject to VxX(x, A) = rA.

198

VARIATIONAL METHODS AND OPTIMIZATION

Sometimes it is possible to solve a subsidiary condition for one quantity in terms of the other. For example, if A. could be found in terms of x from VAX = Tx, substitution for A. in
X(z, v) - X(y, Jl) - (TAJl, X(z, v) - X(y, Jl)

Z-

+ <Jl,

y) - (Tz, v - Jl) > 0

Ty) - (Tz, v) >

o.

Interchanging (z, v) and (y, Jl) merely reverses the sign of the left-hand side. This contradiction renders the initial assumption erroneous. Thus z = y and v = u must hold and uniqueness is proved. The solution of eqns (4.64) and (4.65) has been converted, by the above analysis, to two variational problems which are connected with each other. One problem involves maximizing some functional, the other requires minimization of a related functional, and the maximum and minimum values of the two functionals respectively are the same. Two such variational principles are said to be complementary or, sometimes, dual. Complementary variational principles are of considerable practical importance when the common stationary value is a quantity of physical interest. Trial functions can be substituted in the complementary functionals; they yield upper and lower bounds at once on the physical quantity sought. Usually, the insertion of trial functions will be far simpler than attempting to solve the governing equations (4.64) and (4.65) and then computing the quantity. Even if the common stationary value is not of direct engineering significance, the separation of the complementary bounds can be a guide to how closely the trial functions approximate solutions of the governing equations. To indicate how the last sentence can be justified, take X to be given by X(x, A.)

Then VxX = VF, VAX

= -A.

= F(x) - 1
(4.69)

while (4.64) and (4.65) reduce to

Tx

= -1, rA. = VF

(4.70)

which may be combined into the single equation

TATx + VF = 0

(4.71)

for x. Conversely, any equation of the form Px

+ VF = 0

(4.72)

where P is a positive bounded operator can be expressed as (4.70) since any

199

VARIATIONAL PRINCIPLES

positive bounded operator may be written as rAT for some suitable T (§3.6). Even if P is not bounded the decomposition rAT may still be possible. The specialization (4.69), therefore, still permits the treatment of a large number of situations. If F is convex X is saddle-shaped. If F is strictly convex X is strictly saddle-shaped and then the general theory demonstrates that the solution of (4.71) (and thereby of (4.72» is unique. To find the solution of (4.71) we can maximize - (l, l) - X subject to Tx = -,t, i.e. maximize -!(rATx, x) - F without any restrictions on x. It will be observed that this is immediately applicable to (4.72) since only rAT or P occurs and the individual T is not needed. The corresponding upper bound is provided by minimizing (VF, x) - F + t(l, l) subject to TAl = VF. In this case to put the functional in terms of either x or l we need to know the inverse of rA or VF. Suppose the inverse of rA is known to exist (a similar procedure operates if we can find x given )" from rAl = VF). Then the functional (VF, x) - F

+ !«~n-1VF, VF)

r-t

is minimized over all x. Again only the combination occurs. Thus, in both cases, when given (4.72) the actual forms of rA and T are not necessary-all that requires to be known is that it is theoretically possible to decompose P as

rAT. If Xo is the solution aimed for

by the mean-value theorem, Fo denoting F(xo) and F" being evaluated at + O(x - x o) for some 0 in (0, 1). If r-r + F" is a positive operator bounded below by the positive number m the right-hand side is certainly not less than tmllx - x o11 2 • Hence, by taking account of the upper bound of the terms involving Xo on the left-hand side, we obtain Xo

[x - xo/l 2 ~

1

-

m

{(T'Tx, x)

+ 2(VF,

x)

+ «T'n-1VF,

VF)}

(4.73)

which provides a bound on the deviation of the trial function x from the exact solution X o as measured by the norm. It has the disadvantage of requiring the inverse of rAT and this may not always be known. Returning to our original equations (4.64) and (4.65) we now consider the possibility of replacing one of them by inequalities. Specifically, let us examine

200

VARIATIONAL METHODS AND OPTIMIZATION

the system Ty

= V#,X(y, Jl),

y

~

0

(4.75)

TAJl ~ V,X(y, Jl),

=0

(y, TAJl - V,X(y, Jl»

(4.74)

(4.76)

where, as earlier, the inequality on a vector is a shorthand signifying that each component satisfies the inequality. Under (4.74)-(4.76) the right-hand side of (4.67) becomes (V,X(y, Jl) - TAJl, X)

when Tx = VlX(x, A). This quantity is non-negative when x ~ 0 on account of (4.75). Hence, the value of
which is non-negative ifVxX(x, A.)~ rAe In other words, (V,X(y, J1.), y)-X(y, J1.) is never above (VxX(x, ,1,), x) - X(x, A) subject to VxX(x, A.) ~ rAe Once again complementary variational principles have been obtained. Analogous arguments show that the system TAl

Tx ~ VAX } A~ 0

(4.77)

=0

(4.78)

= VxX,

(A, Tx - VAX)

possess complementary principles. Namely, maximize
with the further constraints

O,}

Tx > VAX,

X

~

TAl ~ VxX,

A.

~0

(4.79)

(4.80) 4.9 Examples This section will be concerned with discussing one or two examples of the preceding theory. 4.9.1 Network analysis It will be sufficient to study the network shown in Fig. 4.1. Each terminal is numbered and each path connecting terminals is assigned a direction, shown

VARIATIONAL PRINCIPLES

201

2

3

4

Fig. 4.1. A typical network.

by the arrows, and numbered, the number appearing beside the arrow. Such a configuration is often known as a directed graph. At each terminal k we assume a voltage Jt; and a current lie (which may be zero) supplied from external sources. With the j th path is associated a current ij in the direction of the arrow and a voltage drop ej • If the vectors Yand e have components JIk and ej respectively it is immediately evident that ATy= e (4.81) where AT is the transpose of the matrix A defined by

0

-1

A=

0

-1

0

0

0

0

-1

0

-1

0

0

0

1

1

-1

The fact that no current can accumulate at a terminal implies that

I

= Ai

(4.82)

where I and i are vectors with components lie and iJ respectively. The relation between ij and eJ will depend upon the nature of the connecting paths. In general, we can write a functional relationship (4.83) with Ohm's law as a particular case in which e j = Rji j • Yet more generally we could put e = F(i) to allow for coupling between the paths but that possibility will not be investigated further. Let there be Fj such that ViFti) = h{i). Then, if the source currents are prescribed so that I is known, (4.81)-(4.83) can be derived from (4.64) and (4.62) via X=

5

L Fj(i

j= 1

j)

+ (I, V)

202

VARIATIONAL METHODS AND OPTIMIZATION

with the identification T = A, x = i and A. = V. Therefore, one variational principle is to maximize 1 Fj(ij ) subject to Ai = I. For Ohm's law fj(i j ) = tR jiJ so that this principle effectively minimizes the heat generated in the circuit. The complementary principle is to minimize

IJ=

(e, i) -

L Fj -

(I, V)

under the constraint ATV = e with e given by (4.83). If I had not been specified but instead had been related to V by I = ( V) the equations could be derived from

X = I.fj + (V) j

where Vv = 4>, and appropriate variational principles formulated. The network could have been cast in the language of economic distribution instead of electricity but the principles would have remained unaltered (for further details see Birkhoff 1963; Iri 1969).

4.9.2 Integral equations The integral equation

f(x(s»

+

r

k(s, t)x(t) dt

=0

(a ~ s ~ b)

where f is a given function and the kernel k is positive definite, is a concrete example of (4.72). Note that the integral equation may be linear or non-linear depending upon the form of f. In this case the inner product is taken as

(x, y) Then

F(x)

=

= J.b x(t)y(t) dt.

r

{f..':(r) f(u) dU} dt

(compare the example just before Theorem 4.1). Thus, one possibility (so long as F is convex) is to maximize

--1 fb

2

a

fb k(s, t)x(t)x(s) dt ds -

F(x)

a

over x. If the inverse of the integral operator is known, perhaps as a differential operator D, then another possibility is to minimize

b fa f(x(t»x(t) dt -

F(x)

+ -1 fb f(x(t»Df dt

and thereby obtain complementary bounds.

2

a

203

VARIATIONAL PRINCIPLES

Another integral equation which can be tackled in this way is x(s)

+

{b

k(s, t)f(x(t» dt = 0

(a ~ s ~ b)

which is known as a Hammerstein integral equation. Again k is assumed to be positive definite and f(t) a known function. Suppose that y(t) = f(x(t» and imagine that this can be inverted to give x = h(y). Then, in terms of y, the integral equation becomes of the previous type and can be handled by introducing F(y)

=f

bf ,< t) C

/I

h(u) du dt.

To avoid the inverse h define

E(x)

=f

bf x< t) /I

/I

f(u) du dr,

e,,,

For small changes in x and y respectively, the change in E(x) + F(y) is (y, e) + (x, ,,) which is the same as the change in (x, y). Hence, by adjusting the constant c if necessary, we can ensure

E(x)

+

F(y) = (x, y).

This permits the conversion of principles involving F into ones containing E. Taking account of this the variational principles enunciated for (4.72) are to maximize

--1 fb 2 a

fb k(s, t)f(x(s»f(x(t»

ds dt

+ E(x) - fb x(t)f(x(t»

dt

a

a

and to minimize

-1 fbfb k(s, t)z(s)z(t) ds dt 2 a a where x(s)

=

{b

+ E(x)

k(s, t)z(t) dr.

Also (4.73) offers the possibility of bounding the mean error in a trial function

4.9.3 Ordinary differential equations A typical illustration is supplied by the Sturm-Liouville problem of finding x(t) such that

dX) -

-d ( p(t) dt dt

q(t)x

= g(t)

(4.84)

204

VARIATIONAL METHODS AND OPTIMIZATION

under the boundary conditions x(a) =

[dx/dt]x=b = d.

C,

(4.85)

The governing equation can be written as the pair p(t) dx/dr = - A, } -dA./dt = q(t)x + g(t) .

(4.86)

The occurrence of the boundary conditions (4.85) as well as the system (4.86) makes it desirable to pick inner products which will allow them to be subsumed in the general framework. We could, of course, limit ourselves to spaces of x for which the conditions of (4.85) are true. If we want unrestricted boundary values it is more convenient to think of x in the variational method as having two parts, one being x(t) and the other the boundary values x(a) and x(b). Denote the boundary values by x(a) and then the interpretation is x

= (X(t)). x(a)

Similarly, the inner product is supposed to consist of two. parts so that (x, y) = (x(t), y(t)),

+ (x(a), y(a))a

where the inner product (,), is appropriate to functions defined over [a, b] while (,)a is of a type appropriate to functions defined only at the boundary. For example, one possibility is (x, y)

=

Ib

x(t)y(t)

dt + x(a)y(a)

+ x(b)y(b).

(4.87)

In a similar fashion A. is split into two parts A.(t) and l(u) with an inner product (A., J1)

= (,A,(t),

+ (A.(a),

J1(t),

J1(u)a·

The operator T is likewise separated into an operator T, and Ta via

T= (T,o t;0) with the requirement that (,A" Tx) = (x, TAA.). For our particular problem we can put xT

= (x(t), x(a), x(b Tx

=

)),

AT

= (A.(t),

,A,(a), A(b )),

(d::a~t), TA. = ( - d:/dt) A

o

l(b)

205

VARIATIONAL PRINCIPLES

the formula for the adjoint following from

_fb x(t) dA. dt + x(b)A.(b)

= fb dx A. dt + x(a)A.(a)

dt

a

<,

a

dt

if >= (,) and (,) is defined by (4.87). Then eqns (4.85) and (4.86) can be expressed in the form of (4.64) and (4.65) if X

= fb {-! ~ +tqx 2+ gx} 2 p(t)

a

dt

+ cA(a)

- dp(b)x(b)

it being assumed that p(t) > 0 and q(t) > 0 on [a, b]. There is no difficulty in checking that X is strictly saddle-shaped so that there is a unique solution. The complementary variational principles are minimize

! fb {;.2 +! (dA. + g)2} dt _ cA(a) 2

a

p

q

dt

subject to A(b) = -dp(b), and maximize

f:

-~ {p(~:r + qx 2+ 2g X} dt + dp(b)x(b) subject to x(a) = c.

4.9.4 Poisson's equation Suppose that one is concerned with solving Poisson's equation

for the scalar function ljJ in a domain t surrounded by a boundary (J on which ljJ has to satisfy certain conditions. Explicitly, assume that (J consists of two parts (J 1 and (J 2 on one of which 4> is specified and on the other the normal derivative of 4> is known. Thus ljJ = g on (J 1 and n , grad 4> = h on (J 2' n being the unit outward normal to (J. Then, if u = - grad ljJ, we have - div u
(4.88)

u = - grad ljJ

subject to (4.89) If

f

contains a portion which can be expressed as a divergence so that we can take account of this by replacing (4.88) and (4.89) by

1=11 + div F

-divu=fl' 4> = g (on (Jl)'

(4.90)

u=F-gradljJ,

n,u

= -hI = n.F

- h.

Following the device of §4.9.3 we split 4> into three parts, 4>(t) associated with the domain t and ljJ«(Jl),4>«(J2) from the two portions of the boundary.

206

VARIATIONAL METHODS AND OPTIMIZATION

Then define grad 4»

-:0

TlfJ = (

the three rows corresponding to t, 0" l' and 0"2 respectively. For an inner product take (u, v) =

f

u( t) • v( t) dt

t

+

f

+

u( 0"1)• v( 0" 1) do

at

Then (u, T4J >

=

f

u( 0"2)• v( 0"2) de .

a2

f

u.grad 4> dt -

t

=f

f

u.n4> da

at

u.n4>dt-f4>divudt t

a2

by the divergence theorem. Defining

we have where TAU

=

-div

0

(

U)

.

n.u

r

Consequently, is the adjoint of T. Eqn (4.90) can now be written as

11

F- u

T4> =

-gn

,

:fAu

=

0

o which can be expressed in the form (4.64) and (4.65) on taking

X(4),U)=f(F.U+4>ll-t u 2 ) dt - f gn.uda-f hl 4> da. t

at

The analogue of the lefthand side of (4.66), with

az

4>, U, t/J, v in place of x, A, y, Jl

207

VARIATIONAL PRINCIPLES

J

can be reduced to

i(u -

V)2

dt

which is positive unless u = v. Therefore X is saddle-shaped and the complementary variational principles are as follows. (i) Minimize

f

(! U2 - F . U) dt + f

t

subject to -div u (ii) Maximize

= il

-J

(4.91)

gn.udO' (1t

= -hI

in t, and n.u

{1
on

0'2'

4»2 + 4>fl} dt +

Iz

h l 4> de

(4.92)

subject to ¢ = g on 0' iIt will be remarked that (ii) takes the customary form of Dirichlet's principle when F = 0 and 0'2 is absent. To see what happens when inequalities are introduced consider the problem of solving (4.93) u = F - grad ¢ (in t), ¢ = g (on 0'1) under the restrictions -div u ~ 11 (in t),

¢

f t

ljJ(div u +

~

n.u ~ -h (on

0 (in t and on

f

11) dt -

0'2)'

(4.95)

0'),

¢(n.u

+ h) de

(4.94)

= O.

(12

In this case (4.91) is minimized subject to (4.94) while (4.92) is maximized subject to ¢ = g on 0' 1 and (4.95).

Exercises 28. Obtain complementary variational principles for the Poisson-Boltzmann equation d 2x/dt2

= e'

- e-

X

on 0 < t < 1, subject to x(O) = 0, x(l) = 1. With the trial function x = sinh at/sinh a show that a = 1.46 is optimal and that II x - sinh at/sinh a II ~ 0.009 with this choice of a. 29. In communication theory the integral equation l/x(t)

=

r i

Jo

"

sin(t - s) xes) ds x(t - s)

occurs. Formulate complementary variational principles (assuming that the operator is self-adjoint and positive definite). Show that IIx(t) - 1.36 - 0.06t2 11 ~ 0.040.

208

VARIATIONAL METHODS AND OPTIMIZATION

30. If P is a positive operator with decomposition TAT and Px

+

vx

=g

where v =F 0 and g are given prove that a variational principle is to maximize -t(Px, x) - tv(x, x)

+ (g,

x).

Derive a complementary variational principle and express it in terms of the same trial function. If z be used as a trial function in both expressions prove that their difference is fllPz + vz - gIl2/v. 31. In the scheme (4.70) the iteration rTX"+l + VF(x,,) = 0 is introduced. Examine whether the complementary variational expressions converge to one another for the sequence {x n } of trial functions, assuming that the sequence itself converges to a solution of (4.71). 32. Obtain complementary variational principles for the partial differential equation

V2 l/J in three dimensions, subject to l/J 33. If W(x, l) is saddle-shaped and

= 4n(l/J

- 1 + l/r)3/2

-.. 1 - l/r as r

-+ 00.

show that complementary principles are provided by minimizing (x, Vx W) - W and by maximizing
prove that minimizing Y and maximizing Y - (x, Vx Y) -
prove that the complementary variational principles are the same as in Exercise 34.

WAVEGUIDES 4.10 The capacitive iris There are occasions when complementary variational principles arise directly without any recourse to the preceding general theory. An example is provided by placing a thin sheet of metal across part of the cross-section of a waveguide to form what is known as an iris. In particular, consider the iris in a rectangular waveguide constructed by placing strips of metal parallel to the longer side (Fig. 4.2). Let the strips be in the plane z = 0 and let the incident field in z < 0

209

WAVEGUIDES

y

tb

! Fig. 4.2. The capacitive iris.

be the fundamental mode. In the fundamental mode

Ey = exp( - i"oz) sin(nx/a),

Hx

=-

("o/WJ.lo) exp( - i"oz) sin(nx/a),

Hz = - (n/iwJ.loa) exp( - i"oz) cos(1tx/a) and the other field components are zero. Here a, b are the sides of the rectangular cross-section with a > band

"0 = (k

2

-

n2 /a2 ) 1/2 ,

"0

it being assumed that the frequency is such that is positive. In view of the x dependence of the incident field and the fact that the electric intensity tangential to the iris must vanish, the field produced by the iris must have the same x dependence. Also no Ex can be generated so the modal structure must ensure this. Expressing the total field in terms of such modes, we assume an expansion

E,

= {eXP(-iKOZ) + nto an exp(-iKnlzl) COS(n7r Y/b)} SiD(1tx/a) ,

-iWJ.lOHx={iKO exp(-iKoz)-

f ~5 an exp(-iKnlzl) cos(n1ty/b)

,.=0 1"11

sgn z} sin(nx/a)

and the other transverse components Ex and H, as zero. As usual, sgn z is 1 if z > 0 and - 1 if z < o. Also

"II = (k

2

-

n2/a 2

-

n 2 n 2 /b2 ) 1/2

with ",. negative imaginary when the quantity inside the radical is negative. The determination of the complex constants a; leads to the complete field.

210

VARIATIONAL METHODS AND OPTIMIZATION

However, in many circumstances and, in particular, if only the fundamental mode can propagate, it will be sufficient to find ao, for ao is the complex amplitude of the reflected wave and this will be the only significant wave away from the iris in single-mode operation. From now on it will be assumed that only the fundamental mode propagates so that "n is pure imaginary for n ~ 1. If, for the moment, only the fundamental mode is retained and [HxJ denotes the discontinuity in H; across z = 0 Ey/[Hx ]

=-

(1

+ ao)roJlo/2"oao

at z = O. On account of the continuity of E, this may be regarded as a shunt impedance Z, where (4.96)

placed across the line at z = 0 in the equivalent circuit. Therefore, it will be sufficient for our purposes to evaluate Z (a o being obtained as a by-product). The aim is to derive an integral equation on z = 0 which will permit the evaluation of the field. However, there are two ways in which this can be approached. One will involve the field on the aperture, i.e. the portion of z = 0 where there is no metal, and the other will concern the current induced in the metal strip. It will be discovered that these two integral equations lead to complementary variational principles for the quantity sought. Let S be the perfectly conducting metal portion of the iris and A the aperture section. On z = 0, write E; as E(y) sin(nx/a). Then, from the theory of Fourier series, for n ~ 1

an

= -2 fb E(t) cos -nnt dt = -2

f

nnt E(t) cos - dt b A b

bob

since E, vanishes on S. Similarly 1 + ao = !

b

f

..t

E(t) dt.

(4.97)

The tangential field H; is continuous across the aperture and so y L --.!!a cos (nn - ) =0 <X)

"=0 x;

b

. A). (y In

On substituting the integral formulae for the an we obtain

~

"0

{! f

b A

E(t) dt - I}

+~f

E(t)

f

!

b A n = 1 x;

cos nny cos nnt dt = 0 (y in A) b b

which constitutes an integral equation to determine the tangential electric intensity in the aperture. It can be converted into a more convenient version

211

WAVEGUIDES

by making the substitution

E(t)

= !biaog(t)

and using (4.97) in the first term. Then

1 fool nny nnt g(t) L :- cos - cos - dt = 0

- Ko

n= 1

A

b

lK n

b

(y in A).

(4.98)

Once 9 has been found from this integral equation, a o can be determined from

L

1 + ao = tiao

g(t) dt

as a consequence of (4.97). It follows from (4.96) that

Z/Zo = -ti

"0

L

(4.99)

g(t) dt.

Since and ix, (n ~ 1) are real and positive, the operator in (4.98) is real and so 9 is real. Thus (4.99) implies that Z is pure imaginary. Also, by multiplying (4.98) by 9 and integrating over A, we see that the integral in (4.99) is positive. Consequently, Z is negative imaginary and the iris is capacitive. The second integral equation can be inferred from a consideration of the current in the metal strip, which is proportional to the discontinuity in H" across the strip. Let J(y) sin(nx/a) = iwPo{(H,,)z= +0 - (H,,)z= -o} so that 00

J(y)

=2 L

n=O

(,,~/iKn)an cos(nny/b).

Since there is no current in the aperture, we deduce that

an

iKn = -2-

f

Kob s ao

J(t)

=

nnt dt b

(n ~ 1),

COS -

-i-f

2K ob s

J(t)dt.

The application of the boundary condition that E; vanishes on S gives

1+

00

L

n=O

whence 1 + ao

+ L 00

11=

1

an cos(nny/b)

f

=0

(y in S)

ix, ntu nx y l(t)cos-cos-dt Kob s b b

-2-

=0

(y in S).

212

VARIATIONAL METHODS AND OPTIMIZATION

Putting we obtain 1+

L

00.

n= t

ix,

f S

nnt

nny

gt(t)cos-cos-dt=O b b

(yinS)

(4.100)

as the integral equation to determine gt and hence the field. In this case

ao = ti(l

+ ao)K o

and

Zo/Z = -iKo

1

gl(t) dt

1

gl(t) dr.

(4.101)

Again, it is clear that gl is real and Z negative imaginary. Variational principles of a rather different type from the ones already discussed can be derived for both (4.98) and (4.100). However, they are particular cases of a more general theory so their consideration will be temporarily postponed.

4.11 Another form of variational principle In operator form the integral equations are Tg=f

(4.102)

where T is a linear operator. It is sufficient for the impedance in (4.99) and (4.101) if the integral of g is determined rather than g itself. In the present language this may be expressed as saying that the determination of (g, h), where h is known, will be enough for our purposes. In fact, taking h as unity would be adequate for (4.99) and (4.101). Let g' be such that (4.103) Then (4.104) (g, h) = (g, TAg') = (Tg, g') = (I, g') which may be viewed as a sort of reciprocity theorem. On account of (4.104) we have h) = (g, h)(I, g') ( g, (Tg, g') which will now be demonstrated to have variational properties. Make a variation in the expression on the right-hand side by replacing g by g + ego where s is considered to be small and go is any element in the space under consideration so long as g + ego is in the space. The right-hand side

213

WAVEGUIDES

becomes

(f,g') [( h) + e{< h) -
+ o(e 2

)J.

If g' satisfies (4.103) the coefficient of e is zero for all go. Conversely, if the coefficient of s is zero for all go,

(Tg, g')h

= (g, h)TAg' .

Multiplication of g' by a constant does not affect this equation. Therefore, if the constant is chosen so that (Tg, g') = (g, h),

it follows that ~g' = h, i.e. (4.103) is satisfied. Variations in g' instead of 9 lead to (4.102). Therefore, it may be concluded that a necessary andsufficient conditionfor eqns (4.102) and (4.103) to hold is that (g, h)(f, g') (Tg, g')

be stationary for small independent variations in 9 and g'. If trial functions are substituted for 9 and g' and chosen to make the expression stationary it is hoped

that a reasonable approximation to (g, h) is obtained. One advantage of this variational formula is that multiplication of 9 by a constant does not affect its value. In the particular case when T is self-adjoint and h =f, the expression

(f, g)2 (g, Tg)

can be used as a variational formula for (f, g). Additional results are available when T is self-adjoint and h = f. In such circumstances another variational expression for (f, g) is 2(f, g) - (g, Tg) as may be confirmed from its becoming 2(f, g) - (g, Tg)

+ 2e{(f, go) -

(Tg, go)} - e2(go, Tgo)

when 9 is replaced by 9 + ego. If G is an approximation to 9 and G = bg h where 9 1 is specified but b is a constant at our disposal, the variational expression is stationary when (f, gl)

= b(gh

Tg 1 ) or

(G, TG)

= (f, G).

(4.105)

It has already been demonstrated that the same equation is satisfied by a trial function which makes the variational expression at the beginning of this section stationary.

214

VARIATIONAL METHODS AND OPTIMIZATION

For any G satisfying (4.105) (TG,g - G)

and hence

(f,g - G)

= (G,f) = (g

- (TG, G)

=0

- G, T(g - G)).

(4.106)

If T is a positive operator, we can conclude that (f, G)

~

(f, g).

In other words (f, G) always lies below the correct value. If - T is a positive operator then (f, G) always exceeds the correct value. If (4.105) is not necessarily satisfied then in place of (4.106) we obtain (f,g - G)

= (G,f -

TG)

+ (g - G, T(g - G).

(4.107)

Both (4.106) and (4.107)can be used to provide a bound for the error introduced in approximating g by G if a suitable estimate can be provided for a term of the form (y, Ty). One possibility (see also Jones 1956) is that there is a positive JI, such that (Ty, T*y*) ~ JI,(y, y*) for all )' which can arise, the asterisk indicating a complex conjugate. Then, if 2

l(h 1 , h2 )1 ~

o; hf)(h

2,

h~),}

1(1', Ty)1 ~ (Ty, T*y*)/~u.

(4.108)

with y = g - G, Ty = f - TG and the right-hand side of (4.108) is known. Consequently, an upper bound to the unknown term on the right-hand sides of (4.106) and (4.107) has been derived. At the same time, a bound becomes available for a quantity such as (11' gJ where fl is known because it can be wrtten as (fl' G) + (fl' y) and 1(/1' '1')1 2 ~ (11' ff)(Ty, T*Y*)/Jl.

As an application consider the capacitive iris of the previous section. Then (4.98) is of the form (4.102) with f = I/Ko and Ta self-adjoint positive operator. Accordingly, if an approximation G is selected which complies with (4.105), G dt will not exceed g dt. Consequently. the value of Z/i obtained from (4.99) by using G will not be less than the correct value. In (4.100) the operator is negative so that G1 dt, G1 being an approximation to gl for which (4.105) is valid, will not be less than the correct value and Zfi, given by (4.101), will not be greater than the true answer. Therefore variational methods for G and G1, ensuring (4.105), are complementary. It should, nevertheless, be pointed out that (4.105) consists effectively of replacing g by G in the integral equation and then integrating after multiplication by G. Thus there is no actual necessity to go through the variational mechanism. In particular, take G to be a constant for (4.98) when A extends from y = 0

r.

r..

Is

215

WAVEGUIDES

to Y = d whereas S covers y

= d to

Gd _ "0

Y = b. Then (4.105) holds if

f .G2~2

n= 1

l"nn n

2

sin? nnd = O. b

Equation (4.99) then gives an approximation Z1 to Z which is

n 2d 2

Z1

(4.109)

iZo = - 4IC ob2 Loo= 1 (l/n 2 iICn) sin2(nnd/b) · For (4.100) make the approximation

= C cos _1t(_b_-_y_)

G

2(b - d)

1

which guarantees that the current vanishes at the edge y The constant C is picked so that (4.105) holds, i.e.

= d of the

metal strip.

2 4(b 2 2(b _ d) ~ + ~ iIC C 4b - d)2 cos (nnd/b) = O. 1t n~l n n 3 {4n 2(b - d)2 _ b2}2

The corresponding approximation Z2 to Z, stemming from (4.101), is 4

·Z = _ b ~

Z 2/1

0

2

i"n cos (nnd/b) 2 2 2 2· "0 II = 1 {4n (b - d) - b } i..J

(4.110)

From the general theory we know that the true Z satisfies Z1/i ~ Z/i ~ Z2/i. An actual example will provide some idea of how close these simple approximations are. Let d = !b and "ob « n so that ix, ~ nttjb. Then from (4.109) since

L:=

Z1/iZo ~ -0.59n/"ob 0

(2n + 1)- 3 ~ 1.04. Also, from (4.110),

L:=l

Z2/iZo ~ -0.87n/Kob 2

because 2n(4n - 1)-2 ~ 0.25. The mean of the two values is -0.73n/Kob which should be compared with the true value of -0.711t/"ob. Both Z1 and Z2 are roughly 20 per cent in error, which is not surprising in view of the crude approximation adopted but is perhaps more accurate than might have been anticipated. One could expect to do much better with more sophisticated trial functions.

4.12 The inductive iris When the metallic strip is placed parallel to the shorter side of the waveguide instead of the longer an inductive iris is produced. The method of §4.10 may be employed to derive integral equations. In order to display its versatility it will now be assumed that more than one mode may propagate and that the guide has different widths on either side of the iris. The cross-section in the (x, z)

216

VARIATIONAL METHODS AND OPTIMIZATION

Fig. 4.3. The inductive iris.

plane is shown in Fig. 4.3. The subscript 1 will be used to indicate quantities in the right-hand guide. For simplicity, only excitation by TE modes independent of y will be considered, though more general cases can be handled. As a result the current induced in the iris will be independent of y and only TE modes will be present. Suppose that the TE mode in which E, is exp(-iAmz)sin(mnx/a) is incident on the discontinuity from the left with )"m = (k 2 - m2n2/a 2 ) 1/ 2 . Then a suitable expansion for the field in z < 0 is

E, = exp( -iAmz) sin(mnx/a) +

00

L

11=1

all exp(iAllz) sin(nnx/a), 00

-iwJl.oH x

= iAmexp(-iAmz)sin(mnx/a) - L iAnall exp(iAnZ) sin(nnx/a) 11=1

and in z > 0 is E,

=

00

L

n=1

b; exp(-ivllz) sin(nnx1/a 1 ) , 00

-iwJl.oH x

= L ivnbn exp(-ivllz) sin(nnx1/a t ) n=l

where As in §4.10

an

f . f .

+ <>nm =

b; = - 2 a1

2 a

-

A

E(t) sin -nnt dt,

A

a

E(t 1 ) sIn -nnt 1 dt 1 a1

where E is the value of E, on z = 0, t and t 1 have their origin in x = 0 and Xl = 0 respectively, and <>nm is zero if n # m but unity if n = m. The continuity of H; across the aperture requires that 00

iA msin(mnx/a) -

L

n=1

iAna n sin(nnx/a) =

00

L

n=1

ivnb n sin(nnx1/a 1 )

217

WAVEGUIDES

for x, Xl in A. Hence

iA.m sin(m1txja) -

L -u, 00

,.= 1 a

f . A

nnt . n1tX E(t) sin - SIn - dt a a

L -iv, 00

,. = 1

al

f ·

n1tt . n1tX E(t 1 ) s l nl- s l n - ld t l =0. (4.111) al al

A

This is the integral equation for E. It can be expressed entirely in terms of X by noting the difference between the origins of X and Xl' and then tackled in the same manner that has already been employed. The theory has been developed considerably (Rozzi 1973) so as to improve accuracy in multimode propagation and to include cascades of apertures. Details will be found in Jones (1986); some subsequent exercises are for those who wish to pursue these developments. Exercises 36. A capacitive iris in a rectangular waveguide, in which only the fundamental mode can propagate, has metal occupying 0 ~ y ~ td l and d + td 1 ~ y ~ b. Show that iZo/Z cannot be less than

2~

. n1td 1}2 + 21d l ) - sln-.

1 {. nn sln-(d d "= 1 In x; b

4K ob

-~ i..J -.-21t

2b

37. In an inductive iris the metal occupies 0 ~ x ~ d < a. If 2 > kalt: > 1 show that the equivalent circuit at the iris is a shunt of impedance iX where

where

K; =

1tX

•

mnx

sln-sln-dx. fo dd 4

•

= 0, d = a l and a = 2d' + d so that the discontinuity is a symmetric H-plane step. Determine a four-port equivalent network for the TE 10 and TE 3 0

38. In §4.12, d'

modes on each side of the junction.

39. An H-plane oversize section is formed by increasing a to 1.2a at one point and then reducing the width to a after a distance 2a. If a = 2.286 em determine a suitable equivalent circuit for frequencies between 7 GHz and 19 GHz. Use two accessible modes in each section of the guide and P = 6. Show that the amplitude of the reflection coefficient of the TE 10 mode does not exceed 0.1 between 10 GHz and 18 GUz. 40. A thick inductive iris is formed by inserting a narrower section of guide for a certain length. Discuss the problem of determining an impedance matrix. 41. Two symmetric infinitely thin inductive irises each have aperture !a and are separated by a distance fa in a rectangular waveguidein which only the TE 10 mode can propagate. Find the impedance matrix for P = 5 when there are 11 accessible

218

VARIATIONAL METHODS AND OPTIMIZATION

modes, three in the guide and eight between the irises. Is there a minimum for the voltage standing wave ratio at any frequency of operation with the TE 20 mode cut-off?

4.13 Vector optimization Earlier sections have dealt with the stationary points of functions on R" or functionals. Sometimes it is desirable to look at other possibilities. For example, attempting to reproduce a given radiation pattern by minimizing the norm of the difference between the approximation and the given pattern can lead to a ratio of input power to radiated power which is of unsatisfactory magnitude. Is it possible to optimize (in some sense) the norm and the radiated power simultaneously while putting a limit on the input power? This is a problem in vector optimization where the components have to be considered individually (rather than combined into a single quantity such as a norm) and improvement of one component may be detrimental to the performance of another. A brief introduction to such problems will be given here; for a fuller discussion see Kirsch et ale (1978) and Jahn (1986). Criteria have to be supplied which enable a decision on whether one vector is to be preferred to another. Whatever the criteria the properties of the rule for comparing vectors should bear some resemblance to the properties of ~ in the comparison of real numbers. Let us agree to write x >- y or y -< x to indicate that the vector x is as good as or better than y. Obviously, it is desirable to have x as good as itself, i.e. x >- x should be true. Also, if x is better than y and y is better than z, it ought to follow that x is better than z, i.e. x >- y and y >- z should imply x >- z. Such a relation is said to be reflexive and transitive. A further desirable property is that the pair x >- y and y >- x should enforce x =y. Usually, in optimization, examples are based on spaces called cones. A convex cone C is a convex set such that, if x is a member of C and the scalar a ~ 0, ax is also a member of C. It might happen that both x and - x were members of C which would be undesirable from the present point of view. Therefore, a convex semi-cone is introduced which is a cone in which there is no x (except x = 0) such that x and -x are both members. If the linear space X contains a convex semi-cone C the relation x >- y can be defined by saying that x >- y when x - y is a member of C. An easy check confirms that the properties set out in the preceding paragraph follow from this definition. In the plane of points (Xl' x 2 ) a convex semi-cone is formed by points in which Xl ~ 0 and X2 ~ 0 are both true. Then x >- y when Xl ~ Yl and X 2 ~ Y2 are both valid. An example which does not involve points is provided by n x n positive sem-definite real matrices which form a convex semi-cone in the space of real symmetric n x n matrices. Then x >- y when the real symmetric x and y differ by a positive semi-definite matrix.

WAVEGUIDES

219

The interpretation above of the symbol >- indicates that Xo should be regarded as best in a set of x whenever x >- Xo entails x = Xo. Of course, there may be several best or none at all in a given set. More generally, one can say that a vector v(t) is best at a point to when v(t) >- v(to) forces v(t) = v(to). By analogy with the terminology for a maximum one might call to a stationary point of v(t). Consider points (t 1, t 2 ) in the set t 1 ~ 0, t 2 ~ 0, t 1t2 ~ 1. Let v(y) = (t 1 , t 2 ) T . Then to = (t h 1/t 1 ) is a stationary point of v(t) for any finite non-zero t 1 • For, suppose v(T) >- v(to) which means T1 ~ t 1 , T2 ~ l/t l • Then T1T2 ~ 1 which is not permitted except with equality. Thus, T2 = 11Tl and this substitution in v(T) >- v(to) leads to an inconsistency unless T1 = t 1 whence T2 = 1ltt. The proof is complete. This example reveals that there can be an unlimited number of stationary points in vector optimization. Theorems about the existence of stationary points and other properties are available but would take us too far afield. The interested reader is referred to the books already quoted. For an application to an antenna problem see Angell and Kirsch (1992). 4.14 Sobolev spaces The definition of a one-dimensional Sobolev space was mentioned in §3.1. It is now convenient to say something about generalizations and a property analogous to that of the gradient in §4.1. Let Q be a domain, i.e, an open set in the real Euclidean space R". The space Lp(Q) with 1 ~ p < 00 consists of those functions x(t) such that

Ilxli p = {fa Ix(t)IP dtf'P <

00

and Ilxllp is the norm of this space. The definition is unsuitable when p = 00. In that case, functions such that Ix(t)\ ~ K (K is a constant) almost everywhere on Q are considered. If K o is the greatest lower bound of such K then

Ilxll oo = K o· For 1 ~ p ~ 00 Lp(Q) is a Banach space and, in particular, L 2(Q) is a Hilbert space. Any bounded linear functional F on Lp(Q) can be expressed as

F(x)

= faX(t)y(t) dt

for all x in Lp(Q). The function y, which identifies F and does not change with x, is a member of Lq(Q) where q = pl(p - 1). Furthermore

IIFII = IIY/lq· In Sobolev spaces these ideas are extended to include derivatives of functions.

220

VARIATIONAL METHODS AND OPTIMIZATION

To simplify notation a partial derivative at (t 1 ,

Dm x

= -

the non-negative integer m1 , m 2 , m1

I/

-

otT

••• ,

-

1oti 2

-

tn) will be written Dm x where

••• ,

-

•••

-

ot:a"

m,. satisfying

+ m 2 + ... +

mn

= m.

We shall write also m for a sum which involves all possible partial derivatives of order m. With this understanding the space W"',P(Q) is comprised of all those functions x(t) for which any D'x is in Lp(Q) for r = 0, 1, ... .m and the associated norm is

Ilxllm,p = Ilxll m• oo =

Lto ~' IID'xlI~} max O~r~m

II D'x]

lip

(1

~

p < (0),

00.

The norms on the right are the L p norms already defined. The derivatives are allowed to be weak, i.e. distributions or generalized functions so long as they are in L p • It is obvious immediately that WO,p is the same as L, and that II

x ll m.p ~

Ilxll p •

Moreover, it can be shown that W'"'P is a Banach space and that Hilbert space with inner product

,t ~' In

Every linear functional

wm,2

is a

D'x(t)D'y(t)* dt

F; on wm,p can be expressed as Fw(x)

=

,to ~' In y,(t)D'x(t) dt

for some Yr in L q; if 1 < p < 00 the yare unique. The space w-m,q(Q) for m = 1,2, ... , and 1 < p < for which II YII-m,q < 00 where

II Yll-m,q = sup

00

consists of those Y

It

x(t)y(t) dt!/llx1lm,p

taken over x in Wm,p(n). It is closely related to the space of linear functional on W'"'p. For some boundary value problems it is helpful to utilize Sobolev spaces of fractional order which will be denoted by WIJ,p. When Jl is an integer this is defined as before. When u = m + v where m is a non-negative integer and

WAVEGUIDES

o< V<

1, a norm is defined by

p

221

,rJo Jor IDmx(t)It _-ul"+vp Dmx(u)IP }l/P dt du ,

Ilxll,.,p = { Ilxllm,p + ~

The space W-Jl,q is obtained from WJl,P in the same way that w-m,q is derived from wm,p. If n has a boundary an which is bounded and reasonably smooth (roughly em regular), a function x in W'"'P is intimately related to its values X B on the boundary (sometimes called traces). In fact, if x is in Wm,P(O) then XB is in Wm - 1/p,p(an) and IlxB Ilm-l/p,p,On ~ K 1 I1 x llm,p,n where the domain in which the norm is to be calculated has been displayed and the constant K 1 is independent of x. Conversely, if XB is in W m - 1 /p,p(aO), there is an x in w,"tp(n) which has these boundary values and IIxll m,p,n ~

IlxB Ilm-l/p,p,on·

The normal derivative n. grad x will be less smooth than x generally, especially if there are edges, and cannot be expected to be in a space higher than W m - 1 - 1/p,p(an ). More general results along these lines are known (see Adams 1975; Lions and Magenes 1972). The importance of Sobolev space can be seen from an integral equation of the type

Lg=! where L is a linear functional. Then, if g is in WJl,P,! is expected to be in W-Jl,q and conversely. Additional information about integral equations and Sobolev spaces can be found in Hsiao and Wendland (1977), Stephan and Wendland (1984), Grisvard (1985), and Hsiao (1989). For the use of Galerkin methods in Sobolev spaces and the relation to the collocation method see Arnold and Wendland (1983, 1985) and Prossdorf and Silbermann (1991).

5 NUMERICAL ASPECTS OF VARIATIONAL METHODS 'MINIMAL SYSTEMS 5.1 Galerkin's method In the foregoing chapter various variational methods have been devised for problems. Advantage can be taken of them only if suitable approximate and analytical techniques are available for implementing them. It is the aim of this chapter to say something about these techniques, especially from a numerical point of view. Often it will be seen that the main purpose of the variaional method is to provide bounds on the approximation found. It is convenient to start with the problem of §4.11 where we were seeking the value of (g, h) knowing that (5.1) Tg=f and f and h are given, Tbeing linear. It was shown that a variational expression for (g, h) is (g, h)(f, g')

(Tg, g')

where g' is in the domain of TA • One could attempt, therefore, to determine trial functions which make this expression stationary and so discover an approximate solution of (5.1). There is, however, another way in which (5.1) is attacked directly. The approach is to assume a truncated series as an approximation to g. Let G be an approximation to g and assume that

where the b; are known elements in the domain of T. They will be called basis elements. The coefficients an have to be chosen in an attempt to satisfy (5.1), i.e. to make (5.2)

Introduce a further set of elements

WI' •.. , W M

which are in the domain of TA •

223

MINIMAL SYSTEMS

Then, form the inner product of (5.2) with each of the N

L a,,(w

m,

Wm

so that

(m = 1, ... , M).

Tb") = (w m, f>

(5.3)

"=1

The coefficients a" are then determined by solving the algebraic equations (5.3). It is common practice to select M = N, but this is not necessary. However, if M =1= N, (5.3) will have to be solved by using the generalized inverse of a matrix (§1.15). The method just described of constructing an approximate solution to (5.1) is known by various names. Many electrical engineers call it the method of moments, following the popular book of Harrington (1968). It is also known as reaction matching because the same equations can be derived from certain reciprocity concepts. If M = Nand w" = b", a name of some antiquity is the Galerkin method. In mathematical literature (5.3) is often referred to as the general Galerkin method or, occasionally, as the Petrov-Galerkin method. The methodofleast squares and the methodofweighted residuals are particular cases. To substantiate the last statement note that the residual left by the approx1 a.Tb; - f. In an attempt to make this as small as possible in imation is some sense a specific element W is chosen (the choice W = 1 is frequent) and then

L:=

is minimized. We are following a formal procedure without trying to specify conditions under which the inner product exists. Restricting attention to the real case, for simplicity, a derivative with respect to am gives N

L

,,=1

a,,(WTb m, Tb,,)

=

(WTbm,f)

(m

= 1, ... , N)

which agrees with (5.3) on the identification W m = WTb m. The general Galerkin method also includes variational methods. Consider the one given at the beginning of this section. Insert trial functions G=

N

L

"=1

M

a"b",

G'

= L

m=l

CmW m•

Make small variations in the coefficients an and Cm so that a" goes to an + ~an and em to Cm + ~cm. According to §4.11 the variational expression is stationary when these variations are independent. Denote by bG and bG' the corresponding changes in G and G'. Then, correct to first order, the alteration in the variational expression is

{(G, h)(f, ~G')

+ (~G, h)(f, G')}(TG, G') - (G, h)
+

224

NUMERICAL ASPECTS OF VARIATIONAL METHODS

divided by (TG, G')2. The contribution due to the Sa; can be zero for all possible c5a n only if

(bn , h)(TG, G')

= (G, h)(Tb n , G')

(n = 1, ... , N).

(5.4)

Similarly, the portion due to c5c m is zero only if

(m

= 1, ... , M).

(5.5)

Now if the set C 1 , •.. , CM satisfies (5.4) and (5.5) then so does the set Cc., ... , CCM where C is a constant. There is, therefore, a set such that

(TG, G') = (G, h).

(5.6)

Similarly, the set a l , ... , aN may be multiplied by a constant without affecting (5.4), (5.5), or (5.6). Therefore, having fixed G', we can choose them so that

(TG, G')

= (I, G').

(5.7)

From (5.6) and (5.7) it follows that the approximation has been selected so. that (G, h) =

(5.8)

and, from (5.4) and (5.5),

=
(b n, h)

(5.9)

(5.10)

Equation (5.10) is merely a repetition of (5.3) while (5.9) is the set of equations which would be obtained from ]-.AG' = h by the general Galerkin method with the roles of b; and W m interchanged. Thus the general Galerkin method is equivalent to the variational method. The same conclusion can be drawn when Tis self-adjoint and h = f. Further, in this case, it can be shown that the alternative variational expression 2(/, g) - (g, Tg) is also equivalent to a general Galerkin method, for the first variation is twice (f, c5G) - (TG, c5G) which vanishes for arbitrary ba; only if . This is a Galerkin method with Wm = bm • The general Galerkin method is usually easier to set up than the variational method and it was for this reason, coupled with the fact that they lead eventually to equivalent results, that it was suggested earlier that the main function of variational methods will be providing bounds rather than supplying the approximate equations to be solved. At any rate, whatever the original starting point, the adoption of truncated series for an approximate solution or trial function ends up with the system (5.3). Whether this system can be solved satisfactorily depends upon the choice

MINIMAL SYSTEMS

225

of basis elements b; and weight elements wm • Since virtually no conditions have been placed on these "elements so far, there is considerable scope for choosing them. In a sense this is a disadvantage, because no criteria are offered on what constitutes a good choice. Consequently, we shall try to see what can be done to remedy this deficiency in succeeding sections. 5.2 Minimal systems It is obvious that we do not wish to have more basis and weight elements than are strictly necessary because keeping M and N as small as is consonant with accuracy is a desirable objective. Our first task therefore is to formulate definitions which indicate how to keep their numbers down. For convenience, the elements will be assumed to be in a Hilbert space H and the properties of a set denoted by {cPm} will be examined without specific reference as to whether they are basis or weight elements.

5.2. If cPk does not lie in the space spanned by all the other elements of {cPm} and this is true for every k then {cPm} is said to be minimal; otherwise, it is called non-minimal.

DEFINITION

If {cPm} is a finite set of linearly independent elements, then {cPm} is minimal, for no one element can be expressed in terms of the others. On the other hand, a finite set of linearly dependent elements is non-minimal since at least one element can be expanded in terms of the others. Any orthonormal system is minimal, if zero be excluded, because any single element is orthogonal to all the others and so cannot be in their space. If the orthonormal set is complete the addition of any other element of H will give a non-minimal set since the additional element can be expanded in terms of the orthonormal set. However, completeness alone is not sufficient to ensure a minimal set because the infinite sets t, t 2 , t 3 , • •• and t 2 , t 3 , ••• , are both complete on the interval (0, 1).This last example demonstrates that if one desires to work with polynomials on finite intervals and stay within minimal sets it is beneficial to make them linearly independent or orthonormal. A set {t/Jn} is said to be biorthonormal to {cPm} if

(cPj, .p",) = ~jk the left-hand side being the inner product in H. We now have the following theorem.

5.2. {cPm} is minimal if and only if there is {.p n} biorthonormal to {cPm}. The biorthonormal set is uniqueifits elementslie in the spacespannedby {cPm}.

THEOREM

Proof. If {cPm} is minimal pick a cPk. Then cPk must be of the form Uk + Vk where Vk is in the space spanned by the other cPm and Uk is a non-zero element

226

NUMERICAL ASPECTS OF VARIATIONAL METHODS

which is orthogonal to that space. Hence

(
Uk) =

=

0

(m =1= k)

lIukl1 2

(m = k).

Therefore, by putting t/Jk = uk/llu kII for every k, a biorthonormal system is obtained and it lies in the space spanned by {4Jm} by construction. If there were another biorthonormal system {t/J~} in that space then (
v,

(5.11) for suitable constants fX k , K and arbitrary e > 0, because there is some
i

k= l,k#j

(Xk€/Jk,t/Ji)l~ellt/Jjll.

But the left-hand side is unity on account of the biorthonormal conditions. Therefore, by taking e sufficiently small, a contradiction arises and the proof is complete. For some purposes it is desirable to have a more restricted class of {(n) be the n x n matrix such that ~j)

= (
Then (n) is a Hermitian matrix which is positive sem-definite. Its eigenvalues can, accordingly, be arranged to satisfy

o ~ A\n) ~ A~) ~ ... ~ A~). From the theory of §3.3 the eigenvalues can be determined by a minimization procedure. In fact 1(n)

1'''1

•

= min

LJ,k (k)t jt:

"n

1.Jj=1

2

Itjl

.

-

= nun

"n

IIL)= 1 tj 4> j l12 1.Jj=1 Itjl

(5.12)

2

over vectors with components t 1 , ••• , t.. Evidently A\n+l) ~ AT) as can be seen by placing t n + 1 = O. In general ),,~+1) ~ l~) from the minimization procedure so that l~) for fixed m does not increase as n increases. (Note that this is only true for fixed m, for a similar process shows that A~++/) ~ A~n) so that A~) cannot decrease.) It follows that if AT) does not become zero as n increases, the eigenvalues of (n) are positive for any n. For this reason we introduce the following definition.

227

MINIMAL SYSTEMS DEFINITION

S.2a. The system {tP".} is called strongly minimal iflim n -

oo

l\n) >

o.

This definition is appropriate if {tP".} is an infinite set. If {tP".} is a finite set with N elements it will be called strongly minimal if l<[') > O. If {tP".} is orthonormal, it is evident from (5.12) that l\n) = 1 (in fact, )w\n) = 1 for all m) and so the set is strongly minimal. Per contra, a minimal set need not be strongly minimal. For, if {
S.2a. Every strongly minimal {tPm} is minimal.

Proof. If {tPm} is non-minimal, (5.11) holds and, a fortiori,

Il

tPi

-

f

ClttPtl12 < e2

k= l.k¢j

(1 + f

k= l.k¢j

IClt I2).

Putting t j = 1, t k = -CXk (k ~ j) in (5.12) we see that l\n) < e2 which is contrary to {tPm} being strongly minimal. The proof is finished. Although a minimal set may not be strongly minimal there is the important result (Dovbysh 1968) that it can always be made so. We have THEOREM 5.2b. If {tP".} is minimal there are scalars cx". such that {cx".tPm} is strongly minimal in the space spanned by {tPm}.

Proof. By Theorem 5.2 there is a biorthonormal set {t/!... } which can be chosen to be in the space spanned by {tPm}' Let 'P(n) be the matrix with elements (t/!h '"j) and eigenvalues 0 ~ JL~) ~ ••• ~ JL~n). Let x be the eigenvector of
L (cPj,
A~)Xj.

We can write "

n

L Xtc/Jk = i=l L Yil/Ji k=l since {cP... } and {t/! m} span the same space. Hence, by the biorthonormal property

YJ' =

r

1(n)x.

11.".

(5.13)

228

NUMERICAL ASPECTS OF VARIATIONAL METHODS

Also n

L

k=1

(t/Jj' t/Jk)Y: =

n

L

k=1

(t/Jj' ¢k)Xk = », = yj /A~)

from (5.13). Thus l/l~) is an eigenvalue of '¥(n) and to each independent eigenvector of (n) corresponds an independent eigenvector of 'P(n). Consequently, (5.14) Choose

~n

> 0 so that 00

L

m=1

1I"'mIl2/~~ <

(5.15)

00

which is obviously always possible (e.g. ~m = mil t/J mII). Then the J.l~n) for the set {t/Jm/~m} certainly does not exceed the left-hand side of (5.15) and is therefore bounded above. From (5.14) it follows that A\n) for {~m¢m} is bounded below for all n and hence {~m
o < Ao ~ l~) < A o for all m ~ n, and all n, the set {
L

k=1

(¢k' l/Jj)a~n)

= (u, ¢j)

(j

= 1, ... , n)

(5.16)

arise. The resulting approximation Un to u is given by (5.17) It is then necessary to examine the question as to whether aLn) tends to a limit as n -+ 00. An answer is provided at once if {
5.2c. If {¢m} is complete and minimal, there exist constantsa"suchthat (k n .....

00

= 1,2, ...).

229

MINIMAL SYSTEMS

If the biorthonormal set {t/J m} satisfies II t/J mII ~ C where C is some constant the convergence is uniform with respect to k.

Proof Since {
Lk=

I(u - U,., t/J,,)I ~ Ilu - u,.IIIIt/J"II --. 0 as n --. 00. Hence, putting a" = (u, t/J,,) supplies the desired result. Furthermore, if Iit/Jm II ~ C for all m,

la" -

a~)1 ~

Cllu - u,.11

--.0

independently of k and the theorem is proved. A related theorem is

S.2d. If {
THEOREM

· ~,. I (")1 - 0. and IIm,..... oo i.J"= 1 a" - a" 2 -

Lt=

1

la" 12 <

00

Proof. Since {
lIu,.11 2 =

,.

L jt Ie

Also

lIu,.11

~

,.,.

=1

(
L la~)12 ~ lo L la~")12.

,,= 1

"

=1

lIull and hence, if p ~ n,

With p fixed let n --.

f ,,=

00

la~")12:s;;

u11 2

Il

lo

1

•

and then, from Theorem 5.2c,

t la,,1 ,,=

2

:s;; lI

1

ull 2 lo

•

Allowing p --. 00, we obtain the first statement of the theorem. Again, if m > n, U

m

on the understanding that

-

m

~

(a(m) - a("»),I,. Ie w«

f..J"

1e=1

a1") = 0 when k > n. In m

lo

= ,.

U

L la1

k=1

m

) -

a~")12

a similar way to the above

< Ilum - u,.11 2

230

NUMERICAL ASPECTS OF VARIATIONAL METHODS

which ensures

Let m ~

00

Now, n ~

with n fixed; then a~m) ~

00

ak

and

Um

~

u. Hence

gives the last result of the theorem and the proof is finished.

5.3 Positive-definite operators Let P be a positive operator such that (Pu, u) ~ cllull 2 for some c > 0 so that (Pu, u) > 0 if u =F 0; in such a case P will be called a positive-definite operator. Define a new inner product [,] by [u, v] = (Pu, v).

With the new inner product comes a norm to be denoted by

II lip which satisfies

where p l / 2 is the square root of P (§3.6). The Hilbert space based on this new inner product and norm will be designated Hp ; it consists of all elements in the domain of p 1 / 2 . It will be assumed that H; is separable. Starting from the equation

Pg=f with an approximation n

e, = L

k=l

a~n)l/Jn

(5.18)

231

MINIMAL SYSTEMS

the Galerkin process leads to n

L [4>k' 4>j]a~) = (f, 4>j).

(5.19)

k=l

The right-hand side can be written as [P-lj; 4>i] or [g,4>j] because p-1 is a bounded operator since P is positive-definite (§3.3). Eqns (5.19) then become the same as (5.16) with a change of inner product and the best approximation to g being sought instead of u. Hence we deduce from Theorems S.2c and S.2d

5.3. If {4>".} is' complete and minimal in Hp , there exist constants at such that, in the Galerkin process for (5.18),

THEOREM

(k = 1,...)

and the convergence is uniform with respect to k if the biorthonormal set in H; then Lk= 1 1ak 12 < 00 satisfies II t/J". II p ~ c. If {4>".} is also strongly minimal in 2 - 0 · ~n I (n)1 and IImn _ oo i.Jk= 1 ak - ak - .

n,

If {4>".} is non-minimal in Hp , the limit of a~n) may not exist as n -+ 00 and, indeed, the values of a~n) may oscillate widely as n varies. In view of the fact that minimality is specified in H; rather than H it is desirable to have theorems which permit one to move from one space to another. A convenient procedure is by imbedding. While a general definition is available the following is sufficient for our purposes. If all the elements of H1 lie in H2 and there is a constant C such that (5.20) for every u E H l ,

II Ilk being the norm of Hk , H, is said to be imbedded in H2 •

THEOREM 5.3a. Let {4>".} lie in H, and H1 be imbedded in H2 • If {4>".} is (strongly) minimal in H2 , it is (strongly) minimal in H1 •

Proof. If {4>".} is non-minimal in H1 there is a 4>j such that

IlcjJ for arbitrary positive

6.

j -

t=

f

1,k¢i

a."cjJ" II < s 1

Hence, from (5.20),

and {4>".} is non-minimal in H2 • Thus, the theorem concerning minimality is proved.

232

NUMERICAL ASPECTS OF VARIATIONAL METHODS

For strong minimality let <1>\") and <1>~) be the matrices in HI and H2 respectively, with eigenvalues ),,~) and JJ~). Then JJ\") ~ JJo > O. Also, from (5.12), 1(") Al

. II Li =1 tjc/> j II r = mIn . II Li =1 tjc/>j II r.IID =1 t jcP j 1\ ~ = mIn - - - -22 D= 1 Itj 1 IID= 1 t jc/>j II ~ Li = 1 Itj 1

>-

H(n)

s-: rl

min IILj= 1 tjc/>j Iii >II~" "" l..Jj= 1 t j'Yj

11 2 s-:

2

H

ro

/c 2

from (5.20). Consequently, strong minimality in HI is shown and the proof is complete.

5.3b. If HI and H2 can be imbedded in each other and {cPm} is strong in one it is strong in the other.

THEOREM

Proof. If JJ\") is bounded below, Theorem 5.3a implies that A\") is, and so it has only to be demonstrated that A~") is bounded above if JJ~n) < Mo. Now ACn) "

since II u 111 ~

= max IILl=1 tjlPjllr ~ Cn)ma IID=1 tjlPjlli ~ M C 2 ~n 2 "'" JJ" x ~n l..Jj = 1 Itj I IIl..Jj = 1 t jlP j 1122 "'" 0 I C111 U tl2 in this case. The theorem is proved.

The essence of the proof in Theorem 5.3b lies in being able to make the assertion (5.20). Now suppose that Hp is contained in H. Then, since P is a positive-definite operator, for U E Hp IlulI~ = (Pu, u) ~

cllul1 2

which is an analogue of (5.20). Hence we have from Theorem 5.3a COROLLARY 5.3a. If H; is contained in H, if {c/>m} is in H; and is (strongly) minimal in H, then {c/>m} is (strongly) minimal in Hp •

The advantage of Corollary 5.3a is that minimality need only be demonstrated in H. For.example, if on the interval (0,1) Pu = -d 2u/dt 2 subject to u(O) = u(l) = 0 I du dv*

[U,v] =

I

--dt

o dt dt

and H; consists of functions which are absolutely continuous, vanish at the endpoints, and have first derivatives of integrable square. With H as the space of functions of integrable square on (0, 1) we see that {sin mnt} is in H; and orthonormal in H. Accordingly, {sin mnt} is strongly minimal in Hp • A somewhat more general result can be obtained from the following theorem.

MINIMAL SYSTEMS

233

5.3c. If P and Q are positive-defnite operators and H; is contained in HQ there is a c > 0 such that

THEOREM

Ilullp ~ cllullQ for u E Hp . Proof. p-l/2 is bounded and self-adjoint so that its domain is H. Hence. for any U E H, p- 1/ 2U is in the domain of p 1/2 and therefore in the domain of Ql /2. Consequently Ql/2 p-l/2 U is well defined. Also, suppose that Un -+ 0 and Ql/2 P" 1/2 u n -+ v. Since p- 1/2 is bounded,

Because Ql /2 is self-adjoint it is closed and Ql /2(p-1 /2 un) --+ Ql/20 = O. Therefore v = 0 and Ql /2 p-l/2 is closed. It follows, on account of p-1 /2un --+ p- 1/20=O.

its domain being H, that it is bounded. Consequently, there is c > 0 such that IIQ1/2p- 1/2ull ~ lIull/c for U E H. Putting p- 1 / 2U = W, so that w is in the domain of p1!2, IIQ1/ 2 wlI ~ IIp1/2 wll/c and the proof is terminated. An immediate conclusion from Theorem 5.3a COROLLARY 5.3c. Under the conditions of Theorem 5.3c, {
Exercises 1. Examine the minimality properties on the interval (-1, 1) of (i) the Legendre polynomial {Pm(t)}, (ii) {cos m7tt} , (iii) t, sin tu, sin 27tt, ... , (iv) cos nt, sin «t, cos 27tt, sin Znt, ... , (v) (1 - t 2 ) 1/2, cos -a, sin nt, cos 27tt, sin 27tt, .... 2. By employing Pu = - d 2 u/dt 2 obtain the first few terms in the Galerkin process for (1

+

r) d 2u/dt 2 = -1

on (0, 1) with u(O) = u'(I) = o. Take cPm = t". Does Theorem 5.3 apply? 3. The solution of V2u = -1 in the region x ~ 0, y ~ 0, x + y - 1 ~ 0 of the (x, y) plane is sought under the boundary condition that u = 0 except on x + y = 1 where au/an = O. With P = - V2 show that

4. In Exercise 3, show that the system {xmym} is non-minimal by Theorem S.3a. Take HI to be the space of functions on 0 ~ x ~ 1, 0 ~ y ~ 1 with derivatives of integrable square and imbed it in Hp by saying that a function in HI corresponds to the same function in H; but defined only on the smaller region.

234

5. If Pu

NUMERICAL ASPECTS OF VARIATIONAL METHODS

= -d 2u/dt 2 and Qu =

d { (2 + t) -dU} + -dt

dt

on (-1, 1) with the subsidiary conditions u( -1) definite and that

1

= u(l) = 0 show

that Q is positive

lIulI~ ~ Ilull~ ~ 3I1ull~· Prove that

{J'- 1 Pn(s) ds} is complete in H; and, by Corollary 5.3c, strong in HQ •

5.4 Stability The importance of minimality in ensuring that the coefficients in the Galerkin process converge as n -+ 00 has already been brought out in Theorem 5.3, and Exercise 2 provides an example of divergence in the case of non-minimality. Minimal sets also have a significant role in guaranteeing numerical stability. In the following it will always be assumed that {cPm} constitutes a standard system, i.e. {cPm} is complete in H; and cPl' ... , cPn are linearly independent for any n. The exact equations (5.19) to be solved can be expressed as II

L

k=l

Cjka~n)

= jj

(5.21)

where cjk = [cPk, cPj] and jj = (!, cPj). In practice, cjk and jj will usually have to be calculated numerically with consequent errors so that the actual, but non-exact, equations solved are n

L (Cjk + djk)b~n) = Jj + gj

(5.22) k=l where djk and gj represent the effects of small errors. It will be assumed that djk = d kj . Denote by c(n), D(n) the matrices with elements cjk, djk and by g the column matrix with entries gj. Signify matrix and vector norms by IID(II)II and IlgII respectively; the 12 or spectral norm (§1.11) will be employed. For stability we want solutions of (5.22) to be close to those of (5.21). Suppose that both can be solved exactly, i.e. ignore round-otT error in the solution. Then we say that the coefficients are stable if, when IID(n) II < ex for some ex independent of n, there are positive constant p, y such that Ilb(lI) - a(lI) II ~

PIID(n)11 + Yilgil

(5.23)

for arbitrary g. Otherwise, the coefficients are said to be unstable. The stability condition assures us that small errors in C(II) and f will only cause small errors in the coefficients. THEOREM

in n;

5.4. The coefficients are stable if and only if {cPm} is strongly minimal

235

MINIMAL SYSTEMS

Proof. If the coefficients are stable (5.23) holds for arbitrary g in (5.22). Let x\n) be the first eigenvector of c(n) and let c(n) be such that c(n)c(n)

= f + x\n).

Then c(n) = a(n) + X\n)jA\n) and the equation corresponds to (5.22) with and g = x\n). Therefore (5.23) implies that

Ilx\n) IljA\n) = Ilc(n) - a(n) II

~

D(n)

= 0

yllx\n)ll.

Thus A\n) ~ 1jy and {cPm} is strongly minimal. Conversely, when {cPm} is strongly minimal, A\n) ~ Ao > 0 and we choose a. < Ao. The equations (5.22) then have a unique solution for every n. Now D(n)II IID(n)II a. IIC(II)-lD(II)II~ IIC(II)-lIII1D(II)11 ~ II A~") ~ 1;;- ~ A < 1 o since a. < Ao. Therefore, by means of Theorem 1.12

Ilb(II) - a(II) II ~ II D(II)II lIa(II) II + IIgll. Ao - ex

If u is the solution of the original problem Ilun lip Theorem 5.2c. Moreover, Ilunll~

n

= L [cPk, cPj]a~n)ar) j,le= 1

~ A\n)

n

L

Ie= 1

~

(5.24)

Ilullp in the notation of

la~n)12 ~ Aolla(n) 11 2.

Therefore lIullp ~ A.~/21Ia(n) II and (5.23) follows from (5.24) with the identification y = (A o - et)-l, P = IlullpjAa/2(Ao - et). The theorem is proved. While Theorem 5.4 deals with the stability of the coefficients it leaves open the question of whether Vn = Lt=t b~n)cPle differs by much from Un. In other words, is there stability in the sense that (5.25) under similar conditions to (5.23)? The answer is provided by the following theorem. THEOREM

n;

5.4a. The solutions are stable if and only if {cPm} is strongly minimal in

Proof. If the solutions are stable choose c(n) as at the beginning of Theorem 5.4 and then Vn = C~n)cPk is such that

L

II VII -

II = IILt= 1 X~~cPlc II

UII P

A~")

236

NUMERICAL ASPECTS OF VARIATIONAL METHODS

where x\n~ are the components of x\n). Also

i II

2

xW
k= 1

i

=

j,k= 1

[

= x\n)Hc(n)x\n)

Hence

"x\n) II ~ 1(n)1/2

1'''1

~

II

showing that {4Jm} is strongly minimal. For the converse, again take IID(n) " ~

Ilvn -

Un

II

(J,

(n)

Xl

~

II

Ao where

)I.\n) ;:::

Ao. Then

II;= (c(n)(b(n) - a(n»), b(n) - a'") ~

IIc(n)(b(n) - a(n»)II/1b(n) _ a(n)/I.

Further

= 11(/ + D(n)c(n)-l)-l(g

Ilc(n)(b(n) - a(n»)11

~

Ao{ IID(n) II [a'" /I . Ao - (J,

+

- D(n)a(n»)

II

/lgll}

as in Theorem 5.4. But, from the proof of Theorem 5.4, IIa(n) 11)1.5/ 2 ~ hence

Ilvn -

Un

Ilulip and

II~ ~,&- {IID(nl 1I11.~~: + Ilgll}(fJIID(n) II + YllgII) AO -

)"0

(J,

from (5.23). Then (5.25) is obtained by taking

PI

=

max ( P,

O AA/21IUllp) , Y1 = max (A }' I , - -)

Ao -

AO -

(J,

(J,

and the theorem is proved. The effect of round-off error can be accounted for to some extent by the condition number K = Ilc(n)11 IIc(n)-lll (see §1.12). In the present context K = A~)/A\n). Since A~) cannot increase and A~n) cannot decrease K will have a tendency to increase with n and the conditioning of the equations will degenerate. This situation does not arise if {4Jm} is strong for then K ~ AolAo and the condition number remains bounded as n ~ 00. In particular, if Q is a positive-definite operator such that H; = HQ , Theorem 5.3c shows that there are positive C I and C2 which satisfy

ctllullQ

~

lIullp ~ c211 u11 Q.

If {4Jm} is orthonormal in HQ , then Theorems 5.3a and 5.3b imply that ~ A~) ~ c~ so that K ~ c~/ci and the condition number is bounded. In general, guidance should always be sought from the condition number even

ci

MINIMAL SYSTEMS

237

when {4>m} is strongly minimal because the stability theorems rely on the exact calculation of numbers. Should they, for example, exceed the capacity of the machine, numerical instability could arise even though analytical stability would hold. Another case where numerical instability could occur despite analytical stability is when A\n) remains positive but becomes too small to be recognized as positive by the computer. Should iterative procedures (§1.13) be adopted for the solution of (5.19) for fixed n it will be necessary to ensure that the spectral radius of the appropriate matrix is less than unity. In certain circumstances it may be possible to relate this to )"\n) and A~n). Theorems 5.4 and 5.4a provide information about the stability of the numerical process but make no statement about whether the approximate PUn converges to Pu. Indeed, there is no warranty that PUn converges to f. Actually, if PUn converges to f for arbitrary {l/Jm} then P must be a bounded operator. Therefore, one can be sure that for unbounded P only special choices of {
5.4b. Let Q be a positive-definite operator with the same domain as P and let its spectrum be discrete. Then, if{4>m} consistsofthe eigenelements of Q,

THEOREM

as n -+

00.

Proof Let Ilk be the eigenvalue of Q corresponding to l/Jk' i.e. Ql/Jk = Ilk4>k' and arrange that III ~ 112 ~ •.•• Let En be the operator which, for any v E H, gives Env = v -

n

L

k=l

(v, l/Jk)4>k·

Since the {4>m} can be taken as orthonormal in H the approximation Un to the exact solution U of Pu = f can, from (5.16), be expressed as u - Enu. Hence

Now, for any v E H,

E;v so that E;

= En

= Env,

QEnv

= EnQv

and QEn = EnQ. Therefore

Also IIQ-l/2 Env 11

2

00

= L

k=n+l

I(v, tPk)r 2fllk ~ IIv 11 2 flln+ 1

238

NUMERICAL ASPECTS OF VARIATIONAL METHODS

with equality when v = 4>n+1' Hence IIQ-I/2Enll = 1/J.,l~t.;1 and ~

[u - Un lip ~

Ilpl/2Q-I/2/1/1E nQu/I 1/2

J.,ln+ I

Consequently

/lQl/2(U - un)11

~

.

IIQl/2p-I/2/1 /lpl/2(U - un)11

-/ K II EnQu II ~--1-/2-

Jln+ i

where K

= IIQl/2p-l/2111lp 1 / 2Q-l/2/1. But un) = Q(I - En + En)(u - Un) = QI/2(I -

En)Ql/2(U - Un) + QEnu since QI/2En = EnQl/2 and Enu n = O. Moreover IIQ I/2(I - En)/I = J.,l~/2, as may be proved in a similar way to that for Q-1/2 En' and so Q(u -

because J.,ln ~ Jin + I' Thus

IIPu n - f

/I

= IIP(u n - u)11

Since \I Env \I --. 0 as n --. theorem is proved.

00

~ lIPQ- 111(K

+

l)IIEnQull·

for any fixed v E H because {4>k} is complete, the

The main difficulty in the application of this theorem is the discovery of an operator Q with the relevant properties. Obviously, if P itself has a discrete spectrum, the appropriate choice is Q = P and then Theorem 5.4b assures that the Galerkin process with the eigenelements of P will converge to the correct solution. Furthermore, the fact that {4Jm} is orthonormal makes certain, via Theorems 5.3a, 5.4, and 5.4a, that the numerical procedures are stable. In general, there are no rules for determining Q and each case has to be treated on its merits. As an illustration, consider the problem of solving

-~. (P(t) dU) + q(t)u = f(t)

(5.26)

t ~ 1 subject to the boundary conditions u(O)

= 0 and u(l) = O. Define

dt

on 0

~

Pu =

dt

-~ (P(t) dU) + q(t)u. dt

dt

Then P is positive definite with

IlulI; =

It

{p(t)1

du/dtl 2 + Q(t)luI 2 } dt

if p(t) ~ Po > 0 and q always exceeds of P when q == O.

Vi

where

VI

is the smallest eigenvalue

239

MINIMAL SYSTEMS

In this case Hp consists of functions which are absolutely continuous for 1, vanish at the endpoints, and have first derivatives of integrable square. In contrast, the domain of P coincides with functions which vanish at the endpoints, have absolutely continuous first derivatives and second derivatives of integrable square. The choice to make in these circumstances is

o~ t ~

Qu = -d 2u/dt 2 for then the domain of Q is the same as that of P because P plays no role in the specification of the domain of P. Also the spectrum of Q with the boundary conditions u(O) = u( 1) = 0 is discrete with eigenelements proportional to sin ntu. It follows that an expansion in terms of {sin mnt} will provide a Galerkin process which converges in a numerically stable way to the solution of (5.26). Indeed, rather more can be said, because Theorem 5.4b asserts that

II p(u:

- u")

+ p'(u~ -

u') - q(u n

-

u)11

-+

0

"un -

as n -+ 00. Since u] -+ 0, lIu~ - u'll -+ 0 it follows that lIu: - u" II -+ 0 because p is bounded below and so the derivatives of the approximate solution converge to the derivatives of the exact solution. The analogous result in higher dimensions is that, for the elliptic equation

au ) + qu = f - Lr kL=r -aXa ( aXk Pjk -

j= 1

1

(5.27)

j

subject to u = 0 on the boundary, a suitable {
240

NUMERICAL ASPECTS OF VARIATIONAL METHODS

Q-l/2. Then

IIQ-l/2vllb = IIvl1 2 and, as in Theorem 5.3c, p 1 / 2Q -1/2 is a bounded operator so that

IIQ-l/2VII;.

IIpl/2Q-l/2 vI12 =

It follows from Theorem 5.3c that there are constants C1 and C2 such that 2 U E H; and so C1Uv1l ~ I\pl/2Q-l/2 v I\ 2 ~ c211v1l 2. Since the domain of Q- 1/2 is dense in H the inequality holds for any v E H. Therefore, the spectrum of Q-I/2pQ-l/2Iies in the interval (c1, c2 ) . Consequently, the spectrum of QI/2p-lQl/2 - I is in «l/c 2 ) - 1, (l/c l ) - 1) and cll1ull~ ~ lIull~ ~ c2l\ull~ for

/lQl/2 p-l/2 - I II However, Ql/2UO - Ql/2UI

Iluo - u11lQ

~

max(ll/c2) - 11, l(l/cI) - 1/).

= (QI/2P-lQl/2 ~

I)Ql/2 U1 whence

IlulllQ max(I(l/c2) - 11, 1(l/cI) - 1/).

(5.28)

Replace Q by cQ and Ul by u.]: with c > O. Then CI and C2 become c fc and c21c respectively. By picking c = 2C 1C 2/(c l + c2) the right-hand side of (5.28) attains its lowest value and

II

Uo -

(c,

+ C2)U 1 II

2CtC2

Q

~

(C2 -

c 1)/Iu 1 1I Q

2C 1C 2

(5.29)

.

By interchanging the roles of P and Q we can obtain a similar inequality involving II lip. Exercises

6. If IID(n) II ~ C t(l\n»I+\' and is not strongly minimal

Ilgll

~

Ilb(n) -

C2(l r » 1/2+" with v> 0, show that even if {
II ~ C(A,\n»,,-1 /2.

Deduce that b(n) -+ a(n) as n -+ 00 for 0 < v < ! and explain why this does not violate Theorem 5.4. (This illustrates that convergence is possible without strongly minimal {
(2m + 1)1/2

I

Pm(2s - 1) ds

but that {xm(l - x)} are not strongly minimal in HQ • 8. If the boundary conditions of (5.26) are (a) u(O) = u'(l) = 0, (b) u'(O) = u'(l) (c) u'(O) - u(O) = 0, u'(l) = 0 show that possible {
= 0,

(a) 2 3/ 2 { (2m - l)n} -1 sin t(2m - l)nt, (b) {t(1 + m 2n2)} -1/2 cos mnt, (c) I, t, (2 1/2jmn) sin mttt.

9. If the eigenelements in Theorem 5.4b are normalized so that a ~ II
II

~

P where

INTEGRAL EQUATIONS

241

and f3 are positive constants independent of m, prove that the conclusion of the theorem still holds. Use this result to simplify the tPm in Exercise 8, e.g. in 1 (b) cos mtu. 10. If (5.27) holds in a two-dimensional region (r = 2) and u = 0 on the boundary, assume that there is a one-to-one mapping with two continuous derivatives which takes the domain into (a) a circle, (b) a square. Prove that possible {tPm} can be expressed in terms of (a) Bessel functions, (b) trigonometric functions. 11. Generalize Exercise 10 to three dimensions and other boundary conditions. (X

m-

INTEGRAL EQUATIONS 5.5 Compact operators

The theory of the preceding sections has the limitation that it asks for the operator to be positive definite. We now attempt to remove this restriction at the expense of requiring that the operator be compact (§3.3). Recall that a linear operator K is compact if, for every infinite sequence {Xj} with Ilxjll ~ C, the sequence {Kxj} contains a convergent subsequence. Since compact operators are necessarily bounded, unbounded operators which are not positive definite are covered neither by this section nor by the preceding section. We continue to work in a Hilbert space H though many of the results are valid for a Banach space. The domain and range of K are in H. Assume that {l/Jm} is complete and minimal in H; let {t/lm} be the unique biorthonormal set which exists by Theorem 5.2. The equation to be solved is (I - K)u = f. For any U E H let n

Fnu

= L

k=l

(u, t/lk)l/Jk'

In order to have the requisite convergence it will be assumed that CONDITIONS

A

(a) IIFnKu - Ku II -+ 0 as n -+ (b) IIFnf - f II -+ 0, (c) sup, IIFnH~ C < 00.

00

for any u E H,

hold. Since we are taking {l/Jm} to be complete IIFnu - u] -+ 0 for any u E H and Conditions A are automatically satisfied. However, some of the following conclusions are valid when only Conditions A are assumed (Ikebe 1972). It will further be supposed that 1 is not an eigenvalue of K so that there is no problem with the uniqueness of the solution to the equation. The assumption is equivalent to stating that (I - K) -1 is a bounded linear operator in H. Let Un = L~= 1 a~n)l/Jk SO that a~n) = (un, t/lt). Then Fnu n = Un. Regarding Un as

242

NUMERICAL ASPECTS OF VARIATIONAL METHODS

an approximation to u one is led to consider

(I - Kiu;

=f

or, rather, by applying the operator F" to both sides

(I - F"K)u"

= F"f.

(5.30)

Taking the inner product of (5.30) with t/Jk we obtain a~") -

L" a}")(KQJj' t/!k) = (f, t/!k)

(k= 1, ... ,n).

j=l

(5.31)

Since (5.31) is the usual Galerkin set, (5.30) can be adjudged as the approximate operator equation to be solved. On this basis we have THEOREM

5.5. If Conditions A hold and (I - K) - 1 exists then

(i) "F"K - KII ~ 0, (ii) IIu" - ull ~ 0 as n ~

00.

Proof (i) Let v be any element of H such that Ilvll ~ 1 and put Y = Kv. Let Y be the closure of the space of such y, i.e. it consists of all y and the limits of convergent sequences {Ym}. Suppose for each nay" E Y can be found such that

IIF"y" - y,,11

~

e>

o.

For each y" choose v" such that II y" - K»; II < lin, which is possible because of the structure of Y. Because K is compact, {Kv,,} contains a convergent subsequence, say {KVk}' with limit Yo E Y. Therefore lim, .... co Yk = Yo and

Each of the terms on the right-hand side tends to zero as k ~ 00; the last by what has just been proved and the second by (a), while the first vanishes because of (c) and Yk ~ Yo. This contradiction means that the supposed Y" do not exist and thus sup IIF"Kv - Kvll ~ 0

IIvII ~ 1

which proves (i). (ii) From (5.30)

u" = (I - F"K) - 1 F"f

= (I

- F"K) - I(F" - F"K)u

or

u" - u = (I - F"K)-l(F"u - u)

(5.32)

INTEGRAL EQUATIONS

243

provided that (I - F"K)-1 exists. Now

(I - F"K)-1 and

= (I = {I -

K - F"K + K)-1 (I - K)-I(F"K - K)}-I(I - K)-1

(5.33)

11(1 - K)-l(FnK - K)II ~ 11(1 - K)-lllllFnK - KII < 1

by (i) if n is sufficiently large. Therefore (I - F"K)-l exists for big enough n and (5.32) is valid for such n. Also Eu - u = F"Ku - Ku + Fnf - f which tends to zero as n -. 00 by (i) and (b). Hence II Un - U II -. 0 so long as 11(1 - F"K)-lll is bounded. But this follows from (5.33) and the proof is terminated. Denote the matrix in (5.31) by A" so that (5.31) may be written a(n) -

Ana(n) = h.

(5.34)

Now

FnKu n = L (Ana(")j
where (A"a("»j is the jth component of Ana("). From this it is evident that if a(") is an eigenvector of A" with eigenvalue A, Un is an eigenelement of F"K with eigenvalue A and vice versa. Thus F"K and A" have the same eigenvalues. Hence we can state the following theorem.

Lt=

5.5a. F"K and An have the same eigenvalues. If 1 a~")
THEOREM

An important deduction from this theorem is COROLLARY

5.5a. The spectral radius of A" does not exceed IIF"K II.

Therefore, if IIFn ll = 1 and 11K II < 1, Corollary 5.5a indicates that the spectral radius of An will be less than 1. Consequently, by Theorem 1.13, all the iterative methods described in §1.13 become available for tackling (5.34), e.g. GaussSeidel and SORe There is therefore considerable advantage for numerical work in trying to arrange that IIFn ll = 1 and 11K II < 1. The discussion of numerical stability is assisted by introducing '-l new norm defined by

244

NUMERICAL ASPECTS OF VARIATIONAL METHODS

and the corresponding subordinate matrix norm (§1.11) IIAII subject to lIalltP provided by

~

= maxllAall

1. Then a bound on the condition number for (5.34) is

THEOREM 5.5b. limn _ 00 sup SI - An 1111(1 - An)-lll tP where C is defined in (c) of Conditions A. Proof. Let x

E

= C2 11 1 - K 1111(1 - K) -111

H and define y by y

Then, if

= (I

- FnK)Fnx.

,.

Fnx

= L CkcPk' k=1

n

Y

= L {c k - (A,.e)k}cPk' k=l

Hence

"YII = 11(1 - An)ell

and so

11(1 - FnK)F,. II =

sup II yll =

IIx II~ 1

sup

IIx II~ 1

11(1 - An)ell·

(5.35)

If x is in the space spanned by cPt, ... , cPn then F,.x = x and IIxll = Ilell. Therefore, the space [x] ~ 1 includes that in which IIell ~ 1 and so the· right-hand side of (5.35) must be at least III - An 11. Hence

III -

An II tP ~

11(1 - F,.K)FnII·

(5.36)

Similarly, it may be proved that

11(1 - An)-llltP ~ 11(1 - FnK)-lF,.II.

(5.37)

Since IIFnK - K II -+ 0 by Theorem 5.5(i), the required result follows from (5.36), (5.37) and the proof is complete. It is worth remarking that, if IIF,. II = 1, equality holds in (5.36) and (5.37). For then, 1 = sUPl!xl! ~ 1 IIF,.xll = supllell so that as x ranges over Ilxll ~ 1, C ranges over lIell ~ 1. It is also true that (5.38) IIA,. II ~ II FnKF,. II with equality when IIFn II = 1. Although the norm employed in Theorem 5.5b depends upon the choice of {cPm} it will be seen later that there are important circumstances in which it can be reduced to a more familiar norm. However, the flexibility permitted by the norm of Theorem 5.5b may be more convenient in particular cases. Stability

245

INTEGRAL EQUATIONS

in the sense of Theorems 5.4 and 5.4a can be discussed by related methods and occurs when {cP... } is orthonormal (see also Yaskova and Yakovlev 1962; Vainikko 1965). NOTE. For some numerical applications it is desirable to allow {eP... } to have more freedom than is permitted by the above theory. To this end, the notion of the inner product is dropped but the norm is retained, i.e. a Banach space 1 c"q,,, but, instead is adopted rather than a Hilbert space. Now define F,.u = of specifying cIt by an inner product, we suppose that there are linear operators
Lk=

Lk=

,.

a~") -

L a~n)~,,(K4>j) =

j= 1

~,,(f)

(k

= 1,... ,n)

(5.39)

is obtained. Apart from these changes the proofs of the theorems go through as before so long as it is remembered that Conditions A are imposed.

Exercises

12. Show that the solution of u - Ku = f can be expressed as u = f + Kf + ... + K"'-lf+v where (I-K)v=K"'f. K"'f is often smoother than f and (b) of Conditions A is covered by (a) so that it may be advantageous to adopt this form of the equation if K"'f is not too awkward to compute. 13. Prove inequality (5.38).

5.6 Integral equations One of the most valuable applications of the preceding theory is in the area of integral equations. A typical integral equation of the second kind is

u(s) -

J.b k(s, t)u(t) dt = f(s)

(5.40)

where a one-dimensional form has been adopted for simplicity, though it will be evident that most of the subsequent remarks apply to higher dimensions. The integral operator is compact in the space of continuous functions on [a, b] if Ilvll = sUP[a,b] lvi, and in LP[a, b] for 1 < p < 00 if

J.b {J:

Ik(s,

where q = p/(p - 1) (see also §3.4)

t)14 dt y/4 ds <

00

246

NUMERICAL ASPECTS OF VARIATIONAL METHODS

Several methods have been adopted for the numerical solution of integral equations; a selection of the most popular will be discussed. (a) Point matching Split the domain of integration into a finite number of subdomains and in each subdomain pick a typical point. Let t l , • • • , t, denote the points chosen and then approximate the integral in (5.40) by a quadrature formula (see §5.12) so that

f

b

a

n

k(s, t)u(t) dt ~ j~l IXjk{s, tj)u(tJ ) ,

For example, if a Riemann sum were employed, lXj would be the length (or area in two dimensions) of the subdomain containing t j • Equation (5.40) becomes approximately n

u(s) -

L

lXjk(s, tj)u(t j)

= !(s).

(5.41)

j= 1

If we ask that this equation be satisfied at s system

= SI' ••• , s; we obtain the algebraic

n

U(Si) -

L

lXjk(Sh tj)u(t j)

= !(Si)

(5.42)

j=l

to determine the unknown values of u. The solution of (5.40) via (5.42) is known as point matching, and may be contemplated when continuous functions are being dealt with. It is usual to make s, = t, though there is no necessity, in principle, to have the same number of S points as t points; however, if they are not the same in number, solution of (5.42) requires a generalized inverse (§1.15). When s, = t, (all i), as will be assumed from now on, the values determined from (5.42) may be substituted in the sum in (5.41). In this way, an interpolationformula is derived giving u(s) for all S in terms of its values at certain points. To place (5.42) in the general framework one might proceed formally with lPm = lX mt5(t - t m), = t5(t - t m) where t5 is the usual Dirac t5-function. This would certainly arrive at (5.42) but, unfortunately, t5 is not a continuous function and therefore not a permissible choice in the strict sense. However, we can take advantage of the note at the end of §5.5 and work via (5.39). Let {lPm} be functions, continuous on [a, b], such that l!Jk(t j) = ~jk. Then define the operator k for any continuous u by

v;

Clearly, k(l!Jj)

= t5 jk and

k(U) = u(t k). (5.39) is available. Indeed

Fmu =

n

L

U(tk)lPk(t),

k=1

and (5.39) becomes (5.43)

247

INTEGRAL EQUATIONS

S:

where hj = f(t j), (AII)ij = k(t h t)
{b

a. j =

(5.44)

This may either be taken as a definition of (Xj (which is at our disposal) or regarded as a restriction on
k is non-zero can be enclosed in intervals whose total length tends to zero as n --. 00. For example, a cubic B-spline (§1.1) which is non-zero only on an interval which contains tk but no other t j would be a possibility. The norm employed will be Ilvll = maxQ~t~b Iv(t)l. Since

I{b k(s,

~ a~::b IV(Y)I{b Ik(s, t)1 dt

t)v(t) -

and a possible choice for v is unity on the whole interval

11K II

= a~~~b {b Ik(s,

t)1 dt.

With these assumptions understood we can assert the following theorem.

5.6. If k and f are continuous, if I\Fllvcontinuous on [a, b], and if

THEOREM

vii

-+

0 as

n -+

00

for any v

n

max Q~t~b

L I
j= 1

then the approximation of point matching converges to the solutiom of (5.40) as n -+ 00.

It is possible to replace f3 by any larger number without invalidating the theorem.

Proof. The assumption on IIFli v - vII ensures that Conditions A hold. Hence, by Theorem 5.5(ii), Iluli - ull ~ 0 where Un is the solution derived from (5.43). Let Bij = k(t i , tj)aj' Then Bj j

-

(All)jj

=

{b

{k(t j , t j )

-

k(t j , t)}
248

NUMERICAL ASPECTS OF VARIATIONAL METHODS

Let where the maximum is taken as t runs through the values where Q>j is non-zero. Then

The supremum or l oo-norm (§1.11) is given by

liB - An II

C()

=

n

max

L IBij -

(An)ijl

l~i~nj=l

~

pe(b - a)

where s is the maximum of eij for i,j = 1,... , n. Since s ~ 0 as n ~ 00 because k is continuous and the length where Q>j is non-zero vanishes in the limit, liB - An 1100 ~ 0 as n ~ 00. Now, for any n x n matrix C,

IIClI~ = maxIIJ1 it CiiaitPill

lit t CijaJtPil1 1

~ max By taking aj conclude that

= ± liP

1=1

)=

(maxla.] ~

1

according as

Ll=

1

CijQ>i is positive or negative, we

With

we see by choosing t

= t i that (i

whence

liP).

= 1, ... , n)

INTEGRAL EQUATIONS

249

= (I - An) - 1, 11(1 - An)-lll ~ PII(I - An)-liit/> ~ fJll(I- FnK)-lFnll ~ PIl(I - FnK)-lIIIlFn II

Therefore, on taking C

00

the second step following by (5.37). But IIFn II is bounded and so is U(I - f;.K) -111 for large enough n (by (5.33». Consequently 11(1 - All) -1 n00 is bounded for large enough n. An immediate deduction is that, for sufficiently large n,

11(1 - An)-l(B - An)lloo < 1 because UB - An

II 00 -+

(I -B)-l_(I -An)-l

o. Since

= {I-(I -An)-l(B-A n)} -1(1 -A n)-l(B-A n)(I -An)-l

it follows that (I - B) -

1

exists when n is big enough and

11(1 -

B)-l - (I - An)-ll\oo -+ 0

as n -+ 00. The point-matching approximation is L~= 1 CilPi where c, replaces u(t i ) in (5.42). Hence Ila~n) - cdl~ = II{(I - B)-l - (I - An)-l}hILx> ~

11(1 -

B)-l - (I - An)-liloollfil

(5.45)

and the right-hand side tends to zero as n -+ 00 by what has just been proved. Since we know Un -+ U the proof is terminated. It will be noted that the interpolation formula used in the theorem is not necessarily the same as (5.41). On the other hand, the proof of the theorem demonstrates that replacing the coefficients of (5.42) by the more complicated ones of (5.43) makes little difference to the approximation when n is large enough. The inequality (5.45) proves a measure of the difference. Whether one or other is more accurate for smaller values of n is more difficult to determine since the theorem does not supply a bound on the error between the exact and approximate solutions. However, an error bound involving Un can be derived from (5.32). For we have seen that 11(1 - FnK)-lU is bounded, uniformly in n, by B 1 (say) and so (5.46) [u - unll ~ B1IIFnu - ull· Although the right-hand side of (5.46) embodies the unknown exact solution

u, it may be possible to make estimates for it which render (5.46) useful.

The main burden in implementing Theorem 5.6 is proving that ilF"v - vII ~ 0 since most of the other conditions are usually verified readily. As a concrete example, let the t j be equally spaced throughout the interval; specifically pick t j = a - h + jh with h = (b - a)/(n - 1). A possible choice for lPj is provided

250

NUMERICAL ASPECTS OF VARIATIONAL METHODS

by linear interpolation, namely j(t) = (t - tj-t)/h

(t j- 1

= (t j + 1 - t)/h

~

(t j ~ t

t

~

~

t j)

tj+ 1)

and zero elsewhere, with obvious adjustments for <1>1 and <1>,.. Since the total interval where j is non-zero is 2h, which tends to zero as n -+ 00, the understood assumption of Theorem 5.6 is met. Also, from (5.44), (Xj = h. Moreover, IIF,. II = 1. Further, the maximum condition in Theorem 5.6 is clearly satisfied with f3 = 1. Finally,

IlFn v - vII = an:~:b

±

I

v(tj)eJ>it) - V(t)1

= l(ti+ 1 -

Ijt

t){v(t i )

(5.47)

v(tj)c/Jit) - V(t)/.

-

v(t)}

+ (t -

ti){V(ti+t) - v(t)}1

h

j=l

~

max{lv(ti )

-

(5.48)

v(t)l, Iv(ti + 1) - v(t)I}.

Since t.; 1 - t i -+ 0 as n -+ 00 and v is continuous the right-hand side of (5.48) and hence of (5.47) tends to zero as n -+ 00. Thus Theorem 5.6 is applicable when j is given by linear interpolation and (Xj = h. Better results can be obtained if one is prepared to assume that v has a continuous second derivative. In this case choose any fixed to such that t j < to < t j+1 and consider h{v(t) - v(t j)} - (t - t j) {v(t j + 1) - v(t j)} t j+ 1) --------=:---h---=--~--~-

(to - tj)(t o -

- (t - t j )(t - t j + 1)

h{v(t o) - v(t j)} - (to - t j ) { v(t j+1) - v(tj)} h

·

This function vanishes for t = tj' to, tj + 1 and therefore, by Rolle's theorem, its derivative is zero at t = C t , C2 where tj < C 1 < to < C2 < t j + 1. Hence the second derivative vanishes at t = C where C 1 < C < C2. Accordingly (to - t)(t - t + - 2[h{V(t o) - v(t j)} j 1)v"(c) o

(to;

t j){v(t j + 1 )

-

V(tj)}]

=0

Since to was selected arbitrarily we can say that

,.

L v(tj)
v(t) -

tj)(t - t j+ I)V"(C)

j=1

on [t j, t j+1] with t j <

C

< t j+ 1. The quantity (t - tj)(t - tj+ 1) must lie between

251

INTEGRAL EQUATIONS

±!h2 on [t j , t j + 1 ] and so if Iv"l ~ M on (a, b) IIFnv - vii ~ kMh2 •

(5.49)

The combination of (5.49) and (5.46) entails [u, - ull ~ 1MB 1 h2 = MB1(b - a)2j8(n - 1)2

(5.50)

if the second derivative of the exact solution u is bounded by M on [a, b]. Thus converges to u by the factor l/n 2 as n is increased. Similar analysis can be carried out for other choices of 4J j but details will not be given.

Un

(b) Collocation Suppose now that the integral is not estimated by the quadrature formula of (a) but still Un

=

n

L aj4Jj

j=l

is taken as an approximation to u, with the understanding that aj is to be identified with u(t j ) . Substitution in (5.40) gives n

u(s) - j~l aj

fba k(s, t)cPj(t) dt = f(s).

By requiring that this be satisfied at s

a7 -

.L aj fb k(t n

)=

1

h

= t l' ... , t n we are

t)4J j(t) dt

= f(t i )

(5.51) led to

(i = 1,... , n).

(5.52)

a

This is the method of collocation or, because 4J j is often selected to be non-zero only on a subinterval, subsectional collocation. The system (5.52) is, in fact, identical with (5.43) so that its theory when k is continuous has already been dealt with. We known that Un ~ u and, if 4J j is a linear interpolant, (5.50) provides a bound to the error. However, the method of collocation is capable of operating even when k is not continuous (Atkinson 1967). Assume that (a) (b)

11K II =

s:

sUPa~s~b Ik(s, t)1 dt is finite, k(S2' t)1 dt ~ 0 uniformly for a ~ S1'

s: Ik(S1' t) -

S2

~ b as

IS1 -

s21

-+

O.

Then the integral operator is compact and the general theory becomes available. Be that as it may, there are important extensions of the technique when k(s, t)

= h(s,

t)j(s, r)

(5.53)

where h is continuous and j satisfies (a) and (b). The method also works if k consists of a finite sum of terms of the type (5.53) but it will be sufficient to supply details for (5.53) alone.

252

NUMERICAL ASPECTS OF VARIATIONAL METHODS

Instead of (5.51) we take n

un(s) - j~l ajh(s, tj)

fb j(s, t)4Ji t) dt = a

f(s)

(5.54)

and then (5.52) is replaced by

ai -

.L ajh(t n

}=

1

i,

fb j(tj, t)
ti)

(i

i)

= 1,... , n).

(5.55)

a

The aim is to select the factors of k so that the integrals in (5.55) can be either evaluated analytically or calculated numerically in a reliable way, while retaining simultaneously good convergence properties for the system (5.55). In a sense the method is a mixture of point matching and collocation, the one being used for h and the other for j. It can be shown that, if the
U- Un

= Ku

- Knun = (K - Kn)u + Kn(u - Un),

we have

U- Un = (I - K n)- I(K - Kn)u. From this result the error bound

Ilu - Un II

~

(5.56)

BII(K - Kn)ull

is deduced. Now, if P(r) is a polynomial of degree m - 1 in r which interpolates a function f(r) at r1 , ••• , 'm' the Lagrange interpolation formula (Jones and Jordan 1969) gives f(r) - P(r) = (r - '1)(r - r2 ) · • • (r - rm)!(m)(c)/m! (5.57)

'1

for ~ r ~ 'm' r1 < c < rm when ! has m continuous derivatives. This is a generalization of (5.49). Now choose equally spaced points with n = (m - l)p + 1, p being a positive integer, so that b - a = (m - l)ph. Then, if
h(S, t)v(t) -

where

I

±

v(tj)h(s, tj)4Jit)1

j= 1

m!Cm =

t

max O~t~m-l

Therefore

If

bk(s, t)v(t) dt a

.t

}-1

v(tj)h(s, t j)

a:

~ Cmhmsup Iat

h(s, t)V(t)1

It(t - 1) ... (t - m + 1)1.

fb 4Jit)j(s, t) a

~ CmhmllJ II sup Iatamm h(s, t)v(t) I

253

INTEGRAL EQUATIONS

where the supremum is taken over all sand t in [a, b], and

IIJ II = sup

fb Ij(s, t)1 dt.

a~s~b

a

Combining this with (5.56) we obtain the error bound

Ilu - Un /I

~ BCmhmIIJ" sup Iatm am h(s, t)U(t)!.

(5.58)

In particular, if linear interpolation is employed, m = 2 and Cm = i. The efficacy of the method depends upon the split chosen in (5.53). In the first place it should be selected so that the integrals involving i can be evaluated analytically with reasonable ease if at all possible so as to avoid as much numerical quadrature as is feasible. Secondly, h(s, t) should be such that it has at least a first derivative, and preferably more, so that (5.58) will ensure convergence. For example, suppose k(s, t) = lnlcos s - cos tl on the interval [0, n]. The split h(s, t) = Is - tll/ 2 ln icos s - cos r], j(s, t) = Is - tl- 1/2 (5.59) has an unbounded ahjat at s = t so that convergence can be expected to be poor. On the other hand, one can take advantage of the fact that k can be made a sum of terms of the type (5.53). Thus, if p = s - t and q = s + t, k(s, t) = In

sin tp

tp

sin!q

+ In q(21t _ q) + lnlp] + In q + In(21t - q).

(5.60)

The first two terms can be taken together as hI with it = 1. The remaining terms can be regarded as j2 with h2 = 1. Now the h's are analytic so that (5.58) implies an h 2 order of convergence when m = 2. Also the integrals involving j can be evaluated exactly. In view of the improved convergence that occurs as well, one can conclude that the split of (5.60) is eminently more satisfactory than that of (5.59). Experimental evidence seems to suggest that m = 3 may be desirable for good accuracy with lower values of n. (c) Fourier methods

If the kernel has the form k(s, t)

= w(t)ko(s, t) where

Ib Ib Ib

w 2 (t) dt <

w 2 (s)lko(s,

w

> 0,

00,

tW dt ds <

00

the possibility of working with functions of integrable square can be contemplated.

254

NUMERICAL ASPECTS OF VARIATIONAL METHODS

With the inner product (u, v)

2(t)U(t)V*(t) dt = {b W

a Hilbert space H can be defined and the integral operator will be compact in this space. Choose now {¢m} as an orthonormal set which is complete in H and take, for v E H,

Fnv

n

=L (v,
Clearly (b) of Conditions A is satisfied and, from Bessel's inequality (§1.5),

IIFnvII 2

~

IIvll 2

with equality when v =
(An)ij =

{b w2(s)c/Jt (s) {b k(s, t)c/Jit) dt ds

and

Since 00

IIFnv - vl1 2 ~

L

j=n+l

I(v, cPj)1 2

(5.32) provides the error bound (5.61)

tt

The norm which occurs in Theorem 5.5b is now

Iiall., =

laj l2 } 1/2

which is the usual Ij-norm II 112. Accordingly, the corresponding matrix norm is the spectral norm. From (5.38) we deduce that

IIA n 1 2 = IlFnKFn II

f/

~ 11K II = {{b w 2(s) {b Iko(s, tW dt ds 2

(5.62)

and from (5.37) (5.63)

INTEGRAL EQUATIONS

255

The condition number in Theorem 5.5b satisfies n-+ 00

In certain circumstances it may be possible to ameliorate the convergence of the Galerkin process by modification of the kernel. For simplicity it will be assumed that w is unity; the extension to general w should present no difficulty. Suppose that, for sufficiently continuous v,

f

b

= ow(a) ~ (Jv(b) + 0(1/j2)

v(t)t/Jj(t) dt

J

a

(5.64)

as j ~ 00 for some constants rx and p, one at least of which is non-zero. For example, if [a, b] is [0,21t] and
f

bf bk(s, t)
a

l

a

f3*k(b, t)}
a

+ O(lji2 ) (5.65)

as i ~ 00 under reasonable conditions on k. Moreover, by putting v = 1 and v = t in (5.64)

f

b

t/J!(t) dt

= (a* ~ (J*) + 0(1//), J

a

f

b

a

(t - a)t/J1(t) dt = -

(J*(~ -

a)

+ 0(1//).

}

Therefore, the same right-hand side as in (5.65) could be obtained by starting not from k(s, t) but from

rx*k(a, t) - f3*k(b, t) rx* - P*

(5.66)

if rx* i: fJ*, or -(s _ a) {a*k(a, t) - (J*k(b, t)}

fJ*(b - a)

(5.67)

if fJ* i: O. By assumption it is not possible for fJ* = 0 and rx* = fJ* simultaneously so that one of the forms is valid. Consequently, if we subtracted a term of this type from k(s, t) we should have a kernel giving rise to coefficients in the Galerkin equations which are O(lji2 ) as i ~ 00. This suggests that, for given n, the Galerkin equations for the modified kernel will be more accurate than those for the unmodified.

256

NUMERICAL ASPECTS OF VARIATIONAL METHODS

To be more explicit denote a term of the type (5.66) or (5.67) by kl(s, t) and consider the modified integral equation

u(s) -

r

{k(s, t) - k1(s, t)}u(t) dt

= f(s) +

{b

k1(s, t)u(t) dt.

Make the substitution

u(s) = f(s) Then

v(s) where

r

+

{b

k 1 (s, t)u(t) dt

+ v(s).

{k(s, t) - k, (s, t)} v(t) dt

= g(s)

(5.68)

(5.69)

Now, if we write k l (s, t) = k 2(s)k 3(t) where k 2 and k 3 are functions of single variables, both of the forms (5.66) and (5.67) will be included, as well as several other possibilities. Then (5.68) can be expressed as

u(s) = f(s) where

A=

r

+

r

Ak 2(s)

+

v(s)

(5.70)

kit)u(t) dt

is a constant to be determined. Further,

g(s) =

{k(s, t) - k1(s, t)}{f(t)

+

Ak 2(t)} dt

= gl(S) + Ag 2(s)

say. Therefore if

(5.71) for j = 1,2, we have v(s) = v 1(s) + Av 2(s) where It then follows from (5.70) that A is given by

A[

VI

r

and

V2

are known functions.

1_{b k 3(t){k2(t) + V2(t)} dtJ = k3(t){f(t) + v1(t)} dt.

(5.72)

The problem is therefore effectively reduced to finding Vi and v2 , i.e. solving (5.71). This can be carried out by Galerkin's process; some additional integrals will have to be computed as compared with the original equation but one can anticipate a much more accurate result for a given value of n.

257

INTEGRAL EQUATIONS

The kernel k 1 is only one of many alternatives which could have been subtracted. It has the advantage of simplicity but any other which reproduces (5.65) is acceptable, especially if it is more convenient from a computational point of view. It may be helpful to make k 2 (s) a particular polynomial or e " for instance. The question arises as to whether the solution of (5.71) or (5.69) is unique even when the original integral equation possesses a unique solution. Now, if the right-hand side of (5.69) is zero, v must be a constant multiple of A., the unique solution of

A.(s) -

f..b k(s, t)A.(t) dt = k 2(s).

r

Since this can be written

{k(s, r) - k 1(s, t) }A.(t) dt = k 2(s)

A.(s) -

{I + Jab

k 3(t)A.(t) dt}

it transpires that the solution of (5.69) is non-unique if and only if 1+

f..b k 3(t)A.(t) dt = O.

(5.73)

As a criterion (5.73) will not be particularly helpful unless the exact solution

A. is known and the integral can be evaluated exactly. Nevertheless it leads to a diagnostic of practical importance. The integral equation (5.71) for refashioned to

can be

1

whence

V2(S)

V2

+ k 2(s) = A.(s{ 1 - f..b k3(t){V2(t) + k2(t)} dt

Consequently

f..b k3(t){V2(t) + k2(t)} dt = {I + f..b k3(t)A.(t) dt} -1 f..b k3(s)A.(s) ds

S:k

and the coefficient of A in (5.72) becomes {I + 3 (t)A.(t) dt}- 1. On account of (5.73) the coefficient of A is infinite when non-uniqueness occurs. Although V2 will usually only be known approximately from the Galerkin process, non-uniqueness must be suspected when the coefficient of A in (5.72) is large. The Galerkin process provides an approximation of the form u(s) = I(s)

+

Ak 2 (s)

+ L"

j=l

ajcP}{s)

258

NUMERICAL ASPECTS OF VARIATIONAL METHODS

and it can be shown (Jones 1973a) that the error in the series is O(l/n) as n -+ 00 when the 11 norm is employed. Appreciable benefit can therefore flow from following the procedure described.

Exercise 14. Examine the following integral equations by each of the above methods, making an appropriate choice of c/>j in each of the cases: (a) u(s) - 2 (b) u(s) -

f

f

(c) u(s) - - 1

2n

stu(t) dt

= s,

(st

+ t 2)u(t) dt = s,

JX

cos(s - t)u(t) dt =

(d) u(s) - -1

J1t {(s -1t

(e) u(s) -

u(t) lnlcos s - cos r] dt

r 8n

(f) u(s) -

S2,

-J(

t)2 - 21s - tl}u(t) dt

LX> u(t) cos(2st)

dt

= sin s,

= 1,

= e-'

Each of the integral equations holds on the interval of integration.

5.7 Equations of the first kind The integral equations that have just been discussed have all been of the second kind. For' an integral equation of the first kind the methods based on positive-definite operators, developed in the earlier sections, may be applicable. Many boundary-value problems can be formulated in terms of integral equations of either the first kind or the second kind, and it is natural to ask whether there are advantages in one or the other. The answer cannot be given in unadorned fashion, however, since it depends upon the circumstances. Suppose that the equation is J.lU

+ Ku = I

where K is a compact operator with a complete set of eigenelements {uj} and associated J.lj so that KUj = J.ljUj. Then, putting I = L~ 1 fjUj' we obtain U

=

It L __ 00

j= 1 J.l

as solution.

J_

+ J.lj

Uj

INTEGRAL EQUATIONS

259

For an integral equation of the second kind Il =1= 0 and, since Ilj -+ 0 asj -+ 00, the convergence of the series is roughly the same as that for the series for f. When u = 0, an integral equation of the first kind arises and the convergence is substantially worse than that for f because of the factor lillj. The convergence becomes increasingly bad the more rapidly the uj tend to zero. In general, the smoother the kernel the fast the Ilj tend to zero (cf. (5.64». Therefore, from the standpoint of numerical work, the more singular a kernel the better for an integral equation of the first kind. Otherwise, the algebraic equations resulting from the Galerkin process are likely to be ill conditioned. When the kernel is singular, as in many radiation problems, ill conditioning does not often turn out to be a problem with integral equations of the first kind. A popular alternative to the Galerkin method for an integral equation of the first kind is based on iteration. In fact, the method can be applied to equations of the second kind as well in suitable circumstances and the treatment here does not differentiate between the two kinds. Let X and Y be Hilbert spaces. The problem is to solve

Lx =Y where L is a bounded linear operator from X to Y. Often for a given equation, X and Yare chosen so that L has the requisite properties. This may well make X and Y different spaces especially when they are Sobolev spaces (see §4.l4). To allow for this possibility the inner product in X will be denoted by (.,.) and that in Y by The aim is to construct a sequence {xm } such that X m -+ x or y - LX m -+ 0 as m -+ 00 and such that the sequence enjoys some of the properties of the conjugate methods of §4.6; for this reason the iteration is termed frequently a conjugate gradient method. The sequence differs according to whether one selects X m -+ x or y - LX m -+ 0 but the principles of construction are the same in both cases. Therefore, only one will be described and the other left as an exercise for the reader. In the one to be discussed we seek to minimize II y - LX m \I where " y 11 2 =
<.,.).

(m = 1,2, ...)

where am is a complex scalar and Pm is an element of X to be specified shortly. The opening term Xo may be chosen in any convenient fashion. Write rm for the residual y - Lxm. Then, since rm = r m-l - amLpm' it follows from §5.1 that II rm" is stationary for variations in am when (r... , LPm) = 0

(5.74)

or (5.75) The Pm are to be selected so that conjugacy holds but in a somewhat more

260

NUMERICAL ASPECTS OF VARIATIONAL METHODS

general mode than in §4.6. Let T be a bounded linear operator from Y to X. Then, we take

PI = Tro, Pm = Trm- 1 + PmP".-l (m = 2,3, ...). The operator T is introduced to permit some control over the convergence and other features of the iteration. The choice of Pm is to be such that (LPm' LPm-l) so that

=0

(m = 2,3, ...)

(5.76)

Pm = -(LTrm- 1, LPm-l)/IILPm-111 2. (5.77) Evidently, the iteration ceases if CXm = 0 at any stage. There is no guarantee that the residual vanishes at this point. However, the numerator of CXm can be expressed as (rm - 1 , LTrm- 1) on account of (5.74). It may be feasible to pick T so that this quantity is non-zero. For example, if the adjoint LA is known, putting T = LA makes the numerator (L Arm- h LArm_I). On the other hand, if there is no obvious T and am turns out to be zero, it may be acceptable to restart the iteration from the point reached with a different T. From now on it will be assumed that CXm =F O. The conjugacy properties are contained in THEOREM

5.7. For n = 2, 3, ... (LPn' LPm) = Olor m = 1, (r n, LTrm)

= Olor

, n - 1, and

m = 0,1,

, n - 1.

Proof. When n = 2, the first relation holds by (5.76). Also (r 1 , LTro> = 0 by (5.74) and (r 2, LTro) = (r 2, Lpl) = (r2 - r1 , Lpl) = -(a2Lp2' Lpl) = 0 by (5.74) and (5.76). This gives in addition (r 2, LTr1> = (r2' L'Ir, + P2Lpl) = (r 2, Lp2) = 0 by (5.74). Therefore, for n = 2, the relations are verified. Now suppose they are true for n = 2, 3, ... , N. Then (LpN + h LpN) = 0 by (5.76) and, for m = 1, ... , N - 1, (LpN+I' LPm) = (LTrN + f3N+I LpN' LPm> = (LTrN, LPm) = (LTrN, (r m - 1 - rm)/cx m) = 0 by assumption. Further and

(r N+ l' LpN) (r N+ 1, LTrN)

= (rN -

= (r N+ 1, LTrN +

by (5.74) and (5.76). Moreover, for m

(r N+ 1, LTrm)

C(N+ lLPN+ l' LPN) f3N+I LPN)

=0

= (r N+ 1 , LpN+l) = 0

= 0, 1, ... ,N -

1

= (rN - CX N+ 1LpN+l' LTrm) = -CXN+1(LPN+hLPm+l-Pm+lLPm) =0

by what has been proved already. Hence, the relations are valid for N + 1 if they are for N. Since they hold for n = 2, induction completes the proof.

261

INTEGRAL EQUATIONS

Convergence is covered by THEOREM

that

5.7a. If am =F 0, IIrm II < IIrm+ 111 and,

if LT has a bounded inverse such

I(Y, LTY)III(LT)-lll ~ IIYII 2 , then

Ilr",1I Proof. IIrm11 2

2

~ [1 - {I1(LT)-:IIIILTIW]IIr

= (r"., r m-l = IIrm _

2

m_dI •

1 11

2

- a".LPm)

= (r"., r ".-1)

-1(r".-I' LPm)1

2/IILPmIl 2

from (5.75). The first part of the theorem has been demonstrated. For the second part

l(r m-

h

LPm)1 2 = l(rm_l,LTrm_I)12 ~ lIr m_t1/4/II(LT)-111 2

by hypothesis. Also

IILp".1I 2 = (LPm,LTr m-l) = IILTrm _ 1 112 + Pm(LPm-l,LTr m- 1 ) = IILTrm - 1 112 -1(LPm_l,LTrm_ 1)1 2/IILPm_111 2 by (5.77). Thus

IILPml1 2 ~ \ILTrm_ 1 11 2 ~ IILT11 211r m _ 1 1l 2 and the inequality of the theorem follows now. The proof is finished. The inequality of Theorem 5.7a suggests that a good convergence rate is achieved by making II(Ln- 111IlLTIl as close to unity as is feasible. The straightforward choices of T = I and T = LA can be improved on usually. For instance, T = PpAL A gives IILTII = IILPII 2 = IIp ALA11 2 (§3.3) and one can aim to have pAL A or LP a reasonable approximation to the identity. This choice also ensures the non-vanishing of am. For further details of practical applications see van den Berg (1984), van den Berg and Kleinman (1988), Kleinman et ale (1990), Zwamborn and van den Berg (1991), Sarkar (1991), Xu (1992).

Exercises 15. A perfectly conducting lamina of sides 2a and 2b is maintained at unit potential. The total charge density (1 satisfies a

f -a

fb -b

{(x -

(1(S, t) ds dt - 1 S)2 + (y - t)2}1/2 -

on the lamina. Find (1 by collocation, approximating functions, and determine the total charge on the lamina.

(1

by piecewise constant

262

NUMERICAL ASPECTS OF VARIATIONAL METHODS

If the potential is represented by the double layer

~fV~(~)dS 2n an, where' is the distance from a point on the lamina to the point of observation the integral equation

f

~ v ~ (~) dS 2n on,

v= 1

results. Compare the total charge obtained by this and a variational method with the previous one (Noble 1960, 1971). 16. Investigate the potential of a rectangular solid in a similar way to Exercise 15. 17. Construct a sequence {xm } for X m -+ x on the same lines as in the text but using the inner product (.') and norm IIxl1 2 = (x, x). The conditions to be satisfied are (x - x m, Pm) = 0 and (Pm, Pm - 1) = O. Show that, for n = 2, 3, ... , (Ph' Pm) = 0 for m = 1, ... .n - 1 and (x - X n ' T'm) = 0 for m = 0, 1, ... .n - 1. Show also that Ct m

= (x

- x m- 1 , Pm)/ II PmIl

2

= (L -I'm_I' T'm_l)/IIPmI1 2 ,

Pm = - (T'm- it Pm-l)/IIPm- til

2

•

Deduce that [x - X m II < IIx - xm - 111 if Ct... :/:: O. 18. Solve the integral equations in Exercises 15 and 16 by iteration.

NUMERICAL TRIAL FUNCTIONS 5.8 Finite elements One way of constructing trial functions is the method of finite elements. In this method the domain under consideration is split up into a finite number of elements and on each element an approximation is prepared which depends upon a number of parameters. These parameters are then determined by one of the procedures already described. In a sense, therefore, the method is subsumed under the foregoing. However, the name is usually reserved for approximants which are polynomials and elements of particular shape, usually triangles or rectangles in two dimensions and tetrahedra or hexahedra (i.e. solids with six faces) in three dimensions. As a rule, every effort is made to have the sides of the triangle straight or the faces of the tetrahedron flat in the respective cases. This is not always feasible when the element is adjacent to a boundary and triangles with a single curved side may have to be accepted. It is not, however, possible to organize a tetrahedral division in three dimensions so that no tetrahedron has more than one curved side. When dealing with a triangular division in two dimensions it is common practice to regard each triangle as having been transformed into a standard triangle. By this device it is only necessary to quote results for the standard triangle, the formulae for the original division being obtained by transformation.

263

NUMERICAL TRIAL FUNCTIONS

q

(0,1) 2

3

p

(1,0)

Fig. 5.1. The standard triangle.

The standard triangle is taken to be right-angled with the two sides adjoining the right angle of unit length (Fig. 5.1) and placed in the (p, q) plane. A triangle with vertices (x., YI)' (x 2, Y2)' and (X3'Y3) in the (x, y) plane can be gained from the standard triangle by the transformation X = X3 + Y

(Xl -

= Y3 + (Yl

x 3 )p + (x 2 - X3)q,}

(5.78)

- Y3)P + (Y2 - Y3)q·

The inverse transformation is Sp = X2Y3 - X3Y2

+ X(Y2 -

Y3) - y(x 2 - X3)'}

Sq = X3YI - XIY3 + X(Y3 - YI) - Y(X3 -

(5.79)

Xl)

where

S=

(Xl -

X3)(Y2 - Y3) - (X2 - X3)(YI - Y3)

(5.80)

is twice the area of the triangle in the (x, y) plane. So long as the triangle is genuine, i.e. its vertices are not collinear, S in non-zero and p, q can be determined. Often p and q are known as isoparametric co-ordinates when the same basis functions are used for interpolating the function as well as describing the geometry. Let the vertices of the standard triangle be numbered 1, 2, 3 as shown in Fig. 5.1. Suppose that it is desired to interpolate the function u so that it takes the correct values U I , u 2 , U 3 are the vertices. Then the interpolant U is given by U

= pUt + qU 2 + (1 -

p - q)U3.

By means of the transformation (5.79) we can return to the original triangle. On the 23 side of this triangle U will vary linearly from U2 to U 3• This will also be true if the same process is carried out for the neighbouring triangle with the same side (Fig 5.2). Since the linear variation is unique it follows that the interpolant built in this way is continuous over the whole triangular network. Thus a continuous trial function has been manufactured in which the unknowns are the values of the function at the vertices of the triangles.

264

NUMERICAL ASPECTS OF VARIATIONAL METHODS

4

2

Fig. 5.2. Linear interpolation on adjacent triangles. 2 4

5

6

Fig. 5.3. Quadratic interpolation.

Polynomials of higher degree grant the power of inserting unknown values at points other than the vertices. For example, with quadratics the mid-points 4, 5, 6 (Fig. 5.3) of the sides of the standard triangle can be allowed for. Now

U = p(2p - l)u 1

+ q(2q -

l)u2

+ r(2r

- l)u 3

+ 4pqu 4 + 4qrus + 4rpu 6

(5.81)

where r = 1 - p - q. Again this provides a trial function which is continuous over the whole triangular network. Cubics permit values at the points of trisection of the sides and one at the centroid of the triangle (Fig. 5.4). In this case U

= 1P(3p - 1)(3p - 2)u1 + · + · + !pq(3p - 1)u4 + !pq(3q - l)u s + . + · + . + · + 27pqru 1o

the dots indicating terms derived in an obvious cyclic manner. It may happen that one may wish to take the values of some derivatives as unknowns instead of relying solely on function values. A quadratic possibility for Fig. 5.3 is

U = t(p

+

+q+

p2 - 2pq - q2)U 1

+ t(p + q -

q(1 _ q)(OU) + p(1 _ P)(ou) oq 6 op

5

p2 - 2pq

+ q2)U2 + (2pq +

r)u 3

_ 2- 1/ 2 (p + q _ p2 _ 2pq _ q2)(OU) On 4

265

NUMERICAL TRIAL FUNCTIONS

2

5

6

4 7

• 10

3

8

9

Fig. 5.4. Cubic interpolation.

and ou/on is a derivative normal to the 12 side out of the triangle. This formula is based on the simple difference formula for the derivative; thus (OU/Oq)6 arises from the difference of values at (!,!) and (!, -i). The significance of the derivatives in the (x, y) plane can, of course, be deduced from (5.78) and (5.79). A cubic formula for Fig. 5.4 is

U = {p(3p - 2p2) - 7pqr}u 1

+ p{(x 1 + p{(Yt

x2)q(r - p) -

+.+· (X3 -

x 1)r(q _ P)}(ou) ox

1

- Y2)q(r - p) - (Y3 - Yt)r(q - p)} (::) 1

+.+.+.+.+

(5.82)

27pqru l O

the dots representing terms obtained by cyclic interchange. All the interpolants so far supply continuity over the whole triangular network. If basis functions are required that also have continuous derivatives over the complete network, then it is necessary to have quintic polynomials at the very least. The formulae are lengthy and the reader is referred elsewhere (see, for example, Mitchell and Wait 1977) for details. In three dimensions the natural analogue of the triangle is the tetrahedron, and again a standard tetrahedron can be introduced in a transparent way. Perhaps a more valuable element, however, is the hexahedron. This is first reduced to a standard cube (Fig. 5.5) in the (p, q, r) space by the transformation x = pqrx

1

+ (1 -

+ (I

- p)qrx 2 + (1 - p)(1 - q)rx 3 + p(1 - q)rx 4 + pq(1 - r)x s

p)q(1 - r)x 6

+ (1 -

p)(1 - q)(1 - r)x7

+ p(1 -

q)(1 - r)xs

with similar expressions for y and x. An interpolant can be achieved by replacing Xi by Ui. Higher-order polynomials involving nodes on the faces of the cube can also be derived. One important problem is still deserving of attention and that is the matter of coping with elements with curved sides. In principle, one can conceive of a

x by U and

266

NUMERICAL ASPECTS OF VARIATIONAL METHODS

,

3

4""-'---.-..-J

7r-----1t-------J--_ 6 q

8

5

p

Fig. S.S. The standard cube.

transformation which converts such a triangle into the standard triangle but that will rarely be feasible in practice. Assume that the 13 and 23 sides of the triangle are straight but that the 12 side is curved. Consider

+ Sm = (Y3 - Yl)X - (x, - xt)Y + Sl =

(Y2 -

Y3)X -

(X2 -

x 3 )Y

X 2Y3 -

X 3Y2,

X 3 Yl

X 1 Y3

-

where !5 is the area of the triangle with straight sides which has the same vertices as the curved triangle; an explicit expression will be found in (5.76). Then 1=0 is the straight 23 side and 1 = 1 is at the vertex 1. Similarly m = 0 is the 13 side and m = 1 at vertex 2. Therefore, in the (I, m) plane the triangle has sides of unit length along the coordinate axes together with a curved side (Fig. 5.6). The equation of the curved side may be written as [(I, m) = 0, and, since f does not vanish at the origin, it may be multiplied by a constant so that [(0, 0) = 1 without altering the curve. Although we do not wish to contemplate mapping from the (I, m) plane to the standard triangle for arbitrary f(l, m) we can examine what befalls when the standard triangle is moved to the (1, m) plane by the connection

1= p + 2(2L - 1)pq, } m= q

+ 2(2M -

1) pq.

(5.83)

The sides p = 0 and q = 0 go into the sides 1 = 0 and m = 0 respectively, the image of a point being at the same distance from the origin. Also the mid-point

267

NUMERICAL TRIAL FUNCTIONS

m (0,1)

(1,0)

Fig. 5.6. Triangle with one curved side.

of the third side of the standard triangle (p = !, q = !) transforms into the point (L, M) which can be selected as any convenient point of the third side (Fig. 5.6). In general, the target will be to have (L, M) so that the image of the third side of the standard triangle is acceptably close to f(l, m) = 0 but we can be sure of only three points being obtained exactly. The equation of the image of the third side is

(X«(X

+ fJI -

(Xm)2

+ «(X + P)2[ = «(X + fJ)(1 + (X)(C( + Pi -

(Xm)

where (X = 2(2L - 1), P= 2(2M - 1). This is a parabola with axis parallel to the line Pi = «m. Thus (5.83) approximates the curved sides in the (I, m) plane by a parabola. Since the relation between (I, m) and (x, y) is linear the approximant will differ from the true curved side unless it is a parabola in the (x, y) plane. If (5.81) is used as an interpolant, p and q are known as isoparametric coordinates and the trial function is continuous over a network composed of straight-sided triangles adjacent to triangles, with two straight and one curved side, around the perimeter of the region. The perimeter is, however, being approximated by a series of parabolas. When f is a quadratic polynomial in 1 and m, an interpolant can be found directly in the (I, m) plane without the intervention of isoparametric coordinates (see Exercise 24). Cubic interpolation can be employed with isoparametric coordinates. Here Fig. 5.4 is the starting point and

1 = p + !pq(6/ 10 - /4 -15 -1) + 227 p2 q(/4 - 2/10 ) +¥ pq2(/s - 2/10 +!), m = q +lpq(6m 10 -m4 -ms -1) + 227 p2q(m4 -2m 1 0 +1)+ 227 pq2(ms -2m 10 ) · The points 1, 2, 3, 6, 7, 8, and 9 in the (p, q) plane go over into the same points in the (I, m) plane. In contrast, 4, 5, and 10 become (1 4 , m4 ) , (Is, m s), and (/ 10, m1o). The first two can be chosen to be any convenient pair of points on the curved side while the third can be made any suitable interior point. Now the approximation to the curved side will be a cubic curve which passes through the two end-points and two other points of the side. In certain circumstances

268

NUMERICAL ASPECTS OF VARIATIONAL METHODS

the image of the 12 side has a simpler equation than in general; for instance, when 14 = Is + t, ms = m4 + (when it is a parabola) or if 110 = t/4 ' ml O = tis. Isoparametric coordinates are easy to use for curved elements but they can be dangerous because they replace a bend by a polynomial curve. The deviation may be significant and generate large numerical errors. In fact, very serious discrepancies may be ·concealed by the simplicity of the approach, for one is apt to visualize the triangle of Fig. 5.3 being mapped into a triangle topologically similar to Fig. 5.6 and this need not be so. To see this, evaluate the Jacobian of the transformation (5.83). It will be discovered that

t

ollop ol/oq I = Iom/op omloq

1 + a.q

+ PP.

The Jacobian is of one sign only if the straight line 1 + «q + Pp = 0 does not intersect the standard triangle. The line will have a point in common with the triangle if L ~ ! or M ~ i. When the Jacobian does vanish inside the triangle the transformation (5.83) is not one-to-one and, indeed, there are points (1, m) for which there are two corresponding points in the standard triangle. As an illustration let L = M = !. Then corresponding to I = t, m = 0 are p = !, q = 0 and p = ~, q = !. The second point has p + q = 1 and so lies on the side 12 of the standard triangle. In other words, the curved side in the (I, m) plane cuts the 1 axis twice as in Fig. 5.7. Far from having a triangle of the type in Fig. 5.6 we have a figure composed of three separate parts, each a sort of crescent shape. Consequently, there is a necessity for basis functions which deal with curved sides exactly. Investigation of the geometrical problems involved have been undertaken (Wachspress 1973; McLeod and Mitchell 1972) but it seems that substantial difficulties with integration can arise. Estimates of the error incurred in the finite-element method are not easy to come by. Frequently, they are stated in terms of Sobolev space (§§3.1 and 4.14) and then only when all the elements have straight sides or flat faces. For example, in two-dimensional (x, y) space the norm of the Sobolev space W",2

m

Fig. 5.7. Multiple-valued mapping.

269

NUMERICAL TRIAL FUNCTIONS

is defined by IIWII~,n =

Ln Ls

s=O ,=0

IIax, at _, asw

2 1

dx dy,

the integration being over the domain under consideration. Suppose that u satisfies an elliptic partial differential equation of order 2m or is given by a variational principle of positive-definite type and order m. Then trial functions in W m , 2 are considered. If all the triangles have straight sides and the trial functions Un are piecewise polynomials of degree p (p ~ m) which are complete, the rate of convergence satisfies

where h is a geometrical parameter (effectively the largest side occurring among the elements) which tends to zero as n -+ 00. The corresponding result in the L 2 norm is but little is known about the maximum norm. It may happen that one desires to employ trial functions that are not in wm in order to reduce the constraints in constructs. The finite elements are then said to be non-conforming and the above error bounds are no longer available. The bounds also assume that integration is carried out exactly so must be used with circumspection when the evaluation is by numerical quadrature.

Exercises 19. Show that (5.82) is, on the side of a triangle, a cubic which is uniquely determined by the values of the function and its first partial derivative at the vertices at the ends of the side. What does this imply about the continuity of the trial function over the triangular network? 20. Find the analogue of (5.78) for transforming a tetrahedron to the standard tetrahedron. 21. If f(l, m) = 0 is a conic show that f(l, m) = al 2

+ blm + cm 2 -

(a

+

1)1 - (c

+

l)m

+

1

to satisfy the normalization conditions. If the conic is a hyperbola with a = c = 0 passing through (L, M) show that

b = (L + M - l)/LM so long as neither L nor M is unity. 22. If ~(l, m) is unity at (0, 0), varies linearly along 1 = 0 and m the curved side of Fig. 5.6 prove that

= 0, but

vanishes on

270

NUMERICAL ASPECTS OF VARIATIONAL METHODS

is an interpolant, where

~

=

W2 =

(1 - M)l

+ Lm - L

l-L-M Ml

+ (1 - L)m - M l-L-M

1-1- m

~=---

l-L-M

L +"'), l-L-M M +----"'), l-L-M

1 ----"'). l-L-M

When the curved side is the hyperbola of Exercise 21 prove that ~ = blm I - m + 1 and deduce the forms of Jt;, "'2, and W4 . Hence demonstrate that this process provides a continuous interpolant across the curved side common to two adjoining triangles. 23. By taking Ut3 = 1 - 1- m - (a + e - b)lm(l - al - em) in Exercise 22 obtain an interpolant in rational functions for the conic of Exercise 21. 24. For a quadratic interpolant add Us JfS + U6 U'6 to the formula of Exercise 22, 5 and 6 being the mid-points of the straight sides. For the hyperbola of Exercise 21 put W3 = (1 - 21 - 2m)f(l, m) and show that ~

= 1(1

- 2f - m/M),

~

= lm/LM,

Ws = 4mf.

25. Study the possibility of a cubic interpolant as in Exercise 24 with the points 5, 6, 7, 8 the points of trisection of the straight sides.

Constructing conforming elements which represent vectors with prescribed continuity of components is a more intricate matter. Some of a group which have been devised by Nedelec (1980, 1982) will be described here. It is convenient to start with the representation for a standard tetrahedron. To distinguish quantities associated with the standard tetrahedron from those for any other tetrahedron a caret will be used. Thus, stands for a vector of coordinates for the standard tetrahedron while v will be a vector in terms of these coordinates. The vertices of the standard tetrahedron are the origin and (1, 0, 0), (0, 1,0), (0, 0, 1). The vector vis to be represented by an interpolant Vwhich has continuous tangential components across faces. A representation in terms of polynomials of the first degree is

x

(5.84) where a and b are constant vectors. The six conditions necessary to determine

271

NUMERICAL TRIAL FUNCTIONS

It

the coefficients in (5.84) are

at

=

Vt(t, 0, 0) dt, a2 =

bl

= a2 -

a3 +

b2 = a3 - at

+

b3 = at - a2 +

It It It

It

V2(O, t, 0) dt,

a3

=

It

V3(O,O, t) dt,

{V3(O, 1 - t, t) - V2(0, 1 - t, t)} dt, {vt(t, O, 1 - t) - V3(t, 0, 1 - t)} dt, {v2(1 - t, t,O) - v3(1 - t, t, O)} dt.

The method of Nedelec is not restricted to polynomials of the first degree but the number of coefficients to be found grows rapidly with degree. For instance, 20 coefficients are needed for quadratic interpolants and 45 for cubics. The tangential components of V on X2 = 0 are ~ = at + b2 3 and Y3 = a3 - b2 t . From the definitions of a and b it is evident that these components involve only the tangential components of v. Therefore, if the tangential components of vare continuous, an adjacent tetrahedron which has the same face will produce an interpolant which has the same tangential components as V. It can be checked that this property is valid for each face and so (5.84) provides an interpolant with continuous tangential components. ..... Another feature of (5.84) is that, if curl V = 0,

x

x

~

.....

~

V = grad(a.x).

This result, that V is a gradient when its curl vanishes, carries over to the interpolants of higher degree. Now suppose that a representation is required for v on an arbitrary tetrahedron. The tetrahedron is mapped into the standard tetrahedron by the linear transformation (5.85) x=Bx+c where the constant vector c and matrix B are known as soon as the position of the arbitrary tetrahedron is fixed. The matrix B is non-singular so long as the arbitrary tetrahedron is a genuine tetrahedron, i.e. has non-zero volume. The connection between a vector for the arbitrary tetrahedron and its counterpart for the standard tetrahedron is fixed by (5.86) The purpose of this relation is to ensure that certain properties are true for both tetrahedra. Let C be the 3 x 3 matrix whose ij component is oV i _

OVj

OXj

OXi

272

NUMERICAL ASPECTS OF VARIATIONAL METHODS

and C the corresponding matrix in which v, x are replaced by V, x respectively. Then (5.85) and (5.86) imply that

C = BTCB. Consequently, when the curl of a vector is zero in one of the tetrahedra the curl of its counterpart in the other tetrahedron also vanishes. If f is a scalar g-:;ad f = B T grad f which shows, if D is the normal to a face of the arbitrary tetrahedron, that the normal to the corresponding face of the standard tetrahedron ii is parallel to HT D. If i is a vector tangential to the face so that i. il = 0 it follows that i. HT D = 0, i.e. Hi. D = O. Thus, t, is parallel to Hi. Hence v, t is a scalar multiple of i. In other words, vectors which are tangential to a face of one tetrahedron remain tangential on the other under the transformations (5.85) and (5.86). Therefore, all the properties which have been ascribed to the standard tetrahedron can be attributed now to the arbitrary tetrahedron. To put it another way, an interpolant with continuous tangential components has been obtained for the arbitrary tetrahedron. Because these elements treat tangential components specially it is easy to arrange that an interpolant has such components zero on a face. Accordingly, they are very convenient for boundary value problems where D A v = 0 on a surface provided that the surface can be approximated reasonably by tetrahedra. Nedelec has extended the theory to representations with continuous tangential components on the standard cube (Fig. 5.5) and derived convergence results (see also Girault and Raviart 1986; Monk 1991). Continuity of normal, instead of tangential, components can be secured by replacing (5.84) by (5.87) = a + bi

v.

v

where each of the four conditions to fix a and b is of the form

f (v -

V).n dS = 0

with the integral over one of the faces. The mapping of the arbitrary tetrahedron to the standard one is still given by (5.85) but the vectors are related by (5.88)

v.

instead of (5.86). It follows then that il is a scalar multiple of v. n so that normal components remain normal under the transformation. Furthermore ...............

div

v= div v.

(5.89)

These conforming elements are suitable for problems in which the normal component of a vector vanishes on the boundary. As an illustration consider Maxwell's equations of §2.1 in a perfectly conducting cavity resonator. Assume that all functions occurring from now on

273

NUMERICAL TRIAL FUNCTIONS

are in L 2 and let (u, v) =

f

u, v·

with integration over the interior of the resonator. If curl u is in L 2 , (2.2) gives (H, curl u) -

(e ~~, u) = (J, u)

when n A U = 0 on the surface of the resonator. On the other hand, for any w, (2.1) supplies

(curl E, w) +

(Jl aa~ , w)

= O.

If, now, the resonator is approximated by tetrahedra, we can try expressing E in the form derived via (5.84)-(5.86)as long as its tangential components vanish on the boundary of the resonator. Likewise, u can be chosen to be an arbitrary element of the same type. For Hand w an obvious choice is from (5.87), (5.88) but it may not be the best-there is some evidence that making Hand w piecewise constant vectors is more efficacious. At any rate, the arbitrariness of u and w leads to a set of linear ordinary differentialequations for the coefficients in E and H. They are subject to whatever initial conditions have been prescribed (in a format to match the foregoing). If the domain is appropriate cubes can be used instead of tetrahedra. Then one obtains differential equations which bear some resemblance to those proposed by Vee (1966) who replaced the space derivatives in (2.1) and (2.2) by centred differences. 5.9 Finite differences Variational principles furnish a method of deriving finite difference approximations to differential equations, whether ordinary or partial. It will be sufficient to demonstrate the technique for d 2u - dt 2 + o{t)u

= J(t)

on a ~ t ~ b, subject to the boundary conditions u(a) associated variational principle is to minimize

=~,

u(b)

= p.

The

over functions w which have piecewise continuous first derivatives on [a, b] and which satisfy w(a) = ~, w(b) = p. Select points to = a, t 1 , ••• , t., tn + 1 = b on [a, b]. Now approximate the

274

NUMERICAL ASPECTS OF VARIATIONAL METHODS

integral by making central difference approximations so that the replacements

dt = (w + f (dW)2 dt t + tj

+

1

f

ti +

W j)2

tj

1 -

j

1

,

= !
g(t) dt

I)

1 -

j

tj

tj)

where wj = w(t j ) , spring forth. Then the discrete analogue of the integral to be minimized is

~ {(Wj + 1 -

1

-2 ~

tj + 1

j= I

Wj)2

tj

-

Remembering that parameters WI' ••• , 2(w

+

Wo Wn

= a, Wn + t leading to 2(wj

Wj-I)

-

2

t

(2 Gj + 1 Wj+

j --- +

t j - tj-t

+ 2t Gj Wj2 -

tj

1 -

Wj+

tjj+ 1 -

wjjj)(t j+ 1

-

}

tj ) .

= (3 we minimize this with respect to the

Wj + 1)

-

t j+

1

+

(GjW j - jj)(tj + I

t j_

-

l)

= 0

(5.90)

for j = 1, ... , n. Alternatively, a Galerkin process could be followed with the linear interpolant w( t ) = w j

+

(Wj+

wj)(t -

1 -

tj + 1

-

t j)

t ~ t j+ I).

(t j ~

tj

Then the system W -We j )-1

tj -

t j- 1

w·-w f'J+l +) j+ 1 +

tj + 1

-

tj

(GW - f)gj

dt

=0

'j-l

appears, where gj is piecewise linear, vanishing at t j _ 1 and t j + I but unity at t j . The system is similar to but different from (5.90). The differences disappear if the integrals are approximated as above. The advantage of the variational approach is that it automatically takes care of natural boundary conditions in formulating the finite-difference scheme. For example, suppose that the boundary conditions are changed to alu(a) - fJlu'(a)

= YI;

a2 u(b)

+ fJ2 u'(b) = Y2'

Then we consider the minimization of

~ fb {(dW)2 + ow2_ 2Wf} dt + _1 {a1w 2(a) 2

a

dt

2fJl

2')'1 w(a)}

+-

1

2P2

{a2w2(b) - 2Y2 W(b)}

as long as PI > 0 and P2 > O. There is now no restriction on the behaviour of the trial function at the end-points. The path already described may be followed

NUMERICAL TRIAL FUNCTIONS

275

though there are now two extra parameters Wo and wlI + 1 to be varied. The equations which result from their variation correspond to certain finitedifference replacements for the boundary conditions. Since the discrete analogue stems from a variational principle it will reproduce any positive-definite properties in the original. This ensures that boundary elements are treated in a way which does not destroy this property, a facility which may not be easily available if a direct attack is made as in Chapter 2. If a differential equation is not self-adjoint so that a variational principle is not to hand, different equations can be provided by integrating the equation once and then using central differences for derivatives and integrals in the manner already indicated.

Exercises 26. Generalize the technique to the equation

dU} + au =

d { p(t)-dt dt

f.

27. Extend the method to the partial differential equation

_~ {P(x. y) au} _~ {p(x. y) au} + au = f ax

ax

on a finite domain subject to (X(x,

oy

y)u + p ou/on

oy

= y on the boundary.

5.10 Comparison between finite difference and finite element In the preceding section it has been remarked that the equations obtained by finite differences and finite elements may bear a distinct similarity. This is not always true, but it raises the question as to whether one method might be preferred to the other. A brief comparison between the methods may therefore be worthwhile and assist the reader in his choice of procedure. In finite differences a regular grid is usually adopted, except possibly near the boundary, and the differential equation is replaced by a local difference approximation. Contrariwise, in finite elements the region is divided into elements (usually triangular) and the unknown is approximated by a polynomial (containing parameters) over each element. Equations for the parameters are obtained on a global basis by Galerkin's process or the variational method. The accuracy of finite differences is usually improved by using more extensive formulae, e.g. by substituting a nine-point approximation for a five-point approximation. In contrast, for finite elements the common technique is to increase the degree of the interpolating polynomial. The algebraic equations obtained are usually suitable for iterative procedures when derived via finite differences but need not be so for finite elements, so that direct methods are more likely to be relevant to finite elements. On the

276

NUMERICAL ASPECTS OF VARIATIONAL METHODS

other hand, finite elements will usually cope with boundary conditions automatically whereas interpolatory approximation is often adopted in finite differences. Finally, while irregular grids are relatively uncommon in finite differences they are habitual in finite elements. Broadly speaking, therefore, finite elements are probably better at coping with complex geometries whereas finite differences are good for initial value problems and simple geometries. When non-linearities are present there is probably not much to choose between them; the backing of variational principles may well not be available. 5.11 Eigenvalues From time to time it is desirable to determine eigenvalues by a Galerkin process or to find values of Afor which A.U - Ku = O. Actually, this spreads its net more widely since it aims at the spectrum (§3.6) of K rather than the eigenvalues alone. The theory tends to be elaborate since eigenvalue problems are non-linear even when the operator is linear. Nevertheless, it can be shown that, if K is compact and conditions A are satisfied, any open set which contains the spectrum of K will also contain the spectrum of FnK when n is sufficiently large. Indeed, if a sequence of eigenvalues and unit eigenvectors of FnK converges to A. and u as n -+ 00, then A is an eigenvalue of K and u is its corresponding unit eigenvector. For a positive definite operator it is sufficient (Dovbysh 1962, 1965) for both eigenvalues and eigenvectors that {cPm} be strongly minimal in Hp •

NUMERICAL INTEG RATION 5.12 Quadrature Frequently, the Galerkin process applied to a practical problem involves numerical integration to calculate the coefficients in the algebraic system. It will not, therefore, be out of place at this juncture to say something about the relative merits of methods for quadrature. In its simplest form quadrature consists of specifying n + 1 interpolation points xo, Xl'.'.' x, on an interval [a, b] and then determining weights \Vo, · · · , Wn so that

f

b

a

n

f(x) dx ~ ,~o wr/(x,)

(5.91)

to some acceptable degree of accuracy. If (5.91) holds exactly for all polynomials of degree p or less but not for a polynomial of degree p + 1 the method is said to be of order p. To find the w, to achieve order p we solve the p + 1 linear alge braic eq ua tions

277

NUMERICAL INTEGRATION

Ib where

~

(X - a)- dx

= ,to w,(x,

(s

- a)

= 0, 1,... , p)

(5.92)

is any convenient constant; this device is known as the method of

undetermined weights.

For example, suppose n = o. Then (5.92) is satisfied with s = 0, ~ = 0 if = b - a. With this choice of Wo it 'is satisfied for s = 1 only if Xo = t(a + b). If we choose Xo = a, we obtain the forward rectangle rule. Wo

{b f(x) dx ~ (b - a)f(a)

while Xo

= b gives the

backward rectangle rule

Ib

f(x) dx

~ (b -

a)f(b).

Both these methods are of order o. By comparison, method of order 1, the mid-point rule

Ib

f(x) dx

~ (b -

Xo

= !(a + b)

yields a

a)f{1(a + b)}.

By increasing the number of interpolating points, methods of higher order may be attained, but the more profuse the points the more intricate the formulae become. For the moment, suppose the points are equally spaced. Then, for n = 1, we have the trapezoidal rule

{b

f(x) dx

~ t(b -

a){f(a)

+ f(b)}

which is of order 1. For n = 2, there is the five-eight rule

{b f(x) dx ~ l~b - a){5f(a) + 8f(b) - f(2b - a)}

r

of order 2 and Simpson's rule

f(x) dx = i(b - a){f(a)

+ 4f{t(a + b)} + j(b)}

of order 3. A method of order 5 is

f

b

a

{(3a + b) + 24f (b-2+ a) + 64f -4-

f(x) dx ~ rto(b - a) 14f(a)

+ 64f(a: 3b) + 14f(b)} sometimes known as the Newton-Cotes rule.

278

NUMERICAL ASPECTS OF VARIATIONAL METHODS

To integrate over an interval to a high degree of accuracy there are two courses open. Having subdivided the interval by interpolating points one can base the rule on a single high-order interpolating polynomial. This is usually unsatisfactory and, indeed, there is no guarantee that the result will converge, even for a continuous function, as the number of subdivisions grows. Alternatively, different polynomials can be employed on different subintervals. For instance, if b - a = nh, the application of Simpson's rule in this way would give

fb f(x) dx = th{f(a) + 4f(a + h) + 2f(a + 2h) + 4f(a + 3h) + ... + 4f(b - h) + f(b)}.

In general, this procedure furnishes tolerable accuracy and is much less objectionable than its alternative. Error estimates are customarily expressed in terms of a derivative of f. If one does not know the derivative, the estimate may be of little value though it may indicate how convergence will improve with finer subdivision so long as the existence of the derivative can be assumed. A posteriori estimates might be more helpful but are harder to come by. In settling the error it will be assumed that all the interpolation points lie in [a, b] or to the right of b, though the method can be adopted straightforwardly to the more general case. Suppose that all the interpolation points are contained in [a, P] with P~ b. Then by Taylor's theorem, f(x)

= f(a) + (x -

a)f'(a)

f
+ ... +

Since the method is of order p

b fa

f(x) dx -

Ln

,=0

w,f(x,)

= fb a

I

(x - t)P

IX

p!

dt.

r-: ,l)(t) dt dx

nIx,. w, (x, -

,=0

f
a

a

- L

(x - t)P

p.

t)P

r-: l)(t) dt. pI

a

Interchanging the order of integration in the first term on the right-hand side we have

f

b

a

n

f(x) dx - ,~o w,f(x,)

=f

p

{(b -

a

p

t)~+ 1

+

1

-

L W,(X, ,=0 n

t)~

}

r-: l)(t) dt pI

where the function x , is defined in §1.1. According to the mean value or intermediate value theorem for integrals (Jones and Jordan 1969), if (b - t)~+ 1

---- -

P+ 1

~

~ W,

,=0

(

x, - t

)P

+

(5.93)

279

NUMERICAL INTEGRATION

r

does not change sign as t ranges from a to {1

J(x) dx -

rt

wJ(xr )

t w.(x

{(b - a)p+2 _

=

p

e

+2

r

,=0

a)P+1} J(P+ 1)(~) (p + I)!

_

(5.94)

for some in (a, {1). Eqn (5.94) provides the desired error estimate, its validity resting on the absence of change of sign in (5.93) which is often easy to check. It may also be noted that, if the right-hand side of (5.94) vanishes, the rule is at least of order p + 1 and not p as originally surmised. For the forward rectangle rule only the first term of (5.93) survives so that (5.94) holds and the error is

!(b - a)2f'(e)

with e in (a, b). Similarly, for the backward rectangle and mid-point rules the errors are -!(b - a)2f'(e) and l4(b - a)3 f"(e)

e

respectively. The quantity need not, of course, have the same value in each of these formulae. Formula (5.94) may also be confirmed for the trapezoidal, five-eight, Simpson, and Newton-Cotes rules, the respective errors being

-l2(b - a)3f"(e)

l4(b - a)4f"'(e) (b - a)7

- 1935360 J

pv@,

- 2lso(b - a)S

vi

(~).

e

In all cases lies in (a, b) except for the five-eight rule where it is in (a, 2b - a). It needs to be emphasized that these estimates are theoretical. They can be exceeded in practical computations because no allowance has been made for round-off error in the theory. The rules given so far have all been for equal spacing between the points of interpolation. It is possible to obtain rules of higher order without increasing n by permitting unequal spacing. The idea behind Gaussian quadrature is to select the n + 1 points so that the method is of order 2n + 1. The device can be displayed most simply by transforming the interval (a, b) to ( -1, 1) by

+ a + b}

x =!{(b - a)y so that

-2b-a where g(y) = f[!{(b - a)y

fb f(x) dx =

fl

g(y) dy

-1

a

+ a + b}]. Now look for

f

l

-1

and which is of order 2n + 1.

n

g(y) dy ~ r~o wrg(Yr)

a formula such that

280

NUMERICAL ASPECTS OF VARIATIONAL METHODS

The points y, are selected as the zeros of Legendre polynomial (§1.5)~ i.e. of P,,+ I (y) = o. There are n + 1 simple zeros in ( -1, 1). We first choose the w, so that the method is of order n. Obviously, this can be done by specifying

~V,=

I

"(y~y) dy. n -n y,-Yj I

l>» j=l=,

Our target is to show that, in fact, a method of order 2n + 1 has been derived by this choice. Let P(y) be any polynomial of degree 2n + 1 or less. Then by subtracting the appropriate constant multiple of ymp,,+I(Y) from P(y) we can write P(y)

where

= Q(y)P" + 1 (y) +

R(y)

Q and R are polynomials of degree n. Now

I~1 R(Y)dy=,t w,R(y,) by what has been proved already. Further Q, being of degree n, can be expanded as a linear combination of Po, . . . ,P" and then the orthogonal properties of Legendre polynomials (§1.5) enforce

I~1 Q(y)Pn+ 1(y)dy=O. Thus it has been established that Gaussian quadrature is of order 2n + 1. If n = 0, Yo = 0 and Wo = o. If n = 1, Yo = -1/.J3, YI = 1/.J3, and Wo = WI = 1. Ifn = 2, Yo = -(3/5)1/2, YI = 0, Y2 = (3/5)1/2, and Wo = W2 = ~, WI = ~. For higher values of n, the reader should consult the relevant tables. The error estimate (5.94) is still valid and it is found that

I I

-1

22,,+3« n + 1),)4 g(y) dy - ,~o wrg(Yr) = (2n + 3)«2n + ;)!)3 g(2n+2)(e). "

To obtain the original integral replace g by f and 2 2 " + 3 by (b - a)2"+3. An alternative point of view is to fix Wo = WI = ... = w" = w and then prosecute a search for y, which gives a quadrature formula of order n + 1. This is known as Chebyshev quadrature. It will be discovered that the y, can be real only if n =1= 7 and 0 ~ n ~ 8. For odd values of n the order is n + 2 rather than n + 1. At this stage all the quadrature rules have been declared in terms of function values alone but one can entertain the potential presence of derivatives. A convenient formula can be derived by introducing the Bernoulli polynomials B,,( t) defined by

etz _ 1 00 z" z-z- = L -Bn(t) e - 1 n= 1 n!

(5.95)

281

NUMERICAL INTEGRATION

for Izl < 2n. There is no difficulty in deducing the first few polynomials, namely B2(t) = t 2 - t,

B1(t) = t,

B3(t)

By taking a derivative of (5.95) with respect to

= t3 t

~t2

+ tt.

we obtain

An expansion in powers of z when [z] < 2n gives (5.96) where the B; are Bernoulii numbers; B 1 = i, B 2 = lo, B3 = 12' B4 = lo, .... Hence (5.97) (5.98) for n ~ 1. Now define the functions C2n and S2n+ 1 for n ~ 1 by C (t)

= (- )n+ 1 {B2n(t) (2n)!

2n

S (t) 2n+ 1

= (-

(- )nBn} '

)n + 1 B2n+ 1(t) (2n + I)!

on 0 ~ t < 1 and by requiring them to be periodic with period 1 for other values of t. Then, if k is a non-negative integer,

by integration by parts since S2n + 1 vanishes at the end-points because B2n + 1(O) = 0 (put t = 0 in (5.95». Advantage has also been taken of (5.98). Moreover, C2n(k) = C2n(k + 1) = Bn/(2n)! so that another integration by parts provides

i

k+1

k

8 211 + 1(t)!(2n+ 1)(a

+

ht) dt

B = __ n _ {!(2n-l)(a + 2

i

(2n)!h

1 - 1: h

k

k

+1

kh) -

!(2n-l)(a

S2n_l(t)!(2n-l)(a

+

+

ht) dt.

(k

+

l)h)}

282

NUMERICAL ASPECTS OF VARIATIONAL METHODS

Proceeding in this way we eventually arrive at

r+

since C;

=t

-

t on

1 - {f(a 2h

1

(t - k - t)!'(a

+ ht) dt

[0, 1). But this integral is

+ (k +

+

l)h)

f(a

+

Ilk

kh)} - h

+1

f(a

k

+ ht) dt.

Hence

11+

S2n + 1(t) I'"+ l)(a + ht) dt

1

=

n

~

()n+rB

-

-

{f(2r-l)(a

r

r~l (2r)! h 2n + 2 -

2r

+

kh) - f(2r-l)(a

+ (k +

2~~~1 {f(a + (k + l)h) + f(a + kh)} + ~;:rl

f+l

l)h)}

f(a

+ ht)dt.

It follows, therefore, that for positive integer m

f

m

o

S

2n + 1

n

«u»: l)(a + ht) dt =

~

~

r=1

()n+rB r (2 r. )' h2n + 2 -

- -

2r

(~+l )n {f(a) + f(a+mh)+2 m-l} L f(a+rh)

-

2h

r=1

(-)" em

+ h 2n+ 1

Jo

f(a+ht) dt.

Consequently, mh

r+

f(t) dt = h {tf(a)

+

+ tf(a + mh) +

±(-

r= 1

{f(2r-l)(a) _f(2r-l)(a + mh)}

),B,h (2r)!

+ (-th 2n+ 2

2'

:t:

{f(2,-l)(a

f(a

+ rh)}

+ mh) _ P2,-1)(a)}

Lm S2n+l(t)p2n+l)(a + ht)dt

(5.99)

for n ~ 1, which is known as the Euler-Maclaurin summationformula. The same formula may be used for n = 0 on the understanding that the sum involving the derivatives is absent and S 1 is defined by periodicity from t - t. If the integral on the right of (5.99) can be regarded as small, (5.99) provides a formula for estimating the integral on the left or, alternatively, can be conceived of as an approximation for the sum on the right. When f is a

NUMERICAL INTEGRATION

283

polynomial the integral can be made zero by choosing n large enough but it would be wrong to imagine that the same result carries over for arbitrary f. Even the assumption that f is analytic is not sufficient to guarantee the validity, as can be seen by the particular example a = 0, b = 2n, f(t) = 1 + cos 4t. In fact, the last term in (5.99) can be expressed as 2n

2

_ (b - a)h + B f(211+2)(~) (2n + 2)! 11+ 1 '"

e

for some E (a, b) and this remainder term need not tend to zero as n ~ 00. In spite of this deficiency (5.99) without the remainder term is commonly used in the summation of series and the evaluation of integrals. It has been tacitly assumed in the foregoing that f and a sufficient number of derivatives are continuous. When this assumption fails special treatment may be necessary. If f or one of its derivatives has a finite discontinuity, the most that is necessary is to split the range of integration at the point of discontinuity into two subintervals and apply the previous quadrature formulae to the subintervals separately. When f possesses an infinity in the interval of integration a more elaborate manipulation must be undertaken. We may suppose that the singularity occurs at the end of an interval, by splitting the original range if necessary, and that the integral has the form g(t)S(t) dt where S is singular at a, e.g. S(t) = (t - a)-1/2. On the other hand, g and a sufficient number of its derivatives will be assumed to be continuous. Two methods of attack immediately offer themselves. In the first the interval is split at a + e. The integral over (a + e, b) may be handled by earlier rules. For the interval (a, a + e) expand g about t = a in a Taylor series with remainder term. Assuming the individual terms can be evaluated explicitly by analytical means, it only requires an estimate of the remainder by a mean-value theorem, coupled with previous error estimates for (a + s, b), to see if s can be adjusted so that tolerable accuracy is achieved. The second approach is to take G(t) as the first m terms (say) of the Taylor expansion of g about a and then write the integral as

S:

I b

{g(t)-G(t)}S(t) dt

+

I b

G(t)S(t) dt.

The first term now contains no singularity and may be tackled by standard quadrature methods. The second term may possibly be evaluated explicitly. Once again we have a method so long as (t - a)'S(t) dt can be calculated explicitly. Certain singularities may be managed by a change of variable. For example,

S:

284

NUMERICAL ASPECTS OF VARIATIONAL METHODS

the substitution t = a

f

b

a

g

+

s'/( 1 -

(r)

(t - a)

dt =

a

with 0 <

(X)

(J.

f(b-a)(1-2)/r

< 1 provides

+ s'!o-a))s'-l ds.

g(a

0

With r an integer, greater than or equal to unity, the integrand is continuous and the quadrature formulae may be applied at once. If S has one sign on [a, b], Gaussian quadrature may be employed directly and is often extremely effective. Sometimes a change of variable is also helpful (for some tables see Kutt 1973). Occasionally the principal value of an integral has to be evaluated. Then one possibility is

P

f

b--dt g(t) = (f

a

t- c

C

- £

a

+

fb ) --dt g(t) +Pf t-

c+£

C

f

C

+£

g(t)

f

C

-t~g(c)P c e t - C r-

+£

c-£

+ £

c-£

Provided that 9 does not vary much over (c - e, c

P

C

g(t) --dt. t- c

+ e)

dt -=0 t- c

and the singularity has effectively disappeared (see also §6.19). Another form of singularity arises when the interval of integration is infinite. The obvious procedure to try here is to convert the interval to a finite one by a change of variable. Thus, t = l/s gives

f

co

-

f(t) dt -

f 0

a

l/a

f(l/s) ds

2.

S

Although this may introduce a singularity at the origin it is of the type already discussed. Alternatively, one might attempt to truncate the interval at b (say) and then estimate the integral from b to 00 analytically or, at any rate, show that it is small. There is an analogue of Gaussian quadrature for integrals of the type J~ co exp( - t 2)f(t) dt based on Hermite polynomials and for J~ exp( - t)f(t) dt employing Laguerre polynomials. Multiple integrals are much less easy to cope with than single integrals. One reason is that, in general, the domains cannot be specified so simply as in the one-dimensional case and the difficulties multiply as the number of dimensions increases. A method is to adopt repeated application of one-dimensional quadrature. For instance, if a two-dimensional integral be written as

f

a

b

1'2(S)

f(s, t) ds dt

'I(S)

we might first use a quadrature formula for the integral in s and then one for

NUMERICAL INTEGRATION

285

the integral with respect to t. Error estimates are not so easy to derive, however. Another expedient is to replace f by an approximating polynomial but this is only practical when the integral of a polynomial over the domain can be evaluated explicitly (see also §6.18). Some attention has been paid to integrands which oscillate rapidly but details will not be given here.

Exercises

28. Evaluate Sg·s e' dt by the forward and backward rectangle rules, obtaining bounds for the theorerical error. Compare with the mid-point rule. 29. Calculate SA (1 + t 2 ) - 1 dt by the trapezoidal and Simpson's rules, indicating the maximum error expected. 30. Use the Newton-Cotes rule to evaluate SA (t 4 - 1) dt. 31. Evaluate the integral in Exercise 28 by Gaussian quadrature using n = 1, 2, and 3. Bound the theoretical error. 32. Calculate Sf t - 1 dt by Gaussian quadrature and compare your result with the exact answer. 33. Evaluate the integral in Exercise 28 by the Euler-Maclaurin formula with n = 1, h = 0.125. 34. Find L~= 1 m J from the Euler-Maclaurin formula. 35. Calculate (i) SA (1 - t 2 ) 1/2 dt, (ii) SA (1 - t 2 ) - 1/2 dt correct to four places of decimals, using Simpson's rule where appropriate. 36. Obtain a numerical value for SA e' In t dt. 37. Devise quadrature schemes for (i)

IIII

s(4 - t 2 ) 112 ds dt,

(ii)

IIII

(4 -

S2 t2)11 2

ds dt.

6 ANTENNAS AND INTEGRAL EQUATIONS WIRE ANTENNAS 6.1 Introduction The concept of an antenna as a piece of wire or portion of dielectric which radiates electromagnetic energy is simple enough in principle, but the derivation of quantitative results of value for design purposes is fraught with difficulties. Even when the isolated antenna can be described as a straightforward boundary-value problem, it can rarely be solved with any ease. In fact, the antenna, to be of any use as an element of a communication system, must be coupled with a transmission line or waveguide, and that coupling forms an important but complicated part of any real system. For these reasons a substantial amount of analysis has been devoted to antennas, not always with great success. The advent of powerful computers has made it possible to generate numerical answers to problems which had hitherto defied solution. It must be confessed, however, that the mathematical detail has often obscured the physical principles involved leaving the engineer up in the air when both analysis and computer fail. For example, to reduce pressure on the computer, some type of symmetry may be assumed but the symmetry is usually lost as soon as a transmission line is connected. While it is our purpose to enumerate some of the analytical and numerical techniques that have been tried, it is hoped not to lose sight completely of physical principles which may be helpful. 6.2 The perfectly conducting wire The first antenna to be considered is a metallic wire whose cross-section is much smaller than a wavelength. The cross-section is assumed to be circular of radius a and the centre of the circle is taken to lie on a curve on which the arc length is s (Fig. 6.1). The basic approximation is to regard the field due to the wire as produced by a current filament along the curve of centres. Then, if sources produce the electric intensity Ei(x) at the point x in the absence of the wire, the total electric intensity at x due to both sources and wire is given by E(x)

= Ei(x) + -.1_ (grad div + k2 ) 1(06 0

II 0

I(e)r/J(x, ~) ds

(6.1)

287

WIRE ANTENNAS

Fig. 6.1. The thin-wire antenna.

for a time dependence eiwt , k being the wave number and the region outside the wire being free space. The arc length of the axis of the wire has been taken as I and I is the current vector tangential to the axis, ; being the point identified by the arc length s. The function t/J is defined by

t/J(x,l;) = exp(-iklx -1;1). 4nlx - ;1

(6.2)

The boundary conditions on the metallic wire are that the tangential components of the electric intensity vanish on the surface of the wire. Only the longitudinal component parallel to the axis of the wire will be considered here. Then, if E~ is the longitudinal component ofEi and xp is a point on the boundary of the wire,

-iweoE~(xp) = «grad div + k

2

)

I

I(I;Jt/J(xp , 1;) ds),

(6.3)

follows from (6.1). The current I is to be determined from the integral equation (6.3). If the wire is straight so that the current filament lies along the z axis, (6.3) simplifies to (6.4) where now t/J(z, s)

=

exp[ -ik{(z 4n{(z -

S)2

S)2

+ a2 } 1/ 2 ]

+ a 2 } 1/2

.

(6.5)

It will be noticed that the right-hand side of (6.4) does not vary around the perimeter of the cross-section and therefore (6.4) can be consistent only if E~ enjoys the same property. In other words, the representation of the current distribution by an axial current filament may be accepted as a satisfactory approximation only when E ~ is effectively constant on a cross-section, i.e. for sufficiently thin wires.

288

ANTENNAS AND INTEGRAL EQUATIONS

Since o.p(z, s)/oz

=-

o.p(z, s)/os, an alternative way of writing (6.4) is

. . I' {k

2

-lQ)eoE~(z) =

o

1(s).p(z, s)

a

+ -d1(s) -0

ds oz

t/J(z, s)} ds

- oz {I(l)l/J(z, I) - I(O)l/J(z, O)}.

(6.6)

If it be assumed that the current vanishes at the ends of the wire both 1(/) and 1(0) are zero with the consequence that the last term in (6.6) disappears. The derivative in (6.4) can be removed by introducing two arbitrary constants A and B. There is then the equivalent form

I

I(s)l/J(z, s) ds

= A cos kz + B sin kz -

iWl:o

J: E~(t)

sin k(z - t) dt. (6.7)

Impositions of conditions such as the vanishing of the current at the ends of the wire will supply sufficient information to determine A and B. If the wire is in the form of a hollow tube it is reasonable to take the current at the ends as zero, but for a solid rod, currents will flow across the ends. Neglect of these currents will introduce an error of the order of ka. If the straight wire is modelled by a hollow tube there arise still integral equations like (6.4) and (6.7) but with .p(z, s) replaced by K(z - s) where K(z)

1

f7t

=- 2 8n -7t

exp{ -ik(z2

(Z2

+ 4a2 sin" t
+ 4a2 sin? t
d.

Although K(z) is singular at z = 0 (like -lnlzI/4na) the difference between K(z - s) and .p(z, s) is of the order of ka over most of the interval of integration. Predictions from the two models are often reasonably compatible therefore. It has been shown (Jones 1981) that the solution of (6.4) with K exists and is unique when the current vanishes at the ends. Also the solution depends continuously on the incident field (Rynne 1992). Furthermore, these conclusions apply to (6.7) when A and B are properly chosen. However, it is important to note, in connection with numerical solution of (6.7), that when the right-hand side consists solely of cos kz the current 1(s) must be singular at the ends, like I/S1/ 2 and 1/(1 - S)1/2 respectively; that is also true if the right-hand side is just sin kz. Integral equations for wires were first derived by Pocklington (1987) and there were subsequently used to find the current distribution on a semi-infinite wire by MacDonald (1902). In the absence of an external field E~ = 0 and an approximate solution of (6.4) is sought. Let ~ »a be such that kb « 1, which is possible when a is small enough. Then, for Is - zl > ~, the contribution of the integral in (6.4) is O(I/~). In Is - z] ~ ~, 1(s) may be approximated by 1(z) and the exponential

289

WIRE ANTENNAS

replaced by unity. Then (6.4) becomes

In

o = (2~)(~2 + k 2 )

az

a

I(z)

2n

0

+ (~). ~

Thus, as a diminishes, we are forced to conclude that

I(z)

= Al exp( -ikz) + B 1 exp(ikz)

where Al and B1 are constants. Consequently, the current distribution on a thin wire is, to a first approximation, sinusoidal and propagates, without loss of amplitude, at the speed of light. In practice, this will not be quite true because of corrections allowing for a being non-zero but it is sufficiently close to the truth to be a valuable concept. When I(z) = 10 exp( -ikz) it is possible to express the integral in (6.1) in terms of tabulated functions. Select

{x 2 +

y2

+ (z -

S)2}1/2

+ S- z

as a new variable of integration. Then

J:

10 exp( -iks)'" ds

where .

EI(

= 10 exp( -ikz){Ei( -

iku2) - Ei( -iku 1)}

(6.7)

ix) = - foo exp( +iu) ±IX - d u = C·IX ± I..Sl X,

· C lX=

(6.8)

U

x

foo cos-u d U

-

. f

U

x

SIX= -

00

x

sind t t t

being positive. The parameters U1 and U2 are defined by U2 = R 2 + I - z, = R 1 - z where R 1 and R 2 are the distances to the point of observation from the ends of the wire (Fig. 6.2). The electric field may now be calculated from (6.1) and has the form (the subscript w denoting the contribution of the wire) (Ex)w = ~ [exp{ -ik(~2 + 1)} ikR 2 (z -I)} 4nlC:OB o R2 U2 X

U1

{t _

_ exp( -;kR 1 ) R1

(1 _

ikR1Z)J

(6.9)

U1

with a corresponding expression for E, while (Ez)w =

+

41tlC:O Bo

(ik _

~)[exp{ az

ik(R 2 R2

+ I)} _

exp( -ikR 1 ) ] R1

.

(6.10)

290

ANTENNAS AND INTEGRAL EQUATIONS

s=o Fig. 6.2. Parameters for field due to current exp( - iks).

The evaluation of (6.9)and (6.10) entails only elementary functions which makes currents of the form exp( -ikz) very convenient to handle. It willalso be noticed that the field involves only distances from the ends of the wire and becomes negligibly small as the length of the wire increases to infinity. Therefore, in this mode of operation, the radiation of a thin wire may be conceived of as originating from the ends of the wire. Of course, the excitation or otherwise of this mode depends upon Ei and so more general possibilities are examined in the next section. If 1(z) = 10 exp(ikz) then (6.9) and (6.10) are altered to (Ex)w = ~ [{ikR 2 (1- z) _ 41tlW8 0 v2

+ (Ez)w =

(

ikR t Vt

z

+

1) exp(

t} exp{ -ik(~2 -I)}

-ikR t ) ] 3' Rt

~ (~ + ik) [ex p{ -ik(R 2 41tlW80

oz

R2

R2

I)} _ exp( -ikR t ) ] R1

6.3 General excitation of the infinite wire As a start to the problem of general excitation for a straight wire the case in which E~(z) = exp(-ielz) will be considered. The constant el will be assumed to be real. Suppose firstly that the wire is infinitely long. Then it is plausible to try 1(z) = 10 exp( -ielz). Substitution if (6.4) gives, after the change of variable s- z

= t,

WIRE ANTENNAS

291

Now

where (k 2 - a2 ) 1 / 2 is positive when k2 > a2 and equals - i(a 2 - k2 ) 1/ 2 when k 2 < a 2 • The function Hb2 ) is the standard Hankel function of the second kind. Hence in this case I

(z)

= (k 2 _

4wGo exp( - iaz) (X2)Hl>2l{a(k 2 _ (X2)1/2}·

(6.12)

The solution for the infinite straight wire under general excitation may now be deduced by Fourier transforms. For instance, if

E~(z) = f:oo f«(X) exp( I(z)

=

f

iez) de,

4 00 f(et) exp( - iez) da mea _ 00 (k 2 _ 0(2)Hl>2l{a(k2 _ (X2)1/2} ·

(6.13)

(6.14)

As an illustration, consider the model when the source is a magnetic frill or ring current of radius b concentric with the wire (Fig. 6.3). Such a model corresponds approximately to the physical situation of an infinite straight perfectly conducting cylinder excited by a coaxial transmission line, the inner conductor having the same radius as the antenna, surrounded by an infinite ground plane (Fig. 6.4). The opening in the ground plane can be regarded as creating a ring of magnetic current with no ground plane, the radius of the ring being the mean of the radii of the inner and outer conductors of the coaxial line. z

Fig. 6.3. Magnetic ring current of radius b.

292

ANTENNAS AND INTEGRAL EQUATIONS

,

Ground plane

Fig. 6.4. A coaxial line feeding a semi-infinite monopole.

Assuming symmetrical illumination, a solution is required of the equations curl E i

+ iOOJloH i = - <5(r

(6.15)

- b)<5(z)$

(6.16) where r, cfJ, z are cylindrical polar coordinates and $ is a unit vector in the direction of cfJ increasing. It may be verified that the solution obtained by Fourier transform in the absence of the wire is for r > b

E~ = -lib ta>oo KHl?)(Kr)J~(Kb) exp( -iiXZ) d«,

(6.17)

f:oo iXH~)'(Kr)J~(Kb) exp( -iiXZ) da,

(6.18)

E~ =

-lb

H~ = -troeob toooo H~)'(Kr)J~(Kb) exp( -iaz) dz

(6.19)

where ,,2 = k 2 - (X2 = 00 2 PoBo - (X2. The other components of the field are zero. When r < b, the non-zero components are

E~= -lib f:oo KJo(Kr)H~)'(Kb)exp(-iaz)da,

(6.20)

E~ = -!b f"oo aJo(Kr)H~)'(Kb) exp( -iaz)

(6.21)

H~ =

-lroeob

d«,

f:oo J~(Kr)H~)'(Kb) exp( -iaz) de.

(6.22)

Formula (6.20) is the one for the representation analogous to (6.13).

293

WIRE ANTENNAS

Fig. 6.5. Contour of integration for the infinite wire.

Therefore, from (6.14), the current induced in the wire by the field Ei is given by l(z) = -im';ob

f

1 (Ka)H(2)'(Kb) 0

oo

0

- 00

(21

KH 0 (x«)

exp( -iaz) de.

(6.23)

The contour in (6.23) may be deemed to pass below the branch point a = -k and above the branch point a = k (Fig. 6.5). For positive values of z the contour may be deformed into the path C (Fig. 6.6) which goes parallel to the imaginary axis from -ioo, circles rJ. = k, crosses the branch line, and returns to -ioo. Thus I(z) = - iwc;ob

f

i (Ka)H(2)'(Kb) 0

c

(2)0

KH0 (xc)

exp( - ixz] dec

(6.24)

In the deformation contributions due to the residues at the poles (if any) might be picked up. However, Hb2)(Z) is free of zeros in 0 ~ ph z ~ -n and so there are no poles to be accounted for. When z is large, the integral in (6.24) can be evaluated asymptotically. By the general theory for such integrals (see,for example, Jones 1972)the dominant part comes from a neighbourhood of the branch point a = k. Therefore, as Z --+ 00,

.

I( z)

I'Ov

-lWC;o

f cK

2

exp( -iaz) dCL (1n tKa + y + Pri)

where}' = 0.5772 ... is Euler's constant. Replacing k + a by 2k we may simplify this to 2nwe o . exp( -lkz)fl (z) I(z) (6.25) k I'Ov

where

il (z) =

f

-

00

o t[{2y

- -

exp( - t) dt

+ In(ka2t exp(t ni)/ 2z)}2 + n 2 ]

-k

k

rI

,I

I I I

,

{t I

, I

I

C

I

:

Fig. 6.6. Deformed contour of integration.

.

(6.26)

294

ANTENNAS AND INTEGRAL EQUATIONS

It will be noticed that (6.25) does not involve b so that the radial position of the magnetic frill is of no consequence as far as the current far along the wire is concerned. Therefore, the same distant current would be obtained if the rod had been excited by a narrow gap in its length. The function It depends only on the single parameter 2z/ka 2 in which tka 2 is the cross-sectional area divided by the wavelength. It may therefore be tabulated conveniently (Kunz 1963; Bach Andersen 1968). This is best done by numerical integration but a rough idea of the behaviour of 11 can be obtained by considering what happens when 2z/ka 2 » 1. In that case, put d2 = 2z/ka 2 exp(y + 11ti) and then

From now on neglect terms of order e- d • An integration by parts gives

f

d

fl(Z) '" Make the approximation

0

e-t dt

eY/d2) ·

In(t

In t Since

So

e

rr

t

(6.27)

In t dt = - y it follows that y

I

fl(Z) '" In(d2/eY) - {In(d2/eY)}2 Consistent with (6.27) this may be written as

11 (z)

~ Ijln d2 •

Thus

I 21n d

11(Z)~--=

1 In{2z/ka exp(y + !-xi)} 2

(6.28)

when IdI » 1. According to (6.28), I/tl decreases steadily as Idl increases whereas ph It increases steadily. Similar behaviour is exhibited by (6.26). The approximation (6.28) should be accurate to within 10 per cent for the modulus when Idl > 10 and for the phase when Idl > 100. On the basis of (6.28) and (6.25) I () z

V exp(Ok) = 41t V exp( - ikz)/t(z) = "'() .I ~ Z - 1 Z Zo

(6.29)

if the strength of the magnetic ring is multiplied by - 2 V and Zo = (po/e o) t /2

295

WIRE ANTENNAS

is the impedance of free space. The admittance Y(z) is given by Y(z) = 4nfl (z)/Zo

(6.30)

~ 4n/2Z o In d.

The current wave decays slowly with distance along the rod. It travels nearly at the speed predicted by Pocklington for an infinitesimally thin wire. The slow diminution in amplitude means that the wave is guided by the wire over long distances, gradually radiating energy. The radiation field can be calculated from (6.1) and H(x)

= Hi(x) + curl f:<Xl

I(~)"'(x,~) ds.

(6.31)

Thus, when r > b, (6.19), (6.23), and (6.11) supply Z oH<1'

= -lkb

J

oo

{

H~)I (xr) J~(Kb) - Jo(Ka)

- 00

H(2)/(Kb)} ~2)· exp(-ic.(z) de. H 0 (xc) (6.32)

If kr » 1,the derivative of the Hankel function can be replaced by the first term in its asymptotic development so long as a is not near ±k, which can always be arranged by deforming the contour in Fig. 6.5 so that it does not pass too close to these points. Hence

Z oH<1'

~ iikb

f

co

exp{ -i(Kr _ in)}

(~)1/2

-00

n~

H(2)/(Kb)} x { J~(Kb) - Jo(Ka) ~2) exp(-iaz) de. H 0 (xc)

The exponent xr + «z has a stationary point at a = kz/R or a = k cos 6 where Rand 6 are spherical polar coordinates. Hence the method of stationary phase gives Z oH<1' ~ tkb{ Jo(ka sin 6)Hb2)/(kb sin 6) _ J' (kb . 6)H(2)(k . 6)} o sin 0 a sin

exp( -ikR) (2) • RH 0 (ka sin 6)

(6.33)

ZoH~ '" [2 sin 9{y + In(tka exp(tni) sin 9)}]-1 exp( -ikR)

(6.34)

If b « 1, this formula simplifies to

R

and to this order of approximation the radial position of the magnetic ring is of no significance. It will be observed that the radiation is largest when 6 is near 0 or n, consistent with the earlier conclusion that the rod guides the field over long distances. However, it must be borne in mind that near the wire kr is

296

ANTENNAS AND INTEGRAL EQUATIONS

not large and so the approximation made in going from (6.32) to (6.34) is invalid. Other components of the distant field can be written down from the radiation conditions. Thus E(I = ZoH
4n G = -- (kb)2

Zo

ft 0

n

1Jo(ka sin 8)Hb2)'(kb sin 8) -

J~(kb sin 8)Hb2)(ka sin

x

8)1 2

sin 0 dO (6.35) IHb2)(ka sin 0)1 2

if (6.33) is used even for f} ~ 0. The integral in (6.35) has been investigated numerically (Duncan 1962). If kb « 1, the simplification of (6.34) may be employed. In the resulting integral, sin may be replaced by in the interval (0, 8) where In 8 » In(!ka e") and In sin may be neglected in the interval (8, !-n). Hence

e

e

e

(6.36) for kb « 1. Formula (6.36) is fairly accurate for ka < lo so long as kb does not exceed moderate values because (6.35) is relatively insensitive to variations in kb. If one wishes to regard the effect of the antenna in Fig. 6.4 on the coaxial line as the same as a circuit element placed appropriately on the line, the conductance will be tG because the radiation takes place only above the ground plane. There will also be a susceptance to take account of the energy stored in the capacitor composed of the ground plane and rod. Good theoretical estimates of this susceptance do not seem to be available (Bach Andersen 1968) so no details will be given. It is worth emphasizing that the formulae of this section also hold for thick antennas providing that the initial requirement of E ~ not varying around the cross-section is met. In particular, this will be true whenever the rod is illuminated symmetrically. There is one further point to be elucidated and that is the admittance observed at the ring current. The admittance well away from the source has been derived in (6.29). Letting z ~ 0, 1(0) = Y(O) V where, however, Y(O) is no longer expressed in terms of 11 but is obtained from (6.24) by putting z = O. If kb « 1, only integration near k is relevant and if an analysis similar to that in the derivation of (6.26) is carried out it is found that

Y(O)

= _ 2n{ln ka + y + !-nil -1 Zo

(6.37)

297

WIRE ANTENNAS

for very thin wires. The resemblance of the real part of (6.37) to tG will be noted; it is an indication of the connection between the admittance at the ring and the radiated energy. 6.4 The semi-infinite wire

Before turning to the problem of the semi-infinite wire it is helpful to consider a current which is a modified version of the one in the preceding section, namely I(w,z)

= (k -

f

h(w) 00 e- iaz dex w)21ti _ 00 (ex - w)(k - ex)h(ex)

(6.38)

where h(ex) = In[a{k(k - ex)/2} 112] + Y + t1ti. In the extra factors w is real with Iwl < k and the pole ex = w is above the contour of integration. If this pole were absent a current which is a constant multiple of that in (6.25) would be recovered. Now OO

f

-

I(w, s)t/f(z, s) ds

= i(k -

00

h(w) foo H~2){a(k2 - e( 2)1/2} e- iczz w) - 2 dex 81t - 00 (ex - w)(k - ex)h(ex)

from (6.11). For positive z only values near ex = k are relevant according to the preceding section. Therefore

f

ex;

-

00

f

h(w) 00 e- iaz I(w, s)t/f(z, s) ds = i(k - w) - 2 dex 41t - 00 (ex - w)(k - ex) = -h(w) e- ikz/21t.

It is evident then from (6.4) that E~ vanishes for positive z. Another useful result is that, when z ~ 0, I(w, z) = e- iwz.

(6.39)

In particular, I(w,O) = 1. Next suppose that the incident field is a plane wave in which Ei = (cos Oii - sin Oik) exp( -ikx sin Oi - ikz cos OJ).

(6.40)

The current induced in an infinitely long wire is I (z) o

= - 4m8o exp( -ikz cosOi ) k 2 sin OiH~2)(ka sin Oi)

by virtue of (6.12). The current Io(z) - I(k cos Oi, z)Io(O)

vanishes for negative z on account of (6.39). Moreover, the boundary condition on the electric intensity is satisfied still for positive z because I(w, z) has no

298

ANTENNAS AND INTEGRAL EQUATIONS

effect there as derived above. Hence, the current induced by the plane wave (6.40) in a semi-infinite wirefrom z = 0 to z = 00 is Io(z) - I(k cosOi , z)Io(O).

By considering the limit as Oi ~ 'It we see that a wave travelling along the wire from z = 00 to z = 0 with a current of unit amplitude is reflected as a current - I( - k, z). For calculations it assists to have a formula somewhat simpler than (6.38) for positive z. It can be obtained by the technique of the preceding section. This leads to

I(w, z)

~

dt

d

e- t

d

fo In(t eY/d ) dt t -

2iz(k - w)h(w) exp( -ikz)

2 -.

lZ(W -

k)

•

Introduce the approximation (6.27) again. Observe that

f

d e- t 1 dt=--o dt t - iz(w - k) iz(w - k) d

to the order of accuracy of the preceding section, and

f

d

d

t

In t e dt = lim o dt t - iz(w - k) £- +0 rr

fd t

lim [ e_ + 0

d

-t

e- In t

Jd _

In t _

e dt dt t - iz(w - k)

t

t - iz(w - k)

e

f.

d

..

e-'

dt

t{t - iz(w - k)}

£~~o iz(w1_k) {in e+1 e;' dt- Ioo t_::(~d~k)}' 00

But

f

oo

£

as e ~

+ 0 and

-t

~ dt = - y - In e + o(1) t

so the right-hand side becomes .

1

lZ(W -

k)

[ _y

+ eiZ(k-w)Ei{

-iz(k - w)}].

Bringing the result into conformity with (6.27) we have

2h(w) exp( -ikz) I(w, z ) ~ - - - - - - cP(w, z) where

cP(w, z) = In d2

+ eiz(k-w)Ei{ -iz(k

(6.41)

- w)}.

(6.42)

It will be noted that (6.41) gives I(w,O) = 1 in agreement with (6.39). One conclusion from (6.41) and (6.42) is that the plane wave induces in a semi-infinite wire the current of an infinite wire plus a travelling wave of slowly varying amplitude. This conclusion can be expected to hold until the radius is

299

WIRE ANTENNAS

such that the current in the ends or asymmetry plays a significant part. A semi-infinite antenna which supports a travelling wave by the addition of a resistor one-quarter wavelength from the end has been devised by Altshuler (1961).

6.5 The finite wire When a finite wire is not specially terminated so that only a single travelling wave develops, any wave generated will be reflected at an end and a pattern of standing waves on the wire can be expected. The problem is more complicated then than that for the infinite wire. Several analytical devices have been employed to tackle the solution of the integral equation (see. for example, Jones 1986); numerical methods will be discussed later. Here the excitation due to the plane wave (6.40) will be examined (for the wire fed by a coaxial line as in Fig. 6.4 consult Bach Andersen 1971). One difficulty is to account properly for the reflection at an end which will depend to some extent on the exact shape of the end and whether it is flat or rounded (King 1956; Jones 1990). In the treatment of this section the shape of the end will be ignored and the analysis based on the theory of the preceding section (for an approach using edge waves see Ufimtsev and Krasnozhen 1992). Let the wire extend from z = -I to z = I. From the preceding section and (6.39) in particular the current lo(O){exp( -ikz cos Oi - l(k coslJi, z + I) exp(ikl cos lJi) - I( - k cos

e. I -

z) exp( - ikl cos lJi)}

provides a first correction to the current in the infinite wire. The second term originates from z = -I and the third from z = I. However, they fail the boundary conditions at the opposite ends. Since, in each case, they represent waves travelling along the wire towards the ends (see (6.41» they will create reflected waves of the type l( -k, z + I) and I( -k, 1- z) respectively as explained in the preceding section. Therefore, the appropriate representation for the total current is l(z)

= lo(O){exp( -ikz -l( -k cos

cos (Ji) - l(k cos (Ji, z

+ 1) exp(ikl cos (Ji)

e. 1- z) exp( -ikl cos lJi)

+ AI( -k, z + I) + BI( -k, 1- z)}.

(6.43)

To secure l( ± I) = 0 we need (1 - J2)A = l( -k cos lJi, 2/) exp( -ikl cos lJi) (1 - J2)B

+ JI(k cos lJi, 2/) exp(ikl cos lJi),

= I(k cos Oi, 2/) exp(ikl cos Oi) + ()l( -k cos Oi, 2/) exp( -ikl cos lJi)

where

J = -l( -k, 2/).

300

ANTENNAS AND INTEGRAL EQUATIONS

Evidently, ~ is the reflection coefficient which multiplies the reflected travelling wave generated by a travelling wave hitting the end. Since h( - k) = In ka + }' + tni and I

4J( -k, 21) = In(41/ka 2 )

-

}' -

tni

+ O(I/kl)

when kl is not too small, it is transparent from (6.41) that, in a thin wire (ka « 1), is roughly - e - 2ikl. Thus, resonance effects can be expected when 2kl is an integer multiple of n though the resonance becomes less pronounced as the length of the wire increases because of the logarithmic dependence of 4> on kl. The field scattered by the wire can be found from (6.3). At a large distance R from the origin in a direction making an angle () with the positive z axis ~

2

. 2 () -ikR E; "" _.k s.ln _e_ 4nlwe o R

f'

l(~) eik~cos8 d~

-I

and the sole transverse component E 8 is given by

Evaluation of the integral is aided by consideration of

fl I(kv, ~ + 1)eik~u d~

J(u, v) =

where u, v are both real subject to lui < 1, Ivi < 1. By changing obtain

~

to -

~

we

so that both types of integral arising from I(z) are covered by J. Substitution from (6.38) and inversion of the order of integration supplies

f

h(kv) J(u, v) = k(1 - v) -

ei(ku - 2a)1 -

00

e-

ikul

_ 00 i(ku - ex)(ex - kv)(k - ex)h(ex)

2ni

de.

There is no singularity at ex = ku but later formulae are improved by regarding this point as above the contour of integration. The integral with the second exponential in the integrand can be evaluated immediately by deforming the contour upwards. Hence J(u, v)

= k(1 +

h( kv) v) -

2ni

i e- ikul k(u - v)

{

f

00

ei(ku - 2a)1

d«

----

_ 00 i(ku - (X)«(X - kv)(k - ex)h«(X)

(1 - v )h( kv)} 1-----

(1 - u)h(ku) .

301

WIRE ANTENNAS

Now write 1

(ku - tX)(tX - kv)

1

= k(u -

(1

v) ku - tX

+

tX -

1) kv

and call on (6.38). The net result is that J(u, v)

= e _Okvl W(u, V)

l(kv, 21) ikul e ik(1 - u)

-

1

ikul + i e-

{I

(1 - V)h(kV)} - ---

(1 - u)h(ku)

k(u - v)

where W(u, v)

=

(1 - v) eik(u+v)l

h(kv)

{

}

(6.44)

I(kv, 2/) - I(ku, 2/) . ik(1 - u)(u - v) h(ku)

All terms in (6.43) can be Integrated explicitly now. As a consequence E8

=

k 2 sin (J e - ikR

.

41tlW6 0

x[

- - 10(0) R

1.

2

k(cos 0 - cos Ol) sin 0

{(I +

h(k cos OJ) e°kl( l cos 8i -cos 8) h(k cos 0)

X ---

h(- k cos Oi) eikl(cos8 h(-k cos 0)

2ih(- k) {A e- ikl cos 8 k sirr' 0

h(k cos 0)

(1 - cos 0)(1

-

+ cos Ol) 0

COS8 i ) }

x --

-

cos (J)(1 - cos (Ji)

+

B eikl cos 8 h( - k cos 0)

}

-

W(cos (J, cos (JI) 0

- W( -cos 0, -cos Oi)

+

{AW(cos 0, -1)

+

BW( -cos 0, -1)}

eikl] .

(6.45)

The only quantity which has to be calculated in (6.45) is, on account of (6.44), the integral defined in (6.38). Provided that kl is not too small the formula (6.41) should be an adequate approximation for the integral.

Exercises 1. The current 10 exp( -ikz) flows in a thin wire along the positive z axis. Show that E z

0 _ (~_ ik) exp( -ikR) = _1_

41tiroBo OZ

R

where R is the distance from the origin. 2. A travelling wave of current 10 exp( -ikz) flows along a thin straight wire extending

302

from z =

ANTENNAS AND INTEGRAL EQUATIONS

-!1 to z = 11. Show that the radiation resistance . 60{ In 2kl - C12kl

+ 'Y

- 1+

is

sin 2k/} - .

2kl

3. If the gain of an antenna is defined as 4n times the maximum value of the polar diagram divided by the total power radiated show that the gain in dB in Exercise 2 when kl » 1 is 0.915 + loglo(l/l) 5.97 - 10 1oglo - - - - - -

1/1

assuming that! tan 1.16 = 1.16. 4. In Fig. 6.4 a resistor R is inserted into the antenna at a distance h from the ground plane. Find the admittance Yobserved at the reference plane in the coaxial line. If ka = is find the value of R which gives a travelling wave up to the resistor when 1- h = 1,1, (experimentally R is about 240 ohms). If R = 00 is there a value of h for which there is a travelling wave? 5. Determine the back scatter from a finite thin wire by putting (J = n - (Ji in (6.45). Find an approximate result for broadside incidence «(Ji = -in) by making ka very small when kl is large enough for (6.41) to be valid.

6.6 The receiving antenna The preceding theory has been concerned with the radiation characteristics of a straight wire when excited by a magnetic ring current surrounding the wire or by a plane wave. Actually there is no real loss of generality in concentrating on the transmitting mode because properties of the antenna in a receiving state can be deduced from it. The connection between transmission and reception can be elucidated under very general circumstances and therefore specific reference to the straight wire can be omitted. The basic feature is to represent the antenna as a network placed in some suitable position. Usually this is somewhere in the feed to the antenna where it is feasible to talk in terms of currents and voltages. Here currents and voltages are used in a generalized sense. For instance, if a single-mode waveguide is conveying the energy to or from the antenna, they can be transverse components of the dominant mode at a convenient reference plane (see §2.2). A place where this representation in currents and voltages is adopted will be called a port. The antenna will be supposed to possess two surfaces So, Si' of which it presents So to the outside world while S, is coupled directly to ports (multiple feeds being permitted). The surfaces So and S, are bounded, with V the volume between, and parts of them may coincide (Fig. 6.7). The only requirement on V is that it be linear and passive; the tangential components of E and Hare continuous across any discontinuities in the medium. Outside So is free space though it could be replaced by any linear, homogeneous, isotropic, and lossless medium.

WIRE ANTENNAS

303 n

Fig. 6.7. Configuration of an antenna.

When the antenna is transmitting let the current ]J and the voltage VJ be supplied at the nth port. Then, owing to the linearity of the governing equations, N

L Z",n]J

V~ =

(m = 1,. .. , N)

n=l

(6.46)

if there are N ports. The impedances Z",n define a network characterization of the antenna in the transmitting mode . The average power supplied to the system is

pT

= lyt 2

N

"L. VT]h "''''

",=1

= lyt 2

N

N

"L.L"Z"'nn", ]T]h

(6.47)

,,,=In=l

from (6.46), the asterisk indicating a complex conjugate. Owing to the linearity of the system

r

= tyt

f

Js;

ET /\ HT*.n dS = tyt

f

J

E T /\ HT*.n dS

So

when the medium between S, and So is lossless. The surface So may be replaced by a large sphere at infinity without altering pT. The radiation conditions (§2.l) imply that on this sphere ET

'"

eT exp( -ikR) HT R'

",

hT exp( -ikR) R

where the fields e", hT depend only upon angle and are transverse to the radius vector. Thus, if is the surface of the unit sphere

n

pT=tytf

n

since hT

'"

eT /\ hT. n dn = _I_ f eT.eT*dn 2Zo n

R /\ eT/Zo. The dependence of'e" on]J may be denoted by writing eT

=

N

L ]JeJ.

n=l

(6.48)

304

ANTENNAS AND INTEGRAL EQUATIONS

Then (6.47) gives the equation (6.49) Put I~ = 1 and all other currents zero in (6.49). Then (6.50) More generally, it is clear that (6.51) which includes (6.50). Eqn (6.51) relates the impedance elements of the network to the radiated far field (de Hoop 1975; van Bladel 1966). The derivation of (6.51) is based on the constitutive equations-relating the flux densities D, B linearly to E, H. A further result can be obtained if D depends only on E while B depends only on H. In that case let ET(l) and ET ( 2 ) be two possible fields produced by different excitations of the transmitting ports. Then

r {ET(l) /\ OT(2) -

Js

ET( 2 )

/\

OT(l)}.n dS

=0

o

as can be seen by pushing the surface of integration out to infinity and applying the radiation conditions. Because of the particular constitutive relations under consideration the surface So can be changed to Si' But, on S, the description in terms of the ports is valid. Hence N

L

V~(1)I~(2)

m= 1

=

N

L

V~(2)l~(1).

m= 1

The insertion of (6.46) then enables the deduction (6.52) for any pair of subscripts m and n. The symmetry displayed in (6.52) cannot be expected in the general case. When the antenna is acting as a receptor the excitation will be assumed to be a plane wave travelling in the direction of the unit vector Do. Consequently the incident field can be expressed as Ei

= Eoexp(-ikDo·X),

ZOH i = Do

1\

Eoexp(-ikDo.X)

where the complex constant vector Eo satisfies Do.Eo = 0 in order that the wave

305

WIRE ANTENNAS

be plane. Let the total field generated by this incident wave be E R,

L

(ET A U R - ER

A

10

UT).n dS =

(ET A U R - E R

A

HR.

Now

UT).n dS (6.53)

so long as (6.54) holds throughout V. The relation (6.54) is a requirement on the electromagnetic properties of the antenna in the transmitting and receiving modes. It will certainly be valid when the medium is reciprocal and may also be true for non-reciprocal media if the constitutive equations have the appropriate form. The integral on the left-hand side of (6.53) can be rewritten in terms of the currents and voltages on the ports. In the receiving situation let V~ be the voltage at the nth port and I ~ the current flowing into the port (i.e. in the opposite direction to I~). Then

i

m=l

-f

(V~I~ + V~I~) =

So

(E T A H R - E R

A

HT).n dS.

The surface of integration may be shifted out to infinity since (6.54) is certainly satisfied outside So. Since the difference between E R, H R and E i, Hi is a radiating field we deduce that

f

m= 1

(V~I~ + V~I~) =

-f

SR

(E T A Hi - E i A HT).n dS

where SR is the surface of the sphere of radius R. In view of the special form of E i the integral can be evaluated by the method of stationary phase with the result (6.55) where eT ( -no) is the value of eT in the direction of -no. The substitution of (6.46) and (6.48) into (6.55) leads to an equation which must be true for arbitrary I~. Hence

~

~

n= 1

R

Znm I n

+

R 4n T V m = - -.- em( - no)•Eo

(6.56)

lWJ10

for m = 1, ... , N. Eqns (6.56) provide a network description of the receiving situation. The right-hand side represents the driving force due to the incident plane wave while V~ is due to an internal voltage source. The impedance matrix is the transpose of the matrix which occurs when the antenna is transmitting. The internal voltage depends upon how the ports are loaded. If, for example,

306

ANTENNAS AND INTEGRAL EQUATIONS

there is a load impedance Z ~n then V Rm

N

= "f..J

n=1

ZLmn ]Rn

(m

= 1,... , N).

If all the ports are open I~ = 0 for all n and measurement of V~ determines the right-hand side of (6.56). If the measurements are repeated for two distinct polarizations of the plane wave e~( -Do) is determined. By varying Do the variations of eT over the unit sphere can be found. Thus the radiation pattern when transmitting can be derived from the characteristics of the antenna when it is subject to an incident plane wave, provided that the condition (6.54) is complied with. It is of some interest to examine under what circumstances the power dissipated in the load is a maximum for a given incident plane wave, i.e. when the receiving antenna is matched for maximum power transfer. Let Z and ZL be the matrices with elements Zmn and Z~n respectively. Then (6.56) may be abbreviated to (Z' + ZL)I = V where the prime indicates a transpose (to avoid using T which has just been employed to identify the transmitting situation). The power dissipated in the load is !91I*Z LI = !91V* Y*'ZLyv where Y is the inverse of Z' + ZL. Thus the dissipated power is

talV*(Y*' - Y*'Z'y)V. If changes in ZL are made so that Y becomes Y dissipated power is

!91V*(eYT' - eYT'Z' Y - eY*'Z' Y1 Since 9tV*ZV

= 9lV*Z*'V, this

-

+ el;,

the alteration in

e2 YT'Z'l;)V.

(6.57)

is stationary if

!9lV*(YT' - YT'Z' Y - YT'Z*y)V =

o.

But l; is an arbitrary complex matrix so this is possible only if (Z' the unit matrix, i.e, ZL = Z*.

+ Z*)Y is (6.58)

Granted the truth of (6.58), maximum power transfer must occur because the only remaining term in (6.57) is negative since the real part of Z' is forced to be positive semi-definite in a passive system. The condition for matching therefore is that the received load impedance should be the complex conjugate of the impedance matrix of the antenna when transmitting. In this optimal regime the powers absorbed by the load and internally by the network are obviously the same. Hence the maximum power in the load is half that delivered

307

WIRE ANTENNAS

by the sources. The condition of optimality (6.58) is clearly independent of the incoming plane wave.

Exercises 6. If ER - Ei

--

eS exp(- ikR)/ R as R

-+

i

00 show that

2 E* eS *(n ) (Z~n+Z~':)I~*I:+eS.es*dQ= -fit o~ o. m= 1 n= 1 Zo a 1WfJo

L L N

N

7. If there is only a single port, define the efficiency n of the load by 16pTpL

"=IVTI~+ V~lTli where P': is the power dissipated in the load. Prove that

11=1-

IZ!1 -Z~112

tz., + Z~112

and deduce that the maximum efficiency of unity is obtained when the load is matched.

6.7 Numerical methods The analysis so far presented is admirable in bringing out the physical features of the behaviour but it has its limitations. Much of it relies upon the antenna being straight and of very small radius, as well as being illuminated by a relatively simple field. When one or more of these conditions are not met, numerical techniques need to be employed. For the moment, it will be assumed that the representation (6.1) is still a valid approximation, i.e. attention will be confined to antennas of small cross-section. The governing integral equation is then (6.3) and its form will now be considered in more detail. The point x p may be taken as a point x of the filament to a first approximation. Let t(x) be a unit vector tangential to the filament at the point x, Then if; is the point where the arc length is s, the integral equation may be written as -iwso't(x). Ei(x)

=

I

l(s)[k 2't(x). 't(;)

+

{'t(x) .grad}{'t(;). grad}]t/Jl(X,;) ds

(6.59)

where t/Jl(X,;)

= exp( -iklx 41tlx - ;

;

+

+ al).

a]

The vector a has magnitude equal to the radius of the cross-section at ;. Its direction is parallel to t(;) /\ (x - ;); if t and x - ; are parallel, any convenient orientation perpendicular to t may be chosen so long as the direction varies continuously along the wire.

308

ANTENNAS AND INTEGRAL EQUATIONS

Eqn (6.59) is of the first kind, and efforts to solve such equations numerically go back as far as 1879 when Maxwell was attempting to determine the capacitance of a square metal plate. In essence, his method was that of collocation (§5.6(b» to find the parameters in a finite-element analysis (§5.8). Nowadays Galerkin (§5.1)and point-matching (§5.6(a» techniques are available as well as Fourier methods (§5.6(c». Nevertheless, it is still by no means a trivial business to deduce the best way of tackling (6.59). One general observation should be made. It has been explained in §5.7 that it is advantageous for certain numerical aspects if the kernel of an integral equation of the first kind is as singular as possible. Therefore, results for tiny radii of cross-section are likely to be more reliable than those for thicker wires. The fundamental idea in all approaches is to select a set of known basis functions {
309

WIRE ANTENNAS

Piecewise linear or triangle functions: rh ) h - Is o/m(s = -h

sm-tl

Piecewise sinusoidal: ¢m(S)

= sin k(h ~ Is SIn

Sm-tO

kh

Trigonometric:

Quadratic: cPm(s) = 1 + Bm(s -

Sm-t)

+ Cm(s -

Sm_t)2

In the triangle and piecewise sinusoidal approximations the points of subdivision are equally spaced with Sm - Sm-t = h for all m. It will also be noticed that these two require an extra segment of length h on the ends so that there are M + 1 basis functions rather than M. The trigonometric and quadratic representations contain two constants Bm and Cm which are at our disposal; they may be adjusted so as to satisfy extra conditions such as the continuity of I. To apply the boundary conditions that 1(0) = 0, 1(/) = 0 it is merely necessary to set at = 0 and aM = 0 for the pulse basis functions. For the piecewise linear and piecewise sinusoidal put at and aM+ t both zero. In the case of the trigonometric and quadratic basis functions the constants Bt , C h B M + l' and CM + 1 of the end-most intervals are adjusted to make 1(0) and 1(1) vanish. Both the trigonometric and piecewise sinusoidal expansions have the advantage that a numerical integration can be evaded for a straight antenna by drawing benefit from (6.10). For example, (

022

OZ

+ k2 )

Ism

sin k(h -

Is - sm-tD

t/J ds

Sm- 2

k {t/Jm(Z) 4n

=-

+ t/Jm-2(Z) -

2t/Jm-t(z) cos kh}

(6.61)

where

Therefore if an expansion of I is made in terms of the piecewise sinusoidal basis function and subsectional collocation is employed with matching at S = Sm' the

310

ANTENNAS AND INTEGRAL EQUATIONS

system of equations

k

M

-.- L SIn

kh m=2

am{t/Jm(sn)

+ t/Jm-2(Sn) - 2t/Jm-t(sn) cos kh} =

.

-4niweoE~(sn) (6.62)

is deduced from (6.61) and (6.4). The coefficients at and a M + i are rmssmg because of the boundary conditions at the end of the wire and (6.62) is implemented at n = 1, ... , M - 1. There are therefore exactly the right number of equations to determine the unknown coefficients am and their solution may be dealt with by the techniques of §§1.12 and 1.13. Apparently, as M increases from 3 or 4, the numerical answer converges rapidly to the correct solution, but this is illusory because further increase in M makes the linear system (6.62) ill conditioned (§1.12) (Taylor and Wilton 1972). This is related to the fact that t/Jm(sm) is very large for small radii so that the combination an + an + 2 2a n + t COS kh dominates in (6.62). This combination is a multiple of the discontinuity of the derivative of the representation (6.60) as S goes from just above s; to just below. Thus the main effort of the process is expended in trying to secure the continuity of the first derivative of the representation. Because of this emphasis with its concentration on the points s; the coefficients am found may well be such that the approximate current gives an E, which is badly in error when S is not a matching point, as may be verified easily in practice (Pearson and Butler 1975). Only when the discontinuity in the derivative can be expected to be naturally small is (6.62) likely to furnish good results. Such an eventuality will arise when the current distribution on the antenna is sinusoidal, i.e. when kl = nn and the antenna is resonant. Although (6.62) dispenses with the need for numerical integration, the procedure cannot be recommended except possibly for the resonant antenna whose length is an integral number of half wavelengths or almost so. One may attempt to improve the smoothness by averaging the equations instead of enforcing them at the solitary points Sn. In other words, a Galerkin procedure (§5.1) is adopted in which integration is performed after multiplication by a suitable weight function. Thus, in (6.62), t/Jm(sn) is replaced by J~ t/Jm(z)wn(z) dz. If the weight functions are chosen as piecewise sinusoidal, numerical integration is again escaped. In practice, this choice yields good results (Butler and Wilton 1975) when the antenna is split into 30 or 40 segments. A typical output is shown in Fig. 6.8. However, the selection of weight functions is crucial because it is known that some options deliver erroneous answers. For example (Klein and Mittra 1975), if the weight functions are pulse functions whose width need not be h, wildly different approximations arise for different pulse widths and the differences are not due to the numerical integration involved. While the influence of weight functions is incompletely understood it is always wise to check how well the approximate current arrived at reproduces E~ over the whole length of the wire. If the boundary condition

WIRE ANTENNAS

311

10001----------I----t-----1

§ Q)

c

C

t'O ..., (/)

-~

ex:

1OO.....---------4L-------~-t

0-0 0·1

0·2

0·3

0-4 0·5 0·6 0·7 1/2).,

Fig. 6.8. Input resistance as a function of frequency when 1/2a = 2000.

is satisfied everywhere with tolerable accuracy one can hope to have attained a reasonable approximation, but wide discrepancies in various intervals will, in general, imply a poor approximation to the current. All of these remarks apply with equal strength if the triangle functions are adopted for the basis since, when h « A. (which is advisable to keep proper track of phase variations), there is not much difference between the piecewise sinusoidal and triangle functions. A trigonometric basis offers also the opportunity of avoiding numerical integration. The constants B; and Cm are selected so that the current vanishes at the ends and the current, together with its first derivative, is continuous across S = Sm (m = 1, ... , M - 1). There are exactly the right number of conditions to fix the B; and Cm' Then the field is matched at M points so that the am are educed. This method, which overcomes the embarrassment of the discontinuous first derivative described above, has accuracy comparable with the Galerkin method {with piecewise sinusoidal weight functions) already mentioned. Another possibility for the straight antenna is to work with the integral equation (6.7). Because of the integration on the right-hand side the equations are less critically dependent on the values of the applied electric field at particular points. Furthermore, the absence of a derivative on the left-hand side means that it is less sensitive to discontinuities in the approximating currents. Accordingly, the use of triangle functions will not cause trouble. By matching at 5 = 5o, ... , 5M we obtain M + 1 equations for the M - 1 current coefficients and the two unknown constants A and B. The accuracy gained by this method

312

ANTENNAS AND INTEGRAL EQUATIONS

I,

o Fig. 6.9. The gap where the feed is inserted.

is of the same order as that for the trigonometric basis except, perhaps, at the lower values of M. There is one topic which has not yet been alluded to and that is the feeding arrangement. The case of the antenna as a transmitting device may be handled by propounding the existence of a gap II ~ S ~ 12 in which the feed is introduced (Fig. 6.9). Then, if

i'i

'2

Ez(s) ds

=-

V,

V may be regarded as the voltage of the source. The input impedance Z and input admittance Y of the antenna can then be defined by

Z = V/I(/ 2 ) = l/Y.

(6.63)

However, things are not quite straightforward because now it is required that the current in the wire produce zero E; on the wire and the correct E, in the gap. However, the integral equation holds only on the wire and so, if only points on the wire are considered, we are led to a homogeneous system of linear equations whose solution is, consequently, non-unique. Two ways out of this impasse have been suggested. The first is to take E; in the gap as - V/(/2 - It) and then to require the current to reproduce this value. The non-uniqueness disappears but it is found, in practice, that no reliance can be placed on the impedance (Miller and Deadrick 1973) calculated by this means. Some improvement occurs if the end segments of the wire adjacent to the gap are made the same length as the gap. This is because, as pointed out in §6.3, the input susceptance is very responsive to small changes near the gap. The input conductance is relatively insensitive, so any numerical method that entails fluctuations in the conductance for minor variations in the gap length may be safely discarded as unreliable.

313

WIRE ANTENNAS

The second method is to specify the current at the gap, say 1(/2 ) = 1(/1 ) = 1. The system of equations from the wire then becomes determinate. With I(s) known, E; can be evaluated in the gap and thus V acquired. In fact, the integral can be taken from 0 to I since E, is supposed to be zero on the wire. Owing to the approximations carried out this will not be exactly true and the alterations in V with integration length provide a measure of reliability. Usually V settles down once the integration extends beyond two or three central segments. It is, of course, vital to employ a numerical method which eliminates the spikiness in E, already referred to in connection with piecewise sinusoidal basis functions. On the whole, this method leads to more satisfactory values than the first. As soon as a satisfactory approximant to the current has been achieved, the far-field pattern can be computed from

. E(x) ~ EI(X)

k2

exp( - ikR)

4nlcoGo

R

+ - .-

x

L am M

L[.(S) - {'(S).~} ~J m= 1

cPm(s) exp

Ck~ ;») ds

(6.64)

where the fast Fourier transform (§2.14) may be expedient. The first term is absent when the antenna is transmitting. The power gain of the transmitter may be taken as

{IE 8 12 + IE~12}R2 301/1 2 alZ

where 1 is the current at the feed. The total power radiated also follows in a straightforward way and, if it be written as tl/1 2 Rf' the radiation resistance R, is known. The input power is t1/1 2 91Z so the radiation efficiency is Rr/[ftZ. A measure of the back-scattering by a receiving antenna is the radar cross-section (Jr. This is defined for an incident plane wave by means of the intensity scattered back towards the origins of the plane wave. Thus

_ I.

(Jr -

1m R~oo

4nR 2 1Es l2 .

IEII

2

where E, is the field scattered in the specified direction. The radar cross-section depends not only on the geometry and materials of the antenna but also on the wavelength, polarization, and direction of arrival of the incident wave. The problem of determining the reflection characteristics of an antenna of complex shape via the radar cross-section is therefore a formidable task (cf. §7.7). If the incident wave is not truly plane the radar cross-section may misrepresent the back-scattering because of the influence of the curvature of the wavefront. Practical implementation seeks often to take advantage of the fast Fourier transform (see §§1.7, 2.14, Brigham 1988) which will be abbreviated to FFT.

314

ANTENNAS AND INTEGRAL EQUATIONS

Recall that the FFT is concerned with the discrete Fourier transform

=

en

L ek e- 21tink/N

N-l

k=O

with inversion formula 1 N-l . ek = en e21tlkn/N N n=O

L

(n

= 0, 1, ... , N -

(k

= 0, 1, ... , N - 1).

1)

(6.64a)

(6.64b)

Further, if

ek =

N-1

L

(k

fpgk-p

p=o

= 0, 1, ... , N -

1)

(6.64c)

then N-l N-l

en = L L

k=O p=o

N-1

= L

fpgk-p e-21tink/N

fp e-21tinp/N

p=o

N-l-p

L

gq e-21tinq/N.

q=-p

Since -1

N-1

q= -p

r=N-p

L s, e-21tinq/N = L

gr-N e-21tinr/N

if follows, provided gr-N

= gr

(r

= 1, 2, ... , N - 1),

(6.64d)

that

en

= Ingn'

(6.64e)

If the sequence {gr} does not satisfy (6.64d) it is still possible to arrive at (6.64e) by working with modified transforms. In effect, the sequences are lengthened so as to achieve the desired property. Take Gn = gn for n = 1 - N, 2 - N, ... , N - 1 and

= gn + 1 - 2N Gn = gn-1+2N

Gn

= N, N + 1, ... , 2N - 2), (n = 1 - 2N, 3 - 2N, ... , -N).

(n

Then Gn- 2N+ 1 = Gn for n = 1,2, ... , 2N - 2. The sequence (6.64c) is kept intact by padding {fp} with zeros, i.e, F p = fp (p = 0, 1, ... , N - 1), Fp = 0 (p = N, ... , 2N - 2). Now (6.64c) can be embedded in 2N-2 (k = 0,1, .. . ,2N - 2) = FpGk - p p=o

s,

L

En

2N-2

= L

k=O

Eke-21tink/(2N-l).

315

WIRE ANTENNAS

Consider the integral equation E(z)

= Lb /(s)K(z

- s) ds

holding for z in (a, b). Let x(s) be the characteristic function which is unity in (a, b) and zero elsewhere. The integral equation can be written E(z) =

f:""

X(s)/(s)K(z - s) ds

= -12 foo eiQtzK(cx)if de 41t

-

00

in terms of continuous transforms. One method of attack is to approximate the continuous transforms by discrete ones but the resulting inaccuracy leads generally to poor convergence (Peters and Volakis 1989). A better alternative is to try X(s)I(s) =

L am4>(s -

sm)

m

where

l/J

is a basis function. This supplies

if = ~ Lam e- illSm m

which may be substituted directly in the above integral. It may happen that there are singularities present, perhaps K involves derivatives or the sought I is singular (as can occur for (6.7) with arbitrary A and B). In that case it is probably advisable to stay with E(z)

= ~ am

f:""

cjJ(s)K(z -

Sm -

s) ds.

If the Sm are separated by intervals of equal length and point matching is applied, a sequence of the type (6.64c) is derived. Alternatively, one may apply pulse functions as testing functions to obtain a similar sort of sequence. Whichever route is chosen one ends up with a matrix equation. This may be tackled by direct inversion or by conjugate gradients (when the method is denoted by CGFFT usually). The question of whether to prefer a direct inversion or CGFFT is not easy to settle. The answer depends on the convergence properties, the likely round-off error, the conditioning of the matrix, and the location of its eigenvalues. A discussion of the matter can be found in Peterson et ale (1988), Peterson (1989). Finally, it should be remarked that CGFFT can be extended to higher dimensions. For instance, if ....

e nm

= L L

N- 1 M- 1 k=O p=O

(.

e k p exp

nk

-21tl- -

N

. mp)

27tl-

M

316

ANTENNAS AND INTEGRAL EQUATIONS

and

ekp =

N-1 M-1

L L

fnmgk-n,p-m

n=O m=O

then enm = fnmgnm provided that gr-N,s-M = grs; the sequences can extended to verify this as for a single dimension (see, for example, Peters and Volakis 1988).

Exercises 8. A plane wave falls on a straight receiving antenna of length I and radius a. Compare the current obtained at the centre of the wire by the various numerical methods described in this section as M increases from 5. How large does M have to be for you to have confidence in the results? Take a = O.Oll and give I a number of values between O.3l and 1.5l. Would your conclusions be modified if (i) a = O.OOll, (ii) a = O.ll? 9. Examine how the input admittance of a centre-fed antenna varies as the width of the gap is narrowed. Use both methods above and decide which you prefer. Consider the same values of a and I as in Exercise 8. 10. Calculate the radiation pattern and radiation efficiency for the antennas in Exercise 9. Do the variations in the pattern surprise you?

6.8 Curved antennas It is pretty futile to try to circumvent numerical integration by working with piecewise sinusoidal basis functions when the antenna is not straight unless perhaps it is polygonal in shape. Even then extreme care would be necessary to mitigate the bother caused by discontinuities in derivatives already encountered. One may also anticipate that approximating the antenna by a polygon will lead to numerical difficulties because of the discontinuities in slope so embodied. Therefore, it is fitting to investigate schemes which incorporate an appropriate amount of continuity right from the beginning. Apposite basis functions are the B-splines discussed in §1.1 and now commonly utilized for computer-aided design in mechanical engineering, but it will be convenient to consider them more fully than was done there. Let t 1 , t 2 , ••• , t m be m real numbers such that t 1 < t 2 < ... < t m; these are to be the knots of the spline. With x~ meaning x" if x > 0 and 0 if x ~ 0, the B-spline of degree n - 1 is defined by

Mn;(t) =

±

r=i-n

(t,

~ t)~-l Q)

(r.)

(6.65)

where w(t) = (t - ti-n)(t - t i - n+ 1) ... (t - t i ) and the prime is a derivative with respect to the argument. The definition (6.65) is more general than that of§I.I, allowing the knots to be arbitrarily spaced and the degree of the B-spline to be chosen freely. On taking n = 4 and the knots equispaced at integer intervals, we can confirm without difficulty that (6.65) gives lBi as defined in (1.14).

WIRE ANTENNAS

317

The knots of Mni are at t i- n, t i- n+ 1 , ••• , t, and Mni vanishes identically for t ~ t i- n and t ~ tie For t i- n < t < t h the B-spline is strictly positive (Schoenberg 1946; Curry and Schoenberg 1947, 1966). In addition, Mni and its derivatives of order 1, 2, ... , n - 2 are continuous for all t. A general spline for the finite interval a ~ t ~ b can be constructed by taking t 1 > a, t m < b and adding n knots at each end so that Then

m+n

S(t)

= L

ciMni(t).

i= 1

An arbitrary function may be likewise expanded and the coefficients determined by reproducing the function values at m + n nodes which may be distributed conveniently apart from some mild restrictions on their relation to knots. The equations for the coefficients have a simple structure because every row of the matrix has at most four non-zero elements (three if the nodes and knots coincide). It is tempting to substitute the definition (6.65) and write the expansion as n

L aiti-

i=1

1

+

m

L bj(t -

tj)"+-l

j=1

but this is unprofitable for numerical purposes, being especially bad for interpolation. The equations to determine a, and bj are extremely ill conditioned even if n is small. Moreover, even when a, and bj are known accurately, the evaluation of the sum engenders many arithmetic operations and can experience severe loss of accuracy due to cancellation (Carasso 1966). Nor is it really satisfactory to compute Mni direct from (6.65) on account of instability which arises. Instead, it is better to use the recurrent relation (6.66) This process is stable, accurate, and fast (Cox 1972) as well as permitting one to take advantage of the fact that M1J{t) is zero except in (t j _ h t j ) where it is 1/(tj - t j - I ) · Stable algorithms for derivatives and integrals of B-splines are also available (de Boor 1972, 1973). The B-splines of (6.65) are normalized so that

Loooo M,,;{t) dt = lin. For our purposes, renormalization possesses some convenience. Define (6.67)

318

ANTENNAS AND INTEGRAL EQUATIONS

Then m+l

L

i= 1

Nni(t)

=1

(6.68)

for a ~ t ~ b, as may be established quickly by induction from (6.66) and the fact that (6.68) is true when n = 1. From now on only cubic B-splines will be discussed, so that n = 4. For simplicity, N, will be written instead of N4 i • In view of the particular simple form which B-splines assume when the interval between the knots is unity, a parameter will be introduced to convert the knots to integer values. In other words, a class of parametric B-splines will be the subject of investigation. Such parametric B-splines are used extensively in the aircraft industry in planning the geometry of an aeroplane (Roberts and Rundle 1972). Let ~«(J) be the normalized cubic B-spline which has knots at the integers i - 4, i - 3, i - 2, i-I, i. Then N;+ 1 «(J) has the same shape as N; but is shifted along the (J axis by a unit length. It is intended to regard (J as a parameter so that a given curve is described as (J goes from 1 to T. Only the integer values of (J will be made to correspond exactly to specific points on the curve so that in between the curve is being approximated. Therefore, the approximant for the curve is T+3

x(a) =

L

i=2

XiN;(a) ,

(6.69)

T+3

y(a)

= L liNi(CT). i=2

Only two-dimensional curves are treated in detail but the extension to three dimensions is immediate. The coefficients Xi and li are fixed, first by requiring that, when a = j (j = 1, ... , T), x = Xi and y = Yi where (Xi' Yi) are specified points on the given curve. This gives 2T conditions for 2T + 4 unknowns. If the curve is closed, periodicity supplies the missing information. Otherwise, two extra points can be designated for matching, say a = and CT = T - t. Since, from (1.14)

t

Ni(a) = t{(a-i)~ -4(CT-i+ 1)~ +6(CT-i+2)~-4(CT-i+3)~+(a-i.+4)~}

(6.70) N;(i) = 0 = N;(i - 4), N;(i - 1) = 1, N;(i - 2) = i, Ni(i - 3) = 1. Thus the matrix associated with finding Xi and li will be tridiagonal with the only non-zero elements on each row being 1, ~, 1 except possibly for those rows relating to the two extra points or special properties of the curve. Computation of the Xi and li should therefore be rapid. The arc length between successive data points becomes unity when the CT variable is employed. It is therefore desirable to aim at making the spacing

WIRE ANTENNAS

319

between data points roughly uniform so that the spline curve is reasonably representative of the original. At any rate, the arc lengths between successive data points should not fluctuate wildly. It is, of course, vital to choose T large enough to ensure that the approximant is a tolerable replica of the given curve. The B-spline representation guarantees continuity of the first two derivatives. If the antenna exhibits discontinuities, say in slope, it may be undesirable for the approximating curve to display higher continuity there. In that case, the antenna should be split into a number of macro-elements, in each of which the B-spline approximation is adopted but without undue continuity enforced at their joins. For simplicity of presentation, it will be assumed in the following that there is only one macro-element. As soon as eqn (6.69) of the wire has been settled, the evaluation of the integrals in (6.59) has to be dealt with. In these both x and ~ are replaced by the same formula (6.69) but a and a' are employed as their respective parameters to distinguish between them. Then

t(~) = d~ = d~ du' ds

and

~= de'

de' ds

{(de)2+ (d'7)2}1/2 de'

de'

'7),

where ~ == (e, again restricting attention to two-dimensional wires. Consequently, t is readily available. The derivatives are evaluated analytically before the substitution (6.69). By this process the integral equation goes over to the form

=

I(u)

IT

l(u')K(u, u') de'

where f and K are known but I has to be found. It is now desired to insert a suitable approximation for I so that a numerical solution is revealed. A general method will be described though CGFFT may be applicable if Kto, u') = K(u - 0"). Since the typical distance between knots for 0' is unity a canonical integral to be computed is

f

m+ l

m

l(u')K(u, u') de'

where m is a positive integer. The first stage is to approximate the integral by a quadrature formula. It need not be of high order and a choice appropriate to our method of approximation by cubic splines such as a Gaussian method of order 3 (§5.12) should be sufficient since the kernel is non-singular. (If a is very small it may be helpful to consider the methods for singular kernels

320

ANTENNAS AND INTEGRAL EQUATIONS

discussed later in §6.19.) Hence

Jl

m+ l

f

m

n

I(u')K(u, u') de' ~

+ u,)K(u,

w,l(m

m

+ u,)

here w, and a, are the weights and points respectively required by the quadrature rule applied to the interval (0, 1). Next we assume that I can be approximated by cubic B-splines with the same knots as those of the representation of the wire, i.e. T+3

I(u')

= L

i=2

IiNi(u')

where the complex constants I, are to be determined from the integral equation and any end conditions imposed. For the interval under consideration only four of the B-splines will be non-zero. Also N, + 1 «(J) = ~«(J - 1) and hence

f

m+1

m

n

l(u')K(u, u') drr' ~ '~1 w,K(u, m 4

~

L

j=l

i., j

n

L

r=l

+ u,)

Jl 4

lm+ j!V.J{u,)

wrNj(ur)K(u, m + (Jr)·

The constant wrNj(ur) is independent of m and so, once computed for one interval, is available for all intervals. By point matching at the integer values of (J the values of I, solve the linear system of equations resulting from the integral equation. The antenna characteristics and radiation pattern may then be determined as in previous sections. A numerical procedure for bent antennas has therefore been devised.

Exercises 11. A linearly polarized plane wave with its electric vector parallel to the z axis is incident on a circular antenna of radius b in the plane y = O. The direction of propagation of the incident wave makes an angle X. with the x axis and the origin is at the centre of the antenna. If kb « 1, show that the current in the wire is approximately the same at all points of the wire and decreases steadily from its value at X = 0 to zero at X. = In. By considering the cases A./4a = 500, 1000, and 2000 with kb round about 1, demonstrate that the greatest current at (b, 0, 0) occurs when the length of the loop is about 4 per cent greater than the wavelength. 12. Calculate the radiation pattern in z = 0 and y = 0 of the scattered field for the cases of Exercise 11. 13. The incident field of Exercise 11 with X. = 0 falls on a helical antenna wound on a cylinder of radius b with pitch angle p, the axis of the helix being the z axis. If the helix has only three or four turns examine the radiation pattern of the scattered field and decide whether it is elliptically polarized in general. Would it be circularly polarized if kb = 2 tan fJ?

WIRE ANTENNAS

321

14. If in Exercise 13, kb is about 1 and the direction of propagation of the incident wave is not perpendicular to the axis of the helix, find the current induced at the central point of the antenna. For what range of lengths of wire (/3 being fixed) is the current a maximum when the direction of propagation is along the axis of the helix?

6.9 Log-periodic antennas The log-periodic antenna was introduced as a mechanism for producing frequency-independent patterns over moderate bandwidths (Rumsey 1957; DuHamel and Isbell 1957). The aim is to provide a structure whose electrical properties vary periodically with the logarithm of the frequency. How this is achieved may be seen by approaching first the problem of the log-periodic transmission line. Suppose that it is asked that the voltage and current at the position x and complex frequency ro shall be equal to those at the position rx and frequency «[t, where the real constant t is known as the log-periodic expansion parameter. Then V(x, ro) = V(x', (6.71) I(x, ro) = I(x', w')

W,),}

where x'

= rx, w' = «[«.

Now d - V(x, co) dx d - I(x, w) dx

d

dx' V(x', «i') d

dx' I(x', w')

+ Z(x, w)l(x, +

w) = 0,

Y(x, w)V(x, w)

= 0,

+ Z(x', w')I(x', w') = 0, +

Y(x', w')V(x', w')

=0

where Z and Yare the series impedance and shunt admittance respectively. Sources could be included but are omitted to simplify the analysis. Dividing the first set of equations by r and using (6.71) we deduce that

= rZ(x', w'), co) = r Y(x', ro').

Z(x, «i) Y(x,

These are the conditions to be satisfied if the transmission line is to be log-periodic. Put x = t", to = r q • Then the equation for Z becomes Z(rP , r")

= rZ(r P + 1 , r q -

1

).

322

ANTENNAS AND INTEGRAL EQUATIONS

The general solution of this difference equation is Z(r P, r q )

= r-PFt(p, q)Gt(p + q)

where F1 is periodic with period 1 in both its arguments and G1 is an arbitrary function. Hence Z(x, w)

= F(ln x, In w)G(wx) x

where the periods of F are both In rand G is arbitrary. On account of (6.71), the structure of V willbe similar without the denominator x; the actual functions occurring may, however, be different. If (6.71)holds for all r, then F == 1 in order to satisfy its periodicity condition. Consequently, if G is independent of frequency a transmission line of frequencyindependent structure has been obtained. Notice that (6.71)specifies phase only to within a multiple of 21t and this flexibility may be important in practical circumstances. For two-dimensional electromagnetic fields in a medium with permittivity e, permeability u, and conductivity a (assumed independent of w) the analogue of (6.71) is expressed in terms of the polar coordinates p, 4J when the fields are independent of z. For example, (6.72) and a similar equation for H are conditions for log-periodicity. By operating with Maxwell's equations in the same way as for the transmission line we find s = F(1n p, 114J )G{p exp( -114J)} with parallel formulae for u and op. Here ~ = -(In r)/4Jo and F and G have the same properties as in the formula for Z. Evidently G will be constant on a logarithmic spiral p = C exp(,,4J). If 4Jo

= 0, the equation for e reduces to e = F(1n p)G(4J).

Thus if the material constants are arranged to spiral logarithmically, a log-periodic structure is produced. In practice, it has to be of finite size and only experiment can confirm that the lack of infinite periodicity does not destroy the desired properties. In fact, antennas of moderate size operate satisfactorily and are usually sufficiently compact for the tapering of the conductivity of a with p to be ignored. Indeed, the elements are often deformed into straight lines (the spiral is almost straight for angular deviations which are not too large) especially when r is not far from unity. For example, the monopole and zigzag are exhibited in Figs. 6.10 and 6.11, where the relative distance from the origin

323

WIRE ANTENNAS

\

,,

\

/

,

/

"

r

2

rl

\

/

/

I

I

I

I

I

/

I I

,(I. (1./

\

I

Fig. 6.10. A log-periodic monopole.

I ,

,

I

, r~

I

I

I

r4"

r~

II

\

r"-1 \

I

,

I'

rl

Fig. 6.11. A log-periodic zigzag.

of the points marked is also displayed (Isbell 1960; Greiser and Mayes 1961). The sought-for frequency independence is attained and some useful polarization effects are also available as a bonus.

Exercises 15. In three dimensions the relation corresponding to (6.72) is E(R, (), ljJ, w) = tE(tR, 8, ljJ -

in spherical polar coordinates R, 8,
6.10 Loads and arrays Once a satisfactory technique has been established for a single wire there is no quandary about what to do for arrays in principle. Simultaneous integral

324

ANTENNAS AND INTEGRAL EQUATIONS

equations arise, but since the numerical procedure replaces them by linear algebraic equations there is no fresh feature. Admittedly, the size of the matrices soon becomes huge as the number of elements shoots up. Nevertheless, with a sound numerical technique and taking what advantage one can of any sparseness, the main limitation will be the size of the computer. Further remarks about junctions are contained in the next section. No awkwardness should be encountered when an antenna is loaded provided that the load may be treated as a lumped circuit. Then the additional equation V = Z,I has to be satisfied at the load, V being an appropriately defined voltage. Such an equation can be incorporated by following a similar route to that described in the second method for input impedance in §6.7.

Exercises 18. Compare your predictions for the radiation pattern of a Vagi array with the experimental results given by Fishenden and Wiblin (1949). 19. Find the radar cross-section for two concentric coplanar circular wires and compare your results with the measurements of Gans (1965). 20. Repeat Exercise 17 for the monopole of Fig. 6.10 and compare the radar crosssections for the two antennas. 21. Repeat Exercise 20 when the side elements are swept down at an angle of 60° to the central radius in herring-bone fashion. 22. Two parallel wires of lengths 0.5A. and 4l are placed so that a wire joining their centres is perpendicular to both. Examine the radar cross-section in the central plane perpendicular to the parallel wires and show that for certain separations the main scattering occurs in the direction from the larger to smaller element. In effect, the wave travelling along the central wire is reflected by the longer wire in such a way as to reinforce the radiation for the shorter at some angles. The structure is often known as a back-fire antenna. 23. Six circular wires are arranged so that their planes are parallel and their centres lie on a straight line. The dimensions are such that the antenna is log-periodic. If r = 0.8 find the radar cross-section for a plane wave incident along the line of centres as a function of the diameter of the smallest circle.

SOLID ANTENNAS 6.11 Wire grid models It is well known experimentally that a wire grid is equivalent to a conducting sheet provided that the mesh spacing is small compared with the wavelength. On this basis the idea that a solid perfectly conducting antenna might be modelled by replacing its surface by an equivalent wire grid was proposed (Richmond 1966). Only the integral equations for thin wires need to be solved. Typical possibilities for a sphere plus monopole and an aircraft are shown in Figs. 6.12 and 6.13. Owing to the simplicity of this approach and its malleability in adapting to complex structures, it has been widely used. At least two

SOLID ANTENNAS

325

Monopole (five segments)

io N

o

Fig. 6.12. Wire-grid model for a sphere with monopole.

computer programs are to hand for its implementation (Burke and Selden 1973; Forgan 1974). The application of such programs to problems like the sphere of Fig. 6.12 yields promising results, especially when collated with experimental measurements (Albertsen et al. 1972) (see Fig. 6.14). Other elementary shapes such as discs and squares also offer encouragement. However, no theory has been developed to back up the method, so that assessment of how to select the wire radius and positions of the grid wires rests on numerical experimentation and comparison with laboratory measurements. Unfortunately, this leaves open the contingency that in a new situation the numerical predictions cannot be relied on until proved in the laboratory, in which case it may be more cost effective not to undertake the computation. It is therefore worthwhile indicating some of the points where experience has shown that particular care is called for. The programs are based on the trigonometric expansions of §6.7. Generally speaking, these give satisfactory results when segments of equal length are employed but their performance when there are junctions between several segments of different length has room for considerable improvement. In this

326

ANTENNAS AND INTEGRAL EQUATIONS

Fig. 6.13. Wire-grid model of an aircraft.

respect, the cubic B-splines of §6.8 with their automatic conversion to equispacing should be superior. One way of overcoming the difficulty is always to use segments of equal length, but this may involve many more segments than are necessary to model the antenna. An irregular structure may need very short segments in some directions to specify its outline but long segments may be perfectly adequate in other directions. Such variations may, of course, entail the use of higher-order quadrature formulae such as the Gauss eight-point rule. Another problem is the choice of wire radius. A close mesh of very thin wire should be ideal as a prototype of the body but it incurs a vast computational effort. It therefore seems preferable to make the radius as large as is consistent with the radius being less than one-hundredth of a wavelength and adjacent wire segments not overlapping. This may force different radii in different parts of the grid which can cause difficulty at junctions because of the introduction of capacitance thereby. The way round this in the RAE program (Forgan 1974) is to taper the wires near a junction so that there is a smooth transition; the wire radius in the integrals is then a function of position. Again, with large systems of segments small errors may easily accumulate so that it is wise to prevent them doing so if possible. For instance, the radiated

SOLID ANTENNAS

327

x

1-0

~

0-9

~

0-8 ~

"C

.~ 0·7 0. E co

"C

~~

> co

0-6

0-5

.~

Q)

a::

0·4 0-3

0·2 0-1 30

60

90

120

150

6 (deg)

Fig. 6.14. Radiation pattern of a tA. sphere plus monopole: solid curve, measured (Lyngby); chain curve, computed (Lyngby); broken curve, computed (AMP); dotted curve, computed (RAE).

power should be calculated by integration of the radiation pattern because this procedure is relatively insensitive to small errors.

Exercises

.

24. A plane wave is incident along the axis of a circular sheet of perfectly conducting metal. Find the radar cross-section as the radius varies and compare your results with those of Richmond (1966). 25. A square antenna contains a central rectangular slot parallel to the sides. Find the radar cross-section for a plane wave incident perpendicular to the plane of the antenna and linearly polarized (a) parallel to the longer side of the rectangle, (b) in the perpendicular direction (see also Miller and Morton 1970). 26. Determine the radiation pattern for Fig, 6.12 when the length of the monopole is tA. (compare also Tesche and Neureuther 1970). 27. Find the radar cross-section of a spheroid for incidence along a principal axis and check against the results of Oshiro et ale (1966). 28. Model a cone with a flat base by radial wires on the base and their continuations along generators of the slant sides. Determine the scattered pattern for a plane wave incident along the axis towards (a) the apex, (b) the base. 29. A quarter-wave monopole is added to the base of the cone of Exercise 28 and is along the axis away from the apex. It is fed by a small magnetic frill surrounding it in the base. Calculate the radiation pattern produced by this transmitting antenna (Thiele et ale 1969).

328

ANTENNAS AND INTEGRAL EQUATIONS

30. Model a paraboloidal antenna by a wire grid and calculate its transmitting characteristics for excitation from a small dipole at its focus (Poggio and Miller 1970). 31. A cone-sphere antenna is manufactured by placing a hemisphere on the base of a cone so that the perimeters of the base and hemisphere coincide. Compute the scattering characteristics for a wave incident along the axis (Mautz and Harrington 1969).

6.12 The electric-field integral equation The derivation of integral equations for antennas of arbitrary shape commences from a standard representation for electromagnetic fields in terms of surface integrals. Let S be a closed surface which can be the boundary of the antenna but may be some other conveniently disposed surface. The infinite region outside S will be denoted by S+ while the bounded interior will be called S_ (Fig. 6.15). Points in S + or S_ will be distinguished by capital letters such as P and Q; moreover, the notation will be simplified so that, for example t/J(P, Q) = t/J(x p , x Q) where the right-hand side is defined by (6.2). Lower-case letters such as p and q will signify points of S. The radiation conditions to be imposed on scattered fields, assuming the medium outside S to have constant Jl and 8, are RE, RH boun~ed, R(E - ZH

R(H -

R 1\

1\

}

R) --. 0

(6.73)

E/Z) --. 0

as R --. 00, Ii being a unit vector along the radius from the origin and Z = (Jl/e)1/2 is the impedance of the medium. The problem for which a solution is sought is to satisfy Maxwell's equations in S+, the radiation conditions (6.73) at infinity, and to have a prescribed tangential electric intensity on S, i.e. if Dp is a unit normal from the point p of S into S + it is required that np

1\

E(p)

= Dp

1\

Eo(p)

Fig. 6.15. The antenna configuration.

(6.74)

329

SOLID ANTENNAS

on S where 0 1\ Eo is a known field. The field Eo might originate from sources inside S, as when S encloses a waveguide with radiating slots. In that case (6.74) is effectively the condition for a transmitting antenna. On the other hand, Eo might be due to a signal whose origins lie outside S. Then (6.74) corresponds to S being perfectly conducting and E is the field scattered by such a receiving antenna. It may be desirable to specify alternative boundary conditions, e.g. of impedance type, on S but that will not alter the essence of the principles involved and so (6.74) will be concentrated on. Observe that the field E, H does not have any sources in S+; they must all be on S or in S_. Having formulated the fundamental antenna or exterior problem, one has to consider what techniques are to be adopted to solve it. First the approach will be that of integral equations solely and some of the considerations in solving them, including finite-element methods. For a method based on an infinite system of state-space differential equations see Hizal (1974). The basic field representation can be written, on the assumption that E and H satisfy the radiation conditions (6.73) and Maxwell's equations curl E

+ ikZH = 0, div E = 0,

curl ZH - ikE

= 0,

div H = 0

as (see for example Jones 1986)

Is [(Oq

A

E(q)}

A

grad, IjI(P, q) + {Oq.E(q)} grad, IjI(P, q) - iW,u{Dq

Is [{Oq

A

1\

H(q)}t/J(P, q)] dS q

= E(P)

(P

E

S+),

=0

(P

E

S_)

(6.75)

H(q)} A grad, IjI(P, q) + {Oq.H(q)} grad, IjI(P, q)

- iwe{Dq

1\

E(q)}t/J(P, q)] dS q = H(P)

=0

(P E S+),

(P

E

S_)

(6.76)

where, now, k 2 = w 2,ue and the time dependence eicz>t has been suppressed. The symbol Dq indicates the unit normal at q of S into S+ and the subscript on a vector operator shows the variable to which it is applied, i.e.

grad, t/J(P, q)

=

(i oX~ + q

j

~ + k ~) t/J(P, q) oY q

OZq

where i, j, and k are unit vectors along the Cartesian axes. The integrals in (6.75) and (6.76) contain the normal components n.E and D. H, but the boundary condition (6.74) involves only tangential components. It is therefore advantageous to express the normal components in terms of

330

ANTENNAS AND INTEGRAL EQUATIONS

n

Fig. 6.16. The calculation of surface divergence.

tangential ones. This may be accomplished by introducing the surface divergence of a tangential vector u on S. A number of results about operators on surfaces is collected in the appendix to this chapter. References to equations there are distinguished by the letter A. The proofs in the appendix differ from the following. Draw a small curve C on S and let v be the unit vector which is perpendicular to both nand C, as well as being out of the portion of S containing n (Fig. 6.16). If the area of this portion of S enclosed by C is A, the surface divergence Div u of the tangential vector u is defined by Div u = lim

~

f

A c

u. v ds

where the integration is around C and the limit is taken as C contracts to the point where n is drawn. If C is arbitrary it follows, by covering the enclosed part of S by a grid of curves each surrounding a small area, that

fsc DivudS =

f

c

u.v ds

(6.77)

where Sc is the portion of S with C as the rim out of which v is pointing. If S is closed then C can be allowed to contract to a point in such a way that Sc becomes S. Hence

Is DivudS = 0

(6.78)

when S is closed. The formulae (6.77) and (6.78) are in agreement with (A.17) and (A.18). The element ds of arc of C satisfies ds = n /\ v ds. Hence Div(n

A

E) = lim

~

f

= -lim ~ A

by Stokes's theorem.

~

f

n /\ E. v ds = -lim E. ds A cAe

f

Sc

(n.curl E) dS

331

SOLID ANTENNAS

By Maxwell's equations, - impH may be substituted for curl E. The integrand may be regarded as constant if C is sufficiently small and hence Div(o A E) = impo.H.

(6.79)

It may be shown similarly that (6.80)

Div(o A H) = -imeo.E.

These results are implied also by (A.25). The surface divergence requires only derivatives along the surface and so (6.79) and (6.80) permit the determination of normal components from tangential ones. Write (6.81) n A H = -j, -eo.E = p. Then, from (6.80), (6.82) Div j + imp = O. The vector j may be thought of as a surface electric current while p is a surface electric charge. Thus (6.82) is a continuity equation on the surface. Surface magnetic currents and charges may also be injected via n

A

E =jm'

(6.83)

-po." = p".

with the surface continuity equation (6.84) Div jm + imp". = 0 from (6.79). For our immediate purpose, the substitution (6.83) will not be made since n A E is known on S from (6.74). Therefore the following electromagnetic field will be considered: E(P)

=

H(P) =

Is [{Oq /\

f[s

Eo(q)} /\ grad, ",(P, q) -

p~q) grad, ",(P, q) + irojlj(q)"'(P, q) ]

j(q) /\ grad, ",(P, q)

+ Div{Oq /\ Eo(q)} +iro6oq

/\

dS q ,

(6.85)

grad~ ",(P, q) Imp

Eo(q)"'(P, q) ] dS q •

(6.86)

In (6.85) and (6.86) it is assumed that (6.82) is valid so that p is known once j has been specified. Thus the integrals of (6.85) and (6.86) involveonly the single unknown tangential surface current j. No matter how j is chosen, so long as (6.82) is satisfied, (6.85) and (6.86) are solutions of Maxwell's equations in S+. The verification is straightforward so

332

ANTENNAS AND INTEGRAL EQUATIONS

that only the hardest step need be considered in detail. Put Dq • Eo(q) =

curl;

Is a

1\

grad, t/J(P, q)

8.

Then

as,

f f

= {a.div, grad, t/J(P,

q) - (a.grads) grad, t/J(P, q)}

= {(a. grad.) grad, t/J(P, q) - a. V~t/J(P, q)}

as,

as,

because t/J depends only on Ix p - xql. Since t/J satisfies Helmholtz's equation in S+ the last term reproduces the final term of (6.86). Further

Is (a. grad.) grad, t/J(P, q) as, = - grad, Is a. grad, t/J dS = -grad Is {Div(at/J) - t/J Diva} as, = - Is Diva grad, t/J(P, q) as, (6.87) q

p

from (6.78) because t/Ja is a tangential vector. The second term of (6.86) has consequently been recovered. A similar procedure is necessary in evaluating curl H but no other terms give any difficulty. It will be remarked that the argument applies equally if P is in S_, i.e, (6.85) and (6.86) are also solutions of Maxwell's equations inside S. In contrast to (6.75) and (6.76), it cannot be guaranteed that a free choice of j will lead to an identically zero field in S_. Of course, if the right choice is made it is expected that the field will be zero in the interior. Again, (6.85) and (6.86) comply with the radiation conditions (6.73) for arbitrary choice of j. This may be confirmed by incorporating the asymptotic behaviour of t/J as P -+ 00 (which is permissible because S is bounded) and by noting that

Is (Div j + ikj. x

p)

exp(ikx q •

x as, = Is Div{j exp(ikx x as, p)

q•

p)}

=0 from (6.78) since S is closed. Consequently, the representation (6.85) and (6.86) satisfies all the conditions which have been demanded of the solution with the sole exception of (6.74). The. next aim is to rectify this deficiency by letting P tend to a point of S. Some care has to be exercised in this process because t/J becomes singular as

333

SOLID ANTENNAS

Fig. 6.17. Determination of the discontinuities of surface integrals.

P approaches S. While a rigorous analysis can be undertaken (Kellogg 1929) we

shall be content with a plausible investigation which gives the same results. For more extensive properties see Colton and Kress (1983). The singularity of t/J(p, q) is integrable so that, if f is reasonably continuous on 8, lim P~p

fS f(q)t/J(P, q) dS = fS f(q)t/J(p, q) dS q

q•

(6.88)

Integrals with derivatives of t/J, however, display discontinuities. Split 8 into two sections, 8 1 and S2' of which 82 is a small region surrounding p and 8 1 the remainder of 8 (Fig. 6.17). If 8 has a continuously turning tangent plane on a neighbourhood of p it may be supposed that S2 is effectively a circular disc of radius ~ with centre p. The z axis may also be taken along the outward normal at p. An integral over 8 1 is continuous as P -. p and causes no trouble. In the limit as ~ -. 0 it will lead to a principal value. With regard to 8 2 , assume firstly that P is on the z axis, a small distance z from p. Then the phase of t/J can be ignored and

f

S2

I/I(P )dS f( q ) 'I' ,q q

~f(p)J27tJlJ tdtdl/J 4 ( 2 2) 1/2 1C 0 0 t + z

= tf(p){ (£52 + Z2)1/2 - z}.

(6.89)

As z -. 0 this may be approximated by tf(p){~ + !z2/£5 - z}. P may now be moved a small distance r off the z axis so long as a solution of Laplace's equation is retained. Thus, in general

A derivative with respect to z along the normal gives a non-zero value as P -. p,

334

ANTENNAS AND INTEGRAL EQUATIONS

whereas tangential derivatives supply no contribution in the limit. Hence -lim grad, P-p

f

S

f(q)t/J(P, q)

as, = lim

P-p

f

S

f(q) grad, t/J(P, q)

= tf(p)op +

Is

as,

f(q) grad, t/J(p, q)

as, (6.90)

where the bar on the integral sign signifies a principal value. If P is inside S, at ~ say, the only difference is that - z in (6.89) is changed to z. Therefore lim Pi-P

f

Jsr f(q) grad, t/J(P, q) as, = -tf(p)op + sf(q) grad, t/J(p, q) dS

q•

(6.91)

From (6.88) and (6.90) it follows that

. 11m E(P)

P_p

= t{np 1\ +

Eo(p)}

Is [{Oq

1\

Eo(q)}

x grad, t/J(p, q) · H(P) _ 1-() 11m - -2.1 P P"'p

+

1\

np

-() [ f S

- Jq

+ -1 Divln, •1\ 2

1\

1\

grad, t/J(p, q) -

+ iW/li(q)t/J(p, q)] Eo(p)}n

p~q)

dS q ,

(6.92)

p

lWIJ

gra

+ iweoq 1\

1 p(p)n up - - - - p 2 e

1\

d .II( ) q 'I' p, q

+ -~:""'--_-.--~-Divln, 1\ Eo(q)} grad, t/J(p, q) lWIJ

Eo(q)t/J(p, q)] dS q •

(6.93)

Now apply the boundary conditions (6.74) to (6.92). Then Up 1\

Is

= top

{iW/li(q)t/J(P, q) 1\

Eo(p) - op 1\

p~q) grad, t/J(p, q)} us,

Is

{n,

1\

Eo(q)}

1\

grad, t/J(p, q) dS q •

(6.94)

The right-hand side of (6.94) is known in principle and so (6.94) constitutes an integral equation to determine the unknown j. It is known as the electric-field

335

SOLID ANTENNAS

integral equation (EFIE). For an elongated antenna it effectively degenerates to one of the forms discussed in connection with wires. The integral equation (6.94) only applies at points where the tangent plane is continuously turning. At edges or conical points the formulae (6.90) and (6.91) are no longer valid and neither is (6.94). However, (6.74) cannot be strictly applied at such points since D is not well defined there. Therefore, although (6.94) can be imposed arbitrarily close to an edge, it must be avoided actually on an edge - an aspect to be watched when employing point matching in numerical techniques. When Eo is due to an electromagnetic field arriving from outside S, the integration on the right-hand side of (6.94) can be escaped. The incident field has no sources inside S and so

Is [(nq

1\

Eo(q)}

1\

grad, "'(P;, q) + {nq.Eo(q)} grad, "'(P;, q) - iWJl{ Dq

1\

Ho(q)}t/J(~, q)] dSq

=-

Eo(~)

(6.95)

from (6.75). Let P, --. p and then (6.95) becomes, on account of (6.91),

f[iWjl{n q

1\

Ho(q)}"'(p,q) - {nq.Eo(q)} grad, "'(p,q)] as, =

tEo(p) +

Multiply (6.96) vectorially by

Dp

Is {n,

1\

Eo(q)}

1\

grad, "'(p, q) as, (6.96)

and add to (6.94). Relabel j

+ Dq."o

and

pie + D. Eo as j and p respectively; no violation of (6.82) occurs by virtue of

(6.80). Hence the electricfield integral equation np

1\

Is {iWjlj(q)",(P, q) - p~q) grad, "'(p, q)} as,

= np

1\

Eo(p) (6.97)

is obtained, when the sources of excitation are in S+. 6.13 Uniqueness Despite the fact that it has been demonstrated that a solution to our problem satisfies (6.94) (or (6.97» it has not been shown that solving (6.94) necessarily provides the answer to our problem. If (6.94) possesses only a single solution the difficulty is resolved, but if (6.94) has several solutions the identification of the desired one has to be settled. Now, if (6.94) has two solutions, their difference must satisfy

np

1\

Is {iWjlj(q)",(P, q) - p~q) grad, "'(p, q)} as, = 0

(6.98)

336

ANTENNAS AND INTEGRAL EQUATIONS

and the question may be reformulated to ask whether (6.98) holds for any j It will be shown now that there are such solutions. Consider the electromagnetic field EO, HO defined by EO(P)

=

Is {iWJlj(q)",(P, q) -

HO(P)

=-

Is j(q)

1\

1=

o.

p~q) grad, ",(P, q)} as, grad, ",(P, q)

us,

Owing to (6.90) and (6.98), limp-+ p Dp A EO(P) = o. Accordingly, EO, HO is an electromagnetic field in S + which satisfies the radiation conditions and has vanishing tangential components on S. By the standard uniqueness theorem (Jones 1986) for radiating fields EO == 0 and n° == 0 in S+. By a similar argument based on (6.91), we can assert that the tangential component ofEO(~) vanishes as P, tends to S. It does not follow that EO is identically zero in S_ because there exist values of k for which electric modes of oscillation take place inside S which is then acting as a cavity resonator (§3.5). Now from (6.92) and (6.93)

E~(p) - E~(p) = H~(p) - H':(p)

=

p(p)n p ,

e

-j(p)

A Op

(6.99) (6.100)

where the subscripts + and - signify the values obtained as S is approached from S+ and S_ respectively. It has been shown already that E~(p) = 0, H~(p) = O. Therefore, if EO(~) is identically zero and thereby HO(~) == 0, p and j must be identically zero. Consequently, (6.94) has a unique solution unless k has a value at which the interior of S resonates in an electric mode. When k does have a resonant value there is more than one solution of (6.94). For suppose EiJ Hi is an interior electric mode. Choose j(q) = -Oq A Hi(q) and define p from (6.82). Then, by letting ~ ~ p, we can check the validity of (6.98) and confirm the existence of a non-trivial solution. While (6.94) fails only for isolated values of k the shortcoming can be disastrous for the application of numerical methods. The matrix of the algebraic system which replaces the integral equation must be singular or nearly so when k is in a neighbourhood of a point of non-uniqueness. How far away k can be to eliminate the uncertainty is less easy to explain. Figure 6.18 shows the relative error as the wavenumber varies for a sphere in an acoustic field where a similar phenomenon rears its ugly head. Clearly, errors start to make themselves manifest at wavenumbers as little as half the first resonant wavenumber. The situation is worse at the higher wavenumbers because the interior eigenvalues tend to cluster with increasing frequency. It may therefore be concuded that numerical results founded on the EFIE must be viewed with suspicion unless the dimensions of the body do not exceed

337

SOLID ANTENNAS

100

-

30

EQ) o

~

~

Q)

00.

tQ) Q)

>

Q) ~

:J

-~ ~ ~

,

10

3

Q)

~c. E ~ '-0 :J ~ E't:

I

~,~-_.

-:J C fI)

~ .s 0-3 0·1 0'4n

-- ------0-6n 0'8n Wave number times radius

n

ka

Fig. 6.18. The relative error in surface pressure on a vibrating sphere: solid curve, no correction for interior resonance; broken curve, exact theory.

half a wavelength when they are probably acceptable provided that errors have not crept in from sources other than interior resonance. The EFIE has one virtue which does not seem to have been exploited to any extent. It is transparent from Fig. 6.18 that it exposes in a distinct fashion where resonance occurs and so it could supply a means of finding the wavenumbers of electric modes of oscillation.

Exercise 32. A perfectly conducting sphere of radius a has its centre at the origin. In terms of the spherical polar co-ordinates R, (J, l/J an electromagnetic field is required outside the sphere with (Eo)o = 0,

(Eo)t/>

= h\2)(ka)

sin (J

where h\2) is the spherical Hankel function defined by h\2)(Z) = (n/2z) 112 H~~~(z).

Assume that io = 0 and it/> = C sin (J where C is a constant. Show that the EFIE (6.94) is satisfied independent of C whenever k satisfies it(ka) = 0 where it is the spherical Bessel function.

6.14 The magnetic-field integral equation The downfall of the EFIE cannot be attributed to posing the original problem incorrectly because that does have a unique solution. The fault must lie with the representation of E. Perhaps that for H is better.

338

ANTENNAS AND INTEGRAL EQUATIONS

Suppose we let P ~ p in (6.68). We are immediately faced with the problem that we have no boundary condition on D 1\ Hand j is unknown. However, from (6.93) (cf. (6.100» Dp 1\

{H+(p) - H_(p)}

= -j(p).

It is desirable that the representation should have the same property as (6.76) since that is exact. Therefore j ought to be such that Dp 1\ H_(p) = O. Accepting this condition leads to the integral equation

-tj(p)

+ Dp = Dp

Is j(q)

A

1\

f s

A

grad, "'(p, q) as,

[DiV{Dq

A

Eo\q)} grad q "'(p, q) + iweD q

A

Eo(q)"'(p, q)J as,

tWJl

(6.101) which is called the magnetic field integral equation (MFIE). If Eo is caused by a field incident from outside S the right-hand side of (6.101) may be replaced by -op 1\ Ho(p) in a similar way to (6.97) provided that j + Dq 1\ "0 is again relabelled as j. Whether (6.101) has a unique solution turns on whether

-tj(p)

+ Dp

A

Is

j(q)

A

grad, "'(p, q) as, = 0

(6.102)

is satisfied by non-trivial j. Define an inner product (§3.2) for tangential vectors j and h by

Is j.h* dS

(j, h) =

(6.103)

q•

Then the adjoint (§3.2) of (6.102) is

th(p)

+

Is

{D q A

h(q)}

A

grad, "'*(p, q) dSq

= O.

(6.104)

By the Fredholm alternative (to be discussed in the following section), whenever (6.102) possesses a non-trivial solution so does (6.104) and vice versa. Therefore construction of a non-trivial solution of (6.104) will imply non-uniqueness of (6.102). Take the complex conjugate of (6.104) and then multiply vectorially by 0p. With jm(P) = op 1\ h*(p) there results

tjm(P) +

Dp A

Is

jm

A

grad, "'(p, q) as, = O.

(6.105)

Thus to every non-trivial solution of (6.104) corresponds a non-trivial solution of (6.105). Conversely, when (6.105) is satisfied non-trivially so is (6.104) as may

SOLID ANTENNAS

339

Is

be confirmed by defining h* = jm 1\ grad, t/J(p, q) dSq whence !jm = - n 1\ h* and the result follows by taking the complex conjugate. Hence, it is sufficient for our purposes to demonstrate the non-uniqueness of (6.105). Consider the electromagnetic field E'(P) =

H'(P) =

Is

L

im(q)

A

grad, t/!(P, q) dSq ,

{iulIlim(q)I/J(P, q) -

~ grad, t/!(P, q)} as,

(6.106)

(6.107)

where Pm is given by (6.84). It satisfies Maxwell's equations in S+, the radiation conditions, and n /\ E~(p) = 0 on account of (6.90) and (6.105). Hence it is identically zero in S+ and n 1\ H'+(p) = O. An immediate consequence, by virtue of(6.91), is that n 1\ H'_(p) = O. Thus (6.105) can have non-trivial solutions only when there are interior magnetic modes of oscillations. On the other hand, if Em' H; is a magnetic mode of resonance choose im = n 1\ Em and then the application of (6.90) and (6.91) to (6.106) implies (6.105). The deduction is that the uniqueness of (6.101) fails precisely for those values of k for which there are interior modes of magnetic resonance. Thus the MFIE suffers from the same disadvantage as the EFIE, and accepting numerical results based on the MFIE should be tempered with prudence. Notwithstanding, if a choice has to be made between the MFIE and the EFIE, the better gambit is the MFIE. One reason is that the EFIE is a singular integral equation of the first kind, whereas the MFIE is one of the second kind, for which the theory has sounder foundations. The second motive for the selection is numerical in origin. Often the algebraic system, substituting for an integral equation, is increased in size by subdividing a mesh on the surface with the aim of improving the accuracy. However, as the size of a mesh element diminishes, the value of the integral over it tends to decrease. Thus the elements of the matrix for the EFIE will be becoming smaller as its order is becoming larger. There will therefore be a tendency for the matrix to become ill conditioned. In contrast, the MFIE contains a j which does not involve integration. Its matrix is therefore likely to exhibit more and more diagonal dominance with steadily decreasing mesh size. Hence, reduction in mesh size should not cause the deterioration of accuracy for the MFIE that would be expected with the EFIE. Before concluding this section there is one further implication of the Fredholm alternative that needs checking. This is the assertion that at a resonant value of k, (6.101) possesses a solution if and only if the right-hand side is orthogonal (with respect to the inner product) to every non-trivial solution of (6.104). For the receiving antenna the right-hand side is a multiple of n 1\ H o and the requirement is (6.108)

340

ANTENNAS AND INTEGRAL EQUATIONS

Now

Is

Oq A

Ho(q).h*(q)dSq = -

=

Is Ho(q).jm(q)dSq

Is {Hm(q)·Oq

A

Eo(q) - Ho(q).Oq

A

Em(q)}

as,

since jm stems from a magnetic mode of oscillation. The last integral is zero because the divergence theorem can be applied in S_ since neither field has any singularity there. Hence (6.108) is verified for a receiving antenna with no sources inside S.

6.15 The Fredholm alternative This section is a digression from the theory of antennas because its aim is to establish the Fredholm alternative for operators sufficiently general to cover the needs of the preceding section. The notation of Chapter 3 will be used but since (6.102) and (6.104) are plainly not self-adjoint a generalized version of the theory of that chapter will be necessary. The proof will be restricted to operators defined on Hilbert spaces but, in fact, the results hold for normed linear spaces though the proofs have to be changed (see Colton and Kress 1983). Let T be a compact operator on a complex Hilbert space H into H. By §3.3 T is both bounded and continuous. Let N, be the set of x such that Tx = x. It is obvious that N, is a linear subspace of H. Indeed, N, is of finite dimension and thereby closed (§3.1). For, if N, contained an infinite basis, an infinite orthonormal set {x n } could be constructed by the Schmidt process (§1.5). Then TXn = x, and so the definition of a compact operator demands that {x n } has a convergent subsequence. However, IIX m - x, 11 2 = 2 prevents the existence of any such convergent subsequence. Hence N, is of finite dimension. Since T is into H, T2 can be defined by Px = T(Tx) and higher powers of T follow in a similar manner. Now if {Xj} is a bounded sequence so is {Txj} since T is bounded. Hence T(Tx j) has a convergent subsequence because Tis compact. Consequently, T2 is compact and, in general, so is T". Let Nm be the set of x such that (T - l)mx = 0, I being the identity operator. Since T" is compact it follows from the result for Nl that Nm is of finite dimension. Clearly, if x E Nm , then x E Nm + 1 and the question arises whether the dimension of Nm increases without limit as m grows. Suppose there is x E Nm+ 1 which is not in Nm. Then (T - I)m-n x E Nn+ 1 for n ~ m but does not belong to Nn • Therefore, either N; + t and Nn are different spaces for all n or they coincide for some n and do not change for any further increase in n. To show that the first possibility cannot occur choose in each N; an element x, which is orthogonal to all members of N;- 1 and such that

341

SOLID ANTENNAS

[x, II

= 1. Then, for n > m, TXn - TXm = {(T -

Tyx; - (T - I)x m - x m} + X n = Y + X n

where Y E N;_ i- Since x, is orthogonal to Y

IITx n - Tx ml12 = I/YI1 2

+ IIxnl1 2 ~

1.

Therefore {Tx n} does not possess a convergent subsequence contradicting T being compact. We deduce that there is a finite integer v such that N; = N, + t = NY + 2 = .... An associated space is M; which consists of all those y which can be written as y = (T - I)nx for some x E H, i.e, M; is the range of the operator (T - l)n. Evidently, Mn is linear and it is, in fact, a closed linear subspace of H. This will be proved for M t and the general demonstration left to the reader. Let {Ym} be a convergent sequence in sf.. Ify". = (T- I)x".putx". = Um + Vm where UmE N, and Vmis orthogonal to Nt. Then Ym = (T - I)v".. To show that II v.,. II is bounded, suppose that there is a subsequence on which II Vm II ~ 00. Then (T-1)v m/llvmll = Ym/llvmll ~ 0 since {Ym} is convergent and hence bounded. The sequence {Tvm/ll VmII} contains a subsequence {Tv"'jlll vmj II} which converges. Therefore {Tvmj - (T - l)v"'j/llvmj II} or {vmj/llv mj II} is also convergent. If its limit is v then (T - I)v

= lim (T j-+ 00

I)v m j

II vmj II

=0

i.e. v E Nt. However, vmj is orthogonal to N, and so V-

Vmjl12 = IIvl1 2 + 1

II IlvmjII

which is a contradiction. Hence .1 Vm II is bounded. Consequently, {Tv".} possesses a convergent subsequence {Tvmk}· Since {Ym} is convergent the sequence {Tvmk - (T - l)vmk) or {vmk} converges also. Let its limit be vo. Then

Ymk = (T - l)v mk -+ (T - l)v o

and so the limit of {Ym} is in M t , i.e. M, is closed. Obviously an element of Mn + t is an element of Mn , but the spaces cannot all be different. If they could, pick for each M; a Yn which is orthogonal to Mn + t and of unit norm. A similar sort of argument to that applied to Nm demonstrates that this is impossible. Accordingly, there is a finite integer J1. such that Mp=Mp+t=Mp+2=···· The remarkable fact is that J1. = v. The common value is known as the Riesz number of T. In order to prove this the spaces are first supplemented by Mo, which is to be the same as H, and No, which contains only the zero element. These spaces are consistent with the earlier definitions of M; and Nm since (T - 1)0 can be regarded as the identity operator.

342

ANTENNAS AND INTEGRAL EQUATIONS

Next, the aim is to show that any x E H can be expressed as x = y + z where y E MJJ and Z E NJJ • Moreover, the only element MJJ and NJJ have in common is the zero element. Briefly, H = MJJ Ee Nil' Now (T - I)JJ x is in MJJ and therefore in M 2 JJ since M 2 JJ = MJJ• Hence there is some WE H such that (T - I)JJ x = (T - 1)2Il W• Therefore if we write

x

= {x -

(T - I)JJ w} + (T - I)JJ w

the first member is an element of NJJ whereas the second lies in MJJ• The desired decomposition has therefore been achieved if zero is the only common element of Mil and Nil. Suppose that there is a z =j:. 0 which is in both MJJ and Nm (m ~ 0). For n ~ u, Z E M" and so, for each n, there is a z; such that z = (T - I)"z". Since Z E Nm it follows that z, E Nm +" but is not in N", contrary to the properties of N, as soon as n ~ v. Hence the only possible common element of Nm and MJJ is zero, which is a rather stronger result than was stated. At any rate H= MIl~NJJ. One consequence of this result is that the solution of (T - I)x = y is unique when x, y E MJJ• There is a solution x E MJJ since MJJ + I = MJJ and it will be unique if (T - l)x = 0 implies that x = O. However, this is true by the preceding paragraphs because x is in both MJJ and NI , whose only common element is zero. Let now x E NJJ +1 so that (T - I)Il+ IX = O. The equation (T - I)y = 0 has solution y = (T - I)JJ x which is in MJJ and so y = O. Therefore x E NJJ , i.e. NJJ + 1 = NJJ whence u ~ v. An instant deduction is that v = 0 when u = o. If u ~ 1,let y = (T- I)JJ-IX E MJJ- 1 which is not in MJJ• Since z = (T -1)JJx E MJJ the equation (T - I)JJ w = Z has the unique solution Wo (say) in MJJ• Hence x - WE NIJ. On the other hand (T - I)JJ-l(X - w o) = y - (T - I)JJ-l~,O -# 0 since (T - I)JJ-I W O E MIJ but y is not in MJJ• Hence x - w o is not in NJJ- 1 and so NJJ - 1 -# NJJ • Therefore Jl ~ v. Combining the last two paragraphs we conclude that u = v. There is no longer any necessity to distinguish betweem u and v. Suppose that u = O. Since Mo = H, the equation (T - I)x = y has a unique solution for any y E H. The correspondence could be expressed as x = (T - I) - 1 Y and the inverse is, in fact, bounded. For, if this were not so, there would be a sequence {y,,} such that Ilx"ll/lly,,11 was unbounded. Put z, = x"/lIx" II; then (T - Fyz; = y"/llx" II ~ O. On account of liz" II = 1, the sequence {Tz,,} contains a convergent subsequence {TZ"k} with limit, say zo0 Then Z"k

= TZ"k -

(T - I)z"k ~

Zo°

By virtue of the continuity of T - I, (T - I)z"k ~ (T - l)zo. The limit of the left-hand side is zero and hence Zo = 0 by uniqueness. This contradicts II zoll = limllz"kll = 1 and so there is a finite C such that II(T-/)-Ill < C. Further, if TAy = y, o = (x, TAy - y) = (Tx - x, y) for all x E H. Since u

=0

this shows that y cannot be in M I

= Mo = H

unless

SOLID ANTENNAS

343

Y = O. Accordingly, when u = 0 the homogeneous adjoint equation possesses only a trivial solution and the inverse (TA - I) -1 exists. It is indeed bounded by the same constant as (T - I) -1 because, in general,

IIT AYI1 2

= (TAy, TAy) = (TTAy,y)

~ IITTAyllllYII ~ IITIIIITAyllllyll.

If u ~ 1 the situation is different because the equation Tx = x always admits the solution x = (T - I)IJ-I Z where Z E NIJ but is not in NIJ- I . Thus T - I has no inverse. However, it has been proved above that Tx - x = y has a unique solution when both x and yare in MIJ. Arguing as in the case Jl = 0 we deduce that there is a C such that for x E MIJ, Ilxll ~ C II(T - I)xll. Since (T - I)nx E MIJ , it follows that [x] ~ CIJII(T - I)lJxll for x E MIJ. When the Riesz number is non-zero the equation Tx - x = y mayor may not have a solution. If there is a solution it will not be unique in general. It has been seen that, if y is in MIJ , there is a solution but any x satisfying Tx = x could be added unless x is forced to be in MIJ. For arbitrary y note that, when

TAW = W,

(y, w) = (Tx - x, w) = (x, TAW - w) = 0

which is a necessary condition that y must obey for a solution to exist. It turns out to be a sufficient condition as well. In fact, since AT is compact when A is complex, the following theorem can be demonstrated. 6.15 (THE FREDHOLM ALTERNATIVE). If T is a compact operatorfrom H into H EITHER the equations x - ATx = 0 and w - A*TAw = 0 have only the solutions x = 0 and w = 0, and x - ATx = y possesses a unique solution OR the equations x - ATx = 0 and W - A*TAw = 0 each have the same number of linear independent solutions x(1), ... ,x(r) and w(1), , w(r). Then x - ATx = y has a solution if and only if (vii), y) = 0 for j = 1, .r.

THEOREM

One important conclusion from this theorem is that, if it can be shown

x - ATx = 0 has no solution other than x = 0, the existence of a solution to x - )"Tx = y is guaranteed.

When the second alternative is relevant and y has the necessary orthogonality the solution of x - ATx = y can be written as X

= Z

+

r

L

bmx(m)

m=l

where z is a particular solution and the bm are arbitrary constants. It will be observed that Theorem 6.15 also resolves an eigenvalue problem. In particular, any eigenvalue A (which may be complex) is of finite multiplicity by the second part of the theorem. It may be deduced that the eigenvalues An are countable and have no limit point in the finite part of the complex plane. Often applications provide an operator T which is bounded but not compact, though some positive integer power T" is compact. In that case, Theorem 6.15 still stands.

344

ANTENNAS AND INTEGRAL EQUATIONS

When the Riesz number of AT is non-zero the second alternative of Theorem 6.15 is pertinent. Any solution of (I - AT)2x = 0 must be of the form

ATx

X -

r

L

=

bmx(m).

m=l

If this equation can be solved for x with one at least of the constants b; non-zero the Riesz number must be two or greater since x - ATx =F o. If no such solution exists the Riesz number must be one. The existence of a solution entails

L bm(x(m), w{J» r

= 0

m=l

for j = 1, ... , r. Consequently, the Riesz number is either one or greater than one accordingly as the r x r matrix with entries (x(m), w{J» is non-singular or singular. When the matrix is non-singular it is possible to choose X, = L~ = 1 akmx(m) such that (Xk , w(j» = ~kj where ~kj is the Kronecker symbol. Since X k is also an eigenfunction this means that there are eigenfunctions such that the matrix is the unit matrix. COROLLARY 6.15. The Riesz number of AT is one if, and only if, the matrix with entries (x(m), wU) >is non-singular or, equivalently, there are linearly independent eigenfunctions X, such that (Xk , w(j) > is the unit matrix.

As an application of Theorem 6.15 we establish the interrelations between the solutions of (6.102), (6.105), and its adjoint

tk(p) -

Is {Oq

1\

k(q)}

1\

grad, "'*(p, q) dS q = 0

(6.109)

assuming that the operators are compact as will be verified shortly. If j is any solution of (6.102) let E 2 , H 2 be the electromagnetic field in which E 2 is defined by E 2(P)

=

Is [iWJlj",(P, q) -

By virtue of (6.102) D 1\ H: for the interior S provides

op

1\

[tE2(P)

+

= O.

Is {Oq

{p(q)/e} grad, "'(P, q)] dS q •

Hence, application of the analogue of (6.75)

1\

E

2(q)}

1\

grad, "'(p, q) dSqJ = O.

Thus Dp A E 2(p) is a solution of (6.105). To every linearly independent j there corresponds an Dp A E 2(p). If the Dp A E 2(p) were not also linearly independent there would be some non-zero j for which Dp A E 2(p) = O. Since up A E~(p) = Up A E: (p) = Dp A E2 (p) the uniqueness theorem of the exterior problem would imply that the field was identically zero in S+ and so Dp 1\ H~(p) = O. But,

345

SOLID ANTENNAS

since Dp 1\ H~(p) = 0, this would necessitate j = 0 and a zero field, contrary to assumption. Hence there are as many linearly independent Dp 1\ E 2 (p) as there are j. But (6.102) and (6.105) have the same number of linearly independent solutions by Theorem 6.15. Accordingly, every solution of (6.105) can be

E2(p) derivedfrom a solution of (6.102). Next, it will be shown that is a solution of (6.105) whenever im is. Let

expressed in the form Dp

1\

i:

im

generate the electromagnetic field Em' H; inside S. Then Dp 1\ U m- = 0 and jm = Dp 1\ E m _ . However, the field E:, also satisfies Maxwell's equations in S_ and has a tangential magnetic intensity which is zero on S. Therefore satisfies (6.105). In view of the conclusion of the preceding paragraph, to each

-H:

i:

solution i of (6.102) is related a solution l, of (6.105) such that i:(p)

= Dp

1\

E 2 (p).

On account of the way in which (6.105) was constructed from (6.104) we infer that every solution of (6.104), the adjoint of (6.102), satisfies

(6.110) The result enables us to show that the Riesz number of (6.102) is one when it possesses non-trivial solutions. If it were not one, then j would be orthogonal to all solutions of the adjoint (6.104) by Corollary 6.15. In particular, it would be orthogonal to the h of (6.110). Since j = - n 1\ H~ this means

Is n

A

H~. E2 * as, = O.

But then the exterior uniqueness theorem would require Dp 1\ H~ = 0 on S or j = 0 in contradiction to i being non-trivial. Consequently, the Riesz number must be one. An immediate inference is that the Riesz number of (6.105) is one. Accordingly, by Corollary 6.15, there are linearly independent J, spanning the space of solutions of (6.105) and linearly independent k, solutions of the adjoint (6.109) such that

(6.111) An explicit formula for J, in terms of the k, can be derived. Multiply (6.109) vectorially by Dp and take the complex conjugate. It is obvious then that D 1\ k* satisfies (6.102). Hence D 1\ k: can be used to generate a field of the type E 2 • By what has been proved above J, = D 1\ E 2 for some E 2 • Therefore, there are constants ars such that J,(p)

= np

A

.tl

a,s

Is {iwjln

when there are N of the ks •

q A

k:.!J(p, q)

+ (L'icos) Divm, 1\

k:) grad, t/J(p, q)} dS q (6.112)

346

ANTENNAS AND INTEGRAL EQUATIONS

6.16 Compactness and other properties of the MFIE To apply the Fredholm alternative to (6.102) it is necessary to show that the operator T, defined by 1j

= Dp

A {

j(q)

grad, t/J(p, q) dS q ,

A

is compact. The theory at the beginning of §3.4 cannot be cited because the kernel is not square integrable. Instead, advantage is taken of the device described after eqn (3.17) of that section. Define 'I'1 and 'I'2 by 'I'1

= grad, t/!(p, q)

(lx, - xql

=0

(lx, - xql < b), (lx, - xql

'1'2=0

= grad, t/!(p, q) Then T = T1

+

~

~

b) b)

(lx, - xql < b).

T2 where T1 j

= Dp

TJ =

j(q)

A '1'1

dS q ,

{j(q)

A '1'2

dS q •

A {

Dp A

The kernel for T1 has no singularity and is square integrable. Therefore, by §3.4, T1 is compact. Hence T will be compact, by the first criterion at the end of §3.3, if it can be shown that the norm of T2 can be made arbitrarily small. For given xp select temporary axes with the origin at xp and the z axis along np , assuming that S has a continuously turning tangent plane near x p • The integral in T2 will then be effectively limited to x 2 + y2 ~ b2 and the equation of the surface will be approximately 2z = ax: + by 2 if the x and y axes are z

_ .......p

--=~---~x

Fig. 6.19. The geometrical configuration for the norm of T2 •

347

SOLID ANTENNAS

arranged to lie in the planes of principal curvature. If j has components j h j 2' and i, parallel to the coordinate axes

i3

= axj, + byj2

(6.113)

because j is tangential to the surface. Also

Tzi = { {(k. 'I' z)j - (k .j)'I' z} as, and now t/J

= e- ik'/41tr where ITzil

Now

r2

= x 2 + y2 + Z2.

Hence

~ f.~d {1~~I(ljll + Uzl) + U3I(1~~1 + 1~~I)}dSq.

Izl/r 2 ~ lal + Ihl and so, if (6.113) is substituted, ITJI ~ B

f

,~d

UI as, r

for some finite constant B. Hence, in general

ITJI2

~ B2{ r

iii

JIxp-xql ~d [x, -

~

B2

"

i

dSq

xql

1_12 U

dS

IXp-xql~d [x, - xql

q

}2

i

dS

_q

IXql~d IXql

by the Schwarz inequality. Consequently

II T:J11 2 =

fs IT:J1

2

dS

~ B {f f 2

lil

z

s s [x, - xql

~B f 2

s

UI 2 dSq

~ B2 11i 112

f

as, dSq } dS p

i i Ixpl
dS

IXpl
IXpl

f

p

IXpl

IXqt~d

r J

Ixql~d

dS

q

dS Ixql ~d IXql dSq

IXqI

q

IXql

where d is the maximum separation of two points of S. Since S is bounded, d is finite and the first integral is finite. The second integral can be made as small as we like by making {) sufficiently tiny. Hence it can be arranged that "T:J" 2 ~ eIIj"2 for any G > 0, i.e. "T2 /1 ~ e and therefore T is compact. Although compactness has been proved for a surface which has a local quadratic approximation (for a more precise argument, including a proof that T is a bounded operator, see Muller 1969 (see also §7.7» a check of the proof reveals that it is still valid if Izl/r2 < lira. where 0 ~ (X < I which may be otherwise stated as the angle between two normals to S must not exceed a finite multiple of [x, - XqI1-a.. Compactness cannot be expected when the direction of the normal changes discontinuously.

348

ANTENNAS AND INTEGRAL EQUATIONS

The argument that our integral operator is compact is legitimate for any complex value of k. In particular, the static operator in which k = 0 is compact However, the solution of the static integral equation is unique subject to the static surface continuity condition Div j = O. To prove this define J

=

L

j(q)

A

grad, l/Jo(P, q) as,

(6.114)

where 1/10 is 1/1 with k = O. Then div, J = 0 and curl, J = 0 (from (6.87) and the continuity condition) in both S+ and S_. Thus J = grad V where V satisfies Laplace's equation in both regions. The static analogue of (6.102) implies that Dp 1\ J _ = 0 and so V_ is constant on S.- Hence V is constant throughout S_ which entails J = 0 in S_. Therefore Dp.J_ = O. But (6.90) and (6.91) require the normal component of J to be continuous whence Dp • J + = o. Consequently, the normal derivative of V vanishes on S and, since V tends to zero at infinity, V must be zero in S+. Therefore Dp 1\ J + = 0 with the implication from (6.90) and (6.91) that j = 0 and the statement about uniqueness for the static problem is proved. Having established this fact, we can drop the surface continuity condition and still retain uniqueness, for, although curl, J = 0 cannot now be asserted, it is true that curl, curlpJ = 0 by virtue of (6.87). Accordingly,

r

Js-

(curl J)2 dr =

r

Js

{Dq.J _ A curl, L}

as, = 0

(6.115)

by means of the static analogue of (6.102). Therefore curl, J = 0 in S_ and we conclude, as before, that J = 0 in S_. In particular, np 1\ curl J _ = 0 on S. It follows from (6.87), (6.90), and (6.91) that np 1\ curl J + = O. A repetition of (6.115) for S+, bearing in mind the behaviour of J and curl J at infinity, then leads to curl, J = 0 in S+. A return to the previous case has therefore been arrived at and uniqueness of the static integral equation has been demonstrated without imposition of the surface continuity condition. 6.17 Other integral equations Unique solutions of the MFIE and EFIE are not available for some values of k. Various means of overcoming the lack of uniqueness have been devised. The simplest way is to give k a small negative imaginary part when solving the integral equation. The solution then becomes unique at the price of introducing an approximation to the surface current. Since the current will be determined normally by a numerical approximation the extra error may be of no great significance, especially in calculation of the far field (for which k is returned to its real value). This possibility has not been explored very fully so there is little information on its effectiveness. Another suggestion stems from the observation that the EFIE and MFIE

349

SOLID ANTENNAS

fail for different kinds of interior resonance so that a combination of them might have the desired uniqueness. Multiply (6.94) vectorially by po p (e/p)1/2 and add it to (6.101). There results

e) I/2Op A fJ; (

[ Op

A

f{ S

p(q) grad, "'(p, q) } iWjJ.j(q)"'(p, q) - ~-

- ij(p)

+ Op A

Is j(q)

A

grad, "'(p, q) dSq

as, ] = known

(6.116)

which is called the combined field integral equation (CFIE). The uniqueness of the CFIE turns on whether (6.116) has a non-trivial solution when the right-hand side is zero. Under this condition (6.116) states that there is an electromagnetic field with no sources inside S_ such that (6.117) Since the field has no sources in S_ the divergence theorem and Maxwell's equations supply

1

Oq.

Substitute for

D /\

{E_(q)

A

H'!.(q)

+ E'!.(q)

A

H_(q)}

as, = O.

H_ from (6.117) to obtain

(fJ + fJ*)

Is

IOq A

E_(q)1 2 dSq

= O.

(6.118)

So long as {3 + (3* # 0 we conclude that Dp /\ E_(p) = 0; it follows from (6.117) that Dp /\ H_(p) = O. The field representation (6.85) implies that Dp /\ E+(p) = 0 when Dp /\ E_(p) = O. The exterior uniqueness theorem now enforces Dp /\ H + (p) = O. On account of the continuity of the tangential components of H we have j == 0 (cf. (6.100». Therefore, the solution of the CFIE is unique provided that P has a non-zero real part. That a solution of the CFIE exists can be deduced immediately from Theorem 6.15 since the sum of two compact operators is compact. The CFIE is superior to both the MFIE and EFIE in possessing no uniqueness difficulties. The cost of this advantage is a substantial increase in computer requirements. Indeed, it is only in recent years that computer power has become great enough for the CFIE to be a feasible proposition. Even today the computer is strained if the body is large or the boundary is at all complicated. A wide choice is open for {3-a positive real number is an obvious option. Theoretically, the solution of the CFIE should be independent of P but no numerical approximation can be expected to display this property. The size of the discrepancies which occur with variations in f3 furnishes a measure of the numerical accuracy. Actually, p must not be selected too large otherwise the

350

ANTENNAS AND INTEGRAL EQUATIONS

CFIE is too close to the EFIE for nmerical convenience. Similarly, if P is too small the CFIE is not sufficiently far away from the MFIE. A value of (J in the region of 1 would seem to be about right. The next avenue to explore is whether a representation through a magnetic current might be an improvement, for example E(P) =

Is J(q) /\ grad, t/J(P, q) dS

q•

The integral equation tJ(p)

+ Dp

J /\ grad, t/J(p, q)

/\ {

as, = D

p /\

Eo(p)

(6.119)

is the outcome. Since the left-hand side of (6.119) coincides with that of (6.105) the discussion of §6.14 reveals that uniqueness fails at interior magnetic resonances. Whether (6.119) can be solved at such frequencies depends, according to Theorem 6.15, on whether

Is

Dq

/\

Eo·k*

us, = 0

for any solution of k of the adjoint (6.109). While this may be true for a particular Eo it will be false in general. In that sense (6.119) is worse than the MFIE. Nevertheless, it is possible to adjust the right-hand side of (6.119) so that it can be solved at magnetic resonance. The wrong boundary condition is being satisfied now but a correction can be introduced to compensate the error. At a resonant frequency consider tJ'(p)

+ Dp

/\ {

J' /\ grad, t/J(p, q)

as, = D

p /\

Eo(p) -

,tl

arJr (6.120)

where J r satisfies (6.111) and (6.112). Select a; = and then, by (6.111), the right-hand side of (6.120) complies with the conditions of Theorem 6.15. Hence (6.120) can be solved for J'. The solution J' will not be unique. However, different J' produce the same exterior field. This is because the different J' differ by a linear combination of the J, and it has been shown already that such currents give rise to a field which is identically zero in S+. Of course, this feature may not be reproduced exactly in a numerical approximation. The field generated by J' does not meet the correct boundary conditions because of the extra terms on the right-hand side of (6.120). To balance these bring in an additional field. Noting that (6.112) can be written J,(p)

= Dp

1\

lim E,(P)

P-p

we see that adding L~= 1 arE, to the field due to J' furnishes an electromagnetic

351

SOLID ANTENNAS

field that has the right boundary values. Thus, a representation via a magnetic current has been obtained whether or not interior resonance is occurring. The representation through a magnetic current has some practical disadvantages. Firstly, the frequencies of interior magnetic resonance need to be known in order to be aware of when the right-hand side of the integral equation has to be modified. Secondly, the modification cannot be undertaken until the eigenfunctions of (6.105) and its adjoint are available.Hence the labour involved is considerably more than for the MFIE and EFIE. The comparison with the CFIE is not quite so clear cut because .l, and k, need to be calculated only once for a given shape being independent of the applied field. Notwithstanding, the errors incurred in their numerical computation are likely to render the method generally inferior to the CFIE. Another attempt to find an alternative to the CFIE can be based on a representation by Green's tensors. This possibility was investigated by Neave (1987) who showed that non-uniqueness is present but was able to circumvent it by a method of Jones (1974,1984). A description of the method will be given without invoking tensors. Choose an origin inside S_ and let R, (J,

Y:

Y:«(J,
P:

= {(n -lml)!(2n + (n + Iml)!4n

1)}1/2 p~ml (cos 0) eimq,

(6.121)

where is the standard associated Legendre polynomial. The spherical harmonics have the orthogonal property (6.122) Let P and Q denote (R, (J,
= -ik

n

L L 00

n=O m=-n

amnh~2)(kR)h~2)(kRl)Y:«(J,
t/J 1 (P, Q) = t/J(P, Q) + X(P, Q) where h~2) is the spherical Hankel function (Jones 1986). The function t/J 1 satisfies the same radiation condition as t/J. In addition, if the origin be excluded, it has the same singularitiesas t/J and is a solution of the same partial differential equation. Therefore, t/J 1 can be employed in place of t/J in representations in S+. Accordingly, a modified EFIE can be obtained with t/Jl substituted for t/J. Uniqueness or otherwise of the modified EFIE turns upon whether Dp A

Is

[iwjli(q)t/Jl(P, q) - {p(q)/e} grad, t/Jl(P, q)]

as, =

0

(6.123)

352

ANTENNAS AND INTEGRAL EQUATIONS

has non-trivial solutions. If EO(P)

Is [iWjlj(q)l/!l(P,

=

q) - {p(q)/e} grad, l/!l(P, q)]

as,

the argument at the beginning of §6.13 shows that (6.123) entails EO being identically zero in S+ and D A E~ = 0 whence n A E<: = o. Draw a sphere with centre the origin and entirely within S_. Let its radius be a and its surface be n. Then, from

Is

E~. H~* +

(0 A

fa

EO •HO* +

(0 A

0 A

0 A

E~*. H~) dS = 0, EO*. HO) an = O.

(6.124)

Now, if P is closer to the origin than Q, there is the expansion t/!(P, Q)

00

= -ik

11

L L

jll(kR)h~2)(kRI)Y:(8,
11=0 m=-II

1,

(6.125)

This expansion can be used in the integral for EO when P is near n. In view of the definition of X it follows that near n EO

11

L L 00

=

{jll(kR)

11=0 m=-II

+

amllh~2)(kR)} Y:(8,
with Pmll a constant vector. From Maxwell's equations iWjlHo

11

00

= L L

11=0 m=-II

Pmll

A

grad[{jm(kR)

+

amllh~2)(kR)} Y:(O,
in the same region. Therefore - iWjlEO

A

HO* = (EO. grad)Eo* -

00

r

L L

(Eo• P~)

r=O s=-r

But, since div EO = 0, (Eo. grad)Eo• - (EO•• grad)EO and

fa n, curl(EO*

A

= curl(Eo.

A

EO)

EO) an = 0

because n is closed. Accordingly, when (6.122) is invoked, (6.124) reduces to 00

11

L L IPmIl1 2 [ {j ll(ka) + amnh~2)(ka)}{j~(ka) + a:llh~1)/(ka)}

11=0 m=-II

- {jll(ka)

+ a:nh~1)(ka)}{j~(ka) +

amnh~2)(ka)}]

=0

353

SOLID ANTENNAS

or

co

n

L L IPmn/

n=O m=-n

2

(1 - 11 + 2amn /

2

)

=0

on simplifying by means of the Wronskian relations for spherical Bessel functions. When all the second factors are of the same sign and non-zero the series cannot vanish unless Pmn = 0 for all m and n. In that case EO = 0 in the neighbourhood of Q. By analyticity EO vanishes throughout S_ and the solution of the modified EFIE is unique. Thus, it has been shown that the solution of the modfied EFlEis unique

provided that either

11 + 2amn l < 1 for n = 0,1, ... , m = -n, ... , n, or

for n = 0, 1, ... , m = - n, ... , n. The main problem in implementing the modified EFIE is in the selection of the coefficients a mn • Colton and Kress (1983) have shown how they can be picked to satisfy the impedance boundary condition on a sphere and thereby draw benefit from a theory due to Ursell (1973). Various criteria have been offered by Kleinman and Roach (1982) but so far there does not seem to be available a selection which permits the replacement of the infinite sum in X by a simple analytical function. Such a function or one which estimated the contribution of the terms for large n would be a valuable practical asset. Without such a function the series has to be truncated in a numerical calculation. The truncated series eliminates some but not all of the interior resonances. As a consequence uniqueness cannot be guaranteed at all frequencies and numerical inaccuracy can occur at resonances which have not been eliminated (Brandt et ale 1985). Yet another technique designed to achieve uniqueness is based on the application of (6.75) to the representation (6.85) with Pin S_. The process leads to

L

[imJlj(q)t/J(P;, q) - {p(q)/e} grad, t/J(P;, q)]

=

-1

{nq

A

as,

Eo(q)}

A

grad, t/J(P;, q) as, (6.126)

holding for every ~ in S_. It is known as the interior electric integral equation; it should be noted that the unknowns are defined on a different domain from the one on which (6.126) holds so that the standard theory of integral equations is inapplicable. Notwithstanding, it can be shown that there is uniqueness so long as (6.126) is satisfied at every ~ in S_. The difficulty in practice of implementing (6.126) is that any numerical scheme will cope with only a finite number of ~. There is no known method of choosing that finite number of

354

ANTENNAS AND INTEGRAL EQUATIONS

points so that satisfaction of (6.126) at them will ensure that (6.126) is met at all interior points. In other words the uniqueness of the analytical formulation cannot be certified to carryover to a numerical calculation. An analogous interior magnetic integral equation can be derived from (6.86) and (6.76). It is

L

j(q)

A

grad,

"'(~, q) as, = known

(6.127)

for every ~ in S_. Its deficiencies are the same as those of the interior electric integral equation. The Schenck method (1968) is to employ the MFIE and supplement it by requiring that (6.127) be satisfied at a number of interior points as well. In some cases it is found that a few interior points are sufficient to give good results but in other cases several interior points fail to prevent poor answers. The lack of knowledge on how to select the interior points for reliable calculations is the stumbling block. There is also the alternative of supplementing the EFIE by satisfying (6.126) at a number of interior points. In general, the results may be expected to be poorer than those based on the MFIE. The reason for this stems from the fact that the usual numerical methods split S into sub-areas by a mesh and assume some approximating formula for the unknown on each sub-area. In the absence of error estimates accuracy is checked by reducing the size of the mesh until two successive combinations agree to a prescribed number of figures. As the size of the sub-area diminishes, the value of the integral over it will tend to decrease. Thus the coefficients in the algebraic system derived from the integrals will become smaller and smaller at the same time as the order of the system becomes larger and larger. Consequently, there is a tendency for the determinant of the matrix to approach zero and the system becomes ill conditioned-for very fine meshes some of the numbers may even be too small to be retained by the machine. These comments apply in particular to the EFIE and the interior integral equations when used by themselves. On the other hand, the MFIE contains a term j which does not involve integration. Therefore the MFIE escapes the above defect and becomes more and more diagonally dominant as the mesh size decreases. Methods based on the MFIE are hence to be preferred since a reduction in mesh size may cause a deterioration of accuracy in other methods. The CFIE enjoys the same advantage of diagonal dominance as the mesh size diminishes and is superior to the MFIE in having no uniqueness problems. Instead of imposing (6.127) directly the expansion (6.125) can be inserted and the behaviour near the origin examined. It will be similar to that in the discussion of the modified EFIE with a mn = o. If the right-hand side is expanded in a similar structure the conclusion is that

L

j(q)

A

grad, "'n",(q) as,

= known

(6.128)

SOLID ANTENNAS

355

for n = 0, 1, ... , m = - n, ... , n where

If all these equations, which are imposed at a single point, are satisfied, it follows from the analytic nature of the field that (6.127) holds throughout S_. Therefore the proposal is that the MFIE should be solved subject to the enforcement of a finite number of (6.128). Such a system can be expected to be overdetermined but a solution can be specified by least squares. The Schenck method requires the selection of a number of interior points with no governing criterion other than that they should avoid the nodal surfaces of the interior modes of resonance. Unfortunately, the location of the nodal surfaces is not known, so that the placing of the interior points has to be based on experience and intuition. The implementation of (6.128) removes the arbitrariness in the selection of interior points, replacing the selection by a demand that certain partial derivatives of the left-hand side of (6.127) should vanish at the origin. On the other hand the computation of (6.127) needs only the computation of t/J whereas each of (6.128) involves a different t/Jmn. It may be possible to have the best of both worlds by choosing the points of the Schenck method at positions which are used for the calculation of derivatives by difference formulae. Providing that the choices of positions were close enough for the differences to approximate the derivatives and sufficiently separated for numerical distinguishability one would hope that the Schenck method would achieve the theoretical advantages of (6.128). The final method to be mentioned in this section may be regarded as combining in a certain sense the modified and interior integral equations. It will be referred to as the mixed method. Its aim is to preserve the simplicity of the MFIE and EFIE, avoiding the complexity of the CFIE, but offering only a measure of protection against non-uniqueness. The technique (Tobin et ale 1987) is to replace t/J(p, q) in the EFIE by

t/J(p, q) - f3t/J(p - t5n, q) thereby obtaining the mixed EFIE. The parameter t5 must be positive and small enough for the point p - bn to be in S_ for every p of S. Not all surfaces will be able to meet this criterion. For example, when S has a sharp edge, there will be a problem when p is near the edge. Although this limitation influences the validity of the analytical argument it may be less serious in an approximation where S is subdivided into patches and the normal to a patch can be positioned appropriately. The mixed M FIE is obtained from the MFIE by the same replacement for t/J(p, q). To investigate the uniqueness of the mixed EFIE let j be a solution when the right-hand side vanishes. Define EO, HO as at the beginning of §6.13. Then the mixed EFIE asserts that

(6.129)

356

ANTENNAS AND INTEGRAL EQUATIONS

P is not too close to unity, (6.129) requires effectively that EO(p) = O. Subject to that condition, np 1\ EO(p - bn) = 0 follows from (6.129). The argument of §6.13 may be repeated and one concludes that there is uniqueness provided that there is no electric mode of oscillation which resonates simultaneously in a cavity with boundary S and in a smaller one inside s. The boundary of the smaller cavity is essentially parallel to S being the displacement inwards of S by a distance lJ along the normal. The restriction to be placed on b in order to prevent the occurrence of the electric mode is not easy to assess. Although quite a bit is known about how eigenvalues of cavities grow and cluster this is insufficient for our purpose. The point is that while two cavities may have the same eigenvalue, they may not support the same eigenfunction. It is only when the two cavities support the same eigenvalue and eigenfunction that lJ is affected. A clue is proffered by the sphere of radius a; spheres in which ka differs by about 1t can support the same mode. Therefore it would be wise to insist that kb < 1t. The discussion has been plausible rather than rigorous but indicates that the mixed method has a reasonable chance of success provided that b can be chosen properly. It must be small enough to allow the construction of the smaller cavity, large enough to differentiate usefully the mixed EFIE from the EFIE, yet not so large as to permit the existence of the simultaneous electric mode of oscillation. Probably, a value of kb between 1 and !n is about right so long as the construction of the inner cavity does not force it to be smaller. As regards p a value of ± i would seem to be adequate. Like considerations apply to the mixed MFIE. If b is small and

Dp 1\

Exercises 33. It may be shown that, when j is continuous, the integral in the MFIE has continuous first partial derivatives. Deduce that, if the right-hand side of the MFIE is continuous with continuous first partial derivatives, the solution j has the same properties when k does not correspond to an interior resonance. 34. An electromagnetic field is required in S_ such that D 1\ E_ = g on S where g is continuously differentiable. Show that it is necessary that

Is g.b· dS = 0 for every solution b of (6.104). If these conditions are satisfied show that the electromagnetic field can be found, it being expressed in terms of a magnetic current only which has the same continuity properties as in Exercise 33. 35. Let the set {j,} be a basis spanning the space of solutions of (6.102) and let {J,} be formed from it by the rule J(p) = Dp 1\ E 2(p) as in §6.13. Prove that the matrix with elements Dq 1\ j,(q).J,(q) dSq is symmetric and has non-zero determinant. Deduce that, for each solution k' of (6.109), there is a corresponding J' such that

Is

Is

for all solutions k of (6.109).

(k' - J').k· dS = 0

SOLID ANTENNAS

36. For the exterior problem in which n

Exercise 35) so that

1\

E+

=g

Is (g - k').k* dS

357

on S choose k' (in the notation of = 0

for all solutions k of (6.109). Find a solution in terms solely of an electric current such that n 1\ E' = J' on S. Deduce that a solution of D 1\ (E+ - E') = g - J' can be found involving only magnetic current and hence that a solution of the exterior problem can be determined though the solution may not be unique. 37. Do you think that there would be trouble from non-uniqueness in the wire-grid model? 38. If in (6.116)

P is

taken as a function of p instead of being constant prove that

uniqueness still holds provided that {3 never vanishes nor assumes purely imaginary values.

6.18 Numerical considerations for surfaces Whatever integral equation an investigator decides is most suited to his needs, the problem of the numerical approximation for surface integrals will have to be faced. Uusually, the surface will be divided into sub-areas or patches and the surface integral expressed as a sum of integrals over the patches. No approximation is involved in this process so that local approximations can be introduced without worrying about the global properties of the integrand. There are two different aspects to be considered. One is where the surface is given and the other is where the designer is attempting to draw the surface subject to certain criteria about its shape and boundaries. The latter problem lies more in the realm of computer-aided design and here techniques such as Coons' patches (Coons 1967; Forrest 1968) in which a patch is defined in terms of four boundary curves and the slopes across these boundary curves; the corner points and twists there may well be relevant. However, we shall concentrate on the situation in which the surface is known and the main concern is to arrive at its scattering properties. The splitting of the surface into patches permits the transfer of ideas that originated with finite elements. One of the first approaches (Wait 1973; Wilton 1973) used plane triangular finite elements (subsequently extended to quadratic triangular elements) with either piecewise constant, linear, or quadratic approximation to the unknown function. Every triangular surface element is mapped on to a \llane ri~ht-an~led trian~e (c£. ~5.8)~ the same C\uadtature nodes can then be employed to compute all integrals over patches of the same type. More complicated representations for functions are possible, though in finite elements it is usual for the approximant to be an interpolation polynomial with parameters which are the actual function values (or maybe derivatives) at specified nodes. If the function being interpolated is unknown every extra node adds an equation to the algebraic system to be solved eventually. To set against that is the fact that, with more nodes, higher-order approximations with a

358

ANTENNAS AND INTEGRAL EQUATIONS

reduction in the number of elements are feasible. There is some evidence that higher-order methods payoff (Hess 1973). It is advantageous to have as many nodes on the boundary as possible so as to improve efficiency; also including derivatives by means of Hermite elements (Strang and Fix 1973) increases the order but becomes unwieldy when the surface is represented parametrically. Continuity of the interpolant and maybe some derivatives is often required if convergence of the numerical approximation to the analytical solution is to be secured. Elements which comply with theoretical requirements are often said to be conforming. Elements which do not have the requisite degree of continuity are frequently simpler to work with; they are known as non-conforming. With non-conforming elements there will be convergence at some places but not at others and Irons' patch test (Irons and Razzaque 1972; Ciarlet 1975) provides a means of deciding which elements are well behaved. If the surface is not closed special consideration of the behaviour at its boundaries will be necessary, and here it is important to note that replacement of the boundary by a polygon may lead to the Babuska paradox in which the numerical approximation converges but to the solution of the wrong problem (Birkhoff 1969). For this reason methods have been developed for treating a boundary exactly (Marshall and Mitchell 1973) but they would generally need a prohibitive labour for integral equations. Representation of the surface by portions of quadrics also has been tried (Van Buren 1970). These general points, together with the observations of §6.7 in the simpler case of the wire antenna, prompt one to continue the pattern of §6.8 and avail oneself of cubic B-splines. If e and " are surface parameters, the natural generalization is provided by the bicubic (because it is a cubic in both ~ and 11) B-spine basis Ni(~)Nj(rJ). The analogue of (6.69) for a three-dimensional surface is then x(e, rJ)

=

Y(e, tl)

=

z(e, tl)

=

T+3 T+3

L L

i=2

j=2

Xij~(e)Nj(rJ),

T+ 3 T+ 3

L L

i= 2 j= 2

Yij~(e)Nj(tl),

T+3 T+3

L L

i=2

j=2

Zij~(e)Nj(tl)

e

thereby mapping the surface onto the square 1 ~ ~ T, 1 ~ tl ~ T in the (e, tl) plane. (The image can be made a rectangle if different ranges for and fl are utilized.) As in §6.8 the coefficients Xij' Yi j ' Zij are fixed by specifying the points corresponding- to integer values of ~ and fl. Again special treatment at any perimeter is necessary but the associated matrix is still tridiagonal. In order that the mapping be invertible it is necessary that the three determinants D - o(y, z) 1 - o(e, rJ)'

D _ o(z, x) D 3 2 - o(e, fl)'

e

= o(x, y) o(e, rJ)

359

SOLID ANTENNAS

do not vanish simultaneously. When that is so a typical surface integral is converted according to

where Qm is a unit square and

J2

= DI + D~ + D~.

The procedure of §6.8 may now be followed, the principal differences being the extra variable and the singular integrands which will be discussed in the next section.

Exercises

39. Prove that Jo = D1 i + Dl.i + D3k where i, i, and k are unit vectors parallel to the Cartesian coordinates. 40. If r 2 = (x q - Xp)2 + (Yq - Yp)2 + (Zq - Zp)2 show that ~ =

on q

(xq - xp)D1(q) + (Yq - Yp)D2(q)

+ (Zq -

zp)D3(q)

rJ(q)

41. Use Exercise 40 to find r and its normal derivatives at p and q for (a) a sphere, (b) an ellipsoid, and (c) a truncated hyperboloid by adopting the following strategy (for a suitable parameter representation): (i) assume p is fixed and xP' are known; (ii) determine xq from the values of and" for q; (iii) calculate the partial derivatives of x, y, and Z with respect to and ,,; (iv) evaluate Db D2 , and D3 *

ox/an,

e

e

6.19 Singular integrals The kernel of the MFIE contains a weak singularity of the form of l/r as the analysis of §6.16 reveals and the same is true of part of the kernel of the EFIE at any rate. Therefore, although Gaussian quadrature is suitable for patches not containing the singularity, a different process is essential for elements in the immediate neighbourhood of the singularity. Because of the weakness of the singularity it is probably sufficient to withhold special treatment except for the nearest four elements. Of the one-dimensional methods that have been invented to deal with singularities the one which seems to be most adaptable to surface integrals is the following (Takahasi and Mori 1973). Let f(t) be a function which may have singularities at ± 1 but not elsewhere. By making the transformation t = 1/i 2 1t

we obtain

I

I -1

I" exp( - v

2

0

2 f(t) dt = n 1 / 2

Ico

_ co

)

dv = erf u

2

f(erf u) exp( - u ) duo

360

ANTENNAS AND INTEGRAL EQUATIONS

The integral on the right is approximated by the trapezoidal rule with constant interval h. There results

f~

1

f(t) dt

~ 7t~~2 n=~N f(erf nh) exp( - n

2h2

)

(6.130)

which is a quadrature rule with the weights (2h/n l / 2 ) exp( - n2 h2 ) attached to the points erf nh. Owing to the rapid convergence of (6.130) it is rarely necessary for N to exceed 5 when h is of the order of unity. The switch to integrals over a square is immediate by changing both variables of integration to erf. Although (6.130) applies even if f has no singularities it tends to be less efficient and that is why its application should be limited. For singularities of higher order such as occur in a Cauchy principal value the following device can be helpful. If f has no singularities

[1

J-

1

f(t) dt = lim t £-+ + 00

lim e-++O

=

I

I

o

(f

-

I I

£

-£

1

+

II)

f(t) dt t

£

dt {f(t) - f( -t)}t

dt {f(t) - f( -t)}t

by changing t to - t in the first integral on the right of the first line. Now the singularity has disappeared and straightforward quadrature can be accepted.

Exercises

42. Evaluate J~ 1 (1 + t)P(1 - t)4 dt numerically for various values of p and q between - 1 and O. Check how close your answer is to 1t when p = q = - t. 43. The integrals (a) J~ t- 1/ 2(1 + t)-1 dt, (b)f~ t- 3 / 4 dt/(1 - t) with T ~ 2 are supposed to be approximations to 1t. By evaluating them numerically for increasing values of T decide how large T must be for them to be accurate to (i) 1 per cent, (ii) 0.01 per cent.

6.20 The algebraic system Once the coefficient matrix and the vector from the right-hand side of the integral equations have been calculated the algebraic system must be solved. Since the matrix is not sparse in general, direct methods are appropriate with the proviso that the matrix is not too large. As the size of the matrix grows the computation of the solution of the algebraic system will equal and surpass the effort devoted to the preparation of the matrix because the former expands as the cube of the order while the latter augments as the square. It may then be profitable to shunt to iterative methods. The Gauss-Seidel and SOR iterative methods (§1.13) demand for convergence

SOLID ANTENNAS

361

that the spectral radius be less than unity (Theorem 1.13). It has also been seen in §2.5 that the possession of Young's Property A is a desirable asset. Unfortunately, it is difficult, if not impossible, to verify that the integral equations of scattering lead to systems with the desired perquisites. Even Gerschgorin's theorem (l.llc) is inadequate for the purpose. In view of these deficiencies Burton (1976) has suggested that large matrices should be dealt with by the conjugate gradient method of optimization (§4.6). The particular version recommended is due to Hestenes (1956). (See also §5.7.) Let the system of linear equations be Mu=v with the usual inner product (u, v) = I:f= I UiVr when the vectors are of order N. An initial approximation Uo is first assumed. Then set q - 1 = 0, C _ I = 1 and r o = v - Mu o. The ith iteration now consists of the operations

c, = Ilrdl , bi- 1 = cilci-l' qi = ri + bi-1qi-l' Pi = MAqi' d, = IIp;l1 2 , a, = cildi , 2

Ui+ I

= u, + aiPi, r i + I = r i - aiMpi·

Despite every iteration depending on M and its adjoint the process is feasible if the matrix is stored by columns and transfers from the backing store are carried out in fairly large blocks for one may get away with one complete matrix transfer per iteration. The efficiency of the method stems from the tendency of the eigenvalues of M to cluster around a single point when it is drawn from an integral equation of the second kind. Suppose the operator involved is I - T where T is compact. Then the eigenvalues of the operator equation will be the values of A such that

(I - T)w

= Aw.

The only limit point of the eigenvalues of T is zero so the eigenvalues of I - T have the unique limit point 1. Since M reproduces approximately the properties of I - T we expect that many of its eigenvalues will be near unity.

Exercises In the following exercises use first 5 and then 10 knots per wavelength in the spline approximation as a check on whether phase variations are adequately taken care of. 44. Calculate the scattering cross-section of a sphere of radius a as a function of wavelength by means of the EFIE and the MFIE paying particular attention to what happens when ka is near 2.75 and 4.50. Do you get the same results from the CFIE? 45. Repeat Exercises 26-31 using (a) EFIE, (b) MFIE, and (c) CFIE. How would you rate these methods in comparison with the wire-grid model? 46. Instead of making most of the collocation points coincide with the spline knots, choose them to be all different from the spline knots. Use scattering from a sphere as a test bed. It will be found that the algebraic system becomes ill conditioned, possibly even singular.

362

ANTENNAS AND INTEGRAL EQUATIONS

47. The method described finds the coefficients in the spline expansion of the unknown. Reformulate it so that the values of the unknown at the spline knots are the quantities to be found. 48. For scattering by a sphere examine how the number of iterations in the conjugate gradient method depends upon the distribution of knots and the alternative representations of Exercise 47. Do the eigenvalues of the algebraic system display a tendency to cluster? 49. Instead of collocating set up the algebraic system by Galerkin's method (§5.1) using some simple weight functions and repeat the calculations of Exercise 45. Assess the merits and defects of the various numerical methods tried.

6.21 The null-field method An alternative to the surface integral equation and its variants is to focus attention on an interior integral equation and especially the equivalent set of equations (6.128). This was first developed systematically by Waterman (1971) under the name of extended boundary condition but the description null-field method (Bates 1968) seems preferable. In the method j is expanded in terms of a convenient set of basis functions on the surface. By limiting the number of terms in the expansion to the same number of equations (6.128) as are imposed, a linear system is obtained for the unknown coefficients. Increasing the number of coefficients and equations supplies an audit of the accuracy. In this connection it should be remarked that in solving an integral equation by expansion in known functions and truncation the accuracy can be improved and an error bound derived by modification of the kernel (Jones 1972a) (see §5.6(c».

There is, of course, no reason why 1/1 should be expanded in terms of the spherical wavefunction I/Illm only. Any other handy set such as circular cylinder or elliptic functions could be employed. This fact has been exploited (Bates and Wong 1974; Wall and Bates 1975) and some successful results achieved. None the less, the theoretical backing for the null-field method must be regarded as less satisfactory until a priori bounds are discovered for the error when the geometry of the surface bears little resemblanceto the natural geometry inherent in the expansion of 1/1.

Exercise 50. Try the null-field method on a selection of the problems 26-31 and compare the results with those of previous methods.

6.22 The impedance boundary condition

Frequently, the surface of the antenna will not be perfectly conducting but the field will not penetrate deeply into the body, particularly if it is coated. Then it

363

SOLID ANTENNAS

may be enough to account for the properties by imposing an impedance boundary condition on the surface. Suppose that the field E i, Hi falls on the body from the exterior and that the boundary condition on S is U

= Z1Zn

AE

(6.131)

A (n A H)

so that Z1 is the impedance relative to the characteristic impedance Z Then the representation to replace (6.85) and (6.86) is

= Ei(P) +

E(P)

L[

-ZlZ{Dq -

/\

j(q)} /\ grad, "'(P, q)

p~q) grad, "'(P, q) + iWjlj(q)"'(P, q)J dS

q,

f [j(q) /\ grad, "'(P, q) + Z Div{ZlD

H(P) = Hi(P) -

= (Jl/e)1/2.

q /\

s

+ iweZZ1Dq

/\

(6.132)

~(q)} gradq "'(P, q)

IWJl

j(q)"'(P, q)J as,

(6.133)

where advantage has been taken of (6.131) to substitute for the tangential electrical field in terms of j. Now let P ..... p and apply (6.131). There results

Dp

/\

(Dp -

-

/\

Is [iwjlj(q)"'(p, q) - ZlZ{D

q /\

p(q)

-8-

grad, t/J(p, q)] as,

Z 1 Z Up A

( ) [ f S

J q A gra

)

d all( q 'I'

+ iweZZ1Dq

j(q)} /\ grad, "'(p, q)

/\

) Z Div{Zln q A j(q)} grad, t/J(p, q) p, q + - - - - - .- - - - lWJl

j(q)"'(p, q)J

=

as,

-np /\ (Up /\ E i) - ZIZnp

A

Hi.

(6.134)

This is the analogue of the EFIE and reduces to it when ZI = O. There is also a companion to the MFIE obtained by asking that n A H_ = 0; it is

1-( ) -iJP +np

/\

f[-() S

d all( ) ZDiv{Z1 nq Aj(q)} grad, t/J(p, q) Jq Agra q'l'p,q + .. lWJl

+ iweZZ1Dq

/\

j(q)"'(p, q)J

as, = D

p /\

Ho(p)·

(6.135)

364

ANTENNAS AND INTEGRAL EQUATIONS

A parallel to (6.116) is formed by combining (6.134) and (6.135) so as to achieve

np A (n p A

A

Is [iWj.tj(q)t/J(P, q) - ZtZ{n

p~q) grad, t/J(p, q)J

Is [j(q)

A

dSq )

-

q A

iPj(p)

grad, t/J(p, q) + iweZZtnq

+ Z Div{Z1

0 q /\

j(q)}

A

+ (P -

grad, t/J(p, q)

z.z»,

j(q)t/J(p, q)

A

.j(q)} grad, t/!(p, q)J dSq = known.

(6.136)

lWj.,l

These integral equations are more complicated than those for perfect conductivity but otherwise there are no essentially fresh features. As far as uniqueness is concerned only (6.136) will be examined. The vanishing of the left-hand side implies that

op /\ {up /\ E_(p)}

+

(Z1 - P)Zu

H_(p)

p /\

= o.

Comparing this with (6.117) we see that 0 /\ E_ = 0 /\ H_ = 0 so long as f3 - Z1 is neither zero nor pure imaginary. Since 0 /\ (0 /\ E) + ZlZo /\ H is continuous across S it follows that (6.137) Therefore, assuming an exterior uniqueness theorem for an impedance boundary condition, up /\ H+(p) = 0 and j is identically zero. Thus, (6.136) possesses a unique solution provided that P is chosen so that P - Z1 is not zero or pure imaginary. The exterior uniqueness theorem just assumed must now be verified. Let E, H be an electromagnetic field which has no sources outside S and satisfies the radiation conditions (6.73). Suppose it also complies with the boundary condition (6.137). Let OR be a large sphere of radius R. Then

r (E

JO

A

H* + E* A H).ndO =

R

r (E

Js

A

H* + E*

A

H).ndS. (6.138)

The left-hand side of (6.138) can be written as

r

__ 1 fiE - ZH 2Z JO R .

A

RI 2 + IZH - R A

EI 2

-

21EI 2

-

2Z 21HI 2

+ Z 2 IH.RI2 + IE.RI 2 } dO. On account of the radiation conditions (6.73) the first two terms of the integrand give a zero contribution as R --., 00. Further, a scalar product with R of the second pair of (6.73) indicates that the last two terms can be ignored as R --., 00.

365

SOLID ANTENNAS

Hence, as R

--+ 00,

f

OR

(IEI2

+

Z 21" 12) dO = Z

=

f

-f

S

S

(E /\ H* + E* /\ H).o dS 2

In /\ EI (Z I + zt) dS IZ 112

The two sides have opposite sign if ff,lZI ~ 0 and so can be consistent only if both are zero, but that is impossible unless E == 0 and H == o. Consequently,

the exterior uniqueness theorem is valid if f11tZ 1

~

o.

An important observation is that neither the integral equations nor the exterior uniqueness theorem depend on ZI being a constant. Hence (6.136) will provide the unique solution for variable impedance subject to #/ZI ~ 0 at all points of Sand Pbeing picked so that p - ZI is not zero nor pure imaginary, e.g. p = - 5 would be a suitable choice.

Exercises

51. Prove that it is not necessary for P to be constant so long as the conditions on p - Z 1 are still met. 52. Reformulate the theory of this section so as to be applicable to two-dimensional fields around cylindrical obstacles parallel to the z axis. Allow the longitudinal impedance (for Ez ) to be different from the transverse impedance. 53. Is the theory of Exercise 52 adaptable to fields whose z dependence is exp( - ikz cos O)? 54. A plane wave with its electric vector perpendicular to the z axis is incident on the symmetric double wedge of Fig. 6.20. The surface impedance is constant

...---------8A----------.. Fig. 6.20. Double wedge.

and purely reactive, i.e. fJlZ 1 = O. Calculate the radar cross-section and the analogous quantity for the forward scattering in the direction of incidence as IZll increases from 0 to 1. Is there any evidence for the existence of a surface wave on the wedge when ZI ~ ±0.36i? How do you account for the behaviour of the radar cross-section between IZ11 = 0.5 and IZ11 = 0.6? 55. The wedge on the right of Fig. 6.20 is replaced by a portion of circular cylinder which fits smoothly on the left-hand wedge. Repeat the investigation of Exercise 54. 56. Undertake a study of the scattering by a sphere with a surface impedance.

366

ANTENNAS AND INTEGRAL EQUATIONS

6.23 Absorbing boundary conditions Instead of tackling scattering problems by integral equations one can contemplate a numerical solution of Maxwell's equations by finite differences or finite elements (see Chapters 2 and 5). It is also possible to consider hybrid methods which combine integral equations with finite differences or finite elements (Mei 1974; Shaw 1974) but such methods will be left on one side here (see §6.28). For simple shapes the formulation via integral equations is generally more efficient than finite differences which require the determination of the electric and magnetic fields at a large number of mesh points in the medium surrounding the scatterer. The comparison is less clear-cut when the shape is complex or the body is not perfectly conducting when the integral equations may have to be three dimensional. The advantage which finite differences offer is the generation of matrices which are sparse and banded; computational algorithms for handling such matrices effectively are highly developed. Integral equations, on the other hand, lead to dense matrices which are expensive to fill and invert for bodies of complicated structure. Finite differences and variational methods are very successful for waveguide problems (Chapters 2 and 4) but two of the properties which contribute to their success are absent from the scattering problem. The first loss is the positive-definiteness of the operators so that bounds on the errors are no longer available. The second deficiency originates because S+ is of infinite extent and no numerical procedure can cover it with mesh points. To get around this difficulty an artificial boundary surrounding S is placed at a large, but finite,

distance away (McDonald and Wexler 1972; Silvester and Hsieh 1971; James 1973) and only the region inside the artificial boundary is subdivided into meshes. While this process keeps the number of mesh points finite it leaves open the question of the boundary condition to be imposed on the artificial boundary. It should ensure that the scattered field is outgoing at the artificial boundary, allow the artificial boundary to be close enough to the scatterer to hold the number of meshes down and preserve the sparse nature of the matrices. In fact, these targets are incompatible (MacCamy and Marin 1980) and a compromise has to be sought (Engquist and Majda 1977; Kriegsmann and Morawetz 1980). Any boundary condition other than the one satisfied exactly by the scattered field will result in the numerical calculation including both incoming and outgoing waves. So practical boundary conditions are designed to favour outgoing waves against incoming while not being able to exclude incoming waves entirely. Since their purpose is to eliminate the reflection of outgoing waves as incoming they are known as absorbing boundary conditions. Various absorbing boundary conditions and ways of deriving them have been devised but only one will be given here; for others see Cooray and Costache (1991). The first case to be considered is Helmholtz's equation (6.139)

367

SOLID ANTENNAS

in three dimensions (for two dimensions see exercises). Let R, f), l/J be spherical polar coordinates and let R = R o be a sphere which encloses S totally. Then, in R > R o, a scattered wave has the representation P=

e-

L Pn(O, l/J)/R n+ 00

ikR

(6.140)

1

n=O

where the Pn satisfy the recurrence relation

2 -a (. 1 -a-Pn + ~_.sin f) -apn) + -.-22 + n(n + I)Pn = 0 2

.

2lk(n + I)Pn+

1

SIn

ao

0 00

SIn

0 dl/>

(n

= 0, 1, ...) (6.141)

in order to comply with (6.139) (Jones 1986). Remark that, if Po P is identically zero outside S. From (6.140)

op + (ik +~)

R P

oR

=

_e-ikR

~

npn = _e- ikR

c: Rn+2 n=O

= 0, the field

~ (n + I)Pn+l. c: R n+ 3 n=O

Therefore

2(ik + ~){:~ + (ik + ~)p} 00

= _e- ikR L

n=O

{2ik(n

+

I)Pn+l

+ n(n +

l)p,. - n(n -

l)p,.}/R n+

3

•

The last term on the right-hand side vanishes at n = 0 and n = 1. Its omission causes an error of O(I/R 5 ) therefore. The remaining terms can be modified by (6.141). Hence

2p} {_1_ ~ (Sin 8 ap)+_l_ a sin of) sin? f) ol/J2

2(ik+~){ap +(ik+~)P}=~2 R

oR

R

R

0 00

(6.142)

with an error of O(1/R 5 ) . Equation (6.142) constitutes an absorbing boundary condition for a spherical artificial boundary. Since the right-hand side involves only tangential derivatives of p the relation is, in essence, one between the values of p on a sphere and its normal derivative. Formulae which commit even smaller errors have been constructed by Bayliss et ale (1982) by applying the operator alaR. They can be converted to relations between tangential and normal derivatives by invoking (6.139). However, their increased accuracy is offset by substantial complication in the equation so that they are used rarely in practice. It is not always convenient to employ a sphere as artificial boundary. For other shapes a relation between tangential and normal derivatives analogous to (6.142) is still feasible though it has to be expressed in terms of the surface operators Div. Grad, Curl defined in the appendix to this chapter. It is

368

ANTENNAS AND INTEGRAL EQUATIONS

(Jones 1992b)

{1 - t[H + (H 2 - K)I/2J}{~: + (ik + H)P} = 2~k { 1 + ~ [H - (H 2 - K)I/2J}{(H 2 - K)p + Div Grad p}

+ 2~k Div[Curl(n 1\

Grad p)],

(6.143)

where H, K are the mean and Gaussian curvatures respectively. The notation [ ]t signifies the tangential component of the vector. The unit vector n is the outward normal to the boundary and op/on is the normal derivative. A simpler version which yet seems tolerably accurate was proposed by Jones (1988a). In two dimensions an alternative has been suggested by Mittra et ale (1989). Another form, for Laplace's equation in three dimensions, has been given by Khebir et ale (1990). A comparison of the predictions of the two-dimensional version of (6.142) with an exact solution has been made by Mittra et ale (1989). They conclude that the absorbing boundary condition is acceptable provided that the scattered field does not include to any great extent cylindrical harmonics above the first 40. If the surface of the scatterer has too many significant wrinkles, predictions from the absorbing boundary condition can be expected to display appreciable errors. Next, absorbing boundary conditions for Maxwell's equations will be derived. Since the Cartesian components of E satisfy (6.139) we can state that 00

E=e-

ikR

L

n=O

c n(O,
(6.144)

where the Cartesian component of c, satisfies

.

2Ik(n

cn 0 a

1 0 (. 1 02 + l)c n + 1 + -.-::;- SIn () -OCn) + -. -2--2 + sm

0 00

00

SIn

n(n

+ ljc; = 0

(n = 0, 1,...)

(6.145)

In addition, div E = 0 enforces (6.146) (n = 0, 1, ...)

(6.147) where

iI' i 2 , i 3

are unit vectors in the directions of R -, 0 -,

-

increasing

369

SOLID ANTENNAS

respectively (Jones 1986). If Co = 0, the scattered field outside S is identically zero. Furthermore, (6.146) makes the distant field essentially transverse in nature. In view of (6.144) and (6.145) p can be replaced by E in (6.142) as far as Cartesian components are concerned. This provides an absorbing boundary for E. Obviously, the magnetic intensity satisfies the same absorbing boundary condition. However, these absorbing boundary conditions are not completely satisfactory because they decouple the electric and magnetic fields in contrast to Maxwell's equation. Nor are they strictly analogous to (6.142) which connects possible sources of radiating solutions of Helmholtz's equation represented by Kirchhoff's integral. Corresponding sources for Maxwell's equations are the tangential components of the electric and magnetic fields. Therefore, an absorbing boundary condition in terms of tangential components only is desirable, especially if one plans to employ Nedelec's conforming elements (§5.8). Although the desired relation can be obtained by direct manipulation of (6.144) it is easier to derive it from an exact result for the sphere. Let h = (l1/ e ) t /2" and put E = p e - ikR, h = q e - ikR. Then Maxwell's equations give - ik(q - it ik(it Let r

= it

1\

(q - it

1\

1\

1\

(it

1\

p) = curl p,

(6.148)

+ p) = curl q

(6.149)

p); from (6.148)

- ikr Since -it supplies

q

1\

= it

1\

curl p.

(6.150)

p) is the tangential component of p on the sphere, (6.149)

ikr = [curl q]t. Hence 2ikr = [curl q]t - it

1\

curl p.

According to (A.23) expressed in spherical polars curl W = Curl W + i 1

1\

oW oR

and so 2ikr

= or/oR + [Curl q], - it

1\

Curl p.

The right-hand side should be written now entirely in terms of it From (A.24) in spherical polars Curl W = {Grad(W.i t)} Evidently it

1\

Curlfi,

1\

1\

it

+ it

W)

1\

W/R

= - it

1\

+ it Div(W

WIRe

1\

q and it

1\

it).

1\

p.

370

ANTENNAS AND INTEGRAL EQUATIONS

Therefore 2ikr = or/oR

+ 2r/R + i l

1\

{Curhi,

q) - i l

1\

+

Call on (6.79) and (6.80) in the form Divu,

1\

p)

to obtain

(2ik

+

1/R)r =

= iki t •q,

~ + 3r + i 1 A

oR

R

A

1\

{Grad(q. it)}

Divti,

[Curlfi,

Curlti,

1\

q)

1\

q) - i 1

1

- - {Grad Divti,

k

1\

A

=-

p)} 1\

il

Grad(p. ill.

-

iki l • P

Curhi,

q) - i 1

1\

A

p)}

Grad Divii,

1\

p)}.

Up to this point no approxmations have been made. Now bring in (6.144) and (6.150) which imply

-ikr =

f

n=O

[{~(C"oil) + nC"oi )}i + {-.1_~(C"oil) + nc"oi }i ] /R + 00 SIn OOl/J 2

2

3

3

n

2

•

Substitution in -ik(or/iJR + 3r/R) changes only I/R n + 2 to (1 - n)/Rn + 3 • The factor 1 - n removes the term n = 1 from the series and the term n = 0 vanishes by virtue of (6.146). Consequently, taking or/oR + 3r/R to be zero commits an error of order I/R s which is the same as the order of the error in (6.142). With these terms omitted only tangential derivatives occur in the formula for r. They have no effect on e- ikR which can be multiplied through therefore. Hence, our absorbing boundary condition on a sphere is (2ik

+

I/R)i l

1\

(h - i 1 -

1\

E)

= il

1\

{Curlii,

~ {Grad Divti. k

A

1\

h) - i 1

h) - it A

1\

Curhi,

Grad Divti.

A

1\

E)}

E)}.

(6.151)

An alternative form has been proposed by Peterson (1988). Recognising that it is the unit outward normal to the spherical surface we can rewrite (6.151) as

(2ik + H)n

1\

(h - n

1\

E) = n -

1\

{Curl(n

1\

~ {Grad Div(n A k

h) - n h) - n

1\

Curl(n

A

Grad Div(n

1\

E)} A

E)

(6.152)

where H is the mean curvature as before. For absorbing boundary condition (6.152) there is no obligationfor the surface to be spherical (Jones 1988b). Once the tangential components have been settled the normal components can be deduced from (6.79) and (6.80), i.e. Div(n

1\

E) = ikn.h,

Div(n

1\

h) = -ikn.E.

371

SOLID ANTENNAS

We shall indicate briefly now how the absorbing boundary condition can be incorporated in the finite element method. First, consider solutions of Helmholtz's equation (6.139) with the absorbing boundary condition (6.142) imposed on the spherical surface n enclosing S. Let v be a testing function. Then

i an f n

op v-dn=

(gradv.gradp-k 2pv)dt+

T

f an s

op v-dS

(6.153)

where T is the volume between nand S. On Q utilise (6.142) for op/on. The replacement is simplified by noting that the right-hand side of (6.142) is Div Grad p and that

fa v Div Grad p an = - fa Grad v. Grad p dO from (A.15) and (A.18). Thus, only first derivatives will be needed throughout (6.153) with a consequent improvement in the sparsity of the matrix which arises in the finite element method. A similar reduction in order of the derivatives can be achieved for Maxwell's equations subject to the absorbing boundary condition (6.152). Here, one might wish to substitute for 0 A h from (6.152) in Sn v .n A h dS with v doing the testing. The formula

fa v.Grad Diva
Exercises 57. In two dimensions the analogue of (6.140) is 00

P

= e- ikr L

n=O

Pn(fjJ)/r"

+ 1/2

in terms of the polar coordinates r, fjJ. The coefficients P« satisfy

21·k( n + 1) Pn+

1

+ (~ n + 2) 2 P« +

2

a Pn =

-2

afjJ

(n = 0, 1,...).

0

Show that an absorbing boundary condition on a circle is

i ( 1 - -kr

){op 1(14P + -a2p) - + (0Ik + -I)} P = --.2r

or

21kr 2

and that another is

-ap + ar

(0Ik + -1) P = -.1 ( 1 + -i 2r

21kr).

kr

afjJ2

)(1 + 4P

2p) a . afjJ

372

ANTENNAS AND INTEGRAL EQUATIONS

58. Examine what happens to (6.152) when the surface is taken as an infinitely long circular cylinder. Compare the resulting formula with those of Exercise 57 for (a) E-polarized, (b) H-polarized waves. 59. If the first absorbing boundary condition in Exercise 57 is rewritten as

(1 - ~){:: +

(ik + tK)p} =

2:k

(i

K2p +

:~)

where" is the curvature and (J the arc-length of the boundary, it could be applicable to non-circular boundaries. Try it for an elliptical boundary (Jones and Kriegsmann

1990).

6.24 The surface radiation condition In 1987 Kriegsmann, Taflove, and Umashankar made the novel suggestion that

the scattered wave should satisfy approximately an absorbing boundary condition applied directly on the surface of the scatterer. Naturally, the condition must be in a form applicable to any shape as in (6.143), (6.152), or Exercise 59. When a condition designed for a radiating wave some distance from the scatterer is imposed on the scatterer itself it is known as a surface radiation condition. Solving the scattering problem by means of a surface radiation condition simplifies considerably the analysis. For a perfect conductor it may give the surface current immediately or, at worst, require the solution of a surface differential equation. So, there is great interest in learning whether the surface radiation condition is a valid approximation. To date, no theoretical justification has been forthcoming but there has been extensive numerical investigation. Most of it has dealt with Helmholtz's equation in two and three dimensions (Kriegsmann and Moore 1988; Moore et al. 1988; Jin et al. 1989; Teymur 1992; Jones 1992a) but three-dimensional electromagnetic scattering has been studied by Murch (1991). The general conclusion is that the surface radiation condition works well in a wide variety of circumstances for a band of frequencies in the middle to low range provided that the scatterer is not too elongated or deformed. This is consistent with the observations of Mittra et al. (1989) on the absorbing boundary condition which were delineated in §6.23. To widen the scope of the surface radiation condition Teymur (1992) has put forward perturbation techniques. They look promising and may be useful also in improving the performance of absorbing boundary conditions. How far they will extend the scope of the surface radiation condition remains to be seen. Finally, it should be remarked that the simplicity of the surface radiation condition makes it a candidate for starting any method which can refine a first approximation.

Exercises 60. Try the surface radiation condition for perfectly conducting obstacles irradiated by a plane wave in cases where an exact solution is available, e.g. the boundary is circular or spherical.

373

DIELECTRIC ANTENNAS

61. Repeat Exercise 60 but allow the obstacle to be penetrable. 62. Compare the results for the scattering of a two-dimensional plane wave by a perfectly conducting square by the surface radiation condition and by the method of moments.

DIELECTRIC ANTENNAS Antennas can be constructed from dielectric materials which cannot be accounted for adequately by means of an impedance boundary condition. In that case there is no escaping from the introduction of interior fields. The complexity of the resulting analysis can be considerable. As a guide to what happens we commence with the propagation along circular cylinders. 6.25 The infinite dielectric circular rod Let the dielectric rod be of radius a with material constants E;, Jl and have no conductivity. The surrounding medium will also be assumed to be nonconducting with material constants Go, }lo. Choose cylindrical polar coordinates r, lj>, z with the z axis along the axis of the rod. Assume that the dependence on z of all fields is exp(- iexz) when ex is a real constant. In order that the field be finite at r = 0 Bessel functions of the first kind must be employed inside the cylinder. Outside the cylinder the Hankel function H(2) is appropriate in order to preserve the proper behaviour at infinity. The expansions compatible with these injunctions and Maxwell's equations are Ez =

00

L

m=-oo

am Jm( Kr) exp(imljJ - iexz)

00

L

cmH~)(Kor)

exp(imlj> - iez)

(r

~ a)

(r

~ a),

m=-oo 00

H= z

L

m=-oo

bmJm(Kr) exp(imljJ - iexz)

00

L

m=-oo

dmH~)(Kor)

(r ~ a)

exp(imcP - iexz)

(r ~ a)

where K 2 = W 2 JlE; - ex 2 , ,,~ = W 2 JloGo - ex 2 • The transverse components Et , H, of the field can be deduced from (§2.2) ,,2Et 2

" Ht

= k 1\ =

iWJl grad, Hz - iex grad, Ez,

ik2k

1\

grad, Ez

WJl

. -

•

lex

gra

d

t

Hz

where k is a unit vectoralong the z axis and grad, is the gradient in the (x, y) plane.

374

ANTENNAS AND INTEGRAL EQUATIONS

The continuity of the tangential components of E and H at r satisfaction of the four equations

=a

requires the

bml m(Ka) = dmH<';)(Koa). These are homogeneous equations for the coefficients am' bm, em, and dm. They will have a non-trivial solution only if the determinant of the coefficients is zero, i.e.

(6.154) When the rod is stimulated symmetrically m = 0 and (6.154) decouples into two equations. Since 1~ = -11 and H},2)I = _H\2) they may be written as

Ko H},2)(K oa) = ~ 10 (Ka) Go H\2)(K oa) G 11 (Ka) for the TM modes and

"0 H},2)(Koa) = ~ 10(Ka) Jlo H\2)(Koa)

J1 1 1 (x«)

(6.155)

(6.156)

for the TE modes. Both (6.155) and (6.156) have synonymous structure so that it will be sufficient to discuss one of them in drawing conclusions about the qualitative behaviour (see also Arnbak 1969) of the values of rJ. which satisfy them. The first query to be answered is whether an undamped wave can propagate at the speed of light in the medium around the rod. If so, rJ. = w(JlOGO)1/2 and = o. The left-hand side of (6.155) then dissolves and so l o(Ka) must be satisfied since K = 0 is not permitted. Hence, if jop is the pth zero of 10 , there are symmetric TM modes with ~2 = rJ.5 p = W2JlE; - i5 p/ a2. Their frequencies are given by W2(JlE; ~ }JoE;o) = j~p/a2 and thus there is a cut-off frequency j01/2na(JlG - J10GO)1/2 below which no TM modes of this type will propagate. For ~ near rJ. op the wave outside the dielectric decays very slowly with rand so only a small part of the energy flow takes place within the rod. An important practical regime is when e » eo. The right-hand side of (6.155) will then be diminutive unless 11 is near zero whereas the left-hand side will not be small unless Koa is as well. Therefore, when Koa is not small, i.e. the

"0

375

DIELECTRIC ANTENNAS

mode is well away from cut-off, the match necessitated by (6.155) can be attained only if J1 (xc) = O. Hence, in this case, (6.157) where i., is the pth non-zero root of J1 • The zero root must be omitted because of the factor" in the numerator of the right-hand side of (6.155). The largness of 8/80 assures that the formula (6.157) will not make "oa small unless 2 2 2 2 W J.l.8 is not far from iIp /a . When W Jl8 is well above iIp /a , will be negative imaginary and of large modulus. The field outside the cylinder will wane rapidly as the distance from the rod mounts. Most of the energy flow will now be within the dielectric. This is borne out by the perception that (6.157) applies to a waveguide whose walls insist on a vanishing tangential magnetic field. A graph of a. against wa(J.l. 080)1/2 when 8» 80 consists essentially of the straight line a. = wa(P08 0)1/2 for the lower values of osa and the curve a. = (W2Jl8 - iIp /a2)1/2 for the higher values of oia. The transition between the two is effected by a smooth curve in the neighbourhood of the point of intersection where W 2(J.l.8 - J.l. 0 80) = iIp /a2. Some notion of the shape of the transition curve can be realized by considering the solutions of (6.155) when I"oal is small. In such a situation" may be approximated by the constant where "I = w 2(Jle - Jloe o). Eqn. (6.155) now reduces to

"0

"1

(6.158) where '1 = 2eJl("la)/80"laJO("la). The constant '1 is a measure of the effective permittivity since it collapses to 8/80 when I"lal « 1. For negative imaginary put = - ix' where ,,' is positive. Then values of

"0

"0

(6.159) The smallness of xa compels '1 to be negative for a solution of (6.159) to exist. Therefore io p < "ta 1

H~)'(Koa) ~ __m_ H~)("oa)

"oa

{I __

(_"0_a_)2_} 2m(m - 1) .

The dominant singularity in (6.154) is ostensibly 1/("oa)4 but the coefficient of this term vanishes automatically. So the terms in 1/("oa)2 have to be brought

376

ANTENNAS AND INTEGRAL EQUATIONS

in and they lead to

(J1. oe + J1.e O)Ka

Jm -1 (Ka) J1. oeoK 2a 2 = m(e - eO)(J1.o - J1.) + - - Jm(Ka) m- 1

as the equation for OJ when J~(z) = Jm- 1(z) - mJm(z)/z is employed. The solutions of this transcendental equation for xa supplies the cut-off frequencies for the hybrid modes in which m > 1; in general K = 0 will not be a solution. When m = 1 we have

H\2)'(Koa) ~ (2) ~ H 1 (Koa)

1 (1

--

Koa

2 2

+ Koa

I

)

n Koa ·

The highest-order singularity 1/(Koa)4 is again removed by cancellation. The next-largest terms are of order (In Koa)/(K oa)2 and these cannot be eliminated unless KJl (Ka) is small. To agree with this, one possibility is OJ2(J1.B - J1. oBo) ~ 0

and there is then no cut-off frequency for the hybrid wave with m

= 1.

In general, (6.154) has to be solved numerically though a graphical approach is a helpful adjunct. For example, xa and iKoa are taken as Cartesian coordinates and the Hankel function is replaced via H~)(z) = (2i/n) exp(tmni)Km(iz). Then the curve whose equation is (6.154) is drawn. The curve (Ka)2 + (iKoa)2 = (l)2 a2(J1.B - J1. oBo) is then traced. The required values of a. may be deduced from the intersections of the two curves. There is no difficulty in confirming that for all these modes the Poynting vector is normal to the cylindrical surface when Ko is purely imaginary. Consequently there is no net energy flow out of the cylinder. The modes can therefore be called surface waves, i.e. they are waves which propagate along the interface between two different media without any transfer of energy across the dividing surface (other than that necessary to make good resistive losses). The possibility of complex a. should not be ignored. That such values are feasible may be recognized by returning to the approximation (6.158) and allowing Ko to be arbitrarily complex, but keeping 11 as a real constant. Let "oa = d e icJ where d « 1. For 0 ~ ~ tn

s

d 2 e 2 icJ (tni

+

bi

+ In td) = -

2/11

if the term involving y is dropped. The real and imaginary parts of this equation rnay be rearranged as

1d = exp{ -

(tn

+ b) cot 2b},

d 2 = ~ sin 2<5 . 111n + b

(6.160)

(6.161)

If d is plotted as a function of b, first from (6.160) and then from (6.161), the intersections of the curves will provide any solutions in the ranges under

DIELECTRIC ANTENNAS

377

consideration. Evidently, for consistency of (6.161) t1 > 0 so that il P < Kia < I and any solutions occur in a different regime from those of (6.159). The curve of (6.160) increases monotonically with b, starting from the value 0 at ~ = O. In contrast, that of (6.161) increases steadily from 0 at b = 0, passes through a maximum for some b < tn, and then falls steadily to zero at ~ = tn. Hence there is precisely one solution and it lies in the interval 0 < ~ < in; as t1 --+ 00 it converges on b = 0 while simultaneously d --+ O. In this wave K o has a positive imaginary part and-so the field grows as r --+ 00. Such waves have, on occasion, been deemed leaky waves. As Kia --+ iO,P+I' '1 --+ 00 and b -+ O. This suggests that the leaky wave which originates for Kia just below io,p+ I converts to the (p + l)th surface wave as K1a traverses the surface-mode cut-off frequency. Furthermore, cos 2b is positive at the solution of (6.160) and (6.161) so that ~a.2 < w 2JJ. e. For ~ and d near zero, Ja. will be of the order of d2 and we deduce that {Jfa. < w(jle)I/2. Consequently, the leaky wave travels with a phase speed faster than light along the z axis. There is, in addition, a radiation loss associated with this fast wave.

io,p+

Exercises 63. An optical fibre consists of a circularly cylindrical inner dielectric core of refractive index N, surrounded by an annular cladding of smaller refractive index N2 • Surrounding the cladding is a dielectric with a refractive index N3 not far from N2 • Set up the equation governing the modes of propagation. If the outer medium is constructed of black glass so as to be highly lossy while the core and cladding are only slightly lossy are there any convenient approximations which can be made? At a wavelength of 0.9 urn with N, = 1.61 extensive numerical investigations for (a) core radius 20 urn, cladding thickness 5 urn, N2 = N3 = 0.99Nh (b) core radius 40 urn, cladding thickness 5 urn, N2 = N3 = 0.96N 1 have been carried out by Roberts (1972, 1973, 1975). Further details can be found in Snyder and Lore (1983). 64. Tackle the problem of propagation along a circular dielectric rod by assuming that the exterior field satisfies a surface radiation condition on the surface of the rod. Compare the modes so obtained with the exact ones (Jones 1989).

6.26 Modal excitation The initiation of modes on an infinite dielectric rod by a given source will be the topic investigated within this section. To fix ideas, the source will be taken as the magnetic frill of §6.3 (Figs 6.3 and 6.4). The incident field is then expressed as in eqns (6.17)-(6.22) with K replaced by Ko to conform to the notation of the preceding section. The scattered field may be assumed to be represented by similar integrals but with the integrands being expansions of the type at the beginning of §6.25. In fact, for the excitation considered here, it is sufficient to

378

ANTENNAS AND INTEGRAL EQUATIONS

limit the series to m = 0 and put bo = 0 and do = O. Apply the boundary condition of the continuity of the tangential components of the field at r = a. Then it will be discovered that the total field (i.e. the sum of the incident and scattered fields) is given in r > b by (cf. Becker and Meister 1973) H -

~-

_1 4

W80

b fOO [H12)(Kob){eoKJt(Koa)Jo(Ka) - KoeJo(Koa)Jt(Ka)} _ 00 8K oJl (Ka)Hlil(Koa) - 80KH\2l(Koa)Jo(Ka)

+ J 1(KOb)] H \2l(Kor) exp( -iaz) de while on r

(6.162)

=a (6.163)

At first glance it may look as though (6.162) and (6.163) have branch points at the zeros of K and K o. However, study of the numerator and denominator reveals that the behaviour is such that the zeros of K do not correspond to branch points so that the only singularities are due to K o (and any poles which the denominator may produce). To ensure that no poles lie on the path of integration (Fig. 6.5) slight dissipation is assumed to be present so that poles for ala. > 0 are displaced downwards and those for ala. < 0 upwards. One might venture to calculate the field on the rod approximately by deforming the contour of integration into that of Fig. 6.6. The drawback to such a procedure is that in the deformation to the position shown by broken lines poles corresponding to leaky waves might be passed over. To avoid their consideration it is convenient to redraw the branch line from a. = k (k 2 = ro2 Jl. oeo) as shown in Fig. 6.21 and deform the contour of integration into Ct. Of course, such a deformation will introduce contributions from poles answerable for surface waves. For large [z] the important part of C t is near (X = k and so, to a first approximation, K may be treated as the constant K 1 • Thus (6.163) becomes approximately

H4>

- 1 b . - roeo 11 1 4n

f Cl

H\2)(K ob) exp( -ia.z) d (2) (2) a. aK H 2 o11 0 (Koa) - H 1 (Koa) 1

~ [H\2)(KOb)J 1(Ka) exp + cueeo -b i..J a

p

dL/da.

(')J -ta.z

(6.164) «=«p

where (6.165) and (Xp is a typical zero of L which becomes real when the dissipation is removed. For the surface waves K o is negative imaginary and the Hankel function may be befittingly replaced by the modified Bessel function K as in the previous section.

379

DIELECTRIC ANTENNAS

-k

c, Fig. 6.21. Contour to avoid leaky waves.

When b = a, the integral in (6.164) can be simplified by introducing the variable of integration t = Ko/k and combining the integrals on the two sides of the branch cut. The result is

Htj)

1

= - - 2 weo'12ka2 2n

foo texp{ -ikz(l2

(1 - t)

0

1/2

t 2)1/2} M

(.)J

~ [H\2)(Koa)J1 (xe) + weeo f..J exp -taz p

dL/da

dt (6.166)

«=ap

where M

= {!akt1tJo(kat) -

J1(kat)}2

+ {!akt1tYo(akt) - Y1(kat)}2

(6.167)

and (1 - t 2 ) 1 / 2 is negative imaginary when t > 1. The far field where kr » 1 may be evaluated from (6.162) by means of the method of stationary phase, the same device as was utilized for (6.33) being adopted. Any poles encountered in moving the contour to the path of stationary phase can be neglected because they will have J Ko < 0 and so make a contribution exponentially small in comparison with the point of stationary phase. Hence

(6.168) where now K = (w 2JL e - k 2 cos? fJ)1/2. Any consistent choice of the radical in K may be employed since no branch point is involved. If b = a, (6.168) simplfies to

i exp(-ikR) ZoHtj) "" -- ek J1(Ka) n R x {ek sin fJJ 1(Ka)H b2)(ka sin fJ) - eoKH\2)(ka sin (J)Jo(Ka)}

-1.

(6.169)

The radiation conductance G may, as pointed out at the end of §6.3, be

380

ANTENNAS AND INTEGRAL EQUATIONS

calculated from the real part of (6.166) with z = O. Thus

(6.170) Some general observations can be made on the strength of these formulae, especially when eJl » eoJlo. Suppose, firstly, that koa(eJl - eoJlo)I/2 « (eoJlo)I/2. Then no surface wave or leaky wave can be initiated and the residue terms disappear from (6.166) and (6.170). Also (6.169) becomes roughly Z oH4J

f'toJ

I (k )2 . 8 exp( - ikR) tn a SIn

R

while (6.170) gives

on approximating M by 4/(nkat)2. Consequently, the pattern is that of a small loop, the main effect of the dielectric being to alter the magnitude of the field. For 1 < ka(eJ.l/eoJ.lo - 1)1/2 < JOI ~ 2.4, ka will be quite small. The denominator of (6.169) will have a complex zero corresponding to a solution of (6.158) and the phenomenon of a leaky wave is manifest. When 8 is near the real part of the complex angle, the radiation pattern will have a maximum. The rod is acting as a fairly efficient radiator, the leaky wave on the rod decaying exponentially with distance from the magnetic frill. Confirmation of the radiation is further provided by G which is still essentially equal to the integral term in (6.170). As ka(eJl/eoJlo - 1)1/2 approaches 2.4, Itli --+- 00 and the cut-off frequency of the surface wave is near. The radiation pattern and conductance transform to those of a metal wire of the same diameter. The current now falls very slowly along the rod and the radiation of energy is inefficient. In fact, the conductance behaves as if the rod were metal for quite a wide range on either side of 2.4. With 2.4 < ka(eJl/eoJlo - 1)1/2
Exercises 65. Find the analogues of (6.162) and (6.163) when the source is not cylindrically symmetric so that hybrid modes may be generated. 66. For z » 1, what is the analogue of (6.25) deduced from (6.163)?

DIELECTRIC ANTENNAS

381

6.27 The finite rod When the rod is not infinite in length reflection at an end can be regarded as creating a magnetic ring current there. Suppose its magnitude is - 2 ~ times that in (6.15). Then a terminating impedance should be introduced at the end z = I related to the upcoming wave by - 2naHt/J = ~ ~ and a similar insertion should be made at the end, z = -I. However, ~ cannot be taken as independent of the position of the source on account of the different decays of the various waves that can be excited on the dielectric. Therefore, in general, ~ must be a function of I and the position of the source. Once this dependence has been settled the reflections can be added together and, if 1(0) = VY where - 2 V is the strength of the exciting ring current, Y= Y 00

2{ Y(I)}2 ~(/)

+

Y(2/) >:(1) >:(2/)

(6.171 )

where Yoo is the admittance of the infinite antenna and 2(l1 ka )2 fOOt exp{ - ikz(1 - t 2)1/2} Y(z)=-dt 0 (1 - t 2 ) 1 / 2 M

-z,

from (6.166)

Exercise 67. If there is a well-established surface wave ~ can be regarded as independent of 1. Determine the simplification which occurs in (6.171).

6.28 General shapes When the dielectric is not a circular cylinder there is little hope of finding analytical solutions unless the obstacle has a very simple shape like a sphere or special conditions (e.g. low frequencies or small refractive index) are applicable (Jones 1986). General shapes, therefore, can be tackled only by numerical techniques. In this section, some of the various methods will be set out. It will be presumed that the dielectric occupies the region S_ of Fig. 6.15 and that J1 and s supply measures of its permeability and permittivity respectively. In S + the corresponding quantities are taken as the constants J10 and eo respectively. To accommodate fundamental solutions in both regions it will be convenient to make a slight change of notation and put k 2 = w 2J1 e, k~ = w 2 J1of,o with

ljJ o(x, 1;)

= exp( -

ikolx - 1;1) .

4nlx - ~I

An incident wave E i ,

Hi

strikes the dielectric obstacle from S+. As a result,

382

ANTENNAS AND INTEGRAL EQUATIONS

the body generates in S+ a scattered field E s ' H s which has to satisfy the radiation conditions at infinity. In addition, a field E t , H, is transmitted into S_. The tangential components of the total field are to be continuous across S, i.e. n A (E i + E s ) + = 0 A (E t ) _ , 0 A (Hi + Hs ) + = 0 A (H t ) _ . (6.172) Inside the dielectric Maxwell's equations can be written as curl E

+

iWJloH = iW(Jlo - Jl)H,

curl H - iweoE = iw(e - eo)E. In this context, the influence of the dielectric is represented as attributable to volume distributions of electric and magnetic current placed in the same medium as the exterior. Consequently, the total field E, H can be expressed as

E(P)

= Ei(P) + (grad div + k5)

L-

(e/eo - l)E(Q)",o(P, Q) dXQ

- iw curl H(P) = Hi(P)

+ (grad div + k5)

L-

L-

(J1. - J1.o)H(Q)",o(P, Q) dx Q,

(6.173)

(J1./J1.0 - l)H(Q)",o(P, Q) dXQ

+ iw curl

L-

(s - eo)E(Q)",o(P, Q) dx Q• (6.174)

This representation has continuous tangential components and therefore automatically satisfies the boundary conditions (6.172). The application of (6.173) and (6.174) with P in S_ delivers integral equations to determine E and H inside the dielectric. Substitution of these values in the integrals of (6.173) and (6.174) when P is in S+ then provides the field outside the dielectric and the problem is solved. One tremendous advantage of these volume integral equations is that there is no necessity for e and Jl to be constant. They are therefore available for arbitrary inhomogeneous dielectrics. Indeed, replacing E, H by E', Hi in the integrals is the standard way of obtaining Rayleigh scattering as a first approximation to an iteration procedure which may be expected to converge if the combination of frequency and material deviation is suitably low. Unfortunately, (6.173) and (6.174) suffer from a grave defect from a practical point of view. In effect, six scalar three-dimensional integral equations have to be solved simultaneously. The effort demanded of the computer is therefore at least two orders of magnitude greater than that for a metallic scatterer. For this reason a good deal of attention has been devoted to finding other formulations. Here the methods appropriate to arbitrary isotropic non-conducting dielectrics will be discussed though the principles carryover to more general

383

DIELECTRIC ANTENNAS

,/~/'/" I

........

...... "'",~ Incident

,,

\

(62

\

\

I

\

\

I

,

\

\ \

,,

,

.... ......

,, ...

,, , ,, , \

I

\

\

\

I I

\

\

wave

,,

, ,,

I

I

.....

Fig. 6.22. Scattering by a dielectric.

dielectrics with much elaboration of detail. The special case of the homogeneous isotropic dielectric is studied in the next section. Basically, all the formulations are the same initially. A boundary B is drawn enclosing the dielectric (Fig. 6.22) and inside B solutions are generated by finite differences or finite elements. The methods differ in the location and shape of B as well as in the treatment of the field on it. In the first case B is taken sufficiently far away from S for an absorbing boundary condition to be applicable. Often the shape of B is circular or spherical but it can be adjusted to accommodate any special features of S. The problem has become now a normal interior one for finite differences with boundary conditions on Sand B. As regards finite elements suppose that (6.153) is being employed. Select representative functions Vi' V 1, ••• , V N and assume

Then, putting v = VI' ... , VN successively in (6.153) leads to a matrix equation for the an which can be tackled by any suitable method such as conjugate gradients. If p vanishes on S the vj will be chosen to be zero on S so that the assumed expansion for p satisfies the boundary condition; in addition, the integral over S will disappear from (6.153).The integral is removed automatically if the normal derivative of p vanishes on S. When the scatterer is penetrable the Vj should be selected to be continuous through S when p is; then the integral over S can be transformed into a volume integral over S_ (similar to that over T) by applying the boundary conditions satisfied by p in the transition through S. More complicated arrangements may be necessary to attain the transformation when p is not continuous across S. The procedure can be adapted to vector fields. The second approach makes B coincide with S and applies a surface radiation

384

ANTENNAS AND INTEGRAL EQUATIONS

condition. For instance, we could require Es ' Us to satisfy (6.152) on S. Substitution from (6.172) furnishes a relation between the tangential components Et , " t . Thus, the problem has been converted to finding a solution Et , H, inside S_ which complies with this relation. The converted problem is amenable to finite differences or finite elements. As has been pointed out the absorbing boundary condition is not entirely successful in keeping out unwanted reflections. Methods have been suggested which avoid using it. One of these is the unimoment method (Mei 1974; Chang and Mei 1974, 1976; Morgan and Mei 1974, 1979; for dielectric obstacles in waveguides see Mur et al. 1976). For simplicity of description B will be taken as spherical but other choices are permissible. Moreover, only solutions of Helmholtz's equation will be discussed; the principles carryover to Maxwell's equations. The fundamental idea is to find out what happens to a basis function defined on B in a truly radiating field. For simplicity again the spherical harmonic will be chosen as the basis function but selections from other complete sets are perfectly acceptable. With p = Y~ on B the interior problem is solved by finite differences or finite elements. It will be assumed that there are no solutions when p = 0 on B so that the question of non-uniqueness does not have to be raised. This assumption would have to be verified in practice since it is not transparent that a dielectric loaded sphere would not resonate at inconvenient frequencies. Given a unique solution, the values of the interior normal derivative on B can be derived, say op/on = qnm. It may be that qnm will emerge as a constant multiple of but, in general, it will contain contributions from other spherical harmonics. Now suppose that the total field on B can be represented adequately by

Y:

Y:

p=

L PnmY~' op/on = L v.;«; n,m

n,m

(6.175)

as determined by the internal calculation. The total field on B is known to consist of the incident field pi and a radiating field p S• If both of these are expanded in terms of Y~ we must have

P~m

= Pnm - P~m

in an obvious notation. However, since pS is radiating, pS =

L P~m Y~h~2)(kR)/h~2)(kRo)

n,m

outside B, R o being the radius of B. Continuity of the normal derivatives through B demands

L Pnmqnm = L {kP~mh~2)/(kRo)/h~2)(kRo) + [oP~m/OR]R=Ro} Y~.

n,m

n,m

Sufficient information has been acquired to enable the determination of the Pnm or P~m·

385

DIELECTRIC ANTENNAS

One disadvantage of the unimoment method is the necessity to know pS exactly outside B. In effect, this limits it to those B which fit separable solutions of the governing equations. The limitation can be surmounted by employing an integral representation for the field outside B, e.g. (6.75) with S replaced by B. Allowing the point of observation to tend to B provides the information to fix Pnm though the theory of integral equations is called on. Because the unimoment method handles individual basis functions the potential exists for splitting off from the incident field some harmonics to be dealt with by the unimoment method while the scattering of the rest of the incident field is resolved by an absorbing boundary condition. Whether this is reasonable depends on the extent to which qnm contains harmonics other than y~. If qnm is relatively free of other harmonics a reasonable splitting should be attainable. For a comparison of the relative effectiveness of the methods for tackling general dielectrics see Peterson (1989).

Exercise 68. The following is a selection of problems for trying out methods: (a) the sphere with invariable interior; (b) the sphere with material changing radially; (c) a sphere composed of segments of constant materials; (d) an annular circular cylinder; (e) an elliptic cylinder; (f) a square cylinder; (g) a biconical antenna; (h) a lens. The incident wave might be plane or come from a point dipole.

6.29 Homogeneous isotropic dielectric It has been observed that the volume integral equations are expensive in computer effort. Therefore, it is worth attempting to find surface integral equations when e and u are both constant in order to save an order of magnitude in computer expenditure. From this point onwards s and J.l will be regarded as unvarying in S_. From (6.75) and (6.76)

E.(P)

=

Is [{n

q A

E.(q)}

A

grad, r/Jo(P, q)

+ {nq.E.(q)} grad, r/Jo(P,

-iwJlo{n q H.(P) =

Is [(n

q A

H.(q)}

A

grad, r/Jo(P, q)

A

q)

Hs(q)}t/Jo(P, q)] dS q ,

+ {nq.H.(q)} grad, r/Jo(P,

+ iweo{Oq

A

Es(q)}t/Jo(P, q)]

(6.176)

q)

as,

(6.177)

where P E S+ and both right-hand sides vanish identically for P in S_.

386

ANTENNAS AND INTEGRAL EQUATIONS

Similarly

Et(J~) = - Is [(nq A

Et(q)}

grad,

A

ljJ(~, q) + {nq.Et(q)} grad, ljJ(~, q) - iWJl{ Oq

Ht(~) = - Is [(nq A

Ht(q)}

Ht(q) }t/J(~, q)J dS q ,

(6.178)

ljJ(~, q) + {nq.Ht(q)} grad, ljJ(~, q)

grad,

A

/\

+ iwe{oq /\

Et(q)}t/J(~, q)]

as,

(6.179)

for 1>; in S_ and the pair of right-hand sides is identically zero when 1>; is in S+. Let

iWPt

=

Then, from (6.81)-(6.84), n , E, that

n /\ E,

n.E,

= j: -

iwp;

-Div jo

=

-Div j;.

= - Pt/e and n. H, = - P; / JI.

0 /\

Ei ,

= - jt - 0 /\ Hi, n.H, = -o.Hi - p;/Jlo

0 /\

= -Pt/eo - o.Ei ,

Also (6.172) implies

H,

from (6.79) and (6.80). Now, if E, and H, are replaced by E i and Hi in the right-hand sides of (6.176) and (6.177) the integrals give - E i and - Hi for P in S_ because the sources of the incident field are outside S. Combining these integrals with those of (6.176) and (6.177) for a point inside S_ we obtain op /\ lim Pi-P

f

{j; /\ grad, t/Jo(1);, q) - Pt grad, t/Jo(P;, q) eo

+ up /\ lim

f{ -it

Pi-P

A

grad, ljJo(~, q) -

iWJlojtt/Jo(~, q)}

P;

Jlo

grad,

as, =

The next objective is to rid the integrands of p, and

Pi-P

f S

p, grad,

t/Jo(~, q) as, -

lim P-p

= np

f S

A

Ei ,

(6.180)

ljJo(~, q)

+ iweoj;t/lo(.l~, q)} dS q =

op /\ {lim

-up /\

p;.

-up /\

Hi. (6.181)

With P in S +

Pt grad, t/J(P, q) dS q }

Is p, grad, {ljJo(p, q) -ljJ(p, q)} as,

387

DIELECTRIC ANTENNAS

because (6.90) and (6.91) hold for both t/J and t/Jo. But

I

PI gradq{",o(p, q) - "'(p, q)}

as, = -grad p = grad,

=-

f

S

grad,

I

PI{"'O(P, q) - "'(p, q)}

as,

{"'o(p, q) - "'(p, q)} div jl

~Sq lW

fj S

I·

grad, {'"o(p, q) _ "'(p, q)}

~Sq lW

from (6.78). The right-hand side of (6.178) is zero in S +. Therefore, if we take the multiple en A of it and add it to eo times (6.180) we derive via (6.90) and (6.91)

-!(E

+ Eo)j; + up r.

Is [j; r. grad, {Eo"'o(p, +

ijt{k~t/lo(p, q) - k 2t/1(p, q)}/w

- (j

= -eoD p

q) - E"'(p, q)}

A

t

.gra

d) grad, {t/Jo(p, q) - t/J(p, q)}] dS •

q

lW

Ei .

q

(6.182)

Similar operations with (6.181) and (6.179) lead to

t(j.t + Jlo)jl + up /\

Is

[jl /\ grad, {Jl"'(p, q) - Jlo"'o(p, q)}

+ ij;{k~t/lo(p, q) - (j;.grad

=

-JloD p A Hi.

q

)

k 2t/J(p, q)}/w

grad q {"'o(P: q) - "'(p, q)}] as, lW

(6.183)

The integral equations (6.182) and (6.183) are the duo sought. They constitute four scalar simultaneous linear integral equations and are appreciably easier to tackle than (6.173) and (6.174) but substantially harder than if S were a perfect conductor. It can be shown that the operators in (6.182) and (6.183) are compact, but the proof will not be given here. The matter of uniqueness will be settled in the next section.

388

ANTENNAS AND INTEGRAL EQUATIONS

6.30 Uniqueness for the homogeneous isotropic dielectric If the right-hand sides of (6.182) and (6.183) are placed equal to zero the integral equations may possess a solution jo, j~. If so, (6.182) and (6.183) are not uniquely soluble. Therefore, to guarantee uniqueness we want to show that jo and j~ must be identically zero. Define Po and p~ by iwpo = - Div jo, iwp~ = - Div j~. Then construct the following fields:

1 Is {-iwJljot/t(P, q) - j~ H = Is {- iwej~t/t(P, q) + jo E =

1

E2

=

A

grad, t/t(P, q) +

:0 grad, t/t(P, q)} dSq,

(6.184)

A

gradqt/t(P, q) +

~ grad, t/t(P, q)} dSq,

(6.185)

Is {-iwJlJ't/to(P, q) - j~

A

grad, t/to(P, q) + :: grad, t/to(P, q)}

as, (6.186)

H2 =

Is {-iweJ~t/to(P, q) + jo

A

grad, t/to(P, q) + :: grad, t/to(P, q)}

as, (6.187)

The integral equations satisfied by jo and j~ have been built in such a way that

eon /\ (E 2 ) -

= en /\ (E t ) +, J.lon /\ (H 2 ) - = J.ln /\ (H t ) _ .

(6.188)

Now the equations satisfied by (6.184) and (6.185) in S+ can be expressed as curl eEt

+ iwe(J.lH t) = 0,

curl JlH l - iWJl(eE t)

=0

whereas, in S_, (6.186) and (6.187) give

Moreover, the radiation conditions at infinity may be set forth as

R{JlH 1

+ (~r/2eEl

R{ -eEl - a

A

JlH1 (~)

A

a} ~

0,

/2} -1 ~ O.

Therefore the field JlH l, -eEl in S+ and J.l OH2 , -eoE 2 in S_ is an electromagnetic field which satisfies the radiation conditions and, by (6.188), has continuous tangential components. Let us assume that such a field must be identically zero. Then (Ej}, = 0 and

389

APPENDIX

(H 1) +

= O.

Hence, from the jump in the representations across S,

Similarly

Consider the field E 2 ' " 2 in S+ and -E 1, -HI in S_. By what has just been said its tangential components are continuous through S and it obeys the radiation conditions at infinity. The assumption at the beginning of the paragraph makes the field identically zero. It follows that io == 0, io = 0 and the uniqueness of (6.182) and (6.183) is established. The assumption in the last paragraph can be justified as follows. If E, H is a field with the given properties

r (E

JaR

A

H*

+ E*

A

H).o an

= O.

Rewriting the integral as in §6.22 we deduce that JaR IEI 2 dO -+ 0 as R -+ 00 and this is impossible unless E and H are identically zero. The uniqueness property of (6.182) and (6.183) makes them similar to the CFIE for a perfect conductor. Despite being known for some years such equations do not seem to have been deployed for numerical purposes until recently (Rao and Wilton 1990).

Exercises 69. Show that the uniqueness demonstrated above holds for complex permeability and permittivity provided that 9l(ime) ~ 0, 9t(iwp) ~ 0 and 0 ~ ph w 2poeo > -1[. Deduce that (6.182) and (6.183) have a unique solution for complex material constants. 70. Prove that the unique solution of (6.182) and (6.183) does furnish a solution of the dielectric scattering problem. 71. Reformulate the theory of this section for two-dimensional scattering.

A P PEN D I X: Geometry of surfaces Assume that the surface S is specified by the two parameters 0'1, 0'2 so that a point on it can be designated by x(a 1,a 2 ) . The curves 001 = constant and 0'2 = constant need not be orthogonal though it is convenient often to select them as lines of curvature. The direction of the unit normal n to S will be chosen so that 0'1, 0'2, n form a right-handed system. In addition, the convention will be adopted that n is an outward normal when S is convex, i.e, n points away from the centres of curvature; this can be arranged always by labelling 1 00 and 0'2 if necessary. With this convention the principal curvatures are positive for a synclastic surface.

390

ANTENNAS AND INTEGRAL EQUATIONS

Let g'k }

ax ax

= -,'= gk'} au} auk

and g = gllg22 - gi2'

For notational convenience it will be assumed that a repeated affix such as k which occurs twice in a character or product means summation over k = I and k = 2. Thus akbk stands for alb l + a2b2, bJ for b~ + b~ but a k + bk and aj j} are unaffected by the convention. Then, the element of length ds can be expressed as j (A.I) ds 2 = dx .dx = gjk du dqk. The element of surface area dS is given by (A.2)

Related to gjk is e" where o" = g22/g, can be deduced immediately that

a" =

g21

= -gI2/g,

g22

= gIl/g. It (A.3)

where ~ = 1 if j = m and 0 otherwise. The vectors ox/ou l and ox/ou 2 are tangential to S and so I

ax

aX

O=--A-

.J gaul

ou 2

in accordance with the convention on the direction of n, It follows that n

whence

ax

1\

ax = _1_ (ga ~2 .J g ou

auk

/

2k

ax

Ol\-=ygg --, au

I

auk

ox

g2k ~)

01\-=

ou 2

(A.4)

au I

/

Ik

ox

-ygg --. auk

(A.5)

Although (A.5) is the preferred form usually there are times when (A.4) is more helpful. Since 0.0 = 1, we have o. o%ak = 0 which implies that o%qk is tangential to S. Consequently, it must be expressible in the form

~ = b{. oak

ox..

au}

(A.6)

The quantities b1e are related to the second fundamental tensor bjk of the surface defined by (A.7)

391

APPENDIX

because (A.8)

from (A.6). It can be shown that

K = b~b~ - bfb~

H = !b:,

(A.9)

where H is the mean curvature and K is the Gaussian curvature of the surface. When 0'1 = constant and 0'2 = constant are the lines of curvature b~ = "1' b~ = "2' bf = b~ = 0 where "1 and "2 are the principal curvatures; then

= "1"2· Observe that, if T is a tangential vector with T = Ti oxjaai and U is another H = !("1 + '(2), K

tangential vector

T. U

= gikTiUk.

(A.IO)

The surface gradient of a scalar u, called Grad u, is defined by Grad u

, au ax

= glk - ,l . -k.

(A.II)

aa aa

On account of (A.10)

(A.I2) from (A.3). The surface divergence of the tangential vector T, denoted by Div T, is defined by

J '

. T = - 1 -a, ( gTl).

DIV

Therefore

, DIV

(A.I3)

Jgaal

Grad u

1 a = ----: l

-i» aa

(J

g gl 'k -

au)

aak

(A.14)

and Div(uT)

= u Div T +

T! au,

aal

= u Div T + T •Grad u

(A.15)

by virtue of (A.12). Let C be a curve on S and let s be arc length along C. Then a unit tangent t to C is given by dx ax det' t=-=k

ds

with

0'1

and

(12

aa ds

expressed in terms of s on C. A unit normal v to C in S such

392

ANTENNAS AND INTEGRAL EQUATIONS

that v, t, n form a right-handed system is / ( lk ox da v=t/\n=",g 9 ---g oak ds 2

2k

ox da l )

--

oak ds

from (A.5). Thus (A.I6) on invoking (A.3) Now let C be a closed curve enclosing the portion

f

I

Div T dS

~

of S. Then

= f~ (J 9 r, do ' d0'2 O(J}

f

from (A.2) and (A.3). Removing the partial derivatives by integration we have

f

Div T dS

I

=

f ( C

-Je

d(Jl) ds =

d (J2

T? - - T2 ds ds

c

T. v ds

(A.I?)

by (A.I6); (A.I?) is the surface analogue of the divergence theorem for volumes. If S is a closed surface choose ~ to be the whole of S so that C disappears. We infer from (A.I?) that

Is DivTdS = 0

(A.18)

when S is closed. By applying (A.I8) to (A.15) we deduce that

Is

u Div T dS

=-

Is T. Grad

u dS

(A.19)

when S is closed. Other analogues of formulae for integrals in three dimensions can be derived. First, remark that, 1 (OU ox ou ox) n 1\ Grad u = 9 00'1 00'2 - 00'2 00'1

J

by virtue of (A.4) and (A.3). Evidently Div(n /\ Grad u)

= O.

(A.20)

On the other hand, integration furnishes

I

n

1\

Grad u dS =

Ie ut ds.

Furthermore, from (A.6) and (A.9), 2Hn~g

an

ax

an

ox

O(Jl

00'2

0(J2

00'1

= -/\ - - - / \ -

393

APPENDIX

so that, on integrating the derivatives of

from (A.5). Hence

0,

Ie uv ds = L(2Hun + Grad u) dS.

(A.22)

For the surface curl, called Curl, it is advantageous not to confine the definition to surface vectors. Assume that S is sufficiently regular for points nearby to be identified by coordinates of the type x( 0'1, 0'2) + rn where r is the (small) distance along the normal. Let W be a space vector which is defined in the neighbourhood where this representation is valid. Then Curl W is defined by Curl W = [curl W - n When W = W3n

+

1\

(A.23)

oWjor],=o.

Wi oxjoO'i this leads to

OX

·

Curl W = (Grad W ) A n + WJbjn A all' + n DIV(W An). 3

. k

(A.24)

The formula (A.24) may be confirmed easily when 0'1 = constant and 0'2 = constant are lines of curvature; it then holds generally because of its tensor nature. The middle term of (A.24) can be rewritten in various ways through

Note that Div(W

1\

n)

= n.Curl W = n.curl W.

It cannot be presumed that Curl Grad u

=0

(A.24)

holds in general nor that

394

ANTENNAS AND INTEGRAL EQUATIONS

Div[Curl W]t

=

0, t signifying a tangential element. Useful combinations are

[Curl(n /\ Grad u)]t = T1

Vg

and .

(OU k OU k) ox - 1 b2 - - 2 bl n /\ - k 0(1

0(1

0(1

ou)

1 0 (OU 1 -1 b2 2 - I - b2 1 - 2 T V g 0(1 0(1 0(1 v g

- Divj'Curlm /\ Grad u)]t = T

+

1

Jg

0

O(J2

(au bu

O(J2 -

Ou)

bl 2 O(JI

1

Jg'

(A.25)

7 TRANSIENT PHENOMENA Most electromagnetic transmitters operate long enough at a single frequency for the analysis by time-harmonic waves to be appropriate. However, it is possible to produce short pulses with a broad frequency spectrum so that predictions of events for general time variations is desirable. The effects of the radiation from lightning discharges and of high-powered optical pulses then become amenable to investigation. This chapter is, therefore, devoted to the problem of transients. 7.1 Finite methods The general problem is to solve equations of the form curl E

+

a =u- ,J, at

aE = J at

curl U - e -

J.1 -

(7.1)

where J and J' are known electric and magnetic currents. The quantities Jl and e may depend on the space variables but do not vary with the time t. Causality insists that there shall be no disturbance until the source is switched on. Thereafter the field propagates behind wavefronts which travel with speeds and in directions characteristic of the medium. No energy flow can be detected by an observer before such a wavefront has passed over him. This property affords the opportunity of avoiding the difficulty with harmonic waves of satisfying the radiation conditions at infinity on a mesh of finite size which led to the introduction of absorbing boundary conditions. For electromagnetic pulses the role of the radiation conditions is taken over by causality. There is now a clear-cut wavefront, ahead of which there is no disturbance. Therefore the mesh does not have to go off to infinity; it merely has to extend as far as the most distant wavefront which has emanated from the obstacle. The progress of each wavefront in time and space can be traced from its initiation and no artificial boundary to account for the behaviour at infinity is needed. Nevertheless, there may be circumstances when an absorbing boundary condition may be called into play. For example, in calculations over a long time the disturbance may have spread far enough for the number of mesh points to have become unmanageable. In such cases one may wish to deploy the analogue of (6.142) in a homogeneous medium. It is

1){op (1 a I)} R1{Isin e aea(. 2 ~ at + R aR + ~ at + R (

10

P

=

2

SID

ap)

e ae

1 a2p} + sin e ac/>2 2

396

TRANSIENT PHENOMENA

where v is the speed of propagation of the waves. This condition can be derived from 00 p = L Pn(vt - R, (),
where

2 v

a

- (11 + 1) - Pn+ 1 (vt, (),
at

. 8 -a Pn(vt, 8,
SIn

0 00

08

a2

SIn

+ n{n +

I)Pn{vt, 8, 4J)

=0

(n

= 0, I, ...).

Likewise, the analogue of (6.152) is

-1-0 (2- -0 + H) n 1\ v ot v ot

(h - n

1\

E)

= -1-a n 1\ v at

+

{Curl(n

Grad Div(n

1\

1\

h) - n

b) - n

1\

1\

Curl{n

1\

Grad Div(n

E)}

1\

E)

where v = 1/{/le)1/2. For some purposes versions integrated with respect to time may be preferable. In applying finite differences or finite elements to the governing equations two options are open. One may leave the time derivatives untouched or discretize them. In the former case ordinary differential equations in t are arrived at. In the latter case mesh points at equal time intervals are specified for E and H but those for H are at the mid-points of the intervals for E-sometimes called a leapfrog scheme (Mitchell 1969; Arbanel and Gottlieb 1976).

7.2 Integral equations in the time domain In principle, solutions in the time domain can be obtained from those of the harmonic variation in Chapter 6 by means by a Fourier transform. The process is assisted by the fast Fourier transform (§2.14). Nevertheless, the transformation is not a trivial exercise since the harmonic problem must be solved for a large range of frequencies and it is evident from Chapter 6 that the computation of results for a single frequency can be quite formidable. Notwithstanding, Tijhuis et al. (1989) have indicated that discretizing the frequency over a sufficient range gets around some of the problems which arise in a direct approach in the time domain. An examination of what is involved in working directly in the time domain will be undertaken now. Let S be a closed surface such that the normal components of J and J' are always zero on it. For sources of finite extent this can be arranged simply by making S large enough. Then, if s and /l are constant within S, the general

397

TRANSIENT PHENOMENA

solution of (7.1) inside S is (Jones 1986) E(P, t)

=

-~at Jl fJ(Q, T)\Il(P, Q) dX Q -curl

f

J'(Q, T)\Il(P, Q) dXQ -

- grad, 'P(P, q)

1\

{D

1\

q

at

t

6

T)'P(P, Q) dXQ

[{D AE(q, q

aE(q, T)/at} v'P

+ {nq ' -aE(q, T) }{grad, 'P(P,

fP(Q,

grad!

q)}

v \II

n}

+ {n,; E(q,

- J.l'P(P, q)n q

Agrad, \Il(P, q)

T)} grad, 'P(P, q)

aH(q, T) ] dS

ot

1\ -

q,

(7.2) H(P, t)

=-

~ e fJ'(Q,

at

+ curl

f

T)\Il(P, Q) dX Q - grad! fp'(Q, T)'P(P, Q) dX Q J.l

J(Q, T)\Il(P, Q) dX Q -

- grad, \Il(P, q)

+

A { Dq A

~ H(q,

t[ n}I

{n, A H(q, T)} v\Il + {Dq.H(q,

0 } {grad 'P(P, q)} q { nq ' -at H(q, T) v'll

+ e'P(P, q)n

A

grad, \Il(P, q)

n} grad, \Il(P, q) a E(q, T) ] dS

q 1\ -

at

q

(7.3) where v2

= 1//1e, div J + op/ot = 0, div J' + ap'/ot = 0, 'P(P, Q) =

1

4nlx p

-

xQI

,

and T = t - Ixp - xQI/v in the volume integrals, while x Q is replaced by x q in the surface integrals. Equations (7.2) and (7.3) express the field in terms of the retarded time T, exemplifying the fact that disturbances move at speed v and do not occur before any sources are switched on. In special circumstances it may occur that the charges and currents are confined to the interior of the surface ~, which itself is entirely within S, but the normal components of J and J' do not vanish on ~. In that case, the terms

~

e

and

f r

Dq · J 1 (q,

T) grad, 'P(P, q)

as,

TRANSIENT PHENOMENA

398

where

oJ 1 (q, t) = J(

at

q"

oJ~ (q, t)

t)

at

= J'(

q,

t)

must be added to (7.2) and (7.3) respectively. Equivalently, the negative volume integrals involving p and p' can be replaced by grad div and grad div

f ~ fJ~(Q, ~

J 1 (Q, tvn». Q) dXQ

two; Q) dXQ.

The derivation of the integral equation for the thin wire follows similar lines to those of §6.2. Again, the current is assumed to be concentrated on the axis of the wire and the magnetic current is dropped. The charge density is also condensed to the axis. In view of the general continuity relation div J(x, t)

+ op = 0

at

the condensed charge may be expressed in terms of the current point ~ where the unit tangent is t(~) by writing it as

-f {t(~). 00

grad,

I(~,

t) at the

}I(~, u) duo

The surface S in (7.2) may be removed to infinity. Its contribution is then zero because no fields produced by the wire reach there in finite time. Hence, if the incident wave is E', the total electric intensity is given by E(P, t)

= Ei(P, r) -

~at u I'0 I(~, tvvi», ~) +

grad

~

d'

If~oo {t(~).

grad,

I(~, u) du'P(P, ~) d'

(7.4)

where' is the arc length and 1 is the length of the antenna. It has been assumed that the current vanishes at the ends of the wire. The boundary condition that the axial component of E vanishes on the surface of the wire may now be applied. If the wire is straight and along the z axis the total field is E(P, t)

= Ei(P,

t) -

~at p.z I' 1(', twir, 0 0

+ grad!

e

d'

I' fT 0

-

(YJ

~ 1(', u) du'P(P, o(

0

d'

(7.5)

399

TRANSIENT PHENOMENA

and the resulting integral equation is

E~(z, t) = ~ fl I(C, T)'I"(z, 0 dC - ~ v 2f' J.1.

at

az

0

0

f

~ 1«(,u) du'l"(z, 0 d(

T

-

00

(7.6)

a,

where

and

If the derivatives outside the integrals in (7.6) are taken inside we obtain

E~(z, t) = fl 'I"(z, ()[~ 1«(, T)+ z - , v{~ 1«(, T) +~ IT ~ 1«(,u) dU}] d' J.1.

at

0

R1

a(

R1

-

00

a,

(7.7) where R 1 = {(z - ()2 form

+ a 2 } 1/2.

An integration by parts gives the alternative

E~(z, t) = fl 'I"(z, 0 ~ 1«(, T) d( + v 2f T, ~ 1(1, u) du'l"(z, 1) J.1.

- v2

ex)

ITo [~1(', U)]

f' IT -

- v2

- iJI

at

0

00

o

-

(j'

~= 0

du'P'(z, O)

2

00

0 2 1«, u) du'l"(z,

a(

0 d(

(7.8)

where To and Il are the values of T at ( = 0 and ( = I respectively. If the current 1«(, t) = 10 (t - 'Iv) flows in a straight wire the integrals in (7.5) can be evaluated explicitly with the result

Ex(P, t) -

E~{P, t) = _~ [l o{t 41tev

Ez{P, t) -

E~ (P,

t) =

J-lV

4n

(R 2 U2R2

+ 1)lv} _

10 {t - RtIV)] , (7.9) u1R1

[Io{l - (R 2 + 1)lv} _ 10 (t - R1/V)] R2

R1

(7.10)

where the various parameters are defined in §6.2 (Fig. 6.2). However, it must be remembered that (7.5) does not correctly represent the field when the current fails to be zero at the ends as it does here. According to the paragraph after eqns (7.2) and (7.3) it is then necessary to add to the right-hand side of (7.5) 1

-- grad{I 1 ( /, 1j)'P(P, I) - 11(0, To)'P{P,O)} e

400

TRANSIENT PHENOMENA

where oJ 1 (x, t)/ot = I(x, t). The effect of this addition is to increase the right-hand side of (7.9) by

~

[_1_ 1 {t _ (R v+ I)} + ~ II{t _ (R v+ I)}

4ne vR~

2

0

2

R~

- _1_ 10

vRi

(t _Rv1) _ ~ II (t - R1)J v R~

and to supplement the right-hand side of (7.10) by __ 1

~[~/I {t _(R 2+ I)} _~ll(t

4ne OZ R 2

R1

V

_R1)J. v

(7.11)

(7.12)

When the wire is curved the appropriate integral equation is t(x).E(x) -- = It

I'

1«(,

0

- v2

T)t(x).t(~)qJl(X,~)

I

d(

t( x)· grad[f:00 {t(F;). grad.} {t(F;)l «(, u)} du'P I (x, F;) ] d(

(7.13) where lJIl(X,~) = 1/41tlx - ~ + al and a has the same connotation as in §6.7. Once more it has been presumed that the current disappears at the ends of the wire. If this is not so, an appropriate adjustment must be made to (7.13) as in the case of the straight wire. As soon as the axial current is known the far field can be determined from

a at

. t) - - u E(x, t) '" El(X, 41tR

I'[ { x} x] 0

t(~)

-

t(;)·- -

R R

J«(, T) d(

(7.14)

where now T= t _ {R - (x.;)/R}.

v The formula (7.14) is valid whether the current vanishes at the end-points or not.

Exercise 1. Find the analogues of (7.9)-(7.12) when I(', t)

= Io(t + 'Iv)

7.3 Numerical methods for thin wires in the time domain The numerical solution of the integral equations of the preceding section might be thought to present no radical problems which have not been encountered for the case of harmonic excitation. From one point of view they might be regarded as the same apart from containing an extra parameter-the time-so

TRANSIENT PHENOMENA

401

that the arguments about the desirable properties of the basis functions would still apply, though with redoubled force because of the necessity to allow for the additional parameter. (The presence of the infinite integral is not a genuine difficulty because the actual integration is only over the finite time that has elapsed from the stimulation of the current.) One might think of using basis functions with arguments t - 'Iv so as to avoid numerical integration by taking advantage of (7.9)-(7.12).. However, this procedure is likely to be efficacious only in particularly favourable situations and must be rejected for curved antennas. Instead one must place space and time variables on the same footing, but the interpolation with respect to time must be done rather carefully. The integral equation will be imposed at the times t 1 , t 2 , ••• in increasing order of magnitude. There is no loss of generality in assuming that the antenna is not excited before t = 0 so that it is permissible to have t 1 ~ O, Suppose that the axis of the wire is represented by cubic B-splines as in §6.8.The points on the axis corresponding to the knots will be separated by certain distances; let R m in be the minimum distance which occurs between any pair. Then the time intervals must be arranged so that tic - t"-1 ~ Rmin/v. The time retardation in the integral equation will then ensure that, when matching at a particular knot, the current distribution in all axial segments but those immediately adjacent to the knot will be that of an earlier time interval. Normally, the subdivision of the filament will be selected so that the geometric representation is good and the subsections are sufficiently small to resolve the spatial variation of the pulse on the wire. The inequality t, - t"-1 ~ Rmin/v then prescribes an upper limit to the length of time interval which can be tolerated. In order to achieve the necessary temporal resolution one may, of course, be forced to choose even smaller time intervals. Usually, the points tic are equispaced in time though the computational efficiency would be improved if the intervals were lengthened as the response settled down. Once the interpolation points have been decided, the integral equation is enforced so that, in simplified notation,

f 1«(1', t

k -

1«(1, (1'»K«(1, (1') drr' = g«(1, tk )

where (J is the spline variable and f accounts for the retardation. The integral is approximated by a quadrature rule so that for a typical subsection

f

m+ 1

m

1«(1', i, -

1«(1, (1'»K«(1, (1') dzr' n

~

L w,I(m + (J" ,=1

If all the time intervals are of length to and t k

t" - [t«; m + (J,»K«(J, m + (J,).

= kt o the current is now

replaced

402

in the interval

TRANSIENT PHENOMENA

«p -

t)to, (p

+ t)to )

Ita', z) =

L

f

by

i=2 j=p-2

1ij~(a')~(~to -

!). 2

(7.15)

Point matching·at the values of (J now leads to an algebraic system to determine the I ij • For any space segment not adjoining the matching knot, f will exceed unity and the I ij will come from times t k _ 1 or earlier. The only Iij involving t, will be those in a segment next door to a matching knot and here f can be set zero to a first approximation. If the Iij at tk _ 1 and earlier are known we are thus led to a relatively simple algebraic system for them at t k in terms of previous values. Hence, by marching in time from t = 0 we can determine succeeding values of the coefficients-the interpolation in time (7.15) has been made backward to warrant this possibility. It also ensures that the matrix inverse, once calculated for one time, need not be recomputed because it is essentially geometric in character. The effect of a gap when the antenna is fed at a point of its length causes no further complication than in the harmonic case. A typical response for a centre-fed antenna with a feed time variation exp{ - (1 - t/t' )2}, with t' being about 10- 10 s, is shown in Fig. 7.1. The impedance was calculated by a Fourier transform of the time response. Figure 7.2 depicts the behaviour of the far field in the plane perpendicular to the antenna for both the time and frequency domains (again obtained by Fourier transforms). Marching in time is an attractive procedure because it is relatively straightforward to evaluate the current from earlier values. However, the approximations performed to reach (7.15) bring in errors which may accumulate as time progresses so that the computed current begins to display instability (see §7.5 3 2 +oJ

c

~

'-

1 0 ~-'---I-~----I#--~~t....L.;:I~~~~ Time

<3 -1 -2

-3

1

2

If A

3

Fig. 7.1. Centre-fed antenna with Gaussian source.

TRANSIENT PHENOMENA

403

al O~~--L.J-L~~~~----.!~~~~~Time

...co

=0co a:

Frequency domain 1

2

3

Fig. 7.2. Far field of antenna in Fig. 7.1.

also). The matter has been examined thoroughly by Davies (1992b). She shows that the crucial factors are the radius a of the wire and d the separation of consecutive mesh points on the wire; the length of the wire is immaterial to stability. For stability it is essential that d > 5a; for accuracy it is desirable that d < A/tO where A is the smallest significant wavelength in the spectrum of any pulse. If these conditions are satisfied she demonstrates that the algorithms under consideration are stable, converge to the correct solution, and become more accurate as d is decreased. An example reveals that instability sets in if a is too large. If nd = A the conditions can be restated as d > 5a and the maximum frequency is (60jan) GHz when a is in millimetres.

Exercises 2. Calculate the driving-point current of a centre-fed straight-wire antenna fed by a unit voltage step when 1/2a = 74.2 and compare your results with those of Sayre and Harrington (1968). 3. A unit step of electric field is incident on the antenna of Exercise 2. Calculate the induced current and far field in the plane perpendicular to the wire. 4. Use the excitation of Fig. 7.1 to find how the input admittance of a V-antenna varies with frequency. 5. A circular ring is struck by a Gaussian pulse (§7.5). Find the radiated field in the time domain and show that the radar cross-section is a maximum when the wavelength is practically equal to the perimeter. 6. Choose a log-periodic antenna and find its frequency response by making a Gaussian pulse fall on it in the time domain.

404

TRANSIENT PHENOMENA

7.4 Perfectly conducting bodies For perfectly conducting bodies a return to (7.2) and (7.3) is pertinent, but the volume integrals can now be removed. To shorten the writing, only the receiving antenna subject to the incident illumination E', Hi will be discussed but the modifications to handle the transmission regime are straightforward. On the boundary S, define nq

/\

H(q, t)

=

-j(q, t),

-enq.E(q, t)

= p(q, t).

(7.16)

The surface conservation of charge differs from (6.82) and now takes the form

ap(q, t) 0 · J·(q, t ) + DIV -= .

(7.17)

at

. i[ ata.

The total field can now be expressed as E(P, t)

= EI(P, t) +

H(P,

t) =

s

Hi(P, r)

Jl- J(q, T)qJ(P, q)

-1

- {P(q,

{j(q,

T) + :t P(:~T)} grad

T) + :t j(q, T)/V'P}

A

q

J

:(P, q) dS q, (7.18)

grad, 'P(P, q)

as,

(7.19)

when P is in S+ (Fig. 6.15). The change of sign in (7.18) and (7.19) as compared with (7.2) and (7.3) is caused by the point of observation now being outside S. The requirement that the tangential complement of the electric intensity be zero gives, when applied to (7.18), the EFIE np

/\

f s

[J1. ~ j(q, T)'P(p,

at

q) - {P(q, T)

+

~at p(q,vqJ T)} ~e grad, 'P(p, q)J as, =

-n p

/\

Ei(p, t).

(7.20)

It should be remarked that the difference in sign between (7.20) and (6.97) is caused by the boundary condition in Chapter 6 being (6.74) n /\ E = n /\ Eo which is a reversal of the sign as compared with here. Annexed to the EFIE is (7.17). It would enable us to replace op(q, T)/ot by - Div j(q, T) but one p would still remain in (7.20). This awkwardness can be overcome by taking a time derivative of (7.20) after which p can be completely eliminated by means of (7.17). The MFIE may be constructed in the same way as in §6.14. Note that the second integrand of (7.19) is not discontinuous in crossing S because the qJ in the denominator renders it less singular than the first. We obtain -ii(p,

t)

+Op A

Is

{j(q, T) +

:t

j(q, T)/V'P}

A

grad, 'P(p, q) dSq=op A Hi(p, r). (7.21)

TRANSIENT PHENOMENA

405

.Other integral equations are rarely considered because it is felt that the time-marching procedure already described for the thin wire prevents the occurrence of the uniqueness problems which are so obnoxious in the frequency domain. While this may be true theoretically it cannot be accepted unequivocally when the integral equation is approximated numerically (see next section). The MFIE has two big advantages over the EFIE. Firstly, it is an integral equation of the second kind. Secondly, it does not have to be complemented by the continuity equation (7.17). Admittedly, this is evaded by the EFIE in time-derivative form but at the cost of second-order derivatives which will tend to engender less accuracy in any interpolation process than for the MFIE. The derivatives might, naturally, be taken outside the integral, thereby reducing the singularity of the integrand, and commuted to finite differences, but whether any improvement in accuracy will follow is dubious. The MFIE further brings out with great clarity the step-by-step operation in time. Since the integral is a principal value the contribution of the self-patch (where there is virtually no retardation in time) may be neglected in the first place (cf. §6.19). Then (7.21) tells us that j is - 20 A Hi corrected by an integral which depends only upon the past of j. Thus the MFIE can be regarded as an updating formula for j starting from known initial values. The mechanism is generated as follows. At the instant the incident wave strikes the body, put j = 0 in the integral so j = - 20 A Hi at the points of impact. At the next time step, replace j by 0 or - 20 A Hi depending on whether the point of integration was not or was one of the points of impact previously. Then (7.21) supplies j at the new time step. The procedure can obviously be repeated and j is obtained at all times. Should the wavefront of the incident field be such that a shadow region should be expected on the obstacle, there can be no disturbance in it until the creeping waves (§8.21) arrive. Up to this event the field in such a region should remain small. This observation provides one internal check at any rate on how well the updating formula is operating.

Exercise 7. Derive integral equations analogous to the EFIE and MFIE for two-dimensional scattering.

7.5 Numerical matters Nothing much needs to be added to the discussion of interpolation in §7.3 other than remarking the further complication of an extra space variable. Notwithstanding, a rather more detailed examination of the choice of patch size and step length must be undertaken. From a theoretical point of view, it would be ideal if the incident wave were a Dirac () function because the response to any incident field could then be synthesized by a simple convolution. However, this is impracticable in numerical

406

TRANSIENT PHENOMENA

work, though it has been studied analytically (Jones 1967; Maystre 1987; King 1992); instead it is more common to work with a Gaussian pulse of the type E,i H i ex: n b1 / 2 exp { -b

2(

X)2} .

t - ~

(7.22)

The right-hand side of (7.22) approaches the b function as b -+ 00 and is tiny enough at sufficiently negative times to be regarded as negligible. The Fourier transform with respect to time of (b/n 1/ 2 ) exp( -b 2 t 2 ) is effectively exp( - w 2 /4b2 ) . Roughly speaking, therefore, the solution to the pulse problem for (7.22) will supply solutions in the frequency domain for all (J) less than 2b. Consequently, the larger b is, the more harmonic frequencies will be covered. Unfortunately, the effective width of the pulse in time is 2/b and less than five time intervals over this width will scarcely manage to attain the temporal resolution of the pulse. This means that the time interval should not exceed 2/5b. The inequality to allow for retardation will be met if the side of a patch is about 2v/5b. Any size much greater will, in any case, not resolve the spatial width of the pulse which is 2v/b. It follows from these considerations that the wavelengths in the frequency domain that will be adequately treated will not be less than four or five times the side of a surface patch. As an illustration suppose that b is 1010 S -1, which is a typical sort of value. The spatial width of the pulse is then about 0.1 m and its upper cut-off frequency about 109 Hz. The above considerations suggest a time step of around 10- 1 1 s and a patch side of 1 em approximately. Thus, for a body 1 m across, roughly 104 patches would be required. Unless the body were highly elongated or wire shaped, a wider pulse, with consequent improvement in step length and patch size, would probably be adequate. Although the guide lines are likely to be conservative, they indicate that there is a trade off between the choice of band the demands on the computer. There is another aspect. If the width of the scatterer is Is, the range of wave numbers that can be handled should be within tk/ s ~ 2n/s / lp where Ip is the spatial width of the incoming pulse. The right-hand side of the inequality is 2n for a pulse the width of the scatterer. A reduction in pulse width with the intention of extending the range of frequency response and of obtaining more accurate information on the scattering carries with it the concomitant of a proportionate increase in the number of time steps while the number of surface elements augments by a square law. Narrowing the pulse soon exposes the fact that high-frequency information has to be bartered for rapidly rising computational cost. The Gaussian pulse is never actually zero away from infinity and this presents another slight difficulty. So long as b is reasonably large its rise is sufficiently quick to take all induced currents and charges as zero until the incident field has reached 10- 5 of its maximum value. No criterion can be given for when the computation should cease since this depends on the response of the obstacle, which is frequently oscillatory.

TRANSIENT PHENOMENA

..'

407

o ~&-./_" ------...::::~~-

o

....,

.., ,,,

~"

Fig. 7.3. The impulse response of a sphere.

One way of checking whether a program is performing adequately is by comparison with analytical results for one of the few shapes for which these are available or with experiment. The standard test-bed is the sphere and some results (Bennett and Weeks 1968) for the reaction to a Gaussian input are shown in Fig. 7.3. These are based on the MFIE. The corresponding backscatter in the frequency domain is compared in Fig. 7.4 with calculations from the exact solution in a Mie series. It is evident that the agreement is good for wave numbers which do not exceed 51a. On the other hand, if the Mie series is converted to the time domain and the comparison made there Fig. 7.5 is obtained. Again the agreement is highly satisfactory. These encouraging results turned out to be misleading. Subsequently, it became clear that, if marching in time is carried on long enough, the surface current obtained from the MFIE grows exponentially (Rynne 1985, 1986; Tijhuis 1984). Changing to the EFIE does not resolve the difficulty (Rynne and Smith 1990); indeed, Smith (1990) has reproduced the instability analytically for a sphere. The cause of the instability for the MFIE is interior magnetic

408

TRANSIENT PHENOMENA

0·8

"0

]

0·6

"'0

QJ

1

Q)

~ 0·4 (J

en

0-2

2

4

ka

6

Fig. 7.4. Backscatter in the frequency domain of a sphere: curve, Mie series; points, time-domain calculation.

-4

-2

o

2 Time

4

6

8

Fig. 7.5. Backscattering impulse response of a sphere: curve, Mie series converted to time domain; points, direct time-domain calculation.

TRANSIENT PHENOMENA

409

resonance. Ideally, the interior resonances should not give trouble because they would be purely imaginary eigenvalues of the matrix equation. However, in the numerical approximation they are displaced slightly off the imaginary axis and, therefore, are potential sources of instability. That instability does originate from such a displacement was confirmed by the above authors through careful analysis of the exponential behaviour and current shape. To counter the instability Tijhuis et ala (1989) proposed solution in the transform domain but Rynne suggested a simpler scheme. He pointed out that smoothing the current by averaging over three consecutive time steps effectively pushed the unstable source into a region where its influence was nullified. This modification is incorporated commonly in time marching schemes now and works well in most situations; however, in some circumstances, it seems to delay the onset of instability rather than dispose of it completely (Davies 1992a).

Exercises 8. Formulate time-dependent integral equations for two-dimensional scatterers. 9. Obtain the integral equation in the time domain analogous to those of §6.26 for dielectric obstacles. 10. Compute the scattering from a sphere and compare your calculations with Figs. 7.3-7.5. See if you can reproduce the instability described above. How are your results affected by Rynne's scheme? 11. A cone of 15° half-angle is terminated smoothly by a portion of spherical surface. Find the impulse response under an incident Gaussian pulse whose spatial width is the same as the length of the scatterer when the direction of the incoming radiation is (a) along the axis of the cone towards the cone, (b) along the axis of the cone towards the sphere, and (c) perpendicular to the axis of the cone. Use the MFIE and plot your results in the plane containing the axis of the cone and the direction of incidence in the form of Fig. 7.3. Compare the current induced on the obstacle with the position of the incident pulse at various times as it moves over the scatterer. 12. Repeat Exercise 11 for the case when the obstacle is a length of circular cylinder capped by a hemisphere at each end. 13. In Exercises 11 and 12 make a Fourier transform from the time domain and compare your results with those derived directly in the frequency domain. 14. Carry out the process of Exercise 12 but in the opposite direction, i.e. make a Fourier transform of the harmonic results and compare in the time domain. 15. Replace the structures in Exercises 11 and 12 by wire-grid models and see if these models give appreciably different results from those already obtained.

7.6 The harmonic approach versus the impulse response The Fourier transform enables one to transfer from the time to the frequency domain and vice versa, but the choice of the domain in which to undertake the computations calls for delicate judgement. Naturally, if the input is a pulse and the immediate transient is wanted the time domain is the correct selection, but other cases are much less clear-cut.

410

TRANSIENT PHENOMENA

In harmonic waves a matrix inverse is determined at a given frequency. The inverse is independent of the incident field so that, once it has been found, a change of incident excitation can be dealt with in a simple manner. However, as soon as the frequency is changed, the matrix inverse has to be recomputed. In contrast, the pulse solution combined with a fast Fourier transform provides results over a wide frequency band but only for a single direction of excitation. Whenever the position of the source is altered the entire computation has to be repeated. Running counter to this disadvantage is the tendency for the pressure on computer storage to be lighter by a factor of 10 or more in the time domain than in the frequency domain, though the storage grows with the time interval being studied. The amount of computer time involved is a further consideration but the evidence is inconclusive. In some numerical experiments (Miller 1972) where the complete response was obtained the frequency domain required about five times as much time as the pulse approach for wires. Set again this is the behaviour for surfaces where the harmonic solution was faster by a factor of five than the time domain. Another observation was that the time domain was always worse than the frequency domain in monostatic calculations for both wires and surfaces. For the bistatic response the time domain was more efficient than the harmonic approach for wires, but the reverse was true for surfaces. In these confusing circumstances only the following broad guidance can be offered. (i) For early transients and short times use the time domain. (ii) If many incident fields at few frequencies are to be investigated employ the harmonic approach. (iii) If few incident fields at many frequencies are needed work via incident pulses. (iv) Consider using a finite difference method (§7.1) in the time domain with a Gaussian incident pulse and, if the frequency response is required, apply the FFT to the time response. 7.7 The Laplace transform A classic method in analysis for tackling transients is the Laplace transform. Normally, the inversion back to the time domain is not dissimilar to the Fourier transform already mentioned in the preceding sections. However, the conjecture that the time response of a body might be represented as a series of exponentials led to the idea that a body might be characterized by the singularities in the complex frequency domain of the inverse Laplace transform (Baum 1971).Since these singularities are often simple poles and depend solely on the geometry of the obstacle they need be found only once. Their storage requirements are minimal and the computation of even broadband frequency response is relatively straightforward. As soon as the relation between the singularities and the radar pulse signature is known, either can be characterized by the other.

411

TRANSIENT PHENOMENA

Supporting theory for the idea has been forthcoming (Marin and Latham 1972; Tesche 1973; Marin 1973, 1974) and it is the aim of this section to explain the fundations. The Laplace transform e of E is defined by

e(s) =

Loooo E(t) e -st dt.

(7.23)

So long as E is non-zero only for t greater than some finite constant and is bounded by an exponential as t becomes infinite, the Laplace transform exists as a regular function of the complex variable s provided that 9ts is greater than some constant (Jo. Then, for PAs > (Jo, Maxwell's equations in a time-independent isotropic medium go over to curl e

+ sJth =

0,

curl h - see

=0

(7.24)

when there are no sources. If a time-dependent solution is sought which is outgoing at infinity and is such that n 1\ E = n 1\ Eo on the surface S then (7.24) must be solved in such a way that n 1\ e = n 1\ eo on S. The condition at infinity is that e and h must decay at least exponentially there since (Jo > O. The problem is then very similar to that of §6.14 and the MFIE can be formulated as

-tj(p)

+ up

A

Is j(q)

A

=

Is {(uq.ho) grad,
Up A

grad,
eo
as,

(7.25)

where A..( ) 0/ x,y

= exp{ -

s(lle)1/2Ix - yl}

4nlx - YI

.

(7.26)

The quantity n. ho can, of course, be expressed in terms of n 1\ eo. The integral equation (7.25) holds for 9ts > (J 0 and it will be seen presently that this restriction guarantees a unique solution. Despite this asset we now wish to drop the restriction and consider the solution of (7.25) for any complex value of s. The proof of §6.16 that the integral operator is compact may be carried over and may also be adapted to show that the kernel associated with the square of the operator is square integrable. This observation enables an explicit solution of the integral equation to be written down. While this formula is not of great value in actual computation it is of tremendous theoretical importance and has some consequences which are of practical significance. To simplify the writing matrix notation will be adopted and the integral

412

TRANSIENT PHENOMENA

equation cast into the form

j(x) -

f

C(x, y)j(y) dy = g(x)

(7.27)

where g is a specified vector and C is a known matrix. Explicit forms for C and g can be deduced from (7.25) but these are not necessary to the current discussion. In order to comply with the known properties of (7.25) it is assumed that the integral operator in (7.27) is compact and that C(x, t)C(t, y) dt is square integrable. The first iteration of (7.27) is

J

j(x) -

f

H(x, y)j(y) dy

= g(x) +

f

C(x, y)g(y) dy

(7.28)

J

with H(x, y) = C(x, t)C(t, y) dt. An integral equation of the type (7.28) can always be converted to a single scalar equation (Goursat 1942). For our purposes it will be sufficient to demonstrate the technique in two-dimensional space, denoting components by subscripts. Suppose (7.28) holds on the set S; then obviously a constant vector Xo can be chosen such that the set So formed by x + xo, XES, is disjoint from S. Define the functions j' and K on the union of S and So by j'(x)

= j I (x) = j2(X - x o)

K(x, y) = H 11(x, y)

= HI 2 (X, y - xo) = H2 1( X - XO, y) = H22( X - X O, Y - Yo)

(x

E

S)

(x

E

So),

(x

E

S, yES)

(x

E

S, Y E So)

(x

E

So, yES)

(x, Y E So).

Similarly, if the right-hand side of (7.28) be signified by G, specify G' by G'(x)

= G1(x) = G2 (x -

xo)

(x

E

S)

(x

E

So).

Then (7.28) transforms to a scalar integral equation on S u So, namely j'(x) -

f K(x, y)j'(y) dy = G'(x)

the integration now being over S u So. This scalar equation is exactly equivalent to (7.28) because S and So are disjoint. Since K is square integrable, standard Fredholm theory (Cochran 1972) allows one to say that, when the solution of

413

TRANSIENT PHENOMENA

(7.28) is unique, j'(x)

= G'(x) +

f ro, y)G'(y) dy

(7.29)

where the resolvent r is given by

ro, y) = The number D is defined by D

D(x, y)/D.

= L:=o d; where

do = 1 and

o

K(t 1 , t 2 )

K(t 1 , t n )

K(t 2 , t 1 )

0

K(t 2 , t n )

dt 1

•••

dz,

for n ~ 1. The function D(x, y) has the structure 00

D(x, y)

where Do(x, y)

= K(x,

= L

n=O

Dn(x, y)

y) while, for n ~ 1,

Dn(x, y) = dnK(x, y)

+

= dnK(x, y) + In addition, the resolvent satisfies

ro, y) = K(x, y) + = K(x, y) +

f f f fro,

Dn- 1(x, t)K(t, y) dt K(x, t)Dn- 1(t, y) dt.

K(x, t)r(t, y) dt

(7.30)

t)K(t, y) dt.

(7.31)

Any solution of (7.27) satisfies (7.28). Therefore, when the solution of (7.28) is unique, (7.27) has only a single solution and it is given by (7.29). If it is known only that the solution of (7.27) is unique it cannot be deduced that (7.28) has a unique solution. What the uniqueness of (7.27) does imply is that any solution of the homogeneous form of (7.28) must satisfy j(x)

+

f

C(x, y)j(y) dy

=0

(7.32)

because the left-hand side is obliged to be a solution of the homogeneous form of (7.27). It is only when (a) there is no solution of (7.32) other than the trivial one

414

TRANSIENT PHENOMENA

and (b) the solution of (7.27) is unique that it can be asserted that (7.28) has uniqueness with (7.29) being the sought solution. Verification that in the scattering problem the uniqueness of (7.27) enforces j == 0 as the solution of (7.32) will come later. The kernel K which originates from (7.25) contains the parameter s, and the behaviour of the resolvent as s varies in the complex plane is of interest. Results from the customary theory of integral equations cannot be taken over straight away because the parametric dependence in the conventional theory is through multiplication by a complex constant whereas C, H, and K involve s via an exponential factor which also includes the variable of integration. H is a matrix whose elements are regular functions of s in the complex s plane and so K has the same regularity. Therefore, d,. and D,.(x, y) depend upon s in a regular manner. Hence, D and D(x, y), being composed of uniformly convergent series of regular functions, are regular. The implication is that D and D(x, y) are entire functions in the s plane so that the only possible singularities of I' are poles at zeros of D. It will first be shown that the solution of (7.28), or equivalently (7.27) in the scattering problem, is not unique at s = s; if and only if D = 0 at s = S,.. In the course of the proof it will be helpful to indicate the dependence of a quantity on s explicitly; for example, d,.(s) will be written in place of d; on such occasions. Now, it is readily checked from the defining formulae that

with which the special formula d 1

f {r(x, x) -

= 0 is in

K(x, x)} dx

harmony. Hence

= - f..J~

,.=0

(n

+

l)d"+l

D

•

(7.33)

Consider now the series L~o d,(s,.)(sls,.)'. It is certainly convergent for Isis,. 1 ~ 1 and approaches D as s --+ s; from Isis,. 1 < 1. Therefore, it behaves like (s - S,.)m (m ~ 1) where m is the order of the zero at s,.. The derivative with respect to s commands like properties and so has the form m(s - S,.)m-l near s,.. Consequently, the right-hand side of (7.33) has a simple pole at s = s,.. However, K has no singularity in the s plane. Therefore r cannot be regular at s = s, and so (7.29) is unable to provide a unique solution. Conversely, suppose that the homogeneous integral equation has a nontrivial solution at s = s; but r is regular there. By the second part of Theorem 6.15 there is a non-trivial solution j 1 of

415

TRANSIENT PHENOMENA

at

S

= Sn.

Let

S

be near s; and consider the integral equation j(x) -

f

K(x, y, s)j(y)

= jl(X)

(7.34)

the dependence of the kernel on S being explicitly noted. Equation (7.34) has a unique solution. which can be expressed in the form (7.29); if this form be substituted in (7.34) we have the identity jl(X)

f

+ ro, y, ;)jl(y) dy

-

f

K(x, y, s)

x {jl(y)

+

f r(y,t,S)jl(t)dt}d Y =jl(X).

Because of the assumed regularity of F, the limit can be taken as S -. s; and then il(X) + J I'(x, y, Sn)il(Y) dy is a solution of (7.34) with S = Sn. However, that is contrary to Theorem 6.15 which declares that the right-hand side of (7.34) must be orthogonal to t. when S = s; for a solution to exist. Hence r cannot be regular at S = s; and the converse has been proved. Thus the zeros of D or, equivalently, the poles of r identify completely the values of S for which (7.27) is non-unique. The zeros of D will be isolated points unless D is identically zero for all s. If D were identically zero for all S there would not be a unique solution for any S in contradiction to what has been proved already in §6.16for the static kernel. The conclusion to be drawn is that D possesses only isolated zeros. The general theory of §6.15 tells us that at a zero of D there is only a finite number of linearly independent solutions of the homogeneous equations, i.e. for S = s; the independent eigenelements are finite in number. There is a connection between the eigenelements and the order of the pole in r. Let m be the order of the pole so that when S is near s;

rt

x,y )

where

U m , U m - 1, ••• ,

=

um(x, y) (s - sn)m

+ um- 1(x, y) + ... (s - sn)m-l

are independent of s. Substitution in (7.30) gives

um(x, y) + um- 1(x, y) (s-~ )m (S-~ )m - 1

+ ...

= K(x

'

y,

s)

+ fK(X t

s) { um(t, y) "() m s-~

+ ... } dt

where the dependence of K on S has been explicitly indicated. Equating the dominant terms on either side we obtain

Thus, for fixed y, Um must be an eigenelement of K at s = Sn. The next-order

416

TRANSIENT PHENOMENA

terms give

Um-t(x, y) = f {K(x, t, sn)um-t(t, y) + um(t, y) aK(x, t, sn)/asn} dt

(7.34a)

if m > 1. If m = 1, K(x, y, S,.) must be added to the right-hand side. According to Theorem 6.15, U m - 1 can exist for m > 1 only if the term involving Um in (7.34a) (regarded as a function of x) is orthogonal to all the solutions of the adjoint equation. In other words, if v is any solution of

v(x) =

f K *(t, x, sn)v(t) dt

(7.35)

it is necessary for m > 1 that

f

v*(X) fUm(t, y) aK (x, t, S

as,.

Conversely, if (7.36) holds and m

= 1, then

)

n

dt dx = O.

(7.36)

Uo can exist only if

f v*(x)K(x, y, sn) dx = 0 which implies from (7.35) that v == O. However, this contradicts what has been proved above, namely that a pole of r entails non-trivial solutions of the homogeneous integral equation and hence of the adjoint. It follows that m > 1. Thus (7.36) is a necessary and sufficient condition for m > 1. If all the poles of resolvent are simple the kernel is said to be semi-simple. Clearly, (7.36) must fail for a semi-simple kernel. Since Um itself is an eigenelement of K, (7.36) can be cast into a slightly simpler form and we can state the following: A kernel is semi-simple if and only if there are eigenelements u, v of the integral equation and its adjoint respectively such that

f

v*(X) fU(t)

~ K(x, t, sn) dt dx =F 0 as,.

(7.37)

for every s"for which the kernel possesses eigenelements. Since the resolvent of the adjoint is the adjoint of the resolvent of the original equation, we could interchange the roles of the adjoint and original in determining whether a kernel is semi-simple. However, the criterion takes exactly the same form as (7.37). We may note in passing that when the dependence of K on s is linear, i.e, K(x, y, s) = sKo(x, y), as in the classical eigenvalue problem though not in the scattering integral equation, oK/as = Ko. Then (7.37) reduces to

f

v*(x)u(x) dx =F O.

(7.38)

417

TRANSIENT PHENOMENA

Hence the existence of eigenelements satisfying (7.38) guarantees that a kernel is semi-simple in the classical eigenvalue problem (Lalesco 1912). If the kernel is not semi-simple, the application of the operator 1 - K o to (7.34a) reveals that (I - K O) 2um_ t

= o.

A solution of such an equation is often known as a generalized eigenelement. If (7.38) is valid there can be no generalized eigenelements. To put it another way, a semi-simple kernel does not possess generalized eigenelements in the classical eigenvalue problem; the Riesz number is 0 or 1. Further properties of U m may be derived by working with the adjoint. It is then evident that Um(X,

= L arur(x)v~(y)

y)

r

where Ur and v, are eigenelements of the original and adjoint equations respectively. Hence, for s near SIt when the kernel is semi-simple j'(x)

= G'(x) + (s

However,

f

G'(x)v:(x) dx

- SII)-1

~ arur(x) f G'(y)v:(y)

=

f

=

f[f {K(x, y, SII) -

f

K(x, y, s)j'(y) d Y} v:(x) dx

{j'{X) -

~ -(s

(7.39)

dy.

- Sll)

K(x, y, s)}j'(y) dY]V:(X) dx

ff~ K(x, y, slI)j'(y) dyv:(x) as"

dx.

(7.40)

The ur(x) can be arranged to be orthonormal. When this has been done the o, can be selected so that they are orthogonal to J (ojos,,)K(x, y, slI)up(y) dy unless p = r when a non-zero result ensues because the kernel is semi-simple. Then, if (7.40) be substituted in (7.39) and the coefficients of Ur compared, we discover that consistency demands that

a, =

-1/ffa~II

K(x, y, sII)ur(y) dyv:(x) dx.

(7.41)

A similar, but more complicated, route may be followed to determine the coefficient when the kernel is not semi-simple. 7.8 The location of the poles Having described the general theory we want to show now that (7.27) and (7.32) go together as far as uniqueness is concerned in the scattering context, so that

418

TRANSIENT PHENOMENA

the only singularities of the resolvent are poles. In doing so some information about the location of the poles will appear. The homogeneous form of (7.25) states that D A h., = 0 on S. By taking the complex conjugate we discover that if s; is a pole so is s:. It will therefore be sufficient to limit consideration to J s ~ O. Now ffIi

r Js

(D A

h.e"); dS = (s + s*)

r Js-

(ele/

2

+ JLIW) dx.

Therefore, the left-hand side can be zero only if either ats = 0 or the electromagnetic field is identically zero in S_. In the latter case D A e, = 0 on Sand hence D A e, = 0 since the tangential components of the representation of e in terms of j are continuous across S. If rJls ~ 0 the field will be radiating at infinity (rJls = 0) or decaying exponentially (91s > 0). In either case the standard exterior uniqueness theorem obliges the electromagnetic field to be identically zero in the exterior S+. Therefore DAb + = 0 and hence j = O. Hence it has been demonstrated that (7.25) is non-unique only when either the field produced by j in S_ is non-zero and 9ts = 0 or the field in S_ is identically zero and 9ls < O. The former corresponds to interior modes of magnetic resonance whereas the latter are exterior modes of electric resonance since D A e , = O. On account of 9ts < 0 the exterior modes grow at infinity rather than diminish. Both types of mode exist for the sphere and so must be allowed for in scattering by general obstacles. Consider now the electromagnetic field in which

Is j' grad, c/J(P, q) dS b(P) = Is {esj' c/J(P, q) + s~ Div J' grad, c/J(P, q)} dS e(P)

=

A

q,

(7.42) q•

If j' is the tangential component of the electric field in an interior magnetic mode e vanishes identically in S+ by the discussion after (6.106). Then D A e , = 0 and

ti' + Dp A

L

j' A grad, c/J(p, q) ss,

= O.

(7.43)

If (7.42) is used as the representation of the exterior electric mode then (7.43) also results on account of the boundary condition. Thus with each solution of the homogeneous form of (7.25) can be associated a solution of (7.43), the j' being the tangential component of the interior electric field of the magnetic mode (9ts = 0) and of (7.42) (als < 0). We wish to show that the reverse is also true so that (7.25) and (7.43) (and thereby (7.27) and (7.32» stand or fall together in the matter of uniqueness.

419

TRANSIENT PHENOMENA

The dimension of the space of solutions of (7.43) is the same as that of its adjoint which, as has been seen in §6.14, has elements n 1\ j •. Therefore (7.25) and (7.43) have the same dimension. Hence the solutions of the homogeneous form of (7.25) are not one-to-one with those of (7.43) only if there is one such that the electric field produced by it satisfies n 1\ e _ = 0 when ~s = o. However, n 1\ e, = 0 with the consequence n 1\ h, = 0 from which follows j = O. Hence the relationship must be one-to-one and the case 9ls = 0 has been dealt with. When fits < 0 let the field be defined by (7.42) with j' satisfying (7.43). Then, from (6.86)

hOD =

-1

{(n

A

h)

A

grad, eJ>(~, q)

+ sen A

+

Div(n

(n

A

h)

A

grad,

1\

e_) grad, cP(~, SJL

for ~ in S_. However, by virtue of (7.43), and (7.44) indicates that when s = s;

1

e_eJ>(~, q)

- 0 1\

q)} as,

(7.44)

e., = j'. Comparison of (7.42)

eJ>(~, q) as, = 0

(7.45)

Allowing ~ to tend to a point of S we deduce that n 1\ b satisfies (7.25). Remark that if n 1\ b = 0 then n 1\ e., = 0 so that j' = o. Once again, the one-to-one correspondence has been established. The kernel of (7.25) is semi-simple according to (7.37) if, for each 5,., there is a solution of the homogeneous version of (7.25) and a solution of (7.43) such that

Suppose that 9lS,. = O. Then, from the foregoing analysis, one j' can be associated with the electric intensity of the interior magnetic mode generated by j. Therefore, (7.46) will be met for 9lS,. = 0, if it can be shown that

fs

ab_ as

o.-

1\

e _ dS =/;

o.

(7.47)

This will be proved by assuming the contrary, i.e. it will be demonstrated that, when the left-hand side of (7.47) is zero, the field must vanish identically. Since

420

n

A

TRANSIENT PHENOMENA

h.,

= 0, the

assumption gives

Oh_ oe_} fs n. {- os e _ - h_ os dS = {e·(S8 oeos + 8e) + ah. SJlh - ae. see s: os os

o=

=

A

A -

f

1-

- h. (SJl ah

os

+ Jlh)} dx

(ee,e - Jlh.h) dx

(7.48)

from the divergence theorem and (7.24), together with the derivative with respect to s. The theory of§3.5 informs us that in any interior mode of oscillation e may be taken as purely real and h as purely imaginary. Hence (7.48) implies that e and h are identically zero. Thus the failure of (7.47) entails the disappearance of the electromagnetic field. Accordingly, (7.47) and thereby (7.46) must hold for a magnetic mode. In other words, all the poles of the resolvent which have 91s" = 0 are simple. The exterior electric mode is relevant when 9ls" < 0 and then the analogue of (7.47) is

f s

n .h ,

A

oe+

-dS #- O. OS

(7.49)

No proof has so far been forthcoming as to whether or not (7.49) is true for general obstacles. It is certainly valid for the sphere and one may conjecture that it has universal veracity. From now on it will be assumed that the scattering kernel is semi-simple. Then advantage may be taken of (7.41). Let j, be an orthonormal set of eigenelements of (7.25) when s = s; and select associated solutions j~ of (7.43) satisfies the adjoint of (7.25» such that (remember n A

j::

(Jle)1/21 j~(p).1 jr(q) /\ grad, exp{ -Sn(J.te)1/2I x p - xql} dSq dSp

=0

if m #- r but is non-zero if m = r. More briefly this may be expressed in inner product notation as .'*) -- 0 (AJ"· Jm (r :F m)

(r

= m)

with an obvious significance for the operator A. If the right-hand side of (7.25) be denoted by ji the solution of (7.25) may be expressed as

j(p) = -2j;(p)

-1

[(p, q).j;(q) dSq

(7.50)

where the resolvent [ is a dyadic. The dyadic has only simple poles in the s plane

421

TRANSIENT PHENOMENA

and near a pole s

= s; r(

- p,q

)

= (s -

S n

)-1 ~ j,(p){n q /\ j;(q)} c: (A . . '*) r J" J,

(7.51)

on account of (7.39) and (7.41). Since .[ is meromorphic a representation valid in the whole s plane could be constructed for .[ by Mittag-Leffler's theorem (Goursat 1942) but that is unnecessary for our purposes. Note that the effect of (7.51) is to give a term in j, via the coupling between the incident field and the interior tangential electric field accompanying the natural modes of oscillation. If e is the field of (7.42) and ji = n /\ hh the coefficient is proportional to n /\ e, • hi dS. Now, if hi has no sources inside S this is the same as n /\ e.. h dS at s = Sn' This last integral vanishes if is pure imaginary because then n /\ h = O. Thus the residue of disappears at any pole corresponding to an interior magnetic resonance when the incident field comes from outside S. In other words that is no coupling between the exterior incident field and the magnetic field inside at frequencies of interior resonance. Before leaving this theoretical investigation we derive some properties which are valuable when making deformations in the complex s plane. For a surface vector g define f by

Is

Is

f(p)

s;

r

= -!g(p) + Dp 1\ = -!g(p) +

Is g(q)

1\

grad, c/>(p, q) as,

Tg

in the notation of §6.16. According to §6.16, T can be split as T1 + T2 where T2 comes from a domain of S of diameter 2b surrounding p and, for any fixed s, b can be chosen small enough for IIT2g112 ~ 8111g\l2 for any 81 > O. The proof of this statement is unaffected if ~S is replaced by any larger value. Therefore, as f!is -+ 00, we first choose ~ so that II T2 g 11 2 ~ 8fllg\l2 and then make ~s so large that IITlgl/ 2 ~ efllgl1 2. Evidently, as ~S -+ 00,

- t1 +

Further

11(1- 2T)-lg - gil so long as

00

~

L

r= 1

T -+

(211 TglI)' ~

4e 111gl1 < 1. Allowing

8 1 -+

00

L

,= 1

-

t1.

(7.52)

(4e 111gll)' ~ 48 1\1gll(1 - 4e 111g1D- 1

0 we see that

(-t1 + tv : -+ -t1 (7.53) as Bls -+ 00. The behaviour as lsi -+ 00 in general needs more elaborate analysis. Let g be continuous on S and let the maximum value of Igl attained on S be denoted by maxlg], Then, for fixed ~ ( < 1), there is a finite K such that IT1gI ~ K exp{lsld(jl8)1/2} maxlg]

422

TRANSIENT PHENOMENA

where d is the maximum separation between any pair of points of S. On the other hand, (6.114) implies that IT2KI ~ K~ exp{lsld(jle)1/2} maxlg],

Thus, as lsi -+

00.

(7.54) Now considering only the variations of T with s, we observe that, for given p, f is an entire function of s which, by virtue of (7.54), is of exponential type and of order 1. Hence, by the properties of the minimum modulus (Titchmarch 1934), there are, for given p, arbitrarily large circles of radius lsi on which

+ ,,)lsl(jle)1/2} maxlg]

If(p)1 > exp{ -(d for any" >

o. Writing g = (-t + T)-lf this may be expressed as I( -t + T)-lfl < exp{(d + ,,)lsl(pe)1/2}lf(p)1

(7.55)

on some circles of arbitrarily large radius. Broadly speaking, (7.55) may be interpreted as stating that [is meromorphic and of order 1, its growth at infinity being dictated by the largest distance between any brace of points of S. Meromorphic functions of order 1 can be factorized as the ratio of entire functions of order 1 but that fact will not be needed here. On the basis of the method of moments Wilton (1981) has conjectured that, for convex bodies, there are contours progressing to infinity on which [(p, q)

"'-I

"'-I

R+ exp{ -slx p

-

xql(JLe)1/2}

R_ exp [slx, - Xql(Jle)1/2}

(Rs-+oo)

(7.56)

(Rs -+ - (0)

(7.57)

where R+ and R_ have algebraic growth at most at infinity. Obviously in Rs < 0 the contours must dodge the poles of [. Bearing in mind (7.30) we can see that iteration of the kernel will verify (7.56). As regards Rs < 0 consider what happens when the right-hand side of (7.57) is substituted for r. The integrand contains an exponential with exponent s{Jle)1/2{lx t - xql - Ii"p - x.l] but no other factor with exponential behaviour. When lsi is large the integral may be estimated by the method of steepest descent (Jones 1982, 1986). As x, moves over the body the exponent is stationary at x, = x p and x, = x q , the convexity of the body being presumed. The exponential contribution from x, = x q has exponent -slx p - xqIVte)1/2 which is the same as that of the kernel; the remainder of the kernel can be recovered by adjusting R_ suitably. On the other hand, the contribution from x, = x p balances I. Accordingly, it has been demonstrated that (7.56) and (7.57) are valid for convex bodies. They may hold for other shapes but the above argument suggests that this is not so in general since one can envisage boundaries where other stationary points have to be included in the asymptotic estimation.

423

TRANSIENT PHENOMENA

7.9 The impulse response Suppose that the antenna is acting as a perfectly conducting receptor under the influence of the illumination eo, ho from outside. Then the field outside may be expressed as

h(P)

= ho(P) -

Is j(q)

A

grad, cjJ(P, q) as,

(7.58)

leading to the integral equation

- tj(p)

+ up

A

Is

j(q)

A

grad, cjJ(p, q) dSq =

Up A

ho(p)·

(7.59)

The same equation could be obtained from (7.25) by the device of §6.14, relabelling j + D A ho as j and then changing the sign of boo From the theory of the preceding section the solution of (7.59) is, from (7.50),

j(p)

= - 2up

A

ho(p) -

Is Ep, q).

Uq A

ho(q) as,

(7.60)

which enables the determination of the total field from (7.58). The integral in (7.58), indeed, supplies an expansion in terms of the natural modes of the body but only those representing the exterior electric modes arise in view of the remarks of the preceding section after (7.51). Typical conduct emerges when the incident wave is plane and impulsive, its time dependence between decreed by a ~ function. Naturally, the performance for other incident fields can then be deduced by convolution. To focus thoughts let "0 = 10<5{ t - to - (J.le)1/2 x} where 10 is a constant vector. Forcing the plane wave to travel along the x axis causes no loss of generality since the orientation of the axes is at our disposal. Moreover, convolution of the resulting timedependent current with I(t o) supplies the behaviour under more general excitation, the incident wave then being 10 L<X><x> f(t o)15{t - to - (Jl8)1/2 X } dt o = lof{t - {Jre)1/2 X } .

The assumed form of "0 makes ho = 10 exp{ -s(,ue)1/2 X} where X x + t o/(,ue)1/2. With Dp A 10 = -jo(p), (7.60) gives

j(p) = 2io(p) exp{ -S(Jl8)1/2 X p }

+

Is Ep, q) ·jo(q) exp{- S(Jl8)X

q}

as,

=

(7.61)

where X p = x p + t o/(J,l e)1/2, X q = x q + t o/(J,l e)1/2 in which x p , x q are the values of x at the point of observation x p and the point of integration x q respectively. Returning to the time domain we obtain for the current induced in the

424

TRANSIENT PHENOMENA

obstacle J(p, t) = -1.

21tl

f

C

+

i OO

estj(p) ds

c-ioo

where c is sufficiently positive for all singularities of the integrand to lie to the left of the contour of integration. From (7.61) where J 1(p, t)

= ~ f<+;OO esT 21tl

c-ioo

f

(7.62)

[(p, q).jo(q) exp{ -s(pe)I/2 xq} as, ds

(7.63)

S

with T = t - to. The first term of (7.62) represents the direct effect of the excitation and does not contribute until the incident wave reaches the point of observation. The next question to be answered is whether J 1 is zero until the same time; for convenience (7.63) will be abbreviated to

Jt(p, t) = -1. 21tl

f

C

i OO

+

c-ioo

esTj l (q) ds.

In drawing inferences from (7.62) it will be assumed that the x coordinates of the body satisfy Xl ~ X ~ X2. There is no reason why Xl should not be negative if that will assist the calculation. However, it is easier to understand what is going on when X 1 is positive and so the restriction Xl> 0 will be imposed. When Rs --+ 00, (7.52) and (7.53) imply that (7.59) can be solved by iteration. Thus

it(p) = 40, /\

Is

io(q) exp{ _S(jl&)1/2 Xq } /\ grad, cP(p, q)

as, + iterates

(7.64)

as Rs ---. 00. It is evident that, for the first term on the right of (7.64), the contour in the integral for J 1 can be pushed to the right with a consequent contribution of zero to J 1 when T/(pe)1/2 < x q + Ixp - xql. It is readily verified that higher iterates in (7.64) make no contribution to J 1 under the same condition. The inequality holds for all xp and xq when T < x 1 (pe)1/2. Hence, no current is induced in the obstacle until the incident pulse strikes it. Consequently, the dictates of causality are obeyed. Furthermore, if the point of observation p is such that T < x p (pe)1/2, the inequality remains valid since x p ~ x q + [x, - xql. Thus, there is no current at a point of the surface before the incident pulse

reaches it. The fact that J 1 = 0 when the inequality is satisfied leads to another inference. For values of T such that Xl < T(J1.e)1/2 < X 2 the integration over S in (7.63) can be limited to those points x q of S for which x q + [x, - xql < T/{Pf,) 1/2. Of course, once T is large enough the whole of S will be involved. If T/(j.lG)1/2 > X2 + d the limitation on the growth of demonstrated in the

r

425

TRANSIENT PHENOMENA

last section empowers deformation of the contour to the left in (7.60). Only the poles of [ need to be taken into account and J 1 (p, t) =

L exp(slIT) L j~,(P~,. n

r

f

(AnJnr, Jnr >

S

Dq A

j~r (q).jo(q)

exp{ -sn(jJe)1/2x q } dSq (7.65)

when T/(pe)1/2 > X2 + d, i.e. the pulse has gone about twice the length of the body from hitting it initially. The subscript n has been added to the quantities in (7.51) in order to identify a particular pole. The summation over r may depend on n since the number of eigenelements of (7.25) may vary with the pole under consideration. Nominally, all the poles of ,[ may be allowed for in (7.65). However, it has already been pointed out, after (7.52), that the interior magnetic resonances can be ignored when the sources of the incident field are outside S. Therefore, the summation in (7.65) can be limited to those values of n for poles corresponding to the exterior electric modes and so all terms are exponentially damped in time. At first sight (7.65) does not appear to be real, but it must be remembered that if s; is a pole so is s: with associated eigenelement j:,. Thus each term of (7.65) has a matching complex conjugate so that J is real, as it needs to be. The weight to be attached to the expansion (7.65) is considerable. It demonstrates that after a certain interval of time the induced current at any point has a particularly simple time dependence, being a series of decaying exponentials whose coefficients do not vary with time. Moreover, the exponentials, though not the coefficients, are independent of the position of the point on the surface. Representing the current in terms of the poles by an expansion like (7.65) is known as the singularity expansion method or SEM for short. The delay which has to be accepted before (7.65) is valid is quite short in practice. For example, with a sphere of radius m in air the delay is a little more than 3 ns. For an ellipsoid whose largest and smallest semi-axes are a and c respectively, the delay does not exceed 12a ns roughly and may be as little as 6(a + c) ns, depending upon the direction of incidence of the incoming pulse. When the obstacle is convex (7.56) and (7.57) can be exploited. To simplify the presentation without invalidating the principles we will just write

*

with P; and Qm independent of s. By virtue of (7.56)

1

fC+ i co

21tl

c- i oo

-.

[(p, q) exp{sT - s(pe)1/2 x q } ds =

!!

(7.66)

when T/(Jle)1/2 < x q + [x, - xql since the contour can be deformed to the right. This is consistent with what has been established already for the general

426

TRANSIENT PHENOMENA

obstacle. When T/(Jle)1 /2 > x q left on account of (7.57) and -1. 21tl

f

C

i OO

+

-

[x, - xql the contour may be deformed to the

[(p, q) exp{sT - s(Jle)1 /2 xq} ds

c - i co

= L Pm(xp)Qm(x q ) exp[sm{Tm

(J.le)1 /2 xq}] . (7.67)

Therefore J 1(p, t)

= L eSmTPm(xp) [ Qm(xq ) . jo(q)H{T/{}le)1/2 - x q -Ix p - xql}

Js

m

x exp{ -sm(J.le)I/2

x q}

dS q ,

(7.68)

H(x) being the usual Heaviside step function.

For T/(J,le)I/2 > X2 + d, the step function is unity for all xq and (7.68) is equivalent to (7.65). At earlier times the integral in (7.68) is time dependent. Therefore, an attempt to express the SEM response in pure exponentials for all time as in (7.65) will be in error, certainly in the early stages; later on it should be perfectly satisfactory. That error does occur in actual calculations has been affirmed by numerical experiments (Michalski 1982; Baum and Pearson 1981). A source of potential numerical error can be seen from (7.66) and (7.67). From these it can be inferred that the series in (7.67) must be identically zero for xq

-

[x, - xql < T/(J.le)I/2 < x q

+

[x, - xql.

In any numerical scheme only a finite number of the poles of [ can be determined and it is impossible to make a finite series identically zero over the prescribed interval. Nevertheless, if enough of the poles in a neighbourhood of the imaginary axis are found, the predictions of (7.68) can be expected to be reliable for all times. Eventually, they will coincide with those of (7.65). There are two reasons for this. One is that the series in (7.67), although not exactly zero in the relevant range, will be approximately so. The other is that the missing poles have a negative real part of largish magnitude and their influence, if present, would diminish rapidly with increase of T. The overlap between (7.66) and (7.67) means that the step function in (7.68) could be taken as H{T - (J,le)1/2 xq}. While this is true in the exact formula it will be in error in a numerical calculation; yet particular examples indicate that the error soon disappears (Michalski 1982). There is another point. Wavefronts in sharp pulses are dominated by high frequencies in the spectrum. But, at high frequencies, the back of a convex target is in shadow and the current correspondingly small (§8.19). Therefore, the major contribution of the integral in (7.68) will have occurred by the time the pulse reaches the shadow boundary or a bit beyond. Hence, (7.68) can be expected to be in agreement with (7.65) much earlier than (X2 + d)(J.lB)1/2. How much earlier depends on the shape of the body and the direction of illumination. For points in the illuminated region it should not be later than about (Xl + td)(Jle)1/2

TRANSIENT PHENOMENA

427

on average, i.e. the pulse has travelled about half the length of the body after first striking it. For a point in the shadow the agreement should take place not too long after the arrival of the incident pulse there. The formulae (7.63), (7.75) and (7.68) will be valuable practically only if there are ways of calculating the positions of the poles and their associated residues. This problem will be examined in the next section. 7.10 Practical determination of the positions of the poles To locate the poles it is necessary to solve the integral equation

-tj(p)

+ up A

1

j(q)

A

grad, f/J(p, q)

as, = O.

(7.69)

If this can be undertaken analytically then there is little more to be said. However, the analysis is likely to be intractable for all except the very simplest shapes and so recourse to other methods is inevitable. The direct approach to (7.69) is by numerical approximation, replacing (7.69) by a finite algebraic system based on any appropriate method from Chapter 6. A solution will exist only if the determinant of the coefficients vanishes, leading to an equation in s which will be satisfied by the s; or rather by approximations of the s,.. Once the s; are known approximation to the eigenelements of (7.69) can be constructed from the solutions of the algebraic system. The procedure may be clearer if (7.69) is replaced by its scalar equivalent as in §7.6, so consider

j'(x) -

f

K(x, y)j'(y) dy

= O.

The algebraic system corresponding to this is N

L Zm,.(s)j~ = 0

,.=1

(m = 1, ... , N)

(7.70)

where the dependence of the coefficients on s, due to K involving s, has been explicitly displayed. The equations have a non-trivial solution only if det[Zm,.(s)]

= o.

(7.71)

The N roots of this equation are regarded as approximations to the s,.. Obviously, not all of the s,. can be covered but the dominant ones should be there if N is large enough and the accuracy should improve as N increases. In any case, the s; which have a much more negative real part than the others are not generally of much significance. Practical methods for solving (7.71) are those of Newton (§1.8) and of Muller (§1.8). Having found a solution of (7.71) the corresponding j ~ is determined from (7.70) and regarded as an approximation to t'. Alternatively, since we expect Z-1 (Z being the matrix with elements Zm") to have a pole at s = s; we can

428

TRANSIENT PHENOMENA

try to discover the residue directly by evaluating

for small 11 until the result does not alter appreciably with '1. Results of reasonable accuracy have been obtained in this way (Tesche 1973). Notwithstanding, the effort to be expended is not trifling. Not only have the algebraic equations (7.71) and (7.70) to be solved but also the integrals in (7.68). (7.65) or (7.68) have to be evaluted, so the labour is at least of the same order of magnitude as a solution in the frequency domain and probably substantially more. Moreover, the method is not always as efficient in the early time stages as the updating process described in the initial sections of this chapter. Another strategy will, therefore, be set forth now. 7.11 Prony's method and modifications The fact that the surface current after a period of transient behaviour, which lasts at most for the time to transmit twice the greatest diameter of the obstacle, settles down to the exponential evanescence of (7.65) suggests the following tactics. First, calculate the current at a point by working in the time domain with an updating integral equation until sufficient time has elapsed for a reasonable stretch where (7.65) is valid to be available. By sampling in this stretch, arrive at the values of the coefficients for all subsequent times. The input pulse is taken as a Gaussian approximation to the J function as in §7.5. The main problem is to discover the coefficients and exponents in the expansion of the current in exponentials, i.e. when the step function in (7.68) has become 1 permanently. One approach to elucidating them is by Prony's method (Whittaker and Robinson 1952). Suppose that the real current is scalar (the adaptation to vectors is straightforward) and has the representation N

J(t)

= L an exp(snt) .

(7.72)

n=1

The an and s; are to be determined from observations on J at equal time intervals; if the only observations available are at unequal time intervals interpolation will be necessary to supply values on a uniform time scale. For a general current there may be no knowledge of N a priori and so it is desirable to have a method capable of fixing N when it cannot be specified by other considerations. Let the observations be made at t = 0, r, 2r, ... ,(M - l)r where M ~ 2N. Denote J(rr) by 1,.. Let P satisfy N ~ P ~ M - N. Then, it follows from (7.72) that p

L bpJr +

p=l

p_

p = Jr + p

(7.73)

TRANSIENT PHENOMENA

provided that z = expis, r) satisfies zP

+

p

L

bpZP -

p=1

p

=

°

429

(7.74)

for n = 1, ... , N. Clearly, f (7.74) is to furnish all the exponentials in (7.72), it is necessary to make P ~ N. By requiring (7.73) to hold for r = 0, 1, ... , M - P - 1 we obtain the matrix equation (7.75) Ab = c where b = (b 1b 2 ••• bp)T, C = -(J pJp+ 1 ••• J M _ 1 )T and A is an (M - P) x P matrix with entry Aij = J i - 1 +p_ j. The principle of the standard Prony method when N is known is to take P = N and then solve (7.75) for b; this is feasible since A and c are known quantities. Once b has been determined s; can be found from the roots of (7.74); in this connection Muller's method (§1.8) is often valid for eliciting the roots of a polynomial. The s; will be indeterminate to the extent of a multiple of 2ni/r but that is inevitable without a change of sampling interval. With the s; known the an are obtained by satisfying (7.72) at the sampling points. Usually, least squares or a similar approximation will be involved in this determination, as it will in the solution of (7.75). Even in the best circumstances the standard Prony method is suspect. It is very sensitive to small alterations of the data and may exhibit instability if the representation (7.72) is not strictly valid. Some consideration of how matters might be improved is pertinent therefore. Also the question of finding N when it is not given in advance needs to be settled. In general, A is a rectangular matrix and (7.75) is solved by means of the generalized inverse A + (§1.15) or, what comes to the same thing in practice, by least squares (Theorem 1.15). Thus

b = A+c where, according to Theorem 1.15c when A is of rank k, A+

A0 00) VB

=U(

-1

in which U, V are unitary matrices and A is a diagonal matrix of order k whose elements A. 1 , ••• , Ak are the positive singular values of A. In (7.75) the rank of A cannot exceed N, however large P and M, because any N + 1 consecutive observations of J are linearly dependent. There are now two cases which·· can occur: (a) A has at least one singular value which is zero, and (b) no singular value of A is zero. In case (a) N can be identified with the rank of A or, equivalently, the number of non-zero eigenvalues of AHA. In case (b) it is clear that P is not large enough and so P is increased until A moves into case (a); this entails also increasing M (Kumaresan and Tufts 1982 recommend

430

TRANSIENT PHENOMENA

choosing M = 3P). Having determined N one can contemplate returning to the original Prony method but the sensitivity referred to has not been obviated. In any event the determination of N is not quite as simple as has been stated. Unless the observations and numerical procedures are extremely accurate it is highly unlikely that any singular value of A will be exactly zero. Nevertheless, those which should be exactly zero can be expected to be small. So arrange the calculated singular values in decreasing order so that A. 1 ~ A. 2 ~ .•• ~ A. p and form (A. m - A.m + l)/(A. m + A.m + 1) for m = 1, ... , P - 1. The value k of m for which this ratio has its largest value (which would be 1 if A. k+ 1 = 0 exactly) is taken as the rank of A. Calling into play singular values to find N when applying the Prony method is often known as the singular value decomposition-or SVD-Prony method. In practice there are two difficulties associated with the above method of fixing N. The first is that few observations of genuine signals will be free of noise and measurement errors. The second is that in a genuine scattering problem N can be theoretically infinite although many of the exponentials will be heavily damped and contribute little to the long-term behaviour of the current. Since one is interested primarily in a few dominant terms these extra exponentials are effectively acting as noise. The presence of noise renders Prony's method with P = N untrustworthy and, furthermore, the rank of A cannot be estimated reliably unless P is very much larger than N (regarded as the number of dominant exponentials of interest), perhaps P = 30N or greater. Therefore, it is common to use large values of P and M, large enough to ensure that the rank of A has settled down; often the values are doubled as a cross-check that the rank of A has become stable. All subsequent calculations are down with these large values of P and M. Clearly, this will entail subtantial computation (Ross and Dudley 1988) as does any robust method for estimating the exponential terms in J. Trouble arises from another source when P exceeds N. For then (7.74) has more than N roots. Some of these are spurious, being generated by the method, and not part of J. Deciding which roots are true and which spurious is not easy. Assuming that P is large enough for the rank of A to be certain we know how many of the roots are true; the rest can be said to be caused by the noise in the system. It is evident that any roots of (7.74) with lz] ~ 1 are spurious and must be rejected because they correspond to srI which do not have a negative real part. Next, increasing P by 1 or 2 should not affect the true roots much but is likely to modify the spurious ones. Therefore, the roots with Izl < 1 which move least as P varies are the potential true candidates. If there are more than k (the rank of A) of them the only course open is to examine their contribution to J. Those with an a" less than one per cent of the largest a" occurring can be considered for rejection, selecting first those SrI with the most negative real part. If all of these criteria fail to eliminate all spurious roots the suspicion must be that P is not large enough to ascertain the rank of A correctly. To cut down the computational effort in the SVD-Prony method Younan

431

TRANSIENT PHENOMENA

and Taylor (1991) suggest that the data should be pre-processed to reduce the noise content by passing them through a low-pass filter first. The basic idea is to form Jt(q)

=

M-l

L

m=O

i; exp( -i2nq/M)

and then construct J

2(n) = [t{Jl(O) + J1(M)} + qtl {Jl(q)exp(21tiqn/M) + J1(M - q) exp( -21tiqn/M)}JIM.

To set a value on Q let en = I n - J2(n) and form the circular serial correlation L~:: 1 enen+Q' The value of Q is steadily increased from 1; that value at which the serial correlation achieves its minimum is the chosen Q. Now that Q is known, data for the SVD-Prony are generated from J2 (n). According to Younan and Taylor this process enables one to work with a much smaller value of M than is required without filtering. Moreover, the filtering does not cause any deterioriation in the accuracy of the estimates for s; and an' Other robust methods for extracting Sn and an from current waveforms have been proposed by Goodman (1983), Rothwell (1987), Park and Cordaro (1988), Hua and Sarkar (1989). Exercises 16. A scalar current I(t) is observed to have the following values at the times shown: t l(t)

10 20 6460 6090

30 40 50 5642 5049 4417

60 70 80 3623 2401 983

90 142

If I(t) is approximated by Al exp(sl t)

+ A 2 exp(s2t) +

A 3 (S3 t)

show that exp(s 1)' exp(s2), and exp(s3) are the roots of Z3 -

3.029z2

+ 3.788z - 1.687 = 0

in Prony's method. 17. A scalar current I(t) provides the data t I(t)

0 248

8 16 24 32 40 345 421 481 529 569

If Prony's method with three terms is used show that an approximation to I(t) is 629.4(1.001)' - 381.4(0.9670)' + 0.0005(1.236)'. 18. It is known that J(t) = 2e- 3' sin 2t + e -t cos 2t. Construct data and assess the accuracy of Prony's method with P = N = M = 4.

432

TRANSIENT PHENOMENA

2·4 2·0 1·6 1·2 0·8 0·4

E

0·0

Q)

~ ~

8 -0,4 -0,8 -1,2 -1,6 -2,0 -2·4

-2,8 -3,2

0

2

4

6

8

10 12 Time (ns)

14

16

18

20

Fig. 7.6. Data to be sampled. 19. A scalar current is found to have the waveform displayed in Fig. 7.6. Take samples at intervals of ! ns and use Prony's method to locate 40 poles. Show that most of the poles are not very distant from the imaginary axis. Calculate the waveform of the exponential series over 0-100 ns and decide whether it is sufficiently accurate in the first 20 ns for one to have faith in the extrapolation. 20. Assume J(t) = est. Find (7.74) when P = 2 and M = 2. Is it clear which root is spurious? What is the effect of increasing P and M to 3? 21. Add some noise to the current in Exercise 18 and compare the predictions of the SVD-Prony method with and without filtering. 22. Calculate the impulse response of a perfectly conducting sphere by the methods of the last two sections and compare the computer requirements of them critically. 23. A centre-fed dipole of length 1 m and radius t m is excited by the Gaussian pulse exp{ - 25 x 1018 (t - 5.556 X 10- 10 ) 2}. Determine the induced current not far from the feed as in §7.3 during the first 30 ns. Extrapolate from there by Prony's method.

TRANSIENT PHENOMENA

r:

433

Find the input impedance in the frequency domain from j = = 1 an/(iw - sn) and show that it gives acceptable values up to about 2 GHz. 24. A prolate spheroid with semi-axes a, b is illuminated along its axis of symmetry. Trace the movement of poles as alb varies. 25. A finite circular cylinder of length 1 and radius a is subject to a plane impulse from outside. The positions of the poles depend upon I, a and the angle of incidence. Plot their behaviour for various typical situations. By taking 1 very small obtain the response of a circular disc. 26. Assuming that the theory applies to two coaxial circular cylinders obtain the impulse response of a single vertical cylinder above a horizontal perfectly conducting ground plane.

8 GEOMETRIC THEORY OF DIFFRACTION The advent of the computer has meant that the small number of analytical solutions to electromagnetic scattering problems can be supplemented by numerical investigation. Antecedent chapters have indicated ways in which answers can be fabricated numerically and reveal that, despite the power of the computer, the accuracy falls off as the frequency increases. Another route must therefore be pioneered to handle the scattering from bodies which are large compared with the wavelength. The method will be one of approximate analytical technique, often conjoined- to numerical methods as well. The classical high-frequency approach is that of geometrical optics, but it fails to account adequately for the behaviour in shadows and the influence of edges. To cope with these problems an extended version, known as the geometric theory of diffraction (GTD), has been devised (Keller 1957, 1962). It is the purpose of this chapter to describe the main features of the geometric theory of diffraction. 8.1 The high-frequency approximation

It will be assumed that the fields vary harmonically in time according to the factor exp(iwt) where ca is real. Then Maxwell's equations take the form curl E

+ iWJlH = 0,

curl H - iwE

=0

(8.1)

where the real Jl and B account for the permeability and permittivity of the medium. The refractive index N of the medium is defined by N = (P£/Jl oBo) 1/2 where JJo and Bo refer to free space. The real quantity W{JJoBo)1/2 will be denoted by ko. If the medium is not isotropic u and B must be replaced in (8.1) by tensors. The resulting analysis then becomes much more complicated in detail without affecting the basic principles of the method. For simplicity, therefore, only the case of scalar Jl and B will be treated. When the wavelength is so small that significant medium changes occur only over distances large compared with it, a reasonable assumption is that in local regions the field behaves as if it were in a homogeneous medium. Thus, locally, the wave may be expected to look like a plane wave. This suggests that E might have the form Eo exp( -ikoL). However it is more likely that this offers a first approximation. Therefore, we introduce the more recondite assumption or

435

GEOMETRIC THEORY OF DIFFRACTION

Ansatz that . ( Eo + Ek 1 + Ek~2 E = exp( -lkoL) o

+

" = exp( -ikoL)(Ho + HI + H:

ko

ko

)

, (8.2)

+

)

where L, Eo, E 1, ... , "0' "1'. · · may depend upon position but are independent of ko. There is no a priori knowledge that the expansion (8.2) is valid whether as a convergent or as an asymptotic series. Even in special circumstances it can be extremely difficult to prove the legitimacy of the expansion. Be that as it may, experience is overwhelmingly in favour of its introduction. It tenders the possibility of progress when exact or numerical solutions are out of the question. Even when the expansion is confuted that may be others of a similar nature which will succeed as will be seen. If (8.2) is substituted in (8.1) and it is agreed to take derivatives term by term then " curl E, i..J - ,=0 leO

-

.

lko grad L

"E, i..J ,=o/co

A

"~ curI -n, - 1·k0 gra d L leO

,=0

. + lWJL

,,", i..J ,=0 k~

= 0,

(8.3)

o.

(8.4)

,,", · "E, i..J - - lwe i..J - = ,=0 k~

A

,=0 k~

If L., E" H" and their derivatives are finite and vary significantly only over several wavelengths, the coefficients of separate powers of ko may be equated to zero in (8.3) and (8.4). There results

= JLH o, } "0 = -eE o,

(8.5)

JlH ml / 2 = -i curl Em-I,

(8.6)

(JL oe o ) I /2 grad L

A

(/Joeo) 1/2 grad L

A

grad L /\ Em -

Eo

(poeo)

(8.7)

for m = 1,2, .... From (8.5) it is evident that, so long as JL and e are non-zero, Eo.grad L = 0, Furthermore, if

"o.grad L = 0,

"0 is eliminated from (8.5),

Ho.E o = 0.

(8.8)

(grad? L - N 2)Eo = 0 when account is taken of (8.8). If Eo is not to be identically zero it is necessary that (8.9)

GEOMETRIC THEORY OF DIFFRACTION

436

Eqn (8.9) is a partial differential equation for L, the function which defines the surfaces of constant phase, i.e. the wavefronts. The function L is known as the eikonal and (8.9) is called the eikonal equation. From (8.8) it can be seen that Eo and are transverse to grad L, i.e, are transverse to the direction of propagation of the wavefront. Since Eo is also perpendicular to H o the Poynting vector is parallel to grad L and so the direction of energy flow is normal to the wavefront. Consequently, the field due to Eo, H o has locally all the properties of a plane wave and may be expected to be a reasonable first approximation at high frequencies. It will, however, require close scrutiny at points where N becomes is predicted to small or changes discontinuously and at points where Eo or be large. The curves which have at each point the direction of the energy flow in the field are known as rays. It has just been shown that the rays are normal to the wavefronts. In general media there are two characteristic velocities. One is the wave velocity which is the rate of displacement of the wavefront in the direction normal to itself. The other is the energy or ray velocity. For the isotropic medium just considered the velocities can be identified with one another. For anisotropic media the wave and ray velocities may not coincide; the distinction between the two is then vital and must not be forgotten. From (8.6)

"0

"0

(1

curl - grad L Jl

1\

Em

)-

curl a, <Jlo8 o)

1/2

=

. curl (1- curl E m- l )

-I

Jl

while, from (8.7), div(eE m)

=-

(poeo)I/2 div(grad L

A

Hm )

= (J,loe o)1/2 grad L. curl H;

= ie grad L . Em + I when (8.7) with m replaced by m Il curl (; curl Em-1)

-

grad

+

G

1 is noted. Hence

div eEm-1)

.( d ) = 1. grad Ldi L div Em - 1. u curl H, 1/2 - 1 gra L (poeo)

1\

curI E m

- illEmdiv (; grad L) - 2i(grad L · grad)EmMultiply (8.6) by grad L

1

.

1\

(Em' grad Il) grad L. (8.10)

and add J,l/(}Joeo)I/2 times .(8.7) to obtain

- (div eEm-l) grad L = e

(D

u curl H; - 1 + gra d IE L A cur m-l (Jloeo)1/2

(8.11)

437

GEOMETRIC THEORY OF DIFFRACTION

by virtue of (8.9). Replace m - 1 by m in (8.11) and then substitute in (8.10). There results 2i(grad

L. grad)E

m

+ iJlE

m

div

{~ grad L} + .'. {Em. grad(/u:)} grad L JJe

Jl

t

= grad G diVSE m-i) - J.l curl curl Em-i'

(8.12)

Similarly, or by observing that the equations for H; are the same as those for Em with u and e replaced by -e and - Jl respectively, it may be shown that 2i(grad

L. grad.H; + ieH

m

div

= grad

(! L) + ~ (t e

grad

JJe

{Hm. grad(J.le)} grad

L

div J.lH m-i) - s curl GCUrl H m-i). (8.13)

The only derivatives of Em and H, which occur in (8.12) and (8.13) are normal to the wavefront. Therefore (8.12) and (8.13) are ordinary differential equations for Em or H, along a ray_ They are known as transport equations. When E m - 1 = 0 and "m-l = 0 the right-hand sides of (8.12) and (8.13) vanish. The scalar product of (8.12) with Em then gives

grad L · grad E;, + J.lE;' div since now Em. grad L

= 0 from

(t

grad L) = 0

(8.7). The equation may be written as

(t

E;, grad L ) = O.

(8.14)

div G H;, grad L ) = O.

(8.15)

div Similarly

In particular, it is always true that

div

(~E~ grad L) =

0,

div

HH~

grad

L}

= O.

Exercises

(8.16)

1. In a homogeneous anisotropic medium D = ~. E, B = J1. H where ~'J!:. are real symmetric positive definite tensors. Assuming an approximation of the form E = Eo exp( -ikoL), etc. show that Do is perpendicular to and grad L while Do is perpendicular to Eo and grad L. If S = Eo 1\ define the energy velocity as 2S/(Eo. Do + Bo). Show that the phase speed v of the wavefront is equal to the projection of the energy velocity on the wave normal.

"0

"0

"0.

438

GEOMETRIC THEORY OF DIFFRACTION

If Jl is a scalar and ~ has elements that v satisfies Fresnel's equation

8 h 82' 83

with respect to its principal axes show

ni n~ n~ --+------+--=0 2 2 2 v

vi

-

v

-

v~

v

v~

-

where v; = 1/Jl8i and n h n2' n3 are the components of grad L along the principal axes. If D~, D~ correspond to two different solutions of Fresnel's equation show that D~.D~

= o.

2. In Exercise 1 assume that S is parallel to neither div(E o A "0) = o.

~. grad

L nor J!:.. grad L. Prove that

8.2 Geometrical optics In an isotropic medium the approximation E = Eo exp( -ikoL) is often known as qeometrical optics, through the name is sometimes applied to the full expansion (8.2). Since the rays are normal to the wavefront the tangent to a ray must be parallel to grad L. Hence a unit vector s along the tangent is s

1

=-

N

(8.17)

grad L

when cognisance is taken of the eikonal equation (8.9). Let K be the curvature at the same point and v a unit vector in the direction of the radius of curvature. Then KV = ds/ds where s is the arc length measured along the ray. Now ds

- = (s.grad)s = -s 1\ ds

and so K

=-

V • S 1\

curl s

curl s.

On using (8.17) and replacing grad(l/N) by -(liN) grad In N we obtain K

= v , grad In N

(8.18)

since v is perpendicular to grad L. Equation (8.18) carries the information that the rate of change of N in the direction of the centre of curvature is positive. This means that a ray bends towards the region ofhigherrefractive index N or, equivalently, towards a region with a lower speed of light. In a homogeneous medium N is independent of position and the right-hand side of (8.18) vanishes. Thus the curvature K is zero and the rays are straight lines. The behaviour of the magnitude of Eo can be found by considering what happens to a tube of rays. Let the tube of rays intersect the wavefronts L 1 and L 2 in the surface elements dS 1 and dS2 respectively (Fig. 8.1). No power will flow across the sides of the tube because the energy moves along a ray; the

GEOMETRIC THEORY OF DIFFRACTION

439

Fig. 8.1. Propagation along a ray tube.

flow through any normal cross..s ection of the tube must therefore be constant. Hence

or, from (8.5),

(;:)1/2/EO/i as, = e:)1/2/Eo/~ dS2.

(8.19)

This is known as the intensity law of geometrical optics. It may also be obtained by integrating (8.16) over a ray tube and remembering the eikonal equation. Similarly, (Jl/e)I/2Ho satisfies the intensity law of geometrical optics. The intensity law permits the determination of the field amplitude at any

point on the ray when it is known as a single point. The form (8.19) is often valuable though other representations are sometimes helpful (§8.3). If J.l is independent of position (8.19) can be expressed in terms of the refractive index N as

(8.20)

When the medium is homogeneous an illuminating formula can be deduced from (8.19). Let the ray through the point A of the wavefront L 1 be the z axis. Take A as origin and choose the (x, z) and (y, z) planes to be the planes containing the principal radii of curvature PI and P2 of the wavefront at A (Fig. 8.2). A ray through B, a point on the x axis adjacent to A, will intersect the z axis at 0 1 and 0lB = Pl' Similarly, a ray through the adjacent point C on the y axis will intersect the z axis at O 2 where 02C = P2' Take a radius of curvature as positive when the centre of curvature is on the negative z axis as in Fig. 8.2; otherwise it is negative. Continue the rays through A, B, and C until they meet the wavefront L 2 at A', B', and C' respectively. Often the ray AA' which passes through the middle of the small element of L 1 is known as an axial ray, whereas

GEOMETRIC THEORY OF DIFFRACTION

440

L, Fig. 8.2. Propagation of energy in a homogeneous isotropic medium.

rays such as BB' through the periphery of the element are called paraxial rays. The paraxial rays form the boundary of a tube. The rays are normal to L 2 and so the normals at A' and B' intersect at 0h i.e. the (x', z') plane contains a principal radius of curvature. This radius of curvature can be no other than PI + s where s is the constant length of ray cut off by the wavefront L 1 and L 2 . Similarly, the (y', z') plane contains the other principal radius of curvature P2 + s. Let dS I be the element of area surrounding A and let the paraxial rays produce the element of area dS2 about A'. Corresponding points in the two areas are related by

, Ipt+sl

x = - - x, PI

y

s ' = !P2+ - - y. P2

l

Hence dS 2 Since N I

= N2 ,

= I(Pt + S)(P2 + S)I dS I • PtP2

(8.21)

(8.20) gives 1/2

E PtP2 E I ob - (PI + S)(P2 + S) I 011 I 1

(8.22)

in a homogeneous medium. When PI = P2 the wavefront is spherical and we have the customary inverse square law of decrease of intensity with distance. It will be observed that (8.22) will give trouble when PI or P2 is negative, i.e. geometrical optics becomes suspect when a principal centre of curvature of a wavefront is in the region into which energy is propagating. Then focusing effects can be anticipated (astigmatism will occur If PI =F P2) and some modification of the theory will be necessary. More generally, we can deduce from (8.21) the relation between dS t and dS2 when the medium is inhomogeneous, for, when the two surfaces are close

441

GEOMETRIC THEORY OF DIFFRACTION

together, we may expect (8.21) to continue to hold. In other words, on a small displacement the change in an element of area dS I is {1/PI (s) + 1/P2(S)} dS I ds where PI and P2 are the principal radii of curvature at s. Hence, in general, dS 2

= dS I exp f

S 2

Sl

{II} + --PI (s)

(8.23)

ds

P2(S)

which expresses the expansion of the element of area as an integral of the mean curvature of the wavefront. Equation (8.23) can be inserted in (8.19) to provide the change in intensity from s I to S2' as 2

IEol~

p s ) 1/ IEoli exp [ =(~

PIe2

-

f

S2

SI

{II} + -- ] -PI(s)

P2(S)

ds .

(8.24)

8.3 The ray and transport equations A standard method of generating solutions of a partial differential equation such as the eikonal equation (8.9) is to introduce the curves which satisfy the ordinary differential equations

dx

oL

drr

ax

-=--,

dy _ da -

oL oy'

dz iJL da -

oz

(8.25)

The parameter a varies along a curve of solution and is connected to the arc length s by

(8.26) on account of the eikonal equation. It is obvious from (8.25) that the curves are normal to the wavefronts and therefore we can identify them with rays. Yet the most convenient form is not furnished by (8.25) because the unknown L is involved. To eliminate L, observe that 2x

d d0"2

a2 L dx a2 L dy a 2 L dz 1 0 = ~ ox = ox 2 dO" + ox oy dO" + ox OZ dO" = 2 ox (grad? L) d (OL)

from (8.25). We infer from the eikonal equation (8.9) that

d2 x aN -= N2 da ax'

d2 y aN d2z aN - -2 = N - - -2= N da oy' da oz

(8.27)

These are differential equations for the rays which can be tracked directly and, in particular, by numerical methods. They can be solved when suitable initial data are supplied for some values of (J (see also §8.24). The differential equations for the rays can be expressed in terms of the arc

442

GEOMETRIC THEORY OF DIFFRACTION

length via (8.26). They become d ds

(N dX) _oN ds

-

d ds

ax'

(N dY) _oN ds -

d ds

oY'

(N dZ) _ aN ds - a;

(8.28)

subject to (8.29)

Once the rays have been found the eikonal L can be determined by remarking that its change between two points on a ray is N ds. The magnitude of Eo can be found from the intensity law and so the geometrical optics field is known completely as soon as the direction of Eo is discovered. To unravel the change in polarization along a ray a return is made to the transport equation (8.12) which now reduces to

J

2(grad L. grad)E o + JlE o div

(~ grad L) + ~ {Eo. grad(jle)} grad L = O. Jl

JlB

(8.30)

Introduce the vector P = Eo/IEol; then dP = _1_ dE o __ Eo Eo. dE o . de IEol de IEol 3 de But, from (8.30), (8.25), and (8.8) o E odE .-

dO"

=

(1

)

1 2. -iJlIEol div - grad L .

Jl

Hence (8.30) gives dP

1

dO"

N

- +-

P . grad N grad L

= o.

(8.31)

It may be verified readily that Ho/I"ol satisfies (8.31) also. One consequence of (8.31) is that P . dPIda = 0 so that IPI 2 is independent of (J, which is consistent with P being a unit vector. Let b be a unit vector in the direction of the binormal to the ray. Then b = S 1\ v and the Serret-Frenet formulae are KV

where

t

ds ds'

=-

dv = - "s ds

-

+ tb

'

db ds

-

=-

is the torsion of the ray. Then, from (8.28), grad N

d(Ns)

dN

ds

ds

= - - = N"v + s - .

tV

(8.32)

GEOMETRIC THEORY OF DIFFRACTION

Since P.s

= 0 it follows from

443

(8.26) that (8.31) can be written as dP

-- + "P . vs = o. ds

Because P is perpendicular to s, it is a linear combination of v and b, say P = av + pb. Further a 2 + p2 = 1 because P is a unit vector. Hence dv ads

da ds

dP

db ds

+ v - + P- + b - + "as = ds

0

or

on account of (8.32). Since b and v are perpendicular, their coefficients must vanish. So, if a = cos fJ and P = sin fJ,

~ exp(iO) + it exp(iO) = 0 ds

whence

exp(iO)

= exp(i01) exp( -

itt

dS)

where fJ 1 is the value of fJ at s 1. Hence

Eo = IEol {V COs(O 1 From (8.5)

n, = IHol{bCOs(Ot -

t t

tdS)

+ bsin (Ot -

tdS) -

VSin(Ot -

t t

tdS)}.

(8.33)

tdS)}.

(8.34)

"0

Equations (8.33) and (8.34) show how the polarization varies as one moves start in the plane of v and b with Eo making along a ray. The vectors Eo, an angle fJ 1 with v and then rotate through the angle - J T ds with respect to v as Eo, H o, v, and b travel down the ray (Fig. 8.3). If the ray is a plane curve the torsion T disappears. The vector Eo then stays in the plane of v and b keeping at a constant angle to v. In particular, if the medium is homogeneous the vector Eo always lies in a fixed plane containing the ray. Once Eo is known we may contemplate the possibility of determining higher-order approximations from the transport equation (8.12). Write (8.12) as 2(grad

L. grad)Em+ JlEmdiv (!J1. grad L) + J1.e~ {Em

0

grad(JlG)} grad

L= em

444

GEOMETRIC THEORY OF DIFFRACTION

v

Ho

Fig. 8.3. Rotation of polarization along a ray.

where

em = iJ.L curl (~CUrl E

m - 1)

i grad

-

G

div eEm -

1)

is supposed to be known. Let P; = Em/IEOI. Then the following equation is obtained by the same means as (8.31) was derived: dP

(Pm. grad N)s

ds

N

m -+

em

(8.35)

=---.

2NIE oi

A scalar product with s yields d(Ns.Pm ) ds

= ~ CmeS 2 IEol

Hence (m ~ 1)

(8.36)

which supplies the change in the component of P; normal to the wavefront in going along a ray from S1 to S2 since C; and Eo are supposed to be known. If (8.36) is supplemented by an equation for the components of P; tangential to the wavefront then Pm has been found. Take a scalar product of (8.35) with P. Then P • ~pm ds

=

P Cm • 2NIE oi e

GEOMETRIC THEORY OF DIFFRACTION

445

On the other hand, a scalar product of (8.31) with P; leads to

By addition d(P .Pm) ds

= P .em

_

"P. vPm.s

2NIE oi

and so

(8.37) The integrand is known because P has already been found while Pm. s is given by (8.36). Accordingly, the tangential component of P; parallel to Eo has been determined. In (8.37), P can be replaced by Ho/IHol without altering its validity and so all components of P; are now available. A similar investigation of (8.13) provides the components of H m • It should be remarked that the determination of Cm may not be trivial in practice. The procedure described supplies the field at points of rays, i.e. in ray coordinates. The vector operators in C; have therefore to be expressed in terms of these coordinates or the ray coordinates have to be inverted to Cartesian form. In either case the formulae may be complicated although there is no difficulty of principle. There is another way of writing the differential equations for a ray which can be revealing. In vector notation (8.25) is abbreviated to dx - = grad L. dO'

(8.38)

Also

(8.39) d(grad L) dN ---=Ngra dO'

(8.40)

as in the derivation of (8.27). Equations (8.38)-(8.40) constitute an autonomous system of first-order differential equations for x, grad L, and L. For some purposes the fact that they are first order may make them preferable to the second-order system of (8.27). Let k o = k o grad L. Then (8.38) and (8.40) can be expressed as

dx

ko

de - ko '

dk.,

= koN grad N. de

GEOMETRIC THEORY OF DIFFRACTION

446

Introduce the speed c = 1(,uc;)1/2 and put ds/dr it is of the dimensions of time. Then dx _ ck o dt - koN'

= c, the symbol t indicating that

dk o dt

- = -koNgradc.

-

(8.41)

Before leaving the subject we derive an alternative, but equivalent, form for (8.24). Suppose that x, L, and grad L are given on some surface S. On the surface they will be functions of two parameters, say (12 and (13. Off S values are determined by the integration of (8.38)-(8.40). They will be functions of (J, (J 2' and (J 3 with (J 2 and (J 3 constant on any particular ray. Thus x can be regarded as a function of (J, (12' and (J 3. To assist temporarily with the notation replace (J by (J 1 and let a be the vector with components (J l' (J 2' and (J 3. Form the Jacobian

J

= o(x)

=

o(a)

Then

O(X h X2' X3). O«(Jh (J2' (J3)

where cof au stands for the co-factor of

oj O(J 1

Now cof(oxi/iJ(Jj)

=J

~=J O(J1

au in J.

t t ~ (OL)

j= 1 i= 1 O(Jj

OXi

Hence

cof

oxi

.

O(1j

iJ(Jj/iJXi and so

t t ~ (OL) O(1j = J t 0

j= 1 i= 1 O(Jj

OXi

OXi

2 :

= JV 2L.

(8.42)

i= 1 OXi

Reverting to our original notation and substituting for V2 L in (8.16), we obtain

-d

de

(J-Eo2) = 0 IJ

(8.43)

i.e. JE5/IJ is independent of (J on a ray. Changes of J account for the variation of the cross-section of a tube of rays during motion along the tube. Instead of calculating the integral of mean curvature as in (8.24) we have now to evaluate the Jacobian determinant of the relations between ray and Cartesian coordinates (see also §8.24 and R. M. Jones 1968). 8.4 The stratified medium A medium in which IJ and s are constant on anyone of a family of parallel planes is said to be stratified. The speed of propagation and refractive index will then vary only in a direction perpendicular to these planes.

GEOMETRIC THEORY OF DIFFRACTION

447

Let the family of planes be parallel to z = O. There is no loss of generality in taking the z axis vertically upwards. Then N is a function of z only. Only rays in the (x, z) plane need to be attended to because those in other planes through the z axis can be obtained by rotation about the z axis. The ray equations (8.28) reduce to

(dX) = 0, dsd(N dZ) aN ds = a;'

d ds N ds Hence

dx

C

ds

N

--where C is a constant. From (8.29)

dz = +(1 _ ds

-

C

(8.44)

2)1/2.

N2

(8.45)

Assume that the ray starts from the origin and that N(O) is positive. Measure

s positively along the ray from the origin. Then C will be positive on rays which

are propagating to the right. The upper sign in (8.45) will assign a ray travelling upwards, whereas the lower sign furnishes a ray going downwards. To fix ideas, let us concentrate on the upward ray. Then, after division of (8.45) by (8.44),

dz = dx

(NC2 1)1 / 2

_

2•

(8.46)

Consequently, the choice of C decides the slope of the ray as it leaves the origin. Clearly, the ray will be real only if N 2 > C2; it will be assumed that this condition holds. Integration of (8.46) discloses the equation of a ray through the origin as

i

z

X=

o

{N2(W) --2--1

C

}-1/2 dw.

(8.47)

Explicit evaluation of the integral is rarely possible and numerical integration will have to be resorted to. Fortunately, some properties can be inferred without knowledge of the explicit value of the integral. If the tangent to the ray makes an angle 0 with the z axis, dx/ds = sin O. Hence (8.44) implies that N(z) sin 8

= C = N(O) sin 80

(8.48)

where 80 is the value of (} at the origin. In a region where N increases with z, and the ray bends towards the vertical. In contrast, if N decreases with z, i.e. the phase speed increases with height, the ray bends away from the vertical. There then arises the possibility that the ray is bent so far around that it starts to come down. In essence, the ray is being reflected and returned to

o decreases

448

GEOMETRIC THEORY OF DIFFRACTION

Fig. 8.4. Rays when N has a minimum at z = m.

°

its original level. Such reflection can take place only if at some height h the ray is horizontal, i.e. = tn. Thus the phenomenon of reflection requires the existence of N(h) such that N(h) = N(O) sin 80 , (8.49) No real angle of launching is allowed unless N(h) < N(O), i.e. the refractive index at altitude h is less than that at launching. Suppose that, in fact, the refractive index decreases to a minimum at z = m and then increases. Let Om be the angle such sin

° = N(O) m

N(m) .

If 00 < Om' the condition (8.49) can never be met for any value of h. A ray launched at such an angle will always travel vertically upwards and steadily depart from the level z = 0 (Fig. 8.4). If 80 > 8m the ray will be reflected and turn back at the height h given by (8.49). The horizontal distance X h to the point of turning is, from (8.47) and (8.48), supplied by Xh

= N(O) sin ()o

f

{N 2 (w) - N 2(O) sin! ()o} -1/2 dw.

In the critical case 80 = 8m we have h = m. Now, for w near m, N 2 (w) - N 2 (m) = O{(w - m)2} because of the minimum of N(w) at w = m. The integrand in Xh thus becomes non-integrable and x; --. 00 as 80 --. 8m • Therefore, the limiting ray 00 = Om approaches the altitude z = m asymptotically. As 00 increases from zero there is no reflection until 00 = Om at which value Xh is infinite. Further increase of 80 must force X h to decrease. Yet eventually x, must go back to infinity because that is what happens when 00 = tn. Hence X h has a minimum X m for some value of 80 . Thus, for X m < Xh < 00, there are

449

GEOMETRIC THEORY OF DIFFRACTION

at least two rays with different angles of launching which turn down at this value of Xh· The equation for the reflected ray is not delivered by (8.47) when x > Xh since

in this range the lower sign must be adopted in (8.45) to make the ray migrate downwards. The appropriate formula is

x

= Xh -

1% {N 2(w) -

N(h)

N 2(h)}-1/2 dw.

(8.50)

when x ~ Xh. The eikonal or phase variation on a ray is estimated by

L=

f

Nds,

the integration being along a ray. For an upgoing ray

L

=f

% o

N2(W)

{N

2 (w)

- C

2

} 1/ 2

dw = Cx

+

f% 0

{N 2 (w) - C 2 } 1/ 2 dw

(8.51)

when L = 0 at the origin. For 00 > Om' (8.51) is still correct while the ray is journeying upwards but, after reflection, must be replaced by

L = N(h)x

+

{h

{N 2(w) _ N2(h)} 1/2 dw

-1% {N 2(w) -

N 2(h)}1/2 dw

(8.52)

where now x is given by (8.50). The rays under consideration are plane curves and so, according to the theory of the preceding section, the polarization of the geometrical optics field is invariant. It therefore remains to evaluate the amplitude of the field. Formulae (8.19), (8.24), and (8.43) are available. For the sake of illustration we shall employ (8.43) but we must take care to acknowledge its three-dimensional character. Let r, l/J, z be cylindrical polar coordinates. The cylindrical symmetry of the propagation guarantees that the equation of a ray in

f(r, z, 00 ) Characterize a ray by 00 and a 3 = l/J. Hence

l/J;

= o.

(8.53)

then, in the definition of J, a 2

or oz or oz) oa oa •

J =r ( - - - - 000 000

From (8.53),

or of oz or oa oz va

of

--+---=

0

= 00

and

450

GEOMETRIC THEORY OF DIFFRACTION

so that

ar = -A. af, az = A. af.

au

oz

or

OU

Hence J = -rAd/loOo' On account of (8.26)

Then IEol is determined from

r( elJ.l) 1/21Eo 1210/1000 I

(8.54)

- - - - - - - - - = constant

{(o/lor)2 + (O/IOZ)2} 1/2

on a ray. The geometrical optics field is therefore completely known. There is, however, a difficulty in that 0//00 0 is zero on a reflected ray at the point of turning. Hence, geometrical optics ceases to hold near the point of turning. Indeed, there is also a change of phase of introduced by passing through the point of turning as will be seen later (§8.14) when we come back to the consideration of how to cope with the failure of geometric optics.

tn

Exercises

3. A horizontally stratified medium has N 2 = 1 + z. Show that an upgoing ray through z = Zo is x = 2C(1 + z - C 2)1/2 - 2C(1 + Zo - C 2) 1/2 and find the equation of a downgoing ray. If 1 + Zo > C 2 > 1 show that the downgoing ray starts to rise eventually and then has equation x

= 2C{ (1 + Zo -

C 2) 1/2 + (1 + z - C 2)1/2}.

Prove also that the point of turning lies on the curve

x = 2(1 +

Z)I/2(ZO -

Z)I/2.

4. In Exercise 3 show that on the downgoing ray L = Cx + !(1

+ Zo -

C2)3/2 + !(l

+ Z-

C 2 ) 3/2

indicating where the different signs are to be employed. 5. In a horizontally stratified medium N 2 = Ni(l + a2z2 ) . A downward ray leaves x = 0, z = Zo at an angle cos- 1{Cj N1(1 + a2z~)1/2} to the x axis. If C > N, show that the ray turns at z = h = (C 2 - Ni)1/2 jaN l and that its equation thereafter is C {z x=-ln

N1a

+

(Z2 -

h 2) l /2}{zo + (z~ - h2)1/2}

h

2

Prove that the change of phase along the ray is

•

GEOMETRIC THEORY OF DIFFRACTION

451

the upper or lower sign being used according as the rising or falling part of the ray is under consideration. 6. By commencing with the vector form of (8.28), prove that the differential equations for rays in two-dimensional cylindrical polar coordinates r, l/J are ..

·2

v,

N(r - rl/J ) + Nr

aN = --

.. · .. N(rl/J + 2rl/J) + Nrl/J

or

aN

=r

ol/J

the dots signifying derivatives with respect to s. In a radially stratified medium aN/ol/J = O. Prove that in such a medium the rays have equations of the form

8.5 Fermat's principle Draw any curve joining the points Po and P. Then the line integral J~o N ds is known as the optical path length between Po and P along the curve. In general, the optical path length will depend not only upon the end-points Po and P but also on the actual path traced between them. If the curve is a ray, the optical path length is L(P) - L(Po) on account of (8.39). An immediate- deduction is that the optical path length is the same for every ray between two wavefronts. Comparison of (8.28) with the equations of motion of a particle in the theory of mechanics informs us that the optical path length is stationary on a ray. This way of characterizing a ray is embodied in

Fermat's principle: The rays between Po and P are those curves along which the optical path length is stationary with respect to infinitesimal variations in the path. There may, of course, be several rays joining Po and P because of the various routes which can be open. The inferences to be drawn from this guiding principle will now be examined. From the onset it must be recognized that the principle does not require the optical path length to be a minimum; only stationarity with respect to local variations of path is asked for. If this point is forgotten significant rays may be omitted. When the medium is homogeneous the optical path length is a constant multiple of the geometrical path length. A straight line denotes a minimum for both; therefore Fermat's principle asserts that in a homogeneous medium the rays are straight lines, in harmony with previous conclusions. Consider the problem of an interface separating two different homogeneous media with Po and P on the same side of the interface (Fig. 8.5). Of the paths

452

GEOMETRIC THEORY OF DIFFRACTION

n

Fig. 8.5. Fermat's principle applied to reflection.

joining Po and P, the straight line PoP is one path where the arc length is stationary. In fact, it is an absolute minimum of all paths between Po and P. It corresponds to the direct ray from Po to P and carries the primary radiation as if there were no interface present. on the The next eventuality to investigate is whether there is a point on interface for which Poo + OP is stationary for small displacements of the interface. The connecting paths PoO and OP must be straight lines because with its position the medium is homogeneous. Let 0' be a point near to specified by tbs where t is a unit vector tangential to the interface at 0. Draw n the unit vector along the normal to the interface at and let a, b be unit vectors along PoO and OP respectively. Then the vectors PoO and OP may be expressed as 11a and 12b respectively. Let the lengths ofPoO' and O'P be 11 + c51 1 and 12 + bl 2 respectively. The difference between the optical path length along PoO'P and that along PoOP is N(c51 1 + bI2 ) . According to Fermat's principle this must vanish to the first order. Now, if a + c5a is a unit vector along PoO',

°

or

(/ 1

+ c51 1)(a + ba) = Ila +

°

° °

tbs

11 <5a + a<5/ 1 = t<5s

correct to the first order. But a. c5a = 0 because a

c51 1

+ <5a is a unit

vector. Hence

= t.ac5s.

Similarly, c51 2 = - r. bc5s. As a consequence, Fermat's principle enforces

(a-b).t=O for all t in the tangent plane at O. Therefore, the plane containing a and b is perpendicular to the tangent plane, i.e. the incident ray, reflected ray, and surface

GEOMETRIC THEORY OF DIFFRACTION

453

normal are all in the same plane. Also the incident and reflected rays make equal angles with the interface normal. This is in accordance with the customary laws of reflection. Note that no assertion is made that there is only a single point of reflection; there could be several relevant points depending on the relative positions of Po and P. Snell's laws of refraction may also be verified by placing Po and P on opposite sites of the interface but the details will be omitted. These results might have been anticipated since the approximation of geometrical optics is one of a locally plane wave, and they suggest that even when the media are inhomogeneous Snell's laws are valid for reflection and refraction at a discontinuity. Nevertheless they must be regarded as an unexpected bonus. The reason for that statement is that geometrical optics cannot be expected to be valid near the interface because the discontinuity in N violates the hypothesis on which the theory is based-namely, that the material characteristics shall be slowly varying. Actually, there is no real paradox because it is not genuinely affirmed that the rays are reflected and refracted at an interface according to Snell's laws. What is contended is that, away from the interface, the reflected and refracted waves can be thought of as composed of rays and that, if the rays were continued back to the interface (despite the ray picture being no longer a proper description there), they would comply with Snell's laws (see also §8.28).

Exercise 7. By placing Po and P on opposite sides of the interface in Fig. 8.5 verify that the incident ray, refracted ray, and surface normal lie in a plane. Show also that, if cos(a, t) is the cosine of the angle between a and t, Nl cos(a, t)

= N2 cos(b, t)

where Nl and N2 are the refractive indices of the two media.

Fermat's principle tells us the trajectory followed by a rayon reflection or refraction but says nothing about what happens to the amplitude of the field. The resolution of this difficulty depends upon the exact solution of reflection by an interface. However, it will be enough to discuss a plane wave and extract the reflection and refraction coefficients from that theory because of the local nature of the geometrical optics approximation. In order words, supplementary information is abstracted from a canonical problem. Let the incident, reflected, and refracted rays be distinguisheJ by the superscripts i, r, and t respectively. Let the incident and reflected rays make an angle 8 with the normal to the interface (Fig. 8.6) whereas the refracted ray is inclined at an angle 8 t , where N sin () = Nt sin (}t, N and N, being the refractive indices on each side of the interface. It is already known that the three rays through 0 lie in a plane, called the plane of incidence. The electric field may

454

GEOMETRIC THEORY OF DIFFRACTION

Fig. 8.6. Reflection and refraction at an interface.

therefore be split into a component orthogonal to the plane of incidence (such a wave will be designated electrically polarized) and a component in the plane of incidence (giving rise to a magnetically polarized wave). For an electrically polarized wave the reflection coefficient R E and transmission coefficient T E are given by J.l1 N cos fJ - J.l(Ni - N 2 sirr' fJ)1/2 RE = (8.55) J.l1N cos () + J.l(N~ - N 2 sin? ()1/2' TE

=

2Jl1 N cos f) J.l1 N cos fJ + J.l(Nf - N 2 sin? 8)1/2

(8.56)

while in the case of the magnetically polarized wave RM

=

TM =

J.lNi cos () - J.l1 N(Nf - N 2 sin? 8)1/2 pNi cos 0 + P1 N(Nf - N 2 sin? 8)1/2'

(Jl1e/p.e1)1/22JlN~ cos f ) . pNf cos 8

+ P1 N(Nf -

N 2 sirr' 8)1/2

"i

(8.57) (8.58)

"r

In (8.57) the components are in the directions of and in Fig. 8.6 while the extra factor (P18/p81)1/2 in (8.58) takes account of the fact that the electric intensity rather than the magnetic intensity is involved. If the boundary should be perfectly conducting these simplify to R E = -1, R M = 1. In homogeneous media the field on a reflected ray may now be taken as 1/2 Er = I 1 exp( -ikoV)REi(O) (8.59) (p~ + Sr)(P2 + s')

P~P2

from (8.22). The superscript r indicates quantities appropriate to the reflected

GEOMETRIC THEORY OF DIFFRACTION

455

wavefront and R is a reflection matrix. When the field is dissected into its electrically and magnetically polarized parts

R=(:E :M)

(8.60)

where RE and RM given by (8.55) and (8.57). There is a similar representation for the transmitted wave. If the media are not homogeneous we can still apply E'(O) = REi(O), Et(O)

= TEi(O)

but now N, Nh u, and III must have their values at 0 in Rand T. Away from the boundary, the field has to be determined by the laws already established for tubes and the possibility of a rotation of polarization cannot be ignored. In (8.59) the curvature of the reflected wavefront must be known before calculations can be carried out and we now investigate the affiliation with the curvatures of the incident wavefront and interface. Let the z axis in Fig. 8.2 go from A towards A' and let 11 be any Cartesian coordinates in the (x, y) plane with origin A. Then, on the axial and paraxial rays,

e,

(8.61)

correct to the second order. Here; is a vector in the (x, y) plane and C is a symmetric matrix governing the behaviour of the curvature. In fact, det C(O)

1

=-, PIP2

trace C(O)

1

1

PI

P2

=- +-

(8.62)

Since grad? L is constant a derivative with respect to z, followed by putting ; = 0, z = 0, gives «(J2LjoZ2)O = O. Hence (8.61) reduces to

(8.63) so long as z is correct to the second order. Now choose the origin to be the point 0 of the frontier in Fig. 8.6. Let t 17 t 2 be coordinates perpendicular to the unit normal n with t 1 perpendicular to the plane of incidence. If ~i, Zi and ;r, zr are the coordinates relevant to the incident and reflected waves select ei and er as perpendicular to the plane of incidence. Let t be the tangential position of a point on the boundary near 0; the displacement of the point in the direction of n will be -ttTCbt where Cb is the curvature matrix of the interface at O. Hence, for results correct to the second order, we may take z' = t 2 sin (J + ttTCbt cos 0,

e= t i

1,

l1 i = -t 2 cos 8.

Hence, from (8.63) L i = Li(O,O)

+ Nt 2 sin fJ + !NtTCbt cos fJ + tN~iTCi~i.

456

GEOMETRIC THEORY OF DIFFRACTION

For the reflected ray

z' = l z sin 0 - !tTCbt cos 0, L f = Lf(O, O)

+ Nt 2 sin 0 -

~f

= t1,

tlf = t 2 cos 0,

!NtTCbt cos 0

+ !N~fTCf~r.

The phases of the incident and reflected waves must agree on the interface, i.e. L i = L', That demands firstly that Li(O, O) = Lf(O, O) which fixes the constant in the eikonal. Secondly.

tTCbt cos 0

+ !~iTCi~i = !~rTcr~f.

This equation holds for arbitrary small t and so

Cf=

(

2Cbll cos 0 + C~1 2Cb12 - C~2

2Cb12 - C~2 2Cb22 sec 0

)

+ C~2

(8.64)

where Cij is a typical element of the matrix C. Equation (8.64) specifies the curvature matrix of the reflected wavefront at o and the principal radii of curvature may be deduced from (8.62). There is now enough information to infer the curvature elsewhere. As regards the transmitted wave a parallel analysis throws up Lt(O, O) = Li(O, O) and Ct.l1(N cos 0-N1 cos (1)+NC~1

Cb12(N cos 0 sec 0 1-N1) +NC~2 cos 0 sec 0 1

4,12(N cos 0 sec 0 1 - Nt)

Cb 2 2 sec ()l (N cos sec ()1 - Nt) .

+NC~2 cos 0 sec 0 1

+ NC~2 cos? 0 sec? ()1

(8.65) It will be remarked that (8.64) gives trouble when 0 = tn, i.e. when the incident ray is tangential to the interface. Also (8.65) is unsatisfactory when the transmitted ray is tangential to the boundary. To deal with such exceptional points more canonical problems have to be solved. We shall return to these matters later (§8.19). In the meantime, we observe that, apart from exceptional points, geometrical optics converts the problem of propagation to one of solving a system of ordinary differential equations such as (8.38)-(8.40). It is therefore appropriate to spend some time on the practical solution of ordinary differential equations.

NUMERICAL SOLUTION OF ORDINARY DIFFERENTIAL EQUATIONS A vast literature exists concerning solving ordinary differential equations, and so it will be necessary to restrict what follows to the salient points. It is hoped

NUMERICAL SOLUTION OF DIFFERENTIAL EQUATIONS

457

that in the process the reader will garner sufficient knowledge of the terminology and properties to be able to pick out the main features in more specialized tests such as Lambert (1991) and Butcher (1987).

8.6 Multistep methods A typical first-order differential equation is y' = f(x, y) where the prime indicates a derivative with respect to x. Usually, a solution is required on some interval (a, b) such that y(a) = n, i.e, an initial-value problem has to be solved. By employing vector notation, we may place a system of first-order differential equations in the same form, namely y'

= f(x,

y),

y(a) = 1).

The initial-value problem for a higher-order differential equation can always be transformed into one for a first-order system, so confining our discussion to the differential equation in the first sentence will not be a serious limitation in bringing out the basic principles that will be encountered. All numerical methods solve a differential equation by steps, i.e, values are found at x, = a + nh where h is the steplength. An efficient computer program should select the steplength automatically at any stage subject to overrule by the user. The target will be to make the steps as large as possible while consistent with any error bounds imposed by the user. Let Yn be the value found by the computer at X n. In general it will differ from the value y(x n) attained by the exact solution there. If f" is written for f(x n, Yn) an equation k

k

L (XjYn+j = h j=O L {Jj!n+j j=O

(8.66)

in which ak = 1 is said to produce a linear multistep method of stepnumber k or a linear k-step method. If Pk = 0 the method is explicit because the current value Yn+k can be computed immediately from earlier values. If fJk =1= 0 the method is implicit and finding Yn+k obliges one to solve a non-linear equation of the form

Yn+k = hPkf(xn+ k, Yn+k)

+g

g being known. Usually, the iterative techniques of §1.8 (§4.5 for a system) will be involved. By Taylor's theorem

Truncating this expansion after two terms and substituting for y' from the differential equation, we obtain Y(X n + h)

= y(x n) + hf(xn, y(x n».

(8.67)

458

GEOMETRIC THEORY OF DIFFRACTION

If Y(x n) is now replaced by its approximation Yn' there ensues

Yn+ 1

= Yn + hf".

(8.68)

The formula (8.68) provides an explicit linear one-step method, known as Euler's rule. Since the inaccuracy in (8.67) is O(h2 ) , the error in Euler's rule may be said to be of O(h2 ) . The inclusion of the third term in the Taylor expansion together with the expansion of y(x n - h) yields the mid-point rule

Yn+2 - Yn

= 2hf,.+ 1·

with an error of O(h3 ) . Another method with error of O(h3 ) is the trapezoidal rule

Yn+ 1

-

Yn = th(fn

+ fn+ 1)·

A different way of generating multistep methods is to start from the exact equation y(x) - y(xn )

= 1~ y'(t) dt = 1~ I(t, y(t» dt.

(8.69)

The integral is evaluated by approximating I by a polynomial. For example, if P(x) is the quadratic polynomial which takes the values f,,+ l' In' and In-l at x, + 2' xn' and x, - 1 respectively, we write

Yn+ 1

-

Yn

= IX"+

I

Xn

P(t) dt =

!!-12 (5/n+

1

+ 8f" - In-I)

(8.70)

which is called a two-step Adams-Moulton method and is implicit. The NewtonCotes quadrature formulae (§5.12), which are based on polynomial approximation, all lead to multistep methods, for instance, Simpson's rule

Yn+ 2

-

Yn

= th(f,,+ 2 + 4f,,+ 1 + f,,)

(8.71)

which is an implicit two-step method. There are thus many multistep methods (for a selection see Lambert 1973, 1991) and their proliferation accounts for some of the difficulty in assessing their relative merits. One essential requirement is that as h -+ 0 the numerical solution should tend to the exact solution. Since we wish to compare them at a fixed x we must also ask that, as h -+ 0, n -+ 00 in such a way that nh does not alter. If nh = c in this procedure, we may say that a linear multistep method is convergent if, for all initial values, Yn -+ y(a + c) for any c satisfying o ~ c ~ b - a and for all solutions Yn complying with the starting conditions Ym = f/m(h) for which (m = 0, 1, ... , k - 1).

With a linear multistep method is associated the linear difference operator

NUMERICAL SOLUTION OF DIFFERENTIAL EQUATIONS If

defined by

k

=L

2[z(x); h]

j=O

{~jz(x

+ jh)

+ jh)}

- h{Jjz'(x

.

459

(8.72)

where z is any function which is continuously differentiable on [a, b]. If z and z' are expanded as Taylor series about x, 2[z(x); h]

where

= Coz(x) + C1hz'(x) + ...

(8.73)

Co = ~o + ~t + . · · + a k ,

= ~1 + 2~2 + .. ·+ ka, - (Po + Pt + ···+ Pk), (al + 2q~2 + ... + ~~k) (Pt + 2q- 1P2 + ... + kq-1Pk) = - ---------Ct

C q

(8.74)

(q - I)!

q!

for q = 2, 3, .... A multistep method is said to be of order p if Co = C t = · . · = Cp = 0 but Cp + 1 =1= 0; Cp + 1 is then known as the error constant. If Y is the theoretical solution of the differential equation ~[y(x); h] is called the local truncation error.

Exercise 8. Find the order and error constant of (a) Euler's rule, (b) Quade's method Yn+4 -

-.L 19(Yn+3 - Yn+l) -

_

Yn -

6h 19 (/n+4

+ 4/n+ 3 + 4/n+ 1 + j,,).

Suppose now that a multistep method is convergent, i.e. as h -+ 0, n -+ 00 with nh = c, Yn --+ y(a + c). Then Yn+j -+ y(a + c) for j = 0,1, ... , k. Hence y(a + c) = Y,,+j + (Jjn(h) where (Jjn -+ 0 for j = 0, 1, ... , k. Then y(a

+ c)

k

k

L ~j = h j=O L Pj!n+ j=O

j

+

k

L ~j(}jn(h). j=O

In the limit as h -+ 0, both term on the right-hand side vanish. Since y(a + c) is not in general zero aj = O. Thus a necessary condition for convergence is, from (8.74), Co = 0, i.e. the method is of order zero at least. In addition, (Yn+j - y,,)/jh -+ y'(a + c) for j = 1, ... , k necessitates

D=o Yn+

where l/J jn

--+

j -

Yn

= jhy'(a + c) + jhl/Jjn(h)

0. Hence k

L

j=O

Since

L'=o ~j =

aj(Yn+j - Yn) = hy'(a

+ c)

k

k

j=O

j=O

L ja j + h L jaj
0, division by h gives k

L Pj!n+ j=O

j

= y'(a + c)

k

k

L ja j + j=O L jajl/Jjn(h) j=O

460

GEOMETRIC THEORY OF DIFFRACTION

or

f(a + c, y(a + c)

k

k

L

j=O

Pj

= y'(a + c) L

j=O

jf1 j

after the limiting process. The differential equation can be satisfied if and only if 'L'=ojf1 j = 'L'=o Pj , i.e. C 1 = o. A multistep method of order p ~ 1 is said to be consistent. Thus, the above demonstration shows that Yn converges to a solution of the differential equation only if the multistep method is consistent. Although consistency is necessary to convergence it is not sufficient. The stability of the numerical scheme is connected with the polynomials

p«() =

k

'L f1 j, j,

u«() =

j=O

k

'L

j=O

Pj(j

(8.75)

which are called the first and second characteristic polynomials respectively. A multistep method can be consistent if and only if p(1) = 0, p'(1) = e(I). Consequently, in a consistent method the polynomial p(') always has a zero at ( = 1. This zero is called the principal zero and the remaining zeros (2' . · · , (k are designated spurious zeros. If f is identically zero, the multistep method becomes the linear difference equation f1 j Yn+j = 0 with a solution

D=o

Yn = d,

+

d2'~

+ ... + dkCk

when the zeros of p are distinct. Clearly, a crisis will develop if I(s I > 1 for any s but zeros with l(sl ~ 1 entail no trouble. Multiple zeros give rise to secular terms which do not affect the situation unless I'sl = 1, when there is blow up. Thus, parasitic solutions are avoided by insisting that I'sl ~ 1 with any such that "sl = 1 being simple. For this reason a multistep is said to be zero-stable if no zero of p«() has modulus greater than unity and if every zero of modulus 1 is simple. In a consistent zero-stable method u(l) =F O. Otherwise p'(1) = 0 which implies a double zero at least at ( = 1. It can be proved that a multistep method is convergent if and only if it is consistent and zero-stable. Indeed, consistency keeps the local truncation error down while zero-stability controls the propagation of error as the calculation proceeds. There is no point in considering multistep methods which are not both consistent and zero-stable. In this connection it should be noted that no zero-stable method can have order exceeding k + 1 (k odd) or k + 2 (k even). Adams methods are those in which p«() = Ck - Ck - 1 • The group is subdivided into the explicit or Adams-Bashforth methods and the implicit or Adams-Moulton methods. Methods in which p«() = (k are known as Nystrom when explicit and Milne-Simpson when implicit. All the spurious roots in Adams methods are at the origin and so the family is zero-stable. The Nystrom and Milne-Simpson groups are also zero-stable because they have one spurious root at - 1 and the rest at the origin.

'5

c:'

NUMERICAL SOLUTION OF DIFFERENTIAL EQUATIONS

461

Exercises 9. The initial value problem Y' = 4xy 1 / 2 , y(O) = 1 is to be solved on 0 ~ x ~ 2 by

Yn+2 - Yn+ 1

= th(3h+ t

-

2In)·

Show that the method is zero-stable but not consistent. By taking Y t = (1 + h2 ) 2 calculate the numerical solution for h = 0.1 and show that it differs substantially from the exact solution at x = 2 (y(2) = 25). Repeat the calculation with h = 0.05, h = 0.025 to demonstrate that the situation worsens as the step length decreases. 10. Prove that the method

Yn+2 - (1

+ a)Yn+ t + aYn = !h{(3 - a)fn+ 1 -

(1

+ a)J;.}

is consistent. Show that it is zero-stable for a = 0 but not for a = - 5. For these two values of a solve numerically the initial value problem of Exercise 9 for the same values of hand Yl = (1 + h2 ) 2 to exemplify the effects of instability. Are the effects mitigated by reducing the step length? 11. Is Quade's method (Exercise 8) zero-stable? 12. Show that the method

Yn+2

+ (a - l)Yn+ 1 - aYn = !h{(a + 3)fn+2 + (3a + l)fn}

has order 2 if a "# -1 and order 3 if a = -1. Is the method convergent when a = -1. 13. Prove that, for any value for a for which

Yn+2

+ aYn+ t

-

aYn - Yn-t

= t(3 + a)h(h+ t

-

f,,)

is zero-stable, the order cannot exceed 2.

A k-step method in which k > 1 requires more starting data than provided by the initial value. Therefore, if k is kept fixed, means must be discovered whereby additional data are created. If f is differentiable a sufficient number of times, derivatives of the differential equation permit the determination of y 1 by a Taylor expansion about x = a with an error of order O(hP + 1) if the method is of order p. Repetition of the process at x = a + h, a + 2h, ... will then supply the requisite number of starting values. The execution is cumbersome at best and fails completely if f does not possess enough derivatives. Most practical programs, therefore, vary k. At the first step k = 1 and then k is increased as more data become available. There is the accompanying advantage that the degree of interpolation can be adjusted as well as the step length although the optimal choice of k and step length is a complicated matter. To assist with this choice we need some idea of the errors being committed by the algorithm. It can be proved that for all Adams methods the local truncation error satisfies 19'[y(x n ) ; h]l ~

v:

1

K max ly(P+ 1)(x)1 xe[a,b]

(8.76)

462

GEOMETRIC THEORY OF DIFFRACTION

where the method is of order p and

K

= IJ~ D=o {iXh - t)~

-

p!

PPJj -

t)~-1} dtl = [C +11. P

A bound similar to (8.76) prevails for other methods but the definition of K is different. Inequality (8.76) instructs us how the error is behaving locally but leaves open the accumulation of error as the computation progresses. Round-off error may have to be accommodated; suppose on each application of the method it does not exceed K 1hq + 1• Let the maximum error in the k starting values be b and assume that, for a ~ x ~ b, If(x, y) - f(x, Y)I ~ Lly - YI. Then, for an Adams-Bashforth or Nystrom method, the global error satisfies IYn - y(xn)1 ~ {l5

+ (x n -

a)(hPK maxly(P+1)1

+ hqK 1 ) } exp{L(xn -

k-l

a)

L IPjl}.

j=O

(8.77) The bound for any zero-stable method has a similar structure. The snag with both bounds (8.76) and (8.77) is being compelled to estimate y(P+ 1). Frequently, no analytical bound is available and one must be constructed from the computed solution, probably by numerical differentiation-a notoriously inaccurate undertaking. Even then bounds in (8.76) and (8.77) can be very conservative. One can, however, assert with confidence that if the bound for the local error is O(hP +1) that for the global error will be not better than O(hP ) .

Two ways suggest themselves for circumventing the impediment in error estimation: (i) adopt a criterion which forces the global error to die out as the numerical reckoning proceeds; (ii) invent a method for which the evaluation of an error bound is feasible. The possibility (i) will be considered firstly, (ii) being left on one side until later. The theoretical solution of the initial value problem satisfies k

L

j=O

ajy(xn+j) = h

k

L

j=O

Pj![xn+j, y(xn+j)] + ~[y(xn); h]

the last term being the local truncation error. If Rn+ t is the round-off error introduced at the nth application of the method, the numerical solution provides Yn where k

L

j=O

k

ajYn+j = h

L

j=O

Pjf(xn+j, Yn+j)

+ R n+k •

Assume that of/oy has the constant value A. and that ~[y(xn); h] - R n+k = a where a is a constant. Then, from subtraction of the above equations, the global

NUMERICAL SOLUTION OF DIFFERENTIAL EQUATIONS

error e,.

463

= y(X,.) - y,. satisfies k

L

j=O

(aj - hlPj)en+j

= a.

The general solution of this linear difference equation is, when the multistep method is consistent and zero-stable, (8.78) where c, is an arbitrary constant and the rs are the roots of p(r) - hla(r)

= o.

(8.79)

It has been supposed in (8.78) that the roots of this polynomial equation are simple. Put hi = hl. As hi -+ 0, (8.79) approaches p(r) = 0 and rs -+ 's; in particular, r1 -+ 1. No other rs tends to 1 because of zero-stability. Now, when the method is of order p, .P[z(x); h] = O(h P + 1) so that, choosing z(x) = e).x, we obtain p(exp hi) - h 1a(exp hi) (exp hi - r1)(exp hi - r2 )

• ••

(exp hi - rk) = O(h~+ 1).

Since exp hi - rs does not tend to zero as hi r1

= O(h~+l) -+

0 for s =F 1 we conclude that

= exp hi + O(h~+l)

(8.80)

as hi -+ O. An immediate deduction is that r 1 > 1 for sufficiently small hI. The derivation of (8.80) is based on of/oy = l and so holds strictly only for the differential equation y' = )"y + g(x). In general, a solution will contain a multiple of e).x. The error equation (8.78) embraces a term ~ which, on account of (8.80), is exp{l(xn - a)} + O(hr+ 1). Therefore, if e" is dominated by ri the error grows no more rapidly than the true solution, i.e. the relative error remains roughly invariable. Consequently, a multistep method is said to be relatively stable for any hI for which Irsl < Ir1 1, s :1= 1. The derivation of the notion of relative stability has been founded on the assumption that of/oy = l which will be untrue in general. The restriction can be dispensed with if l is picked as a bound or typical value of of/oy. The choice may be relevant only to some particular subinterval. A change in subinterval may cause an alteration in the hI for which relative stability is available. Given this flexibility in the selection of l, the step length can be controlled by requiring hI to be such that relative stability occurs. Exercises 14. For the method of Exercise 10 with a ~ -1 show that in (8.76) K = -h(a + 5). Show that bound given by (8.76) for the initial value problem of Exercise 9 is 20h 3 when a = o.

464

GEOMETRIC THEORY OF DIFFRACTION

15. In applying Euler's rule to y' = 2y (2 < 0), y(O) = 1 show that, when h < -1/A., the h max y" when there is no global error is non-negative and does not exceed round-off error. Thus (8.77) overestimates by the factor exp(lllx..). 16. If -1 ~ a < 1 find for what hi the method of Exercise 10 is relatively stable. 17. If -1 ~ a < 1 show that

tx..

Y..+2 - (1

+

a)y ..+

I

+

ay..

h

=-

12

{(5 + a)J..+2

+

8(1 - a)J..+

1 -

(1

+ 5a)!..}

is relatively stable for hi> 3(a + 1)/2(a - 1).

The other device to combat awkwardness in error estimation has already been alluded to and consists of combining explicit and implicit methods in a particular fashion. First, calculate Y,,+k by an explicit method. Relabel it as y~OJk and regard it as a prediction which we wish to improve. The amelioration is achieved by iterating an implicit method, say y~s:kl] +

k-l

L

j=O

k-l

(J.jY"

+j =

hPk!(X,,+k' y~S~k)

+ h L pj /,,+ j j=O

with y~oJ k as starting value. After m cycles y~m} k is reached and this is deemed to be a corrected value which is to be taken as the appropriate value of Y,,+k. The whole manoeuvre is then copied at the next step and so on. Thus, at each step there is a prediction followed by a correction. The explicit method is the predictor while the implicit performs as corrector and the amalgamation is known as a predictor-corrector method. The number of iterations m may be fixed in advance, which has the merit of limiting the demands on the computer but the disadvantage of making the error and stability characteristics depend upon both the predictor and corrector, or m may be left free and the iteration stopped when IY~~k - y~~kl]1 is less than some pre-assigned tolerance. This can always be arranged if the iteration converges, which will be guaranteed if h < 1/LIPkl. Since now Y~~k is independent of y~oJ k the stability and error traits will be those of the corrector alone. The pressure on the computer may, however, be heavy because there may be many iterations at each step. A priori there is no reason why the predictor and corrector should not have different step numbers and different orders. Notwithstanding, substantial benefit accrues from allotting them the same step number k and the same order p, for then it can be shown that, subject to certain conditions, the local truncation error is

C*

Cp + 1 _ C

p+l

p+l

(y[m]

,,+k

_

y[O] )

,,+k

+ O(hP+2)

(8.81)

where C;+ 1 amd C p + 1 are the error constants of the predictor and corrector respectively. The conditions for (8.81) are met if an Adams-Bashforth method is used as predictor, followed by an Adams-Moulton method as corrector. This

NUMERICAL SOLUTION OF DIFFERENTIAL EQUATIONS

465

is the combination employed by many practical programs. Clearly (8.81) evades the problem of estimating y
Yn+ I

-

Yn

= hg(x n, Yn' h)

for some suitably chosen g. These are all explicit and are the only ones that will be discussed here, though implicit methods have been investigated. A comprehensive treatment is in Butcher (1987). A Runge-Kutta method is said to be of order p if p is the largest integer for which

y(x + h) - y(x) - hg(x, y(x), h) = O(hP + 1 ) and consistent if g(x, y, 0) == f(x, y). If g is continuous in all its variables and Lipschitz in y it can be proved that a Runqe-Kutta method is convergent if and

only if it is consistent. If

g(x, y, h)

k1

=

M

L

m=1

wmk m,

= f(x, y),

k, =

a, =

m-1) br.k. ,

f ( x + ha., y + h '~1 ,-1

L »;

s=1

for r = 2, 3, ... , M the Runge-Kutta method is said to have M stages. Note that L~= 1 \Vm = 1 is necessary for consistency and therefore for convergence. There are numerous Runge-Kutta methods, matching the fecundity of parameters available. One two-stage method of the second order is obtained by putting WI = 1, W 2 = 1, a 2 = 1 with the result

466

GEOMETRIC THEORY OF DIFFRACTION

Another two-stage method stems from order three-stage method is

Yn+ 1

with

k1

= f(x n, Yn),

k2

-

WI

= 0,

Yn = !h(k 1

W2

= 1, a2 = !.

Heun's third-

+ 3k 3)

= f(x n + !-h, Yn + !-hk1 ) ,

k 3 = f(x n + ih, Yn

+ ihk 2 ) ·

Perhaps the most popular Runga-Kutta method is the four-stage fourth-order

= ih(k 1 + 2k 2 + 2k 3 + k4 ) , k 1 = Is».. Yn), k 2 = f(x,. + th, Yn + !hk 1 ) , k 3 = f(x,. + th, Yn + thk 2 ) , k4 = [tx; + h, y,. + hk 3). Yn+ 1

Yn

-

(8.83)

It is perhaps worth pointing out that when there are five or more stages the order is always less than the number of stages. The local truncation error is

Y(X n+ 1) - Y(X n) - hg(x,., y(xn)h). For a method of order p it is O(hP + 1). If If(x, y)1 < Q,

Oi+if (X' Y) 1 pi+i iJx! iJyi < Qi- 1

I

(i

+j

~

p)

the modulus of the local truncation error does not exceed

(1 + Ii - a 2 Dh3 p 2 Q for a method of order 2. In a method of order 3, the bound is a multiple of h4p3Q while for (8.83) it is 73h sp4Q/720. If the local truncation error is bounded by Kh P + 1 the global error satisfies IY(x,,) - Y"I ~

hPK

L

[exp{L(x" - a)} - 1]

in the absence of round-off error. These error bounds are usually difficult to apply in practice and direct control of step length by them is out of the question. Instead, results are usually based on two distinct runs. For example, if the local truncation error with step length h is y(XII+ 1) - YII+ 1 = t/!(X,., y(xlI»h P+ 1 + O(hP+ 2 ) the corresponding error when the method is applied at length 2h is

Y;.+

Y(Xn+ 1)

-

Yn+ 1

~+ 1

Hence -

which suggests (Yn+ 1

-

1

X II - 1

with step

= t/J(xlI- h y(x lI_ 1»(2h)P+ 1 + O(hP+ 2 ) = t/J(X n, y(x lI»(2h)P+ 1 + O(hP+ 2).

= (2P+ 1 -

Y;.+ 1)/(2P +1

1)t/!(X,., Y(X,.»hP+ -

1

+ O(hp+ 2 )

1) as an estimate of the error. Thus error

NUMERICAL SOLUTION OF DIFFERENTIAL EQUATIONS

467

estimation for Runge-Kutta methods involves substantial computational effort and has a tendency to be expensive, much more so than that for predictorcorrector methods where the additional computation is negligible. As regards stability, try the general Runge-Kutta method on the test equation y' = Ay. There results Yn+l - Yn = PM(h 1 )Yn where b,

= hA

and PM is a polynomial of degree M. Consequently

where r1 = 1 + PM(h 1 ) . Therefore, we say that the method is absolutely stable if Ir1 \ < 1. When M is the same as the order p the formula for r 1 simplifies to

r1

hi + ... +-. hf = 1 + hi + -2!

p!

From this it can be deduced that a first-order one-stage method is absolutely stable if - 2 < hi < 0, a second-order two-stage method is absolutely stable for - 2 < hi < 0, a third-order three-stage method is absolutely stable for - 2.51 < hi < 0 and a fourth-order four-stage method is absolutely stable for

-2.78 < hi < O. In particular, the popular variety (8.83) is absolutely stablefor - 2.78 < hi < O. It is unwise to use a Runge-Kutta method outside the region

of absolute stability. Efforts have been made to choose the parameters in Runge-Kutta methods so as to improve the stability characteristics when M ~ 5. Choices have also been made to reduce the local truncation bounds since the traditional values were selected to suit mechanical desk calculators rather than electronic machines. However, the theoretical errors were not greatly diminished although the actual computed errors were sensitive to alterations in the parameters.

Exercises

18. Show that (8.83) applied to y' = Ay gives Yn+ .tv, = e l h + O«Ah)S). 19. Prove that (8.82) is the same as a predictor-corrector in which the predictor is Euler's rule and the corrector one application of the trapezoidal rule. 20. The method (8.83) is applied to the initial-value problem y' = exp(lO(x - y», y(O) = 0.1 on [0, 1]. Show that P = 10 and that an estimate of Q is 0.37. Deduce that the step length should be less than 10- 2 for the local truncation error to be less than 10- 8. If It = 0.1, find the global error and demonstrate that it is very conservative by comparing the numerical and analytical solutions. 21. Use the estimate for (Yn+ 1 - 1';.+ 1)/(2P + 1 - 1) to adjust the error in employing (8.83) with h = 0.1 for y' = 1 + y2, y(O) = 1 and compare your estimate with the actual error. 22. The initial-value problem y

,

1 = -5xy 2 + -5 - --, 2 x

x

y(l)

=1

468

GEOMETRIC THEORY OF DIFFRACTION

is to be solved on [1, 10] by means of (8.83). Assuming that a reasonable estimate for of/oy is -10, find which of h = 0.2, 0.3, 0.4 correspond to absolute stability. Compare the numerical solutions for these three step lengths.

8.8 Extrapolation Polynomial extrapolation is sometimes adopted to refine the accuracy of previous methods. Suppose that one of the described methods has yielded the approximation y(x; h) with the step length h. A basic step length H is selected (typically H» h) and then polynomial extrapolation is employed to furnish approximations at Xo + jH, j = 0, 1, ... , where Xo is the starting point. Let No be a positive integer. Apply the numerical method No times with step length HINo to generate the approximation y(xo + H; ho) where ho = HINo. Replace No by the positive integer N, (N 1 > No) and repeat the exercise, thereby obtaining y(x o + H; hI) with hI = HINt. Suppose that for the given numerical method y(x; h) = y(x) + A1h + A 2h 2 + ... (8.84) as h --. 0, the coefficients AI' A 2 ,

•••

being independent of h. Then

hly(x O + H; ho) - hoY(xo hI - ho

+ H; hI) = y(x) + O(h2)

(8.85)

so that the left-hand side of (8.85) ought to be nearer the theoretical solution than either of the two approximations from which it is built. If the left-hand side is denoted by Yt(x + H; H) the whole process is repeated for y' = f(x, y), y(xo + H) = Yt(x + H; H) to determine a value at X o + 2H. Actually, (8.85) is the simplest of the possibilities. Further intermediate steps h 2 , h 3 , ••• .h, could be introduced with ho > hI > h 2 > ... > h, and then a linear combination formed in which the error was oo:: 1). In this way, high accuracy might be attained when r is sufficiently large. To implement the method of extrapolation it must be ensured that the numerical method satisfies (8.84) or preferably y(x; h) = y(x)

+ A 2h 2 + A4h 4 + ...

because of the enhanced accuracy that flows from the same numerical effort. Methods which have this property have been investigated and can be classified in terms of the p«() and a«() of §8.6, but details will be omitted here. (The interested reader should consult Lambert (1991).) In order to have numerical stabiity in the extrapolation procedure the step sizes hI' h2 , ••• must decrease fairly rapidly. While it is true that the accuracy improves swiftly with increasing r so does the workload, and values of r greater than 5 are rarely practicable. In some ways there is a kinship with a Runge-Kutta of variable step. The size of H must be relatively large for the extrapolation to be efficient with the consequence that the production of results

is expensivee

CANONICAL PROBLEMS

469

8.9 Systems of differential equations Most of the preceding theory carries through for systems if y is replaced by y and statements such as Yn -+ y(a + c) are understood to mean llYn - y(a + c)11 -+ 0 in some appropriate norm. One difference is that of/oy must be replaced by the Jacobian matrix of/oy (§4.1) so that L = lIof/oyll and then the constant A. used in the scalar stability theory is replaced by an eigenvalue of the Jacobian. Thus conditions on hl become conditions on h)"i for every eigenvalue Ai of the Jacobian. If the Jacobian is a constant matrix, all of whose eigenvalues have a negative real part, the solution consists of a transient, which decays exponentially and a steady state. The solution will take a long time to settle down if the real part of one of the eigenvalues is small. On the other hand, if lalA i I » 1 small step lengths will be necessary to ensure stability. If both possibilities occur simultaneously so that max i

131)"d »

min I~Ad i

we are forced when finding the steady state to integrate over a long range with tiny step lengths. In an unsatisfactory situation like this the system is said to be stiff. Considerable research has been devoted to stiff systems and the interested reader should consult Lambert (1991).

CANONICAL PROBLEMS 8.10 Geometrical optics revisited The high-frequency approximation of geometrical optics as enunciated in the first sections of this chapter involves finding the rays, determining the amplitude by the intensity law and ascertaining the rotation of polarization (if necessary). Only the solution of ordinary differential equations arises and we can now regard this as always possible in principle either by analytical or numerical means, so geometrical optics offers a practical scheme of calculating the field at short wavelengths. However, if the cross-section of a tube of rays shrinks to zero, the intensity law predicts an infinite amplitude which is obviously unsatisfactory. Therefore, complementary analysis is required to cope with such a contingency. There may be other circumstances in which a predicament materializes and so some generalities are in order. The general technique is to introduce a canonical problem which is susceptible to analysis and which reproduces the main features of the ray picture. Then predictions from the analysis are interpreted so as to explain the modifications to be incorporated in the ray theory. Since geometrical optics is a local approximation it will usually be sufficient to set the canonical problem in a

470

GEOMETRIC THEORY OF DIFFRACTION

homogeneous medium and often the consideration can be restricted to what happens to a plane wave. This approach has already been typified in §8.5 where the reflection of a tube of rays at an interface has been treated by means of the canonical problem in which a plane wave is reflected by a plane interface. The general method of extending the validity of geometrical optics by the judicious involvement of canonical problems could well be called the geometrical theoryof diffraction, although it was originallydevelopedfor scattering by edges and curved boundaries. Various canonical problems in the geometrical theory of diffraction will now be discussed. 8.11 Focusing The particular problem mentioned in the preceding section occurs when two adjacent rays intersectso that the tube collapses momentarily. In a homogeneous medium two adjacent rays from a wavefront intersect at a principal centre of curvature. The loci of the two principal centres of curvature of a givenwavefront are two surfaces known as caustics or focal surfaces. Every normal to a wavefront is tangential to both caustics. In the specialcase of a spherical wavefront all the rays of any pencil intersect in a common point; such a pencil is said to be homocentric. A point source of light produces a homocentric pencil but the pencildoes not remain homocentric in general after reflection or refraction. Instead the light becomes concentrated on the caustics and astigmatism occurs. A suitable canonical problem is a wavefront with a known field which, if translated into rays, would have a caustic in the region of interest. It is convenient to formulate this in terms of an irradiated perfectly conducting obstacle in a medium in which the refractive index N has the constant value of unity. Let the incident field be E i . Then the total electricintensitycan be expressedas E(P) = Ei + (grad div +

k~) f

s

Rq /\

H~(P, q) dS q 1weo

where t/J(P, q) = exp(-ikolxp - xql)/4nlxp - xql. Although H is not known exactly on 8 we can take advantage of the high frequency. The canonical problem for reflection suggests that we shall not go far wrong in assuming, to a first approximation, that the current distribution over the illuminated part of 8 is the same as if at every point the incident wave were reflected as if it impinged on the infinite (perfectly conducting) tangent plane. On the shadow part the current distribution is assumed to be zero (Macdonald 1913). Thus, on the illuminated part 81 (Fig. 8.7)n 1\ H = 2n A Hi and elsewhere n 1\ H = O. The approximation, often described as physical optics, should be valid when the radii of curvature of S and of the incident wavefront are large compared with the wavelength. The condition for the incident wave can always be met if its source is not too close to S.

471

CANONICAL PROBLEMS

o Fig. 8.7. The illuminared and shadow regions of an obstacle.

Let the source of the primary field be an electric dipole of moment Po at the point 0 with coordinates X o. Then .

.

2

= (grad div + ko)Po

EI

Hi - w(x - Xo) 1\ Po (k 4nR 2 where R

= [x - xol. E

Hence

= Ei 1\

_1_ (grad div

2n£0

{(xq

-

xo) 1\ Po}

exp( - ikoR)

4n£oR

i)

-R

exp

(·k R) -1 0

+k~) f (~+ 51

Rq

exp( - ikoRq ) 2

Rq

,

'"

ik)n q

d8 q

where R q = IXq - xol. In view of the short wavelength the integral can be evaluated asymptotically by the method of stationary phase and the main contribution willcome from those points of 8 1 where R q + [x - xql is stationary. The close connection with Fermat's principle is immediately exposed. Split the domain surrounding 0 into three portions 0 1, O 2 , and 0 3 • D 2 constitutes the exterior of the tangent cone from' 0 to S (Fig. 8.7) whereas D 1 is the interior of the tangent cone from 0 to SI. The rest of the interior of the tangent cone is designated D 3 • If 8 has the shape shown in Fig. 8.7 and the point of observation is in D 1 + D2 the stationary point Q is the point of contact of a prolate spheroid with 0 and P as foci and touching S. If the point of

472

GEOMETRIC THEORY OF DIFFRACTION

Fig. 8.8. Axes with the stationary point as origin.

observation is in D 3 it will be denoted by P' and the stationary point Q is the point of intersection of OP' and Sl. It is now convenient to choose Q as origin, with the z axis along the outward normal to S1 and the " axis in the plane of incidence (Fig. 8.8). Let () be the angle of incidence so that 0 is the point (0, - R o sin (), R o cos ()) with respect to the axes at Q. Assume, firstly, that the point of observation is in D 3 so that it will be at (0, R 1 sin (), - R 1 cos 0) on the broken line in Fig. 8.8. When the variable of integration (e, rf) is near Q

1(1 1)

R q + [x - xol ~ R o + R 1 + -

- +2 Ro R1

Hence

(~ 2

+" 2 cos 2 0).

f

exp{ -iko(Rq + lx - xol)} us, <-exp]-ik o(R o + R 1 ) }

foo foo. exp{-tiko(~+~)(~2+,,2 -

00

-

00

Ro R 1

cos? ())}

d~ d"

~ 2nRoR 1 exp{-iko(Ro+R 1 ) }

iko(Ro + R 1 ) cos ()

In this case R o + R 1 = Rand E

~

Dq • Rq

= - R o cos () so that

(D Q• po)Ro} exp( - ikoR) E i - ( gra d d·IV + k2){ 0 Po - -Ro 4nB oR

on neglecting terms of smaller order. When koR 1 is large the second term under the vector operator can be omitted and so, to this order of approximation, E vanishes in the shadow in conformity with geometrical optics.

473

CANONICAL PROBLEMS

For P in D, near Q be

+ D2

the analysis is more complicated. Let the equation of Sl

z ~ -!~TCb~ - -t(De 3 + 3Ee 2 fl

+ 3Fe,,2 + G,,3) where Cb is the curvature matrix as in §8.5. For points near Q Rq

+

[x - xol ~ R o + R 1

+

!~TC~

when only quadratic terms are retained, C being the matrix defined by C

= 2Cb cos e + (~ + ~)(1

s, s,

0

°

) •

cos 2 (J

Hence

f exp{ -iko(R

q

+ lx - xol)} dSq '" exp{-iko(R o + R i ) }

f:ao f:ao exp( -!iko~TC~) de d'1

2n exp{ - iko(R o + R 1 ) koldet C1 1 / 2

-

it5}

where ~ = 0, !n, or -in according as the eigenvalues of C have opposite signs, are both positive, or are both negative. It is a simple matter to check that

(K~ + ~.)( K2 + ~.) cos? ()

det C =

where ,,~, "2 are the principal curvatures of the reflected wavefront, obtained from the curvature matrix C' of (8.64) because the incident wave is spherical so that C' has only the diagonal elements 11R o. If we adopt the rule that

(K + ~Jl/2 = IK + r

r

;J /2

exp( -tni)

when Rl"r + 1 < 0 the phase change ~ is automatically taken care of. Then, if koR o and koR 1 are both large, we obtain 1 -·k R ) E '" E i + ,,~+ _ "2 + _1 ) R o{2(Ei .0)0 _ Ei}Q exp 1 0 1 ( R s, s,

)-l/2(

-112

(

1

(8.86) where { }Q signifies the value at Q. If the disposition of the surface is as in Fig. 8.8, Cb 1 1 > 0 and Cb 1 1 Cb2 2 > C;12. Then both ,,~ and "2 are positive. Formula (8.86) holds for all R 1 and no focusing occurs. Indeed this is true whenever "i and "2 are both positive. On the other hand, if either of the curvatures is negative (which can happen if S is anticlastic or synclastic but concave towards 0 so that Cb 1 l < 0), there is the possibility of focusing. Now + 1/R 1 is certainly positive for small R 1 so that

"r

474

GEOMETRIC THEORY OF DIFFRACTION

as R 1 increases and K r + 1/R 1 passes through a zero the phase in (8.86) augments by !n according to the rule for the square root delineated above. But the amplitude in (8.86) is the same as that of geometrical optics so long as neither square root is near zero so that the following interpretation presents itself.

The formulae of geometrical optics may be used whenfocusing takes place providedthat when a caustic ofan astigmatic beam is crossed in the direction of propagation the phase is advanced by !no After passing two caustics the phase advance will be n and this is true even if the two caustics coincide at a point (in which case the beam has afocus and K~ = K2). The above derivation breaks down if () is near Then Q is near the penumbral curve (i.e, the curve where the incident rays are tangent to the obstacle) and the approximate current distribution is not satisfactory. A new canonical problem must then be proposed (§8.19). Both (8.86) and geometrical optics fail when RIK~ or R1K2 is near -1. Then the cubic terms in z, hitherto neglected, become important. By keeping the cubic terms an asymptotic evaluation supplies a field which is uniformly valid, i.e. it makes an even transition from geometrical optics through the field at the caustic and back to geometrical optics (plus phase advance) again. The main feature is that the field at a caustic is larger than that at other points by a factor of the order of kb/ 6 • It is the attempt to emulate this behaviour that causes the amplitude of geometrical optics to go infinite. If the two caustics coincide at a focus the factor of increase is k~/3. All these estimates depend upon the non-vanishing of certain quantities and when this is not so higher-order calculations are inevitable but the formulae are extremely unwieldy, though investigations based on catastrophe theory have been carried out. The formula (8.86) has assumed that there is a single stationary point Q to be considered. For an arbitrary obstacle there may be several stationary points and then the contributions from the reflection at each must be added in order to obtain the total field at P. It is also possible that a ray may suffer multiple reflection, i.e. strike the body again before reaching P. Then repeated applications of the formula will sometimes resolve the problem although, in some circumstances, a new canonical problem may be involved. If the obstacle is not perfectly conducting (8.86) may still be employed so long as the appropriate Fresnel reflection coefficient is inserted.

tn.

Exercises 23. A circular cylinder of radius a is placed in free space with its axis along the z axis. An electrically polarized wave in which E; = exp( -ikox) is incident along the x axis. The boundary condition on the cylinder is D /\ H + (eo/J1.o)1/ 2Z Et = 0 where E, is the tangential component of E. Show that, according to geometrical optics, the electric intensity of the reflected wave is cos{J + Z (

d 1r1

cos {J - Z 2r1

+ d1

)1 /2 exp{ -ik o(x1+ r t ) } rt /2

475

CANONICAL PROBLEMS

where '1 is the distance to the point of observation from the point of reflection (x., Yl), p is the angle of incidence at (x., Yl)' and d, = a cos p. 24. A parabolic cylinder has equation y2 = -2a 2x + a4 • It is irradiated in y 2 > - 2a2 x + a4 by an electrically polarized wave in which

E; = exp{ -iko(x cos 4>0 + Y sin 4>0)}

(0 < 4>0 ~ 1t).

If the boundary condition is the same as in Exercise 23 show that the formula of Exercise 23 for the reflected wave still holds provided that exp( -ikox 1 ) is replaced by exp - iko(X 1 cos 4>0 + Y1 sin 4>0) and d 1 is understood to mean the projection of the radius of curvature at (Xh Yl) on the incident ray. Would this also be true if the excitation were from y 2 < -2a 2x + a4 and what happens if cPo = 01

8.12 Reflection by stratification In §8.4 we came across rays being reflected by a minimum of refractive index in a stratified medium. The rays will intersect one another and one might anticipate that there will be a phase advance of -in as in the preceding section. However, the canonical problem of §8.11 is not appropriate because what we are effectively dealing with in the stratified medium is a problem in which the refractive index has a zero. To ascertain what happens it is sufficient to consider two-dimensional waves in a stratified medium since this does not radically affect the conclusions. Let the fields be independent of the variable y and have E, as the sole electric component. Let Jl have the constant value Jlo; the analysis is thereby substantially simplified without loss of the chief characteristics. Then

a E, + a E,2 + k 2N2E = 0 2

2

ax2

where N 2

= 8/80.

az

0

,

Let the x dependence be exp( -iko~x) so that

a E + k~(N2 -f az 2

~2)E,

= o.

(8.87)

Thus as z varies there can be values for which N 2 - a2 ~ 0 and geometrical optics is not immediately applicable. If N has a single minimum the number of zeros is at most two and so the discussion will be limited to the case of two zeros. Change the notation and rewrite (8.87) as

d2 w

-2

dz

+

k~q(z)w

= O.

(8.88)

Without specifying branches precisely for the time being substitute ~ iko ql/2(t) dt, w = q-l/4W(e) and obtain

Jz

d2 W

de

2 -

W = -S(e)W

=

(8.89)

476

GEOMETRIC THEORY OF DIFFRACTION

where

This differential equation may be converted into the integral equation

since we aim to have Ey and dEy/dz continuous. If q is always positive, choose ql/2 likewise and then a possible solution of (8.89) is (8.90) As z ~ -00, one may expect this to approach e-~. Consequently, e-~ can be regarded as an incident wave of unit amplitude propagating through the medium. The integral in (8.90) represents a correction which may become significant as z increases but is often negligible. No attempt will be made here or subsequently to specify conditions on q which warrant statements that an integral is small though studies have been conducted in this matter (Heading 1975; Olver 1965). We now turn to the case when q has two zeros. Specifically we assume that q(z) = (z - Zl)(Z - z2)f(z) where Zl' Z2 are real with Z2 > Zt while f(z) is positive for all real z and regular in a complex Z domain which includes the whole of the real axis. Zeros of q off the real axis are irrelevant to the current purpose. We first establish an exact energy relationship. For z > Z2 take ql/2 positive and define

From our argument for (8.90) we can regard this as a unit transmitted wave. For z < z 1 take ql/2 positive also and define

~ = iko

f

z

ql/2(t) dt,

%1

~(e)

W3(e)

=

e-~

+ -1 f~ 2

= e~ + -1 f~ 2

(e-~+t - e~-t)S(t)~(t) dt,

-ioo

-ioo

(e-~+t

- e~-t)S(t)W;(t) dt.

477

CANONICAL PROBLEMS

W2 can be considered as a unit incident wave whereas W3 is a unit reflected wave. Let w be a solution of (8.88) for all real z such that w(z)

= q-l/4{ W2(e) + =

R U-;(~)}

q-l/4TWi(e)

Now, since q is real for real z, (8.88) implies

\VW*" - w*v/'

=0

whence Jww*' is constant for all real z. But, in z

>

Z2'

dW*

. 1\vw*' = fq-l/4T~q-l/4T* __1 ik oq l / 2 --+

de

TT*k o

as z --+ 00. Similarly, from z < ZI' we find Jww*' --+ (1 - RR*)k o as z --+ - 00. Since these must be the same we have conservation of energy expressed by

RR*

+ TT* = 1

(8.91)

It must be emphasized that (8.91) is relevant to the exact solution. If we assumed the approximation w = q-l/4(e-~

+ R e~)

= q-1/4te-~ and substituted in J ww*' = constant we would obtain

RR* + ft* = 1.

(8.92)

However, if T is small is may be swamped by the errors in R so that all that could properly be deduced from (8.92) is that IRI is approximately 1. To obtain a better result we must determine the exact R or T. The difficulty of relating Rand T originates from their being connected by what supervenes between z 1 and z 2. In this interval a solution which is negligible at Zl may be dominant at Z2 so that straightforward matching without appreciable error is hard to achieve. The way around the snag is to split the real axis into two parts each of which contains only a solitary zero and use solutions which can pass through one zero (Jones 1964; Olver 1974). Let Zo be a fixed point between ZI and Z2. Then two uniformly valid asymptotic solutions of (8.88) on [zo, 00] are

(D

k~/3,),

WI

= n 1 / 2kA/6

W2

= nl/2k~/6 (~Y/4Ai( _k~/3,)

1/4 Bi(-

478

GEOMETRIC THEORY OF DIFFRACTION

where Ai and Bi are the standard Airy functions and

(=

±I~

L:

q1/2(t) dtl2/3

the upper or lower sign being taken as z > Z2 or Z < Z2' Let z --+ 00. so that The Airy functions may be replaced by their asymptotic expansions for large negative argument with the result , --+ 00

w1 """ _q-1/4 sin{ko W2 """ q-1/4 cos

L:

q1/2(t)dt-

{k o L: q1/2(t) dt - in},

Comparison with the behaviour of

JJi

as

Z --+ 00

( 1 .)( q - 1/4 T.l1" ""I = exp -Vt1 W2

In ( -

00,

in },

reveals that

+ .lW) I .

Zo] we employ

W3 =

1t1/2k5/6(~Y/4Bi(-k~/3(1)'

W4 =

1t1/2k5/6(~ Y/4 Ai( -k~/3(1)

where

(1

=

+I~L: q1/2(t)df/3

the upper or lower sign being adopted according as z > ZI or Z < ZI. Allowing 00, especially through a sequence of values in which the cosine in the Airy function asymptotic expansion takes alternately the values zero and unity, permits the identification Z --+ -

= e- ni/4 (w3 + iw4 ) , q-I/4nJ = e- ni/ 4 (w4 + iw3 ) . q-l/4~

It follows that W3 + iW4 + R(W4 + iW3) and T(W2 + iw.) represent the same solution to -(8.88). They and their derivatives must therefore be continuous at z = zoo Hence

R=

1Y(W4, W2) -

+ 1'f'"(W3, WI)} , 1r(W3' WI) + i{1r(w 3 , w2) + 1r(w4, WI)}

1Y(W4' w2 )

-21Y(W3' W4) 1Y(W3' WI) + i{1'Y(W3' w 2)

1I1"(w4 , Wt) - If'"(W3' w2 )

T= -

-

i{1r(w4, w2 )

+ 1r(w4 , WI)}

479

CANONICAL PROBLEMS

where

If'"(w i, Wj)

= wiwj -

W;Wj

is the Wronskian evaluated at z = Zo0 Since we know the Wronskian is a constant independent of z we could calculate it at any convenient point. However, the advantage of z = Zo is that the argument of the Airy functions is large and positive so that asymptotic expansions are available. In fact WI

1"'-1

W3

1"'-1

W~

1"'-1

w;

1"'-1

Y- 1 eko! ,

W2

1"'-1

Y- 1 e k Og,

W4

1"'-1

-koy eko! ,

W2

1"'-1

koY ekOg ,

w~

1"'-1

tv-

1

e - k o! ,

ty - 1 e- k Og, tkoY e- ko! , -tkoye- kOg

where

Y = Iq(zo)II/4,

f

=

f%2Iq(t)11/2 dt,

g = f%O Iq(t)1 1/2 dt.

%0

Therefore lII(w 4 , lY(W 4 , W2)

1"'-1

WI)

%1

= 0, 1r(W3' w 2 ) = 0 and

tko exp( -kol),

1r(w 3, WI)

1"'-1

-2ko exp(kol),

1j/(W3'

w4 )

1"'-1

-ko

where

1=

f + g = f%2 Iq(tW 12 dt. %1

Consequently R

= i,

T

= exp( -kol).

(8.93)

There is no point in retaining the exponentially small terms in R because only the first term in the asymptotic expansions has been kept so that the most we can assert is that R = i + O(I/ko). However, the exact relation (8.91) coupled with (8.93) for T does allow the statement

IRI 2 = 1 -

exp( -2kol).

(8.94)

The interpretation of these formulae is that when the refractive index possesses two zeros there is a barrier which prevents the transmission of all but a minute amount of energy. The amplitude of the reflected wave is the same as that of the incident wave to within an exponentially small factor. The phase, however, is advanced by tn. Consequently, when a ray is reflected by a minimum in refractive index the reflected field may be evaluated by geometrical optics provided that a phase advance of tn is inserted. Of course, the result will not be legitimate very near the reflecting level but only some distance below it. One application of this theory is to propagation over a plane earth. Suppose that N is constant and equal to 1 up to a height hi' passes through a positive minimum N(m) between hi and h 2 and becomes constant again at altitudes greater than h2 (Fig. 8.9). If a ray is launched at an angle 80 to the vertical the theory of §8.4 informs us that if sin 8 0 < N(m) the ray will not be reflected but will continue steadily away from the earth after refraction through the layer.

480

GEOMETRIC THEORY OF DIFFRACTION

h,-----------

Fig. 8.9. Ray when the refractive index has a minimum above a plane earth.

If sin 00 > N(m) the ray will be reflected at a height h where N(h) the abscissa of the point of reversing direction vertically being x, = sin 00

(h

J

= sin (Jo,

{N 2(w) - sin? eo} -1/2 dw

ho

where ho is the height of the source above the earth. (It is assumed that the source is below the layer and in the region in which N = 1.) As 00 increases a reflected ray first appears when sin 00 is just greater than N(m) and then Xh is virtually infinite. Increasing 80 further decreases Xh but X h must subsequently increase because obviously X h -+ 00 as 00 -+ tn. Hence Xh has a minimum for some value of 00 • This minimum can be zero only if the ray with 80 = 0 is reflected, i.e. if N(m) = O. In general, there are at least two rays which turn around at any value of Xh greater than the minimum. A reflected ray will eventually strike the earth. If N(m) > 0, the point of impact must be beyond a certain distance because Xh cannot be below its minimum. The minimum horizontal range which must be traversed before a ray hits the earth is known as the skip distance. The field due to the reflected rays is often known as the sky wave. The sky wave is responsible for the ability to transmit signals great distances because it can penetrate to places where the direct wave from the source has been reduced to negligible proportions by horizon effects. When the reflected ray strikes the earth it will be reflected by the earth at the same angle to the vertical. It will therefore return to the upper layer and be reflected back to the earth again. The phenomenon is therefore one of multiple reflection. If the source is a vertical transmitter which would give exp(- ikoR)/R in free space the sky wave after m reflections at the earth and I

481

CANONICAL PROBLEMS

reflections at the variable layer is exp(tl1ti)D{Rv(Oo)}m exp( - ikoL)

d

°

where Rv(Oo) is the Fresnel coefficient for the reflection at the earth at an angle of incidence 0 , L is the optical path length, d is the geometrical length of the ray, and D is a divergence factor representing the effect of spreading of the rays. The only influence of the layer reflection other than in D is the phase advance of t1t on each occasion. The total field at a point is obtained by summing over all the reflected rays which pass through it and adding that which arrives directly without reflection. In the situation of Fig. 8.9 there may be rays emitted downwards which strike the earth before going to the transition layer; their contribution must be included. If both the transmitting and receiving points are on the surface of the earth the formula for the sky wave is l:D{ R v(80)}m{ 1 + R v(80)}2 exp{ - ikoL + t(m + 1)ni}

d where m is the number of earth reflections between the transmitter and receiver. The factor 1 + R; is included once because the receiver is at a point where both a downcoming and an upgoing ray occur and a second time because the transmitter is on the earth. The summation must be carried out over those values of 80 which supply a ray through the point of observation, i.e. which satisfy x = 2(m + l)x h where x is the distance between transmitter and receiver. Also L =

f

N ds = 2(m + 1)

{h N 2(w){N2(w) -

sin? eo}

-1/2

dw.

With regard to D the rays in a small cone occupy an area d2 sin 80 d8 0 dl/J in free space. Here they fill an area x cos 80 dx dl/J. Hence D =d

(tan Ool:Oo/dXI)

1/2

and the sky wave can now be computed. In practice, the determination of rays when the positions of the transmitter and receiver are specified has to be carried out by trial and error. One way is to trace a ray from the transmitter. If it misses the receiver by being at a range Xi instead of XR at the height h R of the receiver, a new angle of launching based on the iteration cosec 8i + 1 = cosec 0i - N(hT )(XR - x i ) / N 2(h R ) oxi/oN(h R ) is tried with hT the altitude of the transmitter. Alternatively, and probably more

482

GEOMETRIC THEORY OF DIFFRACTION

effectively, rays are traced for a selected set of launching angles and then, if two consecutive ones bracket the receiver, a better approximation is derived by interpolation.

Exercises 25. Show that the change in range at constant height due to a change in launching angle satisfies

-OX = -No + cot 80 080

N~

I(

NN") 1- dx ,2 N

0

Deduce the intensity of a ray. 26. Assuming that N is given only at a sequence of discrete data levels interpolate (i) 1/N linearly between levels, (ii) N 2 by quadratics with continuous first derivatives, and (iii) N 2 by cubic B-splines. Set up a program for tracing rays and compare the three possibilities when N(z) = 1 + zla and when 1/N(z) = 1 + zla. Note particularly the comparative positions of any caustics. If N has a parabolic variation, first increasing from the surface of the earth to a maximum at height hit then decreasing to a height h2 , and thereafter decreasing linearly to a height h3 where there is a perfectly reflecting boundary, draw a ray diagram.

8.13 Edges An essential feature of the procedure in applying geometrical optics to scattering by an obstacle is the substitution of the tangent plane at a point for the local surface. Such a replacement is plausible if the principal radii of curvature at the point are large compared with the wavelength. If the wavelength is too large another canonical problem may be necessary but the more awkward situation is that in which the radii of curvature are large in some places and small in others. If one radius of curvature is large and the other small another canonical problem is involved. When one radius of curvature is actually zero as at an edge the geometrical theory of diffraction selects as its canonical problem the scattering by a semi-infinite plane. This section is concerned with elucidating the properties of this canonical problem. Let the perfectly conducting semi-infinite plane occupy that part of y = 0 on which x is negative (Fig. 8.10) and lie in free space. Represent x, y by the cylindrical polar coordinates r, l/J. Then we consider an incident electrically polarized plane wave in which the z component of the electric intensity is given by E i = exp{ -ikor cos(cP - cPo)} where cPo is a real angle satisfying 0 < cPo < polarized incident wave in which Hi

1[.

In parallel there is a magnetically

= exp{ -ikor cos(cP - cPo)}.

483

CANONICAL PROBLEMS

y

x

Fig. 8.10. Plane wave incident on a semi-infinite plane.

The boundary condition on the semi-infinite plane is that the total E vanishes there or, correspondingly, oR/oy = A detailed derivation of the solution will not be given. That the formulae displayed below do comply with all conditions of the problem may be readily confirmed by the reader. The solution is expressed in terms of the function F(w) defined by

o.

(8.95) which has been tabulated (Clemmow and Mumford 1952; see also Banos and Johnston 1970) for complex w; it may also be written otherwise by Fresnel's integrals. One useful attribute is that

F( - w) = n 1 / 2 exp(iw 2

ini) - F(w).

-

(8.96)

Integration by parts in (8.95) easily furnishes

i + -1 F(w) = - 2w 4w3

+

0 ( - 1 5) Iwl

(8.97)

for tn > phw > -n. Formula (8.96) caters for the asymptotic conduct in other ranges of phw (see also §8.24). By putting

we see that

F(w) for [w] « 1.

= 1n1/2 exp( -!ni) -

w + O(lwI 2 )

(8.98)

484

GEOMETRIC THEORY OF DIFFRACTION

The electrically polarized solution is

E = n- 1 / 2 exp( -ikor + !ni)[F{(2k or)1/2 sin t
-

cPo)}

- F{(2kor)1/2 sin !(cP +
= exp{ - ikor cos(
- 4>0)} = 0,

exp{ - ikor cos( cP

+ cPo)} ( -n ~ cP < - 0 < 4> < 4>0) ( ~ z),

The first discloses the presence of a plane wave reflected from the face of the diffracting sheet as if it were infinite in extent; the second contains the incident wave alone while the third indicates a zero field behind the screen. These formulae bring to light geometrical optics. For there is a shadow zone behind the plane where there is no field (Fig. 8.11), an illuminated region where the incident plane wave exists, and a reflection sector where a reflected wave is manifest. The field of geometrical optics is discontinuous across the lines cP = ± cPo which is why the more recondite expression (8.99) is wanted to secure a smooth transition. Extract the geometrical optics terms from (8.99), leaving what is known as the diffracted field. When kor » 1 and

""

cos !0 - cos

(8.100)

from (8.96) and (8.97). The implication of (8.100) is that the diffraction field behaves as if it originates from a line source placed along the diffracting edge but with an amplitude factor which depends upon the angle of observation. Experimentally, this may be confirmed by measurements in the shadow zone where the electromagnetic field appears to issue from the edge.

Shadow

Illuminated

..............

.........

"

Fig. 8.11. Geometrical optics in diffraction by a semi-infinite plane.

CANONICAL PROBLEMS

485

Fig. 8.12. The asymptotic formula fails in the shaded regions.

The formula (8.100) must be dropped when 4J is near ±4Jo because then a small argument may occur in F even though kor is large. A notion of the regions of invalidity can be obtained by drawing the curves 2k or sin? !(4J ± 4Jo) = 10. The curves are parabolas with foci at the origin and axes 4J = ±4Jo (Fig. 8.12). The larger kor is, the narrower the parabolas are. The formula (8.100) should be avoided in the shaded regions. In these regions the full expression (8.99) must be used; there is no simpler approximation. According to (8.99) E = exp( - ikor) + O{(kor)-I/2}

t

when 4J = 4Jo so that the diffracted field is of the same order of magnitude as the incident wave. A similar statement is true at 4J = - 4Jo. This explains in another way why it is not legitimate to employ (8.100) in the shaded regions of Fig. 8.12. In a region where the diffracted and incident waves are of comparable magnitude there is the prospect of interference. It is borne out by calculations. A typical example is shown in Fig. 8.13 where the field three wavelengths behind the screen is plotted, i.e. on y = 6n/k o, with an incident plane wave at normal incidence (4Jo = 1n) to the plane. The interference fringes in the illuminated region are clearly visible as well as the steady decay which transpires as one moves into the shadow. For magnetic polarization

H

= n- 1 / 2 exp( - ikor + !ni)[F{(2kor)1/2 sin !(l/J - 4Jo) + F{(2kor)I/2 sin t<4J + 4Jo)}]

(8.101)

which differs from (8.99) only in the sign of the second term. The general structure of the field of geometrical optics is the same as in Fig. 8.11. When

GEOMETRIC THEORY OF DIFFRACTION

486

1-2 -----------1-0 0-8

-3

-2

-1

0

1

2

3

4

5

Distance in wavelengths

Fig. 8.13. Interference fringes for a plane wave at normal incidence.

kor » 1, the diffracted field is given by Hd

,..,..,

/2

sin t

(8.102)

provided that cP is away from ± cPo, i.e. the shaded regions of Fig. 8.12 must again be excluded. Once more, the diffracted field looks as though it stems from a line source along the diffracting edge. So far the direction of propagation of the plane wave has been in the (x, Y) plane and perpendicular to the diffracting edge. This restriction will now be lifted and the case of general oblique incidence considered. Let the electric intensity of the primary wave be E

= (/oi + rnoi + nok) exp{ -iko(x sin 00 cos
where i, j, k are unit vectors parallel to the Cartesian axes and (10 cos cPo

+ rn o sin cPo) sin ()o + no cos 00 = o.

The direction of propagation makes an angle ()o with the z axis (which is along the edge) and the wave degenerates to the non-oblique case when ()o = tn. If the z dependence of all fields is taken to be exp( - ikoz cos ()o), the governing equations assume the same guise as in two dimensions except that ko is replaced by ko sin 00 • The solution may be deduced from (8.99) and (8.101) by such a replacement. It is found that, if E(3) denotes the total three-dimensional field, E

(3)

= no exp -

(-k

-1 oZ

.

cos .k

() ) (kok sin? 00 - i cos ()ograd)E(3) 0 . 2 ko sin ()o

1 exp( -1 oZ

cos

() ) (/0 sin cPo - rno cos cPo)k

A

grad

H(3)

0 - - - - - - - .- - - - - -

k o sm ()o

(8.103)

487

CANONICAL PROBLEMS

where £(3) and k o. Also

( Itoeo )

1/2

H(3)

H(3)

are given by (8.99) and (8.101) with k o sin 80 in place of

= -ino exp(-ikoz cos 80 )

k

1\

grad . 2

k osin

E(3)

eo

--

. (10 sin 4Jo - rn o cos 4Jo)(k ok - i cos 80 grad)H(3) . e - exp(- Ikoz cos eo) ko sm 0

(8.104)

When kor sin 80 is large the Fs can be estimated asymptotically provided that 4J is not close to ± 4Jo, i.e. so long as the point of observation is not near the shadow boundary or boundary of the reflected wave. Then for the diffracted field (3)

E dr

I'"'t.;

-no cot

e0cos-14> sin !4Jo ( ---

)1/2

2

cos 4Jo - cos 4J nkor sin (Jo x exp(-ikor sin 80

E~~ '" -(/0 sin cPo - rno cos cPo)

ikozcos 80

-

I'"'t.;

no

whence

cos 14J sin 14J ( 2 )1/2 exp{-ikor sin 8 2 2 0 . 0 cos
E<]j sin eo + E~~) cos

Moreover

ini), (8.105)

sin 14> cos 14> ( 2 )1/2 2 2 0 • e cos 4>0 - cos 4> nkor SIn 0

x exp{-iko(r sin 80 + z cos 80 ) E~~)

-

eo -

E~~)

cos 80

E~;) sin eo

I'"'t.;

-

0, E(3)

I'"'t.;

--:._~.

SIn 80

(~:) 1/2 H~3) '" (r sin eo + k cos eo) /\ E~3)

-

ini}, (8.106)

ikoz cos 80

-

ini}

(8.107) (8.108) (8.109)

(8.110)

where r is a unit vector in the direction of the cylindrical polar radial vector. These formulae are valid so long as kor is large, i.e. points of observation almost on the edge are excluded, and sin ()o is not too small. Thus the incident wave must not have a direction of propagation practically coincident with the edge. It is transparent from (8.108) and (8.110) that the electric and magnetic fields are perpendicular to each other and both are orthogonal to r sin 80 + k cos 80 • Thus, locally, the diffracted field has the appearance of a plane wave moving parallel to i sin 80 + k cos 80 and, generally, can be vizualized as propagating along rays of this type. Hence the field looks like a progressive wave emanating

488

GEOMETRIC THEORY OF DIFFRACTION

Fig. 8.14. The cone of diffracted rays produced by an edge.

from the edge. If a cone of semi-angle 90 and apex on the edge be drawn with its axis along the edge, the field on the cone travels along a generator, being constant on any given generator apart from variations in the distance to the apex (Fig. 8.14). The existence of such a cone has been confirmed experimentally. If 80 = tn, the cone degenerates to a plane perpendicular to the edge and the generators are radial lines from the edge, agreeing with the two-dimensional behaviour described earlier. It is now convenient to rewrite (8.105)-(8.110) differently. Introduce the coordinates ~ and '1 on a ray so that both are perpendicular to the ray and form a right-hand system with the direction of propagation. Choose ~ to lie in the plane containing the ray and edge. Denote the incident ray and a ray emitted by the edge by the superscripts i and e respectively. For example, fie will lie in the opposite direction to the unit 4J vector and E~ = E~~) sin 8 0 E~;) cos 80 • Then (8.105)-(8.110) give

o )(E~o) exp( DH where 2

)1/2 exp( -ini) cos 1l/J sin 1l/J

DE = ( nk o DH

= (-

2

ikor cosec ( 0 ) (r cosec 80 )1 /2

Ei~o

2

)1 /2 exp( -ini)

nk o

2

0

cos l/Jo - cos 4> sin 1l/J cos 1l/J 2

2

0

cos l/Jo - cos l/J

(8.111)

cosec 8 0 ,

(8.112)

cosec 80

(8.113)

CANONICAL PROBLEMS

489

e

and E~o, E~o are the components of the incident field along i and '1 i at the point of the edge from where the ray starts. The magnetic intensity can be obtained from (8.110). These results form the basis of the geometrical theory of diffraction. For the canonical problem of an impedance half-plane see Senior (1989).

8.14 Edge rays To enable the geometrical theory of diffraction (which will hereinafter be abbreviated to GTD) to cope with edges we postulate that they have an effect which can be modelled by the behaviour in §8.13. In other words, an Ansatz is introduced stating that the rays behave in the same way even if the edge is of the edge is imagined to curved. The incident ray from Po to any point initiate a set of edge rays all disposed on a right circular cone with vertex 0, axis the tangent to the edge at 0, and with one generator the continuation of the incident ray (Fig. 8.15). Fermat's principle will then make PoOP a is chosen so that the aforementioned cone passes contributory ray when through P. If the incident ray happens to strike the edge at right angles the edge rays spread out all over the plane perpendicular to the edge as depicted in Fig. 8.16. An edge ray may itself encounter an edge. It will then originate its own cone of edge rays, often called doubly diffracted rays to indicate that two edges have been involved. There is nothing new in principle unless the second edge is near the shadow or reflection boundaries of the first (see §8.16) but it may be extraordinarily complicated to decide which rays go through a particular point. The knowledge of the trajectories of the rays has to be supplemented by a

°

°

Fig. 8.1S. Cone of rays at a curved edge.

490

GEOMETRIC THEORY OF DIFFRACTION

Fig. 8.16. Edge rays for perpendicular incidence.

prescription for the amplitude of the field to make GTD complete. According to §8.2 the field on an edge ray can be computed from E'(P)

={

e

}1/2

e

PIP2 (pi + s)(pi

+ s)

Ee(Q) exp{ -iko(Lo + s)}

(8.114)

when N = 1. Here EC(Q) is the field at the point Q of the ray, L o the value of L there, s the distance from Q to P, and P~, pi the principal radii of curvature at Q of the wavefront produced by the edge rays. It would be pleasant to pick Q as the point 0 of the edge in Fig. 8.15 so as to connect EC(Q) with the incident ray. Unfortunately, all the rays on the cone intersect at the apex so that the edge is a caustic of the diffracted rays. Hence, if Q coincides with 0, pi is zero and EC(Q) would need to be infinite to provide a finite non-zero value for EC(P) from (8.114). Contrariwise, (8.114) declares that (pi)1/ 2EC(Q) is finite as Q tends to 0. Let its limit be E(O). Then

°

Ee(p)

e

= { . PI (p~

+ s)s

}1 /2 E(O) exp{ -iko(L

o + s)}

where now s is measured along the edge ray from 0. The radius of curvature pi is the distance of 0 from the second caustic of the rays. Its evaluation is

straightforward. Let 00 be the angle between the incident ray and the tangent OT to the edge (Fig. 8.17). Let v be a unit vector along the principal normal ON which lies in the osculating plane. Unit vectors along PoO and OP are denoted by §i and §e respectively. Then, if K o is the curvature of the edge at 0,

1

-= P~

AC

-KoV.S

2

cosec 0o-cosecOo

dOo

-

duo

(8.115)

where Uo is the arc length of edge in the direction OT and the derivative is evaluated at O. The change in 00 is related to the curvature of the incident

491

CANONICAL PROBLEMS

T

.P

N

Fig. 8.17. Parameters for the calculation of pit"

wavefront as is brought out by writing (8.115) as

-1 = C i11 + "OV. (AiS p~

Ae) cosec 2 (J 0

S

where C i is the curvature matrix of §8.5 but based on the coordinates

§8.13.

(8.116)

e "i of i

,

It remains to find E(O)exp( - ikoLo). The suggestion is that it should be joined to Ei(O), the incident field at 0, in a similar manner to that in which they are when the edge is straight, i.e, E(O) exp( -ikoLo) = DEi(O) where D is the matrix with diagonal elements ~,~ and it is understood that the two sides are expressed in terms of their respective ray coordinates. Now, if we do not stray too far from the edge s will be negligible compared with p~ and the diffracted field will have the form

e, "

S-1/2E(O)exp{ -iko(L o + s)}. Indeed, for an incident plane wave and straight edge p~ is infinite. In the notation of §8.13 s = r cosec (Jo and (8.111) is correctly reproduced by the proposed choice for E(O). Hence, according to GTD, the field on an edge ray is E'(P)

={

e

(p~

PI

}1/2 exp( -ikos)DEi(O).

+ s)s

(8.117)

The quantitites 4Jo and 4J in D in (8.117) must now be taken as angles made with the continuation of NO by the projections of the incident ray PoO and of the edge ray OP respectively on the normal plane at O.

492

GEOMETRIC THEORY OF DIFFRACTION

To find higher-order terms in the expansion reversion to (8.36) and (8.37) is necessary. For example, since JL and e are constant and N = 1, from (8.36) §e • E mIJ(s)1 1 / 2 =

-if'

tIJ(oW/ 2

a.V Em_ 2

1

de

+ s'.Em(Sl)IJ(Sl)1 1/2

81

where J is the Jacobian of §8.3 because sand (J can be identified with each other. In fact, J(s) = s(l + s/p~) sin? (Jo. The iteration is commenced by replacing Eo by E" as given by (8.117). However, we cannot put SI = 0 because of the singularities which occur there. This difficulty is overcome by means of Hadamard's finite part. If f() f(t) dt is expanded in powers (possibly fractional) as ~ -+ +0, let S(~) denote those which are negative and therefore give rise to a singularity in the limit. Then

fin

r f(t) dt = lim {f f(t) dt - S(~)}.

Jo

()-++o

Letting s 1 tend to zero we have

s' .E mIJ(s)1 1 / 2 = s' .e; - ti fin

6

f:

IJ(a)1

1 2 /

a.V

2E

m_ 1

de

(8.118)

where em = fin Em(0)IJ(0)1 1/ 2 • Regrettably, (8.118) still cannot be implemented because there is no rule for specifying em. To resolve this deficiency requires uniformly valid formulae which will be discussed in the next section. 8.15 Uniformly valid approximations

The formula (8.117) has certain failings. It is discontinuous at the shadow and reflection boundaries (cP = ± cPo). Indeed, an infinite field is predicted there because D is infinite there. Also the edge wave approaches infinity at the edge where s = 0, so that edge conditions may not be fulfilled. The higher-order terms, as exemplified by (8.118), cannot be determined. Finally, there are discontinuities in the field or its derivatives at points of free space which lie on cP = -1t or ljJ = 1t. The formula (8.117) is therefore non-uniform in that its domain of validity excludes certain regions. An expansion which removes some of these objections by being continuous across the shadow and reflection boundaries while displaying the correct edge behaviour is called uniformly valid. The analysis for the uniformly valid expansion will be restricted to the case when the edge is plane. To be precise we consider a finite aperture, such as a circle, in an infinite perfectly conducting plane screen. Points will be identified by the ray coordinates s', (J 0' and 4> where, as before, (J 0 is the arc length along the edge to the point originating the edge ray. Assume that the incident field has an asymptotic expansion

.

EI

. ' ~ E~. . ' ~ H~ = exp( -lkoL c: - , HI = exp( -tkoLI) c: -;; I

)

m=O

k'O

m=O

ko

(8.119)

493

CANONICAL PROBLEMS

where E~ and H~ satisfy the equations of §8.1. Let Le = Li(uO) + se so that L e is the eikonal of the edge ray. Introduce the quantity t/J defined by

t/J2 = Le -

u.

Since the distance along an incident ray to a point is never more than that in going to the point via an edge ray by Fermat's principle, L C - i: ~ 0 and so t/J is real. To fix its sign note firstly that it vanishes on the shadow boundary where L e - t: Further, sin !(4) - 4>0) is zero on the shadow boundary; indeed, this is true whenever 4> - 4>0 = 2n1t for integer n. Accordingly, choose t/J to have the same sign as sin !( 4> - 4>0). Then, as 4> varies, t/J will repeat itself at intervals of 41t but not of 21t, so t/J is double valued in the physical space. We now seek an electromagnetic field e, h which complies with the Ansatz (Ahluwalia et ale 1968)

e= 1t- 1/2 exp( - ik oLe +

h=

n- 1 / 2 exp( -ikoLe +

Since dF(w)/dw equations if

!1ti) {F(k A/2"') L

m=O

ini) { F(kA/2"') L

m=O

= 2iwF(w) - 1,

E:+kol/2 ko

H: + ko

k 0 1/2

L F:},

(8.120)

L G:}, k

(8.121)

m=O

m=C

ko o

(8.120) and (8.121) will satisfy Maxwell's

i grad L e A Fo + grad i grad L e A Go + grad

e

.

A E~ = i

e A H~ =

igradLeAFm+grad"'AE~=i(

e:

(Jl.e: )1 /2 Go,

_{::YI2

J.l )1/2

i grad L" A G m+ grad e A H~ = -i

(8.122)

Fo •

(8.123)

G m+ curl Fm- 1 ,

e:Y

1 2

(8.124)

Fm+ curl G m- 1 (8.125)

for m ~ 1. By the same method as was adopted for deriving (8.12) and (8.13), and bearing in mind that E~, H~ satisfy (8.5)-(8.7). we obtain 2(grad L e .grad)Fm + FmV 2L e

= -iV 2Fm _ 1 + 2i(grad t/J .grad)E~ + iE~V2t/J, (8.126)

2(grad L e .grad)Gm

+ G mV 2L e = -iV 2G m _ 1 +

2i(grad t/J .grad)H~

+

iH~V2t/J.

(8.127) Write the right-hand sides of(8.126) and (8.127) as fmand gm respectively. Then,

GEOMETRIC THEORY OF DIFFRACTION

494

from (8.42),

whence

= "21 Jr0 jl/2(O')fm(0') de + (jl/2Fm]o s

jl/2(s)Fm(s)

(8.128)

where [ ]0 means the value as s -+ O. Since the edge conditions force F... to be no more singular than S-I/2 as s -+ 0 and J disappears like s, the last term in (8.128) is a finite quantity depending on (Jo and ljJ. Similarly

= "21 Jr0 jl/2(0')gm(0') de + (Jl/2G m]oo s

jl/2(S)G m(s)

(8.129)

The integrands in both (8.128) and (8.129) are finite so that the only singularity of F... and G m predicted by these equations as s -+ 0 is of order S-1/2 and stems from the last terms. If the last terms are absent Fm and G... vanish at the edge. Both fm and g... can be regarded as known, so both Fm and G... are determined as soon as the last terms are specified. This has to be done by the edge conditions, as will be seen shortly. Let i 2 be a unit vector in the direction of increasing ljJ and let eq, be the component of e along i 2 • Then, if the dependence on cP is explicitly indicated, e

= eq,(ljJ)i 2 + e(ljJ),

h = h.(ljJ)i 2

+ h(cP).

(8.130)

Now if 4> is replaced by 21t - 4> a solutiom of Maxwell's equations is still available provided that the signs of eq, and ii are reversed. Therefore consider the electromagnetic field

+ eq,(21t

{eq,(ljJ)

+ e...,(ljJ) - e...,(21t -ljJ),}
- ljJ)}i 2

{hq,( ljJ) - hq,(21t -

(8.131)

as a candidate for solving the diffraction problem in -1t ~

F(kA/ 2 l/1) ,..,

1 2

1C /

exp(ikol/l2

~ + 3~2

ini)H(-l/I) -

-

2ko t/J

4ko t/J

3

when t/J is not near zero, H being the usual Heaviside unit function. Thus the incident wave is properly reproduced in

exp(-ik L e o

+ hti)[ k- 1 / 2 ( F 4

0

0

i

"E ) _ ~

2'"

i

i

"E } ] + k- 3 / 2 { F + _E0 - ~ 0

1

4'" 2'" 3

(8.132)

495

CANONICAL PROBLEMS

+I r, I

I I

I I I

;

~--------~ r, t,

Fig. 8.18. The local Cartesian axes at a point of the edge.

and more terms could be calculated by carrying the expansion of F further. To complete the non-uniform expansion needs the determination of the last terms in (8.128) and (8.129). Select local Cartesian axes t 1 , t 2 , t 3 at a point of the edge so that t 1 and t 3 coincide with - v and (Join previous notation (Fig. 8.18). The components of the electric and magnetic intensities parallel to t 3 must be bounded as s --+ 0 in order to comply with the edge conditions. This can be arranged by requiring t 3 • [FmJo and t 3 • [GmJo to be finite. It follows that t 3 • [J

1 2F / m] 0

= 0,

t 3 • [J 1/ 2 G m ] 0

= O.

(8.133)

These edge conditions carry certain implications about the other components of Fm and G m at s = 0 as consequences of (8.122)-(8.125). A scalar product of (8.122) with t 3 gives

(sin 00 cos cPt2 - sin 00 sin cPtl). (iFo +

~~)

= sin Oo(cos cPOt2 since

Multiplying by J

2tIJ

+ i ()1 Jlo eo

/2

t 3• Go

4Jt 1 + sin 00 sin 4Jt 2 + cos 0ot 3 •

grad L e = sin 0 0 cos 1 2 /

sin cPOtl). Eoi

and applying (8.133) leads to

(cos cP t2 - sin cP tl)· {(r,

L

- i ~;) J 1/2

= -i(cos lPO t 2 -

).[J21/1Eb] I

sin l!Jot 1

(8.134)

/2

0

which supplies another component of [J 1/ 2 FoJo. On the other hand, a scalar product of (8.123) with grad L e provides grad L when it is recalled that grad L i

(cos cPtl

/\

C . (

Fo - i

H~

+ sin cPt 2). [ ( r, -

i

~;) = 0

= - (eo/ JlO)1/2E~.

Hence

~;) J1/21 = i cot 0ot [E~;/21 (8.135) 3•

496

GEOMETRIC THEORY OF DIFFRACTION

which furnishes the final part of [J 1 / 2 Fo] 0 ' For Go it is only necessary to replace Fo and E~ in (8.134) and (8.135) by Go and H~ respectively. Since Fo - iE~/2'" = [J I/2(Fo - iE~/2"')]0/Jl/2 we deduce that the first contribution of the non-uniform expansion (8.132) to e~ is, from (8.133) and (8.135),

n- 1 / 2 exp( -ikoLe

2 kA/2 sin 00SI/2(1

-

!-ni)E~~

+ S/p~)1/2 sin !(t/J

3 2 /

-

cPo)'

When the corresponding term with 2n - tP for tP is subtracted we obtain complete agreement with the appropriate part of (8.117). Similarly (8.134) and (8.131) ensure compliance with the remainder of (8.117). The rest of (8.132) gives the next higher-order term which defeated the non-uniform theory of the preceding section because of the lack of starting values. The uniform theory does not have this deficiency. Take the scalar product of (8.124) with t 3. Then

(cos 4>t 2

-

sin 4>t 1) · [

= -i(cos 4> ot2 -

(F

1 -

i

~l) J1/21

iJ

112 ~

sin 4> ot 1) . [

2'"

it 3 • [J

-

1

/2

curl FoJo cosec (}o (8.136)

0

Also, from the scalar product of (8.125) with grad L", e { ' } gradL. F1 - -iE~ + - 1 curI H i o + l curI G o =0

2'" 2'"

whence

(cos tPtl + sin tPt 2) . [ { F I - iE

i

1

+ _1 curl H~ + i curl G O}JI/2J

2'" 2'"

= cot Oot 3 • [{-iE \

-

0 -1

2'" 2'"

curl "0i- ' 1 curl Go } J

1/ 2

J.

(8.137)

0

From (8.133), (8.136) and (8.137) the finite part of F 1 + E~/4",3 - iEl/2'" as s --+ 0 can be deduced and thereby the starting values for second-order terms on account of (8.132). These beginning values will, in general, involve derivatives of lower-order terms. Suppose, in fact, that l/Jo = n so that the incident ray travels towards the screen and grazes it. Then DB of (8.113) vanishes and the evaluation of the leading term in E" necessitates proceeding to a higher order. For simplicity in deriving the leading term it will be assumed that the incident field is horizontally polarized so that the only non-vanishing component is E~". Then, from (8.133)-(8.135), o

iE~

ii2E~,,(O)

F - 2t/J = 2 J 3/ 2

1/ 2

sin t< 4> - 4>0)

(8.138)

497

CANONICAL PROBLEMS

with E~,,(O) signifying the value at s = O. It follows that the next order term in the non-uniform expansion is given by i

+ E~

J1/2(F _ iE 1

2t/1

1

4t/13

)ei

2

= fin J1/2(F _ 1

iE~ + E~ )ei 2t/1

4t/13

2

.

From (8.136), the right-hand side is 3/2

2

i 1·E 1" (0) • 1

SIn

2(cP - cPo)

. It 3

-

1/2

e[J

curl Fo]o cosec

eo + fin (J

i 1 2e / 12 e E 0 ) 3

4t/1

Now

iE~) + -ecurl it 3 i it 3 . Eo - -2egrad t/1 /\ Eo

t 3ecurl Fo = t 3ecurl ( Fo - -

2t/1

2t/1

2tfJ

and so an alternative expression is

{iE i1,,(O)

+ [t 3 ecurl

23/ 2 sin t(cP

E~]o cosec

- cPo)

+ fin J 1/2

eo}

t3cosec 9 {4~3 grad L E~ i

0,

/\

i

curl (Fo _

i~;)}.

But,

where

Xl = sin 80 cos

cP,

X2

= sin eo sin cP, X3

= cos

80

and the x j are the Cartesian coordinates corresponding to the base vectors t j . Hence, for small s,

t/1 = (2S)I/2 sin eo

sin

t
{I - .

2

8 SIn 00

.b~ t
SIn

cPo)

}.

Therefore

J1/2f

fin - -

t/13

=

sin

eo

.- - - - - - {2 1/2 sin 00 sin t
[{-21PI + 8·sm e

2

.3~

II 1(,1" ,I" )}f(O) Uo sin 2 'P - 'Po

+ aOS f(O)J.

(8.139)

498

GEOMETRIC THEORY OF DIFFRACTION

Putting lPo fin J 1/2

(F

= n and inserting (8.138) in the iE~

Eh). _

2~

4~3

1 - - + - .1 2 - -

0 +-1 1/2 sin 8 4 {2 sinOocost

-

x

2

{iE:,,(O) + [curl EhJo·t3 cosec Oo} 2 cos !l/J

-1+ 2p~

3b} Eo. (O)+X.grad Eo. (0)]

8sin 20ocos2 tl/J

"

"

(dOo

cos 00

3/2

[{

l/J }3

curl we obtain

• 2

sm 00

0 2 1 ) . cos 80 + sIn 80 - + "0 cos 00 cos Il/J duo oUo

-

Eb,,(O) . sin 80 cos tl/J

(8.140)

For the total field (see (8.131» we must add on the same quantity with l/J replaced by 21t - l/J. This removes the first and last terms on the right-hand side of (8.140). Also p~ is unaffected by the change, as can be seen from (8.115), so its contribution drops out. Moreover, because the incident rays are tangent to the screen, oL i/oX2 == 0 and 02 L i(0)/oXj OX2 == O. Hence b is unaltered by the change from l/J to 21t - l/J. Accordingly, only the term involving X is left and A. . A. · = 2 sm . 80 sm . 'PA. 0E h" X('P).grad Eo" - X(2n - 'P). gra dEo" -. oX 2

Hence

/

p~ }1 2 exp( -ikos) DR !- E~ (0) (p~ + s)s ko ox 2 "

E~ = { "

where

(8.141)

(8.142) Note that

[OD

H] DH = - 1. i sin 00 OlPo 4Jo = 1r I

(8.143)

The occurrence of the derivative in (8.141) explains why it is difficult to find starting values for the higher order terms in the non-uniform expansion when the uniformly valid approximation is unknown. In the case of grazing incidence from the screen
(8.145)

499

CANONICAL PROBLEMS

e,

It must be remembered in both (8.141) and (8.144) that '1 refer on the left-hand side to coordinates attached to the edge ray and on the right-hand side to those on the incident ray. Formulae with the same character as (8.141) and (8.144) will arise when the incidence is not grazing but E~ vanishes at the edge. 8.16 Double edge diffraction The scattering of an edge ray by an edge can be handled by the preceding theory provided that the second edge lies in a region where an expansion of the form (8.119) is valid. This will not be true if the second edge lies on the shadow or reflection boundary of the first edge. When this occurs another canonical problem must be solved. To simplify the analysis only the case of two straight edges will be examined. This problem has been discussed rigorously (Jones 1973b) but here we shall be content with an approximate treatment (Boersma 1975) which is a natural extension of the method of the preceding section and whose results are in agreement with the rigorous theory. It will be assumed that the waves are electrically polarized parallel to the edges so that only the electric vector need be considered. It will be denoted by the letter u generically. To set the scene a single half-plane will first be treated by a uniform approach. Let the incident wave be ui(x, y), with the ray through the edge making an angle tPo with the positive x axis (Fig. 8.19). Then if u(x, y) is the total field (which is zero on the conductor) we obtain from (8.131) and

(8.120) u(x, y)

= U(r, 4J) -

U(r, 2lt -

4J)

where

U(r, tP) =

7(-1/2

exp( -ikot/J 2

+ ini)ui(x, y)F(kA/2t/J ) + 1t -1/2 exp{ -

iko(L i

+

r)

+ ini} ko1/2

Fig. 8.19. Scattering of a wave by a perfectly conducting half-plane.

L Fk=,

m=O

0

500

GEOMETRIC THEORY OF DIFFRACTION

Lb is the value of L i at the edge, Le = Lb + r, and t/J2 = Lb + r - i: The edge conditions require Fm to be finite at the edge and so we deduce immediately from (8.128) (with r in place of s) that E; ~ 0 as r ~ o. Now, as is easily verified,

1 d

-1/2 -

da

J

J 1/2 E~. ) ( --

t/JP

1

= -t/Jp-1

.

2

2

i

{E~V

i

"

2

t/J + 2 grad e . grad E~ + (1 - p)E;" grad t/J} (8.146)

--"\I Em-I.

2t/JP

Applying this result to the indefinite integral form of (8.128) we have

J 2{F. - iE~} = ! f[J 2'" 2 1/

m

=~

f

J

1

I /2{

~ V2E i _ I } da 2'"

-iV2F _ 1

m

12 {( -i)V

m

2(F

iE;; 1)

m- 1 -

i 1 im - 1V2} (t/J grad Em-I·grad t/J-Emi - 1 grad 2t/J+2t/JE t/J) d

~3

U.

Apply (8.146) with p = 3 and

J1/2{F. _iE~+ E~-1}=~fJ1/2{(-i)V2(F. _ _iE~-1)+_i_V2Ei _ }da. 2'" 4",3 2 2t/J 4",3 2 m

m

I

m

Further reduction may be carried out in this way. Let

(8.147) where (t)o = 1, (})n = }(} + 1) ... (} + n - 1)

(n

= 1,2, ...).

Then, for the definite integral, we have

J l l 2 t; = fin J 112 t;

- ti

the analogue of (8.118). From (8.115), 1/p~ finally

J:

J112 V 2 Pm - l do ,

= 0 and so J = r in this case. Hence, (8.148)

Note that

fin r 112 po = fin (

-T = l /2

ir E

i

)

i 2312 E~(O, 0) cosec 1<4> - 4>0)' (8.149)

501

CANONICAL PROBLEMS

There are similar results for Fm , involving a linear combination of the first m derivatives of the incident field at the edge. For example, fin rl / 2F1

=-

i2-3/2E~(0, 0) cosec

!(cP - cPo)

+ ~~ 2- 3/2 E ~(O, 0) cosec" i<4> - 4>0) + 2 - 7/2 oE~(O, 0) cosec 3 21(,k 0/ -

or

,k)

(8.150)

0/0

where

b=

a2Li

-2

ox

o2Li a2Li cos? cP + 2 - - cos cP sin cP + - 2 sin? cP Ox oy

oy

evaluated at the edge. From (8.148) and (8.149)

Fo = -

and, since

v 2 Fo = -

i2 - 3/2 r- 1/2 E h(o, 0) cosec

!( cP - cPo)

i2 - 5/2 r- 5/2 E~(O, 0) cosec'

(8.151)

!
(8.148) and (8.150) give

F1 = 2 - 3/2 r-l/2 cosec !(cP - cPo) {- E~ (0, 0) +

3b

32

E~(O, 0) cosec" i<4> - 4>0)

+ -1 oE~(O, 0) cosec 2 21(,k 0/ -

or

4

+ 2In terms of

7 2 3 2Eh(0, / r- /

0) cosec

3

!
,k)}

0/0

(8.152)

i;

U(r,

{F(k A/

2

"')

1/2n~o (t)1I(:Jn",-2n-l}

+ tiko

•

+ n- 1 / 2 exp{ -iko(L~ +

r)

+

ini}

L 00

m=O

F.

~.

k'O

(8.153)

This form is convenient for asymptotics because

(8.154) H being the Heaviside unit function, as w -+ ± 00. Thus when F can be replaced by its asymptotic expansion the series in the first brace of (8.153) cancels that in (8.154). For 0 ~ cP ~ 1t, this property may be used for U(r, 21t - cP) since then the argument of F never vanishes (it is assumed that 0 <
502

GEOMETRIC THEORY OF DIFFRACTION

is not small

u(x, y) '" n- 1/2 exp( - ikot/J 2 + ini)ui(x, y)

x {F(kA/2t/1)

1

2 "to (t)"CO~2)"}

+ tik o / t/1 - 1

+ n- 1/2 exp{ -iko(L~ + r) + ini}ko1/2 00

x

L

m=O

{Fm(r,l/J)-Fm(r,2n-l/J)}/k~.

The leading terms are, from (8.148) and (8.149), u(x, y) ~ n- 1/2 exp( -ikot/J 2 + !ni)ui(x, y){F(k~/2t/J) + tik

(8.155)

o

1/2t/J-l}

_ exp( -ikoT - !ni) ui(O 0) (,A,. ,A,.) 2(2nk 1/2 , g 0/0' 0/ or)

(8.156)

where

g(4)I' 4>2)

= cosec !(4)1 -

+ cosec !<4>1 + 4>2)'

4>2)

(8.157)

If the incident wave is the plane wave

t/I

uo = exp{ -iko(X cos 0) and the total field given by (8.156) is u 1(x, y) ~ n- 1/2 exp( -ikor + !ni)F{(2k or)1 /2 sin t<0)} -

exp( - ikor - !ni) .lI~ 1/2 cosec 2\0/ 2(2nkor)

consistent with earlier results (§8.13). The result analogous to (8.155) when 0 ~ 4> u(x, y) '" u'(x, y) - n- 1/2 exp( -ik X

{F(kA/2tfr) + tik o / 1

(8.158)

-n is

~

otii

~)

+ 0/0

2

+ !ni)ui(x, -

y)

2 "to (t)"CO~2)"} tfr- l

+ n- 1 / 2 exp{ -iko(L~ + r) + !ni}ko1/ 2

L {Fm(r, cP) m=O 00

x

Fm(r, 27t - cP)}/k~

(8.159)

where ifJ is t/J with 2n - 4> for l/J. Since tii = f//(2n - to -
o

~)

0/

(8.160)

503

CANONICAL PROBLEMS

d

Fig. 8.20. Configuration of two half-planes.

which reduces to

U1(X, y)

"W

uo(x, y) -

-

1[-1/2

exp( -ikor

+ ini)F{(2k or)1/2 sin!
exp( - ikor - i-ni) 1( c/J c/J) 2(21tk r)1/2 cosec I 0 o

cPo)}

(8.161)

for an incident plane wave. Suppose now that the plane wave falls on two half-planes as shown in Fig. 8.20. The edge of the lower half-plane is chosen as origin and the edge of the upper half-plane is at x = h, Y = d. Polar coordinates (r h cP1) and (r2, cP2) with respect to the two edges are used, the upper edge being at (r(1), cPU» with respect to the lower edge. It is assumed that cPo and cP(1) are not both near zero nor both near 1t so that grazing incidence and the second half-plane almost absent or almost the whole plane are excluded cases. The total field produced by the lower half-plane in the plane wave is U 1 as set out in (8.158) with r1 and cP1 in place of rand cP respectively. Write U1 = Uo + V1 and let the total field of the upper half-plane when V1 is incident be V1 + V2' i.e, V2 is the field scattered by the upper half-plane under the illumination V 1 • Suppose, firstly, that cPo < cP(1) - b, b > 0 so that the upper diffracting edge lies in the shadow region cP 1 > cPo. According to (8.158) the incident field V1 = U1 - Uo can be approximated by substituting (8.162)

504

GEOMETRIC THEORY OF DIFFRACTION

l/Jo·

when

Diffraction of -

Uo

produces the total field

-uo(x, y) - exp{ -ik or (1 ) cos(
n- 1/2 exp( - ik ot/1 2 + *ni)u 1(x, y){ F(k~/2t/1)

+ !ik- 1 / 2t/1 -I}

_ exp( -ik or2

- lni) u (h d) (~(l) ~ ) 2(2nko'2)1/2 l' g 'P ,'P2

where now t/1 = (r(1) + r2 - '1)1/2 with t/1 positive (negative) for ( <)

VI (x,

y)

~,

+ V2(X, y) '"

exp{ - iko(r(l) + r 2 ) }

2n:(2k r ) 1/2 g(4>0' 4>1) o1 x [F{ - k~/2(r(1) + r2 - r1)1/2} - !iko1/2(r(1) + r2 - r1 ) -1/2]

_ i exp{ - iko(r(1) + '2)} (~ ~(1» (~(1) ~ ) (1 ) )1/2 g 'Po, 'P g 'P ''P2 8nor k « r2

+ V1(X, y) -

exp{ -ik or(1) cos(
The formula (8.163) is valid uniformly for 0 ~ cP2 ~ n so long as r 2 is not small. If r2 is small, (8.156) fails because terms in U(r, 21t - l/J(l) + ~, the upper edge is illuminated by the plane wave uo and

u 1(x, y) '" uo(x, y) -

'l -

*ni) 2(2n:k r )1/2 g(4)o, 4>1)' o1

exp( - iko

By the same process as has just been adopted v1(x, y) +

V2(X,

exp{ - iko(r (1 ) + r2)}

2n:(2k r ) 1/2 g(4)o, 4>1) o1 x [F{k~/2(r(1) + r2 - r1)1/2} + !ik 1/2(r(1)

y) '" -

o

+ r2 -

_ i exp{ -ik o(r(1) + r 2 ) } (~ ~(1») (~(1) ~ ) (1 ) )1/2 g 'Po, 'P g 'P ,'P2 8nor k « r2 Again this formula is uniform for 0 ~
r1) '-: 1/2] (8.164)

'2 is not small.

505

CANONICAL PROBLEMS

Exercise 27. Show that, when r2 is small, the term exp{ - iko(r( 1) + r2)} (,I,. ,1,.(1) 21t(2k r+)1/2 g 0/0, 0/ o x

_

)

}' +

[F{ -kA/ 2(r(1 ) +

r2 -

o

r+)1/2} - !ik 1/2(r (l) + r2 - r+)-1/2]

must be added to (8.163) and (8.164). Here r +,}' + are polar coordinates with (0,2d) as centre and the initial line is the line joining (0, 2d) to (h,d).

The above demonstrates that, when l ~t the uniform theory may be applied in a standard fashion. When
y) '"

-1t -

1/2 exp( - ik oTI

+ !1li)F( -

where t/J 1 = (2rl ) I /2 sin t(
-

kA12t/J 1)

exp( - ik orl

-.

2(21lk or1 )

!ni)

1/2

l1 ~ cosec 2\'1'1

At. )

+ '1'0

exp(ikorfiD[tn 1/2 exp( -!ni) + q~O e;~;;:q~: k1l+ 1/2rfi~q+ I}

This yields 1

vt(X, Y) '" - 2UO(X, Y) -

exp( - ikor l - i1li) 1 1/2 cosec n
- 1l- 1/2 UO( X , y) exp(ini)

exp( -lq1li)

L ex>

q=O

q!(2q

2

+

1)

+

k~+ 1/2t/J~q+ 1.

Diffraction of the plane wave - tu o furnishes the total field

-tuo(x, y) - texp( -ik or ( l)

v 1

(x - h, y - d).

The contribution of the cylindrical wave is

_ exp{ - iko(r(1 ) + r2)} cosec 1(~ + ~ ){F(k1/2a/l) + lik - 1/2 2n(2k rl ) I /2 o

2 '1'1

'1'0

0."

2

0

a/1

-

I}

."

i exp{ - iko(r(1)

+ r2)} g(
8nk o(r(1)r2 )1/2 sin
Each term in the infinite series in

VI

is now treated separately as an incident

506

GEOMETRIC THEORY OF DIFFRACTION

field. Let ~

= exp{-iko'l COS(cPl - cPo)}t/Jf q + 1

be a typical incident field with L i = '1 COS(cPl - cPo). In view of the factor k~+ 1/2 in VI the full expansion (8.155) must be used rather than the approximation (8.156). However, the final field will be required only to O(ko1) and so it will q be sufficient to stop at 1. Now since t/Jf + 1 and its first 2q derivatives vanish at the diffracting edge it follows from (8.148) that is zero for m = 0, ... , q + 1 when q ~ 1. Hence only for q = can t; playa role. But, when q = 0, (8.149) makes evident that Fo = 0, and then since at/J 1/0'2 = (2,(1) -1/2 sin(cP2 - cPo) at the edge we deduce from (8.152) that

t.;

P= t

t:

°

1 cos !-(4)2 - 4>0) . 8(,(1)r2)1/2 sin 2 !
Thus the total field Uq corresponding to Uq ~

~

is

n- 1/ 2 exp{-iko(r(l) + r2 ) + ini}t/Jf q + 1

+

exp( - I·k0( '(1) +'2 ) + 1·) 4 1tl b h(;I" ;1,,) 8( k 3 (1) )1 /2 Oq 0/2' 0/0 n 0' '2

where boq is the usual Kronecker symbol and

t<4>2 - 4>0) cos t<4>2 + 4>0) . • 2 11;1" ;I" ) + . 2 11;1" ;I" ) sm 2\0/2 - w sm 2\ 0/2 + 0/0

h(;I" ;1,,) = cos 0/2' %

(8.165)

The substitution F(w) =

pl/2

exp(-!nil

L {(-)'" exp(imni)w"'}/(!m)! 00

",=0

leads to Uq ~ ti exp{ - iko(,(l)

X

+ r2) + (1 -

tq)ni}koq-l/2,,2q+ 1

f (_)m exp(!mni) kgmt/J'; _ exp{- iko(,(l) + '2) + !nil

",=0

(1m - q - !)!

2n

(8.166)

507

CANONICAL PROBL.EMS

where fI = t/J I/t/J2' Thus the total field caused by the infinite series in -n

- 1/2

co

(1

')kq + 1/2

q!(2q

+

1 ' ) " exp - 2qnl exp(4n1 ~ q=O

0

U

q

1)

co (

(1)

~

n provided that

t/Ji

=

,(1)

()m

)q 2q + 1 eo

( + '2)}" = 21n - 1/2 exp{'k -I 0 r c: - " q=O q!(2q +

in 0 ~ lP2 that t/J~ -

VI is

1) m=O

(1

i)

11m exp 4 m1t1 k-l0 m •.,,2 (!m - q - i)!

,~, -

'2 is not small. To cope with the series we note

+ '2 - '1 > 0 which implies 0 < fI < 1. Consequently

Writing

L 00

F(x, y) =

q=O

(_

)qx 2q + 1

q!(2q

L + 1) m=O 00

(_)m exp(!mni)ym 1

1

(2m - q - 2)!

(8.167)

it may be shown (see Exercise 28) that, for real x and y with Ixl < 1,

-2exp{iy 2(1 - x 2 )

+ ini}F(-xy)H(-y)

(8.168)

where G(x, y)

= j exptix")

G(x, y)

= -G( -x.

f

00

x

exp(- it 2 ) t

2

+Y

2

dt

y)

Therefore the total field due to the infinite series is

o),}

(x

>

(x

< 0).

(8.169)

508

GEOMETRIC THEORY OF DIFFRACTION

Collecting together all the terms we find that V 1(x,

y)

+ vix, y) '" ~ exp{ - iko(r(1) + r2)} 2n x [ G{kA/ 2(r(1)

+

r2

+ -

r1)1/2, (2kor1)1/2 sin !
-

i(2'1)1 /2 sin 1<
exp{ - iko(r( 1) + r2)} 11 ~ 1/2 cosec 2\'PI 2n(2kor1 )

+ r2 -

x [F{kA I2(r(l)

r1)1 /2}

_ i exp{ - iko(r(1) + r2)} 87tko(r(1 )r2)1/2

tik

+

~ )

+ 'Po

oI /2(, (l) +'2 -

{9(cPo, cP2)

sin cPo +

'1)-1 /2]

h(cP cP)} 2, 0

t

- !uo(x, y)H(
lP t)}H(lP 0 -

-

(8.170)

The formula (8.170) is uniformly valid in 0 ~ lP2 ~ n when lPo = lP(1) so long as r, is not small. Ifr2 is small extra terms must be added as in previous formulae. On the shadow boundary
f

G(x, y) = exp(ix 2)

00

xlY

exp( -iy 2t 2) t

2

+1

dt

-+

y

tan- 1 -

x

Hence, on account of (8.98), the field on the shadow boundary •

(1)

v1 + V2 '" exp{ -lko(r

{1

+ r2)} 27t tan

-1

.

lP2 = lPo is

(r(1) )1/2 - 4I} · r2

(8.171)

Actually, VI + V2 is the total field due to the diffraction by the upper half-plane of the incident field V 1. Furthermore, viis the scattered field emanating from the lower half-plane under the excitation of Uo. The diffraction of Uo by the upper half-plane provides a total field of uo(.X:, y)

+ exp( -ik o, (1 »v1 (x -

h, y - d)

of which the leading term on the shadow boundary is tuo. We infer that the total field on the shadow boundary due to the diffraction of uo by the two

CANONICAL PROBLEMS

509

half-planes when their interaction is ignored is I 1 (' { -4 + -2n tan - 1 ~ ,(1)

)1/2}

Uo

to a first approximation, provided that '2 is not small. It will be observed that this approaches the value of !u o for a single half-plane as '2 -+ 00. In other words, when l/Jo = l/J(1), the field on the shadow boundary a large distance from the two half-planes is the same as that for one half-plane. It must, however, not be forgotten that this result is based on the neglect of the interaction between the half-planes.

Exercises 28. By taking a partial derivative of (8.167) with respect to x prove that

for [x] < 1. Substitute from (8.95) in the integrand and interchange the order of integration if y > O. If y < 0 use (8.96) first. Hence show the legitimacy of (8.168). 29. Repeat the analysis of this section for a magnetically polarized plane wave. 30. Examine the possibility of finding the diffraction by three parallel half-planes.

8.17 Emission from a waveguide A remark at the end of the preceding section pointed out that the interaction between the two half-planes was neglected. In this section a problem is considered in which the interaction must be taken into account if accuracy is to be achieved. Consider a parallel-plate waveguide (Fig. 8.21) which is carrying a TE mode; it is desired to determine the radiation from the open end and the reflection coefficients within the guide. The exciting mode is taken as uo(x,

Y) = sin (N;Y) exp( -iKNX)

where N is a positive integer and the propagation constant x; is defined by

being either positive real or negative imaginary. Put

. 0

SIn

n

Kn nn cos On = - . koa' ko

=-

510

GEOMETRIC THEORY OF DIFFRACTION

a

Fig. 8.21. Radiation from a parallel-plate waveguide.

The incident mode may be decomposed as Uo(X,

Y)

=

exp{ -iko(x cos ON - y sin ON)} - exp{ -iko(x cos ON + y sin ON)}

2i

which expresses it as a linear combination of two plane waves, one travelling in the direction of ON and the other along - ON. The plane wave in the direction of ON has amplitude -IJ2i and, when it strikes the upper edge, generates a diffracted wave V1 according to (8.161) given by -

Vt (x,

y)

N

= (- )

exp( - iko't - ini) (J 4i(21lk g( N, 4Jt) o't)t/2

(8.172)

in 0 ~ cP1 ~ n. The field v1 hits the lower half-plane and causes a scattered field which then scatters V3 from the upper half-plane. The process continues and there is a similar one due to the other plane wave in Uo. This iterative procedure was first introduced by Chester (1950, 1950a) in the time domain and subsequently discussed in the frequency domain in the context of rays by a number of authors (Boersma 1974, 1975a; Lee and Boersma 1975; Batorsky and Felsen 1973; Lee 1972; Bowman 1970; Vee and Felsen 1969; Vee et ale 1969; Felsen and Vee 1968). The interaction field V2 can be evaluated by noting that it is due to a source at (0, a) and so, by symmetry, the scattered field is the same as that at the image point when the excitation is due to the image source. Therefore, take the incident V2'

511

CANONICAL PROBLEMS

wave as (

_ )N exp( -ikor-1 - !1ti) (£J ,I,. _ ). 410(21tkor_ 1 )1/2 g N' 0/ 1

Then, from (8.160), we deduce that for 0 vz(x, y)

=

(- )N+l exp{ -iko(r

41ti(2k )1/Z o

~

4>

~

1t,

+ a)}

X [g«(JN' 4J -1) {F(_ k I / 2 ,11) 0 'I' r_1/21

-

lik -1 / 2 ,112

0

'I'

1} -

i

4(k) oar 1/2

g(£J

.\-n)g(l1t ,I,.)J

N' 2

2 ' 0/

where t/J = (r + a - r_ 1)1/ 2 with the same convention on sign as has been employed hitherto. The wave V 2 strikes the upper half-plane and initiates v3 • On account of the rapid variation of F near to 4J = t1t a procedure similar to that of the preceding section must be followed, Since this has to be done several times it pays to adopt a general approach. Indeed, it is convenient to start from the representation

where t/J 1 = (r1 + a - r)1 /2 with the same convention on sign as has been employed to date, While more complicated than (8.172), because

and all other a1q and b1q vanish, it has the advantage of being in a form which is suitable for generalization. By replacing r1 and 4> 1 by r- 1 and 4> - 1 respectively we deduce that, to find V2' we need the field scattered by the lower half-plane from the incident wave exp( -ikor-1)t/J!-1 where '" -1 = (r -1 + a - r _2)1 /2. According to (8.155), the scattered wave in 0 ~ 4> ~ 1t is - 1t -

1/2

,11 exp (Ok 1 0'1'

2

1t1 - 1 or- 1 ),IIQ + 4look 'I' - 1

_ exp{ -iko(a + r) - !1ti}(h ,I,.)~ 2(21tk r)1/Z g Z ,'I' Oq o with an error which is O(k0 (1 / 2 )Q-

l ).

Here [tq] is the largest integer which does

GEOMETRIC THEORY OF DIFFRACTION

512

not exceed !q. Consequently, in 0 ~ 4> ~ n, V2(X,

y)

= - (nk o)- 1/2 exp{ - iko(a +

{F( - k~/2",) -

+ ialO(a, in)

2(2kor)1/2 9

+

ini}

+ kC; 1/2blir-1' cP -1)}

x [qto {ali r-1' cP -1)

x

r)

X

(k~/2", -l)q

tikC; 1/2", - 1 (~~:q (t)nCo~2)

n}

~)J

(In

2 ,0/

o

correct to O(k 1). Now, on account of (8.166),

1 i (l/2)q 1 (i)n F(z)+-2h)n 2"

L

Zn=O

Z

= 1

+ - -n

!n 1/ 2 i exp( -iqni) 1.)' q+l

(1.

- 2 q - 2 .Z

1 12

2 zq

00

{(3 1).}" exp "4q - 4 nl i..J

m=O

()m (1 i) m exp 4mnl Z 1 1 . ("4m - "4q)!

(8.174)

Hence V 2(X,

y)

= ko1/2 exp{ - iko(a + r)} x [-

x

f

q=O

q Ha 1k-l' cP-l) + kC;1/2blq(r_l, cP-l)}( - )ql1 exp(iqni)

~ exp(imni) (k 1/2 ,J,)m

i..J.lm

m=O(2

1), - Iq ·

0."

~ ii1q(r- h cP-l) ()q

x q~O (-tq _

t)!

+ exp(hri)ii1O(a, in) 2(2nkor)1/2

where

-

9

~ exp(ini)

+ 2 k1/2,J, 0 v

(1 i) q exp -a:qm '1

(tn

A..)J

2 ,0/

(8.175)

n = t/J -1/t/J. If (8.175) is written as

V2(X,

y)

= ko1/2 exp{ -iko(a + r)}

L {a2q(r,4» 00

q=O

+ ko1/2b2q(r, 4»}(k~/2t/J)q (8.176)

it follows that

(8.177)

513

CANONICAL PROBLEMS

(8.178) By commencing with (8.176) we can deduce V3(X, y)

= ko1/2 exp{-ik o(2a + r1)}

L {ii 3q(r1, 4>1) + ko1/2b3q(r1, 4>1)}(k5/ 2t/J )q 00

q=O

where Q3q(r1 , 4>1) is related to a 2 m (r, 4» by a formula of the same type as i8.I77) except that '1 is exchanged for '11 = t/!It/! 1. A similar remark is true for b3q. Sufficient quantities have now been determined for a calculation of the field. Let v 2 n(x, y) = exp(-ik or)w2n(r, 4». Then, according to (8.155), (8.151), and (8.152), a non-uniform expansion of V2 n + 1 for n > 1 is exp{- iko(r + a) - !ni} or1 2 1 1 3i 1 {cos !(!n - 4> 1) X [ -W2n(a, 2n)g(21t, 4>1) - - - w 2n(a, In) -.-3 - 1 - 1 - - 8koa sin 2(21t - 4> 1)

_

v2n+ 1 - - 2(2nk - -1 -)1/2 ---

cos! !(!n + 4>1)} __i_ 3 1 1. sin 2(2 n + 4>1) 4k o

+ . X

{w

2n

(a, in) r1

+

[~

ar

1 1

W

(,I,,)J

2n r, 'P

} rt

=0

{cosec"! 1)+ cosec! t(tn + 4> I)}

It is non-uniform because it fails in the vicinity of 4>1 = ±in. Now

o

w2n(a, in) = k 1/2 exp{-ik o(2n - 1)a}{a2n,0(a, in)

aw~(r, 4»J [ urI 2n

rt

=0

=

1

-2: a

-

+ ko1/2b 2n,0(a, in)},

1/2.1. . a2n,l(a'2n)cos4>lexP{-lko(2n-1)a}.

Hence

(8.179)

514

GEOMETRIC THEORY OF DIFFRACTION

By means of some rather elaborate analysis it may be shown that

(8.180) The same expression. holds for a

(a

2n.l

,

1 )

21t

=(-

Q2 n + 1

) N + 1 + 2n

.

8Ina

with n +

(8

t for n. Also,

1 ) 2n - 1

~ -3/2(2 _ )-3/2 L. m n m

g N,!n

1/2

(8.181 )

m= 1

and

where h(8) = 2 1/ 2(2 + cos 8) sin t8/cos 2 8. The total diffracted field from the upper edge due to the plane wave striking it can now be calculated from (8.172), (8.179), (8.180), (8.181), and (8.182). It is _ VI

+

~ _ L. V 2n + 1

n=1

(- )N exp( - iko r 1 - !ni)

= ---2(-2-k-)-1/-2-n

o~

With regard to the plane wave striking the lower edge we have

The fields v2 , V4' ... successively scattered from the upper edge can be calculated by the same method as has just been described. Thus the field diffracted by the upper edge due to the incident mode is _

vd(r1' 4J1) =

LV= 00

n=1

_

n

( _

)N exp( - ikor1 - !ni) 2(2 k )1/2 !(4J1) n

o~

(8.183)

515

CANONICAL PROBLEMS

where f(
=-

I'

2 1g(ON'

+

exp(-ini)S 4i(2nk

oa)I/2

-~ [g(ON' 1n)h(n 16rckoa

10 1-) g(2n,

+

g(tn,
_ ~ exp( -2mkai) S - m~1 (2m)3/2

+(

_)N ~ exp{ -(2m - l)kai) m~1 (2m-l)3/2 '

Similarly the field diffracted by the lower edge is given by exp( - ikor - !rci) 2(2nk r)1/2 f(
vd(r,
ur(x, y)

= ~ c: n= 1

. (nrc R; SIn -

y)

a

exp (.lKnX ) ·

Obviously

. R" exptix,»)

y) . (nrc = ~2 fa0 Ur(X, Y) sm ----;; d y,

Therefore, it is necessary to consider integrals of the form

fo exp a

{.

2

2 1/2 inrcy} + y) + ---

F(y)

2 1/4 dy y ) and it is sufficient for x to be large and negative. The method of stationary phase is therefore applicable. The point of stationary phase occurs at y = - x tan On and the integral has asymptotic value (2rck )1/2 __ 0_ exp( -irci + iKnx)F(-x tan On).

-Iko(x

a

(x

2

+

Kn

Use this result with (8.183) and (8.185), remembering that

R" = {1

+ (-

)N+,,} f(n - 0,,) .

2aKn

(8.186)

The derivation of (8.186) is strictly valid only when Kn is real, but, in fact, (8.186) holds even when the mode is not propagating but additional argument is needed because the point of stationary phase is no longer in the interval of integration. The formula (8.186) brings out the point that the reflection coefficient vanishes when N + n is odd, consistent with there being no coupling between modes of different symmetries.

516

GEOMETRIC THEORY OF DIFFRACTION

It will be noted that (8.186) breaks down at K" = 0 and, on account of (8.184), at K N = O. In other words it fails at the cut-off frequencies of the nth and Nth modes. At other cut-off frequencies the reflection coefficient is continuous but has discontinuities in slope; as the frequency increases the modulus comes down at an angle to the vertical until cut-off is reached when it immediately starts climbing vertically while the phase descends vertically and then rises at an angle to the vertical. The predictions of ray theory described here are in excellent agreement with calculations based on the exact solution which can be obtained for this problem (Weinstein 1948, 1969; Heins 1948). The theory has been extended to staggered waveguides (Driessen and Jull 1989) and found to be in agreement with experiment.

Exercises

31. The TM mode Uo = cos(Nrey/a) exp( -iKNx) is incident on the end of a parallel-plate waveguide. Show that the field diffracted by the upper plate is _

vd(rlt
= (-)

where F 1(tP l )

=

N

exp( - ikor1

-

irei)

2(21tk rd l / 2 o

F1(
1 exp( -!rei) 1 1 -IG1«(JN, tPl) + - - 1 / i SG 1«(JN' Ire)G 1(Ire, 4>1) 2(2rek oa)

2 1 1 1 + -is- [G1(ON' Ire)H 1(re - 4>1) - G1(Ire, 4>1){G 1(ON' Ire) - H1(ON)}],

16rek oa

G1 «(J, 4»

= cosec t«(J - 4» -

H1 (4) ) =

21/ 2 cos !4>(2 -

cosec t(O + 4», cos 4» sec2

4>.

Deduce that the reflection coefficient for the nth mode is -i{1 + (- )N+II}f(re - ( 11) / = 0 when the coefficient is halved. 32. Check the statements made about (8.186) at cut-off frequencies. 33. The plane wave exp(ikoY) is incident on the parallel-plate waveguide of Fig. 8.21. Show that the distant total field produced in the direction 4J = - ire is 2aK n unless n

(a) 1/2{ 1 - .~1

1 1 exp(-ikor) [ 2- 21t;

. } x exp( - 21nk oa)

00

=+=

(

1

(2n)I/2 - (2n

1)

+

1)1/2

exp( -!rei)] 1/2 2(2rek or)

the upper and lower sign being taken according as the incident wave is electrically or magnetically polarized. 34. Try the same type of problem as Exercise 33 but with the plates staggered. 35. Repeat Exercise 33 but with a line source on x = 0, y > a. 36. Examine the possibility of extending the theory of this section to a cylindrical waveguide.

517

CANONICAL PROBLEMS

37. A slit in an infinite plane occupies Ixl ~ a, y = 0 and is irradiated from y < 0 by the plane wave exp(- ikoY). Find a uniform asymptotic expansion for the field, using polar coordinates (rl, 4>1) at the left-hand edge and (r2, n - 4>2) at the right-hand edge (so that the slit is in 4>1 = 0 and 4J2 = 0). Show that, in the electricallypolarized case, the diffracted field is exp(- iko'l - ini) (I 2(2nk

-

' d 1/2 o

g

2

n

exp(- iko'2- !ni) (I n

A" ) _

,'1-'1

2(2nk o'2)1/2

g

At. )

2 , '1-'2

iexp(-2ika) {eXP(-ikor1) 1At. exp(-ik o' 2) lA,.} O(k-3/2) 4 k 1/2 1/2 sec 2'1-'1 + 1/2 sec 2'1-'2 + 0 • n Oa'l

'2

For magnetic polarization replace g(1n, 4» by G1 (! n, 4J) (defined in Exercise 31) and drop the third term. Compare your formulae with the analytical results of Luneburg and Westpfahl (1975) and Sologub (1972). 38. Repeat Exercise 37 with the aperture and screen interchanged. Confirm that your results are in accord in Babinet's principle. 39. In Exercise 37 the right-hand screen is such that the field vanishes whereas the left-hand screen makes the normal derivative zero. Show that the transmission coefficient is 1 cos(2koa - in)

1- -2n1/2

(k oa)3/2

-

9

32n 1/2(k oa)S/2

. 2k 1 sinf a - -n) 0

4

(for non-perfectly conducting screens see Senior 1975). 40. A splash plate for the waveguideof Fig. 8.21 is constructed by making y = d (d > 0) perfectly conducting. If kod» 1, calculate its effect by imaging the field found without a splash plate. If the splash plate is of finite length determine the additional field due to the edges to a first approximation.

8.18 The wedge A canonical problem for what happens when the edge is on a surface which is not infinitesimally thin is provided by the wedge. The angle of the wedge will be taken as b, with free space occupying -n < 4> < n - b (Fig. 8.22). When () = 0 the wedge degenerates into the semi-infinite plane already considered. The illumination will be assumed to be due to a line source at L producing an incident field of -iiHb2)(kolx - XII) where Xl is the position of L. It will be supposed that the wedge is perfectly conducting, though solutions are available for impedance boundary conditions with different impedances on the two faces (Van Dantzig 1958; Lauwerier 1959, 1960, 1961; Williams 1959, 1961; Malyuzinec 1958). For electric polarization

Ez

= w(
-

(8.187)

while for magnetic polarization (8.188)

518

GEOMETRIC THEORY OF DIFFRACTION P

Fig. 8.22. Geometry for scattering by a wedge.

the z axis being along the edge. Here

w(4))

1

= -81t

f

00

00

+1tip

H~l)[ko{r2 + rf -

-nip

Zrr, cosh(u//l)} 1 / 2] .

cosh u - cos /l4>

SInh u du

(8.189)

where 1/u = 2 - ~/1t and the contour of integration passes to the left of the point where cosh(u//l) = (r2 + rf)/2rr 1 and to the right of the imaginary axis (Fig. 8.23). Alternative ways of writing (8.189) are, when r < r 1 ,

= -!iJl

00

L

n=1

Jnp(kor)H~~(korl) cos nJl4> - *i/lJo(kor)H~)(krl)

where the principal value of the integral is taken. For r > r 1 , interchange rand r1 in these formulae. When kor » 1, the integral in (8.189) can be evaluated asymptotically by the method of stationary phase, the stationary points being at u = mu. In deforming the contour to the points of stationary phase, poles may be passed over. The contributions of the poles represent the field of geometrical optics each corresponding to some image of the source in the faces of the wedge. Remarks that it is possible for there to be no shadow and for more than one reflected wave to exist, e.g. when 4>1 = -t~. Stationary phase provides the diffracted wave

±

(8.190) so long as 4> is not near a shadow boundary or the boundary of a reflected

519

CANONICAL PROBLEMS

pni

-----------------------

tl cosh:"

(r 2ff+f f") 2

1

-pni

- - - - - - - - --- --- - - - - - - - - - -

Fig. 8.23. The contour of integration for the wedge.

wave. Consequently E zd

HZd

'" -

'" -

pi exp{ - iko(r + r l ) } . . 1/2 sm p1t SID p(cP 21tk o(rrt ) I

.

+ 1t) SID Jl(cPl + 1t),

Jliexp{ -iko(r + r)} . 1/2 1 sm Jlx{cos Jl1t - cos Jl(4J 21tk o(rrt ) I

+

x) cos Jl(cPl

(8.191)

+ x)} (8.192)

where

I = cos 2 Jl1t - 2 cos Jl1t cos Jl(
+ 1t) cos Jl(

+ ! cos 2Jl(4J + n) + ! cos 2Jl(
520

GEOMETRIC THEORY OF DIFFRACTION

to that when b = 0, the field on an edge ray in the notation §8.13 is

-ikor

(E~E ~) = (DE0

0 )(E~o) exp( cosec 00) DB E~o (r cosec 00)1/2

(8.193)

where DE

=(-

2 ) DB = ( rck o

2

)1/2

exp( - lrci)

I

rck o 1/2

exp( - ini)

I

4

u sin ut: sin J..l(cP + re) sin J..lcPo cosec 00' (8.194)

J1{cos J1n - cos J1(4)

+ n) cos J14>0} cosec 00 (8.195)

and now

1 = cos? un - 2 cos un cos Jl(cP + n) cos JlcPo

+ !cos 2Jl(cP + re) + !cos 2JlcPo. The relation (8.193) reduces to (8.111) when b = 0 and Jl = t. By means of (8.193)-(8.195) general edges of wedge type can be handled as in §8.14. The formulae so derived will be non-uniform because (8.193) is not uniformly valid. If a uniform expression is desired then (8.190) must be supplanted by one which copes with the passage of a point of stationary phase through a pole. In view of the local approximation in GTD by a plane wave, it will be sufficient to consider an incident plane wave. The appropriate form for w can be ascertained by multiplying (8.189) by (8nk or1 )1/2 exp(ik or1 + tni) and letting r1 -+ 00. This leads to ~v

A.) 1 fOO+1ti ll exp{kor cosh(u/J..l)} . h d ( 0/ = SIn u u p 2rei 00 - 1t i ll cosh u - cos J..lcP

(8.196)

and the incident wave makes an angle cPo with the positive axis where cPo = n + cPl. The integrand has simple poles which are on the imaginary axis at u = i(2nn ± J1cP), where n is any integer. Deform the contour of integration into the straight lines joining 00 - niu, niu, niJl, 00 + niu with indentations around any poles encountered. Because of the integrand being odd, the only contribution from the imaginary axis comes from the poles so that -s

w

kf» ::: ~ exp { ikor cos( 4> + u

+2rei

f

00

_ 00

2:n)} + rei) d t + rei) - cos JlcP

exp( - ikor cosh t) sinh Jl(t cosh J1( t

the summation being over those values of n for which -n < cP + 2nn/J1 < n. The residue series is obviously responsible for the incident and reflected waves

521

CANONICAL PROBLEMS

as well as the shadow regions. When no pole is near t = 0

" exp {'k or cos (A.. + -2nn)} + - -Jl - sin(pn) exp( ----ikor - --!ni) -

'" L.J

w (l/» P

1

n

0/

(2nkor ) 1/2

Jl

cos Jl1t - cos J1.l/>

(8.197) the diffracted wave in which is consistent with (8.190). When a pole is near t = 0 let it be at t = t N = i(cP - n + 2NnIJl). Write

f(t) and put cosh t

~ 21tl

f

co

-

=

sinh JL(t + 7li) cosh Jl(t + ni) - cos Jll/>

= l + tv 2 • Then

f(t) exp(- ikor cosh t) dt

00

= ~ exp( -ikor) 21tl

If v =

VN

f

co - 00

exp( -tikorv2)f(t) dt dv. dv

when t = t N

dt A f(t)-=--+B dv v - VN plus a series which is regular in a neighbourhood of the origin. Clearly A and B

= 1(0) (dt) dv

v =0

+~= VN

i sin JL71

cos Jl1t - cos Jll/>

= tlJl

+ _.

JlV N

Now, if JV N < 0,

= _i(2n)1/2 exp(-ini)f<Xl eXP(-itvN + kor

0

2 it )dt 2kor

= 2n 1 / 2 exp( -!ni)F{i(tkor ) 1/2vN}. Accordingly

~ f<Xl 2m

-00

f(t) exp( -ikor cosh t) dt '"

~~.1 exp( -ikor -

n

+ JL~ (2n )1 21t} kor

/2

ini)F{i(tkor)1 /2 vN}

exp(_ ikor - !71i)

when JV N < O. If §v N > Oit is only necessary to replace F by -F{ -i(tkor ) 1/2vN}.

GEOMETRIC THEORY OF DIFFRACTION

522

Since

VN

wp(c/» '"

=-

2i cos t(4)

+ 2Nx/JI')

it follows that

~'eXP{ikorcos(c/> + 2:n)} + n~/2 exp(-ikor + !ni)F{-(2kor)1/2 cos ~ (c/> + 2:n)} +{

u sin Jlx 1 1 (A,. 2N x)} exp( - ikor - !ni) +"2 sec - 'P+-cos Jlx - cos Jl4J 2 Jl (2xk or ) 1/2

(8.198) where the prime on L indicates that n = N is to be omitted. The formula (8.198) provides a smooth transition as 4> + 2N1t/Jl passes through 1t in contrast to the singular behaviour of (8.197). The transition as 4> + 2N xlu passes through -1t is also given by (8.198). Oblique incidence may be covered by the device of §8.13. An approximate treatment of the impedance wedge has been given by Syed and Volakis (1992).

Exercise 41. A wedge of height h is placed symmetrically on the side of the splash plate of Exercise 40 facing the waveguide. Examine how this affects the radiation pattern for various values of h and a selection of wedge angles. When the interior of the wedge plays a part as, for example, when it is dielectric, little progress has been made with solving the canonical problem except in the special cases where ~ = 1t (plane interface) or where the dielectric constants differ only slightly from the environment (Rawlins 1972).

8.19 The effect of curvature The next problem to be examined is that of a curved obstacle (Fig. 8.24). The first matter to be disposed of is the rays which can be anticipated on the basis

Po Fig. 8.24. Rays for scattering by an obstacle.

523

CANONICAL PROBLEMS

s R

R

(a)

(b)

Fig. 8.25. Examples of creeping rays.

of Fermat's principle. If the medium is homogeneous the absolute minimum of the optical path length is provided by the straight line Pop. It corresponds to the direct ray from Po to P and carries the primary radiation as if there were no obstacle present. There is also a reflected ray PoP, satisfying Snell's laws as in §8.5. In addition, there is the stationary path PoQRSP in which PoQ and PS are tangential to the boundary of the obstacle whereas QRS is on the surface of the target. A ray of this type is called a creeping ray because of the way it sticks to the boundary of the obstacle. Other creeping rays exist. For example, in Fig. 8.25(a) PoS'RQ'P is similar in type to that of Fig. 8.24 but going around the body in the opposite sense. In Fig. 8.25(b) PoQRSOQRSP does more than a complete circuit of the obstacle. Clearly, an infinite number of creeping rays can be constructed. The total field at P is the sum of all contributions from the direct ray, reflected rays, and all creeping rays. If all these contributions had to be computed it would be a tremendous task but, fortunately, it turns out that most of the creeping rays can be neglected because they are heavily damped. The above picture has omitted any rays in the interior of the body; should there be interior rays there may be contributions at P from rays which have emerged from the body after refraction and internal reflection. The canonical problem for creeping rays is the circular cylinder of radius a illuminated by a line source at (r 1,
=

L 00

m=-oo

exp{im(

for' > '1. Hence the total electric intensity in the field is given by E;

= -iiHb2)(k olx -

XlI)

+ ii

00

L

m=-oo

exp{im(

524

GEOMETRIC THEORY OF DIFFRACTION

in order to satisfy the boundary and radiation conditions. The series representation is admirable at low frequencies but as koa increases the convergence becomes too slow for computational convenience. Efforts have therefore been made to convert it to a more suitable form at high frequencies (White 1922; Fok 1946, 1965; Franz and Deppermann 1952; Imai 1954; Franz 1954; Friedlander 1954; Wu and Rubinow 1956; Wu 1956; Beckmann and Franz 1957; Goriainov 1958; Logan 1969). Noting that J m = !(H~) + H~) we see that, when r < r 1 ,

E

z

= -1.i f 8

L.J

m= -

{H(1)(k r) 00

m

0

H~)(koa) H(2)(k r)} H(2)(k) 0 oa

WI

WI

The Poisson summation formula is a device for transforming a series rapidly convergent for low values of a parameter into one which converges well for large values of the parameter. By means of it E

z

foo

= -1.i f 8

L.J

11:: -

00

- 00

{H(1)(k r)\I

0

H~1)(koa) H(2)(k r)} H(2)(k) 0 \I

oa

\I

(8.199) The singularities of the integrand are simple poles at the zeros of the denominator. They are located in the second and fourth quadrants. Denote those in the fourth quadrant by vs ' i.e.

(8.200) The poles in the second quadrant are then at v = - Vs since H
valid for Ivl »X, [phv] ~ !1t - fJ (fJ > 0) and H~~(z) = exp(v1ti)H~I)(z) it may be verified that deformation in the lower half-plane is permissible when l/J - l/Jl < 2n1t and in the upper half-plane when l/J - l/Jl > 2n1t. Consequently,

525

CANONICAL PROBLEMS

for 0 <

cP - cP 1 < 2n, E -

1

00

~

H(1)(k a)H(2)(k r)H(2)(k r ) v"

0

vr s~l H~:)(koa){l

z -

x [exp{ivt,(cP -

v.

v.

0

0 t

- exp( -21tivs ) }

cPl - 2n)} + exp{ -ivs(cP -

cPt)}]

where jj~2)(Z) = aH~2)(z)/av. Another way of writing the series comes from the Wronskian

so that

H(I)(k a)H(2)/(k a) v.. 0 vs 0

4i = -k . n oa

Applying the recurrence relation ZH~2)/(Z)

-

H~2)(Z)

=-

ZH~2J t (z)

we derive

4i

H~~)(koa) = k H(2) (k )' n oa

v.+ 1

oa

Also, if B, and C, are any pair of Bessel functions of order v,

I x

dr kx Bv(kr)Cv(kr) - = - {Bv+ l(kx)Cv(kx) - Bv(kx)Cv+ l(kx)} r 2v

Hence

N s2 = vs

f

00

a

Accordingly

E = -ti z

{H(2)(k t)}2 dt v. 0 t

f s= 1

=

+

Bv(kx)Cv(kx) · 2v

-lk aH(2) (k 0 a)H<2)(k a) . 2 0 vs + 1 Vs 0

H~:)(kor)H~:)(korl)

{I - exp( - 2nivs) } N;

x [exp{ivs(cP -

cP1 -

2n)}

+ exp{ -ivs(cP -

cPt)}].

(8.201)

Equation (8.201) was derived for, < '1. When, > r 1 the only alteration to (8.199) is to interchange rand '1; therefore, the same applies to (8.201). Hence (8.201) holds without restriction on r so long as 0 < tP - tPl < 2n. Since koa is large the standard asymptotic expansion of a Hankel function of large argument shows that H~2) = 0 cannot have any solutions for small and moderate values of Ivl. Thus IVsl must be at least of the same order of magnitude as koa. Now, a uniformly valid asymptotic expression for a Hankel function of large argument and order is

H~2)(X) '" 23/2V-1/3 exp(1-1ti) {(x/V)~ _

J

1/4

Ai{ V2/3~ exp(t1ti)} (8.202)

526

GEOMETRIC THEORY OF DIFFRACTION

where Ai is the Airy function and

If x]v is near unity

e~ 2

1/3(1

- v/x) and

H~2)(X) ~

24 /3 V -

1/3

exp(ini)Ai(y)

(8.203)

where y = 21/3 exp(!ni)x- 1/ 3(x - v). Consequently, when Vs is near koa it is related to a zero of the Airy function. It is well known that Ai(a s e'd) = 0 where Cls is real and positive; the first few are «.

= 2.338,

<X 2

= 4.088,

<X 3 :

5.520,}

Cl4

=

6.787,

Cls

=

7.944,

Cl6 -

9.023.

(8.204)

Hence the poles near koa are given by Vs

= koa + (k oa)1/32- l /3exp(-tni)Cls .

(8.205)

As s increases and v deviates appreciably from koa this formula fails and (8.202) shows that the poles approach -ioo. It is evident from (8.205) that the first term in the [ ] of (8.201) becomes more exponentially damped as l/J decreases, whereas the second term does so when l/J increases. Thus the first term corresponds to a creeping wave travelling in the clockwise direction while the second term is an anticlockwise creeping wave. In view of the size of the imaginary part of Vs the factor 1 - exp(- 21tivs ) in (8.201) can be replaced by unity without significant loss. When x - v » x l / 3 , the asymptotic formula .

AI(z) '"

exp(_· JZ3/2) 21r!1 2

Z

converts to (8.202) to

H~2)(X) '" (~YI\X2 -

v 2) -1/4 exp i

1 /4

(8.206)

(lph z] < 1£)

{in -

(x 2 -

V

2)1 / 2 + v sec- 1 (

~

) } •

Hence, if k.r]», is bounded away from 1, 1.

E; ~ - 21

(2)1 / exp(4 -

2

n

x exp

1·

~

1t1) l...J s =1

H~:)(korl) 2 1/4

2 2 2 N s (kor

- vs )

{iV. sec-I k:: - i(k~r2 _V;)1 /2}

x [exp{ivs(ljJ -l/Jl - 2n)}

+ exp{-ivs(l/J -l/JI)}]·

For a first approximation (8.205) indicates that

Vs

is koa. If also

(8.207)

'1 is not too

527

CANONICAL PROBLEMS

Fig. 8.26. The ray-theory lengths for a circular cylinder. near to a

E '" ~ z 1t where

Lexp{ -iko(ro + r')} {exp( _ ivsr") + exp (_ ivsr"')} s

N;k o(ror' )1/2

a

ro = (ri - a 2)1/2,

~ = cP ~ rIff -;;=2n-

r'

= (r 2 -

-1(r)~ -

cPl - sec

a

(8.208)

a 2) 1/2 ,

sec-l(~) ~'

(r)~ -sec- 1(r~.)

cP + cP1-sec- 1

This result may be interpreted in the following way. The upper incident ray tangent to the circle (Fig. 8.26) hits it at the point of glancing incidence or upper penumbral point and then initiates a creeping ray in the anticlockwise direction. The creeping ray travels along an arc of length r" and then covers a distance r' to the point of observation along the tangent at the other end of the arc. In other words, the incident wave at the penumbral point is multiplied by 4 exp(i1ti - ivsr"/a) exp( -ikor') (21tko) 1/ 2 r,1/2

N;

in order to produce the cylindrical wave at the point of observation. The exponential decay associated with the arc of boundary traversed is due to the steady shedding of energy in a tangential direction as the creeping ray goes round. Via (8.215) and (8.213) we may derive the approximation

N; = - 2 7 / 3 exp( -tni)(koa)-1/3{Ai'( -as )}2.

(8.209)

There is a similar creeping ray from the lower penumbral point and it is launched with a diffraction coefficient calculated by the same rules. If the source is on the boundary put r 1 = a in (8.207) and a picture similar to that of Fig. 8.26 emerges.

528

GEOMETRIC THEORY OF DIFFRACTION

Deep ill umination

Fig. 8.27. Regions associated with creeping rays.

Creeping waves which have made a complete circuit of the boundary are multiplied by the factor exp( - 2niv.) and so are rarely large enough to be of any significance. Therefore, the first one or two terms of (8.207) or (8.208) should be sufficient for a good approximation provided that the arc traversed on the cylinder is non-zero. This means that it is in the deep shadow of Fig. 8.27 that a term or two will suffice. As the diffracted field is exponentially small, the

shadow behind a large obstacle is darker when the boundary is curved than when it has a straight edge. In the penumbra or region of deep illumination the residue series (8.201) converges so slowly that it is not helpful. The dominant terms in the deep illumination will be the direct and reflected rays, though it is quite difficult to prove rigorously (Ursell 1957, 1968; Babich 1962; Buslaev 1964; Grimshaw 1966; Brown 1966; Ludwig 1967; Morawetz and Ludwig 1968; Leppington 1968; Bloom and Matkowsky 1969). The penumbral region needs a formula which makes the transition between deep shadow and deep illumination. The penumbra occupies the space limited by tangents drawn at arc lengths of approximately a(koa)- 1/3 from the penumbral point and is shown shaded in Fig. 8.27. An alternative representation is derived from (8.190). Attention will first be concentrated on the range v ~ O. Replacing H~I) by J.. as is legitimate, we see from (8.202), (8.206), and Jv(X)-21/2V-1/3{

~

}1/4Ai(V2/3~e"i)

(X/V)2 - 1

(8.210)

that the integrand diminishes exponentially for v > kor (note that (8.199) holds for r < rl ) . Therefore, the integral from kor to infinity is discarded for a first approximation. Also only dominant terms will be retained in the following; a

529

CANONICAL PROBLEMS

rigorous treatment (Jones 1963) demonstrates that this is justified. Both rand r1 will be kept bounded away from a. Between 0 and kor, (8.202) and (8.206) may be adopted for H~2)(korl). If we do likewise for H~l)(kor) we obtain for the first part of (8.199) exp i{(k~r2 -

where

C(

v sec- 1(kor/v)

- (k5 r i - V2 ) 1/ 2 + vsec-1(kor1fv)+va} dv

_~fkor

41t

V 2)1/2 -

(k5r2 -

0

= 4J - 4J 1 -

V

2)1/4(k5ri

-

V

2)1/4

2n1t. The exponent has a point of stationary phase where

cos

-1

V

- cos k Or 1

-1

- = kor V

-

C(

provided that C( is negative and greater than -tn; otherwise there is no stationary point. Since the left-hand side is an increasing function of v there is no solution unless r1 cos C( > r. The contribution of this stationary point is

+ !1ti) Xl D1/2

i exp( -ikolx - xli (81tk olx -

It therefore reproduces the incident field. Since the stationary value of v/k o may be interpreted as the perpendicular distance from the centre of the circle to the incident ray, the point of observation must lie in the region of deep illumination. In particular, if v < koa, it corresponds to a point on a direct ray which will strike the cylinder but before the point of impact. If we do exactly the same with the second part of (8.199) we obtain

:11:

f

exp i{ - (k5 r 2

+

2(k~a2

-

-

V

V

2)1/2

+

v

sec-1(kor/v) - (k5 r

2)1/2 - 2v sec- 1(k oa/v)

i - V2)1/2 + v sec-1(korl/v)

+ vC(}/(k~r2 -

v2)1/4(k~ri

-

V 2)1/4

dv.

In this case the stationary point satisfies cos -

1

(_v) + cos k Or1

1

(~) _ kor

2 cos - 1

(~) = koa

C(.

(8.211)

It can exist only if v < koa and C( < O. It is responsible for a reflected ray which is in harmony with Snell's laws. However, the asymptotic behaviour of the Hankel functions of argument koa alters as v passes through koa and for v > koa the appropriate integral is

530

GEOMETRIC THEORY OF DIFFRACTION

The equation for the stationary point is now cos- 1

(_V) + kOrl

COS-1

(~) = -a.. kor

(8.212)

For its existence it is necessary that -n < a. < 0 and rl cos a. < r. This time the incident wave is provided in that part in n <

x

exp i{ v sec -1(k or/v) - (k5r2- V2)1/2 + v sec -1(k or1/v)- (k5ri - \'2)1 /2 + va.} dv (k5r2 - v2)1/4(k ori - V2)1/4

has to be discussed. Since v is near koa we may expand the exponent as far as quadratic powers and ignore the variation of v in the denominator. After the substitution v = koa + (k oa)I/3/3/2 1 / 3 the result is _ exp(-tni) (tkoa)I/3exp(-.ikoL) fAi(-peXp(-tni» 4n k o(r2 - a2)1/4(ri - a2)1/4 Ai(- Pexp(tni»

x

eX

Z}

_ ti(tkoa)Z/3 L1P dP P{ - i(t koa )1/3 LbP a k

o

where

a a - -i; = cos - l - + cos - l - + a., Q

L1

= (r2 -

r

a2)-1 /2

r1

+ (rr -

a 2) -

1/2.

As with the usual asymptotic approach we wish to make the limits of integration ± 00 and drop the quadratic terms as far as possible. Lack of convergence at 00 prevents the immediate implementation of this process. So, for P> 0, take

531

CANONICAL PROBLEMS

advantage of the fact that Ai(z exp( -tni»

= exp(tni)Ai(z exp(ni»

- exp(!ni)Ai(z exp(!ni».

(8.213)

The first term causes no trouble at infinity and the second may be dealt with separately. Hence, we obtain

i(!k oa ) 1/ 3 exp( - ikoL) {I 1 - 2/ e, '" 2 1/2k ( 2 i)I/4( 2 2)1/4 P IkoLbhkoa) 3} nor

exp( -ik oL)F{k6/ 2 L bla(2L 1) 1/2} 23/2nk&/2(r2 _ a2)1/4(ri _ a2)1/4L1/2

+ where p(r ) =

exp(!ni) 2n 1/ 2

f

00

0

(8.214)

Ai(Jl)exp( - iJlr) d u Ai(J.l exp( - ini»

exp( -tni) 2n 1 / 2

+

'1 - a

- a

fO _

00

Ai( - J.l exp( -!ni» (.) d exp -IJlr u. Ai( - Jl exp(!ni»

(8.215)

So far only that part of (8.199) has been examined in which v ~ O. A contribution has been found when 0 ~ 4> - 4>1 ~ 2n only if n = 1 and it represents the effect of the lower penumbral point in Fig. 8.26 by putting ~ = 4> - 0 use (8.213) in the numerator and then _ exp(!ni)

p (r ) -

2n

1/2

f

Ai(Jl)exp( - iJ.lr) d

00

ooexp(-2/31ti)

•

2.

Al(J.l exp( -j1tl»

Jl

+

1

~

2n

r

(r > 0)

(8.216)

The contour can now be deformed over the poles of the integrand to give p(r) =_1 __

2n 1/ 2r

L

s= 1

exp(ini - itXsr exp(-tni» 2n 1 / 2 {Ai'( -c<s)}2

(r > 0).

(8.217)

At the shadow boundary L b = 0 and L b becomes positive as one moves towards the deep shadow. Substitute (8.217) in (8.214) and then if L b is sufficiently positive for (8.97) to be applicable (8.208) is recovered for the upper penumbral point when account is taken of (8.205) and (8.209). Thus (8.214) supplies a continuous transition from the shadow boundary to the deep shadow. Moreover, since p(O) is finite from (8.215), (8.217) shows that (8.208) blows up like IIL b as the shadow boundary is approached.

532

GEOMETRIC THEORY OF DIFFRACTION

When r < 0, deform the contour of integration of the first integral of (8.215) into the radius vector through exp(t1ri). Reverse the sign of J.l and employ (8.213). The consequence is that _ exp( -lni) p(t ) 2n 1/2

f

00

00

exp( - 2/31ti)

Ai(Jl exp( -tni» exp(illr) A·( (1 i) dll 1 J.l exp 3n 1

1

+~ 2n t

(r < 0) (8.218)

For IIlI » 1, the asymptotic formulae can be substituted for the Airy function and a point of stationary phase appears. In fact, it can be shown that p(r) '"

----k foo_ 2n

00

exp(!iJl3 /2

'" .1{ _r)I/2 exp(.1.ni 2

4

+ iJlr) dJl +

-k2n

t

3 ) + _1_ + Lit 12 2n 1/2 t

(8.219)

as r -. - 00. If Lb is small but large enough in magnitude for (8.219), (8.96), and (8.97) to be valid we regain from (8.214) the incident and reflected waves on observing that . 1 L~ L1=L---2

2 a L1 ' 1 L3

(8.220)

Lr=L--~ 24 a 2

in the vicinity of the shadow boundary. However, as L b increases further in magnitude the matching is unsuccessful because of the miscarriage of the approximations for L i and U, The correct way to use the transition formula (8.214) therefore is as follows. Start at the shadow boundary L; = o. As L b becomes more negative a situation will be reached where the incident and reflected field are reproduced. As soon as that occurs use the true incident and reflected waves for any larger deviation from the shadow boundary. On the other hand, for positive L b , the residue series should be used from the point where it first becomes applicable. Values on the shadow boundary require p(O) which is given by

p(O)

= 0.354 exp(lni).

(8.221)

For later purposes it is worth remarking the following identification

= L i1 + L + L b , L 1 = l1L~ + llL c

L

C

(8.222) (8.223)

where L~ is the length of the incident ray to the penumbral point PI and L C is the length of a ray from the point of observation to a point of tangency. L b is the distance along the boundary between the two points of contact, being

533

CANONICAL PROBLEMS

Fig. 8.28. Parameters for points on the illuminated side.

counted negative for observation above the shadow boundary as in Fig. 8.28 and positive for the points on the shadow side. The method has been described for the case of electric polarization but it works equally well for an impedance boundary condition

aa~z + koZ{tkoa)- 1/3 s, =

0

(8.224)

the factor (tkoa) -1/3 being inserted to simplify some later formulae. The expression (8.201) still has veracity with the understanding that 00 {H(2)(k t)}2 dt 2 N

=V

with Vs a solution of H~~)'(koa)

f

(8.225)

v. 0 Sat

S

+

Z(tkoa) -1/3H~~)(koa)

=0

(8.226)

instead of (8.200). The early solutions are given by (8.205) with rxs e1Ei replaced by ~s where Ai'(~s) + Z e-1Ei/3Ai(~s) = O. With this change (8.208) can still be used but now

=

_[2 7/3 e-21Ei/3(koa)-1/3{Ai'(t5s)}2{Z2 - t5s e21Ei/3}]/Z2 (8.227) instead of (8.209). If Z = 0 exchange Z/Ai'(~s) and -exp(tni)/Ai(~s)' The transition formula (8.214) can also be used ifpz(r) is put in place of p(r) where Z Ai(Jl) - Ai'(u) _. e1Ei/6 roo N;

pz( r)

= 21[1/ 2 J0

e"i/3 Ai'(Jt e- 2ld /3) + ZAi(Jl e- 2"i/3) e

lilt

dJl

e- 1Ei/6 fO e- 1Ei/3Ai'(-Jl e- 1Ei/3) + ZAi( -Jl e- 1Ei/3 ) -illt e dJl (8.228) • 21[1/2 _ 00 e1Ei/3Ai'(- Jl e1Ei/3) + ZAi(- Jl e 1Ei/3)

+-

534

GEOMETRIC THEORY OF DIFFRACTION

Table 8.1 Poo(t)

Po(t)

t

Modulus

Phase (deg)

Modulus

Phase (deg)

-3 -2 -1.5 -1.0 -0.5 0 0.5 1.0 1.5 2.0 3.0 4.0 5.0

0.883 0.623 0.549 0.484 0.419 0.354 0.290 0.232 0.183 0.144 0.0953 0.0705 0.0564

-93.5 -1.8 +21.6 32.5 34.2 30.0 22.9 15.2 8.48 +3.79 +0.12 -0.12 -0.02

0.891 0.790 0.688 0.560 0.429 0.308 0.204 0.129 0.0894 0.0793 0.0812 0.0745 0.0623

104.3 192.4 211.3 219.5 218.6 210.0 193.7 167.8 130.6 92.0 45.1 22.3 9.71

Observe that Poo(t)

1

Pz (r ) = -1/2 21l t 11

~ 1\ -t)

+

= p(t). The formulae

analogous to (8.217) and (8.219) are

Z2 exp{ini + ibst exp( -!xi)} L s= {b exp(tni) - Z2}21l 1/ 2 {Ai'(b s

1

(r

s)}2

1/2 Z - tit 1· r : 3 1. exp(41t1 + TIlt) Z + lIt

1 +~ 21l

> 0)

(t--+-oo).

(8.229) (8.230)

t

Some values of Po and Poo are shown in Table 8.1; in particular,

= -0.308 exp(im).

(8.231) Values of pz(O) when Z exp(ini) is a real constant are also available (Wait and Conda 1959). The field on the boundary may also be of interest. The preceding results then need some modification because r = a, but the treatment follows the same lines. It is found that near the penumbral point Po(O)

E, = oEz On

i exp(Vri - ikoL) {lk L lk -2/3} (8nk )1/2(d _ a2)1/4 qz "2 0 b(2 oa) o

= ik (~)1/3 Z 0

koa

f

exp(ini - ikoL) {l-k L (lk a)2)1/4 qz 2 0 b 2 0 - a

(81lk0 )1/2('12

(8.232)

l

2/3}

(8.233)

where ( ) _ exp(-tni) 00 exp( -iJlt) d qz r 2ni _ co exp(ini)Ai'(p exp(-jn:i» + ZAi(p exp(-jn:i» u.

(8.234)

535

CANONICAL PROBLEMS

Table 8.2 Zqz(t) (Z = (0)

qo(t)

r

Modulus

Phase (deg)

Modulus

Phase (deg)

-1.0 -0.5 0.0 0.3 0.6 1.0 1.5 2.0 3.0 4.0

2.16 1.38 0.77 0.515 0.327 0.167 0.066 0.025 0.0033 0.00043

-115.9 -106.8 -120 -134.8 -152.9 -180.1 -215.9 -251.6 -320.7 -358.0

1.861 1.682 1.399 1.197 0.991 0.738 0.488 0.315 0.130 0.054

-15.43 +1.52 0 -6.06 -14.23 -26.63 -42.57 -57.98 -87.57 -116.75

Moving away from the penumbral point soon leads to geometrical optics or a residue series in the shadow because

(r) qz

=L

.=

exp{ic5.r exp( -!xi)} 1 (Z2 exp(-ixi) - c5.)Ai(c5.)

2t exp(!it 3 ) t-iZ

(r

-+ -

(0).

(r > 0)

(8.235) (8.236)

Therefore (8.232) and (8.233) may be employed in the same manner as (8.214). A selection of values for qz appears in Table 8.2. Additional computations for various values of Z including complex values have also been made (Rice 1954; Fok 1946, 1965; Wait and Conda 1958). By reciprocity these results will also predict the field off the cylinder due to a line source on the cylinder. Although the circular cylinder is the easiest to analyse theoretically, it is not completely suitable as a canonical problem because of its invariable radius of curvature. A parabolic cylinder is better from this point of view; it has been studied (Jones 1964; Lowdon 1970) but the detail will not be presented here. There is concord between the conclusions and the approach to be delineated in the next section. 8.20 Generalization The analysis of the preceding section has shown that (8.201) is valid for a variety of boundary conditions (8.224) with a pertinent choice of \Is. The aim is now to broaden its range of validity (Ludwig 1975). First suppose that Z is a function

536

GEOMETRIC THEORY OF DIFFRACTION

of l/J which varies continuously but only slightly 'within a wavelength. Persist in making Vs satisfy (8.226); Vs will then be a function of l/J which may be indicated by writing vs(l/J). Evidently, vs(l/J) will be large but it will be assumed, on account of (8.205) that its derivative with respect to l/J is of a lower order of magnitude. Now consider 1· ~ H~~(kor) (2) {.} E; = -21 L. 2 H v. (kOr 1 ) exp -1~(l/J) s= 1 N;

(8.237)

as a candidate for the contribution from an upper penumbral point, the more general ~ superseding the vs(l/J - 4>1) of (8.201) but N, still being given by (8.225). P, will be large and so will its derivatives. If (8.237) is entered in the governing partial differential equation and the derivatives of Vs and N, with respect to 4> neglected by virtue of their smallness, the disappearance of the largest terms demands p~ =

±vs •

The upper sign is the proper choice for the penumbral point under consideration and so

~(4)) = f'; v.(t) dt tPt

which reduces to the right value when Z is a constant. When r, 4> are very near to r 1 , 4>1 respectively, Vs and N, are effectively constant so that (8.237) displays the correct singularity for a source. Consequently, (8.237) can be regarded as a satisfactory solution when Z is variable. Next, suppose that the boundary is not circular, but is convex. To adapt (8.237) to this situation take a in (8.226) as the radius of curvature of the boundary at the point under consideration. The polar coordinates r, l/J are no longer adequate. Instead use the arc length (J of the boundary (increasing in the same sense as l/J does) and a transverse coordinate R. On the boundary R = a and aR/an = 1, so that R does not change significantly with a wavelength alteration in (J. The choice of these coordinates is reasonable if we do not stray too far from the boundary. Now, Vs is a function of (J, as determined by (8.226)and

s, = -ti

f

s=l

H~~)(~oR) H~~)(koRd exp{ -iP'(u)} N;

(8.238)

with N, defined by (8.225). In view of the choice of R the boundary condition (8.224)is satisfied.Therefore (8.238)is appropriate to an upper penumbral point if

_ Vs d(J - -;;.

d~

(2.239)

The expression (8.238) can be expected to be valid only when both the source and point of observation are near the boundary because Rand (J are coordinates local to the boundary.

537

CANONICAL PROBLEMS

For a distant source, the plane-wave approximation can be employed. In this case, is taken to infinity before the generalizations of (8.237) and (8.238) are attempted. It is fund that for an incident plane wave of unit amplitude

'1

s, = 2 L 00

s=1

exp{ -i~(u)}

H(2)(k0 R) 2 Ns YS

(8.240)

where now P, is zero at the penumbral point where the incident ray grazes the body. Although (8.240) is valid only near the body, the Hankel function can be approximated as soon as R is somewhat different from a and (8.240) has a cylindrical wave structure similar to (8.208). By matching to a suitable cylindrical wave E, can be continued to points well away from the body. When Z is infinite and the Hankel functions are replaced by their asymptotic formulae the basic difference berween (8.208) and (8.239) is that vs(

f

koa + (tk O)1 /3(X. e -"i/3 da/a 2/3 It may thus be surmised that (8.214) will be relevant to a uniformly valid representation if L b /a2 / 3 is replaced by a- 2 / 3 de. Some rewriting of (8.214) will improve its generality. Suppose that not too close to the source the incident field is yi exp( -ikoLi ) and that a line source -liH~2) at the point of observation would generate nearby a field I" exp( -ikoL'), L' being the optical path length from the point of observation. Then, by virtue of (8.220), (8.214) may be expressed as

J

E; '"

i

~/2 exp(!xi - ikoL)F[{ko(L - L i)1 /2}]

1t

-

41t1/2
f:

a- 2/3 da} (8.241)

where P, is the penumbral point and B is the point of departure from the boundary of the tangential ray to the point of observation. The values of a and yi at P, are denoted by at and y~ respectively, whereas aD and r~ are the values of a and I" at B. L is the optical path length from source to point of observation via the boundary as defined by (8.222) (cf. Fig. 8.38). The square root (L - L i ) I /2 is chosen to be positive in the shadow and negative in the illuminated region. The field (8.241) is entirely consistent with (8.214) when the boundary is circular and is employed in a similar fashion in predicting the field, but is applicable for arbitrary convex boundaries. It exhibits reciprocity in that the field is unaltered by interchange of the source point and point of observation. Clearly, impedance boundary conditions are also within grasp by the introduction of Pz as defined in (8.228). In particular, the analogues of (8.232)

538

GEOMETRIC THEORY OF DIFFRACTION

and (8.233) are

e, = y~ eX P( iJEz =

an

ikoL )qz { (t k o)1/3

L~ a- 2/3 do}

-ko(tkoa)-1/3Zy~ exp( -ikoL)Qz{
Jpl

(8.242) (8.243)

where C. is the point of the boundary under consideration. When the source is very close to the boundary but not actually on it further investigation has to be undertaken (Jones 1967). The medium outside the cylinder may not be homogeneous and its refractive index N may vary from point to point. The laws (8.241)-(8.243) are still serviceable provided that L, L i are optical path lengths and l/a is replaced by (Jones 1963)

~(!+~) N a Po where Po is the radius of curvature of the tangential ray, being counted positive when the centre of curvature is on the same side of the boundary tangent as the inhomogeneous medium. Also de becomes N de and, in (8.243), the right-hand side is multiplied by Nc . The boundary condition now takes the form

an

+ koN {~

(!

+ ~)}1/3 ZE z = o. koN a Po The formulae (8.238) and (8.240) are also applicable so long as koN is put for k o and a, a are modified as already indicated. iJEz

Exercises 42. Determine the field in the shadow region when the obstacle is a perfectly conducting (a) parabola, (b) ellipse. How would your results be modified if the boundary possessed variable impedance? 43. For perfect conductivity in Exercise 42 find the field in the penumbra for both electric and magnetic polarization. 44. Show that the scattering coefficient of a perfectly conducting convex cylinder in a homogeneous medium under an electrically polarized plane wave is 2 S / 3 n 1/ 2

2 + - - (a l / 3 + a l / 3 )J p(O) k~/3D

1

2

where D is the length of wavefront intercepted by the obstacle and aI' a2 are the radii of curvature at the penumbral points. Is it sufficient to replace p(O) by Po(O) for magnetic polarization? 45. The medium "above the perfectly conducting plane y = 0 is stratified so that the refractive index is (1 + qy)I/2 where q is a positive constant. Determine the field in the penumbra and deep shadow for both types of polarization. 46. A circular cylinder of radius a is surrounded by a medium in which the refractive index N increases steadily with radial distance but is independent of angle. Determine the field in the penumbra due to a line source parallel to the axis of the cylinder.

539

CANONICAL PROBLEMS

The source and point of observation are at radial distances a + h I and a + h respectively where 1 » hla, ho/a »(k oa)- 2/3. Show that propagation takes place as if it were in a homogeneous medium but the cylinder had an effective radius a, so long as the actual distance a(

a

e={l+a(dN)

dr ,=a

}-I.

Compare your results with those of Exercise 45 when q = 2/a e •

8.21 Optimal curvature Sometimes obstacles are designed to shield regions from high-frequency radiation. It may be conjectured that such obstacles should have curved boundaries because the shadow is deeper than when edges are present. The damping which occurs is governed by the integral along the boundary in (8.241). It has been suggested (Leppington 1970) that the shadow would be darkest if the curvature were optimized so that the integral became as large as possible. However, the practical effects do not seem to be large. The attenuation of the optimal design has been compared (Butler 1972) with that due to an ellipse with the same penumbral point and it was found that the optimal boundary gave less than 2dB extra shielding over the elliptic cylinder. Indeed, for the smaller angular deviations into the shadow the elliptic cylinder offered somewhat greater attenuation than the optimal design. Both shapes provided appreciably better shielding than a sharp edge. It therefore appears that it is sufficient to ensure that the boundary has a large radius of curvature. The degree of shielding afforded will be relatively insensitive to the precise shape of the boundary and little will be gained by attempting to make it fit the optimal design. 8.22 The diffraction matrix for a curved boundary Creeping rays in the deep shadow of a perfectly conducting cylinder in a homogeneous medium make a convenient jumping off point for eliciting the polarization connections in diffraction by a curved boundary. The coordinates based on a ray which were fixed upon in §8.13 permit a satisfactory descriptive mechanism here too. From (8.241) and (8.217), on a creeping ray

( EE~~) = (D~0

0 )(E~(Pl)) exp{ -iko(Lb D~

E~(Pl)

(L C ) 1/2

+ LC)}

(8.244)

where Pl is the penumbral point, L b and L C are as already defined and

DC _ (a 1aB) 1/6 exp(-nni) " exp{-tIs exp(ini)(tk o) 1/3 ~I E 2n1/2(tko)1/6 S~l {Ai'( -tIs )}2

a-2j3

du}

'

(8.245)

540

GEOMETRIC THEORY OF DIFFRACTION

Table 8.3

s 1 2 3 4

0.536 -0.419 0.380 -0.358 0.342 -0.330

5 6

De = (a 1aO ) 1/6 exp(-bi) H 2nl / 2(tko) I /6

0.701 -0.803 0.865 -0.911 0.947 -0.978

L exp{ - Ps exp(pn)(tkO) 1/3 ~1 as= I

PS{ Ai(- PS)}2

{Js being a positive number such that Ai'( - (3s)

2 3 /

du}

(8 246)

,.

= O. Some typical values of {Js are

PI = 1.019,

{J2 = 3.248, {J3 {J4 = 6.163, fJs = 7.372, fJ6

= 4.820,} = 8.488

(8.247)

and values of as appear in (8.204); other relevant constants are given in Table 8.3. The magnetic intensity on a creeping ray is given by

He = (::) 1/2 ie

A

Ee

(8.248)

where LC is a unit vector along the creeping ray towards the point of observation. As in §8.13. (8.244) may be applied immediately to oblique incidence on a cylinder though now the creeping ray will follow a helical-type path on the cylinder and a must be taken as the radius of curvature in the osculating plane of this path. For general convex targets the creeping ray will follow a geodesic on the surface in accordance with Fermat's principle. Analytic expressions for geodesics are not easily derivable except for the simplest surfaces and it will usually be necessary to solve the governing differential equation numerically. Furthermore, each ray of an incident tube will traverse its own geodesic and so the cross-section of the tube will, in general, be different at the point of departure from the surface from that at incidence. Therefore, the field will be altered by the factor C where C 2 is the ratio of the cross-sections at the beginning and end of the geodesic on the obstacle. 'In general, C will have to be calculated numerically. Again, when the creeping rays leave the surface the spreading of energy due to variation of the cross-section of a tube must be allowed for by a factor {PJ./(p~ + L )}1/2 as for the edge rays of §8.14. Here p~ is the principal radius of curvature associated with the tube (the surface is a caustic), but it cannot C

541

CANONICAL PROBLEMS

be specified further, in contrast to the situation for edge rays, and it must usually be evaluated numerically. On this basis the field on a creeping ray in the deep shadow of a convex perfect conductor satisfies

= C(D E ( E~) E~ 0

0

)(E~(Pl)){

DB E~(Pl)

P~

L C(p1

+

_}1/2 exp{

LC)

-ik (L 0

b

+L

C) }

(8.249)

with a the radius of curvature of the geodesic and the magnetic intensity given by (8.248). The prediction (8.249) is founded on the approximation (8.205). More accurate expansions have been discussed (Franz and Klante 1959; Hong 1967) but details are omitted here. The penumbra requires the transition formula (8.241) in its entirety. It leads to E C = exp(!ni)n- l /2Ei exp{ -iko(L - Li)}F[{ko(L - L i )}l/2]

+ (~E

:~J(~t~~:~){LC(P;~ Lc)f

/ 2

exp{ -iko(Lb + Lcn

(8.250)

where now (8.251) (8.252)

where P1' P2 are the principal radii of curvature of the incident tube at P, can be helpful near the shadow boundary. The formula (8.250) must be utilized according to the same conventions as (8.214) and (8.241), i.e. starting from the shadow boundary and then transferring as soon as possible to either (8.249) or the geometric. optics field in the illuminated zone. Radiation from sources on a convex object can be calculated from (8.242) and (8.243) by invoking reciprocity (Pathak and Kouyoumjian 1974). 8.23 Diffraction by a discontinuity in curva ture The preceding exploration has tacitly assumed that there is no discontinuity in the radius of curvature. When a discontinuity does occur another canonical problem needs to be probed (Weston 1962, 1965; Senior 1971, 1972; Boyd 1977; Sharples 1962a, b). Primary concern is not with the influence on a passing creeping wave (which is expected to be weak) nor with the creeping waves

542

GEOMETRIC THEORY OF DIFFRACTION

y

x

¢>o

Fig. 8.29. Parameters for a discontinuity in curvature.

generated by the sources induced at the discontinuity but with the effect on geometrical optics in the illuminated region. The model selected is a cylinder tangential to the x axis and with the discontinuity at the origin (Fig. 8.29). The radii of curvature to the right and left of the discontinuity will be designated at and a2 respectively. The incident wave will be assumed plane and electrically polarized. Thus the primary field is Ei

= exp{ -ikor cos(cP - cPo)}

the subscript z being dropped for convenience. The angle cPo is subject to

o < cPo < 1t in order to prevent grazing incidence. The equation of the boundary curve is y = [(x) where T and I' are continuous. Both [(0) and ['(0) are zero but ["( +0) = 1/a 1 and ["( -0) = 1/a 2 • The boundary condition for the total field E(x, y) is

= o.

E(x, [(x»

(8.253)

With the intent of employing matched asymptotics start by concentrating on the neighbourhood of the origin and introduce stretched coordinates X, Y where X

= kox,

Y

= koy .

Assume that for ko large

E(x, y) = E(ko 1 X, ko1 Y) ~

L

m=O

komEm(X, Y)

which will be called an inner expansion of E. The partial differential equation to be satisfied by Em is

a2

( ax

-2

a2

+ -2 +

ay

)

1 Em

= o.

543

CANONICAL PROBLEMS

The boundary condition (8.253) transforms to

L

m=O

Expanding

f

{x,

kamEm

kof(X)} = ko

o.

about the origin we obtain to a first approximation

f(~) = ~ kf;t for X > 0 and the same formula with a 2 for X < O. Hence the boundary condition may be written

Eo(X,O)

+

2

k

X ~ Eo(X, O)} + a 1 {Et(X, 0) + !2 al,2 oy

... =

°

where a 1 or a2 transpires according to X > 0 or < O. This will be applied as

Eo(X, 0) =0,

1 X2 0 Et(X, 0) = ----Eo(X, 0). 2 a t , 2 oy

(8.254)

Similar boundary conditions can be derived for the higher-order coefficients but will not be needed for our purposes. To complete the picture the conditions at infinity need to be specified. Now Ei

= exp{ - i(X cos l/Jo + Y sin tPo)}

and so it can be regarded as the incident field for Eo. Had terms in inverse powers of ko appeared they would have had to be treated as incident fields for E 1 , •••• At any rate, Eo has to be a radiating field after the subtraction of E' and E 1 consists of entirely of outgoing waves. Thus

Eo(X, Y)

= exp{ - i(X cos l/Jo + Y sin cPo)} -

exp{ - i(X cosl/Jo - Y sin cPo)}

and so, from (8.254), E 1(X,0)

·X 2

= _1 -

a 1, 2

sin l/Jo exp( -iX cos l/Jo).

By means of the Green's function which vanishes on the plane we obtain for

Y
-E (X Y)= YSincP0fo

2a2

l'

X

-00

X,2exp(-iX'coscPo)

{(X' _ X)2

H\2)[{(X, - X)2

+

y2}1/2

+ y2}1/2] dX'

+ Ysin cPo roo X,2 exp(-iX' cos cPo) 2a1 Jo {(X' - X)2 + yl}1/2 X H\2)[{(x, - X)2 + y2}t/2] dX'.

544

GEOMETRIC THEORY OF DIFFRACTION

Strictly, these integrals do not converge at infinity. This is because (8.254) has been applied for large values of IXI where it is not truly valid. In essence, the largeness of X is limited by the requirement to keep x small. Therefore, tolerable approximations may be expected so long as the intervals of integration are visualized as finite with significant contributions for small X'. When X 2 + y 2 » 1, insert the asymptotic approximation for the Hankel function and then

Restricting attention to Y < 0, we see that the exponent is a monotonic function of X' if X sin -l/Jo there is a stationary point at X' = R cosec l/Jo sin(l/J + l/Jo) and it is straightforward to certify that this corresponds to a reflected wave emanating from the right portion of the cylinder, so once again the diffracted field stems from the origin. An expansion in powers of X' supplies a contribution from the origin of 2 exp(!ni - iR) -; R 3 / 2 (cos l/J - cos l/JO)3·

(

8)1/

Consequently the diffracted part of E 1 is given by Y) -_ ( - 2 E d(X 1 ,

nR

)1 /2

sin l/J sin l/Jo 3 ( -1 - -1 ) exp(1.. ·R) . 4 1t1 - 1 (cos l/J - cos l/Jo) a2 a 1

(8.255)

Not too far from the cylinder the field on a ray should have standard form and contain factors such as {p/(p + s)} 1/2 exp( - ikos) where p is the radius of curvature on a pencil at the cylinder and s is the distance along a ray from the cylinder. As s diminishes this field should merge with the field in stretched coordinates as R increases. As far as reflection is concerned there is no new feature and so the calculation will be omitted. The diffraction pattern can be matched with (8.255) if 2

sin

Ed(x, y) = -

Obviously, failure of (8.256) must be accepted in the vicinity of

545

CANONICAL PROBLEMS

The same method applied to magnetic polarization provides

/2 = -1 ( - 2 )1

d

H (x, y)

ko nkor

1 - cos

- -

1)

a 1

1.

.

exp(vr1 - lkor). (8.257)

The effect of a discontinuity in curvature is more subdued than that of an edge. Nevertheless it may be important in determining the polarization characteristics of a reflecting surface. Slight flaws in manufacture might not be sufficient to raise a surface kink but might be large enough to cause the curvature to change discontinuously and there could then be untoward cross-polarization. There is no reason why at should not be negative corresponding to a concavity on one side of the discontinuity. However, the reflected pencil will now be concentrated because of focusing and the diffracted field is unlikely to be significant in comparison except possibly at very great distances from the obstacle. Oblique incidence can be dealt with as in §8.13. The diffracted rays from the discontinuity now lie on a cone. The analogue of (8.117) is Ed = where

{/t }1/2 exp( -ikos)D'Ei(O) (Pt + s)s D' = (DE o

and ,

DE

= exp(!ni) ( - 2 ko

nk o

' _ exp(Vri) ( 2 DH -

ko

)

1/2

)1 /2

nk o

(8.258)

0)

D~

sin 4> sin 4>0 ( -1 - -1 ) cosec 2() 0 (cos

(8.259)

1 - cos c/J cos tPo ( 1 1) 2 (J 3 - cosec o· (cos cP - cos cPo) a2 at

(8.260)

The method delineated in this section will handle impedance boundary conditions, curves of discontinuity on three-dimensional surfaces or discontinuities at a point such as the vertex of a cone (Kaminetzky and Keller 1972). There is no necessity for the medium to be homogeneous. Discontinuities on interfaces between two media are also within the scope of this technique. The discontinuities may, of course, be in higher derivatives than the second. Exercises 47. If the boundary condition (8.253) is replaced by oElon + ikoZ(x)E = 0 where alan is a normal derivative directed out of the obstacle and discontinuities due to Z are ignored, show that

(_2_)1 /

Ed = ~ 2 £1(0) sin 4> sin 4>0{l - cos 4> cos 4>0 - Z2(O)} (~ _ ~) k o nkor {sin lPo - Z(O)}{sin lP + Z(O)}(cos lP - cos lPo)3 a2 at x exp(!1ti - ikor)

546

GEOMETRIC THEORY OF DIFFRACTION

48. If in Exercise 47 Z is continuous but its derivative jumps by an amount Z' in going from negative to positive values of x show that the result of Exercise 47 still holds provided that l/az - l/a t is replaced by

1

1

az

at

- - -

- Z'(cos

4> - cos 4>0){ 1 - cos 4> cos 4>0 - Z2(0)}-- 1.

49. The medium above the boundary in Fig. 8.29 is homogeneous with refractive index N. The conditions on the interface are E(l) = a(x)E(2), oE(1)/on = b(x) oE(2)/on where (1) and (2) refer to the lower and upper media respectively. Find the diffracted ray in the lower medium assuming that a, b, and their derivatives are continuous. 50. In Fig. 8.29 I and its first r - 1 derivatives are continuous. The boundary condition (8.253) is applied. Prove that (8.256)may still be used provided that (cos

Other canonical problems (sometimes called etalon problems in the Russian literature) have been studied. For example, a curved surface which also possesses an edge has been investigated (Borovikov 1973), as has the scattering from a cone (Goryainov 1961; Borovikov 1962; Senior 1971a; Senior and Uslenghi 1971, 1973). The effect of a surface impedance when the boundary is curved has also been examined (Bahar 1971). Creeping waves generally follow geodesics on the surface but it is not necessary that both radii of curvature be finite. A disc, when illuminated by a plane wave at grazing incidence with the electric vector in the plane of the disc, exhibits scattering behaviour which is compatible with a wave reflected from the front edge together with one which has circumscribed the edge in shadow (Hong 1967; Ryan and Peters 1968; Senior 1969). Although creeping rays are customarily regarded as determined by the field at the shadow boundary care is required if the creeping rays come to focus on the diffracting surface since more complicated effects may then arise (Kazarinoff and Senior 1962). Once rays have left their generating surface they are governed by the general considerations of §§8.2, 8.3. Often the propagation will be in an inhomogeneous medium which may have curved boundaries (Lewis 1965; Ahluwalia et ale 1974) and frequently the rays may form caustic surfaces within the medium when another canonical problem must be solved (Kravstsov 1964; Ludwig 1966; Lewis et ale 1967; Tokatly and Kinber 1971; Gazazian and Kinber i971). Conditions under which rays do not produce caustics have been formulated (Bloom and Kazarinoff 1976). The tracing of rays is, in many applications, a matter of integrating numerically ordinary differential equations as already indicated. However, the determination of the field also involves the solution of the transport equations in order to fix the amplitudes of the field components. In effect, this is much the same as calculating the cross-section of a ray tube (see (8.19». Sometimes

547

CANONICAL PROBLEMS

this can be carried out analytically by evaluating the Jacobian O(Xh X2' X3)

J1 = -

-

O(S,

-

-

G 2, G3)

where (Xl' X2 , X3) is the point of the ray where the arc length is s and the parameters characterizing the ray are G 2' G 3' If the equations of the ray are

where (), 4> are perhaps angles showing the initial direction of launching of the ray 011

of og Of){( of og

1 J = ( ao al/J - al/J ao

of og

of Og)2

aX2 aX3 - aX 3aX2

of Og)2 (of og

+ ( aX3 aX I - aX I aX3

of Og)2}-1/2

+ aX I aX2 - aX2 aX I

(8.261)

and other formulae are given in §8.3. When f and 9 are known only numerically, use of (8.261) or other analytical expressions can necessitate awkward numerical derivatives. One way of overcoming this is to trace a large number of trajectories starting in uniformly distributed directions; then the local density of rays (and hence the field amplitude) can be estimated by measuring the distance between adjacent rays for instance. Such a procedure .is uneconomical in computer time and its precision diminishes as the range along a ray increases. A more efficient method has been proposed (Hayes 1970) though it is in no way cheap in computer time. It consists of solving the differential equations for the element of area of the phase front. One advantage is that it is also applicable to anisotropic media. Let the eikonal be L and put p = grad L so that the eikonal equation is

(8.262) The equations of the rays, when L is selected as the parameter to vary along a ray, are dx _ P dL - N 2 '

dp 1 -=-gradN dL N

(8.263)

where L is related to the arc length s by (ds/dL)2 = I/N2. Suppose that the solution of (8.263) starting from a specified point in the direction characterized by 0,4> is x

= x(L, 0, 4»,

p

= p(L, 0, 4».

548

GEOMETRIC THEORY OF DIFFRACTION

Write

Then an element of area of the phase front L = constant is given by <5~

= Q8 1\

Q. <50

s«

(8.264)

The cross-section <5A of a ray tube now follows from

<5A

= <5I:

p dx p ds

_e_

= <5I:

(8.265)

though it should be noted that the factor of <5I: is not unity in anisotropic media in general. Now, by applying 0/00 (which implies that L is constant) to (8.263) dQ8 dL

1

2

= N 2 R/I- N 3 p(Q/I.grad N)

(8.266)

where R8 = iJp/iJO. Also 1 1 dR 8 dL = N (Q/I.grad) grad N - N 2 (Q/I.grad N) grad N.

(8.267)

Similarly, if R. = op/iJljJ, both (8.266) and (8.267) hold if 0 is replaced by which are usually not zero. The calculation of the field has now been converted to the numerical integration of the 18 ordinary differential equations (8.263), (8.266), and (8.267) (for both 0 and 4». The process is started by stating the position of the source and the initial angles 0, 4> of the ray. The inaugural value of p then follows from the eikonal equation. These six conditions, coupled with the 12 on Q and R, are sufficient for the integration to proceed. Any of the methods described earlier, such as the fourth-order Runge-Kutta or predictor-eorrector, may be employed. Accuracy may be checked by comparing the value of p obtained by numerical integration with that provided by the eikonal equation. Furthermore, the behaviour of the phase front area can be measured and, if

11

Pec5~1 pl<5~1

CANONICAL PROBLEMS

549

is not sufficiently small (say, less than 10- 3 ) , the accuracy could be regarded as inadequate. If a perfectly reflecting body is encountered the method will still cope so long as Snell's laws are appropriate, i.e. creeping and edge rays can be ignored. New starting values will be wanted at the body for the differential equations. Denote these by the subscript 2, while values on the ray just before the body is reached are signified by the subscript 1. Then, if n is the unit normal from the body into the medium, P2 = PI and

P2

= PI -

and the same relationship (8.268) is valid for

R 2i

=

R li

-

(8.268)

2(PI•n)n

x/lxl

and Q. However,

(an) n ro.

1\ 1\ x.) 2(R t.n)n - 2(pI·n) .----OX i n.x,

(8.269)

in which the last term is negligible unless the curvature is large.

Exercises 51. Prove (8.261). 52. Show that pop/o() = p. R8 • 53. In a certain medium the eikonal equation is p2 = (N - M. p)2 where M is a known vector. Show that the differential equations of the rays are

dx =~(M +~) N p ,

dL dp dL

= ~ {grad N N

(p.grad)M}.

Prove also that t5A = (p + M. p) t5~lp + pMI- 1 and that dQ 1 { (P.R)P} pM + P 1 = R - - 2 - - --2-(Q·grad N) + -(Q.grad)M. dL pN p pN N

-

54. Justify (8.268) and (8.269).

8.24 Reflector antennas Communication links at microwave frequencies often place severe demands on the designers of antennas. Usually the specifications call for a radiation or reception pattern which forms a pencil beam, i.e. the radiation is concentrated in a small angle about a certain direction, often known as the boresight. The sidelobes into which radiation outside the main beam goes are commonly restricted in their maxima or location and there may be limitations on the total radiation outside the main beam. The main reason for these constraints is the reduction of interference with other antennas. The probability of such interference grows with the expansion of microwave telecommunications which is

550

GEOMETRIC THEORY OF DIFFRACTION \

\

\

\

\

\

,

\

,

I

I

I

(a)

I

'--

,,)r-

,,

\

,

\

\

,, ,, ,

--------~\ (b)

\.

"

Fig. 8.30. (a) On-axis front-fed reflector; (b) offset front-fed reflector.

occurring steadily because of its involvement in point-to-point communication, passing information via satellites, microwave broadcasting, and radioastronomy. Similar considerations can also be relevant in the context of radar where a highly directional beam may be essential to secure an accurate fix on a target. The designer is consequently faced with the two objectives of high efficiency and low sidelobes; not infrequently, these have to be met at the least possible cost and with the best possible match at the feed. A solution of many years standing is to employ a reflector, there being two main types according as the reflector is fed from a source symmetrically or unsymmetrically placed relative to the reflector (Fig. 8.30).The source is usually a horn at the end of a waveguide. Many aspects of the design are unaffected by whether the antenna is for transmission or for reception; there are in any case close connections between the two on account of reciprocity theorems. Many microwave antennas operate with a single direction of linear polarization. However, to augment the possible traffic, antennas may be pressed to perform with two perpendicular linear polarizations at the same frequency simultaneously. Such frequency re-use antennas push design to its limits in order to achieve the desired polarization discrimination. The reflector for on-axis feed must, in effect, be a paraboloid with uniform phase over the aperture of the reflector. With the feed at the focus a good pencil beam with quite low sidelobes will be produced. If the beam has to be moved about in order to scan a certain region either the whole structure has to be steered mechanically or the feed has to be shifted. In the latter case a spherical reflector may be worth consideration though the sidelobes may then rise to too high a level. Beam scanning may raise the problem of blockage of the antenna aperture by the feed; an offset feed may then be viewed favourably. However, offset designs are usually more complicated and costly than those with on-axis feed. The overall effectiveness of an antenna system is highly dependent upon the efficiency with which the feed couples to the nearby field. Early designs were based on a conventional circular waveguide operating in a single mode. Better efficiency at narrow bandwidths can be attained by multi-mode feeds (Ludwig

CANONICAL PROBLEMS

551

1969); another method is to use single hybrid-mode excitation from a corrugated waveguide (Clarricoats and Saha 1971; Clarricoats et ale 1975; Parini et ale 1975). The detailed design of an antenna system is clearly a matter for the specialist so that here we shall be content with a description of the principles involved in this task, leaving the reader to find complete calculations elsewhere (see for example James 1976). Assuming a point source the first duty is to determine the trajectories of the incident and reflected rays. For a given point of observation this means discovering the point or points on the reflector profile which make the optical path length stationary. Two simultaneous equations will have to be solved and, in general, this must be undertaken numerically. Once the points of reflection have been settled the curvature matrices C i and C' of §8.5 are evaluated. The formulae of §8.5 will then supply the field of geometrical optics. Thus a first approximation to the radiation pattern is derived and a similar approach may be adopted for general excitation. When the feed is not a point source the situation is more complicated but systematic procedures have been developed (Westcott 1983; Westcott and Brickell 1984). Geometrical optics neglect the diffraction due to the rim of the reflector. To furnish a correction for diffraction the trajectories of edge rays are necessary. These require the points on the rim where the optical path length is stationary; again a numerical procedure such as Newton's method (§§4.5, 1.8, 1.9) will be involved. The angles
tn

552

GEOMETRIC THEORY OF DIFFRACTION

Hyperboloid ( subrefleetor Paraboloid main reflector

Fig. 8.31. Typical Cassegrain antenna.

than a single reflector. They are typed as Cassegrain, the prototype of which consists of a paraboloidal main reflector and a hyperboloidal subreflector (Fig. 8.31). More uniform illumination of the main reflector and higher efficiency than a front-fed paraboloid can be accomplished by this arrangement. A common feed for a Cassegrain antenna is a conical hybrid mode (Clarricoats and Salema 1973). One aim of the original Cassegrain design was to contrive a uniform phase over the aperture of the main reflector. One can commence with geometrical optics despite the subreflector being habitually not many wavelengths in dimension so that diffraction effects are not negligible. The contingency of deviations from predicted patterns due to the struts keeping the subreflector in position must not be forgotten. They are likely to cause deterioration of cross-polarization discrimination as well as scattering energy into the far sidelobes. The far sidelobes will also be influenced by diffraction at the rims of two reflectors and energy from the feed going past the subreflector (what might be termed spillover); there may be spillover of the energy from the subreflector to the main reflector as well. Blockage of the aperture of the main reflector will spring from the subreflector and struts, exerting an effect on the nearer sidelobes. An extra degree of freedom is available in dual reflectors in that the position and profile of the subreflector can be modified. One possibility is to create a specified phase and amplitude distribution in the aperture of the main reflector via geometrical optics (Galindo 1964; Williams 1965, Potter 1967), so long as the main reflector is in the far field of the subreflector. Cross-polarization and systematic design have been examined (Westcott and Brickell 1982; Westcott et al. 1984). Optimization of a different nature can be fulfilled by means of reciprocity theory (Wood 1970, 1971, 1972). According to a standard reciprocity theorem (Jones 1986) if E 1 is the electric intensity due to a current source a at Xl and E 2 is that due to b at X2' the same conductors being present in both cases, b. E 1 (x 2 ) = a. E 2 (X1 ) .

Let now E 1 be the field when a plane wave of unit power is incident along the axis of the main reflector and the subreflector is ignored. Suppose that when

LEAKY RAYS

553

the feed is transmitting unit power it induces a current on the subreflector in harmony with physical optics. In other words, a current density n 1\ H, is produced on the subreflector where n is the unit normal and Et , H, is the field due to the feed at the subreftector. Regard this current density as responsible for the field E 2 via the main reflector. Then the reciprocity theorem tells us that the field radiated along the boresight is effectively

I

=

L n

1\

"t.El dS

(8.270)

the integration being over the front of the subreftector. Hence a measure of the overall efficiency of the dual reflector system is t1

= 11*

(8.271)

provided that physical optics is acceptable on the subreftector. The field E 1 can be estimated by GTD or any other convenient method and so if the field from the feed is known either by computation or measurement the integral in (8.270) can be evaluated. The efficiency is thereby determined. An arbitrary profile can be assumed for the subreftector and the efficiency maximized by any of the optimization procedures, such as conjugate gradients, described in Chapter 4. (In practice, shaping of the subreflector is much more important than modification of the main reflector.) The effect of small deviations of profile may also be predicted (Poulton 1975). One may also maximize '7 at the same time as minimizing

the power delivered to the subreftector by the feed. Moreover, by integrating over a range of frequencies and then optimizing, we have the possibility of arriving at a subreflector profile which will be optimally efficient over a band of frequencies.

LEAKY RAYS 8.25 Gaussian beams and complex sources The function exp( -ikR)/R where R 2 = (x - XO)2 + (y - YO)2 + (z - ZO)2 is a solution of Helmholtz's equation in (x, y, z) for any fixed (x o, Yo, zo). Even if x o, Yo, and Zo are complex this statement remains true. Suppose, in particular, that Xo = Yo = 0 and z = -ib with b positive real so that R 2 = {p2 + (z + ib)2}1/2 where p = (x 2 + y2)1/2. Then the real values of (x, y, z) at which R vanishes are given by z = 0, p = b. Thus R is a multiple-valued function in (x, y, z) space.

GEOMETRIC THEORY OF DIFFRACTION

554

To render it single valued introduce a cut in z = 0, p ~ b with R specified to be z + ib on p = 0 when z > 0 and to be -(z + ib) when z < O. Values elsewhere are determined by continuity. Such a definition is consistent with the customary one for R when b = o. If p2 « Z2 + b2 and z > 0 _ex_p_(_-_ik_R_) R

~ _1_. z

+ Ib

exp{ _ ik

1

(z +

ib

{(

+!

_p_2_. )}

2Z

+ Ib

1

p2

(8.272) )

~ (Z2 + b2) 1/2 exp -ikz 1 + 2Z2 + b2 kbp? 2 Z2 + b2

- -1

+ kb -

.

1 tan -

1

b}

- . Z

(8.273) Let the range of observation points which satisfy p2 « Z2 + b2 be called the paraxial region. Then these formulae show that in that part of the paraxial region where Z2 «b 2 the wave propagates parallel to the z axis with little distortion of the phase front but subject to an exponential decay perpendicular to the z axis proportional to p2. In fact, the field falls to lie of its value at p = 0 when p = (2blk)1/2. Thus, for Z2 « b2, a well-collimated beam is formed in the paraxial region (Fig. 8.32). For larger values of z the lie points lie on a hyperboloid which is asymptotic to the hyperbolic cone !kbp 2 = Z2. Fields of this type are known as Gaussian beams. They are important in beam waveguides and lasers (Goubau and Schwering 1961; Kogelnik and Li 1966). Often the plane Z = 0 is known as the beam waist and (2blk)1/2 is reckoned as the spot size at the beam waist.

Hyperboloid

b (2b/k)i

,> ...

~

.....

~ ' 7 .

:/j Beam centre z

Paraxial region

Fig. 8.32. The paraxial region of a Gaussian beam.

555

LEAKY RAYS

The above derivation demonstrates that a Gaussian beam is an approximate representation in the paraxial region of an exact solution of Helmholtz's equation due to a source at a complex point (Deschamps 1971). Outside the paraxial region the Gaussian approximation is no longer valid. Nevertheless, it is clear than any known solutions for real point sources which can be continued analytically to sources at complex points will provide solutions for excitation by a Gaussian beam. For example, by employing the standard formulae for a magnetic dipole we may obtain the electromagnetic beam field whose components are, if G = exp( - ikR)j4nR,

mw2 Ex = - ( z cR H; =

mk2

---2

~oR

xyG,

H, =

+ ib)G, mk2

E = 0

y'

{x 2 + (z

--2

~oR

mw2 E = --xG z cR '

+ ib)2}G,

Hz =

mk2

---2

~oR

(z

+

ib)yG

at a distant point. At the centre this beam is transverse electromagnetic. Beams with other polarizations can be produced by giving the magnetic dipole other orientations and by utilizing the electric dipole representation. Two-dimensional Gaussian beams can be generated in the same way by starting from the line source G2

where r2 = {y2

+

(z

+ ib)2}1/2. G2

= -iiHb2 )(kr2 )

For large kr2

~ (_1_)1/2 exp( 8nkr2

and in the paraxial region y2 «

r2 ~ z

Z2

ikr2

-

ini)

+ b2

+ ib + ty 2( Z + ib)-l.

By this device problems such as the reflection of a Gaussian beam by a dielectric can be resolved (Raj et ale 1973). Allowing a complex source to go off to infinity gives rise to a plane wave propagating along a complex angle of incidence so that the wave is evanescent. If such a wave is electrically polarized and strikes a perfectly conducting half-plane the solution is still given by (8.99) but now 4>0 is the complex angle 4>r - i0 is real, failure now occurs within two ellipses which become smaller as
556

GEOMETRIC THEORY OF DIFFRACTION

they are included there is no need to alter (8.97) for other ranges we have F () W

1 1/2 exptiw . 2= 21t

F() w = n 1/2 exp (ilW 2

-

1 0)

i 2w

41t1 -

-

+ - 1 3 + ...

(0 < ph w < !1t),

(8.274)

(in < ph w < in).

(8.275)

4w

.1..) 4n1 - - i + - 1 3 + 2w 4w

.··

-in < ph W < in but for

It is evident from (8.274), (8.275), and (8.99) that a term of the same order of magnitude as the incident wave is present for n ~

Exercises 55. Derive the beams for magnetic dipoles parallel to the x and z axes, as well as those for electric dipoles. 56. A line source parallel to the x axis produces a Gaussian beam field Ex = -iwJ-loH~l)(kr2) in a medium z < with constants J-lo, eo, where r~ = (y

+

°

ib sin ex)2

+

(z + d

+ ib cos ex)2.

Show that in the paraxial region

{(z + d) sin ex - y cos ex} 2 « {( z + d) cos ex

+ y sin ex} 2 + b2

the beam can be visualized as travelling at an angle ex to the z axis from (0, -d). In z > 0, eo is changed to 610 Show that, in the reflected wave, WJ-loR(8) - - - exp(OkD -1 + 141t10) Ex '" 2(2nkD)1/2 where

D2 = (y

+

ib sin tX)2

+ (z -

8 y sin =

+ ib sin ex

°

D

d - ib cos tX)2, ,

557

LEAKY RAYS

57. In Exercise 56 put y !x,

R(O)

= R(!X) + t5R

= Po sin (Jo, where

Z -

d = - Po cos 00. Prove that, when 00 is near

t5R = ~

R(fJ.)

2 sin fJ. (0 0 Po + ib (BilBo - sin? Ct)1/2

_ (X).

Show that the centre of the reflected beam, defined by those 00 for which 9l{ -ikD + t5RIR(et)} is a maximum, occurs at Po - ib

2 sin Ct

00 = Ct + - - Y l - - - kbpo (GilBo - sin? (X)1/2 when D is replaced by its paraxial approximation. Deduce that, when (X > sin -1(Blleo)I/2, the reflected beam has suffered a lateral displacement of (2 sin !X)kl (sin? ex - Bl/80)1/2 parallel to the y axis (Horowitz and Tamir 1971). If a < sin -1(elIBo)I/~ prove that the beam shift depends upon the incident beam width and the observation distance Po from the image source.

8.26 Complex rays The notion of a source at a complex point invites the proposal that energy propagates from it to the point of observation by means of rays which must, of course, lie in a complex space in order to pass through the source. Examination of the theory of §§8.1-8.3 reveals that most of it is still valid if L is taken to be complex instead of real, though some features such as the curvature of rays cannot be accepted. Complex L implies that the local plane wave of geometrical optics is inhomogeneous. If L is complex the eikonal equation (8.9) and the ray equations (8.25) and (8.27) are still valid provided that one does not reject the possibility that x, y, and z are complex. Suppose, for example, that N = 1. Then, from (8.27),

d 2x

d 2y

d 2z

da

da 2

da 2

-= 0=-=2

whence x - X o = Y - Yo = z - Zo ABC

(8.276)

where A 2 + B 2 + C 2 = 1 on account of (8.9) and (8.25). In (8.276) any of the quantities x, y, Z, xo, Yo, zo, A, B, C may be complex and so (8.276) is the equation of a complex ray. However, at the point of observation x, y, and z must be real so that the complex initial point (x o, Yo, xo) and complex slope (A, B, C) must be chosen to ensure that this is so. In the Gaussian beam Xo and Yo are both zero whereas Zo is pure imaginary. Another aspect is that it will not be possible to trace L unless its initial values at complex points are known. It is usually feasible to surmount this difficulty by means of analytic continuation. It is not always necessary to make all the variables complex. To illustrate this

GEOMETRIC THEORY OF DIFFRACTION

558

point we consider the two-dimensional problem in which (8.277) on z = O. In view of the preceding section this is equivalent to specifying the spot size of a Gaussian beam at the beam waist. The theory of complex rays should therefore be competent to reconstruct the Gaussian beam in z > O. Allow y to be complex but keep z real. On z = 0 let L satisfy (8.277) even when y is complex, there being no difficulty about the analytic continuation in the paraxial region. On the complex ray y - Yo = A.z the value of L is given by

L = {Z2

+ (y - YO)2} 1/2 + L(yo, 0) = {Z2 + (y - YO)2} 1/2 + (y~ _ b2)1/2. (8.278)

The relation between A and Yo has to be determined. Now, from (8.277), oL(y,O)joy = y/(y2 - b2)1/2 and hence the eikonal equation (8.9) supplies oL(y,O)/oz = ib/(y2 - b2)1/2. It follows from (8.25) that the slope of the ray must satisfy dyjdz = yojib on z = O. Thus A = yojib and the equation of the ray is y - Yo = yoz/ib. Eliminating Yo from (8.278) by means of this equation, we obtain L = {y2 + (z + ib)2}1 /2 in agreement with the earlier formula for the Gaussian beam. There are consequently two alternative viewpoints. Either the Gaussian beam may be visualized as composed of complex rays passing through a single point in complex (y, z) space or as consisting of rays in complex y and real z space whose sources are distributed over z = O. Equations which are entirely real can be achieved by separating the complex equations into their real and imaginary parts (Felsen 1976). Let L = L, + iL i where L, and L i are real. Then (8.9) becomes grad" L, - grad" L, = N 2 ,

grad L r • grad L,

= O.

(8.279)

Let sf' s, be the unit vectors S

r

=

grad L,

[grad Lrl

,

grad L,

Si=---

[grad Lil

and s., Si the corresponding arc lengths on the curves to which they are tangential. Analogous to (8.28) we have

d

.

- (lgradf., Is.) = grad [grad L, I, ds,

d - (lgrad Ljls i ) ds,

= grad lgrad Lil.

(8.280)

LEAKY RAYS

559

From the second part of (8.279), L, is constant on a trajectory parallel to s, and such a trajectory is therefore known as a phase path. L, is constant on trajectories parallel to Si' and so they are called equiphase curves. Phase paths and equiphase curves are orthogonal to one another because of (8.279). Corresponding to (8.18) the curvatures of phase paths and equiphase curves are given by x, = vr.grad lnlgrad Lrl, Ki = Vi. grad lnlgrad Lil. Thus a phase path bends whenever [grad Lrl varies along an equiphase curve and, mutatis mutandis, there is similar bending of equiphase curves. In an inhomogeneous plane wave the phase paths and equiphase curves are orthogonal straight lines. If the spacing between equiphase curves contracts [grad Lrl must increase and then (8.279) implies that [grad L, I also grows when N is constant. Thus the field becomes steadilymore damped and evanescent in this direction. Conversely, when the spacing between equiphase curves expands lgrad L; I diminishes. Should the situation be reached where [grad L, I--. 0 the possibility of a non-evanescent field arises. Thus there can be a conversion from a zone of evanescence to non-evanescence with consequent leakage of energy. For this reason complex rays are sometimes called leaky rays. However, it is important to realize that phase paths and equiphase curves are real curves which should not be identified with complex or leaky rays, these being straight lines in a complex space when N is constant. Amplitudes can be dealt with via (8.8), (8.16), and (8.30) which may also be split into their real and imaginary parts if desired.

Exercises

58. If (8.277) is replaced by L(y, 0) = f(y) where f is analytic show that the slope dyjdz of the complex ray through (Yo, 0) is j'(yo)[l - {j'(YO)}2]-1/2. 59. Suppose that in two dimensions the phase paths are concentric circles with centre at the origin and that, on r = a, [grad L, I = fJ (> 1). If N = 1 prove that, for r ~ a, [grad Lrl = fJajr and [grad Ld = (fJ2 a2jr2 - 1)1/2. Deduce that L, = Bs, and L, = L ia + fJa(tanh r - tanh ra-r+ra )

where cosh r = paJr and the subscript a indicates that r = a. Examine what happens as r --+ fJa.

8.27 Optical fibres

The simplest kind of optical fibre consists of a circular cylinder of dielectric material surrounded by another dielectric. In more sophisticated versions there may be additional concentric annular regions of dielectric and the refractive index may vary in a radial direction transverse to the cylindrical axis. Such structures may be tackled in terms of modes by employing separation of variables (§6.25). Modal expansions are not, however, as straightforward as for

560

GEOMETRIC THEORY OF DIFFRACTION

metal waveguides because an appreciable part of the energy can be transported outside the guiding cylinder and because a contribution from a continuous spectrum may have to be allowed for. A full modal investigation of an optical fibre can therefore be highly complicated. For this reason the discussion of this section will be limited to those circumstances in which the propagation can be regarded as taking place along rays. This simplification has the advantage of avoiding the global difficulties of modes at the price of concentrating on frequencies for which the ray picture is adequate. A further benefit is that, because rays depend only upon local properties, generalization to fibres which are not strictly circular or whose refractive index defies exact enumeration of the modes becomes a possibility. For more detail see Snyder and Love (1983). It will be sufficient for our purposes to consider what happens at a curved interface where the dielectric contants change discontinuously, but are uniform everywhere else. Such an interface may be able to support a surface wave whose energy is confined to an immediate neighbourhood of the boundary. A two-dimensional model of the situation is that in which the phase paths are concentric circles, one of the phase paths coinciding with the dielectric interface (Fig. 8.33). Then, if lgrad Lrl = P on r = a, lgrad Lrl

pa

= 7'

lgrad L;I

(P2 a2

= 7

- Ni

)1/2

where N1 is the refractive index of the outer medium. So long as f3 > N1 there is a region of evanescence adjacent to the interface. As r increases to palN1 , Phase

",

fronts

,,

\\ \ Transinon reqion

\

\

N,

Rays

\, \

~

,, ,

..

-Caustic

Phase

paths

N

Fig. 8.33. Model of evanescent field outside a curved dielectric.

LEAKY RAYS

561

[grad Lit decreases to zero. Thus the evanescent wave is converted to a non-evanescent wave which carries energy in r > palN, along conventional rays tangential to the caustic r = palNt. In other words, after a period of exponential decay the complex ray is transformed into a real ray whose diminution of amplitude is due solely to spreading by radiation. Rays which reappear as conventional real ones after traversing a region of evanescence are sometimes known as tunnelling rays. Although the rays in the state of propagation lose amplitude only by radiation, they will be exponentially smaller than rays which have not tunnelled because of the dissipation of energy in reaching the caustic. Nevertheless there may be an appreciable leakage of energy from the caustic. It is important to observe that tunnelling rays do not occur for a plane interface since the caustic moves off to infinity as a -+ 00. Hence tunnelling rays represent a surface wave which is peculiar to a curved dielectric. The amplitude changes which take place at a dielectric interface may be calculated from Fresnel's laws for a plane boundary to a first approximation (§8.5). To obtain the appropriate coefficients in three dimensions, recall that if the plane wave Ei

= (Ii + mj + nk) exp{ -

ikN(x sin 80 cos l/Jo + Y sin 80 sin l/Jo

with

I sin (Jo cos l/Jo

+ m sin (Jo

+ Z cos (Jo)}

+ n cos (Jo = 0

sin l/Jo

is incident on y = 0, the reflected wave has components (the exponential factor being omitted) Er = REI (R E + RM)m cos (J, sin (Jo cos l/Jo (8.281) x + .2(J ,

sin

E; = RMm, Er

z

RE

j

(R E + RM)m cos (Jj cos (Jo

= n + - - - - -2 - - - sin (Jj

(8.282) (8.283)

with R E and RM given by (8.55) and (8.57) respectively, (J being replaced by the angle of incidence (Ji where cos (Ji = sin 00 sin l/Jo. Similarly, for the transmitted field

(8.284) (8.285) (8.286)

where T E and T M are given by (8.56) and (8.58) respectively (Oi being substituted

562

GEOMETRIC THEORY OF DIFFRACTION

for 0). Noteworthy is the fact that components of the reflected and transmitted waves tangential to the interface can be created by the normal components of the incident field. Also, when dealing with rays in general, the twisting of the plane of polarization as described in §8.3 must be borne in mind. The formulae for the reflection and transmission coefficients are defined for all angles of incidence OJ but there are important differences of interpretation if sin OJ > N1/N; such an eventuality can arise only when N1/N < 1. If N1/N < 1 define the critical angle of incidence Oc by sin Oc = Nt/N. For OJ> 0c' (Nf - N 2 sin 2 OJ)1 12 is taken to be negative imaginary and the transmitted wave with coefficients (8.284)-(8.286) will be exponentially attenuated as it moves along. Thus, unless the coefficients are modified, the phenomenon of tunnelling which occurs for curved interfaces will be concealed. By consideration of the canonical problem it is elicited (Snyder and Love 1975; Jones 1978) that the appropriate reflection coefficients are those of (8.55) and (8.57) with {(NtlN)2 - sin OJ} 1/2 replaced by Ai'(uo exp(ini» {tkNPo exp(tni)} 1/3 Ai(uo exp(ini» where Ai is the standard Airy function. The quantity Po is the radius of curvature of the section of the interface by the plane of incidence, i.e. the plane containing the incident ray and surface normal, whereas (!kN po)2/3(sin2 Oc - sin? OJ) (8.287) Uo = - - - - -----{I + po(sin 2 Bc - sin? OJ)lb }2/ 3

b being the the larger of the principal radii of curvature. It is understood that sin 2 Oc = NflN2 even when N1 > N. Corresponding results for transmission are more recondite because of the presence of a caustic when OJ > 0c. These formulae are relevant when the incident ray is on the concave side of the interface; no change to R E and RM is necessary for incidence from the convex side because there is then no tunnelling. Denote the reflection coefficients just delineated by RE and RM• Unless OJ ~ 0c' IUol will be large since ray theory demands radii of curvature which are not small. Hence the asymptotic formulae

iw

3/ 2 . exp( ) Al(W) '" 2nl/2wl/4

w t /4 Ai'(w) ~ - - - exp(

2n 1/ 2

. exp(11ti) 2· 3/2 Al(W) ~ 1/2 t/4 COS(31W n

w

1/4

-

(lph

-iw

3/2

wi <

n)

(8.288) )

1_

iX)

Ai'(w) '" wl/2 exp( -ini) sin(1iw3/2 - in) 1t

(lph

wi <

n)

(in < ph w <

in)

(in < ph

in)

W

<

(8.289)

LEAKY RAYS

563

may be employed. It follows that RE '" R E , RM '" RM when OJ is not near 0C. Consequently, the reflection coefficients are, to a first approximation, those of a plane interface when the angle of incidence is not approaching critical. For a particular polarization a guide to the energy which is transmitted is 1 - IR EI 2 or 1 - IRMI 2. From exp(lni)Ai'(x exp(!ni»Ai(x expori) - exp(ni)Ai(xexp(!ni»Ai'(x exptxi) =

exp(ini) 2n

(8.290)

and Ai(x exptzi) = exp(!ni)Ai(x exp(lni») + exp(-ini)Ai(x exp(-lni») (8.291) it is discovered that 1 _ IREI 2

=

4p.p. 1NN1 cos Oilcos OrIH(uo) 2J1.f 20 N COS i + 2J1.J1. lNNl cos Oilcos OrlH + J1. 2Nflcos OrI2HJ'

(8.292) 1 -IRMI2

=

N 2ef COS 20

cos O;/cos OrIH(uo) 2ee lNNl cos Oil cos OrlH + e2Nfi cos Orl2HJ 4f.f. 1 N N 1

i+

(8.293) where N, cos Or 0i > 0c' and

= N(sin2 Oc -

sirr' 0i)1/2, Or being negative imaginary when

_1_ = 41tluoI1/2IAi(uo exp(!1ti»I 2, H(u o) J(u ) o

(8.294)

= 41tIAi'(uo exp(!ni)W

(8.295)

IU o11 / 2

with U o = IUol exp(-nil when negative. When IUol » 1 it may be inferred from (8.279) and (8.289) that H(u o) '" 1

'" exp(- ~luoI3/2) J(u o) '" 1

Hence, for IUol » 1,

- E \2 .-.. l - IR

'" exp(iluoI 3/2)

4J1.J1. 1NN1 cos 0iCOS Or (NJ1.l cos OJ + N1J1.cosOr)2

(U o > 0) (u o < 0),

(8.296)

(u o > 0)

(uo < 0).

(8.297)

(Uo > 0)

4J1.J1.1 N Nt cos Oilcos Orl exp(-iluoI 3 / 2 ) N2p.f COS 2 0j + Nfp. 2lcos Orl 2

(8.298)

(uo < 0)

(8.299)

564

GEOMETRIC THEORY OF DIFFRACTION

When IUol is not large, 0i ~ 0c because of the huge factor kNPo in (8.287). Furthermore, the second and third terms of the denominator of (8.292) can be viewed as products of (tkNPo)- 1/3 and a bounded quantity. They are therefore negligible provided that (8.300) tkNPOJll cos:' 0c » u. Accordingly, for moderate values of U o 1_

IREI 2

~ 4N1Jl.Jl. l cos O;lcos OrIH(uo) . NJlf cos" e,

(8.301)

Since Bi ~ Bc ' the denominator does not differ by much from INJll cos 0i + N1 Jl cos 0rI2/N. On this basis (8.298), (8.299), and (8.301) may be combined into the single formula

1 _ IR EI 2

= 4NN1Jl.Jl. l cos Oil cos OrIH(uo) INJll cos OJ + N1 Jl cos 0r1 2

(8.302)

with tolerable accuracy over the whole range of U o. There is a corresponding formula for RM deduced from (8.293) by exchanging u, Jll and 8, s 1 in (8.302). From (8.287), there is the simplification Uo ~ (tkN po)2/3(sin 2 Bc - sirr' Bi ) (8.303)

°

when Bi ~ Bc • However, (8.303) must be dropped in favour of (8.297) when c becomes increasingly positive otherwise (8.302) will overestimate the amount of transmitted energy. While (8.302) is satisfactory for an incident wave that is E polarized some care is necessary when both polarizations exist because then cross-multiplication may produce algebraic decay rather than exponential for Uo« - 1 and additional terms in (8.296) and (8.297) may be necessary. For the special case of a circular optical fibre, one principal radius of curvature b is infinite and the other is the radius a of the cylinder. Let the z axis be the axis of the cylinder and suppose that a typical ray in the cylinder travelling in the direction of z increasing makes an angle 0% with the z axis. At its point of impact with the cylindrical boundary let the projection of the ray on the cross-section of the cylinder make an angle Bs with the radius vector from the centre of the cross-section to the point of impact (Fig. 8.34). Then

OJ -

cos 0i = sin 0% cos Os.

(8.304)

Meridional rays pass through the axis of the cylinder and are such that Os = 0, cos OJ = sin 0%. Rays in which Bs =F 0 are called skew rays. To illustrate the procedure we shall concentrate solely on rays which are polarized so that (8.302) is applicable. For meridional rays Po/b is unity and Po is infinite. Thus U o will be positive or negative infinity according as OJ is less or greater than 0c. Consequently, a meridional ray will either radiate

LEAKY RAYS

565

Fig. 8.34. Angles for propagation along a fibre.

energy by refraction as if at a plane interface (OJ < Bc ) or radiate no energy at all (OJ> 0c), there being a virtually instantaneous transition when OJ passes through 0c' If the initiating energy in a hollow cone between the angles Oz and Oz + dOz is 2n:G(Oz) sin Oz dOz the energy which leaves the fibre by refraction in a distance z is

where n is the number of reflections in a distance z given by z

n=-tan(J.

2a

If N, > N, the lower limit of integration is O. Sources off the axis will be responsible for skew rays and maybe meridional rays. If a ray starts from z = 0 at the point whose polar coordinates are (c, X) and goes in a direction making an angle (Jz with the cylindrical axis with its projection on z = 0 having the inclination of the polar angle Xo

a sin (Js = c sin(x - Xo)

(8.305)

at the point of reflection. Suppose that G«(Jz' c, x, Xo) is a measure of the incident energy along the ray so that

G(Oz, c, X, Xo)c de dXsin Oz dOz dXo

566

GEOMETRIC THEORY OF DIFFRACTION

is the energy launched from a small area about (e, X) into an elemental solid angle about the specified direction of propagation. Then the energy which radiates from a length z of the cylinder due to the reflection of this elementary beam is

where v is the number of reflections given by 1 z tan (J%

v=----

2 a cos (Js

and (Jj is defined by (8.304). Moreover, from (8.287),

Uo

= (!ka cosec" O. cosec" Oz)2/3 {(~

Y- + 1

sin? Oz cos? O.}. (8.306)

The total radiation due to all rays is obtained by integration, in which it is advantageous to convert 10 to (Js via (8.305). Thus, if t is the total radiation

r= 4 X

af2~ ft~ fSiO-l(Cla) G(Oz, c, X, Xo}{l f0 0 0 0 a cos (J s 2 2. 2 (e - a sm (Js)

112

•

e de d 1 sm

(J

z

d

(J z

IREI 2 V}

d

(J

s:

(8.307)

If, in particular, G is independent of e and Xo put

f

21t

o

G«(J%,

c. X, Xo) dX =

and then, because of the structure of r = 8

2Go«(J%)

a

2

liE and (8.306),

Lt" Lt" Go(Oz){l _IR

EI 2

V} sin Oz dO z dO..

(8.308)

The value Os = 0 is admitted in (8.307) and (8.308) so that both include the radiation due to meridional rays. Also tunnelling is allowed for. The separate contributions from tunnelling and refraction can be assessed by limiting Os to a value less than that for which N cos (Js = (N 2 - Nf)1 /2 cosec (J% (this limit being tn if N, > N) in refraction and letting Os range above this limit for tunnelling. Obviously, as soon as Os exceeds the limiting value by a significant amount the contribution will be exponentially small on account of (8.299). Whether the rays near critical incidence will supply an appreciable effect depends on the relative sizes of various parameters but they should not be ignored without calculation. The integrand of (8.308) decreases as (J s increases so that the rays which are not far off meridional tend to radiate more than the rays which are more skew. On the other hand, the integrand (apart from Go)

LEAKY RAYS

567

is an increasing function of Oz with the consequence that isotropic sources will produce more radiated energy from their rays with larger values of Oz.

Exercise 60. By assuming that the z dependence of all fields is exp( -ikN cos (Js) in the fibre problem, deduce that when two-dimensional rays are refracted by a curve of radius of curvature a

with

/ Uo = (~kN2 a )2 3{(Nt)2 _ sin2 o.} s1n o, N and (Js the (two-dimensional) angle of incidence. What is the analogous result for the other polarization?

9 SOURCE DETECTION 9.1 General considerations

The location of the origins of an electromagnetic disturbance can be of considerable importance, either because one wishes to know where a source of scattering or radiation is situated or because one wishes to eliminate its influence on other signals. A related problem of practical significance is the determination of the properties of a medium from its effect on a wave; there are obvious applications in the detection of minerals, in the investigation of materials, and in assessing the state of the atmosphere for propagation purposes. It is vital to get straight right away that problems of this type do not usually have a unique solution and that assertions about the location of sources are highly dependent upon the assumptions made in analysing the signals. A common supposition is that there is a closed surface S containing all the sources and that a homogeneous isotropic medium lies between S and the observer 0 (Fig. 9.1). The general representations (6.75) and (6.76), which are exact, demonstrate that the observed field is equivalent to that generated by a suitable distribution of sources on S. One can infer that these are apparent (i.e. not real) sources if one has a priori knowledge that the sources are not actually on S. However, a similar representation in terms of apparent sources exists for any closed surface surrounding S but excluding O. Therefore, even the determination of apparent sources is subject to a fair degree of ambiguity. The most that can be asserted is that, given S, the field can be accounted for by a certain distribution of sources on S.

s

.0 Fig. 9.1. Surface containing sources.

SOURCE DETECTION

569

One may also have to be equivocal about the source distribution because there are sources which do not radiate energy at infinity. To see this (Friedlander 1973) consider Maxwell's equations in a homogeneous isotropic dielectric curl E

+

imJLH

= 0,

curl H - imtE = j

(9.1)

where the current density j is reasonably smooth and vanishes outside a finite region. Apply the operator (curl curl -k 2 ) , where k 2 = m2JLt , to (9.1). Then a current density is produced which supplies an electromagnetic field in which the electric intensity is - iWJLj and the magnetic intensity is curl j. Since j vanishes outside a finite region so does this electromagnetic field and so a source distribution which is responsible for a non-radiating field has been created. It has been proved (Bleistein and Cohen 1977) that, if j is zero outside a finite domain, eqns (9.1) possess a non-radiating solution if and only if (9.2) where c

= 1/(JLf,) 1/2, j(K, (0) = fi(X, (0) expux.x) dx

(9.3)

and the integration in (9.3) is over the whole space. Convergence of (9.3) is immediate since j is non-zero only in a finite domain. Current densities for which (9.2) is satisfied need to be treated with care. It must therefore be recognized that the formulation for source detection may often not yield a unique solution and that, even when additional information is available, the difficulty may not be resolved. For instance, suppose it is alleged that the sources are to be regarded as radiating in free space and their positions are to be determined from distant observations by tracing rays backwards. Then, consider the situation in which there is a point source M in a dielectric of higher refractive index than where the observer is located, the interface between the two being a plane (Fig. 9.2). The observer will place the apparent source at the point where nearby rays focus, i.e, on the caustic of the refracted rays An observer at 0 therefore puts the source at M' where OM' is tangent to the caustic. Thus the apparent location of the source depends upon the

o Fig. 9.2. An observer at 0 places M at M'.

570

SOURCE DETECTION

position of the observer and its apparent direction from the observer will not be that of the true source except at normal incidence. Only detailed information about the medium containing the source would allow any likelihood of detecting the real source from the caustic traced by the apparent sources. Additional knowledge may be forthcoming from other approches. For example, by probing at other frequencies, say optical or infrared, it may be possible to deduce properties of the medium, or again a source might give out a variety of disturbances some of which were more easily analysed than others. However, one does need to be convinced that it is the same source which emits the different kinds of radiation before this latter avenue is open. The lack of uniqueness may appear, at first sight, to contradict everyday experience because telescopes and eyes seem to supply unambiguous images (except maybe in optical illusions). However, objects which convey sharp images usually do so by radiation which is incoherent in space. Highly coherent radiation from lasers is often afflicted by speckle and good visual holograms usually demand that the illumination be sufficiently diffuse. Similarly, polished objects viewed by specularly reflected light are often difficult to recognize. The knowledge that the radiation from a source is spatially incoherent can be a valuable piece of information. Even if the theory predicted a unique solution, enough data, as uncontaminated by noise as practicable, has to be collected for the application of the theory in a numerically sound fashion. It is clear therefore that the detection of sources can be fraught with anxiety for the experimenter and a considerable body of experience and skill may have to be built up before interpretations of physical significance can be accepted with confidence.

INVERSE SeA TTERING During the past decade numerous theorems and methods for the problem of inverse scattering have been published, e.g. Roger (1981), Sleeman (1982), Colton and Kress (1983), Colton and Sleeman (1983), Jones (1985), Colton and Monk (1985, 1986, 1987, 1990, 1992), Kirsch and Kress (1987a, 1987b), Kirsch et ale (1988), Jones and Mao (1989), Zion (1989), Ari and Firth (1990), Sylvester and Uhlmann (1990), Bates et ale (1991), Colton and Paivarinta (1991), Wombell and Murch (1992). It is impossible to do justice to all these developments here and so attention will be confined to a couple of simple approaches which, none the less, are of value in practical circumstances. 9.2 Low frequencies The first problem to be investigated is that in which it is known that a far field. has been scattered by a perfectly conducting obstacle in an otherwise

571

INVERSE SCATTERING

homogeneous isotropic dielectric but the shape and location of the scatterer are to be found. It will be sufficient for illustrative purposes to consider the two-dimensional case; the discussion can easily be extended in principle to three-dimensional problems (Imbriale and Mittra 1970). Let the scatterer have generators parallel to the z axis and be irradiated by a plane wave with its electric intensity perpendicular to the (x, y) plane. Denoting the electric intensity by E and choosing the x axis in the direction of propagation of the plane wave we have E i = exp( -ikx) in the incident wave, k being the wavenumber of the medium. A scattered field E S will be produced such that E i + E S is zero on the obstacle and ES

f"Ot.I

2 )1 /2 A(cP) exp{ -i(kr - 1n)} ( nkr -

(9.4)

as the distance r from the origin increases, cP being the polar angle which is zero on the x axis. The pattern function A(cP) is supposed to be known and from it the position of the scatterer is to be determined. Write A(cP) = 1:a ll exp(incP). The coefficients all can be calculated by standard Fourier analysis (cf. §§1.5, 2.14) and, in practice, the expansion is limited to a finite number of terms. Outside the minimum radius r = a of the circle enclosing the obstacle (Fig. 9.3) it is known that (9.4) is the asymptotic expansion of (9.5)

where H~2) is the Hankel function of the second kind. Actually, a is not given because the location of the scatterer is unknown. However, if E i + E S is calculated from (9.5) for decreasing r a value of r will be reached at which E i + E S is zero at a point on the circumference of the circle. This value of r will be adjudged to be a and the point as the point of contact between the minimum circle and perimeter of the obstacle. In practice, a zero will rarely be achieved precisely and it will be necessary to be satisfied when

,

Fig. 9.3. Geometry for inverse scattering.

SOURCE DETECTION

572

E S + E i is less than some pre-assigned tolerance (which must not be tighter than the data can attain). To discover another point on the perimeter of the body select a new origin (r o, <jJo) and repeat the process. The new pattern function A'(<jJ) is given by A'(<jJ)

= A(<jJ)

exp{ -ikro cos(<jJ - <jJo)}

= I:a~ exp(in<jJ)

where a~ = L a".Jn_".(kr o) exp{i(n - m)(tn - <jJo)}· ".

Then the minimum radius r = b (Fig. 9.3) is obtained via (9.5) with a~ for an and r, 4> measured from the new origin. The above procedure is generally satisfactory for single convex scatterers. For more complicated obstacles it may be supplemented by drawing a circle of radius d and centre (r 0' <jJo) whose interior lies entirely outside r = a. Inside this new circle

ES

= I:cnJn(kr') exp(in4»

where r' is measured from (ro, <jJo) and c; = L a".H~2J".(kro) exp{i(n - m)(n - 4>0) -

tmni}.

".

Now increase r' until a zero of E i + E S is discovered; this corresponds to another point on the perimeter. If A( 4» is not given for the whole range 14> I ~ 1t but only for <jJ 1 ~ <jJ ~ 4>2 say, then an approximation An to an is found by minimizing (§1.5)

f

~2

411

IA(<jJ) - LAn exp(in. n

Thereafter, the procedure is followed as before with An substituting for an. It must be realized that it is rarely possible to make observations as complete as required by theory. Measurements can usually be made at only a discrete number of angles and have to be extended by interpolation or other devices so as to cover a continuous range. Variations in time and wavenumber will be similarly incomplete. Therefore estimates deduced from real data are bound to be in error as well as suffering from the uncertainties (which cannot be avoided) from the noise inherent in any measurement. So predictions in inverse scattering theory are always highly idealized. The procedure seems to work tolerably well when a typical dimension of the scatterer is somewhat less than a wavelength but at higher frequencies the computation time becomes excessive and the method of the next section should be tried.

INVERSE SCATTERING

573

Exercises 1. From the known analytical solution for a circular cylinder of radius 11k use the above procedure to locate 8-12 points on the boundary. Repeat the calculation when the far-field data are restricted to 14>1 ~ 120°. 2. By means of one of the methods described in earlier chapters calculate the far field scattered by an elliptic cylinder with semi-axes 11k and 1/2k. Use this as data for the inverse problem and examine how much inaccuracy is introduced if the data are restricted to 14> I ~ tn. Does the orientation of the cylinder have much effect? 3. Repeat Exercise 2 for two circular cylinders, each of radius 11k, whose centres are separated by 61k, the line of centres being perpendicular to the direction of propagation of the incident plane wave. 4. Formulate the procedure for three dimensions and test it on a sphere and a spheroid.

9.3 High frequencies

At high frequencies the location of a perfectly conducting convex obstacle or target is assisted by the approximation of physical optics. The representation of the field is where

ES(P)

= ~ (grad div + P) lroB

",(P, Q) = exp( -iklx p

41tlx p

-

-

f s

nq

1\

Ht/J(P, q) as,

xal)

xal

and nq is the unit normal from the interior of S to the exterior. H is not known exactly on S but, at high frequencies, the reasonable approximation can be made that each small portion will be subject to the incident wave and a secondary field generated by the rest of the surface. To a first approximation the secondary field may be neglected. Thus we may suppose that an incident plane wave illuminates a section of S called the lit side and denoted by L in Fig. 9.4. The remainder of S is dark and denoted by D. On L, n . nq < 0 where n is the unit vector in the direction of propagation of the plane wave. In the shadow region D the current distribution is assumed to be zero and on the lit side L the approximation

nq

1\

H = 2nq

is introduced. Thus

ES(P)

= ~ (grad div + k2 ) lroB

1\

f L

Hi

nq

1\

Hit/J(P, q) as,

574

SOURCE DETECTION

exp (- ikn ox) a

s Fig. 9.4. The lit and dark portions of the target.

with the result that at distant points 2

ES(x) '" k exp(-iklxl) 2nelxI

f L

{P - (P .x)x} exp(ikx.xq)dSq

where x is a unit vector in the direction of x and iwP Write

= Oq

1\

Hi.

ES(x) '" A(n, x) eXP(~~klxl) so that A(o, x)

= wk (~)1/2f 2n e

L

{P - (P . x)x} exp(ikx. x q ) dS q •

Let the incident plane wave be

(;Y / Ei

Hi =

= eo exp( -ikn.x), 2 0 /\

eo exp(-ilm.x)

where eo is a fixed unit vector such that n. eo = O. Then

A(o, x)

=~ f 2nl

[(nq.eo)n - (nq.n)eo - {(nq.eo)(o.x)

L

- (nq.n)(eo.x)}x] exp{ikxq.(x - o)} dSq • Now irradiate the target with the plane wave eo exp(ikn. x), travelling in the opposite direction to the previous one. Then Land D interchange roles so that A( -0, x)

=

-~ r [(oq.eo)o -

2nl

J

(oq.o)eo - {(oq.eo)(o.x)

D

- (n,; n)(eo. x)}x] exp{ikxq. (x

+ o)} dSq•

575

INVERSE SCATTERING

Hence

A(n,i)

+ {A(-n, -i)}* = ~

f

2nl s

[(nqeeo)n - (nqen)e o - {(nqeeo)(nei)

- (n, en)(eo ei)}i] exp{ikx qe(i - n)} dS q

(9.6)

where the asterisk indicates a complex conjugate. Therefore

n.[A(n,i)

+ {A(-n, -i)}*] = ~f

2nl s

nqe[{1 - (nei)2}eo

+ (eo ei)(nei)n]

x exp{ikxqe(i - n)} dSq •

By the divergence theorem

n.[A(n, i)

2

+ {A(-0, -i)}*] = k eo.i(l - n.i) 2n x

L

exp{ikxq.(i - n)} dV

(9.7)

where V is the volume enclosed by S. Eqn (9.7) constitutes a basic identity (Bojarski 1967; Lewis 1969; Perry 1974; Bleistein and Bojarski 1975) from which the shape of the obstacle is obtained. Let Hy(x) be unity when x is in V and zero elsewhere. Then (9.7) can be written as (9.8) where

B

= 2nn. [A(n, i) + {A(-n, -i)}*] k2e oei(1 - nei)

tI

= k(n -

,

i)

and the integration in (9.8) is over all space. Regarding B as known from far-field measurements we can view (9.8) as giving the Fourier transform of Hy , i.e. Hy can be found from Hy(x)

= -13 foo 8n

B expfie , x) de.

(9.9)

-00

Consequently Hy can be determined and the location of the target thereby discovered provided that B can be calculated for all tI. All values of tI can be covered by picking i to differ from D by a small vector in a chosen direction and then allowing k to range over large values. The advantage of this choice is that physical optics can be expected to be a good approximation at high

576

SOURCE DETECTION

frequencies for observations within a small cone about the direction of propagation of the incident plane wave. Alternatively, we can set x = - 0 so that only observations of backscattering are involved and again physical optics may be anticipated to furnish good results. Unfortunately, (9.7) is then nugatory since both sides vanish identically. However, by scalar multiplication of (9.6) by eo we can still retain (9.8) by putting

B

= 21teo • [A(o, -0) + {A(-0, o)}*] k2

•

(9.10)

Nevertheless, a large number of measurements of the amplitude and phase of the scattered wave will still be required. A check on accuracy is provided by the fact that strictly By should take only the values 0 or 1. If, therefore, (9.9) supplies an By which differs significantly from either of these values it is clear that errors have been committed. In particular, this will be true if the evaluation of the integral in (9.8) leads to an appreciable imaginary part. Actually, it is easier to compute n, grad By in which the integrand of (9.8) is multiplied by ie , 0, eliminating a small denominator in B when n is near x. Since grad By vanishes except on S a trace of the boundary is obtained. Usually, real data will be band limited so that the resolution of the boundary will be no better than a half wavelength. An obvious disadvantage of the above technique is the requirement to make observations for all values of (I. The procedure can be adapted to obviate this difficulty at the cost of solving an integral equation. Suppose that measurements are possible only for <2 ranging over a volume T. Then, instead of B being known for all <2, only BB T (<2) is available where HT is unity for <2 in T and zero elsewhere. Since B satisfies (9.8), the Fourier convolution theorem gives (9.11) where hT(x)

= -13 foo 81t

- 00

BT«(I) exptie , x) de.

Since the left-hand side of (9.11) and hT are both known, (9.11) is an integral equation to determine By. It is an integral equation of the first kind with kernel hT • Such integral equations are generally ill posed and there is no guarantee that a solution exists. Even if a solution exists it may be be unique and may not depend continuously on the data. Notwithstanding, it can be tackled by methods of preceding chapters and some success has been reported (Bojarski 1974) in obtaining numerical solutions. The method can be modified to deal with the detection of protrusions on plane sheets when observations are limited to one side (Bleistein 1976).

INVERSE SCATTERING

577

Exercises 5. From the known solution for a circular cylinder of radius 20/k use the above method to estimate the boundary. Repeat the calculation if the data are restricted to a wedge of 120°. 6. Repeat Exercise 2 with semi-axes of 20/k and 10/k. 7. Repeat Exercise 3 with the radii of the cylinders to/k and their centres separated by 50/k. 8. Repeat Exercise 5 for a sphere of radius to/k. 9. Test the above procedure on (a) a spheroid, (b) a parallelepiped.

9.4 Scattering in the time domain Undoubtedly, one of the simplest methods off getting a fair idea of the shape of an obstacle in a homogeneous medium is to irradiate it with a sharp pulse and observe the backscatter. Then the time taken for the first return determines the nearest point of the obstacle. In order words it fixes a tangent plane to the target. By sending in the pulse from different directions one can delimit the target by a set of tangent planes. Additional information about prominent features of the obstacle may be deducible also from a study of the time history of the return. To obtain the complete boundary would require illumination from all possible directions; this could demand a prohibitive expenditure of effort. However, if a priori knowledge of the shape were available, such as the obstacle being spheroidal, the task would be lightened appreciably as it would if the only objective were the detection of a particular feature. More generally, the technique of the previous section is also capable of coping with pulse problems (Bleistein 1976a). Define the time transform F(co) of f(t) by F(w)

For real

= f~CXl f(t) exp(-iwt) dt.

f this imples that {F(co)}* = F( -co)

(9.12)

which enables F to be determined for negative values of co from its behaviour for positive w. Further

f(t) = - 1

21t

~

fex> F(co) exp(icot) dco.

(9.13)

0

x

When backscattering data are utilized, = - n and the functional dependence of B may be denoted by B(n, co). From the definition of A it follows that B satisfies (9.12) whether given by (9.8) or (9.10). Therefore, introduce a real function b such that w

2B(n,w)

=

f~CXl b(n,t)exp(-iwt)dt.

(9.14)

SOURCE DETECTION

578

Now, the left-hand side of (9.9) is real so that we can write

= ~ rJt foo

Hy(x)

- 00

8n

B(n, w)

eXP(2iwne~) de c

where c = 1/(lle)1/2. By carrying out the integration with respect to employing (9.13), we have Hy(x)

-hc Inr b(n, 2n.x) en c

=

OJ,

(9.15)

2n

where dO is the surface element of the unit sphere. Eqn (9.15) offers the possibility of making observations in the time domain if b can be related to the time-varying field. Otherwise, a Fourier transform will be necessary so that the results of the preceding section can be applied in the frequency domain. It is convenient to deduce the relation for b from an exact result which will now be derived. Suppose the excitation in the time domain is such that the incident electric intensity near the target is the plane wave

e'. = eo!(neX) t - -c- · Then in the frequency domain the incident electric intensity is E i = F(w)e o exp( -iknex) and

E 8 (x) '" F(w)A(n, i, w) exp( -iklxl) [x]

the dependence of A on to now being explicitly indicated. This holds for positive wand is extended to negative values by means of (9.12). Hence, in the time domain the distant scattered field is given by S( ) e(n, X, t - lxi/c) ext '" - - - - ,

[x]

where

e(n, i, t) =

~ foo

F(w)A(n, i, w) exp(iwt) dw

21t - 00

(9.16)

or, in terms of positive w only,

e(n, i, t)

= 9l!

n

foo F(w)A(n, i, w) exp(iwt)

dw.

(9.17)

0

If

A(n,i, w) =

f~oo a(n,i,t)exp(-iwt)dt

(9.16) can be expressed in the convolution form

e(n, i, t) =

f~

00

f(t - r)a(n, i, r) dr.

(9.18)

579

INVERSE SCATTERING

Eqns (9.16) - (9.18) are equivalent forms of the exact relation referred to above. When f(t) = b(t) or F(ro) = 1, e(n, X, t) = a(n, X, t). Thus, in general, (9.18) shows that the scattered wave in the time domain is the convolution ofthe incident signal waveform f and the scattered waveform for an impulsive source. We also conclude that a(n, X, r) = 0 for t < 0 to be consistent with the finite rate of spreading of the disturbance caused by the impulsive source. When the approximation of physical optics is made, (9.14) and (9.10) imply that

b(n, t)

= 21tc 2eo• a(n, - n, t) = 21tc 2eo• a( - n, n, - t)

(t > 0) } (t < 0).

(9.19)

Substitution in (9.18) supplies

eo.e(n,

x, t) = -12 foo {f(t 2nc

0

r)b(n, r)

+ f(t + r)b( -n, r)} dr.

(9.20)

The left-hand side is known by measurement and f is prescribed by the incident wave so that (9.20) constitutes an integral equation to determine b. Once solved the position of the obstacle is provided by (9.15). Should the incident source be impulsive (9.20) does not have to be solved because a can be measured directly and b found from (9.19). .

Exercises

10. If f(t) = b'(t) find b in terms of e and see what formula results for grad Hy(x). 11. Locate a sphere by examining the scattering in the time domain of (a) an impulsive plane wave, (b) a Gaussian plane wave pulse.

9.5 Moving targets The exact formula (9.16) refers to a target which is stationary with respect to the source of the irradiating field. If the source or target is moving it needs modification. The case of an obstacle travelling with constant velocity while the source is fixed with be considered here. Formulae when the roles of the obstacle and source are interchanged can be dealt with by a simple transformation. If, however, different parts of the system possess different velocities a more elaborate attack has to be mounted (Jones 1964, 1977, Ffowcs Williams and Hawkings 1969). Let x, t be the coordinate system in which the sources are at rest and x', t' a system in which the target is stationary. The two coordinate systems are related by a Lorentz transformation. It will be assumed that the spatial axes of the two systems are aligned and that the target moves along the positive Xl axis with speed v. By writing formulae in terms of the velocity v, however, the special choice of the axes can be eliminated.

580

SOURCE DETECTION

The Lorentz transformation can be expressed as x;

= P(X1 -

Xl

= P(X'l + vt'),

vt),

where p = (1 - v2/e2 ) ei

t X2 = X~, 1/ 2 .

X3 = xi,

,= p(

t-~,

VX 1)

(9.21)

+ VX') e2l

(9.22)

t = P( t'

In the x, t system the incident wave is

= eof(t -

nex/e),

hi

= (e/Jl,)1/2 n A e'

On substitution from (9.22)

f (t where

OeX)

{( , 0' eX')} = g (t, - -e0' eX')

----;- = f Y t - -e-

y=

p(t- n~v)

,

p(n 1

n1 =

-

v/e)

y

=

P(1- D;V).

(9.23)

, n2 , n 2 =-, y

In the x', t' system the incident wave is l• ei, = pei + (l - p)(e v)v + PIlV A hi

v2

, ('t - -eO'eX') = eog where

+ (eo.V){(l

eo = yeo

- P) :2

+ ~D}.

(9.24)

Thus if the incident plane wave has angular frequency to in the x, t space it appears to have angular frequency yw in the system in which the target is stationary. This change in frequency represents the Doppler shift due to motion. In addition, the apparent direction of propagation is 0' instead of D. As is clear from (9.24) the polarization of the waveis also altered (unless eo is perpendicular to v); this is to accommodate not only the change in direction of propagation but also the interdependence of electric and magnetic fields. Indicate explicitly the dependence of A on the polarization by writing A(e, 0, X, In). In the x', t' system the obstacle is motionless so that (9.16) can be applied. Hence e'(n' x,t'- Ix' lIe) eS'(x, t') = - , ----A

,

,

[x']

where e'(n', i, t') =

~ f""

21t - 00

G(co)A(eo, D', ii, co) exp(icot') dco.

581

THE INVERSE SOURCE PROBLEM

Now

yG(w) and, from (9.21),

x' (

, Ix'i

t where Yl = P(1 -

= F(~)

x' • = t , - -c=

~

x.

x)j

t - -c-

Yl

x. vic), A'

Xl

=

P(X 1

-

vic)

~,

,

1'1

_ X2

(9.25)

A2 - - ,

Yl

in analogy with (9.23). Hence

IX'I) = e ( , x,A, t -IXI) -

( , xA, , t' - eo,

1 0,

c

c

with et(o',

x', t) = 1!- foo

2ny - 00

F(YI w)A(e" 0', x

f

Y

,

'YI W)

exp(iwt) dw.

(9.26)

The distant field observed in the x, t system is now obtained from

e'(x, t)

= {Jes, + (1 -

(J)(e

SI

•

v) v2 v

-

~VA c

(X' A

e"),

(9.27)

Comparison of (9.26) with (9.16) reveals that (9.26) reduces to (9.16) when the target is immobile so that v = 0, P= 1 = Y = 1'1' X = x', 0 = 0'. Also (9.26) displays evidence of the effect of motion in four places. There is a frequency shift in the incident signal waveform due to the factor YIly. The scattering amplitude A has a frequency shift caused by the presence of co, an angular distortion due to 0', x' replacing 0, x and a polarization warping because e~ substitutes for eo. An additional polarization twist also occurs in the conversion back to source coordinates through (9.27). The whole of (9.27) can be expressed in terms of x, t by means of (9.23) - (9.26). The integral in (9.26) can be written in forms analogous to (9.17) and (9.18) if desired. For further details and experimental measurements see Vogel (1991).

THE INVERSE SOURCE PROBLEM 9.6 Harmonic sources The assumption made in this section is that the observed radiation is to be accounted for by a distribution of sources placed in a homogeneous isotropic medium. Then a solution of (9.1) is sought such that j is confined to some part

SOURCE DETECTION

582

T

S Fig. 9.5. The inverse-source problem.

of the interior T of a closed surface S. The tangential components of both E and H are to be measured on S and the aim is to determine j from these data (Fig. 9.5). Now E(x) H(x)

f

= (grad div + k2 ) ~ = curl

lWB

L

T

j(xq)t/J(P, q) dx q ,

j(x q)IjJ(P, q) dx,

(9.28) (9.29)

where x or P denotes the point of observation and t/J(P, q) = exp( -ikr)/4nr with r the distance between P and q. Since j is non-zero only in a portion of T there is no loss in making the volume of integration the whole of T. The fields inside S can be exposed by means of volume and surface integrals. Although it is customary to base such a representation on retarded potentials, it will prove to be advantageous here to employ advanced potentials. The appropriate relations are E(x)

= (grad div + k2 ) x

{~f j(x q)IjJ*(P, lW6 T

- curl

H(x) =

L

Oq /\

lW6

S

Dq 1\

Ht/J*(P, q) dS q}

EIjJ*(P, q) dS q,

CUrI{L j(x q)IjJ*(P, q) dx, + ~ (grad div + k 2 ) lWJl

~f

q) dx, -

J,r

(9.30)

-1

Oq /\

Oq /\

HIjJ*(P, q) dSq}

EIjJ*(P, q)

as,

(9.31)

583

THE INVERSE SOURCE PROBLEM

where Dq is the unit normal out of S. The surface integrals in (9.30) and (9.31) cannot be reckoned to vanish beecause advanced potentials have been utilized. They are, however, known from the measurements of the tangential components made on S. Hence E(x)

= (grad div + k2 ) ~ H(x)

= curl

t

lWB

f

T

j(xq)t/J*(P, q) dx,

j(xq)"'*(P, q) dx,

+

+ g(x),

h(x)

(9.32)

(9.33)

where g and h are known. Combining (9.32) with (9.28) and (9.33) with (9.29) we obtain (graddiv

+

curl

k2)~ f

t

lWB

T

j(xq){t/J*(P,q) - t/J(P,q)} dx,

j(xq){"'*(P, q) - ",(P, q)} dx,

+ g(x) = 0,

+ h(x) = O.

(9.34)

(9.35)

These two integral equations determine j from the given data. Notice that they are exact and do not require measurements to be made in the far field or the approximation of physical optics to be adopted. The integral equations are not independent because (9.35) is the curl of (9.34) since curl g + iWJlh = O. Nevertheless, they serve as useful verifiers of one another in numerical work where their relationship will not be exact. The kernel of (9.34) (or (9.35» is non-singular; indeed ",*(P, q) _ ",(P, q)

= ik io(kr) 21t

where io is the spherical Bessel function. Nevertheless, (9.34) is an ill-posed and ill-conditioned integral equation. Despite this fact, experiments with synthetic data (i.e. generated analytically) suggest that a tolerable if not very precise indication of the source distribution can be achieved (Bojarski 1974).

Exercises 12. Place an electric dipole at the origin and then calculate g and b if S is a sphere of radius a with centre at the origin. Then solve (9.34) and (9.35) numerically to see how well they predict the source. 13. Repeat Exercise 12 for a magnetic dipole. 14. Repeat Exercise 12 with two parallel electric dipoles on the z axis and equidistant from the origin. 15. If b = 0 and T is a sphere of radius a find the solutions of (9.35) by using the addition theorem to expand io in a series of spherical harmonics. (This provides confirmation of the non-uniqueness referred to already.)

SOURCE DETECTION

584

16. If T is a sphere of radius a, find the values of A. for which

t

j(xq)jo(kr) dx, = Aj(x)

inside T and show that they can be expressed as ta 3[{im(ka)}2 - im-l(ka)i ... + 1 (ka)]

for m = 0, 1,2 .... As m -+ 00 show that these eigenvalues tend to zero rapidly (like m- 2m - 3), thereby verifying that (9.34) and (9.35) are ill conditioned and ill posed.

If the tengential components of E and H can be observed only over part of S, say Sh and not over the remainder S2 put

g = gt

+ g2 =

r + Jr .

J

Sl

S2

Now g2 is unknown. However, it can be expressed in terms of the unknown j by finding E and H on S2 from (9.28) and (9.29). Then (9.34) becomes an integral equation of the first kind for j in which the kernel also contains an integral over S2. While the problem can be tackled in principle the integral equation is likely to become more ill behaved as SI shrinks in size. 9.7 Inhomogeneities The method of the preceding section is also adaptable to the investigation of finite regions of inhomogeneity in an otherwise homogeneous medium. Suppose, in particular, that the permittivity is e + e' where e' is non-zero only over a finite region. Excite the medium with a known wave E i , e.g. a plane wave. Then the scattered wave satisfies (9.1) with j = icos' (E i + ES). The current density j is determined from (9.34) or (9.35) and then ES can be calculated from (9.28). Since E i is known it follows that e' can be computed. Variations in u can also be coped with by using representations of the electromagnetic field in terms of magnetic current (cf. §7.2). The method does not seem to be very satisfactory from a numerical point of view and other modes of tackling the problem have been put forward (Cohen and Bleistein 1977). For scalar problems one suggestion (Bates, Boerner, and Dunlop 1976) has been the introduction of the Rytov approximation in which E = EiE' and the equation for E' handled. For stratified media where the variations are one dimensional more intricate analysis is feasible (Meyer 1975, 1975a, 1976; Bates and Wall 1976). Another technique, which is applicable to the general inverse source problem, is to expand j in a finite series of known functions such as polynomials but with coefficients to be discovered. The resulting field on S can then be calculated from (9.28) and (9.29). The coefficients are now found by minimizing the sum of the squares of the differences between the predicted and measured tangential

THE INVERSE SOURCE PROBLEM

585

components of E and H on S. In general, this will involve an optimization technique such as conjugate gradients as described in Chapter 4. It is likely that this method will be superior numerically to the integral equation approach if the expansion functions are well chosen. Simultaneous probing at several frequencies is also possible with this approach.

Exercises 17. Repeat Exercises 12-14 using the optimization technique with the expansion functions chosen as polynomials. Do you think some other choice would give better results? 18. Choose a simple f,' and compute the field scattered from an incident plane wave. Use the values as data to determine e' by (a) integral equations, (b) the optimization technique. Which do you think is the better method? What happens if you vary the frequency? 19. In fields which depend only on z, J.l and e have the constant values J.lo and f,o except for 0 ~ z ~ 1 where s = f,oe z • Use the known form of f, to generate data for the inverse problem and see how accurate your solution to the inverse problem is. Consider a number of frequencies. Repeat the investigation for (i) f, = f,o(l + z), (ii) s = f,o(l + sin xz), (iii) s = f,o( I + sinh 2z).

9.8 Statistical considerations Many natural sources change their positions and frequencies with time. The assumption of monochromatic radiation is then inappropriate. While it would be possible to contemplate the formulation of the integral equation (9.34) in the time domain it is doubtful whether sufficient measurements could be undertaken to make the calculation of the corresponding g a practical realizability. For the inverse source problem an outgoing solution of curl E

aE =

aH = 0,

+ u-

curl H - e -

at

at

J(x, t)

is required. Assuming that J is zero outside some finite region, we have from eqn (2) of Chapter 7

a

Jl E( x,t) - - - -

fOO

at41t

-00

1 - gra d ---

J(x', t -Ix - x'l/c) d ' x Ix-x'i

foo

41t8

_

p(x', t - [x - x'l/c) d ' x

Ix

00

where the integration is over all space and div J

+ ap

at

=

o.

- x'

I

(9.36)

586

SOURCE DETECTION

If random processes are operating it will usually be necessary to consider averages for comparison with experimental results. It is customary to regard E as a typical member of an ensemble of functions which characterize the statistical properties of the process. Common practice is to suppose that the ensembles which are encountered in electromagnetism are stationary and ergodic. Roughly speaking, an ensemble is stationary if all ensemble averages are independent of the time origin. Likewise, an ergodic process is one in which each ensemble average agrees with the corresponding time average for a typical member of the ensemble. Thus, in stationary ergodic random processes attention can be restricted to time averages and the absolute position of the time origin is irrelevant. Any practical disturbance will exist only during some finite time interval, say - T ~ t ~ T. It is mathematically desirable to allow T -+ 00 so that the assumption of stationarity can be invoked. For this reason the time average is defined by

=

(h)

lim T-+oo

~ fT 2T

h(t) dt.

-T

It is clear at once that, if this limit is non-zero, J~ 00 h(t) dt must be divergent. Since we expect to deal with non-zero averages this difficulty is bound to arise and there will be technical problems with Fourier analysis. To surmount these prolems we work with truncated functions such as hT where hT

= h(t)

(It I ~ T)

=0

(It I > T)

carry out the desired procedures with hT , and then proceed to the limit as T -+ 00. Provided that suitable smoothing is accomplished by ensemble averaging, the limit customarily exists and is the same as would be achieved by direct formal Fourier analysis. Accordingly, the technicalities will be ignored in the following and Fourier transforms will be manipulated as if there were no difficulties. On this understanding the average (E(x

1,

t1

+ r), E(x 2 , t 2 + r)

can be introduced. In view of the assumption that the random process is stationary, only the time difference t 2 - t 1 will occur. It is therefore sufficient to concentrate on (E(x

1,

r). E(x

2,

t

+ r)

which is known as the cross-correlation function at x, and x 2 . When this becomes (E(x

1,

r). E(x

t,

t

+ r)

which is often called the autocorrelation function.

Xl

= X2

587

THE INVERSE SOURCE PROBLEM

An associated normalized quantity is (E(x 1 , t). E(X2' t + r» Y12(r) = {<E2(X1, t)<E 2(X2, t)}1/2

which may be deemed a measure of the degree of coherence. As a consequence of the Schwarz inequality (§1.5)

o ~ IYI2(r)1

~

1.

If 1Y12(t)1 = 1 the disturbance is often said to be coherent, whereas if 1Y12(t)1 = 0 it is called incoherent; for other values of IY12(r)1 there is partial coherence. Write <E(x 1, t).E(x 2, t

+ r) = f~oo Gdw)

exp( -iwr) dw,

(9.37)

<E(xt> t).E(x 1, t

+ r) = f~oo G1(w) exp( -iwr) dw.

(9.38)

Then the cross-correlation and autocorrelation functions are being effectively analysed into their Fourier components. That is why G1 is termed the power spectral density and G12 the cross-power spectral density. Since most random processes are not periodic with respect to r, Gl 2 and G1 are liable to vary continuously with OJ rather than contain a number of discrete frequencies. They are obviously functions of x 1 and x 2 as well as OJ in general. By Fourier inversion G 12 (W )

= -1

foo

(E(x 1 , t).E(x 2 , t

+

foo

(E(x l , t).E(x l , t

+ r)

2n - OCJ

G1 (eo) = - 1 2n:

- 00

r) expticor) dr ,

exp(imt) dr.

(9.39) (9.40)

Since the correlation functions are real, the spectral densities must satisfy {G I 2 (m)}*

= G1 2 ( -w),

{Gt(w)}*

= G1 ( -w)

(9.41)

as may be seen directly from either (9.37) or (9.39) (cf. (9.12». On account of (9.41) the spectral density is known at negative frequencies as soon as its values at positive frequencies are available. Therefore, the mean ill is defined by _ So eoIG 1(eo)1 2 dco OJ=~----SO IGt(w)1 2 deo and the effective spectral width t1w by (L\W)2

= SO' (w

2 - w)2IG1(w)1 dw. IG t (w)12 dw

So

(9.42)

SOURCE DETECTION

588

The coherence time L\r is specified by

(L\r)2 = S~ ex> r

2\gt(r)\2

dr

(9.43)

J:>oo Igt(r)1 2 dr

where

(9.44) The mean value of r is zero because (9.41) implies, via (9.38) and (9.44), that Igt (t)1 2 is an even function of r. Now

f~ex> Igt(r)1 dr = 21t f~ex>

IGt(wW dw = 41t IX> IGt(wW dw

because the integrand is even by virtue of (9.41). Similarly

f-00

oo t 2Ig1(t)1 2 dr = 41t 100 laG (W)12 dw. _1_ 0

Hence, from (9.43),

(L\r)2 If a. is real

= So

aw

2 loGt(w)/owI dw. J~ IG1 (w)1 2 dw

(9.45)

Choosing

we derive

{Iex> (hth 2 + hth!)dwf ~4Iex> Ihl

2dw IX> Ih 12dw. 2

1

(9.46)

Make the selection

Then

Iex>

o». + h!h!) dw = [(w -

w)IG1(wW]0' -

Iex> IG1(w)1 2 dw

after an integration by parts. The contribution from the upper limit must vanish if (9.42) is to have a meaning. The lower limit does not contribute if G1(0) = 0, which will now be assumed. Hence, from (9.46),

(Iex> \G1(wW dw r·~ 4Iex> (w -

w)2IG 1(wW dw Iex>

1~;12 dw.

THE INVERSE SOURCE PROBLEM

589

It follows from (9.42) and (9.45) that (9.47) when G1(O) = O. Thus the effective frequency range is related to the reciprocal of the time over which the signal remains coherent. The smaller the time of coherence the greater the bandwidth involved. The definitions of spectral width and coherence time are satisfactory in the sense that they are based on the measurable correlation g 1 (or its Fourier transform) rather than the electric intensity itself which may be rapidly varying. There is another feature to be noted. In any region which does not contain sources

where V2 refers to the point x 2 , because each component of E satisfies such an equation. This permits the cross-correlation function to be represented by Kirchhoff's integral and the way in which it propagates can be determined. In particular, if it and its normal derivative are known on a closed surface it can be found elsewhere provided that no source intervenes. Correspondingly

in any region where there are no sources. If G1 2 and its normal derivative were measured on a closed surface an integral equation analogous to (9.34) and (9.35) could be set up for the source term appropriate to the cross-power spectral density partial differential equation. Alternatively, the technique described at the end of the preceding section could be adopted. In either case we arrive at some indication of the average source distribution. However, this direct attack does not seem to have been explored to any extent yet. Although the correlations have been defined in terms of E. E it is straightforward to handle other quantities. For instance, the cross-correlation of a single component of the field

or that of two components

(E x (x 1 , t)Ey (x 2 , t

+ r)

(which might be relevant to cross-polarization effects) or between the electric and magnetic intensities might be considered. Only minor modifications to the foregoing analysis would be needed.

590

SOURCE DETECTION

9.9 Correlation techniques

The first scheme to be discussed is based on measurements of a single component of the electric field such as Ex. This satisfies 2 1 02 Ex oJ x 1 op V E x - - 2 - 2 = J.1. - + - - =

c

ot

ot

s

ox -q(x,t)

and q furnishes a guide to the sources. At large distances from the source distribution Ex(x, t)

~ _1_ fq(x" t _Ix 4nlxl

c

XII) dx'.

(9.48)

If now correlations are defined in terms of this sole component, as explained at the end of §9.8, at large distances

gl(O) = ({E x (X, t)}2) =

2\ 2fS(Y)d Y 16n c [x]

(9.49)

f R( Y) dY

(9 50) ·

1 - 161t 2 c4 1x12

where

S(y) =

f\

c4q ( x', t - Ix ~ x'l)q(y, t _ Ix: YI)) dx',

R(y) = \ 41tc4 Ix\E.,(x, t)q(y, t _IX: YI)). In view of the similarity of (9.49) and (9.50) to the corresponding result for a simple point source both Rand S can be thought of as the source strength per unit volume. Since they involve averages measurement of them may be feasible. However, the insertion of probes close to the sources can scarceley be avoided and the noise generated thereby may pollute the readings. Even if this were not so the correlations to be observed may well be small with consequent doubt as to their accuracy and reliability. For example, should the right-hand side of (9.49) or (9.50) turn out to be negative the observations would have to be rejected. In any case there is no certainty that Rand S are being correctly interpreted. Any function, whose integral over space was zero, could be added to R or S without invalidating (9.49) or (9.50). So there is extensive scope for alternative interpretation and there seems to be no convincing argument why the additional function should be zero rather than some other value. Correlation can also arise in connection with harmonic problems. An illustration is the finding of a local deviation in a medium which is not varying with time. Suppose that s has the constant value eo except in some finite region.

591

THE INVERSE SOURCE PROBLEM

Then a plane wave, whose wavelength is much larger than the inhomogeneity, can be fired at the medium. The Rayleigh-Gans or Born approximation is then reasonable and the second part of (9.1) can be written as curl H - iweoE = iw(s - eo) eo exp( - ikx • i) if the x axis is picked to be along the direction of propagation of the incident wave. The distant scattered field is therefore 2

ES(x)- k exp( -iklxl) {eo _ (eo.x)x}

4neol x I

flXl -

n(y) exp{iky .(x - i)} dy

00

where n = s - eo and k 2 is now w 2 Jleo. Clearly n is zero outside the blob of deviating matter. Hence

IES(x)1 2 '"

4

2k 2

2

16n eo Ix I

leo - (e.,; x)x 12

f

-

foo 00

n(y)n(z)

- 00

x exp{ik(y - z).(x - i)} dydz 41e

'"

k

X

o - (eo· x)xl 2 2 2 16n eO Ix I

2

foo -

exp

{·k (A e)} 1 W. x - I

00

f~lXl n(x)n(w + z) dzdw.

(9.51)

The inner integral in (9.51) can be perceived as the autocorrelation function of n (though now in space rather than time) so that measurements of the far field yield some information about this function. It is perhaps most profitable to make some simple assumption about the form of the autocorrelation function, such as that it is Gaussian with adjustable parameters, with the intent of evaluating the right-hand side of (9.51). Comparison with the measured left-hand side should then sanction the determination of the parameters, optimization techniques being utilized if convenient. If tolerable agreement can be secured by adjustment of the parameters some idea can be obtained of the rate at which the autocorrelation functions falls to zero. That enables one to get some notion of the correlation between deviations from the average and gain some indication of the extent of the inhomogeneity. This procedure is likely to be especially valuable in eliciting the properties of biological specimens which are usually subject to statistical variability. 9.10 Far-field cross-correlation technique There is another way in which correlation measurements can be related to source distributions. Let the observations be made at points sufficiently distance from the sources for (9.48) to hold. Then, if Ixll = IX21, the cross-correlation

SOURCE DETECTION

592

function g12(t)

g12

is given by

= <El(X I, t)E l ( X 2, t + r) = 21 2 fOO f'" / q(y, t - IX I 16n

IXl I

-

00

- 00 \

-

c

YI)q(z, t + r

-

X I2

-

c

ZI)) dy dz

with the slight generalization that E l is an arbitrary, but fixed, component of the field and q its associated effective source. The assumption of statistical stationarity enables this to be written as

1 2 fOO 16n IXl I -

gdt) =

f'" / q(y, t)q(Z, t + r + IX

2

00

-

I

-

c

00 \

YI - IX2 - ZI)) dy dz. c

The cross-power spectral density is, from (9.39),

G12 (W)

=

1

3

32n IxII x

f~

co

2

fro fa:> -

00

exp{ -ik(IX l

YI

-

-lx 2 -

z/)}

- 00

(q(y, t)q(z, t

+ r)

expfieor) dr dy dz.

The quantity

-1 foo
+ r)

expticor) dr

is the cross-power spectral density of the source distribution; denote it by q(y, z, co), Then

G12 ( W) =

21 2 16n Ix 1 I

foo foo -

00

q(y,

Z, w)

exp{ -ik(IX I

16n

21 2 X1

I I

foo foo -

00

IX2 -

z/)} dy dz. (9.52)

Likewise, the power spectral density at Gl(w) =

YI -

-

- 00

Xl

is given by

q(y, Z, w) exp{ -ik(IX l

-

YI-lx 1 -

zl)} dy dz.

- 00

(9.53)

Define

QI(Z, w) Then

= f~<x> q(y, Z, w) exp{ -ik(lx l - YI - IX I -

zl)} dy.

(9.54)

THE INVERSE SOURCE PROBLEM

593

"", Movable sensor

~,

/ //

/ //

,K

/

/

/ x2

\

\

\

\

\

\

\ \

EJ\

/

(/ f)

x,

Fixed sensor

Fig. 9.6. Determination of cross-powerspectral density.

The resemblance between (9.55) and (9.49) suggests that Qt(z, w) might be called the apparent source strength per unit volume at the point z, radiating at the angular frequence ca. From (9.54), Qt relies on observations at X t and so it is the apparent source strength as viewed from that point. In general, changes in the reference point x t will entail alterations in the apparent source distribution. Determining Ql from (9.55) suffers from many of the objections levelled against (9.49) and (9.50). However, there is more opportunity for manoeuvre with (9.56) because of the variation with X 2 • Suppose that there is a fixed sensor at Xl and that there is a traversing sensor at X 2 which is always maintained at the same distance from the origin as the fixed sensor (Fig. 9.6). The record of simultaneous signals at the two sensors is Fourier analysed at a given angular separation 9 by passing it through narrow-band filters which will transmit only a small range of frequencies centred on the w/2n at which information about QI is desired. The cross-correlation function of the filtered signal then leads to G12(w) as a function of 9. In ideal circumstances, values for all 9 are feasible but in practical situations the magnitude of G12 frequently diminishes with increasing angular separation so that the correlation may not be measurable when ()exceeds, say, 60°. Also the observations will probably be made at discrete rather than continuous values of O. Notwithstanding, we shall assume that Gl 2 is known as a continuous function of 0 on a circle with centre at the origin and whose circumference passes through Xl. To find Ql from Gl 2 the relationship (9.56) has to be inverted. Unfortunately, Gl 2 is known only on a circle so that there is no hope of recovering full knowledge of Qt. It therefore pays to make simplifying assumptions about the source distribution. To put it another way, the measured cross-power spectral density is accounted for by a simple, but artificial, distribution of sources. Take the plane of the circle as the (x, y) plane with Xl on the x axis. It will now be assumed that the sources are not very far from the y axis. Then (9.56) simplifies to

Gl 2(m) =

1

16n

2

2

IXl I

fro-

ql(Y, m) exp{ik(lx 2 00

-

yjl -Ixli - yjl)} dy

594

SOURCE DETECTION

where Ql(Z2,m)

= f~oo

f~oo Ql(z,m)dz

1

dZ3

and i, j are unit vectors parallel to the x and y axes respectively. It will now be postulated that the points of observation are many wavelengths from the places where qt is non-zero. This is a non-trivial assumption because the positions of the apparent sources are unknown and forcing them to lie on the y axis may oblige them to be. some distance from the genuine sources about whose rough location a fair degree of confidence may have been felt. Actually, the postulate can only be truly checked a posteriori by examining whether (a) the resulting apparent sources are, in fact, nowhere near the circle of measurement and (b) increasing Ixtlleaves the apparent sources substantially unchanged. At any rate, squares and higher powers of y will now be neglected so that Ixti - yjl ~ Xl and IX 2 - yjl ~ XI - Y sin 8. Then

Gdm) =

1

2

2

16x Ix 1 I

fOO -

Ql(Y, m) exp( -iky sin 0) dy.

(9.57)

00

Eqn (9.57) can be regarded as a Fourier transform in which k sin 8 is the transform variable. But k sin 8 ranges only from - k to k so that G12 is constrained to that interval. By defining GI 2 to be zero outside the interval (which is plausible when IGI 2 1 drops sufficiently rapidly with increasing 8) we can take a Fourier inverse which will give a q'l which will satisfy (9.57) on the interval of measurement. The pertinent formula is

(9.58) In essence q't is a diffraction-limited image of the apparent source distribution qt. Assuming that the far-field approximation is legitimate we can be sure that the resolving power is independent of Ixtl because it is necessary only to maintain the same range of 8 for differing x t. On the other hand, a disadvantage is that the method prejudges the issue by insisting that the apparent sources be located on the y axis. It is advisable to see how well (9.58) performs when the real sources are actually on the y axis. To this end let q(x, t)

= Q(y)<5(x)<5(z) cos{root + c/J(y)}

(9.59)

where roo is positive and the real function Q(y) is assumed to be zero outside some finite interval. The phase c/J (y) is chosen so that c/J(O) = 0 which

595

THE INVERSE SOURCE PROBLEM

may necessitate a change of time origin but that is of no significance here. From (9.59)

q(x', x", co) = !Q(y')()(x')()(z')Q(y")()(x")()(z") x «()(ro

+ roo) exp[i{q,(y")

- q,(y')}]

+ e5(ro - roo) exp[ -i{q,(y")

- q,(y')}]).

Substitute in (9.57) via (9.54). Because of (9.41) only positive frequencies need be considered so that the term involving ()(ro + roo) can be omitted without loss. Hence G12(m)

= b(m ;- m~ foo 64n Ix 1 I

-

foo 00

-

Q(y')Q(y") 00

x exp[ -i{q,(y") - 4>(Y') ,.., b(m ;- m~ 641t Ix 1 I

foo foo -

00

+ k(lx l

-

y'jl-lx 2

-

y"jl)}] dy' dy"

Q(y')Q(y")

- 00

x exp[ -i{q,(y") - 4>(Y')

+ ky"sin O}] dy' dy"

subject to the same conditions as imposed in (9.57). From (9.58) q'l(y, m) = b(m; mo) 1t

foo foo -00

-00

Q(y')Q(y") sin k(y ~, y") y-y

x exp[ -i{4>(y") - q,(y')}] dy' dy". For large values of k, (sin kx)/x is approximately n()(x) as far as integration is concerned (Jones 1966, 1982). Hence, for high frequencies Q'l(y, m) = tb(m - mo)Q(y) exp{ -i<jJ(y)}

f~

00

Q(y') exp{i<jJ(y')

(9.60)

Thus q't reproduces the amplitude and phase of the genuine source apart from a normalization factor. To get rid of the normalization factor put q~ (y,

ro) =

q~ (y,

ro)e5(ro - roo)

and suppose that

f~oo Q(y') exp{i<jJ(y')}

dy' = A exp(ia)

where A is real. Then q~ (0, ro) = !Q(O)A exp(iex).

Since q'; is presumed to be known from the measurements, ex can be found to within a multiple of n. The indeterminacy springs from the fact that we do not

596

SOURCE DETECTION

know whether Q(O)A is positive or negative. Also

f~oo q1(y, en) dy = tA

2

since Q is a real function. Therefore, if we define "'( w) _ 2q~(y, w) exp( -ia) q1 y, - (Joo-00 q1"( Y, w) dY)1/2

(9.61)

we obtain q~'(y, w)

= ±Q(y)

exp{ -i<jJ(y)}

(9.62)

and all quantrnes on the right-hand side of (9.61) can be obtained from measurements. The ambiguity of sign in (962) is not surprising because if the sign of q is changed, G12 remains unaltered. Consequently, the absolute sign of a source distribution can never be found by correlation techniques. Of course, as y varies, the sign in (9.62) stays invariable, be it plus or minus. From (9.62) Q(y) and <jJ(y) cann be determined, apart from a possible sign error, and hence the actual sources found insteadof the apparent sources. This leads to be suggestion that q~' as defined by (9.61) should be employed as the indicator for apparent sources in general. It gives the actual sources when they are on a line and can be expected to be a good predictor if they do not depart too widely from a straight line. The proposal to adopt q~' rests on the high-frequency approximation deployed in deriving (9.60). Whether q~' would be at all reliable at low frequencies is an open question. The necessity for high frequencies also enforces measurements at sufficiently many values of (J to resolve the structure of the field. Even if the sources are not nearly rectilinear q~' can perform a helpful function as a standardizing mechanism, for it says that the cross-power spectral density agrees with that produced by a certain (fictitious) rectilinear distribution. Therefore, relations between actual source distributions may be clarified by compared their (fictitious) rectilinear equivalents. It is, of course, true that the real sources may not be perfectly correlated as in (9.59) but that is irrelevant. The idea of standardization is that polar correlation measurements should always be interpreted by means of a rectilinear perfectly correlated distribution but it is not argued that this will coincide with the actual sources in general.

HOLOGRAPHIC TECHNIQUES 9.11 Basic principles of holography In holography (Gabor 1948, 1949, 1951, 1956) an object is illuminated by a coherent light wave and the diffraction pattern or hologram recorded on

HOLOGRAPHIC TECHNIQUES

-,

Object

/

597

Photographic plate

------IIo'-A---------~----~z

z=z,

z=O

Fig. 9.7. Formation of a hologram. a photographic plate. After suitable processing the plate, when re-illuminated conveniently, conveys an image of the object. Holography therefore represents a means of solving the inverse scattering problem from a photographic copy of the far field. It has the obvious benefit of providing a permanent record of the solution. The basic theory which validates the reconstruction of the image will now be derived. Since the primary concern is with optical frequencies it will be sufficient to consider scalar fields only (Born and Wolf 1965; Wolf and Shewell 1970). A monochromatic wave ui is incident on an object and produces the field U 1 in the aperture A of the plane z = Z1 with Z1 < 0 (Fig. 9.7). The aperture is designed so that light which misses the object does not pass beyond f Z = Z l' The field u1 spreads in z > Z t and has added to it a reference field u i f (of the same frequency as u ) so that the total field is U t + u • This total field is recorded on a photographic plate at Z = O. The plate responds to the intensity I of the total field where

I t and If being the intensities of U t and the reference wave respectively at the plate. The plate is now processed so that its effect is to make a wave incident on it emerge multiplied by the factor ao + a.] where ao and at are real constants. Usually, ao is positive and at negative. The hologram so formed is placed in the plane z = 0 and illuminated by the monochromatic reconstruction field VC (Fig. 9.8). We shall discuss only the situation in which VC is the same frequency as u but the case when they are of different frequencies can be considered. The f

598

SOURCE DETECTION

Hologram

----+f----~z

/

VC

z="O

Fig. 9.8. Reconstruction of object.

field appearing on the far side of the hologram is then v(O) = a l VCUl u f* + a l vCuiu f

+ VC{ ao + a l (I 1 + If)}.

(9.64)

If matters have been arranged aright v will reproduce an image of the object somewhere in z > o. Circumstances under which this is possible (Leith and Upatnieks 1963) will now be investigated. Let ur and VCbe plane waves so that

ur = B exp( -ikn f .x), C V = Cexp(-ikoc.x). Then (9.64) gives where

exp{ik(n f - nC).x}, V2(O) = at CBui exp{ - ik(n f + OC). x}, v3 (O) = {ao + a 1(I l + BB*)} exp{ -iknc.x]

Vt(O)

= atCB*u t

(9.65) (9.66) (9.67)

with x == (x, y, 0). Suppose now that Of = nC so that the reference and reconstruction waves travel in the same direction. Then, from (9.65), Vt(O) = atCB*u1 and so VI is proportional to the field created at z = 0 by the light coming from the original obstacle in the absence of any reference wave. Thus, in z > 0, VI will be a constant multiple of U1 and will appear to be propagating from a virtual image of the original object, this virtual image being in precisely the same position relative to the hologram as was the original object relative to the photographic plate.

599

HOLOGRAPHIC TECHNIQUES

Define U(/, m) by

U(/, m) = -k2 foo 4n - 00 2

then a representation for

= f~CXl

Ul(Z)

Ut

foo -

Ut(x, y, 0) exp{ik(lx

+ my)} dx dy;

(9.68)

00

valid for all z is

f~CXl U(/, m) exp{ -ik(lx + my + nz)} dl dm

(9.69)

where n2 = 1 - 12 - m", n being positive when 12 + m2 < 1 and negative imaginary when 12 + m2 > 1. In fact, we shall suppose that U(/, m) is effectively zero when 12 + m 2 > 1 so that only real n need be considered in (9.69). Eqn (9.69) is a representation in plane waves, homogeneous if 12 + m2 < 1 and inhomogeneous otherwise. Since evanescent waves do not radiate the waves with 12 + m2 < 1 are often described as the visible part of the spectrum while the inhomogeneous waves are the invisible part. Let V2 be the field which coincides with uT on z = O. Then

V2(Z) =

f~CXl f~CXl {U( -I, -m)}* exp{ -ik(lx + my + nz)} dl dm

= {f~CXl

f~CXl U(l, m) exp{ -ik(lx + my -

nz)} dl dmr

(9.70)

on account of our assumption about U. It follows from (9.69) and (9.70) that (9.71) which is a sort of reciprocity theroem. That (9.69) is valid for z > 0 and z < 0 is a consequence of U t travelling to the right near z = O. Having established (9.71) in general we now turn to (9.66) when n' = n". If the components of n' are (n~, n~, n3) 2

k 4n

-2

foo

- 00

fOO uf(x, y, 0) exp{ - 2iknr • x + ik(lx + my)} dx dy -

00

= {U(2n~

-I, 2n2 - m)}*

and hence

viz) =

alCB{f~CXl f~CXl U(I + 2n~, m + 2n2) exp{ -ik(lx + my -

nz)} dl dmr. (9.72)

We shall now make the additional assumption that U(/, m) is negligible unless

600

SOURCE DETECTION

12 + m2 « 1. Then squares of 1 and m can be ignored and (9.69) becomes ut(x, y, z)

= exp( -ikz)

I~co I~co U(l, m) exp{ -ik(lx + my)} dl dm.

(9.73)

If, in (9.72), I + 2n~ and m + 2n2 are taken as new variables n

= {I

- (1- 2n~)2 - (m ~ (v 2 + 4(ln~ + mn2)}1/2

on rejecting squares of I, m and putting v2 is not near zero, the approximation 2(ln~

2n~)2}1/2

=1-

4(n~)2 - 4(n 2)2. Whenever

v

+ mn2)

n~v+-----

v

r

is legitimate. Consequently, in these circumstances

V2(Z)

= atCB{ U{ x -

2n~ ;, y -

'k( r x exp { - 21 nix

2n2; , -;)

2)}

ikz(l - V + nr) 2 y + - - v- - .

(9.74)

Thus V2 carries much the same information for positive z that U 1 does for negative z. Since U 1 must account for the presence of the obstacle in negative z, V2 must produce a real image of the object to the right of the hologram. If most of the object lies pretty well on Z = Z l' the image will occur on z = - VZl' Apart from a factor of proportionality and a phase factor, the complex amplitudes of corresponding points of the object and image are complex conjugates of one another. However, whereas the object is disposed about the Z axis, the image is displaced from it. In fact, if the object point (x o, Yo, Zl + ~), ~ being small, corresponds to the image point (xj, yj, Zj) Xi

= Xo

-

2n~zl'

Yi

= Yo

- 2n 2z t ,

z,

=

-V(ZI

+ e).

(9.75)

These relations show firstly that the image is displaced by a distance (1 -- V2)1I21ztl from the z axis. Secondly, transverse changes in which only X o and Yo vary are reproduced exactly but longitudinal displacements in which e alters are subject to a magnification v. Thus the image is a distorted replica of the object, being squashed lengthwise (Fig. 9.9). The analysis requires that v be not small. Since v is zero when the reference wave makes an angle of 30° with the z axis, we want the direction of propagation of the reference wave to be inclined at somewhat less than 30° to the z axis. However, for a good image the angle must not be too small. The reason is that V 3 essentially reproduces the reconstruction wave and a radiating wave. The image will therefore be easily distinguished from VI (another radiating wave)

601

HOLOGRAPHIC TECHNIQUES

/ Yirtua l/ Image

I

I

/

/

/

..- ..-

---- --

Real image

<,

-,

'\

\

\

\

\

Holog ram

Fig. 9.9. Positions of real and virtual images.

and V3 provided that the image is not in the same direction as the reconstruction wave. The formulae (9.75) indicate that this is true provided that nf and n2 are not too small.

Exercises 20. If Dr. k = DC. k and Dr /\ k = nC /\ k, where k is a unit vector along the z axis, prove that V2 is responsible for a real image on the z axis. (Hint: eqn (9.71).) Is this image a distorted version of the object? Prove also that V I has a virtual image which is displaced from the z axis and subject to longitudinal deformat ion. 21. If DC is parallel to the z axis but Dr is not show that VI entails a virtual image and V2 a real image, both images being displaced from the z axis and being deformed versions of the object. 22. In the situation of Exercise 21 prove that v(O) is real. (Because of the photographic processing it is non-negative.) If Vel, m) is non-zero only where III ~ L, Im\ ~ L where 18L 2 ~ (nD 2 + (n~)2 prove that U 1( x,

y ).

= 1t

2

1 aICB*

f a> f a> _ a>

x exp{ -ik(n1x'

-r

V

(0) sin kL(x - x') sin kL(y - y')

co

+

n~y' )}

(x - x')( y - y')

dx' dy' .

This demonstrates that the amplitude and phase of the scattered field can be determined from measurements of the intensity of the field emerging from the hologram. Deduce that when the reference wave makes an angle of about 300 with the z axis details down to about nine wavelengths of the illuminating light can be distinguished (Wolf 1970).

9.12 Location of an inhomogeneity In the detection of an inhomogeneity in an otherwise uniform medium by holography the blob of deviant material is treated as the object. The incident wave ui is chosen to be a plane wave moving in the direction of the unit vector n',

602

SOURCE DETECTION

If the blob is sufficiently small for the Rayleigh-Gans of Born approximation to be applicable then, at large distances, u 1(x)

~

Ai exp( -iklXD 4nlXI

JOO

.

n(y)exp{lky.(X - nl ) } dy A.

•

-00

where u = Ai exp( -iknI .x) at the blob, X = x - Zlk, and n represents the difference between the refractive index in the blob and that in the surrounding medium. According to Exercise 22 measurements of the intensity emergent from the hologram will give U 1 and thereby supply the angular Fourier transform of n. By varying the angle of incidence the three-dimensional Fourier transform becomes available and so n follows by Fourier inversion (Wolf 1969). Although this is theoretically true the fact that the resolution cannot be brought below nine wavelengths imposes severe practical limitations. i

9.13 Field in the aperture of an antenna The theory of holography developed above has concentrated on optical aspects, but the principles have been applied in other areas. One such is the calculation of the aperture distribution of an antenna from measurements of the radiation pattern (Bates 1971; Napier and Bates 1971, 1973; Ransom and Mittra 1971). To avoid the complication of vector fields, which intensify the detail without affecting the principles, the analysis is restricted to scalar fields. The field u is generated from the aperture A in the antenna at Z = Z 1 and the aperture distribution is to be inferred from measurements on z = 0 (Fig. 9.10). There is therefore a distinct similarity to the problem treated in §9.11. Assuming that u is zero on the antenna outside the aperture u(x, y, 0)

=

-~ ~ 21t aZ l

X

f

A

u(x', y',

Zl)

exp[ - ik {(x - X')2 + (y - y')2 + Z2} 1/2] 1 dx' dy'. {(x - X')2 + (y - y')2 + Zi}1/2

(9.76)

Because of the postulate on u, the integral in (9.76) can be extended over the whole plane z = Z 1. Then the measured u is the convolution of the required distribution and a known function. A direct solution of (9.76) by discretization or Galerkin methods is notoriously awkward. Instead, it is more remunerative to invert the convolution equation which is, in effect, what holography does. The inversion of a convolution is a common occurrence because the effect of many linear devices on a signal is to form a convolution. For example, this happens when the reception pattern of an antenna, together with any linear filtering and detection, is taken into account. Again, correlation functions, transmission through a fluctuating medium, X-ray diffraction, crystoholography, and the restoration of images blurred by motion are all examples

HOLOGRAPHIC TECHNIQUES

603

z=O

z=z,

Measurement plane

Antenna aperture

Fig. 9.10. Geometry for aperture calculation.

where convolution plays an important part (Jones and Misell 1970; Sondhi 1972; Frieden and Burke 1972; Bates et ale 1974; Bates and Gough 1975; Bates and Lewitt 1975). A few remarks on the difficultires involved are therefore in order. It is sufficient to discuss convolution in one dimension, so we consider the problem of finding g given f and h in f(x)

= f~<Xl g(t)h(x -

t) dt.

(9.77)

Define Fourier transforms by capital letters so that, for example,

F(rx) Then (9.77) leads to

= f~<Xl f(x) exp(-irxx) dx. F(rx) = G(rx)H (c)

whence

G(rx)

= F(rx) . H(rx)

(9.78)

Thus 9 is the convolution of f and the inverse transform of the reciprocal of H. Conditions for the validity of this process depend upon the properties one is prepared to ascribe to f, g, and h. For example, (9.77) can be formulated in terms of generalized functions (Jones 1966; 1982) but for our purposes ample scope is available by sticking to conventional functions. In that case F/H and F would need to be square integrable when h is integrable. At any rate, whenever H vanishes for a real rx, F must do likewise if an unacceptable singularity in G is to be evaded. We might attempt to find 9 by applying a fast (inverse) Fourier transform to (9.78). This should work satisfactorily so long as IHI is never small compared with IFI. It has already been remarked that F and H should vanish together but,

604

SOURCE DETECTION

regrettably, F is obtained from measurements and so suffers noise contamination. For that reason there is likely to be some discrepancy between a zero of F and the corresponding one of H. The result is that the inversion will be very noise sensitive. The complaint will be aggravated if H is also derived from observations. In other words, the inverse transform should be employed if IHI is always large enough to swamp the noise in F but when this is not so an alternative technique is apposite. One possibility is to deform the contour off the real axis to one where IHI is never small. F must then be calculated for complex a from the data for f. Unfortunately, the net result is to amplify the noise from the data so the inversion will probably be as intolerably noise sensitive as before. Nevertheless, it has been suggested (Bates et al. 1976) that it is advantageous to study (9.78) in the complex a plane with particular reference to zeros in the left half-plane where 9l(a) < o. Wherever F has a zero either G or H must have one also. By identifying those zeros of F which agree with those of H we can deduce the positions of the zeros of G. In practice, when F and Hare obtained from measured data, there will not be precise coincidence and it will be necessary to resort to determining by inspection which pairs are closest together, supplemented by minimizing the sum off the squares of distances between pairs where it is not clear which are correspondents. There may be some zeros of H left over by this process. The data are then inconsistent but just ignoring the surplus zeros of H may be successful at the expense of increasing the uncertainty in g. Any zeros of F now remaining are attributed to G. Assuming that properties of 9 can be deduced from the zeros of G, a matter which will be turned to shortly (§9.14), a method has been devised for carrying out the inverse convolution. It might happen that, instead of knowing H, we were given IHI for real a. Since IHI 2 is the Fourier transform of the autocorrelation of h for real a we can find I H 12 in the complex plane from this autocorrelation. In such circumstances h is usually real and the autocorrelation is then an even function. As a consequence the complex conjugate of a zero of H is also a zero of IHI 2 . Despite the zeros of IHI 2 occurring in complex conjugate couples the ones belonging to H can be identified as those common to F. Thus the zeros to be associated with G can be determined. Consequently we can hope to find 9 even though information on the phase of H is lacking. Another case in which progress can be made is that in which g(t) and h(t) are known to be zero outside the intervals (to, to + Ig ) and (- t 1 - 1,., - t 1 ) respectively (lg , 1,. being positive) and the modulus of the Fourier transform of g(t) + h( - t) is available. Effectively, this is what transpires in holography if we make the further supposition that to - t 1 - I,. exceeds the greater of Ig and '1,.. The convolution of g(t) and g( - t) or autocorrelation of 9 vanishes outside ( - Ig , Ig ) while that of h disappears outside ( -1,., 1,.). On the other hand, f is zero outside (to - t 1 - I,., to - t 1 + Ig ) which is disjoint from (-l g , Ig ) and ( -I,., I,.) because of our supposition. Thus, when x surpasses the greater of Ig

HOLOGRAPHIC TECHNIQUES

605

and lh' the autocorrelation of g(t) + h( - t) reproduces that of f in the whole interval where it is non-zero and nothing else. But this autocorrelation is the inverse of the square of the modulus of the Fourier transform. Consequently, f can be discovered from the information supplied by an inverse transform. If, now, IHI is known the method of the last paragraph provides g. 9.14 Zeros of entire functions In this section the relation between f and the zeros of its transform will be discussed. It will be assumed, mainly on physical grounds, that f is non-zero only on the interval (-a, a); a simple shift places this interval anywhere on the real axis. Then F(IX)

= f~

.

f(x) exp( -ilXx) dx.

(9.79)

We shall have in mind that f is continuous though this is by no means essential. The integral in (9.79) converges uniformly for every bounded (complex) a and so do its derivatives with respect to a. The integrand is a continuous function of C( and x for real x and complex C(. Furthermore, the integrand is a regular function of a for the real x involved. Hence F(a) is a regular function of a for all finite complex C( and is, in fact, an entire (or integral) function. Moreover IF(a)I < C exp(aL/af)

(9.80)

for some finite constant C as lal --+ 00. Thus F(a) is actually an entirefunction of exponential type. As a matter of fact, a theorem of Paley and Wiener (1934) demonstrates the converse, namely that, if F is an entire function satisfying (9.80), f is zero outside (-a, a). It is transparent from (9.80) that F is of less than exponential growth on the real C( axis. An application of Carleman's theorem (Titchmarsh 1939) reveals that the zeros of F must cluster about the real a axis. More specific information is available from consideration of the behaviour of f near its end-points. Suppose that

f(x)

~

(x

+ a)e1B1

(x

~

-a)

~

(a - x)e2 B2

(x

~

a)

(9.81)

SOURCE DETECTION

606

Actually, (9.81) holds if both et and e2 are non-positive. If either is positive there may be more important contributions from interior points where the derivative of f is discontinuous (Jones 1966a, 1969). If the derivative of f is continuous (9.81) is valid if neither e 1 nor e2 exceeds unity. We shall assume that the conditions are such that it is legitimate to employ (9.81). The case of most interest is when e t = e2. Then (9.81) shows that if F vanishes for some particular ex it does so when ex is increased by an integer multiple of xla. So there is an infinite string of zeros running roughly parallel to the real axis. They cannot be far from the real axis otherwise one of the exponentials would dominate and F could not be zero; this is in accordance with Carleman's theorem. In the particular case e t = e2 and B1 = B2 the zeros of (9.81) occur at ex = n(te t + n)/a, n being a large positive integer. If, instead, B, = - B 2 the large zeros of F are at ex = n(te 1 + t + n)/a. A similar investigation on the negative real axis places zeros at ex = -n(!e 1 + n)/aand ex = -1t(!e t + + n)/a respectively in the two cases. Of course, these locations of the zeros are only a first approximation because lower-order terms have been neglected in (9.81). More refined calculations may push the zeros somewhat off the real axis. The device for fabricating F from its zeros consists in taking care of the large zeros analytically whereas those at a finite distance from the origin are handled by inspection. However, some additional information about f is essential because the large zeros when el = ! and B 1 = B 2 occupy the same positions as those when e 1 = -! and B, = -B 2 • Suppose therefore that B, = B 2 • Now

t

f

(x -t/2exp( -iexx) -dx = ( 1)!1t 1 - - )V 2

a

r-

a

a2

a

V __

2

J. (exa) 1 / 2 _v_ _ •

(texa)V

(9.82)

Since the integrand of (9.82) has the same end-point behaviour as f when v = e t + t the large zeros of Jv(exa)/ex v will agree with those of F. Let ex l , ex2' .•. be the zeros of F arranged so that lexll ~ lex21 ~ ••• and suppose that those for n > 2N coincide with those of the Bessel function (note that an entire function can have only a finite number of zeros within a finite distance of the origin). Consider F(a)

= Jv(aa) a.

v

n;~ 1 (an - a) 2 2 1 (i:1I - a. a )

n:=

(9.83)

where v = e1 + t and i: is the nth positive zero of J, so that Jv(ivn) = 0 and i; > t-. n -1. The zeros of the denominator remove thos~ of the Bessel function

which may not be in agreement with ones in F. Thus F has the same zeros as F. Also F is entire because Jy(a.a)/a. v is entire and the zeros of the denominator are cancelled, as just explained, so that there is no trouble from that quarter. Hence an entire function has been manufactured with the same zeros as F.

SYNTHESIS OF RADIATION PATTERNS

607

Now FIE is an entire function without zeros on account of the coincidence of the zeros of F and F. Hence F = A exp(ba)F for some complex constants A and b. By allowing lal --+ 00 we see from (9.83) that (9.80) will be violated unless b = o. Thus F is, in fact, a constant multiple of F. Therefore, by inversion of F, f is found to within a constant multiple. This constant multiple cannot be resolved unless a further piece of information, such as the specification of a non-zero value of f or F at some particular point, is available. IflBtl :F B 2 If B1 = -·B 2 , the contender to replace J, is the Struve function a linear combination of J\I and would be appropriate. The preceding approach is based on the assumption of data uncontaminated by noise. Procedures when noise is a vital feature of the scene have been proposed (Bates and Napier 1972, Bates 1969) but the cardinal principle that only a few zeros not far from the origin necessitate special treatment remains unaltered.

"\I.

"\I

SYNTHESIS OF RADIATION PATTERNS 9.15 General considerations The problems that have been considered hitherto in this chapter are, in essence, all examples of solving an equation of the type

Tf=g

(9.84)

where T is an operator. Measurements of g have been assumed to be available and we have to determine f, the originator of these measurements. Often, it is handy to be able to design antennas which radiate a prescribed patterns. Then we have a recipe for g and require the antenna or f that is responsible for it. We are still faced with the resolution of (9.84) but there are important differences of emphasis. If g is chosen arbitrarily, it may not be a possible radiation pattern and exact synthesis is then out of the question. None the less, an approximate synthesis may be accessible and be accurate enough for design purposes in the sense that Tf is sufficiently close to g. If g is a realizable pattern then (9.84) has a solution; whether it is unique is of less significance than in inverse probing because it is enough to find a solution acceptablee to the designer. The designer will, of course, wish to know if large changes in the radiation will arise if he does not construct f precisely; this will depend upon the properties of T. Conversely, he will also wish to be aware if small changes in the prescription of g will enforce large alterations of f; a dominant role will be played by r:: in this context. Some idea of what happens can be gained from a discussion of a onedimensional array which occupies the interval (-a, a) of the x axis. The excitation f(x) is taken to be continuous on this interval and zero outside.

608

SOURCE DETECTION

In other words, the excitation is aperture limited. If the angle 0 is measured from the broadside direction the radiation pattern is given by

g(a.)

= e(a.) f~a exp(ia.x)f(x) dx

(9.85)

where a = k sin () and e(ex) is the pattern due to an element of the aperture. Evidently, g is defined only for -k ~ ex ~ k but the discussion of (9.79) indicates that the values in this interval must be those attained by a product of e(ex) and an entire function of exponential type which satisfies (9.80). Thus g cannot be chosen arbitrarily for (9.85) to be soluble. On the other hand, if g is such that (9.85) can be solved the solution is unique, for the difference between any two solutions for the same g would have zero Fourier transform in - k ~ ex ~ k and the only entire function with this property is itself zero. If g is continuous without being a member of the class for which (9.85) possesses a solution it has been proved (Bouwkamp and De Bruijn 1946) that there are functions In with corresponding pattern functions gn such that gn can be made as close to g as desired by increasing n. However, as n grows, f" becomes highly oscillatory and it becomes steadily more difficult to create a design incorporating it. The problem is related to supergain and superdirectivity; it may be regarded as an unwelcome growth in the norm of the aperture function. To put it another way superdirectivity demands aperture currents which are prohibitively large for a given radiated power and which are highly susceptible to errors of manufacture (Woodward and Lawson 1948). In many practical applications g(ex) is not prescribed for all ex. Rather its qualitative behaviour is described, e.g. the main beam being of a certain width and the side lobes being below a specified threshold or having nulls in designated positions. In essence, this means that I has to reproduce g only at some discrete values of ex and give reasonable performance in between. This allows I much more freedom and goes some way to resolving the difficulties associated with the solubility of (9.85). For, if I is physically realizablee and does reproduce g at the discrete ex, it is bound to generate a feasible radiation pattern. Instead, the question becomes more one of finding v, N, exn in (9.83) to fit the desired properties than one of solving (9.85). With this point of view in mind we shall examine in the next few sections how a suitable I might be obtained under various conditions. Two points have to be addressed: (i) the accuracy with which I reproduces the desired values of g, and (ii) the relation between the main beam and side lobes; they will be dealt with in that order. Despite the concentration on (9.85) the fundamental ideas carryover to two-dimensional finite arrays where the detail is more complicated but still tractable and iteration can be applied to secure improvement (Hussain et ale 1992).

SYNTHESIS OF RADIATION PATTERNS

609

9.16 Synthesis by series expansion One method of attacking (9.85) is to expand functions through

f(x)

=

f

via a series of known

L f"h,.(x) 00

(9.86)

.. =:0

and determine the coefficients f,. by sampling g at a number of points ex By this means we arrive at the equations

h(A'm) = n~o In f~a hn(x) exp(iAmx) dx

= A.m· (9.87)

if h(ex) is written for g(ex)le(ex). The equations (9.87) simplify considerably if it is possible to choose the functions h,. so that

f~a hn(x) exp(iAmX) dx =

0

(m 1= n)

= 1

(m = n)

(9.88)

i.e. the sets {h,.} and {exp( - iAmx)} are biorthogonal (§5.2), for then (9.87) reduces to f" = h(A,.) and the coefficients of (9.86) are determined immediately from pattern values at the sampling points. To find the correspondence between h,. and A. m let K(a)

=

Ii (1 -~) A.m

m=:O

on the understanding that 1 - exl Am is replaced by - ex if Am = o. Under proper conditions on the incrementation of Am it can be demonstrated (Levinson 1940) that K is effectively bounded by exp(alJexl) as lexl ~ 00. Therefore, if we define hn(x)

= ~ fOO K(a) exp( ~iax) d«, 21t _ 00 (ex - A,.)K (A,.)

(9.89)

h,. is zero for x < - a and x > a as can be seen by deforming the contour to infinity. Furthermore, by Fourier inversion

K(ex) --- ,- = fa (a - A.,.)K (A.,.)

-a

.

h,.(x) exptiex) dx.

The substitution ex = Am leads instantly to (9.88). Consequently, (9.89) determines h; once the sampling points have been chosen. The advantage of this is that the sampling points can be placed conveniently for measurement and can be arranged so that they fit in with any specified end-point behaviour consistent with §9.14. The theory holds strictly

SOURCE DETECTION

610

if there is one and precisely one sampling point within a distance of n/4a of nn]« for every integer n and no other samplingg points. However, this is not likely to be of any significance in practical synthesis when the expansion in (9.86) will probably be limited to a finite number of terms and 9 itemized at a modest number of points. Indeed, values of 9 are at hand only for - k ~ lX ~ k. Therefore, if f were expanded in a Fourier series (hll(x) = exp{ -innx/a}, All = nn/a) the number of sampling points available would be 2[ka/n] + 1 where [x] is the largest integer which does not exceed x. The number of determinable coefficients in f would be the same and could be increased (if this were felt to improve accuracy) only at the expense of making ka larger, i.e. putting more wavelengths into the aperture. More sampling points with fixed ka can be achieved by defining All suitably and finding the associated hll. There is always the danger, however, that if the number of coefficients is much larger than twice the number of wavelengths in the aperture f will be oscillatory and of high norm, as described in the previous section. More freedom of action can be secured by dropping the condition (9.88) on the basis functions hll. Also it is well to recognize that, in practice, (9.86) will be constrained to a finite number of terms. Therefore, put N

f(x)

whence g(a)

=

,.tl

= L

11=1

f"e(a)

f"hll(x)

f~1I h,,(x) exptiex) dx.

(9.90)

Obviously, (9.90) is exactly soluble only when 9 is in the space spanned by the elements of the right-hand side. Whether this is a severe limitation will depend upon the pattern to be synthesized and the basis functions employed. When 9 has to be reproduced at only a finite number of values of e, all that is necessary is to make sure that N is large enough. If 9 is specified at lX = Am then (9.90) implies (9.91) for m = 1, ... ,M. Define the column vectors f, g to have fh 12' . · · and g(At ), g(A2), ... respectively. Then, if B is the M

the components x N matrix with

elements

(9.92) (9.91) can be expressed in matrix form as g

= Bf.

(9.93)

SYNTHESIS OF RADIATION PATTERNS

611

The problem has therefore been converted to the solution of the algebraic system (9.93) subject to whatever constraints are imposed on the norms. The norms that we shall adopt are N

II fll 2 = r-cr

=

L frCijij, i,j

(9.94)

M

IIgl1 = gHDg = LgrDijgj 2

i,j

where fH is the conjugate transpose off (§1.10), and C and D are positive definite N x Nand M x M matrices respectively. Frequently, it will be opportune to make C and D diagonal or even unit matrices. We shall assume that C is such thatLffrCiJi~I/fll2. _ Suppose that a solution f of (9.93) has been obtained by some method. Then often will not reproduce g exactly but generate instead g. The synthesis error may then be defined by

Be

E= IIg-gn2.

(9.95)

"g1l2

Another measure of the merit of f is the quality factor

r

2

Q = Mll 11 IIgll 2

(9.96)

•

9.17 Construction errors It has to be realized that aberrations in fabrication will involve departures from actual construction. So, before investigating methods of finding 1, we try to estimate !he effect o! small errors. Assume that f becomes f owing to the introduction of errors where

f in any

Here rn is a random variable with zero mean and variance v2 • It will also be supposed that the data points are sufficiently separated for the rn to be uncorrelated with one another. In terms of expectations

E(rn )

= 0,

E(r:rm )

= v2 <5,.m .

The restriction v2 « 1 will also be imposed since it is anticipated that the errors are small. From (9.94),

£(11£1/2) =

-

N

/lfll2 + v2 L

-

n=1

cnn l.i;. 1 2

implying IIfll 2 ~ E(lIfIl 2 >. ~ (1 + v2 )lI f ll 2 by our assumption on C. As would be expected, the norm of = f does not differ appreciably on average from that of -f. ~

612

SOURCE DETECTION

Analogous to (9.95) the average error in the radiated pattern is taken as

g112) _ v 2 L:=l P I.i;. 12 I/gll2 IIgl/ 2

E(I/g -

Il Il

(9.97)

where ~j is a typical element in the matrix P = BHDB, and g = BC. We cannot assert that the right-hand side of (9.97) is small even though v 2 is much less than unity because we cannot guarantee that the sum in the numerator is not much greater than the denominator. To estimate the quality put

(9.98) when terms of the order of v4 are neglected. Another figure of merit which can be relevant arises from G= bg where b2 = Ilj /12/E (II g1 2). Then E (II G/12) = /I i /12 which may be interpreted as saying that G is a normalized pattern whose average 'power' is the same as that of g. Hence

E

•

= E (II G -

n

,.. 2

gil)

IIg11 2

= 2(1 -

b).

(9.99)

It can be forecast that b will be near unity if there is little pattern deterioration so that the smaller Ell is the more likely it is that the average design pattern does not deviate far from that required. So, to a certain extent, a small Ell will warrant * ,.. a design being robust ,.. against minor constructional errors. If f, f are exchanged for f, f respectively the above analysis can be used to provide estimates of the errors that will result from imperfections in the solution of (9.93) by numerical methods. 9.18 Constrained aperture norm A solution of (9.93) can be obtained by means of the generalized inverse (§1.15) of a matrix. According to Exercise 80 of Chapter 1 the solution of (9.93) which minimizes IIg - Bfll 2 when BHB is positive definite is (9.100) As already explained this is sensitive to small errors and leads to unduly high values of the norm of f. To prevent this the concept of regularization (Tihonov 1944, 1964; Deschamps and Cabayan 1972) is introduced; however, if it is desired to use (9.100) a possible numerical method is discussed in §9.21.

613

SYNTHESIS OF RADIATION PATTERNS

We seek to minimize (9.101) where J.L ~ O. The extra term J.LII fl1 2 is a penaltyfunction whose aim is to impose an increasing penalty on J as the norm of f gets too big. The parameter J.L controls the level of the penalty. If u = 0 we recover (9.100) so we shall have in mind that u > O. However, Il must not be too great otherwise there will be too much of a discrepancy between the f found from (9.101) and that sought. Keep J.L fixed. The minimization will depend upon u, which will be denoted by writing fll" Variations of fH alone in J indicate that the minimizing solution must satisfy (BHDB + J.LC)fJl = BHDg. (9.102) Since the matrix operating on

fJl

is positive definite it possesses an inverse and (9.103)

Useful theoretical results can be adduced by importing the eigenfunctions and eigenvalues J.Li satisfying

«Pi

(9.104) Since C is positive definite and BHDB positive semi-definite /li ~ 0 and the can be normalized so that (§1.10) «P~C«Pi = b;j'

«p~BHDB«Pi = Ilibij.

Also any vector can be expressed as a linear combination of the

«Pi

(9.105)

Since f itself can be expanded similarly it follows that g is representable by

Substitution in (9.102) and application of (9.105) leads immediately to J.Libi(J.L + J.L1) -1 so that

L llibi(J.L + Jli)-1«Pi .

fJl =

i

From (9.106) we deduce that

"fJl 1/2

= L Jlrl bd2(Jl + J.Li)-2 i

and that the synthesis error is E

=

~

Ji

2

bd 2 (Ji + Ji;) - 2 IIgl1 2 .

1

aiJl

=

(9.106)

SOURCE DETECTION

614

Consequently, as Jl increases the norm of II fp " decreases monotonically whereas E increases steadily. The price to be paid for holding down the norm of f is larger error in the radiated pattern. If g is corrupted by noise, the change of fll is that bi becomes bi(1 + Ti ). Then, if f is the solution of (9.93) when there is no contamination by noise, a measure of the effect of u is £(llf" - f11 2 ) = L (v2 + Jl2)lb i l2 2 II f 11 i (Jl + Jli)2 " f 1/2 . As Jl increases this quantity first decreases steadily and then grows monotonically, the minimum occurring at the value of Jl which satisfies 2 2 (JlJli - v )I: iI = o. (9.107) i (Jl + Jli)

L

Thus, by choosing Jl from (9.107), we can ensure that fll is in its most stable regime against noise disturbances. The smallness of v suggests that the best choice of Jl will be round about v2 • Hence, when information is available about the noise characteristics, an optimal design can be attempted, though it involves the solution of (9.104). The corresponding values of II filii and the synthesis error can be derived from formulae already given. An alternative approach (Mautz and Harrington 1973) is to require that 1/ fll 1l 2 ~ K where K is some constant prescribed in advance. In view of the monotonic decrease of II filII with increasing Jl, the least value of J.l which will be successful satisfies

L Jlrl bd 2(Jl + Jli)-2 = K.

(9.108)

i

Observe that there is no solution to (9.108) if K exceeds Ldbil2. When there is a solution it may deviate widely from that of (9.107) so that substantial noise corruption may occur. Several other criteria for selecting Jl have been proposed (Hansen 1992). One that has been found to work quite well in practice is to draw a graph of the points (lIg - BIll II, II f ll ll) as Jl varies from small to large values. Often its shape is as shown in Fig. 9.11. The optimal Jl is picked as that value which occurs at the sharp corner, indicated by the arrow in Fig. 9.11. So far it is implicit that g is known in both magnitude and phase. Should it transpire that the magnitude alone of the pattern has to be matched (as is frequently the eventuality in practice) the procedure needs modification. Let h be the vector whose mth component is Ig(Am)l. Then a minimum of (9.109) is pursued.

SYNTHESIS OF RADIATION PATTERNS

615

"fJlI'

Fig. 9.11. The corner identifying the optimal

J.1..

When D is diagonal, as will now be assumed, the awkward modulus in (9.109) can be avoided. Consider

which we aim to minimize with respect to f and the parameters ously. The variation of J 2 with respect to Pm shows that exp( - iPrn)

=

L

BrnqJ., .

ILq Bmqhl

Pm simultane(9.110)

On substitution in J 2 , J 2 resumes the form of J 1 • Thus the minimization of J 2 will lead to the same results as working with J1 • The variation of J2 with respect to fH supplies (9.102) again with g replaced by a vector with components exp( -iPm)h m. The function fll may still be expanded in terms of the «Pi but, on account of (9.110), the coefficients now satisfy non-linear equations. Theoretical treatment can therefore be involved but numerical work will proceed with the minimization of J2 directly.

Exercises

23. If e«(X) = (1 - (J.2/k2)1/2 and h,,(x) = c5(x - x,,) synthesize the radiation pattern g«(X) = 1 when ka = 21t. Space the X n equally throughout the intervalwith one at the origin and take N to be (i) 3, (ii) 5, (iii) 7. Consider various values of M and J.1. (including zero), computing the norm of f and the synthesis error. Take C and D to be unit matrices. Add a small amount of random noise to g and repeat your calculations, examining the robustness of the synthesis. Is your best value of J.I. comparable with v 2? 24. Repeat Exercise 23 with g«(X) equal to (a) (1 - (X2/k 2)2, (b) (1 - (X2/k 2)2 exp(i sin (X), (c) t - (12/k 2• 25. Try the synthesis of Exercise 24 when only Ig«(X)1 is given.

SOURCE DETECTION

616

26. Consider some of the previous questions for wider apertures. 27. Examine the effect of changing C and D from unit matrices to diagonal matrices with unequal diagonal elements. 28. Study the problem of synthesizing g(a.) = (1 - a. 2 /k2 ) - 1/ 2 and decide whether weighting D in favour of observations away from (J. = ±k is helpful. 29. By minimizing

carry out a synthesis in which the quality factor is controlled. Show that

M /lr/l {( I _ 11 /I Bfl1 4

2 )

BHDB +

jJMC}r = BHD

/I Bfl/2

g

.

See how your answers with this synthesis compare with those in Exercise 23.

9.19 Directivity Hitherto, consideration has centred on trying to make the synthesized pattern close to specification on average, but when an antenna has to discriminate against signals arriving from sources other than in some specified direction it may be desirable to ask that it has a main beam which falls to zero within a certain angle and insignificant sidelobes. Remark that it is pointless to attempt a design in which there are no sidelobes because an entire function which vanishes on a stretch of the interval ( - k, k) is zero everywhere. We are concerned only with patterns which are symmetrical about iJ. = 0 which is assumed to coincide with the maximum of the main beam. However, it is convenient to work first with a variable y which takes values from - 1 to 1. Consider the Chebyshev polynomial T2 m (by) where b > 1 (§1.5). As by increases from -1, T2 m (by) passes through maxima of 1 and minima of -1 until by reaches I (there being 2m + 1 of these stationary points); thereafter it steadily increases. Hence its maximum for -lib ~ Y ~ I is T2 m(b). The null nearest to y = I is at Y1 = (lib) cos (nI4nl). Suppose now that p(y) is a polynomial of degree m in y 2 such that p(l) = T2 m(b) and that Ipl ~ 1 at any stationary points (excluding y = 1). Suppose further that p vanishes at )'2 > Y1' Then, for y < )'1' T2 m (by) - p(y) is non-negative at maxima of T2 m and non-positive at minima. The same assertion cannot be made at the maximum y = 1, but positivity is assured at y = )'2' Since T2 m (by) - p(y) vanishes at y = 1 and is a function of y2 it must have 2,n + 2 zeros. Being a polynomial of degree 2m it must be identically zero. Thus T2 m (by) gives the null nearest to y = 1 of all polynomials with the stated properties. Now make the change of variable y = 1 - b i iJ.2lb. Then the preceding result can be interpreted as stating that T2 m(b - b t iJ.2 ) is a radiation pattern which has a main beam to sidelobe ratio of T2 m(b) on iJ.2 < (1 + b)lb l and, of all polynomials of degree 2,n in iJ.2 with the same main beam to sidelobe ratio, it gives the narrowest main beam.

617

SYNTHESIS OF RADIATION PATTERNS

and b = 1 + C 2/8m2. Allow m -+ 00. We infer that cos{(acx)2 is the pattern with the narrowest main beam of those with a main beam to sidelobe ratio of cosh C. The large zeros of this pattern occur at ±(n + t)n/a. These may not be consistent with the behaviour enforced by conditions at the aperture end-points as described in §9.14. However, it is a simple matter to shift the large zeros, say those for n ~ N, by replacing the factors in the infinite product for the cosine. The result is Let b l

= a2/8m2 C 2 } t /2

)'}2(aa _1 ), 2 t · -1t 2et · SIn . ( .) aa. - 1 2 e 1n

{( 1 e

1)' +

------~

(

aa. 1t

2 eI

•

aa.-!e t 1t

nN-1I [1 - 2{C2 +a( P nn [1- (+a cx 2

(X2

n=

2"1)2 1t 2}

2 2

N- 1 n= 1

n

1

"let

)2 2

]

]

1t

where

p = {C 2

+ (N

- t)2 n2}1/2'

The main beam width in this pattern is larger than in the ideal pattern, the factor (Taylor 1955) being p. This factor is usually only a few per cent greater than unity. The question of directivity can also be approached by planning to maximize the radiation ina prescribed direction while holding the total radiated power fixed. If a = A.o is the special direction this comes down to maximizing Ig(A. o)1 2/11 Bfll 2 where g(A o) is obtained from (9.91). Experimentally, it is not easy to produce a directivity that is much in excess of that generated by an aperture distribution which is uniform in both magnitude and phase. It is normally preferable to recognize this fact and avoid the theoretical superdirectivity feasible under ideal circumstances by constraints on the source norm or quality factor. A better quantity to maximize is therefore

Ig(AoW "Bf1l 2

+

11£11 Jl

2

or a similar expression based on the quality factor. Apart from matters of detail this problem is similar to those already considered for minimization and so will not be discussed further. 9.20 Penalty functions The expressions (9.101) and (9.109) are not by any means the only ones which could be adopted in order to impose contraints. For example, if it is wished to make IIfll2 ~ K a possible penalty function is

J3

= IIg -

Bfll 2

+ tu(K

-

IIfl1 2

-

v):

618

SOURCE DETECTION

where x _ is x if x ~ 0 and 0 if x ~ O. The constraint is taken into account only when it is violated, as it were. The extra parameter v permits the avoidance of unpleasant behaviour of the second derivatives of J 3 at v = o. Indeed the method can be generalized to cover more than one constraint (see also §4.7). Suppose that the constraining relations are cm(f) ~ o. Then consider m

A strategy for dealing with J4 is based on iteration (Fletcher 1973). Let a and v be vectors with components o; and Vm respectively. Select some starting values a(1)

and

of J4 . If

v(1)

(often

v(l)

= 0 will be satisfactory). Calculate the minimzer f O )

II C(f<1»)

1100

= max Imin{c".(f<1»), vm}1

~

e

".

where e is a pre-assigned tolerance, stop. Otherwise set define V(2) = v(1) - (N TG- 1 N ) - 1c (f(1 »)

K(1)

= II c(fil)) " 00

and

grad c2 ,

•••).

where G is the Hessian matrix of J4 and N is the matrix (grad Increase each U m if necessary so that U(2)

m

C1,

~ 4u(1)1 {v~) - v~)} I. s-:

m

C".(f
With these new values recalculate the minimizer fi2) of J 4 • Either "C(f<2») II 00 ~ K(1) or II C(f<2») II 00 < K(1). In the latter case repeat the procedure just described. In the former put u~) = 10u~) for all those m for which ICm (f<2») \ ~ K(1) and return to finding the minimizer of J 4 • The evidence of numerical experiments is that this strategy is attractive in practice.

ARRAY SIGNAL PROCESSING 9.21 Adaptive beam forming Radar surveillance in which existing targets are tracked and new targets observed places heavy demands on antenna facilities. Because of its importance in connection with satellites, military applications, and air traffic control (to mention a few examples) efforts have been devoted to making antenna structures which are economical by being multi-purpose. These purposes are sometimes in conflict. For instance, transmission may call for maximum power in given directions, whereas reception will require information to be absorbed from directions (perhaps unknown in advance) with maximum efficiency. It will

ARRAY SIGNAL PROCESSING

619

(JI

I

)(

)(

)(

)(

)(

Fig. 9.12. Geometry of beam steering.

therefore frequently be necessary to scan a portion of space either mechanically or electronically. Mechanical scanning will usually be sufficiently slow that the pattern can be estimated instantaneously from the position of the array as if it were not moving, but electronic sweeping may be rapid enough for time to enter as a new factor in the calculations. In any case the beam of the antenna is being regularly adapted as a result of the information available. The information itself is normally obtained by samples which need only be taken at the rate of 1/2B where B is the communication bandwidth. While there is no lower limit to the duration of the sample, the information may be corrupted in the absence of suitable precautions. Aliasing arises from the introduction of additional unwanted information if the bandwidth being sampled is greater than that required for signal communication. To obtain some idea of the considerations involved in beam steeringg, let a plane wave irradiate a linear array of equi-spaced receptors (Fig. 9.12). The difference in phase between two adjacent receptors is kd sin 0 so that if all the receptors are alike the signal in the nth is s;

= exp{i(n - l)kd sin O}

taking the phase of the first as zero. If these signals are given a phase delay <5 n and added, the net signal is 8(0)

N

= L

n=l

exp{i(n - l)kd sin 0

+ i<5 n } .

(9.111)

If <5 n = (n - 1)<5 for all n, corresponding to an inserted constant phase delay between adjacent elements, S(O) = expHi(N ~ l)u} sin tNu

SIn tu

620

where u

SOURCE DETECTION =

kd sin 8

+

~.

The normalized power pattern

(Sin ~~U)2.

2

IS(8)1 = ISu=ol N

(9.112)

SIn 2:u

From (9.112) it can be seen that, when b = 0, the maximum occurs at 8 = 0, i.e. the main beam is the broadside direction. If ~ "# 0, the maximum is at () = -sin-l(~/kd) and the main beam has been shifted. Thus, weighting the signals by phase permits beam shifting and if the magnitudes are also weighted it may be possible to introduce some of the effects discussed under synthesis. Therefore we consider beam forming in which N

S

= "

~

n=1

w* n Sn

= w"s

where wand s are the weight and signal column vectors with components W m and Sm respectively. Now suppose that a plane wave of unit amplitude from a certain direction, called the look direction, produces a signal 1. The magnitude is correctly reproduced if (9.113) To ensure that the look direction is always preferentially treated, (9.113) is applied as a constraint. Then the weights are adjusted so that the power output 181 2 is a minimum. If A. is a Lagrange multiplier the objective function to be minimized is w"Sw

+ ).,(w" 1 -

1) + A*(I"w - 1)

where S = SSM. Variations in w" alone lead immediately to Sw + Al = 0 and then (9.113) implies that ),,* = -1/1"S- 11. Hence the optimal choice of w is given by

(9.114) and the best beam former is

The method has affinities with those in §9.18 but is somewhat simpler in character. In both substantial computational effort may have to be expended in the evaluation of inverses of matrices unless the following iterative algorithm (Frost 1972) is adopted. If wj is the result of some iteration, define

where I is the unit matrix and f3 is a non-zero feedback factor which controls 11 = 1 = 1 so long as the rate of convergence of the iteration. Then

w7+

w7

621

ARRAY SIGNAL PROCESSING

(9.113) was satisfied at the start of the iteration. If wj we have w~1 = 1 and

--+ W o

as j

--+ 00,

11") - Sw=O ( 1 -1"1 whence W

S-lll"Sw o o1"1

-----

(9.115)

Substituting for W o from (9.115) in w~1 = 1 we obtain I"Sw o = 1"I/l"S-11 which, when inserted in the right-hand side of (9.115), shows that W o coincides with w opt as given by (9.114). To discuss the rate of convergence let vj = wj - Wo0 Then

Since I"vj

=0 = v. (/- ~~)v. 1"1 J

so that

Vj+ 1

= (I

J

- fJQ)vj where

Q=

(I - :~:)S(I - :~:).

Because Qis Hermitian there is a matrix P such that p"QP = A and A being diagonal with elements the eigenvalues of Q. Hence

p"V j + 1

= (I

r:' = P",

- PA)P"v j •

Thus the system is stable and the error decays .rapidly to zero provided that

o < fJ < 2/Ama x •

(9.116)

The restriction (9.116) can be severe in practice and convergence may be slow. There may also be instability due to noisy data as already explained. For that reason it is probably advantageous to employ the theory for linear constraints set out in §4.7 or to use the following method of elimination (Gill and Murray 1974). It is based on a Givens-Householder transformation (§1.14). Let L be the vector with zero elements except for the first which is 11(1"1)1/2/1/11{1"1)1/2 if II = OJ. Choose u so that

1 - 2uu"I = L and u"u

= 1. This is always feasible and makes the matrix unitary (§1.10). If

y = w - 2uu"w,

y"L

= w"l.

622

SOURCE DETECTION

Thus the constraint (9.113) is met if and only if

*_

11 11

Yl - (I Hl)1/21

1·

(9.117)

Hence an equivalent minimization problem is that ofy*(I - 2uuH)S(I - 2uuH )y subject to (9.117). Let b be the first column of the matrix after the first element has been removed and B the submatrix after the deletion of the first row and column. Then, if Y2 is y after the extraction of Yl' (9.118) The matrix B, in view of the unitary character of I - 2uuH, can be written in its Cholesky decomposition (§1.12) LL H and so Y2 can be found from (9.118) by solving two sets of linear equations. Since Y 1 is known from (9.117), wopt can be calculated without difficulty. This method can be efficient especially when combined with the penalty function technique described earlier. A possible disadvantage is that the optimization is carried out afresh with each set of signals so that any earlier information about the values of the weights is wasted.

9.22 Simultaneous multiple beams The observation that different values of ~ in (9.112) cause shifts in the main beam prompts the suggestion that by attaching a number of independent feeding systems each incorporating a different phase shift, the array could be operated as if its maximum beam were in various directions simultaneously, i.e. a multiple-beam mode would be achieved by means of multiple feeders. Such a mode could be attractive so it is vital to establish the point that if the antenna and feeds are lossless and passive there are strict limitations on the patterns which are permitted (White 1962; Allen 1961; Shelton and Kelleher 1961). It will be convenient to discuss the antenna when transmitting since reciprocity will then cope with reception. Suppose that for a unit drive in the mth feed the radiated electric intensity is Em«(J',l/J) in the direction indicated by the spherical polar angles ()', l/J. Then if the drive is Vm the electric intensity is vmEm. Consequently, if all feeds are operated simultaneously, the output is LmvmEm and the input power is of the form LmRmlvml2 since the feeds are independent. On the other hand, the output power is proportional to

In the lossless passive case the two powers must be the same for arbitrary

Vm •

623

ARRAY SIGNAL PROCESSING

We therefore conclude that (9.119) i.e. multiple independent beams can be formed by a loss less passive antenna only when the individual beam patterns are orthogonal over the unit sphere. For one-dimensional arrays where the individual beams possess the same polarization the scalar product in (9.119) can be replaced by multiplication. For a uniformly illuminated aperture as at the beginning of §9.21, the pattern is approximately sin(p - "tJ o) (p = !kd sin 0) (p - ( 0 ) when it is sufficiently sharp. Also the integration may be extended to infinity without great loss. Then, since

f

a:>

sin x sin(x

- 00

X

X

+ t) dx = 1t sin t

+t

l

t

the patterns sin(p - nm1C)/(p - nm1C) are orthogonal if the integers nm and nn are different. However, we have already seen that uniformly illuminated apertures are not optimal from the point of view of side lobes, so either resistance has to be deliberately introduced or the system made active instead of passive so as to render orthogonality unnecessary. Alternatively, the aperture may be split into independent apertures. There are interesting facts to be gleaned from ·processing the data from independent apertures in an appropriate way. Consider the particular case of (9.111) with ~n = (n - 1)~ and put p = !(kd sin 8 + ~). If (9.111) is multiplied by exp{ -i(N - l)p}, the phase increases steadily from exp{ -i(N - l)p} at the first element to exp{i(N - l)p} at the last so the point distant !(N - l)d from the first may be regarded as the phase centre of the array. Also, SN(P)

.

= exp{I(N -

sin Np l)p} - . sin p

the subscript N indicating the number of elements under consideration. Suppose now that a further M elements are added at the right-hand end of the array. Their pattern will be exp(2iNp)SM(p). The real signal from the first aperture is DN(p) cos{wt + (N - l)p} and from the second is DM(p) cos{wt +(2N + M - l)p} where DN(p) = sin Np/sin p. If these are multiplied together the result is tDN(p)DM(p)[cos{2wt

+ (3N +

M - 2)p}

+ cos(M +

N)pJ.

If this is processed so that the sinusoidally varying term is eliminated the

SOURCE DETECTION

624

demodulated output (9.120) may be generated. The output (9.120) of a multiplicative array may be interpreted as the product of the directional responses DN , DM of the two individual parts of the aperture multiplied by the interference pattern of two elements situated at the phase centres of the two parts. If M = N, the first zero of D(p) occurs at P = 1t/4N whereas D2N vanishes first at p = 1t/2N so that the main beam obtained by multiplying the responses off two halves of an array has half the width of that of the whole aperture. Moreover, D(p) is negative for 1t/4N ~ P ~ 31t/4N so that, if the recorder accepts only positive D, the first side lobe can be suppressed. Thus signal processing can improve the directional qualities of an array. Another form of processing is rectification. In linear rectification the output is the amplitude of the sinusoidally varying term. Therefore, if the signal from the whole N + M array is rectified liearly, the output is

D,(p) = IDN+M(P)I. In square-law rectification the output is the square of the amplitude so that

Ds(p) = {DN + M (P)}2. Clearly, rectification does not improve the main beam characteristics in the same way as multiplication. However, before deciding that multiplication is superior to rectification it is important to examine the performance when more then one source excites the array (Shaw and Davies 1964). Let one wave be of unit amplitude from the direction indicated by Pt and a second wave be of real amplitude A from the direction P2' The two waves have the same frequency but differ in phase by ljJ. Then the real response of the N portion of the aperture can be expressed as Consequently,

D,

DN(Pt) cos rot + ADN(P2) cos(rot + ljJ).

= {D~+M(Pl) + A2D~+l~f(P2) + 2AD N+ M(Pl)D N+ M(P2) cos ljJ}1/2,

= D;, D = {DN(Pl)DM(pt) + A 2DN(P2)DM (P2)} cos(N + M)p + ADN(P2)DM(pt) cos{(M + N)p - c/>}

Ds

+ ADN(Pl)DM(P2) cos{(M + N)p + ljJ}. If there is sufficient incoherence between the waves for the average of ljJ to be zero, integration will lead to mean values

Ds = D~+M(Pt) + A2D~+M(P2)' D = {DN(pt)DM(pt) + A 2DN(p2)DM(P2)} cos(N + M)p.

625

ARRA Y SIGNAL PROCESSING

Thus the response for multiple targets can be evaluated by superposition (except for linear rectification) and the superiority of multiplicative processing is retained in these circumstances. If 4> remains relatively constant in the time interval of relevance, as is likely in radar applications, the performance depends upon the probability that 4> assumes values which allow discrimination. It is found that multiplicative processing always gives better resolution than square-law rectification but the improvement falls off as the difference between source strengths increases unless the first sidelobe of the multiplicative system is suppressed.

Exercises 30. If

() gp=

sin(p - n) (2 4) sin p sin(p + n) + +a--+--p-n p p+n

show that the patterns g(p) and g(p + q) are orthogonal for any value of a if q = 31t. Are there any other values of q for which this is true?

9.23 Time-varying arrays In electronic beam scanning the shifting of the beam is accomplished by varying the properties of the array in time. This is one instance of altering array parameters with the objective of realizing desired radiation characteristics or improving the system's capacity to handle information. The parameters which might be varied are the weights to be attached to individual elements, the frequency of operation, and the physical dimensions of the array (including the physical location of the phase centre). The parts of an aperture can be sampled in time either by having elements placed there whose weights are zero except at certain time intervals or by having an antenna move physically over the aperture so that it is at a prescribed location at a set time. In the latter case we often speak of a synthetic aperture created by the movement of the antenna. Radio astronomers rely on the motion of the earth for a synthetic aperture of enormous proportions (Ryle 1952). Consider the array of Fig. 9.12 with N odd and suppose that the phase I n varies with time according to the law

In

= (n -!N -

!)wot.

In effect, each element of the array is being sampled at intervals of 2n/w o. Then, if the phase of the incident wave is zero at the central element, the signal produced when the outputs of the elements are added is

+ !kd sin 0).

(9.121)

ItI < n/w o the maximum of DN in (9.121) occurs at

the angle 0 satisfying

S(O) = cos cot DN(!wot For

sinO = - wot/kd

626

SOURCE DETECTION

provided, of course, that n < kd, i.e. the spacing between elements is at least half a wavelength. Thus as t increases the direction of the main beam moves from 8 = 8m where 8m = sin -1 (n/kd) to 8 = - 8m• Owing to the periodic nature of DN , a new main beam enters at 8 = 8m as t passes through 1t/wo while the old one leaves at 8 = - 8m • Consequently, the main beam is swept through the angle 28m at a repetition rate of 2n/w o• To extract all available information Wo must be equal to the signal bandwidth as pointed out at the commencement of §9.21. The system effectively provides N simultaneous information channels and can resolve signals from N diflTerent directions at a cost of a scanning bandwidth of Nwo. Each channel has to cope with a different frequency and must be supplied with a narrow band filter to prevent the (wide-band) noise being N times worse on addition than in a single channel. It is therefore evident that by varying the parameters of an array in time a multitude of radiation patterns can be generated (Shanks and Bickmore 1959; Shanks 1961; Milne 1964) which can be separated by proper processing. The information capacity of a solitary array is thereby improved substantially. Synthetic apertures may be subsumed under the same scheme because a moving antenna can be visualized as a series off fixed antennas which are sampled at the correct times. There is also no reason why the general principles cannot be applied to arrays which are not straight lines or planes (Johnson 1968).

REFERENCES Abramowitz, M. and Stegun, I. A. (1965). Handbook of mathematical functions. Dover, New York. Adams, R. A. (1975). Sobolev spaces. Academic Press, New York. Ahlberg, 1. H. E., Nilson, E. N., and Walsh, 1. L. (1967). The theory of splines and their applications. Academic Press, New York. Ahluwalia, D. S., Lewis, R. M., and Boersma, J. (1968). SIAM J. apple Math. 16, 783. Ahluwalia, D. S., Keller, J. B., and Matkowsky, B. 1. (1974). J. acoust. Soc. Am. 55, 7. AI-Baali, M. (1985). IMA Jour. Num. Anal S, 121. Albertsen, N. C., Hansen, J. E., and Eilskov Jensen, N. (1972). Reps. 104, 108. Technical University of Denmark, Lyngby, Denmark. Allen, 1. L. (1961). IRE Trans. Antennas Propag. AP-9, 350. Altshuler, E. E. (1961). IEEE Trans. Antennas Propag. AP-9, 324. Angell, T. S. and Kirsch, A. (1992). Math. Meth. Appl. Sci. 15, 647. Arbanel, S. and Gottlieb, D. (1976). J. Compo Phys. 21, 351. Ari, N. and Firth, J. R. (1990). Inverse problems 6, 299. Arnback. J. (1969). Electron. Lett. 5, 41. Arnold, D. N. and Wendland, W. L. (1983). Math. Compo 41, 349. Arnold, D. N. and Wendland, W. L. (1985). Numer. Math. 47, 317. Atkinson, K. E. (1967). SIAM J. numer. Anal. 4, 337. Babich, V. M. (1962). DokI. Akad. Nauk SSSR 146, 571. Bach Andersen, 1. (1968). Radio Sci. 3, 432. Bach Andersen, J. (1971). Metallic and dielectric antennas. Polyteknisk, Lyngby. Bahar, E. (1971). J. math. Phys. 12, 186. Banos, A., Jr. and Johnston, G. L. (1970). UCLA Rep. No. R-65. Bates, R. H. T. (1968). Proc. lEE 115, 1443. Bates, R. H. T. (1969). Mon. Not. R. astr. Soc. 142, 413. Bates, R. H. T. (1971). Int. J. eng. Sci. 9, 1107. Bates, R. H. T. and Gough, P. T. (1975). IEEE Trans. Comput. C-24, 449. Bates, R. H. T. and Lewitt, R. M. (1975). Optik 43, 529. Bates, R. H. T. and Napier, P. J. (1972). Mon. Not. R. astr. Soc. 158, 405. Bates, R. H. T. and Wall, D. J. N. (1976). IEEE Trans. Antennas Propag. AP-24, 251. Bates, R. H. T. and Wong, C. T. (1974). Appl. sci. Res. 29, 19. Bates, R. H. T., Boerner, W. M., and Dunlop, G. R. (1976a). Opt. Commun. 18, 421. Bates, R. H. T., Napier, P. 1., McKinnon, A. E., and McDonnell, M. 1. (1976b). Optik 44, 183, 253. Bates, R. H. T., Gough, P. T., and Peters, T. M. (1974). Int. Optical Computing Con! Zurich, p. 100. IEEE Catalog. No. 74CH0862-3C. Bates, R. H. T., Smith, V. A., and Murch, R. 0.(1991). Physics Reports 201, 185. Batorsky, D. V. and Felsen, L. B. (1973). Radio Sci. 8, 547. Baum, C. E. (1971). Interaction Notes 63, 88. Baum, C. E. and Pearson, L. W. (1981). Electromagnetics 1, 209. Bayliss, A., Gunzburger, M., and Turkel, E. (1982). SIAM J. Appl. Math. 42, 430.

628

REFERENCES

Beaubien, M. 1. and Wexler, A. (1968). IEEE Trans. MTf-16, 1007. Becker, K-D. and Meister, E. (1973). Forschunqsber. Landes Nordrhein- Westfalen No. 1175. Beckmann, P. and Franz, W. (1957). Z. Naturf. 128, 533. Bennett, C. L. and Weeks, W. L. (1968). Purdue University Tech. Rep. TR-EE68-11. Biggs, M. C. (1971). J. Inst Math. Appl. 8, 315. BirkhotT, G. (1963). Q. apple Math. 21, 160. BirkhotT, G. (1969). In Approximation with special emphasis on splinefunctions (ed. I. J. Schoenberg). Academic Press, New York. Bladel, J. van (1966). Arch. elektr. Uebertr. 20, 447. Bleistein, N. (1976). J. acoust. Soc. Am. 59, 1259. Bleistein, N. (1976a). J. acoust. Soc. Am. 60, 1249. Bleistein, N. and Bojarski, N. N. (1975). University of Denver Rep. MS-R-7501. Bleistein, N. and Cohen, 1. K. (1977). J. math. Phys. 18, 194. Bloom, C. A. and Matkowsky, B. 1. (1969). Arch. ration. Mech. Anal. 33, 71. Bloom, C. O. and Kazarinoff, N. D. (1976). Short wave radiation problems in inhomogeneous media. Springer, New York. Boersma, 1. (1974). Proc. IEEE 62, 1475. Boersma, 1. (1975). Q. J. Mech. Appl. Math. 38, 405. Boersma, 1. (1975a). SIAM J. apple Math. 29, 164. Bojarski, N. N. (1967). Syracuse University Research Corporation Report. Bojarski, N. N. (1974). Naval Systems Command Rep. NOOOI9-73-C-9312/F. Boor, C. de (1972). J. Approx. Theory 6, 50. Boor, C. de (1973). M.R.C. Tech. Rep. 1333, University of Wisconsin. Born, M. and Wolf, E. (1965). Principles of optics (3rd edn). Pergamon Press, Oxford. Borovikov, V. A. (1962). Dokl. Akad. Nauk SSSR 144, 527. Borovikov, V. A. (1973). Inst. Prikl. Mat. Akad. Nauk SSSR, Preprint 63. Bouwkamp, C. J. and De Bruijn, N. G. (1946). Philips Res. Rep. 1, 135. Bowman,1. J. (1970). SIAM J. apple Math. 18, 818. Boyd, W. G. C. (1977). Proc. R. Soc. 356, 315. Brandt, D. W., Eftimu, C., and Huddleston, P. L. (1985). Proc. 4th Int. Conf. on Antennas and Propagation (ICAP85), lEE Conference Publication 248, 434. Brigham, E. O. (1988). The fast Fourier transform and its applications. Prentice Hall, Englewood Cliffs, NJ. Brown, A. L. and Page, A. (1970). Elements of functional analysis. Van Nostrand Rheinhold, New York. Brown, W. P., Jr. (1966). J. Math. Anal. 15, 355. Broyden, C. G. (1970). J. Inst. Math. Appl. 6, 79, 222. Bryant, T. G. and Weiss, J. A. (1968). IEEE Trans. MTf-16, 1021. Buneman, O. (1969). Rep. 294, Stanford University Institute for Plasma Research, Stanford, Calif. Burke, G. 1. A. and Selden, E. S. (1973). Antenna modelling program. Rep. IS-R-72/10. Information Systems Division, MB Associates, Menlo Park, Calif. Burton, A. J. (1976). NatI. Phys. Lab. Rep. NPL OC5/535. Buslaev, V. S. (1964). Trudy Matem. Inst. V.A. Stekloo, Mat. Fiz. 73, 2, 14. Butcher, 1. C. (1987). The numerical analysis of ordinary differential equations: RungeKutta and general linear methods. Wiley, Chichester. Butler, C. M. and Wilton, D. R. (1975). IEEE Trans. Antennas Propag. AP-23, 534. Butler, G. F. (1972). RAE Tech. Rep. 72126. Royal Aircraft Establishment, Farnborough.

REFERENCES

629

Buzbee, B. L., Dorr, F. W., George, J. A., and Golub, G. H. (1971). SIAM J. numer. Anal. 8, 722. Buzbee, B. L., Golub, C. H., and Nielson, C. W. (1970). SIAM J. numer. Anal. 7, 627. Carasso, C. (1966). Methodes numerique pour l' obtention de functions-spline. These de Jeme Cycle, Universite de Grenoble. Carre, B. A. (1961). Comput. J. 4, 73. Cermak, I. A. and Silvester, P. (1968). Proc. lEE 115,1341. Chang, S. K. and Mei, K. K. (1974). URSI Symp. on Electromagnetic Wave Theory, p. 160. Institution of Electrical Engineers, London. Chang, S. K. and Mei, K. K. (1976). IEEE Trans. Antennas Prop. AP-24, 34. Chester, W. (1950). Phil.·Trans. Roy. Soc. 242A, 527. Chester, W. «1950a). Proc. R. Soc. 203A, 31 Ciarlet, P. G. (1975). Seminaire de mathematiques superieures. Universite de Montreal. Clarricoats, P. 1. B. and Saha, P. K. (1971). Proc. Inst. Electr. Eng. 118, 1167. Clarricoats, P. 1. B. and Salema, C. E. R. C. (1973). Proc. Inst. Electr. Eng. 120, 741. Clarricoats, P. 1. B., Olver, A. D., and Chong, S. L. (1975). Proc. lnst. Electr. Eng. 122, 1173. Clemmow, P. C. and Munford, C. (1952). Phil. Trans. Roy. Soc. London 245A, 189. Cochran, J. A. (1972). The analysisof linearintegralequations. McGraw-Hill, New York. Cohen, 1. K. and Bleistein, N. (1977). SIAM J. Appl. Math. 32, 784. Colton, D. and Kress, R. (1983). Integral equation methods in scattering theory. Wiley, New York. Colton, D. and Monk, P. (1985). SIAM J. Appl. Math. 45, 1039. Colton, D. and Monk, P. (1986). SIAM J. Appl. Math. 46,506. Colton, D. and Monk, P. (1987). SIAM J. Sci. Stat. Comput. 8, 278. Colton, D. and Monk, P. (1990). Inverse Problems 6,935. Colton, D. and Monk, P. (1992). IMA J. Appl. Math. 49, 163. Colton, D. and Paivarinta, L. (1991). University of Delaware Technical Report 91-16. Colton, D. and Sleeman, B. D. (1983). IMA J. Appl. Math. 31, 253. Cooley, 1. W. and Tukey, 1. W. (1965). Math. Comput. 19, 297. Cooley, 1. W., Lewis, P. A. W., and Welch, P. D. (1967). IEEE Trans. AU-15, 79. Coons, S. A. (1967). MIT Project. MAC. Rep. No. MAC-TR-41, MIT, Cambridge, Mass. Cooray, F. R. and Costache, G. I. (1991). J. Electromagn. Waves Appl. 5, 1041. Corr, D. G. and Davies, 1. B. (1972). IEEE Trans. MlT-20, 669. Cox, M. G. (1972). J. Inst. Math. Appl. 10, 134. Curry, H. B. and Schoenberg, I. 1. (1947). Bull. Am. Math. Soc. 53, 1114. Curry, H. B. and Schoenberg, I. 1. (1966). J. Analyse Math. 17, 71. Davidon, W. C. (1959). AEC Res. Dev. Rep. ANL-5990. Davies, J. B. and Muilwyk, C. A. (1966). Proc. lEE 113,277. Davies, P. 1. (1992a). University of Dundee M.O.D. Report No.3. Davies, P. 1. (1992b). University of Dundee M.O.D. Report No.4. Deschamps, G. A. (1971). Electron. Lett. 7,684. Deschamps, G. A. and Cabayan, H. S. (1972). IEEE Trans. Antennas Propag. AP-20, 268. Dingle, R. B. and Morgan, G. J. (1967). Appl. Sci. Res. 18, 221. Dixon, L. C. W. (1972). Math. Prog. 2 (3), 383. Dixon, L. C. W. (1973). Math. Prog. 3 (3), 345. Dovbysh, L. N. (Gagen-Torn) (1962). Trudy Matem. Inst. V.A. Steklov 66, 190. Dovbysh, L. N. (1965). Trudy Matern. Inst. V.A. Steklov 84, 78. Dovbysh, L. N. (1968). Trudy Matem. Inst. V.A. Steklov 96, 188.

630

REFERENCES

Driessen, P. F. and Jull, E. V. (1989). Electromagnetics 9, 147. Duff, I. S. (1976). AERE Rep. No. CSS28, AERE, Harwell. DuHamel, R. H. and Isbell, D. E. (1957). IRE Natl. Conv. Rec. Pt I, 119. Duncan, 1. W. (1967). IEEE Trans. MTT-15, 575. Duncan, R. H. (1962). J. Res. Nat. Bur. Stand. 66D, 181. Engquist, B. and Majda, A. (1977). Math. Compo 31, 629. Felsen, L. B. (1976). J. Opt. Soc. Am. 66, 751. Felsen, L. B. and Yee, H. Y. (1968). J. Acoust. Soc. Am. 44, 1028. Ffowcs Williams, J. E. and Hawkings, D. L. (1969). Phil. Trans. Roy. Soc. 264A, 321. Fishenden, R. M. and Wiblin, E. R. (1949). Proc. Inst. Electr. Eng. 96, 5. Fletcher, R. (1970). Comput. J. 13, 317. Fletcher, R. (1971). AERE Rep. TP 453. AERE, Harwell. Fletcher, R. (1972a). Math. Prog. 2, 133. Fletcher, R. (1972b). AERE Rep. TP 478. AERE, Harwell. Fletcher, R. (1973). AERE Rep. CSS2. AERE, Harwell. Fletcher, R. (1987). Practical methods of optimization (2nd edn). Wiley, Chichester. Fletcher, R. and Powell, M. J. D. (1963). Comput. J. 6, 163. Fletcher, R. and Reeves, C. M. (1964). Comput. J. 7, 149. Fok, V. A. (1946). J. Phys. 10, 399. Fok, V. A. (1965). Electromagnetic diffraction and propagation problems. Pergamon Press, Oxford. Forgan, D. H. (1974). RAE Tech. Rep. 74077. Royal Aircraft Establishment, Farnborough. Forrest, A. R. (1968). Ph.D. Dissertation, University of Cambridge. Forsythe, G. E. and Moler, C. (1967). Computer solution of linear algebraic equations. Prentice-Hall, Englewood Cliffs, NJ. Fox, L. (1962). Numerical solutionofordinaryand partialdifferential equations. Pergamon Press, Oxford. Fox, L., Henrici, P., and Moler, C. (1967). SIAMJ. Numer. Anal. 4, 89. Franz, W. (1954). Z. Naturf. 9a, 705. Franz, W. and Deppermann, K. (1952). Annln Phys. (6) 10, 361. Franz, W. and Klante, K. (1959). IRE Trans. Antennas Propag. AP-7, 568. Freeman, T. L., Delves, L. M., and Reid, 1. K. (1974). J. Inst. Math. Appl. 14, 145. Frieden, B. R. and Burke, J. J. (1972). J. Opt. Soc. Am. 62, 1202. Friedlander, F. G. (1954). Commun. Pure. Appl. Math. 7, 705. Friedlander, F. G. (1973). Proc. London math. Soc. 3, 27, 551. Frost, O. L. (1972). Proc. IEEE 60, 926. Gabor, D. (1948). Nature (London) 161,777. Gabor, D. (1949). Proc. Roy. Soc. 197A, 454. Gabor, D. (1951). Proc. Phys. Soc. B 64, 449. Gabor, D. (1956). Rev. Mod. Phys. 28, 260. Galindo, V. (1964). IEEE Trans. Antennas Propag. AP-12, 403. Gans, M. 1. (1965). Proc. IEEE 53,1081. Gazazian, E. D. and Kinber, B. Yeo (1971). Radiofizika 14, 1219. Gelder, D. (1970). Proc. lEE 117, 699. Gill, P. E. and Murray, W. (1972). J. Inst. Math. AppI. 9, 91. Gill, P. E. and Murray, W. (1974). Numerical methods for constrained optimisation. Academic Press, New York. Gill, P. E., Murray, W., and Picken, S. M. (1972a). Nat. phys. Lab. N.A.C. 24. Gill, P. E., Murray, W., and Pitfield, R. A. (1972b). Nat. phys. Lab. N.A.C. 11.

REFERENCES

631

Girault, V. and Raviart, P.-A. (1986). Finiteelement methodsfor Navier-Stokes equations. Springer, Berlin. Goldfarb, D. (1969). SIAM J. apple Math. 17,739. Goldfarb, D. (1970). Math. Compo 24, 23. Goldfeld, S. M., Quandt, R. E., and Trotter, H. F. (1966). Econometrica 34, 541. Goodman, D. M. (1983). Lawrence Livermore National Lab. Report UCID-19767. Goriainov, A. S. (1958). Radio Eng. Electron. (USSR) 3, 23. Goriainov, A. S. (1961). Radiotekh. Elektron. 6, 47. Goubau, G. and Schwering, F. (1961). IRE Trans. Antennas Propag. AP-9, 248. Gourlay, A. R. and Watson, G. A. (1973). Computational methods for matrix eigenproblems. Wiley, London. Goursat, E. (1942). Cours d'analyse mathematique, Tomes II, III. Gauthier-Villars, Paris. Green, A. (1976). Ph.D. Thesis, Polytechnic Institute, New York. Green, H. E. (1965). IEEE Trans. Microwave Theory Tech. MTf-13, 676. Greenstadt,1. (1972). Math. Comput. 26, 145. Greiser. J. W. and Mayes, P. E. (1961). Proc. NatI. Electron. Conf. 17, 193. Grimshaw, R. (1966). Commun. Pure Appl. Math. 19, 167. Grisvard, P. (1985). Elliptic problems in nonsmooth domains. Pitman, Boston. Hallen, E. (1956). IRE Trans. Antennas Propag. AP-4, 479. Hansen, P. C. (1992). SIAM Review 34,561. Harrington, R. F. (1968). Field computation by moment methods. Macmillan, New York. Hart,1. F., Cheney, E. W., Lawson, C. L., Maehly, H. J., Mesztenyi, C. K., Rice, J. R., Thatcher, H. C., and Witzgall, C. (1968). Computer approximations. Wiley, New York. Hayes, W. P. (1970). Proc. Roy. Soc. 320A, 209. Heading, 1. (1975). Proc. Roy. Soc. Edinburgh 73A, 51. Hebden, M. D. (1973). AERE Rep. T.P. 515. AERE, Harwell. Heins, A. E. (1948). Q. J. Appl. Math. 6, 157, 215. Hess, J. L. (1973). Conf. Methods Appl. Mech. Eng. 2, 1. Hestenes, M. R. (1956). Proc. Symp. Appl. Maths 6, 83. Hestenes, M. and Stiefel, E. (1952). J. Res. Nat. Bur. Stand. 49, 409. Hizal, A. (1974). J. Phys. D: Appl. Phys. 7, 248. Hockney, R. W. (1965). J. Assoc. Comput. Mach. 12, 95. Hockney, R. W. (1970). Methods Comput. Phys. 9, 135. Hong, S. (1967). J. Math. Phys. 8, 1223. Hoop, A. T. de (1975). Philips Res. Rep. 30, 302. Hornsby, J. S. and Gopinath, A. (1969). IEEE Trans. MTT-17, 684. Horowitz, B. R. and Tamir, T. (1971). J. Opt. Soc. Am. 61, 586. Howe, D. (1973). J. Inst. Math. Appl. 12, 125. Hsiao, G. C. (1989). J. Compo Math. 7, 121. Hsiao, G. C. and Wendland, W. L. (1977). J. Math. Anal. Appl. 58, 449. Hu, Y. F. and Storey, C. (1991a). J. Optim. Theory Appl. 71, 399. Hu, Y. F. and Storey, C. (1991b). J. Optim. Theory Appl. 69, 139. Hua, Y. and Sarkar, T. K. (1989). IEEE Trans. Antennas Prop. AP-37, 229. Huang, H. Y.(1970). J. Optim. Theory Appl. 5, 405. Hussain, M. A., Yu, K-B. and Noble, B. (1992). General Electric Preprint. Ikebe, Y. (1972). SIAM Rev. 14, 465. Imai, I. (1954). Z. Phys. 137, 31. Imbriale, W. A. and Mittra, R. (1970). IEEE Trans. Antennas Propag. AP-18, 633. Iri, M. (1969). Networkflow, transportation and scheduling. Academic Press, New York.

632

REFERENCES

Irons, B. M. and Razzaque, A. (1972). In The mathematical foundations of the finite element method with applications to partial differential equations (ed. A. K. Aziz). Academic Press, New York. Isbell, D. E. (1960). IRE Trans. Antennas Propag. AP-8, 260. Jahn, J. (1986). Mathematical vector optimization in partially ordered linear spaces. Peter Lang, Frankfurt. James, G. L. (1976). The geometrical theory of diffraction. Institution of Electrical Engineers, London. James, 1. H. (1973). ARL Rep. ARL/R/R4 Admirality Research Laboratory. Jin, J. M., Volakis, 1. L., and Liepa, V. V. (1989). IEEE Trans. Antennas Prop. AP-37, 118. Johnson, M. A.(1968). Proc. IEEE 56, 1801. Jones, A. F. and Misell, D. L. (1970). J. Phys. A. 3, 462. Jones, D. S. (1956). IRE Trans. AP-4, 297. Jones, D. S. (1963). Phil. Trans. Roy. Soc. 2S5A, 341. Jones, D. S. (1964). The theory of electromagnetism. Pergamon Press, Oxford. Jones, D. S. (1966). Generalised functions. McGraw-Hill, New York. Jones, D. S. (1966a). J. Inst. Math. Appl. 2, 197. Jones, D. S. (1967). Proc. Camb. Phil. Soc. 63, 1145. Jones, D. S. (1969). Phil. Trans. Roy. Soc. 26SA, 1. Jones, D. S. (1972). SIAM Rev. 14, 286. Jones, D. S. (1973). J. lnst. Math. Appl. 12, 63. Jones, D. S. (1973a). Proc. Roy. Soc. Edinburgh A 71, 263. Jones, D. S. (1973b). Q. J. Mech. Appl. Math. 26, 1. Jones, D. S. (1977). Proc. Aerospace Sci. 17, 149. Jones, D. S. (1974). Q. J. Mech. Appl. Math. 27 (1), 129. Jones, D. S. (1981). lEE Proc. 128, 114. Jones, D. S. (1982). The theory of generalised functions. Cambridge University Press. Jones, D. S. (1984). Math. Proc. Camb. Phil. Soc. 96, 173. Jones, D. S. (1985). Applicable Analysis 19, 181. Jones, D. S. (1986). Acoustic and electromagnetic waves. Oxford University Press. Jones, D. S. (1988a). J. Sound Vib. 121, 37. Jones, D. S. (1988b). IMA J. Appl. Math. 41, 21. Jones, D. S. (1989). Q. J. Mech. Appl. Math. 42, 457. Jones, D. S. (1990). SIAM J. Appl. Math. SO, 547. Jones, D. S. (1992a). In Huygens Principle 1690-1990: theory and applications (ed. H. Blok, H. A. Ferwerda, and H. K. Kuiken). Elsevier, Amsterdam. Jones, D. S. (1992b). IMA J. Appl. Math. 48, 163. Jones, D. S. and Kriegsmann, G. A. (1990). SIAM J. Appl. Math. SO, 559. Jones, D. S. and Mao, X. Q. (1989). Inverse Problems 5, 731. Jones, D. S. and Jordan, D. W. (1969). Introductory analysis, Vol. I. Wiley, New York. Jones, R. M. (1968). Radio Sci. 3, 93. Kaminetzky, L. and Keller, 1. B. (1972). SIAM J. Appl. Math. 22, 109. Kazarinoff, N. D. and Senior, T. B. A. (1962). IRE Trans. Antennas Propag. AP-IO, 634. Keller, 1. B. (1957). J. Appl. Phys. 28, 426. Keller, 1. B. (1962). J. Opt. Soc. Am. 53, 116. Kellogg, O. D. (1929). Foundations of potential theory. Ungar, New York. Khebir, A., Kouki, A. B., and Mittra, R. (1990). IEEE Trans. Microwave Theory Tech. MTT-38, 1427. King, R. (1956). The theory of linear antennas. Harvard University Press, Cambridge, Mass.

REFERENCES

633

King, I. D. (1992). DRA Malvern Memorandum No. 4625. Kirsch, A. and Kress, R. (1987a). In Inverse problems (ed. H. Engl and C. W. Groetsch), p. 279. Academic Press, Boston. Kirsch, A. and Kress, R. (1987b). In Boundary elements IX, Vol. 3 (ed. C. A. Brebbia et al.) p. 3. Springer, Berlin. Kirsch, A., Warth, W., and Werner, J. (1978). Notwendiqe Optimalitiusbedinqunqen und ihre Anwendunq. Springer, Berlin. Kirsch. A., Kress, R., Monk, P., and Zinno A. (1988). Inverse Problems 4, 749. Klein, C. A. and Mittra, R. (1975). IEEE Trans. Antennas Propag. AP-23, 258. Kleinman, R. E. and Roach, G. F. (1982). Proc. Roy. Soc. A383, 313. Kleinman, R. E., Roach, G. F., Schuetz, I. S., Shirron, 1., and van den Berg, P. M.

(1990). Wave Motion 12, 161. Kogelnik, H. and Li, T. (1966). Proc. IEEE 54, 1312. Kravstsov, Yu. A. (1964). Radiofizika 7, 664, 1049. Kriegsmann, G. A. and Moore, T. G. (1988). Wave Motion 10,277. Kriegsmann, G. A. and Morawetz, C. S. (1980). SIAM J. Sci. Stat. 1, 371. Kriegsmann, G. A., Taflove, A., and Umashankar, K. R. (1987). IEEE Trans. Antennas

Prop. AP-35, 135.

Kruseman Aretz, F. E. J. and Zonneveld, J. A. (1975). Philips Res. Rep. JO, 288. Kulsrud, H. E. (1961). Commun. Assoc. Comput. Mach. 4, 184. Kumaresan, R. and Tufts, D. W. (1982). IEEE Trans. Acoust. Speech Signal Processing

ASSP-JO, 833.

Kunz, K. S. (1957). Numerical analysis. McGraw-Hill, New York. Kunz, K. S. (1963). J. Res. Nat. Bur. Stand. 67D, 417. Kutt, H. R. (1973). Rep. WISK 132, National Research Institute of Mathematical Science, Pretoria. Lalesco, T. (1912). Introduction la theorie des equations inteqrales, pte III. Hermann & Fils, Paris. Lambert, J. D. (1973). Computational methods in ordinary differential equations. Wiley, New York, 1973. Lambert, 1. D. (1991). Numerical methods for ordinary differential equations. Wiley, Chichester. Lauwerier, H. A. (1959). K. Ned. Akad. Wet. Amsterdam 62A, 476. Lauwerier, H. A. (1960). K. Ned. Akad, Wet. Amsterdam 63A, 355. Lauwerier, H. A. (1961). K. Ned. Akad. Wet. Amsterdam 64A, 123, 348. Lee, S. W. (1972). J. math. Phys. 13, 656. Lee, S. W. and Boersma, J. (1975). J. Math. Phys. 16, 1746. Leith, E. N. and Upatnieks, 1. (1963). J. Opt. Soc. Am. 53, 1377. Leppington, F. G. (1968). Proc. Camb. Phil. Soc. 64,1131. Leppington, F. G. (1970). RAE Tech. Rep. 70183. Royal Aircraft Establishment, Farnborough. Levenberg, K. (1944). Quart. Appl. Math. 2, 164. Levinson, N. (1940). Gap and density theorems. Am. Math. Soc. ColI. Publ., Vol. 26. Lewis, R. M. (1965). Arch. Ration. Mech. Anal. 20, 191. Lewis, R. M. (1969). IEEE Trans. Antennas Propag. AP-17, 308. Lewis, R. M., Bleistein, N., and Ludwig, D. (1967). Commun. Pure Appl. Math. 20, 295. Liebeck, H. (1969). -Alqebra for scientists and engineers. Wiley, London. Lions, J. L. and Magenes, E. (1972). Non-homogeneous boundary value problems and applications. Springer, Berlin. Liu, Y. and Storey, C. (1991). J. Optim. Theory Appl. 69, 129.

a

634

REFERENCES

Logan, N. A. (1969). Reps. LMSD-288087 and LMSD-288088, Missiles and Space Division, Lockhead Aircraft Corp. Lootsma, F. A. (1972). Numerical methods for non-linear optimisation. Academic Press, New York. Lowdon, T. A. (1970). Q. J. Mech. Appl. Math. 23, 315. Ludwig, A. C. (1969). Jet Propulsion Lab. Pasadena SPS37-26 4, 200. Ludwig, D. (1966). Commun. Pure Appl. Math. 19, 215. Ludwig, D. (1967). Commun. Pure Appl. Math. 20, 103. Ludwig, D. (1975). SIAM Rev. 17, 1. Liineburg, E. and Westpfahl, K. (1975). Annln Phys. 7, 166. MacCamy, R. and Marin. S. (1980). Int. J. Math. Math. Sci. 3, 311. McDonald, B. H. and Wexler,A. (1972). IEEE Trans. Microwave Theory Tech. MTT-21, 841. MacDonald, H. M. (1902). Electric waves, Cambridge University Press, London. MacDonald, H. M. (1913). Phil. Trans. Roy. Soc. 212A, 299. McLeod, R. and Mitchell, A. R. (1972). J. lnst. Math. Applic. 10, 382. Malyuzinec,G. D. (1958). Annln Phys. 6, 107. Marin, L. (1973). IEEE Trans. Antennas Propag. AP-21, 809. Marin, L. (1974). IEEE Trans. Antennas Propag. AP-22, 266. Marin, L. and Latham, R. W. (1972). Proc. IEEE 60, 640. Marquardt, D. W. (1963). SIAM J. 11, 431. Marshall, J. A. and Mitchell, A. R. (1973). J. Inst. Math. Appl. 12, 355. Masterman, P. H. and Clarricoats, P. J. B. (1971). Proc. lEE 118, 51. Matthews, A. and Davies, D. (1971). Comput. s. 14, 293. Mautz, J. R. and Harrington, R. F. (1969). Appl. Sci. Res. 20, 405. Mautz, J. R. and Harrington, R. F. (1973). Syracuse Univ. Tech. Rep. TR-73-9. Maystre, D. (1987). J. Mod. Optics 34, 1433. Mei, K. K. (1974). IEEE Trans. Antennas Propag. AP-22, 760. Meinke, H. H. and Baier, W. (1966). Nachrichtemech. Z. 11, 662. Meinke, H. H., Lange, K. P., and Ruger J. F. (1963). Proc. IEEE 51, 1436. Meyer, R. E. (1975). Proc. IEEE 63, 1070. Meyer, R. E. (1975a). SIAM J. Appl. Math. 29, 481. Meyer, R. E. (1976). J. Math. Phys. 17, 1039. Michalski, K. A. (1982). Electromagnetics 2, 201. Mikhlin, S. G. (1956). Dokl. Akad. Nauk SSSR 106 (3), 391. Miller, E. K. (1972). Lawrence Livermore Lab. Rep. UCRL-51276. Miller, E. K. and Deadrick, F. J. (1973). Rep. UeRL 74818. Lawrence Livermore Laboratory, University of California. Miller, E. K. and Morton, 1. B. (1970). IEEE Trans. Antennas Propag. AP-18, 290. Milne, K. (1964). Radio Electron. Eng. 28, 89. Milne-Thomson, L. M. (1960). The calculus offinite differences. Macmillan, London. Mitchell, A. R. (1969). Computational methods in partial differential equations. Wiley, London. Mitchell, A. R. and Griffiths, D. F. (1980). The finite difference method in partial differential equations. Wiley, Chichester. Mitchell, A. R. and Wait, R. (1977). The finite element method in partial differential equations. Wiley, New York. Miura, R. (1963). J. Res. Nat. Bur. Stand. 670, 245. Mittra, R. and Itoh, T. (1971). IEEE Trans. MTT-19, 47.

REFERENCES

635

Mittra, R., Itoh, T., and Li, T. (1972). IEEE Trans. MTf-20,96. Mittra, R., Ramahi, G., Khebir, A., Gordon, R., and Kouki, A. B. (1989). IEEE Trans. Magnetics 25, 3034. Monk, P. (1991). SIAM J. Numer. Anal. 28, 1610. Moore, T. G., Kriegsmann, G. A., and Taflove, A. (1988). IEEE Trans. Antennas Prop. AP-36, 1329. Morawetz, C. S. and Ludwig, D. (1968). Commun. Pure Appl. Math. 21, 187. More, 1. J. and Sorensen, D. C. (1982). Argonne Nat. Lab. Report ANL-82-8. Morgan, M. A. and Mei, K. K. (1974). URSI Symp. on Electromagnetic Wave Theory, p. 163. Institution of Electrical Engineers, London. Morgan, M. A. and Mei, K. K. (1979). IEEE Trans. Antennas Prop. AP-27, 202. Muller, C. (1969). Foundations of the mathematical theory of electromagnetic waves. Springer, Berlin. Muller, C. and Niemeyer, H. (1961). Arch. Ration. Mech. Anal. 7, 305. Mur, G., Quak, D., and van Dijk, G. 1. (1976). ICCAD Conf. on Numerical Methods. Santa Margherita Ligure. Murch, R. D. (1991). University of Dundee M.O.D. Report No.4. Murch, R. D., Tan, G. D. H., and Wall, D. J. N. (1988). Inverse Problems 4, 1117. Napier, P. J. and Bates, R. H. T. (1971). Int. J. Eng. Sci. 9, 1193. Napier, P. J. and Bates, R. H. T. (1973). Proc. Inst. Electr. Eng. 120, 30. Neave, G. (1987). Q. J. Mech. Appl. Math. 40, 57. Nedelec, J. C. (1980). Numer. Math. 35, 315. Nedelec, J. C. (1982). Numer. Math. 39, 97. Noble, B. (1960). Proc. Symp. Int. Computer Center, Rome, p. 540. Springer, New York. Noble, B. (1971). Corf. on Application of Numerical Analysis, Dundee, p. 137. Springer, New York. Noble, B. and Sewell, M. J. (1972). J. Inst. Math. Appl. 9, 123. Nokes, S. A. (1974). Ph.D. Thesis, University of London. Nokes, S. A., Bernal, M. J. M., and Davies, J. B. (1974). 1974 URSI Symp. on Electromagnetic Wave Theory, lEE Con! Publ. No. 114, p. 12, lEE, Stevenage. Olver, F. W. J. (1965). J. Res. Nat. But. Stand. 69B, 291. Olver, F. W. J. (1974). Asymptotics and special functions. Academic Press, New York. Oshiro, F. K., Torres, F. P., and Heath, H. C. (1966). U.S. Air Force Avionics Lab. Tech. Rep. AFAL-TR-66-162. Otis, G. (1974). J. Opt. Soc. Am. 64, 1545. Paley, R. E. A. C. and Wiener, N. (1934). Fourier transforms in the complexdomain. Am. Math. Soc. Coli. Publ., Vol. 19. Parini, C., Clarricoats, P. J. B., and Olver, A. D. (1975). Electron Lett. 11, 567. Park, S. W. and Cordaro, 1. T. (1988). IEEE Trans. Electromagn. Compat. 30, 145. Pathak, P. H. and Kouyoumjian, R. G. (1974). Proc. IEEE 62, 1438. Pearson, L. W. and Butler, C. M. (1975). IEEE Trans. Antennas Propag. AP-23, 295. Perry, W. K. (1974). IEEE Trans. Antennas Propag. AP-22, 826. Peters, G. and Wilkinson, J. H. (1969). Comput. J. 12, 398. Peters, T. J. and Volakis, J. L. (1988). IEEE Trans. Antennas. Prop. AP-36, 518. Peters, T. 1. and Volakis, J. L. (1989). J. Electromagn. Waves Appl. 3, 675. Peterson, A. F. (1988). Microwave Opt. Technol. Lett. 1, 62. Peterson, A. F. (1989). J. Electromagn. Waves Appl. 3, 87. Peterson, A. F., Smith, C. F., and Mittra, R. (1988). IEEE Trans. Antennas Prop. AP-36' 1177. Pocklington, H. C. (1897). Proc. Camb. Phil. Soc. 9, 324.

636

REFERENCES

Poggio, A. 1. and Miller, E. K. (1970). MBA Rep. No. MB-TM-70/20. Polak, E. and Ribiere, G. (1969). Rev. Fr. Inform. 16-RI, 35. Pontoppidan, K. (1969). European Microwave Conf., lEE Conf. Publ. 58, p. 99. lEE. Stevenage. Potter, P. D. (1967). Microwave J. AP-15, 727. Poulton, G. T. (1975). Proc. 5th European Microwave Conf., p. 61. lEE. Powell, M. 1. D. (1966). In Numerical analysis: an introduction (ed. 1. Walsh). Academic Press, London. Powell, M. J. D. (1971a). J. Inst. Math. Appl. 7, 21. Powell, M. 1. D. (1971b). U.K.A.E.A. Res. Rep. TP 459. Prossdorf, S. and Silbermann, B. (1991). Numerical analysis for integral and related operator equations. Birkhauser, Basel. Ra,1. W., Bertoni, H. L., and Felsen, L. B. (1973). SIAM J. Appl. Math. 24, 396. Ralston, A. and Wilf, H. S. (1967). Mathematical methods for digital computers, Vol. II. Wiley, New York. Ransom, P. L. and Mittra, R. (1971). Proc. IEEE 59, 1029. Rao, S. M. and Wilton, D. R. (1990). Electromagnetics 10, 407. Rawlins, A. D. (1972). Ph.D. Thesis, University of Surrey. Reid, J. K. (1966). Comput. J. 9, 200. Rice, S. O. (1954). Bell Syst. Tech. J. 33, 417. Richmond, 1. H. (1965). Proc. IEEE 53, 796. Richmond, 1. H. (1966). IEEE Trans. Antennas Propag. AP-14, 782. Richtmyer,R. D. and Morton, K. W. (1967). Difference methodsfor initial-value problems. Wiley-Interscience, New York. Roberts, A. and Rundle, K. (1972). British Aircraft Corp. Rep. Aero MA 19. Roberts, R. (1972). lnst. Phys. Fibre Optical Communications. Inst. of Physics. Roberts, R. (1973). 8th Progress Rept., University of Dundee. Roberts, R. (1975). Electron. Lett. 11, No. 22. Roger, A.(1981). IEEE Trans. Antennas Prop. AP-29, 232. Ross, M. P. and Dudley, D. G. (1988). IEEE Trans. Antennas Prop. AP-36, 1192 Rothwell, E. J. (1987). IEEE Trans. Antennas Prop. AP-35, 913. Rozzi, T. E. (1973). Int. J. Circuit. Theory. Appl. 1, 161. Rozzi, T. E. and Mecklenbrauker, W. F. G. (1974). Philips Res. Rep. M.S. 8433. Rumsey, V. H. (1957). IRE Natl. Conv. Rec. Pte 1,114. Ryan, C. E., Jr. and Peters, L. Jr. (1968). IEEE Trans. Antennas Propag. AP-16, 274. Ryle, M. (1952). Proc. Roy. Soc. 211A, 351. Rynne, B. P. (1985). IMA J. Appl. Math. 35, 297. Rynne, B. P. (1986). Electromaqnetics 6, 129. Rynne, B. P. (1992). IMA J. Appl. Math. 49, 35. Rynne, B. P. and Smith, P. D. (1990). J. Electromaqn. Waves Appl. 4, 1181. Sard, A. and Weintraub, S. (1971). A book of splines. Wiley, New York. Sarkar, T. K. (1991). Application of conjugate gradient method to electromagnetics and signal analysis. Elsevier, New York. Sayre, E. P. and Harrington, R. F. (1968). Proc.lnt. AntennasPropagation Symp., Boston, p. 160. IEEE. Schenk, H. A. (1968). J. Acoust. Soc. Am. 44, 41. Schoenberg, I. 1. (1946). Q. Appl. Math. 4, 45, 112. Seeger, 1. A. (1968). Proc. IEEE 56, 1393. Senior, T. B. A. (1969). IEEE Trans. Antennas Propaq. AP-17, 378, 751. Senior, T. B. A. (1971). Electron. Lett. .' (10), 87.

REFERENCES

637

Senior, T. B. A. (1971a). Appl. Sci. Res. 23, 459. Senior, T. B. A. (1972). IEEE Trans. Antennas Propag. AP-20, 326. Senior, T. B. A. (1975). Radio Sci. 10, 645, 911. Senior, T. B. A. (1989). Electromagnetics 9, 187. Senior, T. B. A. and Uslenghi, P. L. E. (1971). Radio Sci. 6, 393. Senior, T. B. A. and Uslenghi, P. L. E. (1973). Radio Sci. 8, 247. Sewell, M. J. (1987). Maximum and minimum principles. Cambridge University Press. Shampine, L. F., Watts, H. A., and Davenport, S. M. (1976). SIAM Rev. 18,376. Shanks, H. E. (1961). IRE Trans. Antennas Propag. AP-9, 162. Shanks, H. E. and Bickmore, R. W. (1959). Can. J. Phys. 37, 263. Shanno, D. F. (1970). Math. Comput. 24, 647. Sharples, A. (1962a). Proc. Camb. Phil. Soc. 58, 662. Sharples, A. (1962b). Q. J. Mech. Appl. Math. 15, 253. Shaw, E. and Davies, D. E. N. (1964). Radio Electron. Eng. 27, 279. Shaw, R. P. (1974). J. Acoust. Soc. Am. 56, 1354. Shelton, 1. P. and Kelleher, K. S. (1961). IRE Trans. Antennas Propag. AP-9, 154. Silvester, P. (1968). Proc. lEE 115, 43. . Silvester, P. (1970). IEEE Trans. MTT-18, 63. Silvester, P. and Hsieh, M. S. (1971). Proc. Inst. EIectr. Eng. 118, 1743. Sleeman, B. D. (1982). IMA J. Appl. Math. 29,113. Smith, P. D. (1990). Electromagnetics 10, 439. Snyder, A. W. and Love, J. D. (1975). IEEE Trans. Microwave Theory Tech. MTT-23,

134.

Snyder, A. W. and Love, 1. D. (1983). Optical waveguide theory. Chapman and Hall, London. Sologub, V. G. (1972). USSR Comput. Math. Math. Phys. 12, 135. Sondhi, M. M. (1972). Proc. IEEE 60, 842. Stephan, E. P. and Wendland, W. L. (1984). Applicable Analysis 18, 183. Stinehelfer, H. E., Sr. (1968). IEEE Trans. MTT-16, 439. Strang, G. and Fix, G. J. (1973). An analysis of the finite element method. Prentice-Hall, Englewood Cliffs, N.J. Syed, H. H. and Volakis, J. L. (1992). Electromagnetics 12, 33. Sylvester, J. and Uhlmann, G. (1990). In Inverse problems in partialdifferential equations, p. 101. SIAM, Philadelphia. Takahasi, H. and Mori, M. (1973). Num. Math. 21, 206. Taylor, C. D. and Wilton, D. R. (1972). IEEE Trans. Antennas Propag. AP-20, 772. Taylor, T. T. (1955). IRE Trans. Antennas Propag. AP-3, 16. Tee, G. J. (1963). Computer J. 6, 177. Tesche, F. M. (1973). IEEE Trans. Antennas Propag. AP-21, 53. Tesche, F. M. and Neureuther, A. R. (1970). IEEE Trans. AntennasPropag. AP-18, 692. Teymur, M. (1992). IMA J. Appl. Math. 48, 217. Thiele, G. A., Travieso-Davies, M., and Jones, H. S. (1969). Proc. Conf. on Environmental Effects on Antenna Performance. Pergamon Press. Thong, T. and Liu, B. (1977). IEEE Trans. CAS-24, 132. Tihonov, A. N. (1944). Dokl. Akad. Nauk SSSR 39, 332. Tihonov, A. N. (1964). Sov. Math. Dokl. 5, 835. Tijhuis, A. G. (1984). Radio Science 19, 1311. Tijhuis, A. G., Wiemans, R., and Kuester, E. F. (1989). J. Electromagn. Waves Appl. 3,

485.

Titchmarsh, E. C. (1939). The theory offunctions. Oxford University Press, London.

638

REFERENCES

Tobin, A. R., Yaghjian, A. D., and Bell, M. M. (1987). Digest of the National Radio Science Meeting (URS/), Boulder, Colorado. Tokatly, V. I. and B. Yeo Kinber (1971). Radiofizika 14, 761. Ufimtsev, P. Y. and Krasnozhen, A. P. (1992). Electromagnetics 12, 121. Ursell, F. 1. (1957). Proc. Camb. Phil. Soc. 53, 115. Ursell, F. 1. (1968). Proc. Camb. Phil. Soc. 64, 171. Ursell, F. (1973). Proc. Camb. Phil. Soc. 74, 117. Van Blaricum, M. L. and Mittra, R. (1975). IEEE Trans. Antennas Propag. AP-23, 777. Van Buren, A. L. (1970). NRL Rep. 7160. U.S. Govt., Washington. Van Dantzig, D. (1958). K. Ned. Akad. Wet. Amsterdam 61A, 384. van den Berg, P. M. (1984). IEEE Trans. Antennas Prop. AP-32, 1063. van den Berg, P. M. and Kleinman, R. E. (1988). IEEE Trans. Antennas Prop. AP-36, 1418. Vainikko, G. M. (1968). Dokl. Akad. Nauk SSSR 179, 1029. Vainikko, G. M. (1965). Uch. Zap. Tartusk. Inst. 73, 182. Varga, R. S. (1962). Matrix iterative analysis. Prentice-Hall, Englewood Cliffs, N.J. Vogel, M. H. (1991). Ph.D. Thesis, University of Delft. Wachspress, E. L. (1973). J. Inst. Math. Applic. II, 83. Wait, J. R. and Conda, A. M. (1958). Trans. Inst. radio Eng. AP-6, 1957. Wait, J. R. and Conda, A. M. (1959). J. Res. Nat. Bur. Stand. 63D, 181. Wait, R. (1973). Ministry of Defence Res. Rep. University of Dundee. Wall, D. 1. N. and Bates, R. H. T. (1975). IEEE Trans. Microwave Theory Tech. MIT-23, 605. Waterman, P. C. (1965). Proc. IEEE 53, 805. Waterman, P. C. (1971). Phys. Rev. D 4, 825. Weinstein, L. A. (1948). Izv. Akad. Nauk Sere Fiz. 12, 144, 166. Weinstein, L. A. (1969). The theory of diffraction and the factorization method. Golem Press, Boulder, Colorado. Westcott, B. S. (1983). Shaped reflector antenna design. Research Studies Press, Letchworth. Westcott, B. S. and Brickell, F. (1982). lEE Proc. 129, 307. Westcott, B. S. and Brickell, F. (1984). lEE Proc. 131, 9. Westcott, B. S., Graham, R. K., and Brickell, F. (1984). lEE Proc. 131, 365. Weston, V. H. (1962). IRE Trans. Antennas Propag. AP-I0, 775. Weston, V. H. (1965). Radio Sci. 69D, 1257. Wheeler, H. A. (1965). IEEE Trans. MIT-13, 172. White, F. P. (1922). Proc. Roy. Soc. lOOA, 505. White, W. D. (1962). IRE Trans. Antennas Propag. AP-IO, 430. Whiting, K. B. (1968). IEEE Trans. MIT-16, 889. Whittaker, E. T. and Robinson, G. (1952). The calculus of observations. Blackie, London. Wilkinson, J. H. (1965). The algebraic eigenvalue problem. Clarendon Press, Oxford. Williams, W. E. (1959). Proc. Roy. Soc. 252A, 376. Williams, W. E. (1961). Appl. Sci. Res. 98, 21. Williams, W. F. (1965). Microwave J. 8, 79. Wilton, D. T. (1973). Ministry of Defence Res. Rep., Pts 1, 2, University of Dundee. Wilton, D. R. (1981). Electromagnetics 1, 403. Wolf, E. (1969). Opt. Commun. 1, 153. Wolf, E. (1970). J. Opt. Soc. Am. 60, 18. Wolf, E. and Shewell, J. R. (1970). J. Math. Phys. 11, 2254. Wolfe, P. (1971). SIAM Rev. 13, 185.

REFERENCES

639

Wombell, R. J. and Murch, R. D. (1992). J. Electromagn. Waves Appl. 7, 687. Wood, P. J. (1970). Electron. Lett. 6, 326. Wood, P. J. (1971). Marconi Rev. 34, 149. Wood, P. 1. (1972). Marconi Rev. 35, 121. Woodward, P. M. and Lawson, 1. D. (1948). J. Inst. Electr. Eng. 95, 363. Wu, T. T. (1956). Phys. Rev. 104, 1201. Wu, T. T. and Rubinow, S. I. (1956). J. Appl. Phys. 27, 1032. Xu, 1. (1992). SIAM Review 34,581. Yaskova, G. N. and Yakovlev, M. (1962). Trudy. Matem. Inst. V.A. Steklov 66, 182. Yee, H. Y. and Felsen, L. B. (1969). IEEE Trans. Microwave Theory Tech. MTf 17, 73, 671. Yee, H. Y., Felsen, L. B., and Keller, 1. B. (1969). SIAM J. Appl. Math. 16, 268. Vee, K. (1966). IEEE Trans. Antennas Prop. AP-16, 302. Younan, N. H. and Taylor, C. D. (1991). Electromagnetics II, 223. Young, D. M. (1971). Iterativesolution of large linear systems. Academic Press, London. Zoutendijk, G. (1970). In Integer and non-linear programming (ed. J. Abadie). NorthHolland, Amsterdam. Zinn, A. (1989). Inverse Problems 5, 239. Zwamborn, A. P. M. and van den Berg, P. M. (1991). IEEE Trans. Antennas Prop. AP-39, 224.

INDEX absorbing boundary condition 366 and dielectric 383 and finite elements 371 and Helmholtz's equation 367, 371 for Laplace's equation 368 and Maxwell's equations 368 numerical study 368 in time domain 395 A-conjugacy 190 active set 192 adaptive beam forming 618 ADI method 100 adjoint operator 124 norm of 128 admittance, of dielectric rod 381 of infinite wire 295, 296, 302 shunt 321 of transmitting antenna 312, 316 aircraft, wire grid model of 326 Aitken's £52- method 39, 61 algebraic equations associated with integral equation 360 algorithm, BFOS 186 DFP 186 Fletcher- Reeves 189 Hestenes-Stiefel 189 LR 65 Polak-Ribiere 189 QR 65,66 quasi-Newton 186, 187, 189 algorithms of Remes 14, 15, 25 alias 118,619 alternating direction implicit method 100 amplification matrix 91 angle of incidence, critical 562 anisotropic medium 436, 547, 548 and high frequencies 437 Ansatz for high frequencies 435 antenna, admittance in transmission 312, 316 back-fire 324 Cassegrain 551 circular 320, 324, 403 as cone-sphere 328 conical 327 cylindrical 433, 474 disc 433 frequency re-use 550 gain of 302

helical 320 impedance of 303 impedance symmetry 304 input conductance 312 input susceptance 312 laminar 327 loaded 302; 306 log-periodic 321, 403 matched 306 maximum power transfer 306 network characterization 303 parabolic 475 paraboloidal 328 power gain 313 power transfer 306 receiving 302 reflector 549 spherical 327, 433 spheroidal 325, 337, 361, 362" v: 403 wire 286 as wire grid 324 zigzag 322, 323 aperture field in holography 602 aperture function, norm of 608, 612 aperture-limited array 608 aperture, ray theory of 517 synthetic 625 apertures, independent processing 623 apparent source strength and diffraction 594 approximation 12 finite difference 78 L2 norm 15, 23, 28 rational 23 area of surface element 390 array 323 of antennas 324,619 of circular wires 324 multiplicative 624 phase centre of 623 power pattern 620 rectified 624 time-varying 625 Vagi 324 array synthesis 607 with aperture limitation 608 associated Legendre polynomial 351 astigmatism 440,470

642

INDEX

autocorrelation function 586, 591, 604 axial ray 439

cubic 8, 318, 401 parametric 318 in two dimensions 358

Babinet's principle 517

Babuska paradox 358

back scatter 302 back substitution 52 backward difference 79 backward rectangle rule 277, 279 Banach space 122, 165, 166, 219, 245 and logarithmic norm 172 bandwidth and coherence time 589 basis 121 basis element 222 basis function 4, 308 piecewise constant 308 piecewise linear 309 piecewise sinusoidal 309 pulse 308 quadratic 309 triangle 309 trigonometric 309 beam, and side lobes 624 collimated 554 Gaussian 554 narrowest 616, 617 beam former, best 620 beam forming, adaptive 618 beam sweeping 619, 626 beams, and orthogonality 623 multiple 622 beam waist 554, 558 Bernoulli number 281 Bernoulli polynomial 280 Bessel's inequality 17 bicubic B-spline 358 biharmonic operator 99 binormal 442 biological specimen 591 biorthonormal set 225, 241 bisection method 31 boresight 549 Born approximation 591, 602 boundary approximation 97, 103, 105 boundary condition, absorbing 366 Dirichlet 87, 137 extended 362 impedance 362 natural 141 Neumann 87, 88 periodic 107, 109 boundary value in Sobolev space 221 Broyden-FIetcher-Goldfarb-Shanno (BFGS) algorithm 186 B-spline 8, 316 algorithm for 317 bicubic 358

canonical problem 453, 469, 482, 517, 523, 541 capacitance of square plate 308 catastrophe theory 474 Cauchy principal value 360 Cauchy's method 36, 39 causality 395, 424 caustic 470, 560, 561 intensity at 474 phase change at 474 uniformly valid field 474 caustics and rays 546 cavity resonator 143, 272 and finite elements 272 central difference 7, 79 CFIE 349, 361 and choice of parameter 349 MFIE and EFIE compared 354 uniqueness of 349 CGFFT 315, 319 chain rule for operator derivatives 161, 162 characteristic function 315 characteristic polynomial 460 charge, conservation of 72 charge density 72 surface 73 charge, surface 331 Chebyshev polynomial 20,616 recurrence formula for 21 Chebyshev series 22, 23, 28 truncation 22, 23 Chebyshev theorem 21 Cholesky decomposition 56, 182 closure of space 242 coaxial line 78 coherence 587 partial 587 coherence time 588 and bandwidth 589 collocation 251, 261, 308 and Sobolev space 221 subsectional 251, 309 column vector 41 combined field integral equation (CFIE) 349 communication theory integral equation 207 compact linear operator and extremum property 130 compactness, for MFIE 346 for MFIE with complex wavenumber 348 for static MFIE 348 compact operator 129, 130, 142, 147, 151 conditions A 241 and Riesz number 341 complex eikonal 558

INDEX complex line source 555, 556 complex ray 557 complex source 553 concave functional 195 condition number 54 and scaling 54 spectral 54 conditions A 241 conductance, radiation 296, 379, 380 conductivity 73 conductor 73 perfect 73 cone 327 convex 218 of edge rays 488 cone-sphere 328, 409 conformal mapping 109 conforming element 358 conjugate direction 188 conjugate gradient 189, 190, 315, 361 conjugate gradient method 259, 262, 315 for integral equation 259 conjugate method 188 conservation of charge 72 surface 404 constitutive equations 72 continued fraction 25 convergent 25 infinite 25 nth convergent 25 terminating 25 continuous spectrum 146 convergence, fixed-station 83 convex cone 218 convex functional 170, 172, 187, 208 and Gateaux derivative 171 convex semi-cone 218 of matrices 218 convex set 162 convolution, inversion of 602 Coons' patch 357 coordinates, isoparametric 263, 267, 268 corrector 464 correlation, and source strength 590 circular serial 431 in frequency domain 591 in time domain 590 coupling between resonant modes 421 Cramer's rule 53 creeping ray 523, 539 damping on 526 shedding of energy from 527 in three dimensions 540, 546 creeping waves 405 on a disc 546 and focusing 546 critical angle of incidence 562 cross-correlation for source location 592

643

cross-correlation function 586, 589 and partial differential equation 589 cross-polarization 552 and discontinuous curvature 545 cross-power spectral density 587, 592, 593 and partial differential equation 589 cross-section, radar 313 scattering 361 cube, standard 265, 272 curl, surface 393 current, body and causality 424 ring 291 surface 331 current density 72 surface 73 curvature, Gaussian 391 mean 391 cut-off frequency in dielectric rod 374, 376 cut-off wavelength 77, 98 cut-off wavenumber 77 Davidon-Fletcher-Powell (DFP) algorithm 186 decomposition, Cholesky 56, 182 triangular 55, 57 derivative, of operator 156 Frechet 159, 162, 167 Gateaux 156, 162, 167 higher 169 numerical 79 numerical and global accuracy 80 numerical and local truncation error 80 second Frechet 167 derivatives, chain rule for operator 161, 162 descent method 183 det 42 detection, of inhomogeneity 584, 601 of stratification 584, 585 determinant 42 Jacobian 160 diag 43 diagonal element 41 dielectric, and absorbing boundary condition 383 and finite differences 383 and finite elements 383 and Gaussian beam 555 homogeneous isotropic 385 inhomogeneous 383 and surface radiation condition 384 volume integral equation 382 dielectric constant 72 dielectric interface and surface wave 560, 561 dielectric rod, admittance of 381 effective permittivity of 375 energy flow in 374, 375, 380 of finite length 381

644 dielectric rod (cont.) hybrid modes on 375 infinite 373 modal excitation 377 modes on 374 surface wave on 376, 378, 380 difference, backward 79 central 7, 79 five-point 82, 84, 89 forward 79 nine-point 83, 89 difference equation 83 convergence of 83 initial value for 89 instability of 83 maximum principle for 85 difference equations, and preprocessing 105 direct methods for 100 differential equation, Adams-Bashforth method 460, 462, 464 Adams-Moulton method 458, 460, 464 difference operator of 459 Euler's method 458, 464 extrapolation method 468 Heun's method 466 mid-point rule 458 Milne-Simpson method 460 multistep method 457 Nystrom method 460, 462 predictor-corrector method 464, 467 Quade's method 459, 461 Runge- Kutta method 465 Simpson's rule 458 Sturm-Liouville problem 203 trapezoidal rule 458 and variational principle 203 differential equations, autonomous system of 445 for rays 441, 445 state-space 329 steady state 469 stiff 469 systems of 469 transient 469 diffracted field 484 as plane wave locally 487 diffraction, and apparent source strength 594 by circular cylinder 523 by circular cylinder with impedance boundary 533 by cone 546 by curved boundary with impedance 546 by discontinuity in curvature 541 by discontinuity in curvature on dielectric 546 by discontinuity in curvature with impedance 545, 546

INDEX by discontinuity in curvature with oblique incidence 545 by edge 482 by edge at oblique incidence 486 by ellipse 538 by general curved object 536 by half-plane with complex source 555 with inhomogeneous medium 538 and interference 485 by open end of waveguide 509 by parabolic cylinder 535, 536 by radially stratified cylinder 538 by screen 517 by slit 517 in stratified medium 538 by two edges 499 by wedge 517 X-ray 602 diffraction matrix 488 for curved boundary 539 in penumbra 541 for wedge 520 directivity of radiation pattern 616 Dirichlet's principle 207 discontinuous curvature and cross-polarization 545 divergence factor 481 divergence, surface 330, 391 divergence theorem, surface 392 domain 124 of Euclidean space 219 dominant mode 77, 96 Doppler shift 580 doubly diffracted ray 489 dyadic 420 economic distribution 202 edge ray 489 efficiency of antenna 307 maximum 307 efficiency, radiation 313, 316 EFIE 334, 361 CFIE and MFIE compared 354 and MFIE compared 339 mixed 355 modified 351 non-uniqueness of 335 numerical errors in 336 and resonance 337 singularity of 359 in time domain 404 in time domain, instability of 407, 409 in time domain, numerical properties 406 eigenelement, generalized 417 eigenfunction of operator 129 eigenvalue, matrix 42 multiplicity of 131 operator 129

INDEX simple 131 eigenvalues, and Galerkin's method 276 bounds for 143, 149-55 of compact operator 343 continuity of 44, 46 finding 60 and Givens-Householder method 63 Givens method 63 of Hermitian matrix 45, 152 of inverse source integral equation 584 Jacobi method 62 and LR algorithm 65 of positive definite operator 276 and QR algorithm 65 of self-adjoint operator 129, 142 eigenvector of operator 129 eigenvectors, continuity of 44, 46 orthogonality of 42, 45 eigenvectorsof partial differentialequation 138 expansion in 138 eikonal 436 complex 558 eikonal equation 436 electrically polarized wave 454 electric dipole, complex 556 electric-field integral equation (EFIE) 334 electric flux density 72 electric intensity 72 electric mode and resolvent pole 420 electromagnetic surface radiation condition in time domain 396 element, conforming 358 non-conforming 358 emission from waveguide 509 energy flow, and rays 438 on complex rays 559 in dielectric rod 374, 375, 380 at high frequencies 436 from non-radiating source 569 energy velocity 436 ensemble 586 ergodic 586 stationary 586 entire function 605 of exponential type 605 zeros of 606 • equation, difference 83 of continuity 72 integral 134 partial differential 135 pivot 52 Poisson's 84, 233 root of 31 solution by Gaussian elimination 52 solution of 31, 51 zero of 31 equations, constitutive 72 ill-conditioned 53

645

iterative methods for 57 Maxwell's 72 residual in 56 well-conditioned 54 equiphase curve 559 error constant 459 etalon problem 546 Euler's constant 293 Euler-Maclaurin summation formula 282,285 evanescent field 559, 561 expansion, full-domain 308 subdomain 308 expansion parameter, log-periodic 321 expansion theorem 132, 138 explicit method 457 extended boundary condition 362 exterior uniqueness theorem 364 extremum property of compact operator 130 false position, method of 9, 32, 39 far-field, and cross-correlation 592 of antenna 295, 301 of moving target 581 in time domain 579 fast Fourier transform (FFT) 106, 117, 313 feasible point 192 Fermat's principle 451, 489, 493, 523, 540 and reflection 452 and refraction 453 FFT 106, 117, 313, 410 fibre, optical 377, 559 field, reconstruction 597 reference 597 finite difference and finite element, comparison 275 finite differences 78 and dielectric 383 in time domain 396 and variational principle 273 finite element 262 conforming 270 with continuous normal components 272 with continuous tangential components 272 Nedelec 270,369 non-conforming 269 finite elements, and absorbing boundary condition 371 and cavity resonator 272 and dielectric 383 in time domain 396 five-eight rule 277, 279 Fletcher-Reeves algorithm 189 floating point 53 focal surface 470 focus 474 intensity at 474 phase change at 474

646 focusing 440, 470 forward difference 79 forward rectangle rule 277, 279 Fourier method for integral equation 253 Fourier transform 291 fraction, continued 25 Frechet derivative 159, 162, 167 Fredholm alternative 338, 340-4 Fredholm's theorem 343 frequency 73 frequency range, effective 589 Fresnel's equation 438 Fresnel's integrals 483 functional, concave 195 continuity of 171 convex 170, 172, 187,208 derivative of 158 gradient of 158, 162, 208 Jacobian matrix of 160 linear 124 maximum of 158, 170, 195 saddle-shaped 197 stationary 158 strictly convex 170, 172 strictly saddle-shaped 197, 207 function, aliased 118 autocorrelation 586, 591, 604 basis 4,308 characteristic 315 cross-correlation 586, 589 entire 605 expansion in orthonormal set 132 pyramid 2 triangle 2 weight 19 functions, orthogonal 16 fundamental mode 77 fundamental tensor, second 390 gain, antenna 302 power 313 Galerkin's method 222, 308, 310 and eigenvalues 276 general 223 for partial differential equation 239 and Sobolev space 221 and variational principle 222 Gateaux derivative 156, 162, 167 and convex functional 171 Gateaux variation 156 Gaussian beam 554 and dielectric 555 lateral displacement 557 and reflection 556 Gaussian elimination 52 Gaussian pulse 402,406,432, 579 Gauss-Seidel method 58,243

INDEX generalized eigenelement 417 generalized inverse 68, 223, 429 geodesic, surface 540, 546 geometrical optics 434, 438, 469 intensity law 439 geometric theory of diffraction (GTD) 434,470 geometry of surfaces 389 Gerschgorin circle theorem 50, 51 Givens-Householder reduction 63, 66 Givens method 63, 66 glancing incidence 527 glass, black 377 global accuracy 80 gradient, of functional 158, 162, 208 surface 391 graph, directed 201 Green's tensor 143 for modified EFIE 351 GTD 434,470 and edges 489 uniformly valid 492 Hadamard's finite part 492 Hammerstein integral equation 203 Hankel function, uniformly valid expansion 525 harmonic response versus impulse response 409 harmonic, spherical 351 Heaviside step function 426 Helmholtz's equation 367 Hermite interpolation 4 Hestenes-Stiefel algorithm 189 high frequency, Ansatz at 435 Hilbert space 122, 219, 340 separable 228 hologram 570, 596 wave from 601 holography 596 of antenna 602 and inhomogeneity 601 homocentric pencil 470 homogeneous isotropic dielectric 385 integral equations 387 uniqueness 388 H-plane oversize section 217 H-plane step, symmetric 217 hull, linear 121 hybrid method 366 hybrid modes 375 cut-off 376 ill-conditioning 53 illuminated region 484 image, blurred 602 diffraction-limited 594 real 600 virtual 598

INDEX imbedding 231 and strong set 232 impedance 74 of medium 328 mode 97 series 321 impedance boundary condition 362, 533 transverse 365 uniqueness with 364 impedance half-plane 489 impedance matrices for transmitting and receiving 305 impedance matrix 303 symmetry of 304 transpose of 305 implicit method 457 impulse response of body 423 impulse response versus harmonic response 409 incidence, glancing 527 plane of 453 incoherence 587 inhomogeneity, detection of 584, 601 inhomogeneous dielectric 383 inhomogeneous medium 538,546 at high frequencies 440 initial value problem 89, 457 inner product 15, 19,47 inner product space 121 input conductance 312 input susceptance 312 instability, of difference scheme 83 numerical 53, 90 integral equation 134, 245 and algebraic equations 360 combined field 349 of communication theory 207 and compact operator 241 and conjugate gradients 259 electric field 334 electric field in time 404 of the first kind 258 Fourier method 253 Hammerstein 203 interior electric 353 interior magnetic 354 inverse source problem 583 and iteration 259 magnetic current 350, 357 magnetic field 338 magnetic field in time 404 mixed electric field 355 mixed magnetic field 355 modified electric field 351 for potential of lamina 261 resolvent of 413 and Sobolev space 221 and variational principle 202 volume 382

647

for wire in time domain 398 integral equations for homogeneous isotropic dielectric 387 integral operator, compactness of 245,251,254 integral, Riemann 164 Stieltjes 147 integration, numerical 276 intensity law, in stratified medium 450 of geometrical optics 439 interference and diffraction 485 interference fringe 486 interior electric integral equation 353 interior magnetic integral equation 354 internal reflection 523 interpolant 4 bilinear 11 linear 10, 11 piecewise Hermite 5 interpolation 1, 81 bilinear 12 in two dimensions 9 inverse 9 Lagrange 252 linear 12 piecewise Hermite 4 triangular 12 trigonometric 28 inverse, generalized 68, 223, 429 of linear operator 127 inverse interpolation 9 linear 9 inverse matrix 42 inverse operator 124 inverse scattering 570 by circular cylinder 573, 577 by ellipse 573, 577 at high frequencies 573 at low frequencies 571 by parallelepiped 577 and protrusion 576 by sphere 573, 577, 579 by spheroid 573, 577 in time domain 577 inverse source integral equation 583 and eigenvalues 584 inverse square law 440 iris 208 capacitive 209, 211, 217 inductive 215, 217 thick inductive 217 Irons' patch test 358 isoparametric coordinates 263, 267, 268 iteration, convergence of 58, 59 Gauss-Seidel 58, 59, 60 inverse 61, 66 Jacobi 57, 59 Peaceman-Rachford 59 power method 60, 65

648

INDEX

Jacobian, for rays 446 in Newton's method 40 Jacobi method 57 for eigenvalues 62 Jordan canonical form 44, 46 kernel 142 semi-simple 416 knot 6 Kreiss-Buchanan theorem 91 Krylov-Weinstein theorem 151 Kuhn-Tucker conditions 193

Lagrange interpolation formula 252 Lagrange multiplier 67, 191 Laplace's equation and Monte Carlo method 92 Laplace's operator and differences 82, 83 Laplace transform 410 and MFIE 411 lateral displacement on reflection 557 leaky ray 553, 559 leaky wave 377, 378, 380 radiation loss 377 leapfrog scheme 396 least squares, method of 19, 30, 223 Legendre polynomial 20,22,233,234,240,280 Levenberg-Marquardt method 190 global property 190 linear dependence 120 linear equation, solution of 51 linear functional 124, 170 bounded 128 convexity of 170 linear independence 120 linear k-step method 457 linear operator 124 completely continuous 129 compact 129, 340 inverse of 127 linear space, complex 120 real 120 line, coaxial 78 line search 183, 187 line source, complex 555, 556 Lipschitz condition 33 l~-norm

47

load for maximum power transfer 306 local truncation error 459, 461 in differences 80, 82, 83 logarithmic norm 172 log-periodic antenna 321, 403 three-dimensional 323 log-periodic expansion parameter 321 look direction 620 Lorentz transformation 580

lp-norm 47

LR algorithm 65

magnetic current integral equation 350, 357 EFIE, MFIE and CFIE 351 non-uniqueness 350 magnetically polarized wave 454 magnetic dipole, complex 556 magnetic field integral equation (MFIE) 338 magnetic flux density 72 magnetic frill 291, 327, 377 magnetic intensity 72 magnetic mode and resolvent pole 420 magnetic oscillation 339 magnetic resonance 339 manifold, linear 121 marching in time 402 matched asymptotics 542 matching 306, 307 matrices, as convex semi-cone 218 similarity of 43, 46 matrix 40 amplification 91 anti-symmetric 42 block -diagonal 44 block tridiagonal 102 capacitance 105, 106 consistently ordered 94 continuity of 44 dense 52 determinant of 42 diagonal 43 diagonal similarity 43 eigenvalue of 42 eigenvalues of 60 generalized inverse of 68, 223 Hermitian 44,45,46,71 Hessenberg 65 Hessian 168, 182 inverse of 42 Jacobian 160 Jordan form of 44 left-inverse of 41 lower triangular 55 non-singular 43 normal 91 norm of 47 order of 41 orthogonal 42, 46 plane rotation 61 positive definite 45, 46 positive semi-definite 45, 46 rank of 67 right-inverse of 41 row 41 singular value of 68 sparse 52

INDEX spectral radius of 48 square 41 square root of 46 Stieltjes 50 strictly diagonal dominant 49 subordinate norm of 47 symmetric 42, 46 trace of 4S transpose of 41 tridiagonal 51, 59 unit 41 unitary 44, 45, 50 upper triangular 44 Young's property A for 94 matrix polynomial 44 maximum principle 85 Maxwell'sequations 72 absorbing boundary condition for 368 mean-value theorem for functions 162 mean-value theorems for operators 162-7 medium, radially stratified 451 stratified 446, 450 meridional ray 564 radiation from 566 mesh 84 node of 84 regular point of 84 method of elimination 621 method of moments 223 MFIE 338, 361 adjoint of 338 and adjoint, relation between solutions 345 compactness of integral operator in 346-8 compactness of transformed 411 complex conjugate of adjoint 338 continuity of 356 EFIE and CFIE compared 354 and EFIE compared 339 and Laplace transform 411 mixed 355 non-uniqueness of 338 and relation to complex conjugate of adjoint 345 Riesz number of 345 and Schenck method 354 semi-simple nature of 419 singularity of 359 solutions of 356 in time domain 404 in time domain, instability of 407, 409 uniqueness for static 348 microstrip 112 boxed 116 and quasi-static approximation 113 shielded 112 slotted 117 mid-point rule 277, 279 minimal set 225

649

and stability 234 mixed EFIE 3SS uniqueness of 356 mixed MFIE 355 mode 77,374 higher 98 mode impedance 97 modified EFIE 3S 1 coefficients in 3S3 uniqueness of 353 moments, method of 223 Monte Carlo method 92 moving scaUerer 579 Muller's method 9, 36, 39 multiplicative array 624 multistep method 457 characteristic polynomial 460 consistent 460 convergence of 458, 459, 460 error constant in 459 global error 462 local truncation error 459, 461 order of 459 relatively stable 463, 464 stability of 460 zero-stable 460, 462 Nedelec finite element 270, 369 network analysis 200 Newton's method, convergence of 38 error in 180, 181 for one equation 35, 39, 190 for operators 172-81 for two equations 40 Newton-Cotes rule 277, 279 Newton-Kantorovich theorem 179 nodal point 6 node 6 of mesh 84 noise, and beam sweeping 626 in data 607 and method of elimination 621 non-conforming element 358 non-uniqueness, of EFIE 335 of magnetic current integral equation 350 of MFIE 338 norm, and numerical stability 243 Euclidean 47 L2 13, 16 least squares 13 logarithmic 172 maximum 12, 21, 27 of matrix 47 of operator 127 of vector 46 spectral 48, 54, 234 subordinate 47

650 norm (conr.) uniform 12, 47 norms, compatibility of 47 null-field method 362 numerical integration 276 numerical method for rays 547 numerical methods, for curved antenna 316 for ordinary differential equations 456 for poles 418, 427-31 for surfaces 357 for wire 307 for wire in time domain 400 numerical stability of equations 234 numerical trial function 262 Obreschkotrs formula 24 Ohm's law 73, 201, 202 operator 124 adjoint 124 biharmonic 99 bilinear 167 bounded linear 127 bounded positive 146 central difference 7 continuous 127, 165 derivative of 156 domain of 124 eigenfunction of 129 eigenvalue of 129 identity 128 integral 126 inverse 124 linear 124 logarithmic norm of 172 norm of 127, 128 positive 146, 198, 208, 214 positive-definite 230 range of 124 Riemann integral for 164 self-adjoint 126, 127 symmetric 126 trilinear 169 unbounded 146 zero of 173 optical fibre 377, 559 circular 564 and surface radiation condition 377 optical path length 451 optimal curvature 539 optimization, A-conjugacy 190 and active set strategy 192 BFGS algorithm 186 conjugate gradient 189, 361 conjugate method 188 constrained 191 descent method in 183 DFP algorithm 186

INDEX Kuhn-Tucker conditions 193 Levenberg-Marquardt method 190 steepest descents 183 unconstrained 181 vector 218 ordering, consistent 94 natural 94 orthogonal functions 16 orthogonal radiation patterns 623 orthonormal set 17 complete 17 expansion in complete 18 in Hilbert space 123 Pade approximant 23, 25, 27 paraboloid 328 paraxial ray 440 paraxial region 554, 556 Parseval's formula 18 partial differential equation 135 elliptic 269, 275 and Galerkin's method 239 patch, Coons' 357 Irons' test for 358 surface 357 patch size in time domain 406 pattern, radiation 316 Peaceman-Rachford method 59 penalty function 613, 617 pencil beam 549 pencil, homocentric 470 penumbra) curve 474 penumbral point 527 penumbral transition 528, 532, 534, 541 permeability 72 permittivity 72 Petrov-Galerkin method 223 phase centre of array 623 phase path 559 physical optics 470, 551 pivot 52 pivoting, complete 53 partial 53 plane of incidence 453 point, feasible 192 point matching 152, 246, 308 and interpolation 246 point spectrum 146 Poisson's equation 84, 233 in polar coordinates 108 and variational principle 205 Poisson-Boltzmann equation 207 Poisson summation formula 524 Polak-Ribiere algorithm 189 polarization, of wave 455 on ray 562 ray change of 442

INDEX polynomial, associated Legendre 351 Bernoulli 280 Chebyshev 20 Hermite 284 Laguerre 284 Legendre 20, 233, 234, 240, 280 matrix 44 port 302 efficiency of 307 positive-definite operator 230, 237 and Galerkin method 231 positive operator 146, 198, 208, 214 adjoint of 146 square root of 146, 149 power method 60, 65 power spectral density 587 power transfer in antenna 306 Poynting vector 296 predictor 464 preprocessing 105, 107, 108 principal value 284, 360 principal zero 460 product, inner 15, 19, 47 programming, quadratic 182 Pro ny's method 428, 432 propagation over earth 481 pyramid function 2

Q R algorithm 65, 66 quadratic programming 182 quadrature 276 backward rectangle rule 277, 279, 285 Chebyshev 280 with discontinuity 283 errors in 279 five-eight rule 277, 279, 285 forward rectangle rule 277, 279, 285 Gaussian 279, 284, 285 Gaussian, error in 280 with Hermite polynomials 284 with infinity 283 with Laguerre polynomials 284 mid-point rule 277, 279, 285 Newton-Cotes rule 277, 279, 285 on infinite interval 284 order of 276 for principal value 284 Simpson's rule 277, 279, 285 for singular integral 359 trapezoidal rule 277, 279, 285 weights of 276 quadrature weights for singular integral 360 quality factor 611, 612, 616 quasi-Newton algorithm 186, 187, 189

radar cross-section 313, 323, 324, 327, 365, 403 radially stratified medium 451 radial stratification, diffraction by 538 radiation, coherent 587 incoherent 570, 587 partially coherent 587 radiation conditions 74 radiation condition, surface 372 radiation conductance 296, 379, 380 radiation efficiency 313, 316 radiation pattern 316 and Chebyshev polynomial 616 directivity of 616 of linear array 608, 616 with narrowest mainbeam 616, 617 normalized 612 synthesis of 607 radiation patterns, orthogonal 623, 625 radiation resistance 296, 302, 313 radius, spectral 48 random signal 586 random walk 92 range 124 rank 67 rational function as continued fraction 26 ray 436 axial 439 bending in stratified medium 447 bending of 438 change of polarization on 442 complex 557 differential equations for 441, 445 doubly diffracted 489 edge 489 intensity in stratified medium 450 leaky 553, 559 meridional 564 paraxial 440 polarization in stratified medium 449 polarization on 562 reflection due to refractive index 447 Serret-Frenet formulae on 442 skew 564 in stratified medium 447 and torsion in plane 443 torsion of 442 tunnelling 561, 566 ray equations 441 Rayleigh-Gans approximation 591, 602 Rayleigh scattering 382 rays, and caustics 546 and divergence factor 481 Jacobian for 446 tracing of 547 tube of 438 ray velocity 436 reaction matching 223

651

652

INDEX

reciprocity, in transmitting and receiving 305 in wedge field 519 reciprocity theorem 212, 552 in holography 599 reconstruction field 597 rectification processing 624 reference field 597 reflected beam, lateral displacement 557 reflection, and Fermat's principle 452 at end of dielectric rod 381 of Gaussian beam 556 internal 523 by stratification 475 by stratification, phase change in 479 of travelling wave 300 reflection coefficient, of end of waveguide 515, 516 on finite wire 300 reflection region 484 reflector antenna 549 overall efficiency 553 and reciprocity theorem 552 refraction and Fermat's principle 453 refractive index 434 effect on ray 448 regular point 84 Remes, first algorithm of 14, 15 second algirithm of 15, 25 residual 56 of approximation 239, 240, 241 resistance, radiation 296, 302, 313 resolvent 413 dyadic 420 growth of 422 location of poles of 418 as meromorphic dyadic 422 numerical determination of poles 427-31 poles of 414 relation of poles to resonant modes 418 resonance, and EFIE 337 magnetic 339 resonant modes, coupling between 421 resonator, cavity 143, 272 response, bistatic 410 monostatic 410 retarded time 397 Riemann integral for operator 164 Riesz-Fischer theorem 18 Riesz number 341 criterion for unit 344 of MFIE 345 of semi-simple kernel 417 value of 344 ring current 261 rod, dielectric 373 Rodrigue's formula 20 Rolle's theorem 250

root of equation 31 round-off error and stability 236 round-off in recurrence 21 Runge phenomenon 8 Runge-Kutta method 465 absolutely stable 467 consistent 465 convergent 465 global error 466 local truncation error 466 order of 465 stability of 467 stages of 465 Rytov approximation 584

saddle-shaped functional 197 scaling 54 scatterer, impulse response of 423 moving 579 scattering coefficient of cylinder 538 scattering cross-section 361 scattering, inverse 570 Rayleigh 382 Schenck method 354 Schmidt process 18, 20 Schwarz inequality 15, 121 secant method 9, 33, 39 SEM 425 validity of 426 semi-cone, convex 218 semi-simple kernel 416 condition for 416 Riesz number of 417 sequence, Sturm 63 Serret-Frenet formulae 442 set, active 192 biorthonormal 225, 241 convex 162 minimal 225 orthonormal 17 strong 228 strongly minimal 227, 308 shadow, behind curved object 528 deep 528, 539 and optimal curvature 539 shadow zone 484 side lobe suppression 624 signal, random 586 signal processing 618 similarity transformation 43 Simpson's rule 277, 279 singular integral, numerical consideration of 359 singularity expansion method (SEM) 425 singular value 68, 71, 429 singular value decomposition 430

INDEX singular vector 68 skew ray 564 skip distance 480 sky wave 480 slot line 117 Snell's laws 453, 523 Sobolev space 122, 123, 219, 269 and collocation method 221 and Galerkin method 221 and integral equation 221 solution of equations 31 SOR 58, 59, 60, 93, 107, 108, 243 modified 95 symmetric 95 source, apparent location of 569 complex 553 determination of 581 location of 568 non-radiating 569 source strength, apparent 593, 596 space, Banach 122, 165, 166, 219, 245 basis for 121 closed 123 closure of 242 complete 122 dense 124 finitely dimensional 121 Hilbert 122, 219, 340 imbedding of 231 infinitely dimensional 121 inner product 121, 158 linear 120 normed linear 122, 340 pre-Hilbert 121, 122 Sobolev 122,219, 269 spanned by elements 120 spanned space 120 speckle 570 spectral density, cross-power 587 power 587 spectral radius 48 spectral width, effective 587 spectrum 146 continuous 146 invisible part 599 point 146 visible part 599 sphere, impulse response 407, 432 spherical harmonic 351 spillover 552 splash plate 517, 522 spline 6 B- 8, 316 bicubic B- 358 cardinal 8, 9 cubic 6, 123 knot 6 node 6

653

quadratic 6 spot size 554, 558 spurious zero 460 square root, of bounded positive operator 146, 149 of matrix 46 stability, associated norm 243 and minimal sets 234 numerical 90 numerical and von Neumann criterion 90, 91,92 and round-off error 236 of solutions 234 unconditional 91 stability criterion 91 staggered half-planes, diffraction by 503 standard cube 265, 272 standard system 234 standard tetrahedron 265, 270 standard triangle 262 statistical ensemble 586 step function, Heaviside 426 steplength 457 step length in time domain 406 stepnumber 457 Stieltjes integral 147 Stirling's formula 115 stratification, detection of 584, 585 reflection by 475 stratified medium 446, 450 and conservation of energy 477 diffraction in 538 stretched coordinates 542 strongly minimal set and convergence 240 strongly minimal system 227,308 strong set 228 and imbedding 232 Struve function 607 Sturm-Liouville problem 203 Sturm sequence 63 subspace, linear 121 span of 121 successive over-relaxation (SOR) 58 superdirectivity 608, 617 supergain 608 surface area 390 surface charge 331 surface curl 393 surface current 331 surface divergence 330, 391 surface divergence theorem 392 surface, geometry of 389 numerical method for 357 second fundamental tensor 390 surface gradient 391 surface integral, edge behaviour 335 limit of 333-4 surface patch 357

654

INDEX

surface radiation condition 372 and dielectric 384 and optical fibre 377 in time domain 395, 396 surfacewave, on dielectric discontinuity 560,561 on dielectric rod 376, 378, 380 SVD-Prony method for poles 430,432 synthesis, by series 609 construction errors in 611 of directional pattern 616 with optimal parameter in regularization 614 with penalty function 613, 617 in presence of noise 614 with quality factor control 616 of radiation pattern 607 with regularization 612 synthesis error 611, 613 target, biological 591 Taylor's theorem for operators 169 TEM mode 77, 78, 93, 96, 110 TE mode 77,78, 110, 154,509 on dielectric rod 374 Temple- Kato theorem 150 tensor, Green's 143, 351 tetrahedron, standard 265, 270 time, coherence 588 marching in 402 retarded 397 TM mode 77, 97, 108, 110, 516 on dielectric rod 374 torsion, and ray in plane 443 of ray 442 trace 45,46 in Sobolev space 221 transformation, similarity 43 transmission line, Jog-periodic 321 microstrip 112 series impedance 321 shunt admittance 321 transport equation 437 transpose of matrix 41 transverse electric (TE) mode 77 transverse electromagnetic (TEM) mode 77 transverse magnetic (TM) mode 77 trapezoidal rule 277, 279 travelling wave 298, 301, 302 reflection of 300 triangle function 2 triangle, interpolation on 263 standard 262 triangular decomposition 55, 57 tunnelling ray 561, 566 unbounded operator 146 undetermined weights, method of 277

unimoment method 384 and uniqueness 384 uniqueness of CFIE 349 for homogeneous isotropic dielectric 388 with impedance boundary conditon 364 of interior electric integral equation 353 of mixed EFIE 356 of modified EFIE 353 in wire grid model 357 uniqueness theorem, exterior 364 V-antenna 403 variational principle, and finite differences 273 for operator equation 212 with self-adjoint operator 213 variational principles 194 complementary 198,200, 202-8, 214 dual 198 vector, column 41 norm of 46 singular 68 vectors, comparison of 218 von Neumann criterion 90 waist, beam 554, 558 walk, random 92 wave, creeping 405 electrically polarized 454 leaky 377 magnetically polarized 454 polarization of 455 scattered by moving target 579 scattered in time domain 579 sky 480 wavefront 436 waveguide 74 circular 98, 111, 516 corrugated 551 with dielectric 109 elliptical 98, 155 L-shaped 108, 155 lunar 98 parallel-plate 509, 516 rectangular 98, 111, 209, 216, 217 ridge 98 with splash plate 517 with splash plate and wedge 522 square 98 triangular 78 and variational principles 210 wavelength, cut-off 77, 98 wavenumber, cut-off 77 wave velocity 436 wedge, dielectric 522 double 365

INDEX impedance 517,522 perfectly conducting 517 reciprocity for 519 Weierstrass theorem 13 weighted residuals, method of 223 weight element 225 weight function 19 wire, admittance of 295 back scatter from 302 with coaxial feed 291 curved 316 of finite length 299 and Gaussian pulse 402, 403 of infinite length 290 integral equation in time 398 numerical methods for 307 numerical methods with time 400 propagation on 289 resonant lengths of 300

of semi-infinite length 297 thick 296 travelling wave on 298, 301, 302 wire antenna 286 wire grid model 324, 409 uniqueness with 357

Young's property A 94

zero element 120 zero, of equation 31 of operator 173 principal 460 spurious 460 zero-stability 460 zigzag 322, 323

655

Methods in electromagnetic wave propagation

Read more

Parabolic equation methods for electromagnetic wave propagation

Read more

Parabolic Equation Methods for Electromagnetic Wave Propagation

Read more

Theory of electromagnetic wave propagation

Read more

Propagation of Radiowaves, 2nd Edition (Electromagnetic Waves)

Read more

Electromagnetic wave propagation, radiation, and scattering

Read more

Wave Propagation in Fluids

Read more

Singularities in Linear Wave Propagation

Read more

Singularities in Linear Wave Propagation

Read more

Wave Propagation in Periodic Structures

Read more

Wave propagation in elastic solids

Read more

Wave Fields in Real Media: Wave Propagation in Anisotropic, Anelastic Porous and Electromagnetic Media

Read more

Experimental Methods in Wave Propagation in Solids and Dynamic Viscometry

Read more

Wave Fields in Real Media: Wave Propagation in Anisotropic, Anelastic Porous and Electromagnetic Media

Read more

Electromagnetic Wave Theory

Read more

Analytical and numerical methods for wave propagation in fluid media

Read more

Electromagnetic Propagation in Multi-mode Random Media

Read more

$Electromagnetic diffraction and propagation problems$
Electromagnetic diffraction and propagation problems

Read more

Electromagnetic Waves Propagation in Complex Matter

Read more

Fundamentals of seismic wave propagation

Read more

Seismic wave propagation in stratified media

Read more

2nd Wave

Read more

Elastic wave propagation and generation in seismology

Read more

Wave Propagation in Materials for Modern Applications

Read more

Elastic Wave Propagation and Generation in Seismology

Read more

Imaging Phonons: Acoustic Wave Propagation in Solids

Read more

Fundamentals of Seismic Wave Propagation

Read more

Essentials of Radio Wave Propagation

Read more

Contact geometry and wave propagation

Read more

Spheroidal wave functions in electromagnetic theory

Read more

Recommend Documents

Methods in electromagnetic wave propagation

Parabolic equation methods for electromagnetic wave propagation

Parabolic Equation Methods for Electromagnetic Wave Propagation

Theory of electromagnetic wave propagation

THEORY OF ELECTROMAGNETIC WAVE PROPAGATION CHARLES HERACH PAPAS PROFESSOR OF ELECTRICAL ENGINEERING CALIFORNIA INSTITUT...

Propagation of Radiowaves, 2nd Edition (Electromagnetic Waves)

)%4%,%#42/-!'.%4)#7!6%33%2)%3 3ERIES%DITORS0ROFESSOR0*"#LARRICOATS 0ROFESSOR%6*ULL 0ROPAGATION OF2ADIOWAVES ...

Electromagnetic wave propagation, radiation, and scattering

Wave Propagation in Fluids

This page intentionally left blank Wave Propagation in Fluids This page intentionally left blank Wave Propagation...

Singularities in Linear Wave Propagation

Lecture Notes in Mathematics Edited by A. Dold and B. Eckmann Subseries: Nankai Institute of Mathematics, Tianjin, RR. C...

Singularities in Linear Wave Propagation

Wave Propagation in Periodic Structures