HARMONICI ANALYSIS IL
t
EgflMg MTWO MOCO)Ff
Harmonic Analysis:
A Gentle Introduction Carl L. DeVito, PhD University of Arizona
JONES AND BARTLETT PUBLISHERS BOSTON
Sudbury, Massachusetts SINGAPORE LONDON
TORONTO
World Headquarters Jones and Bartlett
Jones and Bartlett Publishers Canada 6339 Ormindale Way Mississauga, Ontario L5V 1J2 CANADA
Publishers
40 Tall Pine Drive Sudbury, MA 01776 978-443-5000
[email protected] www.jbpub.com
Jones and Bartlett Publishers International
Barb House, Barb Mews London W6 7PA UK
Jones and Bartlett's books and products are available through most bookstores and online booksellers. To contact Jones and Bartlett Publishers directly, call 800.832-0034, fax 978443-8000, or visit our website www.jbpub.com. Substantial discounts on bulk quantities of Jones and Bartlett's publications are available to corporations, professional associations, and other qualified organizations. For details
and specific discount information, contact the special sales department at Jones and Bartlett via the above contact information or send an email to
[email protected].
Copyright 0 2007 by Jones and Bartlett Publishers, Inc. ISBN-13: 978-0-7637-3893-8 ISBN-10: 0-7637-3893-X
All rights reserved. No part of the material protected by this copyright may be reproduced or utilized in any form, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without written permission from the copyright owner.
Library or Congress Cataloging-in-Publication Data DeVito, Carl L. Harmonic analysis : a gentle introduction / Carl L. DeVito. p. cm. Includes bibliographical references and index. ISBN-13: 978-0-7637-3893-8 ISBN-10: 0-7637-3893-X 1. Harmonic analysis. I. Title. QA403.D457 2007
515'.2433-dc22 2006010535
6048
Production Credits Acquisitions Editor: Tim Anderson Production Director: Amy Rose Editorial Assistant: Laura Pagluica Production Editor. Tracey Chapman Production Assistant: Jamie Chase Manufacturing Buyer: Therese Connell Marketing Manager. Andrea DeFronzo Composition: Northeast Compositors Cover Design: Timothy Dziewit Cover Image: 0 Bob Wainsworth/ShutterStock, Inc. Printing and Binding: Malloy, Inc. Cover Printing: John Pow Co.
Printed in the United States of America
10 09 08 07 06
10 9 8 7 6 5 4 3 2
1
Contents Preface
v
Chapter O
Preliminaries
1
l
1. Set Theory
2. Relations and Functions
3. The Real NumberSystem
4 7
4. The Complex Number System
5. Analysis
Chapter 1
Classical Harmonic Analysis 23 1. The Dirichlct Problem for a Disk 24 2. Continuous Functions on the Unit Circle 25 3. The Method of Fourier 29 4. Uniform Convergence 32 5. The Formulas of Euler 39 6. Ces3ro Convergence 43 7. Feje is Theorem 47 8. At Last the Solution
Chapter 2
II
15
52
Extensions of the Classical Theory 57 1. Functions on (-tt. tt) 57 2. Functions on Other Intervals 61 3. Functions with Special Properties 65
4. Pointwise Convergence of the Fourier Series 70
iv
Contents
Chapter 3
Fourier Series in Hilbert Space 79 I . Normed Vector Spaces 80 2. Convergence in Nonmed Spaces 84 3. Inner Product Spaces 90 4. Infinite Orthonormal Sets, Hilbert Space
5. The Completion 6. Wavelets
Chapter 4
103
107
The Fourier Transform 113 1. The Fourier Transform on Z 113 2. Invertible Elements in CI (Z) 119 3. The Fourier Transform on R 123 4. Naive Group Theory 128 5. Not So Naive Group Theory 131 6. Finite Fourier Transform 135 7. An Application 143 8. Some Algebraic Matters 150 9. Prime Numbers
154
10. Euler's Phi Function
Chapter S
159
Abstract Algebra 165 1. Groups 165 2. Morphisms 169 3. Rings 172 4. Fields 175
Appendix A Linear Algebra Appendix B The Completion
179
181
Appendix C Solutions to the Starred Problems Index 213
187
96
Preface Many branches of mathematics come together in harmonic analysis, each adding
richness to the subject and each giving insights into this fascinating field. My purpose in writing this book is to give the reader an appreciation for all this interaction without overwhelming him or her with technical details. I also hope to inspire the reader to learn more about harmonic analysis and about mathematics in general. This book can be read by any student who has had a course in advanced calculus and has some acquaintance with linear algebra. My students were mostly mathematics majors in their junior or senior year. Fourier transforms, the fast Fourier transform, even abstract algebra, have all made their way into modern engineering and into many areas of modern science. Some of my students were majoring in these subjects. For them, I usually asked that they knew some linear algebra and had previously taken an "Advanced Analysis for Engineers" type of course. Typically these students had trouble with the proofs. but, they told me, they had little interest in the proofs and were more concerned with understanding the definitions and doing the calculations. Because of the varied backgrounds of my students, I have tried to make the book as selfcontained as possible. The text begins with a very elementary preliminary chapter (Chapter Zero). This contains the kind of material that every student has seen but many have forgotten. It is there mainly for reference so that a student who has forgotten some fact about complex numbers, or some theorem from calculus, can refresh his or her memory. The only thing that might be new to some readers is the derivation of the Cauchy-Ricmann equations. These are used to convince the student that Laplace's equation has lots of particular solutions. We refer to this fact in Chapter One where we define harmonic functions. It also tells the student why solving partial differential equations requires a new technique. Chapter One contains a discussion of the Dirichlet problem for a circle with continuous boundary data. Here harmonic functions and periodic functions are treated. We also touch on ordinary differential equations and the method of separation of variables for partial differential equations. There is a section
v
vi
Preface
dealing with uniform convergence. This can, of course, be skipped or assigned as reading if the students are familiar with this topic. We use this material to derive the Euler formulas. The theorem of Fejer, that the Ceshro means of the Fourier series of a continuous function on the circle converge to that function uniformly over the circle, is proved in Section 7. We use that result to solve the Dirichlet problem and to prove the Weierstrass approximation theorem. The purpose of Chapter One is to give the students a thorough foundation in basic analysis as it relates to trigonometric series. Once this has been mastered, the student is ready to learn something about the applications of these series and about the more abstract approaches to their study.
In Chapter Two we discuss the way a trigonometric series can be used to represent functions of arbitrary period, functions defined only on an interval, and functions with discontinuities. Lots of examples of functions and their Fourier series are given. Much of this material can be covered quickly, and it is not used in any subsequent section. There is, however, a theorem about pointwise convergence that will be new to most students. In Chapter Three we look at Fourier series in Hilbert space. Here a little linear algebra is needed, and a brief discussion of the material required is given in Appendix A. We discuss inner product spaces and normed spaces and their interrelation. Banach and Hilbert spaces are defined and their basic properties investigated. A proof that every normed space has a completion that is a Banach space is given in Appendix B. There we also prove that the completion of an inner product space is a Hilbert space. These proofs are lengthy and a bit tedious, which is why we put them in an appendix. Instructors can cover them or simply assign them as reading. In the remainder of Chapter Three, we define L2(T), T being the unit circle, to be the completion of the continuous functions on T, C(T), with the integral inner product. We then prove that {eia1/ 27r} is an orthonormal basis for this Hilbert space. Once again the theorem of Fejer is used. We end Chapter Three with a brief discussion of wavelets. Here we define the Hilbert space L2 (R) and show how certain functions give rise to orthonormal bases for this space. Chapter Four deals with the Fourier transform. We start with the vector space Pt (Z); sequences {a(n)} such that E Ia(n) < oo. We define the convolution operation and the Fourier transform for elements of this space. The advantage in doing that is that all the proofs can be given, and in this case they are quite transparent. Next we look at elements of fl(Z) that are invertible with respect to convolution. To show that there is a connection between these ideas and Fourier series, we discuss Wiener's theorem: If f E C(T) is never zero and the Fourier series of f is absolutely convergent, then the Fourier series of 1/f is absolutely convergent. This connection motivates a further investigation into the properties of P(Z) under convolution, and we end Section 2 with a sketch of the
Preface
vii
proof of Wiener's result. Although the proof is, necessarily, incomplete. I believe I have done enough to give the reader a pretty good idea of what is involved.
In Section 3 we discuss the Fourier transform on R. Because we are not assuming a knowledge of the Lebesgue integral we pursue, as far as possible, the
analogy with t' (Z). Many examples of functions and their Fourier transforms are presented, and we show, by means of an example, how the Fourier transform can be used to solve a differential equation. Reviewing the first three sections of this chapter we observe that the Fourier transform involves maps from Z into T and continuous maps from R into T.
that "preserve the algebra" [i.e., maps c such that c(rn + n) = c(m) c(n)]. This, of course, takes us into group theory and the notion of a homomorphism. We do not use this terminology, however, until Chapter Five. Here we define,
very simply, subsets of C that are groups under addition or under multiplication. The discussion soon leads us to Z (the integers modulo n) and its additive structure. By investigating maps from Z,, into T that "preserve the algebra" we are led to the group of nth roots of unity and the finite Fourier transform. This transform is of central importance in certain branches of engineering (e.g., digital signal processing). We discuss its properties and show how it arises, rather mysteriously, in Lagrange's treatment of equations. This is done in Section 7. The calculations are a little lengthy, but this material is not used in later sections and so it can he skipped or assigned as reading. Sections 8-10 contain some number theory. In Section 8, we show when a power of a primitive nth root of unity is again primitive, we discuss the multiplicative structure of Z,, and characterize those elements that have inverses, and we define the Euler phi function. Some instructors may want to skip sections 9 and 10 and go directly to Chapter Five. Finding a formula for the phi function requires us to discuss primes, and to keep things interesting, we mention perfect numbers and their relation to Mersenne primes. At this point we have done enough modular arithmetic to prove the interesting half of the Lucas-Lehmer test, which tells us when 2p - 1 is a prime. We end Chapter Four with a brief discussion of how the phi function enters into the creation of cyphers. The final chapter of the hook contains the basic definitions from abstract algebra. Here we define groups and point out the many examples of groups that have occurred at various places in the text. We define homomorphisms and isomorphisms and explain that the maps from Z into T, discussed in connection with the Fourier transform, are examples of these things. The Fourier transform itself is an isomorphism between rings, and so we define rings and fields and point out these facts. A short treatment of the properties of finite fields (these do arise in the applications of harmonic analysis) is also given. The detailed discussion of the Dirichlet problem given in Chapter One is intended as motivation for the study of trigonometric series. Instructors who wish to by-pass this material and begin their course with the series themselves
viii
Preface
may easily do so. Simply cover Sections 4, 5, 6, and 7 of Chapter One and omit Section 8. The remainder of the text is then accessible. Those instructors who want to get to the Fourier Transform as rapidly as possible may do so by covering Sections 4-7 of Chapter One, then Sections 1
and 2 of Chapter Three, and the material in Section 5 up to Theorem 4 (in particular, Definition 1 and Theorem 3). 1 have tried to get the reader actively involved in the development of the mathematics. To accomplish this, I put some results, results referred to later in the text, in the exercises. Any such exercise is marked with a star and should be worked, or at least read, by anyone using the book. Solutions to the starred problems are given in Appendix C. I received helpful suggestions about various parts of the book from a great many people. I would now like to thank Nirmal Bose (Pennsylvania State University), James Kang (California State Polytechnic University), David Walnut (George Mason University), Selin Aviyente (Michigan State University), Kasso Okoudjou (Cornell University), Ilya Pollak (Purdue University), and John Brillhart and William Farris (University of Arlizona). It is my pleasure to also thank my Belgian colleague Maurice Hasson for reading the entire manuscript and making many valuable suggestions. I would also like to thank Victoria Milne for typing the manuscript and having infinite patience during its many revisions.
Chapter Zero
Preliminaries
All the mathematics needed to read this book is outlined in this chapter. The advantage in doing this is that the material is then easy for us to reference and, if necessary, easy for the reader to review. The notations and conventions used in the book are also explained here, but these are, for the most part, quite standard.
1.
Set Theory
In ordinary English the word we use for a collection of objects depends on what those objects are. So a collection of cows is a herd, while a collection of sheep is a flock. A collection of owls is a parliament, while a collection of ravens is a murder. We even have a name for a collection of unicorns. It's called a blessing. In the terminology of mathematics the word "set" is used for any collection of objects, whatever those objects may be. These collections, however. must be well defined; i.e., it must be clear just which objects are in a given collection and which objects are not. The collection of all great novels, for example. is not
a set because there is no general agreement, even among experts, as to what constitutes a "great" novel. Moreover, are we to include in our collection novels yet to be written or only those in print today? One faces similar difficulties with any collection of physical entities. The only "real" sets are collections of mathematical objects. Still, it is often illuminating to consider collections of physical entities and to call them sets provided one understands that this practice is really just a pedagogical device whose value is more psychological than mathematical. The objects comprising a set are called its "members" or its "elements." Given a set X and an object x we write x E X (read "x belongs to X") when 1
2
Chapter Zero
Preliminaries
x is a member of the set X, and we write x 0 X when this is not the case. A set is completely determined by its elements. More formally:
Definition 1. Two sets X and Y are said to be equal, written X = Y, if every element of X is an element of Y and every element of Y is an element of X. Sets that contain only a few objects can be specified by simply listing their
members. To emphasize the fact that we want to consider the set, a new "entity" formed by the act of collecting these objects together, we list them between a pair of curly brackets. So {a, b, c} is the set consisting of the three letters it. b, and c from the English alphabet. This notation is impractical when we work with sets that contain a great many members. The next definition will help us deal with that situation.
Definition 2. Let X, Y be two sets. We shall say that X is a subset of Y, and write X C Y, if every element of X is also an element of Y.
When X C Y, we shall say that X is contained in or included in Y, and we shall say that Y contains, or includes, X. It is clear that X = Y if, and only if. we have both X C Y and Y C X. This observation is useful because in many problems the easiest way to show that two sets are equal is to show that each contains the other. (See Exercises 2, problem 1b.) It is often convenient to define a set by specifying that its members are all elements of a known set, X say, and that they are distinguished from the other elements of X by the fact that they satisfy a certain condition. For example, if X is the set of all atoms, we may want to consider the subset of X defined as follows: {x E X ix is a hydrogen atom}. The vertical bar is read "such that" and the total expression reads "the set of all x in X such that x is a hydrogen atom." When this notation is used, it is assumed that the condition specified is clear enough to define a unique subcollection of X.
Definition 3. Given two sets X and Y we define two new sets as follows: (a) X U Y(read "X union Y") is the set consisting of those objects that are members of X or of Y, and (b) X fl Y (read "X intersect Y") is the set consisting of those objects that are members of X and Y.
It is clear that X UY contains both X and Y and that X flY is contained in both X and Y. Moreover, because we use "or" in the inclusive sense, we always have X fl Y C X U Y. The set with no members is called the empty set and is denoted by 0. From Definition 2 we see that 0 is a subset of any set X. This is consistent with Definition 3 because 0= 0 fl X C X.
Definition 4. Two sets X and Y are said to be disjoint if X fl Y = 0. It is a curious fact that the empty set is contained in, yet disjoint from, every set.
1. Set Theory
3
A set that contains a single element, say x, is called a singleton, or the singleton defined by z. We denote this set by {x}. If y is another object different from x, then the set {x, y} is called the unordered pair defined by x and y. This terminology reflects the fact that {x, y} = {y, x}. We shall need to consider the ordered pair defined by x and y.
Definition 5. Given two objects x and y we define their ordered pair, (x, y), to be the set ({x), {x, y}}. Given two sets X and Y. the set X x Y = ((x, y) I x E
X and y E Y} is called the Cartesian product of X and Y. We leave it to the reader to show that (x, y) 34 (y, z) unless we happen to have x = y (Exercises 1, problem 2). The Cartesian product of a set X with itself, X x X, is sometimes denoted by X2.
Exercises 1 A number of useful facts, facts that are referred to later in the text, are scattered among the exercises. In some of the exercises we introduce terminology that
is also used later. Any problem that is referred to later in the text is marked with an asterisk.
1. Show that 0 is not equal to {e} and that neither of these sets is equal to {0, {0}}. *2. Given two objects x and y we have defined (x, y) to be the set { {x}, {x, y}).
(a) Show that (x, y) = (y, x) if. and only if, x = y. (b) Show that (x, y) = (u, v) if, and only if, x = u and y = vv. This exercise justifies our calling (x, y) the ordered pair defined by x and Y.
*3. Let X,Y be two sets. We define X -Y to be {x E XIx V Y}. Show that
X-Y=X-(XnY).
*4. Let E be a fixed set and let X be any subset of E. The set E - X is called the complement of X (in E) and is denoted CE(X ). Let X. Y denote arbitrary subsets of E. (a) Prove DeMorgan's theorem; i.e., show that
i. CE(X U Y) = CE(X) n CE(Y); CE(X nY) = CE(X) UCE(Y).
ii.
(b) Show that X-Y=Xn(CE(XnY)]=XnCE(Y). (c) Show that the following are equivalent:
i. XUY=E ii. CE(X) C Y iii. CE(Y) C X
4
Chapter Zero
Preliminaries
(d) Show that the following are equivalent:
i. XC_Y ii. CE(X) CE(Y) iii. X U Y = Y
iv. XnY=X 2.
Relations and Functions
We want to introduce some concepts, and the terminology associated with them, that will be useful throughout the book. To enhance the applicability of these ideas, we present them in a very general abstract setting. Some concrete examples, however, are given here and many more are given in subsequent sections. Throughout this section, X and Y denote nonempty sets and X x Y denotes
their Cartesian product (Section 1, Definition 5).
Definition 1. Any nonempty subset R of X x Y is called a relation from X to Y. We shall say that x E X and y E Y are R related, and we shall write xRy, if (x, y) E R. The two sets D(R) C X and R(R) C Y, called, respectively, the domain and the range of R, are defined as follows: V(R) = {x E XI there is
ayEYwith(x,y)ER}, R(R)={yEYIthere is an xEXwith (x,y)ER}. We can define a relation from the set M of all men to the set W of all women by taking R = ((m, w) E M x W1 m and w are married (to each other)}. Then D(R) is the set of all married men and R(R) is the set of all married women.
A relation from a set X to itself (i.e., a subset of X x X) is called a relation
on X. A relation R on X is said to be (a) reflexive, if xRx for every x E X; (b) symmetric, if x, y in X and xRy implies yRx; (c) transitive, if for x, y, z in X, we have xRz whenever we have both xRy and yRz. A relation that has all three of these properties (i.e., a relation that is reflexive, symmetric, and transitive) is called an equivalence relation on X. Referring to the set W we can take w - w' to mean that the two women w and w' are sisters. (Let's agree that two, not necessarily different, women are sisters if they have the same parents.) It is easy to see that - is an equivalence relation on W.
Definition 2. A relation R from a set X to a set Y is called a function from X to Y or a mapping from X into Y, if (a) the domain of R is X and (b) whenever (x, y) and (x, y') are both in R, we must have y = y'. We can rephrase (b) as xRy and xRy' together imply y = y'. A relation R that has property (b) but not (a) is sometimes called a partial function from X to Y. In this case, R is a function from X' to Y where X' = D(R).
2. Relations and Functions
5
In a monogamous, or a polyandrous, society, the relation from M to IV defined above is a partial function. In a polygamous society, it is not. When f is a partial function from X to Y, it is customary to write y = f (x) instead of x f y or (x, y) E f . We shall also use the symbol f : X -. Y to abbreviate the phrase "f is a function from X into Y." Let us emphasize that when we write f : X -+ Y the domain off is all of X. The set Y, in this case, is often called the codomain of f and the range of f is, of course, a subset of Y. Y we can, for each y E Y, define a subset pre f(y) of X as Given f : X follows: pre f(y) = {x E X I f (x) = y}. We call this set the preimage of y. Note
that because f is a function, the sets pref(y) and pref(y') are disjoint unless y = y'. Moreover, pref(y) 54 0 if, and only if, y is in the range of f. There are two important special cases here. (i) When f (x) = f (x') implies x = x'. In this case, every nonempty pre (y) is a singleton. (ii) When the range of f is all of Y. In this case, no pre (y) is empty.
Definition 3. A function f from X to Y is said to be one-to-one if f (x) _ f (x') implies x = x'. We shall say that f is onto or that f maps X onto Y if the range of f is Y. Observe that when f : X -+ Y is both one-to-one and onto we can define a function g : Y - X by setting g(y) = pre f(y) for every y E Y. A set whose elements are themselves sets is often called a family of sets. Given a family of sets A we can always "name" the members of A as follows:
Choose a set I and a one-to-one, onto function p : I -+ A. We call V an indexing of A and we call I the index set. It is customary to denote the set W(A) E A by AA and to write A = {AAA A E I}. We now define
UA= UAA={xIxEAAforsomeAE1} AE1
and
nA=nAA={x(xEAAfor every AEI} AEI
A family of sets A = {AAI ,\ E 1} is said to be pairwise disjoint if AAflA,, _ 0 whenever A, ,u are in I and A 54 p.
Given a function f from X to Y, the set R(f) indexes the family of nonempty preimages of f; i.e., {pref(y)ly E R(f)}. This is a family of nonempty pairwise disjoint sets whose union is X.
Axiom of Choice: When f : X -+ Y is onto it seems evident that there "should be" a function g : Y -. X that is one-to-one; for each of the sets pre f (y) is nonempty so we may set g(y) equal to any member of this set. This seems to
6
Chapter Zero
Preliminaries
give us a function, and because these preimages are pairwise disjoint, g must be one-to-one. The trouble with this is that we have not specified which element of pre f(y) is to be g(y); hence, we have not really defined anything here. There
is, in fact, no way to specify g(y) in the general case. To define g, we must appeal to a famous axiom: Given any nonempty family of nonempty sets, there is a function that assigns to each set in the family of an element of that set. This is the axiom of choice. It has many applications and many equivalent forms and we discuss some of them later (see Section 2 of Chapter 4).
Exercises 2
*1. Let = be an equivalence relation on a set X. For each x E X let [x] _ {y E X I x
y}. We call [x] the equivalence class defined by x.
(a) Show that for each x E X, (x) 00. (b) Show that for x, y in X, [x] fl [y] 00 if, and only if, [x] = [y]. Thus the equivalence classes of X are either identical or they are disjoint.
*2. Let f : X -+ Y and let g be a partial function from Y to a set Z.
(a) If the domain of g contains the range of f, show that setting (g o f)(x) = g[f (x)] for each x E X defines a function, g o f , from X to Z. We call this the composition of f and g. (b) Show that f is both one-to-one and onto if, and only if, there is a
function g : Y -+ X such that (f o g)(y) = y for all y E Y and (go f)(x) = x for each x E X. When such a g exists, we call it the inverse off and denote it by f -1. *3. For any set X, let P(X) denote the family of all subsets of X.
(a) Define a relation = on P(X) as follows: For Y, Z E P(X), Y - Z if, and only if, there is a function f from Z to Y that is both one-to-one and onto. Show that = is an equivalence relation on P(X). (b) Define a relation < on P(X) as follows: For Y, Z in P(X), Y < Z if, and only if, Y C Z. i. Show that the relation < is reflexive and transitive.
ii. Show that < is antisymmetric, meaning Y < Z and Z < Y together imply Y = Z.
A relation R on a set X that is reflexive, antisymmetric, and transitive is called a partial order relation on X or a partial ordering of X. The term "partial" refers to the fact that there may be elements of X that are incomparable, i.e., elements x, y
of X such that neither xRy nor yRx is true. A partial order relation on a set X is called a total order relation or a total ordering of X if any two elements of X are comparable.
3. The Real Number System
7
4. Suppose that for some set X, we have a function g : X - P(X) that is onto. Then for each x E X, g(x) E P(X) (i.e., g(x) is a subset of X) and so it makes sense to ask whether or not x is an element of g(x). Now let Xo = {x E X I x it g(x)}. This is a subset of X, hence an element of P(X). There is an xo E X such that g(xo) = X0 because g is onto.
(a) Show that xo E g(xo) leads to a contradiction. (b) Show that xo ¢ g(xo) leads to a contradiction. We conclude that for any set X there is no function from X to P(X) that is onto. (c) The famous Russell paradox arises when we consider the "set" of all sets that are not members of themselves (e.g., the set of all books is not a book). If there were such a set, let's call it S, then we have both S E S and S S. Paradoxes like this led to the creation of axiomatic set theory.
*5. Let {A,\ I A E I} be a family of sets and suppose that there is a set X such that Aa C X for every A E 1. Prove the strong form of DeMorgan's theorems, i.e., show that (a) Cx [UAEI A,\]= nAEI Cx (A,\) and (b) Cx [nAEI AA] =
3.
Cx(AA)
The Real Number System
The real numbers were introduced into mathematics to provide a foundation for analysis (i.e., calculus and its off-shoots). We assume the reader has some familiarity with this set, but a brief review of its basic properties is well worthwhile.
To begin with, any two real numbers can be added and multiplied and the result is a real number. Moreover, this set, which we shall denote by R, contains two special members 0 and 1 that have the properties:
(1) x+0=0+x=x forallxER (2)
These numbers, which are clearly unique (see Exercises 3, problem 1), are called, respectively, the additive and multiplicative identities of R. Each element x E R has a unique additive inverse, that is, there is a unique element (-x) E R such that
(3) x + (-x) = (-x) + x = 0
Chapter Zero
8
Preliminaries
Thus we may, for any two real numbers x and y, define x - y to mean x + (-y). Any nonzero real number x has a unique multiplicative inverse. This is the number, we shall denote it by x-1, for which x-x-1=x_1'x=1
(4)
For any x, y in J with y $ 0, we define x/y to mean We assume that the reader is familiar with the various properties of these arithmetic operations. There is a subset P of IR that has the following properties:
(5) For any x E R exactly one of the statements x E P, (-x) E P, x = 0, is true.
(6) If x, y are in P so also are x + y and x y.
It is easy to see that the number 1 is an element of the set P, whereas the number 0 is not. The members of P are called the positive real numbers. Given x, y, in lR we say that x is less than y or y is greater than x, and write x < y, if y - x belongs to P. This gives us a relation on the set IR that we shall call the order relation on R. It is clear from (5) that for any two distinct real numbers x and y we have x < y or y < x (but not both), and from (6) we see
zEP. Because 1 E P we see from (6) that 2 = 1+1 is in P and so is 3 = 2 + 1, etc. The subset of R generated in this way is denoted by N and is called the set of natural numbers. It has many fascinating properties but the one that we shall need in the sequel is this: (7) Any nonempty set of natural numbers contains a smallest member.
The set R does not have property (7) because it has no smallest member. However, R does have an important property that can be stated in terms of the order relation. First, we need some terminology. Let S be a nonempty set of real numbers. A real number a such that a < s (read "a is less than or equal to 9") for all s E S is said to be a lower bound for S. Similarly, a number b E R is said to be an upper bound for S if s < b for all s E S. A nonempty subset of
R is said to be a bounded set if it has both an upper and a lower bound. By convention the empty set, 0, is also called a bounded set.
Definition 1. Let S be a nonempty set of real numbers. A real number be is said to be the (see Exercises 3, problem 2c) least upper bound for S if: (1) bo is an upper bound for S, and (ii) if b is any upper bound for S, then bo < b. We may now state one of the key properties of the set R.
3. The Real Number System
9
(8) Any nonempty set of real numbers that has an upper bound has a least upper bound. There are many ways to study the real numbers. In some treatments (8) is taken as an axiom and referred to as the axiom of continuity. In others it is a theorem and called the completeness property of R. Let us show how some of the most familiar facts about R follow easily from (8). The results we have in mind are well known but, as we shall see in the sequel, extremely useful.
Theorem 1. Let a, b be any two positive real numbers. Then there is a natural number n such that a < n b. Proof. If a < b, we need only take n = 1. Suppose that nb < a for every natural number n. Then the set {nbln E N), which is certainly not empty, has an upper bound, the number a, and hence, by (8), has a least upper bound that we shall denote by ao. Now for any it E N we have (n + 1)b < ao, which implies the inequality nb < ao - b for all n E N. Thus the number ao - b is also an upper bound for the set {nbin E N}, and this contradicts the fact that ao is the least upper bound for this set.
Corollary 1. Let a, b be any two positive real numbers. Then a = n b + r where n is a natural number or zero, and r is a real number that satisfies the inequalities 0 < r < b.
Suppose b < a. Then in E Nia < nb} is, by our theorem, nonempty and because it is a subset of N, it contains a smallest member that we shall call in, here we are using property (7). Thus we may write (m - 1)b < a < mnb and because b < a we know that 2 < in. It follows that the number in - 1, that we shall denote by n, is a natural number. Thus, except when a = mb in which case r = 0, we have a = (m -1)b+r = nb+r where r is the real number a - nb. Clearly, r < b in either case. Proof.
Theorem 1 has another useful corollary. Before we can state it we must recall two important subsets of R. First, we have the set of integers, Z, obtained from N by adjoining zero and the additive inverse of each of its members. So
(9) Z={0,±1,±2,...} Next there is the set Q of rational numbers,
(10) Q={xERIx=m/nwhere m,n are in Zand a
0}
Corollary 2. Between any two distinct real numbers there is a rational number.
10
Chapter Zero
Preliminaries
Proof. Let a. b be two real numbers with a < b. We want to find two integers in. n such that a < '-'t` < b. Now the numbers 1 and b - a are positive, so there is, by the theorem, a
natural number n such that 1 < n(b - a). Choose, and fix, such an n. Now, again by our theorem, the set {k E Zina < k} is nonempty. Let m be its smallest member. Then no < m < na + 1 < no + n(b - a) = nb, giving us a < ;" < b and proving the corollary. The set lR together with its order relation is, as we have just seen, a sophisticated mathematical object with an intricate structure. The members of R are lined up (linearly ordered) by this relation but not at all like the beads on a string. Between any two distinct real numbers there are always more real numbers. Thus given a specific number it is meaningless to speak of the "next"
one. Moreover, removing a number does separate the line into two parts; it does leave a "gap," but this gap has no edges.
Exercises 3 1.
(a) Show that the additive identity of R is unique; i.e., if o and o' both satisfy equation (1), show that o = o'. (b) Show that the multiplicative identity of R is unique.
2.
(a) Given real numbers x and y with x < y, show that
i. x+z
(b) For x, y in R we have defined x < y to mean x < y or x = y. Show that < is a total ordering of R (Exercise 2, problem 3b). (c) Show that a set of real numbers that has a least upper bound has exactly one least upper bound. (d) Show that any nonempty set S C R that has a lower bound has a greatest lower bound; i.e., there is a number ao such that i. ao is a lower bound for S, and ii. if a is any lower bound for S then a < ao. (e) Leibnitz wanted to add infinitesimals to the real number system; i.e.,
quantities j such that 0 < j < n for every n E N. Show that no member of R has this property.
3. For any x E R define IxI to be x when x E P or x = 0 and define IxI to be (-x) otherwise. Show that for any x, y in R we have (a) Ix + yI <- IxI + Iyi (b) I IxI - Iyii <- Ix - yI
(c) Ix' yI = IxI Iyl
4. The Complex Number System
11
4. Suppose that for some set S we have a function f : N -+ S that is onto. Show that there is a function g : S -' N that is one-to-one. Explain why you do not need the axiom of choice in this case. 5.
(a) Show that there are real numbers that are not rational as follows: Consider the real number V,'2-. If this were rational, there would be integers m, n such that f = n . Clearly, we may suppose that in and n have no common factors. Now deduce a contradiction. (b) Let a be a real number that is not rational; such numbers are called irrational. Show that i. a + q is irrational for every q E Q, and a q is irrational for every nonzero element of Q. (c)
Given a, b in IR with a < b, modify the proof of Corollary 2 to show that there are integers in, n for which a < (n) < b. Thus, between any two distinct real numbers, there is an irrational number.
6. For any two real numbers a and b, let (a, b) = {x E RI a < x < b} and let
[a,b]={xE]Rla<x
(b) When (a, b) 0 0, we call this set an open interval, and when [a, b] 0 0 we call this set a closed interval. In either case, a and b are the end points of the interval and b - a is its length. We have seen that every open interval contains both rational and irrational numbers. Is this true of every closed interval? 7. A purely fictitious sleep researcher discovers a man with a strange pattern of dreaming. On any closed interval of time during which he is asleep he
dreams a great deal, but never on the (open) middle one-third of this interval. Suppose this man sleeps for 1 hour. Describe, as precisely as possible, the set on which his dreaming takes place. Hint: Model his nap by the interval [U, 1]. No dreaming takes place on the interval (1. s)
4.
The Complex Number System
The reawakening of Europe, which began in the Italian city states, is usually associated with the remarkable achievements in painting, sculpture. and architecture. But it was an intellectual rebirth and so, not surprisingly, it had a mathematical side. Perhaps the two most important developments were the discovery of the laws of perspective by Brunelleschi, which led to the creation of projective geometry, and the discovery of the system of complex numbers. These numbers are of fundamental importance in modern physics and modern electrical engineering, but when they were first introduced they were viewed
12
Chapter Zero
Preliminaries
with skepticism and distrust. Let us briefly review how they came into being and give a quick survey of their basic properties. One might believe that the study of the quadratic equation would lead to complex quantities. This, however, was not the case. Quadratics whose solution involved square roots of negative numbers were simply dismissed as having no solutions. It was in connection with the cubic equation that this could not be done. A formula for solving the general cubic was found by Scipione del Ferro and, independently, by Niccolo Fontana, also known as Ta'taglia. In 1545 this formula was published by Gerolomo Cardano in a book entitled Ars Magna wherein he clearly states that he obtained the formula from Ta'taglia and that the result was also found by del Ferro. For some reason the formula has become associated with Cardano's name, leading many to believe he stole it. But his book is widely available and the facts can be easily checked (see Note 1). Now the equation x3 = 15x + 4 was discussed by Bombelli in his algebra book written in 1572. Applying Ta'taglia's formula, he obtained the root
r=
a
2+ -121+
V2
-
-121
(1)
Bombelli wondered if, perhaps, the cube roots appearing in (1) might be written in the form p +,V--q and p - vf--q, what we today would call complex conjugates (see Exercises 4, problem 1). He was able to show that they are and 2 (see Exercises 4, problem 10). Putting these in (1) indeed 2+ he obtained r = 4, which is easily seen to satisfy the equation. Now expressions involving square roots of negative numbers had, on occasion, been considered before. In fact Cardano, in his book, used them to find two numbers whose sum is 10 and product is 40. But Bombelli went on to develop the arithmetic of these expressions, giving us the rules we use today. In modern notation he worked with quantities of the form a + bi where a, b are real and is = -1 and he developed the arithmetic operations on these quantities. Thus the system of complex numbers was born.
A major advance in our understanding of equations came in the 1620s when the English mathematician Thomas Harriot began writing them in the modern way, i.e., as a polynomial set equal to zero. So x3 = 15x + 4 became x3 - 15x - 4 = 0. Harriot noted that whenever the number r was a root of the equation p(x) = 0, he could find a polynomial q(x) such that p(x) = (x-r)q(x). Clearly, q(x) must have degree one less than that of p(x), and thus he discovered
the fundamental fact that the degree of p(x) tells us the number of roots of the equation p(x) = 0. By combining this with the work done in Italy, it became clear that every quadratic, every cubic, and every quartic (fourthdegree equations, first solved by Lodovico Ferrari, a student of Cardano's) had
two, three, and four, respectively, complex solutions. The natural question then was whether one could show that every polynomial equation of degree n had n complex roots. Various people did make progress with this question, but
4. The Complex Number System
13
complete success did not come until nearly 200 years later (1820) with the work of Gauss. Thus we see that complex numbers, despite their mysterious nature, became central to the theory of equations. Still people were skeptical. Euler, who gave us the definition (generalizing an important observation of DeMoivre; see Exercises 4, problem 3b)
e'v =cosy + i sin y,
yER
(2)
wrote in 1770: "All such expressions as Vf--1, vr--2, etc., are completely impos-
sible or imaginary numbers, since they represent roots of negative quantities; and of such numbers we may truly assert that they are neither nothing, nor greater than nothing, nor less than nothing, which necessarily constitutes them imaginary or impossible." The first clue as to how one might define complex numbers rigorously came
from a geometric interpretation of these quantities given, independently, by a Norwegian surveyor named Wessel, who presented his ideas to the Danish Academy of Sciences in 1797; by the astronomer Bessel in 1799; and by a Swiss bookkeeper named Robert Argand in 1806. These results made the connection
between complex numbers and points in the plane, i.e., ordered pairs of real numbers. Essentially what they did was to identify the symbol a + bi with the point (a, b) or, if one preferred, with the segment joining (0, 0) to (a, b). We shall exploit this representation below.
Let us turn now to Gauss' definition of the system of complex numbers first given in 1831. He defined his basic set, C, to be the Cartesian product of R with itself, the set R x R. Next he needed arithmetic operations on the members of C and these he defined as follows:
(a,b)+(c,d)=(a+c,b+d) (a, b) (c, d) _ (ac - bd, ad + bc)
(3) (4)
for all (a, b) E C and all (c, d) E C.
It is easy to see that (0, 0) and (1, 0) are, respectively, the additive and multiplicative identities of C. Each complex number (a, b) has a unique additive inverse, (-a, -b) of course, and if it is not zero, a unique multiplicative inverse (see Exercises 4, problem 1b). It is customary to identify R with the subset {(x, y) E Cly = 0}; thus every real number is also a complex number. Note that
the real number -1 is now identified with (-1, 0). Since (0,1) (0, 1) _ (-1, 0), by (4), (0, 1) is the mysterious i.
Now that we know what complex numbers really "are" we shall, as is customary, write them as a + bi, a and b real, i2 = -1. Thus C = {a + bi la, b E R}. We may identify any z = a + bi in C with the line segment joining (0, 0) to (a, b). The length of this segment, '/a2 + b , is called the modulus, or absolute value, of z; we shall write Izi = \/a2 _+b2. It is clear from the geometry that we may write a = I z I cos 9, b = I z I sin 0 where 0 is the angle between our segment
14
Chapter Zero
Preliminaries
and the positive x axis. The angle 0 is referred to as an argument of z; obviously
0 is not unique. Thus, for any z = a + bi in C, we have z =a+ bi = lzl(cos 0 + i sin 0) = [zl eie
(5)
These last two expressions are called the polar forms of z. For any two points
of C, say zl = x1 + iy and z2 = x2 + 42, we define the distance between these points to be Iz1 - z21. This is just Vl'(Xl - x2)2 + (yl - y2) hence for many problems we may identify the complex plane with the plane of analytic geometry.
Exercises 4 1. For any complex number z = a + bi define x, read "z bar," to be a - bi. This is called the conjugate of z. *(a) Show that z z = a2 + b2 = 1z12. *(b) If z # 0, compute the multiplicative inverse of z, z-and show that it is unique. (c) Define f : C C by setting f (z) = 'z for all z. Show that (i)
f(z) f(w); (ii) f(z+w) = f(z)+f(w); (iii) f(az) =af(z) when a is real.
(d) Use (c) to show that the complex roots of a polynomial equation having real coefficients must occur in conjugate pairs; i.e., if p(z) = 0,
then we must also have p(z) = 0.
2. Show that e" + 1 = 0. This formula was once thought to have mystical significance because it combines the e of analysis, the 7r of geometry, the zero and one of arithmetic, and the i of algebra. *3. Let z and w be two complex numbers whose polar forms are, respectively,
z = r(cos0+isin0),w = s(cos0+isin¢). (a)
(b)
Show that z w = rs(cos(0 + 0) + i sin(0 + 0)). Prove DeMoivre's formula: For every natural number n, z" = r" (cos nO + i sin n0).
Hint: Use the fact that any nonempty set of natural numbers has a smallest member. *4. Let z, w be any two complex numbers. Show that (a) Iz wi = Izi Mwl and (b) Iz + wI < IzI + IwJ.
*5. Assume the fundamental theorem of algebra: Any polynomial equation, with complex coefficients, has a complex root. Use this to show that an equation of degree n has n complex roots.
5. Analysis
15
*6. Given any function f from R to C show that there are two functions g and h, both from R to R, such that f (x) = g(x) + ih(x) for all x E R. R and v : C - R by setting, for each z = ai + Hint: Define u : C ib,u(z) = a and v(z) = b. Incidently, a is called the real part of the complex number a + bi, and b is called its imaginary part. This custom is unfortunate because it forces us to live with the peculiar fact that the imaginary part of a complex number is real. 7. We refer here to section 3, specifically to the subset P of R with properties
(5) and (6). Show that there is no subset of C that has these properties.
8. Describe geometrically the sets defined by the following inequalities:
(a)lzJ<1;(b)lz+il <1;(c)IzI>2;(d)Izl=1. 9. Find the real part, the imaginary part (see Exercises 4, problem 6), and the absolute value of (a) (3i - 2)3; (b) ;.+i ; (c) 2+3i 10. Compute (2 + i)3 and (2 - i)3.
11. Show that for any natural number n the number i" is equal to one, and of course only one, of the numbers i° it iz 0.
5.
Analysis
The basic notions of analysis, limit, continuity, and differentiability are formally
the same whether our functions map R to R. R to C, C to R, or C to C. To exploit this fact, we shall let lid and K' denote either R or C.
Definition 1. Let so E K and let p be a fixed, positive, real number. The set Up(so) = {s E K is - sot < p} is called an open neighborhood of so or, sometimes, just a neighborhood of so. If S C K and so E S we shall say that so is an interior point of S if for some p > 0, UU(so) C S. The set S is called an open set if each of its points is an interior point.
We leave it to the reader to show that 0 and K are open sets and that (i) the union of any family of open sets is an open set, and (ii) the intersection of any finite family of open sets is an open set. A subset of K is called a closed set if its complement is an open set. From (i) and (ii) and the theorem of DeMorgan (Exercises 2, problem 5) we immediately obtain (iii) the intersection of any family of closed sets is a closed set, and (iv) the union of any finite family of closed sets is a closed set. A nonempty subset S of K is said to be bounded if there is a positive real number M such that Isi < M for all s E S. The empty set is also taken to be a bounded set. We recall that any set that is both closed and bounded is called
16
Chapter Zero u Preliminaries
a compact set. These behave very much like finite sets. In particular, when S is a compact set, we can, from any family of open sets whose union contains S, extract a finite subfamily whose union contains S. If a function f is defined on U,(so), except possibly at so itself, and if for some to E K we have: Given e > 0 there is a 6 > 0 such that If(s) - tol < e whenever 0 36 Is - soI < S, then we shall say that to is the limit of f(s) as s approaches so, and we shall write lim f (s) = to.
Definition 2. Let f : S -+ K' and let so be an interior point of S. We shall say that f is continuous at so if lim f (s) = f (so). We shall say that f is differentiable at so if lira f (s) - f (so) exists. This limit is denoted by f'(so). $-.$o
8-80
It is easy to see (see Exercises 5, problem 2) that if f is differentiable at so, it is continuous there. We stress that differentiability is defined only at interior points of the domain S of f. The following amusing example shows that this restriction is not to be taken lightly.
Example I One can easily show that for any natural number n, (*)
n summands on the right-hand side of (*). Now the derivative of 712 is 2n and that of n is 1, hence (**) Dividing by n, which is valid because n is a natural number and hence not zero,
we get2=1.
Let us specialize now to partial functions from R to R. More specifically, we consider a function f whose domain is a closed interval [a, b]. We already know what it means to say that f is continuous at any point xo E (a, b). We shall say that f is continuous at x = a if given e > 0 there is a 8 > 0 such that
If (a) - f (x) I < e whenever x E [a, bJ and x - a < 8. We shall say that f is continuous at x = b if given e > 0 there is a 6 > 0 such that If (b) - f (x.) I < E whenever x E [a, b] and b - x < 6. Not surprisingly, we say that f is continuous on [a. bJ when it is continuous at each point of (a, b) and also continuous at x = a and at x = b. Let us now review some of the basic properties of realvalued functions that are continuous on closed intervals (remember a, b are in R so the length of [a, b] is finite). We shall omit the proofs.
Theorem 1. There are points x1 i x2 of ]a, b] such that f(xj) < f (X) < A X2) for all x in [a, b].
5. Analysis
17
This theorem simply states that f has a maximum and a minimum on [a, b]. Suppose now that y is any number that is between the two extremes of f . Then:
Theorem 2. There is a point xo E [a, b] such that f (xo) = y. Definition 3. Let f : [a, b] - K'. We shall say that f is uniformly continuous on [a, b] if, given e > 0, there is a b > 0 such that If (x1) - f (X2)1 < e whenever x1, x2 are in (a, b] and Ixi - x21 < b.
Theorem 3. If f :
R is continuous on [a, b], then it is uniformly con-
[a, b]
tinuous on this interval.
Theorem 4. If f :
[a, b] --+ R is continuous on [a, b], then the Riemann inte gral of f, fa° f (x)dx exists. Moreover, there is a point xo E [a, b] such that
fa f (x)dx = f (xo) (b - a). The second statement is an immediate consequence of theorems 1 and 2. Finally, we have the mean-value theorem:
Theorem 5. If f : [a, b] -+ R is continuous and has a derivative at each point of (a, b), then to any h such that a < a + h < b there corresponds a number 0,0<0< 1 such that f (a + h) - f (a) = h f'(a + Oh). Let us turn now to functions from C to C. Simple examples are as follows: f(z) = z2 = (x + iy)2 = (x2 - y2) + i(2xy)
(1)
f(z) = e= = eT+iy = ex (cos y + i sin y) = e' cosy+i(e?siny)
(2)
It is clear, as we have illustrated, that any f : C -+ C can be written in the form
f(z) = u(x,y) + iv(x,y)
(3)
where u and v are real-valued functions defined on R2. Suppose now that f is differentiable at zo = xo+iyo. Then f is defined on some neighborhood UP(zo) and lim Z
f(z) - f(zo) = f'(xo)
ZO
z - zo
We must obtain the same number for this limit no matter how z approaches zo. Let us take, first, z = (xo + Ax) + iyo, where Ax is so small that z E UP(zo). Then we find that
f '(zo) = AX-0 lim
u(xo + Ox, yo) - u(xo, yo) + i lim v(xo + Ox, yo) - v(xo, yo) Ox Ax
= Ox(xo,yo)+iex(xo,yo)
Chapter Zero
18
Preliminaries
But by taking z = xo + i(yo + Ay) we obtain, similarly.
f'(zo) _
-i
5b (xo, yo) +
a- (xo, yo)
(4)
Thus at any point at which f has a derivative, we have
atc_av OX
(5)
ay
and
Uu
av
ay
ax
(6)
These are the Cauchy-Riemann equations named for the Frenchman A. L. Cauchy and the German G. F. B. Riemann, two pioneers in complex function theory.
It is clear that u(x, y) = x2 - y2, v(x. y) = 2xy satisfy these equations as do u(x, y) = e= cos y and v(x, y) = e' sin y. Suppose now that we have two real-valued functions u and v defined on 1R2 that satisfy the Cauchy-Riemann equations at some point. Suppose further that u and v have continuous first and second partial derivatives at this point. Then equations (5) and (6) give: 02u
_
02v
axe - axay
(7)
and
021,
a2v
aye = - 8yax
(8)
Under our assumptions, 02" = aye=, hence u must satisfy 02.U
0,2 U
axe + aJ2 = °
(9)
A similar calculation shows that v must satisfy the same equation. Equation (9) is called Laplace's equation or, sometimes, the potential equation. We have already seen four solutions to this equation, and our discussion above shows how we can generate many more (see Exercises 5, problem 4). This is a very important observation as we shall see. A function g : N -a S (recall that N = {1, 2,3,... 1), where S is an arbitrary set, is called a sequence of points of S, or, more simply, a sequence in S. It is I. customary to set g(na) = s for all ii and to speak of the sequence
5. Analysis
19
Definition 4. Let {sn}I I be a sequence in K. We shall say that {sn} is convergent to the element so E K, and we shall write lim sn = so, if given n -+oo e > 0 there is a natural number no such that Iso - snI < e whenever n > no. A sequence that is not convergent is said to be divergent.
When f is a partial function from K to K' and {sn} is a sequence in the is a sequence in the range of f. We can use the behavior of sequences constructed in this way to characterize those points at which f is continuous.
domain of f, then If
Theorem 6. Let f be a partial function from K to K' and let so be an interior point of the domain of f. Then f is continuous at so if, and only is a sequence in the domain of f that if, lim f(sn) = f(so) whenever n-oo converges to so.
Proof. Suppose, first, that f is continuous at so and that {s, } is a sequence
in the domain of f that converges to so. Given e > 0, we use the continuity of f to find 5 > 0 such that If (so) - f (s)I < e whenever Is - sol < 5: remember so is an interior point of the domain of f. Next we use the fact that lim s = so to find a natural number no such that Isn - soI < 5 whenever n > no. Thus for the given e > 0, we have If (sn) - f (so) I < e whenever n > no, and this means If (sn)} is convergent to f (so) as claimed. Now suppose that lim f (so) = f (so) whenever {sn} converges to so and yet f is not continuous at this point. We shall deduce a contradiction. Negating the definition of continuity at so, we see that for some e > 0 we can, for every 6 > 0, find an element s6 in the domain of f for which Is6 - sot < 6 and yet If(s6) - f(so)I > e. We now construct a sequence as follows. Setting 6 = I we find sl such that Is' - soI < 1 and I f (sl) - f (so) I > e. Next we set 6 = 2 and find ,s2 so that 152 - soI < 2 and If 02) - f(so)I > e. We continue in this 1 have been found, we set 6 = n and find s,, so that way. After 81, 82, ..., Isn - sop < n and If (sn) - f (so)I > E. It is clear that we have constructed a sequence {sn} that converges to so. But it is also clear that If (sn)} could not converge to f (so) because If (sn) - f(so)I > e for all natural numbers n. This is our contradiction.
nx
Definition 5. A sequence {sn} C K is called a Cauchy sequence if, given any e > 0, there is a natural number no such that Isn - 8n, I < E whenever m and n both exceed no.
If {sn} converges to so, then because Isn - snl < Isn - soI + Iso - sn,i we see that this sequence must be a Cauchy sequence. For the set IIt, it is in fact the case that a sequence converges if, and only if, it is a Cauchy sequence. This statement is equivalent to the completeness axiom for R (Section 3, paragraph just after equation (8) ). It follows easily that a sequence in C is convergent if, and only if, it is a Cauchy sequence (see Exercises 5, problem 1).
Chapter Zero
20
Preliminaries
Exercises 5 1. Let (z,, } be a sequence in C, and for each n let zn = xn + iyn, where xn and yn are in R.
(a) Show that
is a Cauchy sequence if, and only if, both {xn} and
{ y,, } are Cauchy sequences.
lim zn = xo + iyo if, and only if, limxn = xo and (b) Show that n-x limy,, = yo.
*2. If the function f is differentiable at a point so E K, show that it is continuous at this point.
*3. Suppose f : lR -+ C. Then we may write f (x) = g(x) + ih(x), where g and h map 1R to R.
lim f (x) = uo + ivo if, and only if, we have both (a) Show that X-zo lim g(x) = uo and lim h(x) = vo. l- XO 2-SO
(b) Show that f is continuous on [a, b] if, and only if, both g and h are continuous on this interval. So fn f (x)dx can be defined to be fQ g(x)dx + i f, h(x)dx, and we shall do so. (c) Show that f is differentiable at a point xo if, and only if, both g and h are differentiable at this point. Show that, when this is the case, f'(xo) = g'(xo) + ih'(xo). 4.
(a) The function f (z) = z3 is differentiable at every point in C. Find the real and imaginary parts of this function and show that together they satisfy the Cauchy-Riemann equations. Show that each of these functions satisfies Laplace's equations. (b) Do the real and imaginary parts of the function f (z) = z satisfy the Cauchyy-Riemann equations?
5. Let f be a partial function from R -+ K' with domain a closed interval [a, b). Show that f is continuous on [a, b] if, and only if, we have (i) f is continuous on (a. b); (ii) lim f (xn) = f (a) whenever {xn} C [a, b] and C [a, b] and licexn = b. lira .r,, = a; (iii) lim f f (b) whenever
6. Let f be a partial function from K to K' that is uniformly continuous (Definition 3) on its domain.
(a) Show that {f(sn)} is a Cauchy sequence whenever {sn} C D(f) is a Cauchy sequence.
(b) Show that f (x) = z , D(f) = (0, 1), is not uniformly continuous.
5. Analysis
21
*7. Let {nk}k 1 be a sequence of natural numbers such that nk < nk+1 for all k. Given a sequence {sn}, the sequence {sn,t} is called a subsequence Of {8n}-
(a) Show that all subsequences of a convergent sequence are convergent to the same limit. (b) Show that any subsequence of a Cauchy sequence is a Cauchy sequence. (c) If {sn} is a Cauchy sequence and if it has a subsequence that converges, show that {sn} is convergent.
(d) If {sn} is a Cauchy sequence, show that there is a positive real number M such that is,, I < M all n.
Notes 1. Cardano's book, Ars Magna, published in 1545, has been translated by T. Richard Witmer (The Great Art. M.I.T. Press, Cambridge, MA, 1968). On page 8 Cardano states that the formula for solving a cubic was found by Scipione del Ferro and, independently, by Ta'taglia. He also states that the formula was given to him by Ta'taglia. Nowhere does he claim it as his own.
2. Bombelli's work, and a comment from him on how he was led to these ideas, can be found in Number-the Language of Science, by Tobias Dantzig (Doubleday and Co., Inc., Garden City, NY, 4th edition, 1956). He discusses Bombelli's derivation of the laws of complex arithmetic and also the work of Harriot (see pp. 184-186 and p. 188). Euler's quote about the impossibility of taking square roots of negative numbers can be found on page 191. The geometric interpretation of complex numbers by Wessel and later by Argand is discussed on page 101. 3. Ralph P. Boas, in his wonderful book, Invitation to Complex Analysis (Random House, Inc., New York, 1987), states that the geometric interpretation of complex numbers was also found by Bessel in 1799 (see the notes on p. 13).
Chapter One
Classical Harmonic Analysis
The central question in this subject first arose in the 18th century when D'Alembert, Euler, and Bernoulli investigated the problem of the vibrating string. We imagine an elastic string, of finite length, stretched along the x-axis with both end points fixed. If the string is displaced to form a curve, say f (x), and then released, the y-coordinate of each of its points will vibrate (i.e., vary with time), and because the end points are fixed, y depends on x. So we'll have y = y(x,t)
(1)
with y(x, 0) = f (x) and f (x) is known. It was soon discovered that the function y(x, t) could be found provided one could represent f (x) as a series of sines and cosines, i.e., as a trigonometric series. More explicitly, provided one could find
constants ao,a1,a2, and b1,b2, such that 1
AX) = ao + 2
00
(an cos nx + b,, sin nx.)
(2)
n=1
The details of the analysis connecting the physical problem and the series representation are discussed later. Many writers begin their discussions of classical harmonic analysis with the series (2). We believe, however, that it is more instructive to stay closer to the historical development. So we shall begin with a problem of mathematical physics. A problem whose solution is, we believe, especially illuminating.
23
24
Chapter One
1.
The Dirichlet Problem for a Disk
Classical Harmonic Analysis
Before we can state the problem we have in mind we need some terminology.
Definition 1. Let C be a circle in the plane (i.e., R2) and let G be the set of all points inside C. A real-valued function u defined in G is said to be harmonic in G if (i) u has continuous partial derivatives of order one and of order two at every point of this set, and (ii) at each point of our set u satisfies the equation
0'u +2=0.
The equation just written is called the potential equation or, sometimes, Laplace's equation. In Chapter Zero (Section 5) we gave examples of harmonic functions and indicated how one can find many others. These functions have some remarkable properties.
Theorem 1. Let C be a circle of diameter d in R2 and let G be the set of all points inside C. Suppose that u is a real-valued function that is defined and continuous on G U C and is harmonic in G. Then the values of u in G cannot exceed the maximum of u on C, nor can they be less than the minimum of u on C. Proof. Let ire be the maximum of u on C, and suppose that at some point
(xo, yo) E G, u(xo, yo) = 1'1 > in. We may suppose that 1L1 is the maximum of u on G U C. Now translate the origin to (xo, yo). We obtain a function we shall denote by u that has all of the properties of our original function and has maximum m over C and maximum M over G U C but attains this last value at (0, 0). Now define an auxiliary function v(x, y) = u(x, y) + 't2 (x2 + y2). Clearly, v(0, 0) = M and, if (x, y) E C, then v(x, y) m+(M-r) d2 = M2 m < M. Thus z' attains its maximum over G U C at some point of G. But, at every point of G 02v
02u
8x2 + 8y2
8x2
824'
82u
/M-m\
+ 8y2 + 2 (
d2
1
=2
M-m d2
>0
This is a contradiction because, at the maximum of v, none of its second partials can be positive.
To prove the second statement of the theorem we apply the result just obtained to the function -u(x, y). This proof is due to the Russian mathematician I. I. Privalov. The problem we shall investigate here, and in the remainder of this chapter, is this: Let C be a circle in R2 and let G be the set of all points inside C. Suppose
that we are given a real-valued function f that is defined and continuous on C. Can we find a function u that is harmonic in G, continuous on G U C, and equal to f on C?
2. Continuous Functions on the Unit Circle
25
This is called the Dirichlet problem (or, simply, the DP) for C with boundary data f. Theorem 1 has an immediate and, as we shall see, very important, consequence.
Theorem 2. Solutions to the Dirichlet problem for a circle are unique. Proof. Using the notation introduced above, suppose that uu and u2 are two
solutions to the DP for the disk G U C with boundary data f. Then the function uu - u2 is harmonic in G, continuous on G U C, and identically zero on C. By Theorem 1 this function must he identically zero at all points of G. Thus ul = u2 on G U C.
Exercises 1 *1. We shall solve the DP for the unit disk, i.e., the disk whose boundary is the circle of radius one centered at the origin. To do that it is convenient to use polar coordinates: x = r cos 0, y = r sin 0. Show that under this I Ou 1 Ou 02u 02u 92u = 0. + + transformation, axe + aye = 0 becomes
are r 5r -j a02 2. Let C be a circle in the plane and let G be the set of all points inside C. Suppose that fl. f2 are two real-valued functions that are defined and continuous at each point of C and that these functions differ by less than some c > 0 at each such point. If it, and u2 are solutions to the (DP) for G U C with boundary data fu and f2, respectively, show that uu and u2 differ by less than a at each point of G.
2.
Continuous Functions on the Unit Circle
The boundary data for the DP are continuous functions on a circle. Such functions have some interesting properties that we shall explore here. Recall that K denotes either the set of real numbers, R, or the set of complex numbers, C.
Definition 1. For any function h: R - K, the set P(h) = {a E RIh(x+o) = h(x) for all x} is called the set of periods of h. We shall say that h is a periodic function when this set contains a nonzero number.
For any function h the set P(h) contains the number zero, and in certain
cases it contains nothing else. For example, when h(x) = x or h(x) = e=, we have P(h) = {0}. At the other extreme, we have P(h) = R when h(x) is a constant; so constant functions are periodic. Less trivial examples are the functions sin x and cos x. We recall that
P(sinx) = P(cosx) = {2nir I n E Z} The set of periods of a function has some structure.
26
Chapter One
Classical Harmonic Analysis
Lemma 1. For any function h: JR K we have (i) the number (-a) E P(h) whenever a E P(h), and (ii) if a. 0 are in P(h), so are both the numbers
a+d.a-i3.
Proof. When a E P(h) we have h(x+a) = h(x) for all x. Hence, when x = y-a we have h(y) = h(y - a), and this is true for any y. Given a, f3 in P(h) we see that h(x+(a+,0)) = h((x+a)+Q] = h(x+a) _ h(x) for all x, first because 13 E P(h) and then because a does. The fact that (a - i3) is in P(h) follows from what we have just shown.
Corollary 1. If a E P(h), then {no n E Z} C P(h). When a is the smallest positive member of P(h), these two sets coincide.
{na I a E Z}, it is sufficient to show that Proof To show that P(h) P(h) P {na I n E N). If this last statement is false, then In E N I no 0 P(h)} is nonempty, and hence it contains a smallest member no. From lemma 1 we see that no > 2, and so (no - 1) E N. Furthermore, (no - 1)a must be in P(h).
But then, again using lemma 1, (no - 1)a + a = noa is in P(h), which is a contradiction. Now suppose that P(h) has a smallest positive element a. We want to show
that P(h) = {na I n E Z}. Given any 0 E P(h) suppose, first, that # > 0. Then a < 3 and we can find an integer k such that /3 = ka + r, 0 < r < a. But both 3 and ka are in P(h). hence r E P(h), and so this number must be zero. Thus /3 = ka. k E N. When E3 is negative, then (-(3) E P(h). and this number is positive. So, (-[3) = ka for k E N, telling us that /3 = (-k)a. A constant function is, obviously, a periodic function with no smallest positive period. A less trivial example is given in Exercises 2 (problem 2). When a periodic function h has a smallest positive period, this number is sometimes called the period of h, or the Period of h. We shall call every element
of P(h) a period of h, and when we need to talk about the smallest positive member of P(h) we shall refer to it as such.
Lemma 2. If a continuous function on J has arbitrarily small periods, it is a constant.
Proof. We shall prove this result for real-valued functions and leave the case of complex valued functions for the exercises (see Exercises 2, problem 6d). So let f : JR -+ It be a continuous function and let {a,, } C P(f) be positive numbers such that lim a = 0. Now for each fixed n and any real number x, n-00
2. Continuous Functions on the Unit Circle
27
there is, by the mean-value theorem for integrals (Chapter Zero), a real number
x;, such that =+e .,
L
f(t)dt = f(x)[(x+ a) - x] =
and x < x,*, < x + a.. Thus 1
an
fx+'" f (t)dt = f (x,,)
Now f is continuous and an -+ 0, hence 1
liman- J=
=+Q
f (t)dt = f (x)
However, because an E P(f) we have, for any x and y, r:+Q
J
f (t)dt =
ry
y+Q
f (t)dt
(see Exercises 2, problem 3). Thus letting n -. oo, we find that f (x) = f (y) showing that f is a constant. Corollary 2. Any nonconstant, continuous, periodic function has a smallest positive period. Proof.
Again we shall prove our result only for real-valued functions. So let
f : R -+ R be nonconstant, continuous, and periodic. By lemma 1, the set P+(f) _ {a E P(f) I a > 0} is nonempty. Let ao be the greatest lower bound for this set (Chapter Zero). If ao is zero, then a sequence in P+ (f ), a sequence of positive periods of f, must converge to it; i.e., f would have arbitrarily small periods. This cannot be the case, because f is not constant, and so ao must be positive. To complete the proof we need only show that ao E P+(f ). We argue by contradiction. If ao is not in this set, then we must have {an } C P+ (f) such that limn. oo an = ao. But then, because f is continuous, we have f (x + ao) = lim f (x+an) for any fixed x. However, f (x+an) = f (x), so f (x+ao) = f (x) n-oo for any x, and this says ao E P(f). Combining our results we see that if f is a nonconstant, continuous, periodic
function, then f has a smallest positive period ao and P(f) = {nao I n E Z}. In all that follows we shall denote the unit circle (the circle of radius one centered at the origin) by T. We shall solve the DP for this circle. Note that if f is a continuous function on T, then we may define f (O) to be f (cos 0, sin 0) for
28
Chapter One
Classical Harmonic Analysis
all 0 E R. It is clear that f is a continuous function on R and that 2n is a period of this function. Thus every continuous function on T gives us a continuous function on R that has 27r for a period. Conversely, given a continuous function f on R that has 21r for a period, we may define f (cos 0, sin 8) to be f (O), for all 0, obtaining a function f that is continuous on T.
Definition 2. A function that has 27r for a period, whether or not it is the smallest period of this function, is called a 27r-periodic function.
Exercises 2 1. Let f : R
K. Show that P(f) = R if, and only if, f is constant.
2. Recall the set of rational numbers Q. Define X(x) = 1 when x E Q, X(x) = 0 when x E R \ Q. Show that P(X) = Q; hence, X has no smallest positive period.
* 3. Let f : R - R be continuous and periodic and let a > 0 be any element of P(f ). Show that for any real numbers x and y we have f = +R f (t)dt = f(t)dt. * 4. Let a, b be two fixed complex numbers and let A be a positive real number. Let g(x) = a cos Ax + bsin Ax. Show that P(g) = {2-n' In E Z}.
5. Let f,g be two periodic functions. (a) If the set P(f) f1 P(g) contains a nonzero element, show that a f + bg (here a, b are any complex constants) and f - g are periodic functions. (b) When f has a smallest positive period a and g has a smallest positive period 3, show that P(f) f1 P(g) contains a nonzero number if, and only if, a = rf3 for some rational number r q' 0.
*6. Leth:R-*C (a) Show that there are functions f, g, both from R to R, such that h(x) = f (x) + ig(x) for all x. (b) Show that P(h) = P(f) f1 P(g). Hence if h is periodic, then both f and g are periodic; the converse is false. (c) Show that h is continuous if, and only if, both f and g are continuous.
(d) If h is continuous and has arbitrarily small periods, show that h is a constant. (e) If h is continuous, periodic, and nonconstant, show that h has a smallest positive period. * 7. If f : R - C is continuous and periodic, show that it is uniformly continuous on R (Chapter Zero, Section 5, Definition 3).
3. The Method of Fourier
3.
29
The Method of Fourier
The DP requires that we solve a certain partial differential equation subject to a given boundary condition. The mathematical details are a bit complicated, and it is wise to begin with a simpler problem involving the solution of an ordinary differential equation. The relevance of this simpler problem will become clear very soon. We begin with the equation
it a real constant
e"(8) + µe(B) = 0,
(1)
We seek nontrivial (i.e., not identically zero) solutions to (1) that are 27rperiodic. There are three cases:
(i) µ = 0. Here the general solution to our equation (e"(8) = 0) is easily seen to be O(9) = cob + ao
where co and ao are constants. This is nontrivial and 27r-periodic only when co is zero and ao is not. (ii) µ < 0, say p = -A2 where A is a positive real number. So (1) becomes
9"(8) - A2O(8) = 0 We set 0(8) = emo, m to be determined, and putting this into our equation, we find that (m2 - A2)emo = 0. It follows that m = ±A and the general solution to our equation is 0(8) = cleae + c2e-ae
Clearly, there is no choice of the constants cl and c2 that gives us a function that is both nontrivial and 27r-periodic. The problem has no solutions when µ < 0 (see Exercises 3, problem 1).
(iii) µ > 0, let us say p = A2 for some A > 0. Again putting 0(8) = emo into our equation (i.e., into a"(8) + A2O(8) = 0) we find that m = ±iA. Hence, the general solution is
oa(8) = clae'ae +
c2Ae-taa
= as cos a8 + ba sin A0
where aa, b, are arbitrary constants, not both zero. We are seeking solutions that are 27r-periodic. Now
P(6a) = {n (T) I n E Z}
(Exercises 2, problem 4)
and so this condition is satisfied if, and only if, A is a natural number.
30
Chapter One
Classical Harmonic Analysis
To sum up, equation (1) has nontrivial 27r-periodic solutions when, and only when. p = nz, n = 0, 1, 2,- and these are
9 (0) = a cos nO + 6 sin n0
(2)
Let us turn now to the DP for the unit circle T. The geometry suggests that we use polar coordinates, and we let C be the set of all points inside T. Thus we are given a real-valued function f (0) that is continuous on T, and we seek a function u(r, 0) that satisfies 52
ru +
r - + -- a02 = 0
(Exercises 1, problem 1)
(3)
at all points of C, that is continuous on C U T, and is equal to f (9) on T. Now equation (1) has, depending on p, a general solution that can be represented by a simple formula. Equation (3), however, has so many particular solutions (Chapter Zero) that we cannot expect to represent all of them in any simple way. Another approach is needed, and the standard one is to restrict ourselves to solutions u(r, 0) that can be written in the form R(r)9(0). This may seem like a drastic assumption and a severe limitation on what we might find. One might wonder if there are any solutions that can be "factored" in this way. And, even if there are such solutions, might there not be other perhaps more interesting solutions that cannot be so written'? Are we to ignore these? Both objections disappear. however, once we show that our restriction does in fact lead us to a solution, because, as we have seen (theorem 2 of Section 1), solutions to the DP for a circle are unique. This is what justifies our entire approach.
So let us set u(r, 0) = R(r)O(9) and put this into (3). We find, after some algebra,
r2R"+rR'
A"
9 R Because r and 0 are independent variables, it follows that both expressions are equal to the same constant, say, it. Thus we have two ordinary differential equations
r2R"(r) + rR'(r) - pR(r) = 0
(4)
9"(0) + p9(0) = 0
(5)
and
For each fixed p we find the general solution to (4) and to (5) and their product
is a solution to (3). However, we want u(r,O) = R(r)9(9) to be equal to the given function f (0) when r = 1, i.e., on T. This says that the solutions to (5)
3. The Method of Fourier
31
that are of interest here must be 2a-periodic. Our discussion at the beginning of this section shows that we will find such solutions when, and only when,
µ = n2 where n = 0,1,2, and that these solutions are a,, cosno+b"sinn8
(6)
Equation (4) now becomes
r2R"(r) + rR'(r) - n2R(r) = 0
(7)
and its general solution, when n # 0, is easily seen to be
R(r) = c1r" + c2r-"
(8)
(see Exercises 3, problem 2). Because R(r)O(0) is to be harmonic inside the unit circle, and c2r-" is undefined at the origin, we must take c2 = 0. When n = 0, the general solution to (7) is also undefined at the origin and hence cannot be harmonic (see Exercises 3, problem 3). We have found a whole class of solutions to (3) that are 27r-periodic when r = 1. They are
r"(a.cosn9+bnsinn9),
n=0,1,2,3,
(9)
It would be remarkable if any one of these solutions was equal to the given function f (0) on T. However, because (3) is linear (meaning sums and scalar multiples of solutions to this equation are again solutions), the early mathematicians simply "added" them all up to obtain
u(r,0) = ao+
r"(ancosn8+b sin n49)
(10)
n=1
and then asked if one could choose the constants an and bn so that f (6) = u(1, 0) = ao + F,0"(an cos nb + bn sin n6)
(11)
n=1
This is the problem that so troubled D'Alembert, Euler. and Bernoulli. Modern mathematicians have learned to be wary of infinite sums. We immediately ask about the meaning of (11). Some form of convergence is implied, but convergence in what sense?
Exercises 3 1. Let A be a fixed, positive, real number.
(a) Show that aea= + be-AZ = 0 for all x if, and only if, a = b = 0.
32
Chapter One
Classical Harmonic Analysis
(b) Set f (x) = cleA= +c2(-'-a= where Cl, c2 are constants not both zero.
Show that P(f) = {0}, i.e., if f (x + a) = f (x) for all x, then a = 0.
* 2. Use the substitution r = et to show that equation (7) is equivalent to R - n2R = 0 where R = de R. Next find the general solution to (7).
3. Solve the equation r2R"(r) + rR'(r) = 0. 4. Solve the equations
+
On
= 0 and
On
- ou = 0 by setting u(x, y) =
X (x)Y(y)
4.
Uniform Convergence
The basic facts about uniformly convergent sequences and series are probably known to most readers. A quick review of the main theorems may, however, be helpful.
Definition 1. Let D C lR be a nonempty set and, for each n E N, let fn : D K. We call { fn } a sequence of functions in D. We shall say that our sequence converges to the function g: D - K uniformly over D if, given any c > 0, there is a natural number N such that
Ig(x) - fn(x)I < e for all x E D. whenever n > N. Our first result, though not. terribly surprising, is useful.
Lemma 1. Let D C R be a nonempty set. A sequence of functions { f,j in D converges uniformly over this set if, and only if, to each e > 0 there corresponds
a natural number N such that
Ifn(x)-fm(x)I <E for all x in D, whenever both m and n exceed N. Proof. Suppose that our condition holds for the sequence (f,,). Then for each x E D, If,, (x)} C K is a Cauchy sequence (Chapter Zero, Section 5, Definition 5). Any such sequence converges to a point of K that we shall denote by g(x). Now given e > 0 we can find a natural number N such that I fn (x) -fm(x)I < E
for all x E D, whenever both m and n exceed N. Letting m -. oo in this inequality gives us I f,,(x) - g(x)I < e for all x E D, whenever n > N. But this says { f } converges to g uniformly over D. and the proof is complete in this case.
4. Uniform Convergence
33
Now suppose that the given sequence { fn} of functions in D converges to a function g uniformly over this set. Then, given e > 0, we can find a natural
number N such that Ig(x) - fn(x)I < e/2 for all x in D, whenever n > N. But then If(x) - fm(x)I < I fn (x) - g(x) I + Ig(x) - fm(x)I < e for all x in D, whenever both m and n are > N. Thus if,, } satisfies our condition. Given a sequence {uj} in D we may construct another sequence in this set by letting fl(x) = u1(x)
f2 (x)
= ul (x) + U2 (-T)
f.+ (x)
j=1
When {f,,) converges to a function uo uniformly over D we say that the series 00
E uj(x) j=1 00
is uniformly convergent to uo(x) over D. We sometimes write uo (x) = E u)(x). j=1
Lemma 1 has an important consequence:
Corollary 1. (Weierstrass M-test) Let {uj} be a sequence of functions in D and let {lbfj} be a sequence of positive numbers. Suppose that
(i) Iuj(x)I < Mj for all x E D, and each fixed j; 00
(ii) E Mj is convergent. J-1 m
Then the series E uj (x) converges uniformly over D. j=1
Proof. Because EMj is convergent we can, given E > 0, choose a natural number
N such that m
n
j=1
j=1
I>Mj->M,I<e
34
Chapter One
Classical Harmonic Analysis
whenever both m and n exceed N. Thus if m > n, m
E Mj < e j=n+1
fn(x) = whenever n > N. Now, for each n, let
n
E uj (x). j=1
Then, when m > n,
m
n
Ifm(x) - fn(x)I = I >uj(x) - Euj(x)I j=1
j=1
m
= I E uj(x)I j+1
m
E Iuj(x)I j=n+1
M
E Mj j=n+1
by (i). Combining this with our first observation we see that
I fo(x) - f,n(x)I < E when m > n > N. By Lemma 1 the sequence {fn}, and hence the series Euj(x), converges uniformly over D.
There are theorems that tell us when a certain property that is shared by each term of a convergent sequence of functions is also shared by the limit of this sequence. The main applications of these results, for us, are to series of functions that are defined on some interval of R and converge uniformly over this interval. So let [a, b ] C R, let uj : [a, b] - K f o r each j = 1, 2, ... , and, for
=
each rl, let fo(x)
n
1: uj(x). We shall assume that {fn} converges to uo(x) j=1
uniformly over [a, b].
Lemma 2. If xo E [a, b] and each u j (x) is continuous at this point, then uo is also continuous at this point. In particular, if each uj is continuous on [a, b], then uo is continuous on this interval.
Proof. First suppose that xo E (a, b). We may write Iu0(x) - u0(x0)I :5 Iuo(x) - fn(x)I + Ifn(x) - fn(x0)1 + lfn(xo) - u0(x0)I Now suppose that e > 0 is given. We first choose, and fix, fn so that
Iuo(x) - fn(x)I < 6
4. Uniform Convergence
35
for all x in [a, b]. For this n we use the continuity of f at xo to choose 6 > 0 such that I fn(x) - fn(x0)1 < 3
whenever Ix - xoI < 6. It then follows that Iuo(x) - uo(x0)I < E
whenever Ix - xoI < 6, which shows that uo is continuous at x0.
The cases x0 = a and x0 = b are left to the reader.
Lemma 3. If each of the functions uj(x) is continuous on [a, b], then
jb
=
jb
dx =
u3(x) j=1
fb j=1
Proof. By lemma 2 the function uo(x) is continuous on [a,b], and hence its Riemann integral exists. Now
fl
rb I
Ja
//b
uo(x)dXl
fn(x)d, - J
< J I f (x) - uo(x)Idx
a
a
and because {f n } converges to uo uniformly over (a, b) we can, given E > 0, choose a natural number N so that
I MX) - uo(x)I <
E
b
a
for all x E [a, b], whenever n exceeds N. Thus b
I fa f(x)dx -
fb
b
=I 1
fa
-
fb
whenever n > N. This proves the lemma.
One cannot, in general, find the derivative of uo(x) by differentiating the series Eu,(x) term-by-term. In fact, uo(x) may not have a derivative even when every uj(x) does (see below). However, we do have the following:
Lemma 4. Suppose that each of the functions u, (x) is continuous on [a, b], and has a continuous derivative at every point of (a, b). Suppose further that the series u, (x) j=1
36
Chapter One
Classical Harmonic Analysis
converges to a function v(x) uniformly over (a, b). Then uo(x) is differentiable on this (open) interval and U0, (X) = v(x)
at each x E (a, b).
Proof. By our assumptions each of the functions
fn (x) _ >ui(x) j=1
is continuous on [a, b] and has a continuous derivative on (a, b). Moreover, the sequence If,, (x)} converges to v(x) uniformly over (a, b). It follows, from lemma
2, that v(x) is continuous on this interval. Now given x E (a, b) we can choose a number x0 < x such that [xo, x] C (a, b)Then by lemma 3
L v(t)dt = li1III fn(t)dt = li{f(x)
- f(xo)} = uo(x) - uo(xo)
a
Thus uo(x) = z Z v(t)dt + uo(xo) o
It follows that uo is differentiable at x and that
A uniformly convergent series of even such well-behaved functions as sines and cosines can converge to a function with very strange properties. The following examples illustrate this point. It is said that in a lecture given in 1861, Riemann asserted that the function sinn27rx
R(x) _ n=1
2
is continuous at every x but does not have a derivative at any point. The Mtest shows that the series converges uniformly over R and so the limit function is certainly continuous by lemma 2. When Karl Weierstrass heard about Rie111a1m's claim, he tried to prove that R(x) was nowhere differentiable. He was unable to do so but, in the course of trying he came up with the following set of functions,
W.,&)
a"' cosbn7rx n=1
4. Uniform Convergence
37
and was able to show that when 0 < a < 1. b is an odd integer and ab > 1 + z 7r,
these functions are nowhere differentiable. Again the M-test shows that the series converges uniformly over R when 0 < a < 1; hence, the associated functions are continuous. In 1916, G. H. Hardy improved on this result by showing that the functions Wa,b(X) are nowhere differentiable when 0 < a < 1, b > 1, and ab > 1. Hardy also showed that R(x) has no derivative when x is an irrational number or a rational number of a certain kind. Surprisingly, .1. Gerver proved that R(x) does have a derivative at certain other rational points. This he did in 1969 while he was an undergraduate student at Columbia University.
Exercises 4 For problems 1 and 2, T is the unit circle and C is the set of all points that are inside T. * 1. Suppose that we have a sequence of functions each of which is continuous on C U T and harmonic in G. If this sequence converges uniformly over T, show that it converges uniformly over G U T. * 2.
(a) For any fixed n show that r" (an cos n9 + bn sin nB) is harmonic in G.
(b) For p fixed, 0 < p < 1 show that F _,On, > np", and > n2 p" are convergent.
(c) Suppose that, for some M, lan1 < M and lbnl < Al all n. Set u (r, 0) _ Er" (a,, cos n6 + bn sin n6) for all (r, 0) E G. Show that n=1
u (r, 0) is harmonic in G.
3. Recall that, for a given sequence ao, as,... of real numbers,
is n=0
called a(real) power series. To any series there corresponds an R, 0 < R < oo. such that E a x" converges for all x with Jxi < R and diverges for all x with jxi > R. R is called the radius of the series. (a) For each fixed p. p < R, shows that the series converges uniformly on [-p, pl.
For each x E (-R, R), define f (x) to be Y, anx", and show that f n-0 is continuous on (-R, R). (c) If [a, b] < (-R, R) show that
(b)
fb
r=
f
Ja
dx = n=0
n=0
38
Chapter One
Classical Harmonic Analysis
x (d) Show that > x" converges to
at each point of (-1,1), but that
1
n-0 convergence is not uniform over this interval.
anx" has radius R, R 0 0 and suppose that the series
4. Suppose that n=o
has radius ff. n=1
(a) If jxj < R, choose xo so that jxj < Ixol < R and show that for some constant, call it A, we have janxo I < A for all n. (b) With x and xo as in (a) write na"x »-1
n-I
n
x
XO
x0
n = -anx0 -
and conclude that
Inanxn-1j < A
nrn-1
IxoI
where A is the number found in (a) and r = j xo
(c) Show that
E
A
nrn-1 is convergent and conclude from this that
00
I
II
converges whenever jxl < R. Thus R:5 IT.
(d) Now suppose that R < Ixj < R' and note that Ianxnl =
jnanx"- III ` I < Inanxn II
n
when jx.I < n. Deduce a contradiction and conclude that R' = R. 00 (e)
Show that E a.nx" is differentiable at every x. E (-R, R) and that n=o 00
its derivative is
nanx"-
n=1 00
00
5. Suppose that we have two power series, > a, x" and E bnxn. Suppose n=0
that for some r > 0 we have F, anxn =
that a = b,, for n=0,1,2 .....
n=o
bnxn whenever jxj < r. Show
5. The Formulas of Euler
5.
39
The Formulas of Euler
The boundary data for the DP is a real-valued function that is continuous on the unit circle, on T. Let us denote the set of all such functions by C,.(T). So every element of Cr(T) is a real-valued function that is continuous on T or, equivalently, a real-valued function that is defined on R is continuous and is 27r-periodic. Our discussion thus far has led us to ask if every f E Cr(T) can be "represented" by a trigonometric series. In 1777 Euler made the first real progress toward answering this question. We shall present his results here. Let us suppose that the trigonometric series 00 1
no +
2(an cos nO + bn sin n9)
(1)
n=1
converges to the function f (0) uniformly over R (or, equivalently, over T). We are supposing also that the coefficients in this series are real, so clearly f E Cr(T). Now given e > 0, we can find a natural number N such that 1
sinne)}I<e
if (0)-(2no
(2)
n=1
for all 9, whenever m > N. We may write, for each n,
eine - e-in9
ein9 + e-in8 cos nO
=
sirs B =
2
2i
Putting these expressions into (2) we obtain
I1(9)-{2ao+r(a,:
n=1 \
ib 2n)eine+
J
or, setting co = ' ao, c,, = 'n- ib,. , c_,, =
(an 2ibn)e__sne}I<e
J
n=1 a_ Z4bgj
(3)
for each n = 1, 2, ... .
M
if(0) - E
Cne,nei
<
(4)
for all 0, when m > N.
Definition 1. When D C R, D 54 0, and we have a set of functions uj CO
D --+ K, one for each j E Z, we say that the series
uj(x) converges to the j=-oo function v(z) uniformly over D whenever the sequence fo(x) = uo(x), fi(x) _
u_1(x) + uo(x) + ul(x),..., fn(x) = E' j=_nuj(z),... converges to v(x)
40
Classical Harmonic Analysis
Chapter One
uniformly over D. More precisely, we say that the series converges to v(x) uniformly over D if, given e > 0, we can find N such that n
IV(x) - E uj(x)I < e j=-n
for all x E D, when n > N.
The inequalities (2) and (4) show that the series (1) converges to f (0) uniformly over R if, and only if, the series 00
1: seine
(5)
00
converges to this function uniformly over R.
We began by assuming that (1) converged to f(9) uniformly over R, and we have just seen that this is equivalent to assuming that (5) converges to f (0) uniformly over R. The results we want can be obtained, very easily, from the latter series. Choose, and fix, an integer p. Then the series
(
00
eneine') a-iPe
=E
-00
-00
l Cnei(n-P)e
converges to f (8)e-' uniformly over R; equivalently, uniformly over T. Thus by lemma 4.3 we have
TV
f (B)e-iped9 = I a 0o
(tce'("-P)O
dB
J
rx
cn J/a+(n-P)ed9 W
00
It is easy to evaluate these integrals. Clearly,
f
e=(n-p)ed0
=0
x
when n q6 p, and when n = p, the integral is equal to 27r. Thus our last infinite series reduces to a single term: 21rcc. And this gives
cP = 2- J
f
all p E Z
(6)
5. The Formulas of Euler
41
Thus the coefficients in the series (5) can be computed from f (0).
The coefficients in the series (1) can also be computed from this func-
tion. We have ao = 2co, and a little algebra gives an = c + c_n, b _ i(cn - c_n), it = 1,2..... Hence (6) gives us
a bn =
R I
1
1 IJ
f (0) cos nOd9,
n = 0,1, 2,...
(7)
f(0) sin nOdO,
it = 1,2, ...
(8)
and here we see why it is customary to put the 1 before the ao. Without this factor the formula (7) would not be valid when it = 0. Given a function f E Cr(T), the numbers obtained from f using formulas (7) and (8) are called the (real) Fourier coefficients of f. We sometimes denote them by an(f ), bn (f ). The trigonometric series (1) whose coefficients are the Fourier coefficients of f is called the (real) Fourier series of f. The numbers obtained from f by using (6) are called the complex Fourier coefficients of f and are sometimes written cn (f ). The series (5) is called the complex Fourier series of f when its coefficients are the numbers c (f ). We may now ask whether the Fourier series of a given function converges to that function. This is a tricky matter. The discussion of these series is intricate and technically challenging, and it is not even possible to state many of the results unless we first discuss the Lebesgue integral. One of the major reasons for interest in that integral is the profound impact it had on the study of Fourier series. In the next few sections we discuss the convergence of the Fourier series of
a function in Cr(T) and use the results we obtain to solve the DP. A discussion of these series for more general functions is given in the next chapter. We end this section with an important theorem due to Riemann.
Theorem 1. For any f E Cr (T), lim c (f) = 0. Ink moo
Proof. Given f E Cr(T) we may write
-en(f) = e"en(f) =
1
27r
ff Ir
(t)e-`n<<
n idt
A
In this integral we set u = t - M to obtain
-Cn(f) = 27r 1
1 f (u + n" )e-fnudn = 27r
where we have used problem 3 of Exercises 2. Now
2cn(f) = Cn(f) - [-Cn(f)]
* f (u + 7 )e'"du n
Chapter One
42
Classical Harmonic Analysis
and so we have 7r
2cn(f) =2a 1
J
(f (t) - f(t+ n
i)}C-inter
From this we find that 1
21
f
7F
If(t)-f(t+n)Ile-intldt<_maxAlf(t)-f(t+-)I
A
Now f E C(T), and so this function is uniformly continuous on R (Exercises 2, problem 7). Thus, given e > 0 we can choose a natural number N such that
I f(t) - f (t + n) I < 2e whenever Inj > N, and this holds for any t E [--ir, it]. It follows that Icn(f)I _< e
whenever Inj > N, and this proves our theorem.
an(f) = 0 and Corollary 1. For any function f E C,.(T) we have lim 00 lim bn(f) =0. n-oo 71
Proof.
Because an(f) = cn(f) + c-n(f) and bn(f) = i{cn(f) - c-n(f)} the
theorem shows that lim an(f) = 0 = It-00 lim bn(f ). Exercises 5 1. Define f (x) = x2 for -it < x < it, and f (x+21r) = f (x) for all x. Clearly, f E Cr(T) (a)
Compute the real and the complex Fourier coefficients of f and show
that the real and complex Fourier series of this function converge uniformly over R. (b) Assuming that the real Fourier series of f converges to this function, 1 00 ao (-1)n+1 and (ii) compu te t he num bers (i ) n2 .
E n2
n=1
E
n=1
* 2. For f E C,(T) show that the Fourier coefficients of f are bounded; i.e., show that there is a real number µ' such that Ian(f )I < µ' and Ibn(f )l < µ' for all n. 3. If f E Cr(T) has Fourier coefficients {cn} and if f' exists and is also in Cr(T), show that lim ncn = 0. If f" also exists and is in Cr(T), show In]-.oo
that
lim n2cn = 0.
lnI-oo
6. Ceshro Convergence
6.
43
Cesaro Convergence
A modified form of convergence was introduced into mathematics by the Italian mathematician Ernesto Ceshro. This notion seems, at first glance at least, somewhat contrived and, perhaps, artificial. A major result of Leopold Fejer (next section) shows, however, that this type of convergence is the natural one to use in connection with the Fourier series of a function in C,(T).
Definition 1. Given a sequence ao, ai, a2, ... in K the numbers ai = a3 = ao+ 3 +a, ... are called the Ceshro means of this ao a2 = a 2a o is Ceshro convergent to the number a if
sequence. We shall say that limn-op an = a.
One can easily show that a convergent sequence is Ceshro convergent to its (ordinary) limit. There are, however, sequences that do not converge in the ordinary sense that are convergent in this new sense (see Exercises 6, problems 1 and 2). So Ceshro convergence "contains" and extends the ordinary notion of convergence. As one might expect, a series is said to be Ceshro convergent (or, sometimes,
Ceshro summable) to the number s' if its sequence of partial sums is Ceshro convergent to s'. This can get a bit messy. The mth Ceshro mean, call it am, 00
for the series E ak is -00
am =
F_0=0 ak
+ Ek=- l ak +
Ek-_2 ak
...
r
1(m-i) ak
(1)
In
We shall now derive an expression for the mth Ceshro mean of the complex Fourier series of a function f E C,(T). It turns out that there is a nice, compact, expression for this mean in terms of an integral. Given f E C,(T) the nth partial sum of its complex Fourier series is n
n
sn(xi f) =
k
F
Ck(J
k=-n f"
2ir," 1
)eikr =
eikz [±.
k-n
p y)
rf(y)e-'kdy] J-
(2)
ik(=-Y)] dy
=-n
J
The m th Ceshro mean of this series is
+... + sm-1(xif) am(xi f) = s0(xif) +si(x;f)m
(3)
Combining this with (2) gives us *
am(x;f) =
27r
, f(y)
rLk=o + 0 Qik(z-v) + ... + +
'n-1 k--(+n-i)
eik(x-v)1 J dy
(4)
44
Chapter One
Classical Harmonic Analysis
Observe that the integrand contains the mth Cesi),ro mean of the series 00
eik(x-y)
(5)
k=-oo
If we denote this mean by K,. (x - y), then (4) becomes IX
am (x; f) = 2a
f (y)Km (x - y)dy,
m = 1, 2, 3 ...
(6)
We shall use this expression in the next section.
The functions K.(x), m = 1, 2, ... , are the Ceshro means of the series eikx. These functions have some useful properties:
oo
Lemma 1. For each fixed m we have (1) hrn(x) =
I
in
r --Cams) 1 --Cams) _ 1
-CM x
r sm mx 2
1
m ` ain
x 1
(ii) Kn,(x) > O for all x (iii)
2n
j Km(x)dx = 1
Furthermore, if I is any open interval containing zero, then Gm sup{K,n(x) I x E (-7r, 1r) \ I} = 0 m-0c
Proof. Let its set sn(x) _ Kn+l (x) =
e'kx so that we may write 8n(x)
+
8n-1(x) + ... + 81(x) + 80(x)
n+1
and
K,,(x) =
8n-1(x) + sn-2(x) + ... + 81(x) + 80(x)
n
then
n
n
n
-n
0
1
(n + 1)Kn+1(x) - nKn(x) = sn(x) = E eikx = [1 eik= + E e-ikx
(1)
These last two series are (finite) geometric series and can be summed to give 1 - e`(n+1)x
1 -e'x
6. Cesiiro Convergence.
45
and
11 - e-'= respectively. Adding these expressions and doing a little algebra gives us
nK.(x) = cosnx - cos(n + 1)x
(n +
(2)
1 - cos x
Now Kj(x) = 1 for all x. and putting this into (2) we find K2(x). Putting K2(x) into (2) gives us K3(x), and so on. In this way we find that
(1 - cos mx ) 1 -coSx
1
K. (S) = m
which is the first expression in (i). Using our knowledge of the trigonometric functions we can transform this into 1
(sinInix)
fit
Sin 2x
which is the second expression in (i). Note that this proves (ii) as well.
To prove (iii) we consider the function that is 1 for all x. The Fourier coefficients of this function are Ck
rn
=
27r J
1e-'x=da.
for k E Z. Clearly, Ck = 0 when k 54 0, and co = 1. Thus the Fourier series of this function consists of a single term, which is 1. Now when we compute the mth Cesilro mean of this series, the 1 occurs m times in the numerator. But we also divide by m. Hence am (x; f = 1) = 1
for all x and each m = 1, 2, .... We now refer to equation (6) above. For the function we are treating here this equation becomes 1
1=
27r
f 1 . Km(x - y)dy
This last expression is true for any x, so we may take x = 0. Finally, because K,,,(-y) = Km(y) by (i), we have proved (iii). Now suppose that 0 < 6 < 7r and 6 < IxI < a. Then 2x and 16 are both in the interval where the sine function is monotonically increasing. Hence 1
2
1
(sin 26 I < (sin 2x I
2
46
Chapter One
Classical Harmonic Analysis
Given an open interval I that contains zero, we may choose b > 0 so that
(-5.6) C I. Then sup{Kn(x) I x E (-r, 7r; \ I} < sup{Kn(x) 16 < IxI < rr}
But this last term is less than or equal to sup
1
n
sin nx
( sin 2x )
2
1
< n1 sin Z6I2
Letting n -+ oc in this last expression proves our claim.
Exercises 6 1. Show that the sequence {(-1)n+1}0O
which is clearly divergent, is
Cesaro convergent zero.
*2. If the sequence {an}n°_1 is convergent to b, show that it is also Cesaro convergent to b.
* 3. We call {km}n°_1 Fejer's kernal. Show that Km(x) = > (I - IkJ )e-iks IkI<m
m
00
4. Suppose that the series E an is Cesaro convergent to b. Suppose further n=1
that the sequence {nan}O°__1 is bounded. Show that the original series must converge to b. Results of this kind (i.e., results that give conditions under which a series that converges in some general sense must converge in the ordinary sense) are called Tauberian theorems. 5. Let {an} and {bn} be two sequences of real numbers. Suppose that (a)
i.
ii.
a,, > an+i for all n; lim an = 0; n -:J0
iii. The series E bn has bounded partial sums.
(b) Show that E anbn is convergent. Hint: Let so = bo, Si = bo + n
bl,...,sn =>bj. Write aobo+albs +... as aoso+a1(sl - so)+ j=o a2(s2 - 81) + ...,
(c) Show that anbn is convergent even if the bn's are complex, just as long as the partial sums of E 6n are bounded.
7.
47
Fejer's Theorem
00 in
oc
(d) Show that
- is convergent. More generally, show that n n=1 converges when IzI = 1 and z $ 1.
n=1
Zn
n
(e) If we take bn = (-1)n, show that this test, which is due to the Norwegian mathematician Niels Abel, gives us the familiar Leibnitz test for alternating series.
7.
Fejer's Theorem
Scientists who are engaged in SETI (the search for extraterrestrial intelligence) call it the Fermi paradox. It first arose during a lunchtime conversation among
the physicists Enrico Fermi, Emil Konopinski, Edward Teller, and Herbert York, which took place at Los Alamos sometime in the year 1950. They were speculating about when we could expect to achieve faster-than-light travel (as-
suming this is possible). Each man gave his opinion, and the talk turned for awhile to scientific subjects of more immediate interest, when Fermi suddenly asked "So where is everybody?" It was an abrupt change of subject, but everyone knew what he meant. Given the age of the universe and the widely held belief that technically sophisticated civilizations are fairly common, even the most pessimistic estimate of when faster-than-light travel is achieved implied that visits from "elsewhere" should be. not frequent perhaps, but not really uncommon either. Yet we have had none! This question has been discussed by scientists ever since, but with no general agreement on an answer. It is said that when the Hungarian physicist Leo Szilard heard of it he quipped, "They are among us, but they call themselves Hungarians." Szilard was famous for his off-beat sense of humor and, no doubt, would claim that his facetious remark also explained why the tiny country of Hungary has produced so many outstanding mathematicians. We shall discuss the work of one of them, Leopold Fejer. here. We shall then use his result to complete our solution to the DP.
Theorem 1. The Ceshro means of the Fourier series of any real-valued function that is continuous on the unit circle converge to that function uniformly over the circle.
Proof. Let f E Cr(T) and let an(x) denote the nth Cesaro mean of its (complex) Fourier series. Given e > 0, we shall show that there is a natural number N such that
Ian(x)- f(x)I <e
48
Chapter One
Classical Harmonic Analysis
for all x E T, whenever n > N. We shall use formula (6) and Lemma 1 of Section 6. From (iii) of that lemma we may write, for any fixed x, 7r
a (x) - f (x) =
2Jr
f
f(t)K, (x - t)dt -
1J KK(t)f (x)dt 2Tr
In the first of these integrals we set u = x - t and use the 21r-periodicity of the integrand to write
f (t)K1(x - t)dt = - r
f (x -
J 'x f (x -
J -A f (x x
We may now write
f(x) =
2n
f{f(x - t) - f(x)}Kn(t)dt
From this last expression we obtain
Ion(X) -f(x)I =
21r r_,6
2R
If(x-t)-f(x)IK.(t)dt
,c
I f (x - t) - f (x) I KK(t)dt -I
1 27r
"I-
(1)
If (x - t) - f (x)IK,(t)dt
where tS > 0 is to be determined. The first of these integrals can be estimated as follows: 1 2r J
6
If(x-t)-f(x)IKK(t)dt
< sup
-6
<
I f (x - t) - f (x)I 2a1
(2)
6
1 6
sup I f (x - t) - f (x) I 1 2a
-6
J x
K (t)dt = sup I f (x - t) - f (x) I -6
where our change in the limits of integration is justified by (ii) of Lemma 1 of Section 6 and we have also used (iii) of the lemma. Let us now estimate the second integral appearing in (1). We have 1
2;r
Jts>6 2
If (x - t) - f(x)IK,,(t)dt
1
f max If(x)I]
2'Jr L-A G=GT
I016
(3)
7.
Fejer's Theorem
49
Now suppose c > 0 is given. We use the uniform continuity of f to find 6 > 0 such that
If(x - t) - f(x)I < 2 whenever t E [-6, 6] and x E [-7r, 7r]. With 6 now fixed we see by (2) that the first integral on the right-hand side of (1) is less than 2. To finish the proof, we use the last statement in Lemma 1 of Section 6. It tells us that we can choose a natural number N so that
sun K (t) <
ltl>6
n
E
4_max7rIf(t)I
whenever n > N. Combining this with (3) we see that the second integral on the right-hand side of (1) is less than 2 whenever n > N.
Definition 1. A sequence { fn} of functions defined on a set D is said to converge pointwise on D if for each x E D the sequence of numbers { fn(x) I is convergent. A series Euj(x) of functions defined in D is pointwise convergent in D if its sequence of partial sums converges pointwise on D.
If f E Cr(T) and if the Fourier series of f converges at 80 E T, then its limit must be f (80). This follows from Fejer's theorem. The partial sums of the Fourier series, when 8 = 80, converge and so their Cesaro means must converge to the (ordinary) limit. But we know that these means converge to f (8o), and this proves our claim. We have shown the following:
Corollary 1. If f E Cr(T) and if the Fourier series of f converges at any point 8o E T, then this series converges to f (80).
Remark 1. We have seen that the Fourier series for the function f (x) = x2, -7r < x < zr, f (x + 27r) = f (x) all x, converges uniformly over T. Because uniform convergence over T clearly implies pointwise convergence on T, we now
know that the limit of this series is f (x). This establishes the results found in Exercises 5, problem 1. Let n1, n2, ..., nt be a finite set of integers, some of which may be negative, and let cl, c2, . . ., ct be complex numbers. The function
t T(8) _ E CieinjO j=1=1
50
Chapter One
Classical Harmonic Analysis
is called a trigonometric polynomial. It is clear that r(O) is in C(T). Moreover, the Cesi ro means of any f E Cr(T) are trigonometric polynomials. Thus:
Corollary 2. Given any f E Cr(T) and any E > 0 there is a trigonometric polynomial r, (O) such that If(B) - r;(0)1 < E for all 0 E T.
Corollary 3. (Weierstrass Approximation Theorem) A continuous real-valued function defined on a closed bounded interval can be approximated, as closely as we wish, on that interval by a polynomial.
Proof Let a, b be real numbers with a < b, and let f be a real-valued function that is defined and continuous on [a. b). Given E > 0 we shall construct a polynomial Pr(x) such that If (x) - P,(x)I < e for all x E [a, b]. Let us suppose, first, that [a, b] C (-ir,a). Then we may extend f, in any convenient way, to a function f that is continuous on [-7r, 7r] and satisfies f (-ir) = 1(7r). Then, by Corollary 2, we can find a trigonometric polynomial r(x) such that I f (x) - r(x)I < 2 for all x E [-7r, ir]. Now r(x) can
be written as
cje'",=, where the cg's are complex numbers, the nj's are j=1
integers, and t is a natural number. For each fixed j we may expand the function e'n-,x in a MacLaurin series that converges to this function uniformly over [-a, xr]. Hence for each j = 1, 2, ... , e we can choose a natural number kj such k;
that max
ICjezn,x
-w<x<w
- cj
I :n
jx)k
E
< 2 f. Thus the function cjem x can be
k=0
(i
approximated. to within ', by a polynomial Pj(x) = cj k=0
[-7r, in. It follows that
Pj(x)I <
_ max<w I f(x) j=1
I
I
I
max 1/(x) - > cjein,xI + -w<x<w max I E cjein,r - E Pj(x)I -w<x<w j=1
+ 2
j=1
- -
j=1
Pj(x)I <
j=1
2
+
2f
e=E
kx)k
over
7.
Fejcr's Theorem
51
Now any finite sum of polynomials is a polynomial, so we may set PE(x) = I
E Po(x), and the proof is complete in this case.
i-i If [a, b] is not contained in (-ir, ir) we first define p(x) = (b-a)x+a and note that this function is continuous and maps [0, 1] onto [a, b]. The inverse (Chapter
Zero, Exercises 2, problem 2b) of gyp, p-'(x) _, is also continuous and, of course, maps [a, b] onto [0, 1]. Now f o pp is a continuous function on [0, 1] and so by the first part of the
proof we can, given e > 0, find a polynomial P(x) such that max I (f 0 ;p O<x
P)(x)l < E. It follows then that max 1(f -
Poop-')xI = ma<x 1(f o y - P) o
cp't (x)I < omax 1(f op - P)(x)I < e. Because P o p-' is clearly a polynomial, z
we are done.
Remark 2. It is customary to define harmonic functions to be real-valued. and consequently the boundary data for the DP must also be real-valued. This is why we restricted our attention to functions in Cr(T). It is clear, however, that if f is a continuous complex-valued function on T. we may use formula
(6) of Section 5 to define numbers c11(f) and formulas (7) and (8) of that section to define numbers a (f) and b (f ). These numbers are called the Fourier coefficients of f . The series F_ c (f )e'ne is called the complex form of the Fourier series for f , and the series 1 as (f) + E (a (f) cos n9 + b (f) sin n9) is called the real form of the Fourier series for this function. This terminology, although a little misleading, should not cause any problems. Observe that from Exercises 2, problem 6c, we must have lim c (f) = 0, lim an (f) = IIlim00bn (f) = 0. 11--00 Inj00 Furthermore, for every e > 0, there is a trigonometric polynomial r,(9) such
that 1f(0)-rr(9)I <e forall9ET.
Exercises 7
r 1. Given any f E Cr(T), show that there is a sequence of trigonometric polynomials rn that converges to f uniformly over T. * 2.
(a) Show that the complex Fourier series of any trigonometric polynomial r(9) is r(9).
(b) If r(ip) _ E cue4'j''. find the real Fourier series of r(cp).
Chapter One
52
Classical Harmonic Analysis
* 3. Show that -1 + 2Re
[ 1 _re,-,
is equal to 1+--2r (Chapter Zero,
Exercises 4, problem 6).
4. Let f (x) = lxi for -ir < x < a and let f (x + 2a) = f (x) for all x. Then fECr(T) (a) Compute the real Fourier series of f (x). (b) Discuss the convergence of this Fourier series and find its limit.
5. Suppose that f E Cr(T) and that f and f" both exist and are in Cr(T). Show that the Fourier series for f converges to this function uniformly over 7'.
* 6. If f E Cr(T) and if every Fourier coefficient of f is zero, show that f is identically zero on T. 7. If f is a real-valued function that is defined and continuous on [a, b], show
that there is a sequence {P"(x)} of polynomials that converges to f(x) uniformly over this interval.
8. Let f be a real-valued function that is defined and continuous on 10, 1]. 1
The moments of f are the numbers
J0
f (x)x"dx, where n = 0,1, 2, ... .
If two functions have the same moments, show that they are identical.
9. If f is a continuous real-valued function on [0, 1], we set B"(f,x) = n
(k) f (iX'(1 n k=o
x)"_k
for each n = 1, 2, .... These are the Bern-
stein polynomials of f . It can be shown that they converge to f uniformly over [0.1].
8.
At Last the Solution
Let us now solve the DP for the unit circle T with boundary data f. The real Fourier coefficients of f, {a"}, {b"}, are bounded (Exercises 5, problem 2) and so the function 1
00
u(r,8) = 2ao+F, r"(a"cosn9+b"sin nO) "=1
(1)
8. At Last the Solution
53
is harmonic in G (recall that G is the interior of T). This was shown in Exercise 4, problem 2. Using formulas (7) and (8) of Section 5 we may write
llJ
u(r, 0) = 9 + E r" (an cos nO + 6n sin n9) _ n=1
(J2fl
ao
2*
f (p) cos npdcp) cos nO
rn {
f (cp)dcp +
+ (j2"fcpsinncodw)sinno}
j2w
00 /=
f (cp)dcp + E J
2
f (cp)r" cos n(cp - O)dtp
n=1 0
The series (1) converges uniformly over any closed subdisk of G (i.e., when
r < ro where ro < 1 is fixed), and so by Lemma 3 of Section 4 this last sum can be written as follows:
12,, u(r, B)
f (cp)
27r
11+2
r" cos n(cp
9)
dp
n=1
In the series inside the integral we set cp - 0 = w and note that 00
00
1+2F, r"COSnw=-1+2>r"COSnw= n=0
n=1
[rneu'i] = -1 + 2Re I
- 1 + 2Re
1 - re"'' 1
1 - r2
1+r2-2rcosw where Re denotes the real part (Chapter Zero, Exercises 4, problem 6) of the expression in brackets and we have used problem 3 of Exercises 7. Thus 1
u(r, 8) =
27r
ff 2,
1 - r2
(cp)1
+ r2 - 2r cos(,p - B)
(2)
dcp
and we have shown that when f E Cr(T), the function defined by this integral is harmonic in G. We now want to do two things. First, we want to show that the series (1) solves the DP; i.e., we want to show that u(1, 6) = f (9) on T. Second, we want to show that (2) solves the DP in the sense that 1
f (B) = u(1, 6) = rlim
212,
1 - r2
f (cP)1
+ r2 - 2r cos(cp - B)
dcp
Chapter One
54
Classical Harmonic Analysis
This gives us a nice, compact, representation of the solution: We shall do this in two steps. Suppose, temporarily, that the Fourier coefficients of f are such that the series 0C
-gaol + F(Ia, l + 1b.,I)
(3)
n=1
is convergent. Then the Fourier series for f converges uniformly over T, and so its limit must be f (9) (Section 7, Corollary 1). Thus the series (1) converges uniformly over G U T to a function u(r, 0) that is continuous on this set and, moreover, u(1, 0) = f (0). We have already seen that u(r, 9) is harmonic in G, and so under the assumption (3), the function u(r, 0) given by (1) is the solution to the DP. In particular, we must have lim u(r,0) = f(0)
r-.1-
and because in C, u(r, 0) is given by (2), we must also have
_z
zn
rl-I
2T f f
12r cos(cp - 0)
r2
dip = f (O)
So we have proved the two things we wanted to prove, but only for functions f E CT(T) that satisfy (3). Now suppose that our boundary data are merely continuous on T. We may that converges to f (0) choose a sequence of trigonometric polynomials uniformly over T (Section 7. Corollary 2). Now each r,,(9) certainly satisfies (3), by problem 2 of Exercise 7, and so for each n the function 1
u,, (r, 9) =
2r
Jzx
T,(() 1+r
1-r2 z - 2r
0) dW
is harmonic in G, continuous on G U T, and equal to rn(0) on T. It follows, because limrn(0) = f(9) uniformly over T (Exercises 4, problem 1) that the sequence {un (r, 0)) converges uniformly over G U T to a function u(r, 0) that is continuous on this set and satisfies u(1, 0) = f (0) on T. But clearly, in G,
_z
11, f(g) u(r, 9) = 27r
1
+ rz
?2r
0) d`P
(4)
because {r } converges to f (0) uniformly over (0, 2r]. Now u is continuous on C U T, u = f on T. and u is given by (4) in G, hence
a
lim 1 fu f ()1 + rz 1lr cos ('p- 9) d = f (0) for each 0 E T. r-.I-2r
8. At Last the Solution
55
Exercises 8 1. Equation (4) is called Poisson's integral. The family of functions Pr(9) _
r+r2ro, 0 < r < 1 is called Poisson's kernel. One should compare (4) with equation (6) of Section 6. See also Lemma 1 of that section. Show
that (a) Pr(O) > 0 and Pr is continuous on T for all r E [0, 1].
(b) 2r f f r Pr(6)dO = 1 for 0 < r < 1. (c) If 0 < b < 7r, then Um sup IPr(9)I = 0 r--1 X01>6
-
Pr(O) Hint: If 6 < 18 < 7r, then Pr( O) < I
' -2'
2. Verify that Pr(9) is harmonic in G.
3. The formula (4) enables us to write down the solution to the DP for a circle of radius R centered at the origin. If f is continuous on the circumference of this circle, then setting r = k and s = Rye gives 1
u(r, B)
27rR
R2-r2
2''
f (s) R2 + r2 - 2Rr co.s(p - 9) dg
Notes 1. The work of D'Alembert, Euler. and Bernoulli on the vibrating string problem is discussed in Trigonometric Series- A Survey, by R. L. Jeffery (University of Toronto Press, Toronto, 1965).
2. A discussion of Riemann's function, and the mysteries surrounding its origin, can be found in P. L. Butzer and E. L. Stark, "Riemann's example of a continuous nondifferentiable function in the light of two letters (1865) of Christoffel to Prym," Bull. Soc. Math.Belg. 38 (1986), pp. 45 -73.
J. Gerver published two papers about this function: "The differentiability of the Riemann function at certain rational multiples of 7r," Amer. J. Math. 92 (1970), pp. 33-55. "More on the differentiability of the Riemann function," Amer. J. Math. 93 (1970), pp. 33-41.
3. The Fermi paradox is the subject of a book, Where Is Everybody?, by Stephen Webb (Copernicus Books in association with Praxis Publishing, Ltd., 2002). See especially pages 17-18 and 28-29.
Chapter Two
Extensions of the Classical Theory
Fourier series are used to solve many problems arising in engineering and in the physical sciences. The functions encountered here are generally real-valued and defined only on some interval of finite length. They are not periodic and often have several points of discontinuity. To discuss the mathematics behind these applications, we must reformulate the definitions given in the last chapter to, somehow, include more general functions.
It was the French physicist Fourier who stated, in 1807, that an arbitrary function could be represented by a trigonometric series provided the coefficients
in that series were computed from the Euler formulas. He offered no proof of this, but he used these series extensively to solve physical problems and his work brought harmonic analysis to the attention of his colleagues. This is why these series now bear his name.
1.
Functions on (-ir, 7r)
The Euler formulas imply that any function we use them on must be integrable.
We are supposing that the reader is familiar with the Riemann integral and most, perhaps all, functions arising in applications are integrable in this sense. So throughout this chapter "function" means "Riemann integrable function."
57
58
Chapter Two
Extensions of the Classical Theory
Definition 1. Let the function f be defined on (-7r, 7r). Then the Fourier coefficients of f are
a_
1
/"`
f(x)cosnxdx, n=0,1,2,...
(1)
f (x) sin nx dx , n = 1, 2,3....
(2)
7r
bn = 1
r'T
J
n
The (real) Fourier series of f is 2 ao + > (an cos nx + bn sin nx)
(3)
1
It is customary to write
x f (x) - : ao +
(an cos nx + bn sin nx) n=1
to indicate that the series is the Fourier series of the given function. i.e., that the an's and were computed from f (x) using the Euler formulas. We can define a new function, j (x), by setting
j (x) = 7 ao +
(an cos nx + bn sin nx) n=I
at every x at which the series is convergent. The hope, of course, is that AX) equals f (x) for each x E (- 7r, ir). Now if the series does converge, pointwise. on (-s, a), then because sine and cosine are 2r,-periodic, the series converges everywhere except, perhaps, at odd multiples of it. Thus f is defined at all points of R except, perhaps, at the points mentioned. Moreover, f is 27r-periodic, meaning f (x + 2;r) = j (x) all x ¢ {n7r I n E Z. n odd}. Thus, unless f is 2, periodic, the best we can hope for is that f (x) = j (x) in (-7r,-,r) or [-it, 7r], and the following examples show this clearly.
Example 1 Consider the function f (x) = x2 on (-7r, r). We have seen that
f(x)- 3 +4(n2)ncosnr
1. Functions on (-7r, 7r)
59
and the series converges, uniformly in fact, for all x (Chapter One, Exercises 5, problem 1). Thus
f(x) = 3 +4 (n2)n cosnx is a 21r-periodic function on R, a function in C(T). It is tempting to say that the series converges to x2, but this is true only on 1-7r, 7r] (as we saw in Chapter One, Section 7, Remark 1, this particular series converges to x2 at both 7r and -7r). If we set x = 27r, the series converges to f (27r) = f (0 + 27r) = j(0) = 0 and not to f (27r) = 47r2. The graph of f (x) = x2 is well-known; the graph of f is shown below:
Example 2 Let f (x) = x for all x E (-7r, 7r). Then
f (x) - 2
00
n=1
n
sin nx
as is easily seen. We shall see that this series actually converges at each point of (-7r, 7r) and that its limit, I (x), is equal to f (x) on this interval. But it is clear that the series converges at every odd multiple of 7r as well and that its limit is zero. Thus f (nir) = 0 for all odd n (also for even n in this case). The graph of f is shown below:
The dots are on the x-axis at ±7r, ±37r, ±57r, etc.
60
Chapter Two
Extensions of the Classical Theory
aa
Recall that a function f defined on (-a, a) is said to be an even function if f (-x) = f (x) for all x in this interval. Itris easy to see that, in this case,
f(x)dx=2Jf(x)dx
a
0
A function g(x) on (-a, a) is said to be an odd function if g(-x) = -g(x) for all x in this interval. Again one can easily see that, in this case, jg(x)dx = 0
an
Now when f is an even function on (-r, , 7r), then f (x) cos nx is also an even function for all n = 0, 1, 2,- ... Hence.
an =
r
7r l
f (x) cos nx dx = 2 JA f (x) cos nx dx, 7
,r
for each n.
o
Also, because f (x) sin nx is an odd function for all n = 1, 2, ... , b,+ =
1
r
for each n
f (x) sin nx dx = 0
fir
Similarly, when g(x) is an odd function on (-7r, ir) we have
a=
n=0,1,2,---
g(x) cos nx dx = 0, 7r
Ir
and
b=1
J
g(x) sin nx dx =
x
2
ir
n=1,2,.--
9(x) sin nx dx, lo
Examples 1 and 2 illustrate these results. The function f (X) = x2 is even on (-?r. ir). We have seen that its Fourier series contains only even functions, every b,, = 0. and that 7r2
1
Zao = 3 ,
a2 \
soap=2(3
I
r2(-1)^1
and a,= 2 I`
n2
J,
n = 1,2,...
The function f (x) = x is an odd function on (-7r, 7r). Its Fourier series
contains only odd functions, each a = 0 and b,, = 2, ' , n = 1, 2, Exercises 1 1.
(a) Show that the Fourier series of f (x) = x on (-a, a) is 00 (-1)n+l 2 sinnx n=1
n
. .
2. Functions on Other Intervals
61
(b) Show that the Fourier series of f (x) = e' on (-7r, 7r) is
(-1)° I2+1+n2 (cosnx - nsinnx)
2sinh7r
1
7r
n=1
(c) Let f (x) = 1 when -7r < x < 0, and let f (x) = 2 when 0 < x < 7r. 3 + 1 1 - I)1) sin nx. Show that f 2
7r
rt
n=i
(d) Let g(x) = 0 when -7r < x < 0, and let g(x) = sin x when 0 < x < 7r. 2 cco s2ni Show that g - + sin x n=i
2
a function on (-a, a) and let g be an odd function 2. (a) Let f be an even on this interval. Show that a
a
f
fa
f(x)dx
f(x)dx = 2
and
J0
a
g(x)dx = 0
J
a
(b) Show that f g(x) = f (x)g(x) is an odd function. 3. Give f defined on R set fe(x) = f(Z) (-_) and set fe(x) _ f(=)+s -_) Observe that f (x) _ f ,(x) + fo(x). z (a) Show that fo is an odd function and fe is an even function. (b) Compute ff and fe for f (x) = e=.
2.
Functions on Other Intervals
Suppose now that our function is given only on (-a, a). By analogy with what we did in the last section, we would expect that the Fourier series of the given function should converge to a 2a-periodic function. Now the period of sin bx is b T. This is 2a when b = 7r/a. So we would expect our series to be of the form
x 2 ao +
[a cos (nx) + bn sin ( nQx 11
(1)
JJ
Given a function f on (-a, a), we define the Fourier coefficients of f to be the numbers an = a
a
J
f (x) cos (nQ dx,
n = 0,1, 2.
a
and
bn =
1
n
/'
n7rx
J
f (x) sin ( a ) dx, 0
n= 1, 2, 3,
62
Extensions of the Classical Theory
Chapter Two
Having found these numbers we define the Fourier series of f to be (1). As an example, consider f (x) = x2 on (-a, a). We find that a2
f (x) _
3
-f-
4,2
Z)n COs
n
7r2
(n1rx ( a
When a = r we obtain the series found in the last section (Example 1). It is clear that this series converges uniformly over R to a function f that is continuous and 2a-periodic (i.e., 2a is a period of this function). Recall that Fejer's theorem tells us that any function in C(T) is the uniform limit of the Cesb.ro means of its Fourier series. Given a continuous 2a-periodic function f we may regard f as a function on the circle with radius a/7r centered at the origin. A change of variable in Fejer's theorem tells us that any such f is the uniform limit of the Cesbro means of its Fourier series (i.e., the series (1)).
Very often the functions arising in applications are defined only on an interval of the form (0, a). Given f (x) on this interval there are two, among many, natural ways to extend it to (-a, a). We may define fe(x) _
f(x)
for 0 < x < a
if (-x)
for -a < x < 0
This is called the even extension of f. The Fourier series of this function is called the Fourier cosine series for f. We have 00
fe(x)
21
ao+Eancos(nax) n=
where a
a =aa f f(x)cos("X)dx,
n=0,1,2,
0
If this series converges, its limit will be a 2a-periodic function, the periodic extension of fe(x). We may also define
fe(x) _
1f(x)
for 0 < x < a
- f (-x) for -a < x < 0
This is called the odd extension of f and its Fourier series 00
fe(x)
Eb,, sin n=i
(nrrx) a
2. Functions on Other Intervals
63
where 2 b,, =
a,j0
(nax)-
dx,
fe (x) sin
n=1,2,3,
is called the Fourier sine series for f. Again, if the series converges, its limit will be the 2a-periodic extension of ff(x). These series are called the half-range expansions of f (x). Let us give some examples.
Example 1 technically, fo is not Let f (x) = x on (O,ir). Then fa(x) = x on defined at 0, but this does not effect its Fourier series. That series is the one found earlier, specifically,
00 (-1)n+l sin nx
fe(x) ^' 2 n=1
The even extension of f (x), fe(x), coincides with IxI (except, again, at x = 0). Here we have AE(x) ti
- 4a 2
00
cos(2n - 1)x
(2n - 1)2
n=
Example 2 Suppose we modify Example 1 by taking g(x) = x on (0, 1). Then go(x) = x
on (-1,1) and
bn = 2J x sin nirx dx = 1
2 cosnir = n7r
2(-1)n+l
n7r
Thus 2
E
go(x) - n n=1
(_1)n+1
sin nax
n
The even extension, ge(x), coincides with IxI on (-1, 1), giving us 1
ao=21 xdx=1 0
r2
an = 2J xcosnaxdx = 0
r1272
(1- co ma) _ -2(1 ns ( I)n)
0
64
Extensions of the Classical Theory
Chapter Two
and 1
gr(x) '
2[1 - (-l)n]
°r0
- L..
2
n 27r2
COSn7rx
Exercises 2 1.
(a) Let f(x)=-1 when-a<x<0, f(x)=1when0<x
-
f~-4 l( 2n- 1 1)sinlLr(2n -a1)axl 7r
1
J
(b) Let f (x) = x + x2 for -1 < x < 1. Show that 1 °° f ~ 3-1 + 27r- n=1 )(-1)n r712n2 cosnlrx - -sinn7rx n
(c) Let f(x)=0when-2<x<1, f(x)=1when1<x<2.Show that
f~
1
'O
1
n7r
1
`n=1n Lsin 2
4
nirx
nir
nr, x
co 2 + (cos n7r - cos 2) sin 2 ]
2. Here we compute some half-range expansions:
(a) If f (x) = sin x when 0 < x < 7r, then fe (x) = sin x and 2
4
x
cos 2nx
fe(x) ~ 7r- - 7rn_14n2-1 (h) If f (x) = cosx when 0 < x < a, then
4n2-x
fP(x)=cosx
fe(x)~7rz an=14n2-1 (c) If f (x) = es on (0, 7r), then fe(x) ~
and fe(x) ~
e
_
00 1
7r
nsln
2
- - n=1 [1 -
(-1)ne"]
n s+1
(d) if f (x) = A - x on (0, 7r), then fe(x)
...
2
E n-1
sinnx, fe(x) ~ n
1
E [1 - (-1)ne"] n +1 rn=1
7r
2
F
4 r. cos(2n - Z)x 7r
n-1-1
(2n - 1)
3. Functions With Special Properties
65
Functions With Special Properties
3.
The functions that arise in the applications of Fourier analysis to engineering are, as illustrated by the examples given above, usually continuous except at a finite number of points. Moreover, at these exceptional points the functions often have one-sided derivatives (defined below). Here we shall investigate the relevant properties of functions of this type.
Let a, b be real numbers with a < b. A function f that is defined and f (a + h) andh_0+ lim f (b - h), continuous on (a, b) and for which the limits, h lim 0+ both exist (i.e., they are real numbers) can obviously be extended to a function f that is defined and continuous on [a, b]; just take I (x) = f (x) in (a, b) and
set f (a) equal to the first of these limits and f (b) equal to the second one. Because f is continuous on [a, b], it can be approximated by a polynomial on that interval (Chapter One, Section 7, Corollary 3). So it follows that f is bounded on (a, b) and, given e > 0, there is a polynomial P, (x) such that If (X) - PE(x)I <,F
for all x E (a, b).
We can characterize these functions another way. Recall (Chapter Zero. Section 5, Definition 3):
Definition 1. The function g(x) defined on (a. b) is uniformly continuous on that interval if. given e > 0, there is a d > 0 such that I g(x.)-g(y)I < e whenever x, y E (a, b) and Ix - yI < d.
It is easy to see that if g is uniformly continuous on (a, b), lim g(a + h) h-0+ and lim g(b - h) both exist, and conversely, if f is continuous on (a, b) and h
0'
these one-sided limits exist for f , then f is uniformly continuous on (a, b) (see Exercises 3, problem 4).
Definition 2. A function f defined on (a, b) is said to be piecewise continuous, or sectionally continuous, on this interval if there is a finite (perhaps empty) set {xJ}j=1 C (a,b) such that f is discontinuous at each x. and f is uniformly continuous on each of the intervals (x,. xj+i ), j = 0,... , n where x0 = a, b. We shall call {xj } +0 the subdivision of f. Given any y E (a. b) we set
f(y+0)=hlim+f(y+h) and
f(y-0)=hlim04 f(y - h)
Chapter Two s Extensions of the Classical Theory
66
When f is piecewise continuous on (a, b) these limits always exist. For any such function the numbers f(a + O) =
f(a + h) hlnn
and
f(b-O)=hlim f(b-h) also exist.
Theorem 1. If f is piece-wise continuous on (a, b), then lim
rb
a-oo Ja
f (x) sin Ax dx =
0and lim Jf(x)coe3Axdx=O. Proof. It is clearly sufficient to prove this result for a function f that is uniformly continuous on (a, b). Now given e > 0 we must show that b
L for all A > A; i.e., given e > 0 we must show that we can find A such that this inequality is true. Now we may choose a polynomial PP(x) such that
f
bIf(x)-PE(x)Idx<2
for all we need do is choose PP(x) so that E
If (X) - Pe(x)I < 2(b - a)
for all x E (a, b). Clearly then
j
b
b
f(x)cosAxdx
Ja
f (x) cos Ax dx -
Ja
b
PP (x) cos ax dx +
I f Pe (x) cos Ax dx a
jb <_J
a
bIf(x)-Pe(x)IlcosAxIdx+
<-2+
j
b
P(x)as axdx
PP(x)cosAxdx
3. Functions With Special Properties
67
In this last integral we may integrate by parts. We find
I PP (x) cos Ax dx =
PE (x)Asin Ax
- ! r Pa (x) sin Ax dx
and so
fPi(x)cosAxdx
< PE(b)sinAb
P.(a)sinAa
b
1
L
P,\(x)sinAxdx
We can choose Al, so that the first term is less than e/4 when A > Al. The term containing the integral can be estimated as follows:
1
A
1b
Jn
P, (x) sin ax dx < sup IPE(x)l (b
_ a)
a
where we recall that P,(x) is a polynomial; hence, P, (X) is uniformly continuous on (a, b) and, in particular, bounded on that interval. Thus we can find A2 such
that this integral is less than e/4 whenever A > A2. Clearly then jb
f(x)cosAxdx <E
whenever A > max(Al, A2), and the proof is complete. A similar argument shows that b
lim I f (x) sin Ax dx = 0
A-ac a
Corollary 1. The Fourier coefficients of a piecewise continuous function tend to zero as n -+ oo. This generalizes Riemann's theorem (Chapter One, Section 5, Theorem 1).
68
Chapter Two
Extensions of the Classical Theory
We recall that if xo E (a, b), we say that f is differentiable at this point provided lim f (xo + h) - f (xo)
h-0
It
exists. We denote this limit by f'(xo). Now
f(xo + h) =
[f(xo + h) -f(xo)J h+f(xo)
and so, when f'(xo) exists, li m f (xo + h) = f (xo); i.e., differentiability of f at xo implies the continuity of f at that point. We shall now define the onesided derivatives of f obtaining a useful generalization of the derivative that can exist even at discontinuities of f. We shall use the notation introduced just after Definition 2.
Definition 3. Let f be piecewise continuous on (a, b), and let xo E (a, b). Then the right-hand derivative of f at x0, f+(xo), is defined to be f(xo+h)- f(xo+0)
lim
h-o+
h
provided that this limit exists. We define the left-hand derivative of f at xo, f' (xo), to be lim
f (xo - h) - f (xo - 0) h
h-»0*
provided that this limit exists.
It is clear that when f has a derivative at xo, then both its right- and left-hand derivatives exist and both are equal to f'(xo). The function f (x) = 2x, x < 1, f (x) = 1 - x, 1 < x, however, is discontinuous, hence nondifferentiable, at x = 1, but f' (1) and f' (1) both exist. The first of these is -1; the second is 2.
Lemma 1. If the two functions f,g both have a right-hand derivative (or a left-hand derivative) at a point xo, then also so does f (x) g(x). Proof. For any It > 0 we may write f (xo + h)g(xo + h) - f (xo + 0)g(xo + 0) It 1
f =f (xo + h) I g(xo + h) h g(xo + 0) + g(xo + 0) Lf (xo + h) It f (xo + 0) l J I and the result follows from this.
3. Functions With Special Properties
69
Not surprisingly, we can define the right-hand derivative at a and the lefthand derivative at b for a piecewise continuous function f on (a, b). They are f(a + h) - f(a + 0)
f+(a) = lim
h
h-.0+
and
f ' (b) = hlim,
f (b - h) h f (b - 0)
We can give conditions on f that imply the existence of these derivatives.
Lemma 2. Let f be uniformly continuous on (a, b). Suppose that f'(x) exists at every point of this interval and that f' is uniformly continuous here. Then the right- and left-hand derivatives off at a and b, respectively, exist. Furthermore,
f+' (a) = f'(a + 0) and f' (b) = f'(b - 0). Proof. We may define a continuous function f on [a, b] by setting f (x) = f (x) when x E (a, b), 1(a) = f (a + 0). 1(b) = f (b - 0). Clearly, f has a derivative at each point of (a, b); in fact, f'(x) = f'(x) and every point of (a, b). By the mean value theorem (Chapter Zero, Section 5. Theorem 5) we may write f(a + h) - f(a)
_ !'(a + eh)
h
where h > 0, and 0. 0 < 0 < 1. depends on h. But this is just f(°+h)nf(a+o) _ f'(a + Oh) and, letting h -. 0+ proves the result at x = a. The proof for the point x = b is similar.
Corollary 1. If f is piecewise continuous on (a, b) with subdivision {xj }; o (see Definition 2). and if f' is piecewise continuous on (a, b) with the same subdivision, then the one-sided derivatives of f exist at every point of [a, b].
Exercises 3 * 1. Show that f (x) = 21,, r) has both a left- and a right-hand derivative at x = 0. What is li m f (x)? 2. Let f (x) = x on (-7r, 7r). Let f (x + 2ir) = f (x). Compute:
(a) f (zr - 0);
(b) f (a + 0):
(c) f'(a - 0);
(d) f'(a + 0).
3. Let f (x) = x if 0 < x < 2 , f(x) = a - x if z < x < a. Compute: (a) f (2 - 0):
(b) f (z + 0);
(c) f'(z - 0):
(d) f'(2 + 0).
70
Chapter Two
Extensions of the Classical Theory
4. Some authors define f to be piecewise continuous on (a, b) if there is a finite (perhaps empty) set {xj 1 such that f is continuous on (x xj+l ),
j = 0,1, ... , n, where a = xo and b = xn+i, and lim f (x), lim f (x)
®-xt z- .xI exist for each j = 1, 2, ... n, lim f (x) exists and lim f (x) exists. Show xa+ x-bthat this is equivalent to our Definition 2 as follows. Clearly, it suffices to prove the result when {x.7 } = ¢. (a) If f is continuous on (a, b) and =lim f (x), xlim f (x) both exist,
define f on La, b] by letting f (x) = f (x) when x E (a, b), 1(a) _ xlim+ f (x), f (b) =xlim f (x). Then f is continuous on the closed bounded interval [a, b). Now use Theorem 3 of Chapter Zero, Section
5 to show that f is uniformly continuous on (a, b). (b) Now suppose that f is uniformly continuous on (a, b). We must show
that lim f (x) and lim f (x) both exist. x-a+ x-bi.
If {xn} C (a, b) is a Cauchy sequence (Chapter Zero, Exercises 5, Definition 5), show that {f (xn)} is a Cauchy sequence.
ii. Choose {xn} C (a, b) such that n-oo lim xn = a and let a = lim f (xn ). Given e > 0, find 6 > 0 such that Ia - f (a + h) I < e n-oo
whenever 0 < h < 6. Thus lim f (x) = a. The limit of f at b can be treated similarly.
4.
Pointwise Convergence of the Fourier Series
In proving Fejer's theorem we made use of an integral representation of the Ceshro means of the Fourier series. This involved a sequence of functions that we called Fejer's kernel. Recall
am(x+ f) = tar
f"
Km
0
(x - y)f (y)dy
where
Km(x)
_
1
1 - cos mx
m [ 1 - coax,
_
1
m
sin I'mx (sin
2
Zx
Here we shall derive an integral representation for the partial sums of a Fourier series. This, too, involves a sequence of functions that we shall call Dirichlet's kernel. We find that
8n(x;f)= 1
a
f f(t)Dn(t-x)dt
4. Pointwise Convergence of the Fourier Series
71
where
=
sin(n + z)x
Dn(x)
2 sin 1 x
We begin with a lemma due to Lagrange. n
Lemma 1. 2 > cos k9 = -1 +
sin(n + z)9 Sm 29
k=1
n
n
Proof. 2(cos 9 + cos 29 + ... + cos no) = > eik9 + k=1
> e-ike. We may sum each k=1
of these finite geometric series to get 2
n
cos kO =
e:ne)
1 - ei9
a-(I- e-ine
+
1 - e-i9
k=1
ei9(1 - ein9)
(1 - -ine)
1-e0
;i9(1 - e-ie)
ei9(1 - ein9)
(1
- e-'no)
eie - ei(n+1)9 _ 1 + e-in9
1-eie
1 - ei9
1 - eie
In this last quotient we multiply numerator and denominator by a-i9/2 to get n
coo k9 =
ei3 - ei(n+ J)9
-i(n+i)e
e-i4 - ei3
k=1
=-
- e-i +
e-i(n+J)e - ei(n+e)9
(e-i9/2 - eie/2)
e-i3
- eij
+
e-i3 - eii
cos(n + 1)9 - isin(n + 1)9 - {cos(n + 1)9 + isin(n + z)9}
cost -isinz -(cost+isinz)
--1- 2isin(n+z)9 --1+sin(n+1)9 -2i sin 2 and the proof is complete.
sin 19
72
Chapter Two
Extensions of the Classical Theory
Now given a function f on (-r, r) the nth partial suns of the Fourier series of f can be transformed into an integral as follows: n
-9n (X; f) k=1
>(coskxj f(t) cosktdt.
2r f x f(t)dt+ + sin kx
_J
f
x
x
k=1
f(t.)sinktdt)
f(t) { 2 + E(coskxeoskt i sinkxsinkt)1 dt k-1
1
=! J
T
1
Z+
f (t)
t
cos k(t - z) dt
k=1
Using Lemma 1 inside the curly brackets we obtain ( f) = J x f(t) sin[(n + 1 )(t - r)] s x: 2 sin(! (t - x)[
-
dt
(1)
71,
We shall use (1) to investigate the convergence of the Fourier series of f. There is no loss in generality in assuming that f is 2r-periodic, and we shall assume that from now on (Section 1, just after Definition 1). In (1) we set u = t - x to obtain
sn(x,f) =
1 f x_r 21r
.f(u+x)sin[(n i 2)uldu sill I u
In here we replace -L "nl("+4)21 by Dn(u). The reader can check that the function D (u) is 2r-periodic. Thus (Chapter One, Exercises 2, problem 3) our last expression for 4 (x; f) becomes x
s,, (x; f) =
J
f (u + x)Dn (u)du
Lemma 2. For each fixed n the function D8(u) is even (Section 1, just after Example 2) and, furthermore, fWX
Dn(u)du=1
4. Pointwise Convergence of the Fourier Series
73
Proof. We shall use Lemma 1. We have /F
J/
Dn(u)du
=
r
f
1
1
r,
2
1
sin[(n+ 2)u]du 2sin 2u
x
du + E
J
J
cos kudu
I
[2] (21T)
7r
Theorem 1. Let f be a 2r-periodic function that is piecewise continuous on (-r, r). Then the Fourier series off converges to 2 [f (x + 0) + f (x - 0)]
at every point x where f has both a left- and a right-hand derivative.
Proof. Let x be a point where f has both a left- and a right-hand derivative. There is no loss in generality in assuming that x E (-jr, r). We must show that
lim sn(x; f) =n-x lim 1 f (u + x)Dn(u)du = f (x + 0) 2+ f (x - 0)
n-00
and to do this it suffices to show that liin
n
00
J0
* f(u+x)Dn(u)du = f(x2 0),
lim 0 f(u+x)Dn(u)du=
n-.oc
a
f(x-0) 2
By lemma 2 the first of these is equivalent to showing that
nlim J
7r[f(u+x)
- f(x+0)]Dn(u)du = 0
This integral can be written as follows: jEf(u+x) - f(x + 0)]
sin[n + 2 )u] A 2r sin z u
=
[f(u+x)- f(x + 0)1
2rJ o
sin 2u
1
2
)uIdu
74
Chapter Two
Extensions of the Classical Theory
By Theorem 1 of Section 3 this tends to zero as n oo, provided the function in the square brackets is piecewise continuous on (0, ir). The only potential difficulty is that the limit, as u 0+, might not exist. This, however, is not really a problem because Jim
f(u + x) - f(x + 0)
u_o+
sin Zu
lim -o+
1/ (u+x)- f(x+0)1 [ U
=2 Um u-0+
J
2
sin ju
12
f(u+x)- f(x+0) u
and this exists because f has a right-hand derivative at x. The fact that 0
lim
n-oo J n
[f (u + x) - f (x - 0)]Dn(u)du = 0
can be proved in a similar way.
Remarks. (1) At any point xo where the function f is continuous and has one-sided derivatives, the series converges to f (xo). This is true, in particular, at any point where f (x) is differentiable.
(2) An analogous theorem is true for any piecewise continuous, 2a-periodic function (see Section 2 for the definition of the Fourier series of such a function).
(3) If f is defined only on (-a, 7r), then the theorem applies to the periodic extension of f to all of R. Hence, if f is piecewise continuous, its Fourier series converges to I' [f (x + 0) + f (x - 0)] at each interior point (i.e., point
of (-7r, 7r)) where both one-sided derivatives exist. At both end points, x = zr and x = -A; however, the series converges to 1 W r (r - 0) + A -7r +
0)] provided f has a right-hand derivative at x = -7r and a left-hand derivative at z = s. Thus if the series is to converge to f (-ir + 0) at x = -a or to f (ir - 0) when z = ir, it is necessary that f (-zr + 0) = f (ir - 0).
4. Pointwise Convergence of the Fourier Series
75
Let us return now to equation (1). Recall
sn(x;f) =
1
a
if
f(t)sin[(n+ 2)(t -x)] dt 2sin[2(t-x)]
1 n
1
_=
(1)
f (t) sin(n + 2 )u du sin Zu
/'o f (x + u) sin(n + 2 )u
du
2
2a f0
f (x + u) sin(n + 11 u)u du sin 2 u
where we have set u = t - x. In the first integral on the right we set v = -u to obtain o f (x - v) sin(n + 2 )v (-dv) = 1 / f (x - v) sin(n + I 1 )v dv 27r o T7r x sin av sin 1v Hence (1) becomes 1
an(x; f) = - 2a
+
//'x f (x + u) sin(n + 51 )du du sin Zu Jo W f(x - v)sin(n + j1 )v
1
2rr fo
sin 1v
dv
Wsin(n+2)u{f(x+u)+f(x-u)}du 2a
sin Z u
We can put this into a more illuminating form by first noting that when
f(x)=1for all xwe have ao=2and sn(x;f)=1for all n. Thus /lr sin(n+ 2)u( 21r fo
1
sin Zu
2du
)
Multiplying this by a and subtracting from sn(x; f) gives us
an(x;f)-a
ZA*Sln(nn o
I
)u{ f (x + u) + f lx - u) - 2s}du
Thus a necessary and sufficient condition that sn(x; f) converge to s is that this last integral tends to zero as n oo. There are tests for convergence that are based on giving conditions on f for this to be true (see Notes).
76
Extensions of the Classical Theory
Chapter Two
Exercises 4 1. In each of the following cases discuss the convergence of the real Fourier series of the function given.
(a) f(x)=.r, -r<x
(c) f(x)=1, when -r<x<0and f(x)=2when 0<x
Let f(t)=it -tfor 0
°O sin nt n
k
(b) Let us set fk(t) = 2 E
sin nt - (r - t) and investigate the behavior n
of .fk for large k near the discontinuity t = 0 of f. Note first that sink + !I )t
fi(t) =
sin
by Lemma 1.
zt (c) The first critical point (i.e., the first point where f is zero) of fk to the right oft = 0 is tk = +
J'k s k -t )t dt-
Show that fk(tk) =
0
r.
Z
(d) In the integral in part (c) set y = (k + )t and use the fact that lim (e)
sill l/11
y
= 1 to show that lim fk(tk) k-oc.
z
2 loff
sinydy y
-r
Numerical evaluation of this last quantity shows it to be 0.562. Now as k - oc, tk - 0. but, as we have shown, the difference between f (t) and its kth partial sum (i.e., fk) does not tend uniformly to zero. However large k is, there are points arbitrarily close to zero at which fk is near 0.562.
4. Pointwise Convergence of the Fourier Series
77
Notes The work of Fourier is discussed in the hook by Jeffery listed in the Notes to Chapter One. The tests mentioned at the end of Section 4 can be found in The Theory of Functions, by E. C. Titchmarsh (Oxford University Press, 2nd edition, London, 1939), see pp. 406-410. Further discussions of the engineering applications of Fourier series can be found in Advanced Engineering Mathematics, by Erwin Kreyszig (John Wiley & Sons. Inc., 8th edition, New York, 1999), see pp. 525-648.
Chapter Three
Fourier Series in Hilbert Space
There is another way to look at Fourier series. Before we can present this more geometric point of view, we must first review some linear algebra. We begin informally and leave a more systematic discussion to the next section. We recall that the vector space lit' has an inner (sometimes called "dot" or "scalar") product. For x = (xl,..., xn) , y = (y1, ..., yn) we have n
<x,y>=Exjyj
(1)
j=1
In the space Cn we must modify (1) a little bit. For z 1 = (z1, ..., zn) and w1 = (wzf ..., wn) we have n
< z, w >= E zjwj
(2)
j=1
where the bar denotes complex conjugation (Chapter Zero, Exercises 4, problem 1).
If we set e 1 = (1, 0,..., 0), e 2 = (0, 1, o,..., 0), ... e _ (0, 0,..., 1). then clearly < e j, e k >= 0 when j i4 k, and this product is 1 when j = k. Also, for any vector u , n
E ej J=1
79
(3)
80
Chapter Three
Fourier Series in Hilbert Space
With these results in mind, let us turn now to C(T). Equation (2) leads us to define
< f, g >=
for any f, g in C(T)
f (t)g(t)dt
J
Note that for any fixed n E Z, < eine, eint >= 2a eint
eint
eimt
and so the set 0 if m # n and is I if E Z satisfies 2a 2i > n = M. Equation (3) leads us to ask if, for f E C(T), we may write: oc
f
eint
Cf, n--oO 010
n=lam-oo
2rr>
2ir
-
I < f, tine > eint
00
(2a _
n=-oo
eint
,
f(t)e-intdt l eint l
So we are asking if f can be represented by its Fourier series, but we have not specified in just what sense the series is to "represent" the function. Some kind of convergence must be involved, but the convergence we have in mind here is neither pointwise nor uniform. It is a notion of convergence induced on C(T) by its inner product. This will be made clear below.
1.
Normed Vector Spaces
To proceed with our new way of looking at Fourier series, we must first say something about infinite dimensional vector spaces. The terminology of linear algebra is, of course, needed here. A brief discussion of this terminology is given in Appendix A. Let us begin by recalling some conventions. If D is a nonempty subset of K
and f, g are maps from D into K, then f +q is the function defined as follows: (f +g)(t) = f(t) + g(t) for all tfD. Also, when AeK, the function of is defined by (Af)(t) = Af(t) for all tED. When we consider a vector space of functions defined on D (i.e., a set of functions that has the properties listed in Definition 1 of Appendix A), then the values of these functions determine our scalars. So if all of our functions are real-valued, our scalars are real numbers, and when all of our functions are complex-valued, our scalars are complex numbers.
1. Normed Vector Spaces
81
Recall that, in very elementary physics, a vector is defined to be a quantity having both length and direction. In mathematics we first define a vector space and then say that the elements of that space are to be called vectors. We know
nothing about them except that they obey certain axioms. This is the basis for the often repeated remark that mathematics is the subject in which we don't know what we are talking about. Many of the vector spaces that arise in analysis, however, have a function defined on them that enables us to assign
a "length" to each element of the space. Such a function is called a "norm." The formal definition is this:
Definition 1. Let E be a vector space over K. A nonnegative real-valued function p on E is said to be a norm on E if
(a) p(x+y)<_p(x)+p(y)for all x,y in E; (b) p(ax)=I,Ip(x)for all Ain Kand all xEE; (c) p(x) = 0 if, and only if, z is the zero vector.
When p is a norm on E, it is customary to denote, for each xfE, the number p(z) by II x II . A vector space on which a norm is defined is called a normed space. If we want to emphasize the norm, say II II, on E we speak of the normed space (E, II II). Let us look at a few examples. (i) For any fixed nEN the vector space Rn consists of all ordered n-tiples of
real numbers. So if iER', then
(x1 i ..., xn ). We define II x II to be n E xj2 j=1
Because Vi = (Axl, ..., Axn) we see that n
IIaxII2= (,\2X;)
= A2Ex;
j=1
j=1
4
n
=IAI
> x? j=1
=IAIII11I2
Also, II x II2 = 0 if, and only if, n
Ex? =0 j=1
Clearly, this will happen when, and only when, xj = 0 for each j. So the 112 has properties (b) and (c) of Definition 1. Property (a), the so-called triangle inequality, is harder to prove. We present the proof after we discuss inner products. function II
82
Chapter Three
Fourier Series in Hilbert Space
(ii) We can obtain an important generalization of It" by considering all "square summable" sequences of real numbers, i.e., all functions x : N -. R such
that x Dx(n))2 < oc n=1
Writing x(n) = xn for each n, we are considering all sequences x = (x1, x2, ...) such that 00
x2k<00 k=1
The set of all such sequences is denoted by e2(Nl) (read "e two of NP'). We shall see, once we discuss inner products, that £(N) is a vector space and that the function
[x] n=1
is a norm on this space.
(iii) There are natural analogues of (i) and (ii) for the vector space of ordered n-tuples of complex numbers, C", and the vector space of square summable sequences of complex numbers. The latter is denoted by £(N) or, when there is no possibility of confusion, by &(N). In the first of these, however, we must define I1z 112 = to be
j.1
and in the second
111112 = [E I zn 12 I n=1
(see Exercises 1, problem 2).
J
(iv) Of special importance for us is the vector space F1 (Z) (read "t one of T'). It consists of all functions a : Z - C such that 00
I a(n) j< oc n=-oc,
1. Normed Vector Spaces
83
Observe that when a, b are in 11 (Z), then a + b is also in this set: for 00
00
00
I (a + b)(n) I= E I a(n) + b(n) I<-00
-00 00
{Ia(n) I + I b(n) I} -00
00
_ F, I a(n) I+ E I b(n) I 00
-00
and these last two sums are finite. It is also clear that when adf 1(Z), so does as for any scalar A. One can now check that our set has all of the properties listed in Definition 1 of Appendix A, and so f' (Z) is a vector space. For each ael'(Z) we define Ila111 to be 00
a(n)I -00
This is a nonnegative real-valued function and Ilalll = 0 if, and only if, a (n) = 0 for all neZ; i.e., if, and only if, a is the zero vector in fl (Z). Also, IIAaIII =1 A I Ilalll for any a and any scalar A. The triangle inequality (property (a) in Definition 1) was proved in the paragraph just above. Thus (f' (Z), II.111) is a normed space. We shall see more examples of normed spaces in the later sections.
Definition 2. Let (E,11-11) be a normed space and let x oeE. A subset S of E is said to be a neighborhood of xo if for some e > 0 we have {xeE I III - I o II <
e} C S. A subset of E is said to be an open set if it is a neighborhood of each of its points, and a subset of E is said to be a closed set if its compliment, in E, is an open set.
Given x0EEand any e>0,the set O,(xo)={xEEIIII -loll <E}is called the open ball of radius e centered at xo, and the set f3 (io) = {xeE I III - Iohl < e} is called the closed ball of radius E centered at xo. Observe that an open ball is an open set and a closed ball is a closed set (see Exercises 1, problem 3). For any A C E and any scalar .1, let AA = {.fix I xeA}. We have 0, (0) = {xeE 111111 <- e} = {(Ex) 111111 5 1} = 0,(0)
(1)
For anysetBCEletA+B={x+y I eAand yeB} Then
0,(Io)={xeEIIIx--loll <E} _{yeEI y=xo+e with 111 II 51}_{IO}+e8j(0)
(2)
Chapter Three
84
Fourier Series in Hilbert Space
because
{xo}+6,31 (0) = {xo+ex
1
11-;11
<_ 1}
(3)
Hence we see that if we call 131(0) the unit ball of E, any closed ball centered at zero is a scalar times the unit ball and any closed ball with center xo is just a translate to .so of a closed ball with center at zero. Similar remarks apply to open balls. Exercises 1 * 1. Let (E, II 11) be a normed space and let x, y be elements of E. Show that 111x11-IIy1II<_II -yII. -
Hint: Writex=x-y+yand y=y-x+x. * 2. Find a vector (z1, z2)eC2 such that zi + z2 = 0 and I z1 12 + I z2 1296 0. * 3.
(a) Show that an open ball, say OE(xo). is an open set. Given yeO, (xo),
find 6 > 0 such that 06 (y) C 0e (xo). (b) Show that a closed ball, say f3,(io), is a closed set. Given y & (xo),
find 6>0such that 06(y)n,3E(xo)=o. (c) Show that the union of any collection of open sets is an open set, and the intersection of any finite collection of open sets is an open set.
(d) Show that the intersection of any family of closed sets is a closed set, and that the union of any finite collection of closed sets is a closed set.
4. In R" the vectors e1 = (1,0,0,...,0), e2 = (0,1,0,...,0), e3 = (0,0,1, 0, ..., 0) , ..., e " = (0, ..., 0,1) are linearly independent. An infinite subset of a vector space is said to be linearly independent if each of its finite subsets is linearly independent. Find an infinite linearly independent subset of e2(N) and find such a set in t' (Z). What does this tell you about the dimension of these vector spaces?
2.
Convergence in Normed Spaces
The presence of a norm on a vector space enables us to define convergence for sequences in that space. Once we have this concept, we shall be able to discuss convergent series in, and continuous functions on, the space.
2. Convergence in Normed Spaces
85
Definition 1. Let (E, II' II) be a normed space and let{ an }° be a sequence n=
of vectors in E. We shall say that this sequence is convergent to the vector
and we shall write n"o xn = lo, if for any given e > 0 there is a natural number N such that Ilxn - xoll < e whenever n > N. Note that the sequence
1x n }
C E converges to zoeE if, and only if, the
sequence { xn - xo} converges to the zero vector of E. As a simple application
of this idea, let us use convergent sequences to characterize the closed subsets of (E,II.II).
Lemma 1. A nonempty subset C of E is closed in (E, II
. II) if, and only if, every sequence of points of C that is convergent converges to a point in C.
Proof. Suppose first that C is closed and that {xn} C C is convergent to a point E. If V V C, then y is in the complement of C, E\C, which is an open set. Hence, for some e > 0 we must have O, (y) C E\C (Section 1, Definition 2). Because {in} converges to y, there must be an integer N such that xneOE (b) whenever n > N. Thus we have reached a contradiction
because, when n > N, xn is in both C and E\C.
Now suppose that C is not a closed set. We shall construct a sequence
fin-1 C C that converges to a point y that is not in C. Because C is not closed, E\C is not open, and hence, in particular, it cannot be empty. Thus there must be a point yeE\C such that every neighborhood of y intersects C; otherwise E\C would be a neighborhood of each of its points. So for each n = 1, 2,... we can find aneOi (y) fl C. Clearly, {xn } is a sequence in C whose limit is
C.
The Fourier transform is of fundamental importance in harmonic analysis. It is a linear map between two normed spaces. Whenever we speak of a linear map between normed spaces El, E2 we shall always assume that these spaces are defined over the same field K (i.e., both are vector spaces over R or both are vector spaces over C). Recall that a map T : El - E2 is said to be linear if
(i) T(x+y)=T(x)+T(y) for all x,y inE1 and (ii) T (Ax) = AT (x) for all
and all scalars A.
Observe that T must map the zero vector in El onto the zero vector in E2.
Chapter Three
86
Fourier Series in Hilbert Space
Definition 2. Let (El, II 111) and (E2,11-112) be two normed spaces and let T : El -. E2 be a linear map. We shall say that T is a continuous linear map converges to the vector T (xo) for II - 112 whenever if the sequence IT the sequence J-;.1 C El converges to the vector xo f o r
. II
111.
Our first result gives some especially useful characterizations of continuous linear maps. It is because of the first two of these that continuous linear maps are often called "bounded" linear mappings.
Theorem 1. Let (E1, II T : El
- 111) and (E2, II . 112) be two normed spaces and let E2 be a linear mapping. Then the following are equivalent:
(i) There is a real number M such that IIT (x) 112
M II x II 1 for all x eEl ;
(ii) There is a real number M such that IIT (x) 112 5 M whenever xeE1 and IlxIll < 1;
(iii) T is a continuous linear map.
Proof. It is clear that (i) implies (ii). Suppose that T has property (ii). Then given xeE1, x 14 0, we must have lIT ( )II2 < M. Because T is linear we
_
II=II
se that IIT (x) 112 5 M111111 for any nonzero vector xeE1. But any linear we
map takes the zero vector to the zero vector, so our inequality holds when x is 0 as well. Thus (i) and (ii) are equivalent. Let us now show that (i) implies (iii). If 1x- I is any sequence in El
that converges to a vector xofE1, for the zero vector of El for II
-
111.
II
111, then {xo -
converges to
But by (i) we must have IIT (xo - xn) II2 <
M11 (xo - xn) 111, and so it follows from the linearity of T that n"- T
T (xo) for all II - 112Finally, let us show that (iii) implies (ii). We shall assume (iii) and the negation of (ii) and deduce a contradiction: We are supposing that, given any M, we can find x e E1 such that I I x I I 1 <- 1 and IIT (x) II:, > M. Choose
x 1 eE1 so that Ilxl II 1
<- 1 and IIT (x 1) II2 > 12. Next choose x 2EE1 so that
Ilx2l11 5 1 and IIT (x 2) 112 > 22. Continue in this w a y . After x 1 , x 2, ... x have been chosen so that Il xi 111
<-
1 and IIT (xi) Il2 > j2 for j = 1, 2,..., n,
choose xn+1eE1 so that llx,+llll 5 1 and lIT (xn+1) 112 > (n+ 1)2.
2. Convergence in Normed Spaces
Now the sequence { IIT
} C E1 satisfies the inequalities
II
II1 <
87 and
112 > n for all n. But then the sequence { n" } C E. converges to
the zero vector of E1 for II
II
whereas the sequence
IT
_1 (.)} C
E2 does
not converge. This contradicts (iii) and completes the proof.
A linear map from a vector space to itself is called a linear operator on that space. We have two natural linear operators on the space (2 (N). They are the right-shift operator
Ar(z)=Ar({z}_1) = {0, Z1, Z2,...}
for all zel2(N)
and the left-shift operator
A! (Z) = Ae
1) = {Zn}°n__2
for all x e12 (N)
Clearly, IIAr (z) 112 = 111112 for every z , and II At (z) 112:5 11_z 112 for every
Thus these linear operators are continuous on (2(N). Because Ar and Ai map 12 (N) to itself, we can compose them (Chapter Zero, Exercises 2, problem 2). We see that A, o Ar is the identity operator (i.e., the map I such that I z = z for all z) whereas Ar o A, is not.
The space 11 (Z) consists of all sequences z = (z,,}oo such that 00
-00
This last number is
II z II 1
Clearly, 00
E z -00
is convergent and so we may define a map gyp: 11 (z) - C by setting
for every tell (Z). This mapping is easily seen to be linear. A linear map from a vector space into its field of scalars is called a linear functional on the space. Because K has a norm (the absolute value function [see Chapter Zero, Exercises
88
Chapter Three
Fourier Series in Hilbert Space
3, problem 3 and Section 4], is a norm on K), we may consider the continuous linear functionals on a normed space. Note that p is continuous because
x
oo
E=nl<-EIznl=llzllu -00
00
for every Z et' (Z).
D
Now let us consider a fixed nonempty subset D of K and a function f K. We shall say that f is a bounded function or that f is bounded on
D, if there is a real number M such that I f (t) I< M for all teD. Such a function need not have a maximum in D but the least upper bound of the set {I f (t) II tED}, what we call the supremium of I f J. is a real number (Chapter Zero, Section 3, Definition 1 and Remark (8)). Now let B (D) be the set of all functions from D into K that are bounded on D. This is a vector space over K (Appendix A) and we can define a norm on this space by setting IIf III = sup I f I= dub {I f (t) II teD) (we call this the sup norm) for each f eB (D). Furthermore, we can easily characterize the sequences of functions that converge for this noun. We have:
Lemma 2. A sequence (f,,} C B (D) converges to the function foeB (D) for the norm II ' III if, and only if, this sequence converges to fo uniformly over D.
Proof. Let { fn} C B(D) and suppose that this sequence converges to foeB (D) for 11 ' I I oo Then, given e > 0, there is an integer N such that I I f o - fn I I oo < e whenever n > N. But this says I fo (t) - fn (t) 1:5 dub {I fo (t) - fn (t) II teD} < e, for every teD, when n > N. Hence, {f,,) converges to fo uniformly over D. Now suppose that the sequence { f} converges to fo uniformly over D.
Then given e > 0 there is an N such that fo (t) - fn (t) I< a for all teD I
whenever n > N. Clearly then, Ilfo - fnIIoo = dub {I fo (t) - fn (t) II teD} < e whenever n > N showing {fn} converges to fu for II ' Ilao. In Chapter One, Section 4, Lemma 1, we showed that a sequence If- I of functions defined on D is uniformly convergent over D if given e > 0 there is an N such that I In (t) - fm (t) I< a for all teD whenever both m and n exceed N. It follows that a sequence { fn} C B (D) converges for II Iloo if, and only if, given e > 0 there is an N such that II fn - fn Iloo < e whenever both m and n exceed N. Normed spaces that have this property are extremely important in both pure and applied mathematics. Before we can discuss them we must define our terms.
Definition 3. Let (E,11 . II) be a normed space. A sequence J xn j of vectors in E is said to be a Cauchy sequence if, given e > 0, there is an N such that
2. Convergence in Normed Spaces
89
Ilxn - x m II < e whenever m and n both exceed N. A normed space (E, II II ) is said to be a complete normed space, or a Banach space, if every Cauchy sequence of points of E converges to a point of E.
We have just noted that the space (B (D), II ' IIa,) is a Banach space. In particular, (B (T), II ' II ), T the unit circle, is a Banach space. Observe that the continuous functions on T are bounded and hence C (T) C B (T). Furthermore, any sequence in C (T) that is a Cauchy sequence must have its limit in C (T). Because it is Cauchy it converges to an element of B (T) and because each fn is continuous, its limit must actually be in C (T) (Chapter One, Section 4, Lemma 2). Thus (C (T), II ' IIoo) is another example of a complete normed space.
Exercises 2 1. Let (E, II ' II) be a normed space and let
{in} C E converge to xoEE for
11-11.
(a) Show that the sequence of real numbers { II
converges to the
number IIxoII
(b) Show that { an } is a Cauchy sequence (Definition 3).
i
s 2. Let S be a nonempty subset of a normed space (E, II ' II) . A point o of E is said to be an adherent point of S if every neighborhood of x0 contains infinitely many points of S. The set of all adherent points of S is denoted by S', and the set S U S', which we shall call the closure of S, is denoted by S.
(a) Show that ioeS' if, and only if, there is a sequence of distinct points of S that converges to 10(b) Show that (S')' C S'and hence S' is a closed set. (c) Show that S is a closed set and show that any closed set that contains S must also contain S. (d) If H is a linear subspace of E, show that H is also a linear subspace
of E. A linear subspace of E that is also a closed set is called a closed linear subspace of E.
(e) Show that C (T) is a closed linear subspace of (B (T), II II)
3. For each nEN let fen (t) = ten for -Tr < t < 7r and let f2n (t + 27r) _ f2n (t) for all t. Then {fen}n__1 C C (T). Show that any finite subset of {fen} is linearly independent.
90
Chapter Three
Fourier Series in Hilbert Space
4. Let El, E2 be two normed spaces and let T : El - E2 be a continuous linear map.
(a) Show that the null space (Appendix A) of T is a closed linear subspace of El. (b) More generally, show that for any closed subset C of E2 the set T-' (C) = { xeE1 I T (x) eC} is a closed subset of El. 5. We can define two linear functionaks on C (T) as follows:
(a) I (f) = f", f (t) dt for each f eC(T); (b) For any fixed a E T, let cp,(f) = f (s) for each f eC(T). Show that these maps are continuous when C (T) has the sup norm.
6. Recall the space f1(Z) (Section 1, example (iv)). For each aet'(Z) let
(a) =
a(n)einb _x
(a) Show that F (a) eC (T) and show that the map taking each a to F (a), call it F, is linear. (b) Show that .F is a continuous linear mapping from (e (Z), II II t) into (C(T) - [I.). II
3.
Inner Product Spaces
Returning once again to elementary physics and our informal discussion, we recall that, the dot product of two vectors u, v in R2, or R3, is the product of their lengths and the cosine of the angle between these vectors. Thus { u , v) = II u II II v II cos B. Because we have an inner product on R" (introduction to this
chapter, equation (1)) and we have a norm on R"(Section 1, example (i)), we might define the angle between u, veR" to be cos-1
1, ,v
{
}. To carry this
11-i; 1111-;U
out we'd have to show that the quantity in curly brackets is between -1 and +1; otherwise the arccosine wouldn't be real. This is, as we shall see, actually the case. So we can give "direction" to the vectors in R. But this is not, as far as I know, very useful. What is important about an inner product is that it enables its to define "orthogonality" of vectors. In R2, or R3, two non-zero vectors are orthogonal if the angle between them is '. In this case the cosine is zero, so the inner product of the two vectors vanishes. In the general case when
we have an inner product defined on the vectors of a given space, we define two vectors to be orthogonal if their inner product is zero. This is an extremely useful notion as we shall we below. Let us now give formal definitions of these concepts.
3. Inner Product Spaces
91
Definition 1. Let V be a vector space over K. The set of all ordered pairs (u, v) whose members are in V is denoted by V x V (Chapter Zero, Section 1, definition 5). A function p from V x V into K is said to be an inner product on V if (a) F o r any u i , u 2, v in V and any scalars a.,3 we have V ( u i + 3u2 ,
_
ap (ui, V) + (b) For any u , v in V, p (u , v) = cp (e , u) , where the bar denotes complex conjugation (Chapter Zero, Exercises 4, problem 1);
(c) For all eeV, cp (e . V) > 0 with equality holding when, and only when, is the zero vector. When cp is an inner product on V, it is customary to write cp
u, a ) in place of
(u , e) and to call the pair (V (,)) an inner product space. For the remainder of this section we shall suppose that any vectors we
consider are members of a fixed inner product space (V. (,) ).
It is convenient to define for each v eV 11v 11 to be ((e , v))
this the norm of e. It is clear that IIa t' II =1 a I
11_v II
and to call
for any eeV and any
acK, and it is clear that II v II = 0 if. and only if, a is the zero vector. To prove is a norm, we must also show that it satisfies the triangle inequality that II (Definition 1(a) of Section 1); i.e., we must show that 11'u + v II < 11u II + 11v II for all u , v in V. This will follow from Theorem 2 below (Corollary 1). -
II
A vector eeV is called a unit vector when 11_v II = 1. Such a vector is sometimes said to be normalized. We can normalize any nonzero vector v by multiplying it by the scalar A set of vectors is said to be normalized if II v ll each vector in the set is a unit vector. Definition 2. Two vectors i , v in V are said to be orthogonal if (u , e > = 0. A nonempty set of vectors is said to be an orthogonal set if any two distinct vectors in this set are orthogonal. Finally, an orthogonal set that is also normalized is called an orthonormal set.
It is easy to check that the functions defined by equations (1) and (2) in the introduction to this chapter are inner products on R" and C", respectively. If, for each j = 1, 2,..., n, we let ei be the vector (0,..., 1, 0,..., 0), where the 1 is the jth place, then { ej }" is an orthonormal subset of R" and also of C". =1
92
Chapter Three
Fourier Series in Hilbert Space
Theorem 1. (Bessel's inequality, weak form). Let { v j }m be a finite orthoj=1
normal set in V. Then for any vector v eV we have
(2< II v
I(2
j=1
Proof. The proof is by direct calculation using the properties listed in Definitions
I and 2. It is convenient to set cj = (v , -;j ) for j = 1, 2,..., m, and to use different indices in our sums. We have
r m
m
ECj91j,
0
j=1
Ckvk =
k-1
"I
in
j=1
E CjCk(vf,vk) =
j,k=1
k=1 m
m IIv1I2-EIcj12-`ICk12+IC.,
12
j=1
j-1
k=1
because v j, vk } = 0 unless j = k. Thus we have ,n
05IIUIi2-E IC1I2 j=1
and we are done.
Theorem 2. (Cauchy, Schwarz, Bunyakowski). For any two vectors u, v in V we have I (u, v) I< IIuIIIIt'II Proof. The result is trivial when
is the zero vector (see Exercises 3, problem
la). When this is not the case, {
} is a finite orthonormal set. Thus, by v 11211- 112. Theorem 1 we have I \u Ilvp v I2< II 1L II2, giving us ((u, v } 12 II u v Taking square roots we have our result. p
q11
Remark 1. In our discussion of 1R we suggested that one might be able to define the angle between two nonzero vectors u, v by using cos' 1 ('EillTheorem 2 shows that when the inner product is real, the quantity in curly brackets is always between -1 and 1. Thus, in this case, the arccosine is always a real number.
Corollary 1. For any two vectors u,1 in V we have ((u +111 S II u II + II v II
3. Inner Product Spaces
Proof. First note that (u , v) + (v , u) is real and less than 2
93 Hence
we may write
IIu+ 1112=(u+v,u+v)=(u,u++(v,u+v)=(u,u)+ +(V,u)J +(v, v) <
11-u112 +21 (u,;,) I +IIvii2
But by Theorem 2 this last quantity is < IIuII2+2IIuIIIIvII+IIvII2 = (IIuiI+IIvII)2 Taking square roots we have our result.
Remark 2. We have now shown that when (V (,)) is an inner product space
((v, v )) # is a norm on V, i.e., setting II v II = ((v, a ))
the function v
I
for every veV, (V, II . II) is a normed space. We shall call this the norm defined by, or associated with, the inner product. Unless the contrary is explicitly stated, whenever we speak of a norm on an inner product space we shall mean the norm defined by the inner product.
We have now shown, in particular, that the functions defined on It" and C" in examples (i) and (iii) of Section 1 are norms on these spaces. They are the norms defined by the inner products mentioned in the introduction to this chapter (equations (1) and (2)).
Remark 3. An inner product space (V, (,))always has, as we have just seen, a norm. Hence we can discuss open sets, closed sets, and convergent sequences in the space as well as continuous mappings on the space.
Let us now recall the set 4 (N) (Section 1, example (ii)). This consists of all sequences {xn }' of real numbers such that
00
1
00
xn is convergent: i.e.,
n=1
it is the set of all functions x : N - R such that
x (n)2 is convergent. We n=1
know how to add two such functions and how to multiply any such function by a scalar, but are the results still in 4 (N)? Clearly, \ {xn}n° 1 = {Axn} 14 (N) whenever {xn}n is in this set and AeR. Let us show that for any {xn}w 1
1
and any {yn}n in P (N) their sum is again in this set; i.e., let us show that 00
1
Oc
when E xR < oo and n=1
00
y,2, < oo then E (xn +?/n)2 is finite. n=1
n=1
For each fixed N we have {xn}n 1 and {y,,}n 1 in the space RN. This is, as we have seen, an inner product space and so we may use the C.S.B. inequality
Chapter Three
94
Fourier Series in Hilbert Space
(Theorem 2) to obtain N
/
N
+
({xn}n1=1 , {Yn}nN=1) 1=1 E xnYn I
(1)2 2
(i xn
oo we see that whenever {xn} and {yn} are in 4(N) the
Letting N
x
series E xnyn is convergent. Furthermore, because n=1
N
N
N
.N
E(xn+yn)2=Exn+2>xnyn+)1Jn n=1
n=1
n=1
n=1
we see, by again letting N --* oo, that {xn + yn}°O 1 is in PR (N). It is easy to check that PAR (N) is a vector space over R and if we define for 00
each x = {xn},
1 , y = {yn}1 in this space < x, y > to be
xnyn, we n=1
easily see that (4 (N), (,)) is an inner product space. Thus 11 - 112, where for each
{xn }n 1,11 -;112 = ( xn)
,
is a norm.
n=1
Exercises 3
In problems 1-3 we assume that all vectors considered are in a fixed inne product space (V, (,)). * 1.
(a) Show that (u, o) = 0 = (o, u) for any (b) Prove the parallelogram identity: For any u, v -x112=2(I1u112+IIU112)
(c) Prove the Pythagorean relation: if < u, v >= 0, then II u + 1112 = I11112+111112.
(d) Prove the polar identity: For any u, v 4111u+1112-IIZ-1112+111u+i1112-iII
-ivII2}=(u,v)
2. Show that any finite subset of an orthonormal set in V is linearly independent.
3. Inner Product Spaces
95
* 3. Let { u n } and 1_v _l be two sequences in V that converge, for the norm
of V, to uo and vo, respectively. nit
(a) Show that for any
00
(u n, i)
(b) Show that ,limo (un, vn) = (u0, vo)
.
4. On the space C (T) define, for any f, g, < f , g > to be f f (t) g (t )dt.
(a) Show that (,) is an inner product on C (T). (b) Denote the norm associated with this inner product by II
.
.
lr"
112 so
I f (t) I2 dt) Compute Ile" "112 and Ile:nell. (Section 2, just after Definition 3) for any fixed nEZ.
Ill 112 = ((f, f)) = (f
(c) Show that { 2a I neZ }
,
{ coo nt }
sets in (C (T),<,>).
n=1
and { $'n } x
are orthonormal
o
5. Find an infinite orthonormal set in tx (N).
6. Recall the set 12 (N) (Section 1, example (iii). This consists of all sequences of complex numbers, z = {zn}, such that E I zn
12 <
oo.
n=1 00
(a) Show that for any {zn} , {wn} in 12 (N) ,
x,111, is convergent. n=1
(b) For any two nonzero complex numbers z, w let a = zw . Note that a I= 1 and azfv =1 z1'u I. Hence if {zn} and {wn} are in t2 (N),
so is {anzn } where an =
" W"
. Thus we see that
I zn wn I is n=1
convergent.
(c) For any z = {zn},w = {wn} in t2 we have defined z + w to be {zn + wn}, and for any AEC we have defined A z to be {Azn } . Show
that z + w and Al are in
(N) and that t2 (N) is a vector space
over C.
(d) For any
{zn},w={wn}in t2c (N)define (z,w) to be 00 Eznwn n=1
Show that (,) is an inner product on t22c (N) and that the norm associated with this inner product is the one defined in Section 1, example (iii).
(e) Find an infinite orthonormal subset of (tc (N), (, ))
.
Fourier Series in Hilbert Space
96
Chapter Three
4.
Infinite Orthonormal Sets, Hilbert Space
Let us begin our discussion with a fixed inner product space (V, (, )) and an orthornornial sequence (i.e., a sequence that is an orthonormal set) {vj} C V. We have seen several examples of such sequences (Exercises 3, problems 4c, 5, and 6e). Suppose that for some u FV and some scalars {aj } we have (1)
TL
j=1
What we mean, of course, is that u =
llnl aj v j where convergence n -+ 00 j=1
is for the norm on V defined by its inner product. Then, by problem 3a of Exercises 3, we may write
_ Tl. t n) \\
=
Un
lim k--+ oo
=
j-1
lim
k
_
It
j=1
k
o aj vj, vn
(2)
It
lim
k - +oo
Eaj
v j , -j,, }
= an for each fixed n.
j=1
Thus when (1) is true the scalars are uniquely determined by u. They are
aj=(u.vj),j=1,2,...
(3)
We can immediately apply our result to the space C (T). Recall that, for all f, g in this space, (f, g) = f'., f (t) g (t) dt. Furthermore, the sequence eint
I nEZ 27r
I
is
orthonormal (Exercises 3. problem 4c). Now suppose that,
for some f eC (T) . Pint
f=
an
(4)
2;r
Then, as we saw above
a
\
f,
x
eint
)=
2r,
/
1
2ir J
f (t) a-tntdt =
27rCn
(f)
for every n.
(5)
4. Infinite Orthonormal Sets, Hilbert Space
97
Putting this into (4) we find that °°
(
f=
x
1
2a
o0
eint
f f (t)
e-intdt)
E c" (f)
eine
(6)
where, we recall, that c (f) = zx f w, f (t) a-'ntdt (Chapter One. Section 5, formula 6) Thus (4) is the complex Fourier series for f .
Returning now to (V, (,)) and the orthonormal sequence { v, } we can generalize Theorem 1 of Section 3.
Theorem 1. (Bessel's inequality, strong form). For any vector ueV we have 00
12<_ IIu112
I
Let us set a;
Proof.
. Then
f o r each j = 1, 2, M
1o; I2<....But, by the theorem we are
Ia1 l2
;.1
generalizing (Section 3, Theorem 1) each term of this increasing sequence of real numbers is bounded by 1 1 7 u1 1 2 . Thus E I (u , v; ) 12 converges and this j=1 sum is bounded by 11 u 11 as well. 00
Corollary 1. For any f eC (T) we have 27r (:L I cn (f) 7r(n=1 f {Ia,.
(f) I2+Ibn(f)12) I
12)
<
11f112
and
<_ IIiII2
/
Thus far we have discussed what we can say when we know that the series 00
E (u v;) v, converges to u . There is a kind of inner product space for which ,
=1
every series formed in this way will converge; i.e., for every vector ti in this 00
kind of space the series E (u , u; ) v, converges. =1
Definition 1. Given an inner product space (V, (,)) we recall that for each v e V, II v 11
=
((v v) ) 4. We shall say that (V, (,)) is a Hilbert space if (V, Il ,
is a complete normed space (Section 2, Definition 3).
II )
Chapter Three
98
Fourier Series in Hilbert Space
We may now state the following:
Corollary 2. If (V, (, )) is a Hilbert space, then for every vector ueV the series
j .1
(u , v j } i
converges to a point in V.
Proof. Given ueV set aj = (u, vj) for each j = 1,2,.... Then for any natural numbers in, n with n < m we may write
_
m
n
m
I1Eajvj-EajUJ112=l1 E ajvj112 j=1 j=1 j=n+1 m _
m
m
E I aj 12 E a,;,, M a, uj = j=n+1
(7)
j=n+
j=n+1
W because { rj } is an orthonormal set. But by Theorem 1, the series M
convergent. Hence, given e > 0 we can choose N so that
1 aj 12 is
j=1
1 aj 12< e when j=n+1
m > n > N. It follows from (7) that
aj v j
is a Cauchy sequence in
j=1
nal Because this is a Hilbert space, every such sequence converges to a point of V. (V, II
II)
Unfortunately, the series j=1
(u , v j } v j , even when it does converge, does
To see why this is so, suppose that the series not necessarily converge to is does converge to u and suppose, further, that (u, u 1 > 0. Then { vj } j=2
x
an orthonormal set, and the series
(u , v j ) vj does converge, but not to u.
j=2 To ensure that the series converges to the vector that generates it, we must somehow guarantee that we have not left any vectors out of our orthonormal set. We can do this by requiring that our set has the following property.
Definition 2. Let (V, (,)) be an inner product space. An orthonormal subset S of V is said to be a maximal orthonormal set if the conditions (i) S' an orthonormal set and (ii) S C S', imply that S = S. We leave it to the reader to show that an orthonormal set S is maximal if
(u, o) = 0 for all veS implies u = 0 (Exercises 4, problem 1).
4. Infinite Orthonormal Sets, Hilbert Space
99
The discussion just before Definition 2 has led some to call a maximal orthonormal set a "complete orthonormal set." We shall not use this terminology.
Hilbert spaces first arose in the papers of David Hilbert on the subject of integral equations, and it is well known that he was attracted to this area by the pioneering work of Ivar Fredholm. These spaces play an important role in establishing the foundations of modern quantum mechanics. Their importance for the study of trigonometric series stems from the following result.
Theorem 2. Let (H, (,)) be a Hilbert space and let {vj
be an orthonormal sequence of vectors in H. If this sequence has any one of the four j=1
following properties, it has all four of them: (a)
1v 400 00
j=1
is a maximal orthonormal set; cc
(b) For each ueH the series E (u, v j) v j converges to j=1
(c) (Parseval's relation) F o r e a c h ueH we have 1 1 ' u1 1 2 =
00
1 (u, v j) 12;
j=1
(d) (also called Parseval's relation) For any two vectors u, v in H we have ao
j=1
Proof Assume that { v j } is maximal and let u be any vector in H. We have 00
seen, in Corollary 2 above, that E (u , -;j ) v j converges to some element of j=1
H. We must show that its limit is actually the given vector u. For any fixed n
r!-E(m
we have
00
j=
i{,V,1Vvn =(/\ tl\\
m
oo E( 7Lr'Vj)vi,vn
lim
j=1
100
Chapter Three
Fourier Series in Hilbert Space
and by Exercises 3, problem 3a this is m
Jim
ii -
(u+ vj> vj+ vn m
lim
1{, 11n
u, v
11
, vn 1
4L, vn 1
( 1L, 7In1 = ()
=1
Thus u = E (u , v j ) v j must be the zero vector because 1v j } j
is a maximal orthonormal set (Exercises 4, problem 1). So we have shown that (a) implies (b). Now suppose our set has property (b). Then for any ueH we may j=1
write
`
1
1lufl
\ =(u,u)= (ti 00
i=1
\ vj)vj,ti /
lira
=
m
i=1
and again by Exercises 3, problem 3a, this becomes m
Jim
m-+oc lim
vj,u- \ _
j=1
m
n` \ \ (a,v,)(vj.41)
lim oo
j=1
m
(u, j=1
= m lim y
00
rMl (u, vj} I2= r l (u' vj) j`1
I2
j=1=1
Thus (b) implies (c). But for any vector ucH a direct calculation shows that
m
v,II2= IIuiI2- I (u
IIu j=1
;vj1
I2
j=1
It follows from this that (c) implies (b) and hence these two conditions are equivalent.
Suppose now that (v l has property (c) and let u , v be any two vectors j
4. Infinite Orthonormal Sets, Hilbert Space
101
in H. To prove that (c) implies (d), it suffices to show that (b) implies (d). Now
(u,v)= j=1
u,
m
lim
(v, vj vj j=1
lim = m-00 j=1
,vJ)\u'v3)=u\u,v1)\v'v,) j=1
m--moon 1=1
And this is (d). Finally, we show that (d) implies (a). We shall show that assuming our set has property (d), the only vector in H that is orthogonal to every vj is the zero vector. So suppose veH and 0 for every j. Then by (d), we have 00
IIvII2 j=1
By our own assumption each term of the series is zero showing that II V II = 0
and hence that v is the zero vector.
Definition 3. Let (H, (,)) be a Hilbert space. An orthonormal sequence that has any one, and hence all four, of the properties listed in Theorem 2 is called an orthonormal basis for H.
Let us show that the inner product space (e4 (N), (,)) is a Hilbert space. oo
Given a Cauchy sequence { xn } in this space we first note that for each n=1 fixed n x n = I xnl , xn2, xn3, -4
(1)
Now given e > 0 we can find an integer N such that
Ilxn - xrII2
(2)
when m,n > N. Using (1) we may rewrite (2) as follows: 00
E (xnj - xmj)2 < f2 j=1
(3)
Chapter Three
102
Fourier Series in Hilbert Space
for in, n > N. For any fixed j the sequence {xnj }n 1 is a Cauchy sequence of real numbers because 00
- xm) I2= (xn
when m, n > N. Let us set xoj = 11 xnj for each j = 1, 2,.... We shall first e4 (N). Choose and fix n > N and note that for any k we show that 1
have {xoj}
and
in the space Rk. Because II - 112 is a norm on this
space we may use the triangle inequality to write 2
j=1 xoj
9
k
<
((xoj - xnj)Z j=1=11
k
+ E xnj j_1
(4)
Letting k - oc on the right-hand side of (4) we have, by (3). k
j=1
4
<E+IIxnII2
Letting k -' oo we see that xo} C12 (N) . 00
Finally, letting in -+ oo in equation (3) we obtain E (xnj - xoj )2 < E2 j=1
when n > N, and this s a y s that 1 1
o - -;,,112 < e when n > N.
Exercises 4 * 1.
(a) Let (V, (,)) be an inner product space and let S be an orthonormal subset of V. Show that S is a maximal orthonormal set if, and only if, the only vector in V that is orthogonal to every element of S is the zero vector. (b) For each let en = (0,..., 0, 1, 0,...) where the one is in the nth place. Clearly, { e n)' C eR (N) and also { e n } C P,2 (RI) (Section
nil
1, examples (ii) and (iii)). Show that this set is a maximal orthonormal sequence in PR (111) and also in t (111) .
(c) Recall the inner product on C (T) (Exercises 3, problem 4). Show e/ne
nt
that the orthonormal sequence f72=ir I neZ } is maximal. Hint: See Chapter One, Section 7.
111
103
5. The Completion
2. Let (H1, (,) 1) and (H2, (1)2) be two Hilbert spaces over the same field. Suppose that {-;j}' C H1 and {i-vj}' C H2 are maximal orthoj=1
j=1
normal sequences (let us stress that both of these are infinite sequences). M
wj and, given u = E aj vj in H. define T (x)
For each j let T
j-I tobeE 1ajwj. (a) Show that T is well defined. (b) Show that T is a linear one-to-one mapping from H1 onto H2. (c) For any vectors u , v in H1 show that (u , v )1 = (T (u) , T
(r))
;
in particular, 11 u II I= I I T (u) 112 for every u eH1.
The Completion
5.
The Lebesgue integral had a profound effect on the study of trigonometric series. It led, among many other things. to the creation of the Lebesgue classes;
also known as the LP spaces (read "L.P."). Here 1 < p < x, and for each such p, LP is a Banach space whose elements are equivalence classes (Chapter Zero, Exercises 2, problem 1) of functions defined on, let us say, [-7r, ar] in. For f fLP [-1r, ir] the LP-norm off is defined as follows: Ilflip
=(f If(t)IPdt),
In this context, equal functions have the same integral, so this is well defined. It is customary to "abuse language" and to refer to the elements of LP as LP -functions. The connection to trigonometric series is this:
Theorem 1. For each fixed p. with 1 < p < oo, and each f eLP [-7r. irl tile Cesaro means of the Fourier series of f converge to this function for the LPnorm.
When p = 2, and only then, 11 112 is the norm associated with an inner product. Thus L2 [-ir, 7r] is a Hilbert space and here we have a stronger result:
Theorem 2. The Fourier series of any function in L2 [- 7r, a) converges to that function for the L2-norm. We shall not prove Theorem 1 (see the Notes for a reference). Theorem 2
x
eint
follows immediately once we show that { 2 1
l
is an orthonormal basis for
11
L2 [-ir, ir] (Section 4, Theorem 2) . We shall do this very soon.
104
Chapter Three
Fourier Series in Hilbert Space
Since we are not assuming a knowledge of the Lebesgue integral, so we must define the LP-spaces in a different way.
Definition 1. Let (E, II II) be a normed space and let D be a subset of E. We shall say that D is a dense subset of E or that D is dense in E if for any i eE we can find a sequence { F/ n } S D such that n"m V. = i. If (V, (,)) is an inner product space and D C V, we shall say that D * is dense in V if it is -
dense in (V, II II) where II
II
is the norm associated with (,).
We recall that the set of trigonometric polynomials (Chapter One, Section 7, just before Corollary 2) is dense in (C (T), II - Iloo). Theorem 1, quoted above, shows that C (T) is dense in (L" [-w, 7r) , lI . IIp) for each p. We shall use this fact to define the LP-spaces.
Theorem 3. For any normed space (E, II II) there is a Banach space (BI, II .
and a linear map T : E
111)
B1 such that:
(a) For every xeE, 11 II = IIT (x)111;
(b) The set (it is actually a linear subspace) T (E) _ {T (x) I xeE} is dense in (BI, 11 - 111).
Furthermore, if (B2, II - 112) is another Banach space that has properties (a)
and (b), then there is a linear map T' : Bl -+ B2 that is one-to-one, onto, and norm preserving (i.e., IIT' (x) 112 = Ilxlll for every xeBj). A proof of this theorem can be found in Appendix B. We call (B1, II - 111) the completion of (E, II - II) and often say, identifying E with T (E), that E is dense in its completion. On the space C (T) we can define a function II II1 by setting IlfIII =
f* If(t)Idt
for every fEC (T). It is easy to see that this gives us a norm on C (T). The completion of (C (T), II . 111) is defined to be the Banach space L' [-7r, r]. When p > 1, and p : 2, it is not so easy to show that II IIp, defined to be
(1-111(t) Ip
dt)
for each PC(T), is a norm. It is, however, and the completion of (C (T,) II . IIp) is defined to be LP [-1r, 7r] . Because we do not need these spaces, we do not carry this discussion any further.
5. The Completion
105
For the special case p = 2 we have a stronger form of Theorem 3:
Theorem 4. Given an inner product space (V (,)) there is a Hilbert space (H1, ()1) and a linear map T : V -' H1 such that
(a) For all u, v in V, (Tu,Tv) (b) The set (it is actually a linear subspace) T (V) = IT (v) v eV } is dense in (H1,(,)1). Furthermore, if (H2, WO is another Hilbert space that has properties (a) and (b), then there is a linear map ' : H1 -+ H2 that is one-to-one, onto, and for which (7-u , 7-v% = (u , v) for all u , v in H1. 2
A proof of this theorem can be found in Appendix B. We call (H1, (, )1) the completion of (V, (,)) and, as in the case of a normed space, we say V is dense in its completion. We have defined for any f, gEC (T), (f, g) = f ", f (t) g (t)dt, and we have
seen that this is an inner product C (T). The completion of (C (T) , (, )) is denoted by L2 (T) (read "L two of T").
The inner product and norm of L2 (T), when restricted to elements of C (T), is just the inner product and norm as defined above. Thus the sequence etnt 100 2rr
is an orthonormal sequence in L2 (T). We shall show that it is an _ao
orthonormal basis for this Hilbert space. First we need:
Lemma 1. An orthonormal sequence { v j } in a Hilbert space (H, (, )) is an orthonormal basis for H if, and only if, the linear span of { v j } (Appendix A) is dense in H.
Proof. Recall that the linear span of { v j }, denoted by lin { v j }, is the set of all (finite) linear combinations of elements of { v j }.
Suppose first that { v j } is an orthonormal basis for H, and let v eH be given. Then by part (b) of Theorem 2 (Section 4) we have 00
E(v,vj)> vj j=1
Where convergence is for the norm of H. But then the sequence 00
(v,vj)vj j_1
n=1
Chapter Three
106
Fourier Series in Hilbert Space
which is clearly contained in lin { v }, converges to v showing that the linear span is dense in H. Now suppose that lin { v j } is dense in H. To show that { -;j } is an orthonormal basis for H, we need only show that it is maximal (Section 4, Theorem 2). So suppose v EH and 0 for every j. We must show that v is the zero vector. Let e > 0 be given. By our assumption there is a iaf lin { v j } such that II v - w II < e. Thus, by Theorem 2 of Section 3, we have
11v112=1(v,;)1=1(v-iu,v)1<11v-w11I14 <ell4 because wE lin { v 41 and (v , v,) = 0 for all d so (v , w) = 0. Now if v is not the zero vector, then 11111 < e for any e > 0, which is clearly a contradiction.
Theorem 5. The set { e sR }
is an orthonormal basis for the Hilbert space
L2 (T).
Proof. According to Lemma 1 we need only show that the linear span of the given sequence is dense in L2 (T). Let ffL2 (T) and e > 0 be given and use the fact that C (T) is dense in this space to find (T) such that Ilf - 9112 < a. 100 Now the linear span of { `427 . is dense in (C (T), 11 II,o) (Chapter One, Corollary 2, Section 7). Thus there is an element r of this linear span such that I19-7-IIo <
z 7257
But then 119-7-112 = fR* 1 g (t)-r (t) 12 dt <
e
R
(2a) = a
.
Combining our inequalities we have Ilf - 7-112 < IIf - 9112 + ll9 - r112 <
+2=eandwearedone.
//
Corollary 1. For any ffLz (T) the series00 0 (f, 00
eent \ eint
72a) 7 2 converges to f
for the L2-norm. In particular for any f e (C (T)), the series (\/
f
eint
\/) e"K= ==
00 ( l -00
00
j
A
f (t) a-snt
converges to f for the L2-norm. For any such f we also have 00
27r1: 1 c.(f)12
o
R
If(t)12dt=IIfhI2
eCne
6. Wavelets
107
Exercises 5 is contained in this * 1. There is no inner product on L1 [-7r, ir] . Still, {ei"t space. Show that the linear span of this sequence is dense in (L1 [-a, ir] , ' [I
/
00
eint \
*2. When f eC (T) the numbers an = (f, =) satisfy EIan12<00. 00 00
Given a sequence of numbers (On}'. such that
I
Qn 12< oe we
On for any n. The may ask if there is an f eC (T) such that (f, answer is "no." Show, however, that there is an element geL2 (T) such eint \ )3n for all n. that g, * 3.
Let C(R) be the set of all complex-valued functions that are defined and continuous on R. (a) Show that C (R) is a vector space over C (see Section 1 for a discussion of the algebraic operations on functions).
(b) An element feC (R) is said to have compact support if there are real numbers a, b such that {xeR I f (x) 0} C [a, b] . Let Co (R) be the set of all functions in C (R) that have compact support. Show that Co (R) is a vector space over C. (c) For each f eCo (R) let III III = f . I f (x) I dx. Show that II ' II i is a norm on Co (R). The completion of (Co (R) , II ' 11 1) is a Banach space that we shall denote by L1 (R) (read "L one of R").
4. For each f,g in Co (R) define (f, g) to be f . f(x)g(x)dx. Show that (,) is an inner product on Co (R). The completion of (Co (R), (,)) is a Hilbert space that we shall denote by L2 (R).
6.
Wavelets
One of the major applications of Fourier series is to the analysis of real-world eint l data. We have seen that I n E Z . is an orthonormal basis for L2[0,2a]
nt
172P
(Section 5, Theorem 5) and so any function in this space can be approximated,
as closely as we wish in the L2 norm, by a finite linear combination (what physicists call a "superposition") of sines and cosines. There are many instances,
however, when such sums fail to reflect some of the subtleties in the data. A better tool, in these instances, are the wavelet bases. This subject can be traced to some research in pure mathematics; however, many important results
1 1 1)-
108
Chapter Three
Fourier Series in Hilbert Space
came from the investigations of geophysicists and others interested in signal processing. There is now an extensive literature on wavelets and their many important applications, and we can only give the reader a very small sample of the mathematical ideas underlying the subject. A string of measurements, one-dimensional data, or a signal can be thought 00
I f(t) 12 dt
of as a function of time, f (t). The energy in the signal is given by f 00
and because this is finite, f E L2 (R) (Exercises 5, problem 4). In many cases of interest f has compact support (Exercises 5, problem 3b) and many sharp spikes or sharp discontinuities. To represent such functions in a series, we seek a basis for L2 (R), but we want our basis to represent our function at different scales (resolutions). In this way both gross or large-scale features and small subtle features of our data are captured in the representation.
Definition 1. A wavelet is a function 0 (t) E L2 (R) such that {21 w (2Jt - k) j, k E Z} is an orthonormal basis for L2 (R). Note that we have here both translations (the role of k) and dilations of the function ii, the so-called mother wavelet. At this point it is not at all clear that any wavelets exist. There are many, however, and, in applications, a judicious choice of yields a powerful new way to analyze data. Wavelets can be constructed from a multiresolution analysis. This is, by definition, a sequence of closed linear subspaces { V2 } of L2 (R) such that
0) CV_1CVOCV,C...; (ii) U {Vj I j E Z} is dense (Section 5, Definition 1) in L2 (R);
(iii) n {Vy I j E Z} = (01; (iv) f (t) E V2 if, and only if, f (2-it) E VO;
(v) There is a function cp E Vo, called the scaling function, such that {
In applications, a multiresolution analysis is a mathematical framework for decomposing a signal into components of different scales. The spaces V correspond to these scales. If any V,, is given then by (iv), any other V,,, can be found because Hence, a multiresolution analysis is completely {f (2"`-"t) I f E determined by the scaling function V (sometimes called the "father wavelet"). Because the translates of cp give us an orthonormal basis for Ve, the translates of p (2"t) gives us an orthonormal basis for each V". Unfortunately, however, the union of these bases is not, in general, a basis for L2 (R). To obtain an
6. Wavelets
109
orthonormal basis of wavelets, we construct a new sequence of subspaces as follows. For each fixed n E Z let Wn = {w E Vn+1 I `iv-, v) = 0 for all v E Vn}; Wn is called the orthogonal complement of V,, in Vn+1 Every vector in Vn+1 can be written, in just one way, as a sum of one vector from Vn and one from Wn (proved below). To indicate this, we write Vn+1 = Vn ® Wn.
We can repeat this process. Vn = Vn-1 ® Wn_1 and so Vn+1 = Vn-1 03 Wn-1® Wn. Continuing in this way we find that every vector in L2 (R) can be written as an infinite sum of vectors one from each of the spaces Wn. In symbols
L2 (R) = ® Wn. These spaces are mutually orthogonal and f (t) E W,, if, n=-oo
and only if, f (2-"t) E W. We see from this last fact that once we find a function whose translates give us an orthonormal basis for WO, we can obtain
from that function an orthonormal basis for every Wn, and because of the mutual orthogonality of the Wn the union of these bases is an orthonormal basis for L2 (R).
The problem of finding a wavelet can be now outlined as follows. Given a multiresolution analysis {V,,}, and recall this means we are given a scaling function V whose translates constitute an orthonormal basis for VO, construct the sequence {Wn } and find a function 0 whose translates constitute an orthonormal basis for WO. The function , is a wavelet. It turns out that such a 1/i exists for every multiresolution analysis, and the proof of this fact actually gives a method of constructing 0. We have:
Theorem 1. Let {V,,} be a multiresolution analysis with the scaling ao
(-1)k hl_k'P (2t - k), where hk =
function gyp. Then the function z
f
-00
oo
p (t) V (2t - k)dt is a mother wavelet; i.e., {291{' (2nt - m) I m, n E Z} 00
is an orthonormal basis for L2 (R).
Perhaps the simplest example of the process discussed above is found by
using the Haar scaling function H (t), where H (t) = 1 for 0 < t < 1 and H (t) = 0 for all other t. It is easy to see that the integer translates of H (t) give us an orthonormal set in L2 (R). These translates generate the space VO. The function H (t) has its support in [0, 1], whereas the function (2tt- n) H has its support in f 2 , n 2 111. These overlap when n = 0 in the set 10,11 and
when n = 1 in the set
r 12,111
. Thus for the Haar function ho = hl = 1 and
(x) = w (2x) - p (2x - 1). The function 0 (x) is called the Haar wavelet. As we have already said, there are many others and not all of them have compact support. For more about this fascinating subject and for a proof of Theorem 1, we refer the reader to the references listed in Notes.
Chapter Three
110
Fourier Series in Hilbert Space
We end this section with a proof of a property of Hilbert spaces that played a key role in the construction above.
Lemma 1. Let (H. <, >) be a Hilbert space, let M be a closed linear subspace of H, and let y E H\M. Then there is a unique vector into E M such that 11V - ;no ll =inf { 11 y - mll f in E MI. Furthermore, ( y - trco, m) = 0 for all
mEM. Proof. Let d= inf { I y- m I I in E M} and choose a sequence { nt }
such that d =
CM
Then, by Exercises 3, problem 1b, we may write
11 y - mn 11
T$
1
-
-;nr)II2
II(y -mk) +
-
I2
211 y
- mk I + 211 y - mt 112. We may write this as -mkl12+211
JIM1-mkll=211
-hill'-II2y-(m1+ink)112.
The very last term in this expression can be written as 4
M1 +Ink y
in, +'nt.
2
is in M. It
2
mr
follows that 4 II y
mk II
2
Imr-'nkll2<2Iy-..
> 4d2 and so V2+211
-.nr112-4d2.
Recalling the way {mn} was defined we see that our last inequality implies that {m++} is a Cauchy sequence (Section 2, Definition 3). Because H
is a Hilbert space this sequence converges, because {mn} C M and M is closed, its limit, say m0, is in M (Section 2, Lemma 1). Thus 11 y
lim ly -
- mOII
d.
Let us now show that m0 is unique. Suppose we had two vectors m1i m2, in M such that II y X11. Then, as above 11 y
=d= 1 I(y -m1) + (y -m2)112+II(y -m,)
-
2
11V
-
m'111
2 11-y
- (V -»m2)112
- m2112. This may be written as
6. Wavelets
II"a2-ml112 =4d2-4
of the proof 4
ml + m2 Y
, but as we discussed in the first part
2
ml + m2 Y
111
2
2
> 4d2 and so IIm2 - ml II
2
< 0, giving us
m2=ml. Finally, we must show that (y - mo, m) = 0 for all in E M. Let us set
y o = y - mo. Note that for any in E M and any scalar 6, ino + dm is in M, and so
d2IIp-(rno+bm)II2-IIp0-bm1I2=(po-bmo-bm>
=
p0II2-6(m,po) -3(po,m) +161211'nl12. I2
Because I I p ° I I2
= p - mo II
I
= d2 this becomes
(*) 0 < -b (m, po) - b (po.'n) + Now suppose
(p0,
m)
1612
IImII2.
0 for some m E M. Then m could not be zero
(Exercises ,3roblem p la) and so we may set 6 = (p0'm ) IImII2
Putting this into (*) we obtain 0<
-2l
which is a contradiction. IImII2
Iiml2
Imll2
Theorem 2. Let (H, (,)) be a Hilbert space and let M be a closed linear subspace of H. Then M1 = 1; E H I < ; , m >= 0 for all m } is a closed linear subspace of H. Furthermore, (a) M n M1 = J0 }, and (b) each v E H can be written in one, and only one, way as a sum of a vector in M and a vector
in M1. Proof. The proof that M1 is a closed linear subspace we leave to the reader (Exercises 6, problem 1). If v E MnM1, then (v, v> = 0, which gives v = 0 (Section 3, Definition 1c).
For any v E H\M, there is a unique moM such that II
° - mo = II
mf { II v - m II I m E M } and, as we showed in Lemma 1,
(v - mo, m) = 0 for all m E M. Thus v - mo is in M1, i.e., v - ma = p for
112
Chapter Three
Fourier Series in Hilbert Space
some vector p E M1. It follows then that mo + p, where mo E M, p E M1. Suppose now that v = mo+ where nao E M. p' E M. Then + p,
- m')
(p - p). Call this vector h. Clearly h E M, = because h = rrao and h E M1, because h = p' - p. Thus, by (a) h = 0 and we haveino=rao, p = p mo + p' so (me
Exercises 6 1. Let (H, (,)) be a Hilbert space and let M be a closed linear subspace of H.
(a) Show that M1 = (w E H I (w, v) = 0 for all V E m} is a closed linear subspace of H.
(b) Define M11 to be (M1)1 and show that M11 = M.
Notes 1. A discussion of the foundations of quantum mechanics can be found in Introduction to Hilbert Spaces with Applications, by Lokenath Dehnath and Piotr Mikusinski (Academic Press, 2nd edition, New York, 1999). See Chapter 7. 2. A proof of Theorem 1 (Section 5) can be found in Functional Analysis, by Carl L. DeVito (Academic Press, New York, 1978), pp. 92-93.
3. Wavelets are discussed in the book listed in note 1 above. See Chapter 8, and for a proof of Theorem 1 (Section 6) see page 441 (their notation is slightly different from ours.) Wavelets are also treated in Wavelets and Operators, by Yves Meyer (Cambridge University Press, Cambridge, 1992), in A Mathematical Introduction to Wavelets, by P. Wojtaszczyk (Cambridge University Press, Cambridge, 1997), and in Wavelets-Algorithms and Applications, by Yves Meyer, translated and revised by Robert D. Ryan (Society for Industrial and Applied Mathematics, Philadelphia. 1993). This last reference contains an interesting discussion of the
functions of Riemann and Weierstrass (see the discussion at the end of Section 4, Chapter 1) and their connections to wavelets. See pages 115118.
Chapter Four
The Fourier Transform
In the last three chapters we have been concerned with the Fourier series and the problem of recapturing a function from its Fourier series. Here we shall change our point of view and investigate the properties of a certain mapping between pairs of "function spaces." By the term "function space" we mean a collection of functions that is a vector space over C when addition and scalar multiplication are defined in the "usual" way. The mapping, called the Fourier transform, is the object of interest here.
There are a number of natural questions that arise at this point. Why change the emphasis in this way? What, if any, is the advantage in doing so? At this early stage these questions are hard to answer. Suffice it to say that the Fourier transform plays a central role in many areas of engineering, of physics, of chemistry, and even, in recent times, of astronomy. We discuss some of these applications later. So the effort involved in understanding this mapping is well justified. Moreover, the associated mathematics is fascinating and very beautiful.
1.
The Fourier Transform on Z
We recall the space e'(Z) = {a : Z --, C I EIa(n)I < oo} defined in Chapter Three, Section 1, example iv, and the space C(T) of continuous functions on the circle. There is a natural mapping between these spaces: For each a E t' (Z)
we set ¶ (a) = E' a(n)eine and note that because E I a(n)I < oo, the series converges uniformly over T to an element of C(T). The map 3 is called the Fourier transform on Z. Its properties are easily proved, and so our results here provide us with a model for the discussion of this mapping in more complicated settings; i.e., we shall seek analogues of the theorems proved here for the Fourier transform on R and on Z (defined later). 113
Chapter Four
114
The Fourier Transform
Lemma 1. The map 3 : fl(Z) -' C(T) is linear and has an inverse. Proof. For any a, b in fl (Z) we have (a + b) (n) = a(n) + b(n) for all n E Z. Thus 3'(a + b) = E[(a + b)(n)Jeine = E[a(n) + b(n)Jeine = E[a(n)eine + b(n)einoJ
= Ea(n)eine + Eb(n)eine =%a) +%b)
It is easy to see that given A E C, 3(Aa) = l(a) for every a E t'(Z) and so 3' is a linear mapping. We have already noted that for any a E CI (Z) the series Ea(n)eine converges uniformly over T. If we denote this sum by a(8), then (Chapter One, Section 5) it follows that
a(n) =
r a(8)e-inede,
for every n E Z
Thus 3` is one-to-one and, in fact, a'' exists and is given by -' (a{B}J = {
f_a(e)e-t9do}_oo 27r
It is convenient to denote 3(a) by a(8), or just a, and we shall often use this notation below.
Recall that fl(Z) and C(T) both have norms (Chapter Three, Section 2, Definition 1). For each a E £'(T), IIaIII = E Ia(n)I, and for each f E C(T), Illll - = maXOET If(a)1
Lemma 2. The map 9' : fl (Z) - C(T) is continuous and, in fact, is norm decreasing; i.e., Ilalla 5 IIaIII for all a. Proof
I E a(n)einel <_ > la(n)I Ilein°II. 5 > Ia(n)I = IIaIII The continuity of
now follows from Theorem 1 of Chapter Three, Section 2.
There is a very useful operation on tI (Z) called "convolution." It is a kind of multiplication.
Definition 1. For any a, b in £'(Z) and any n in Z, we define (a * b)(n) to be Zm=_x a(n-m)b(rn). This sequence {(a*b)(n)}n _00 is called the convolution of a and b, or a convolved with b, and is denoted by a * b.
1. The Fourier Transform on Z
115
We must show that a * b is in L1(Z) whenever a and b are both in this space. The next lemma proves that and gives us a bound for the norm of the convolution.
Lemma 3. For any a, bin tl (Z) the convolution, a*b, is in @1 (Z) and IIa*bII 1 IIaII1
Ilbill.
Proof. Because Ilalll = E O Ia(n)I we have, for any fixed m, Ilalll = r_ Ia(n m)I. Now
IIa*bII1=EI(a*b)(n)I =EIEa(n-m)b(m)I -00
n
m
Ia(n - m) I Ib(m)I = E Ib(m)I(E Ia(n - m)I) m
m
n
n
Ib(m)I = IIaIIIIlbIIi
= Ilall1 M
The function e(n) = 1 when n = 0, e(n) = 0 when n 54 0 is clearly in t' (Z). Note that a * e = e * a = a for all a and, further, 'e(O) = 1 for all 9 E T.
Two functions f, g in C(T) can be multiplied. We simply set (f g)(9) _ f (9)g(9) for all 9 E T and observe that f g E C(T). The Fourier transform, besides preserving sums and scalar multiples (Lemma 1), also preserves "multiplication." More precisely:
Lemma 4. For any a, b in P1(Z), 3'(a * b) = Q (a) 8'(b). In another notation: Proof. In the expression
* b(9) = 1
{an
- m)b(m) } eine
we set k = n - m to obtain
E E a(k)b(m) I ei(k+m)o = >2 eike {>a(k)b(m)etme} =
{a(k)e'} {>b(m)eIme} = a(B) ,
(9)
m
To see how all of this might be useful in the study of Fourier series, consider
the following result of N. Wiener: Suppose that f E C(T) is never zero, and
116
Chapter Four
The Fourier Transform
the Fourier series of f is absolutely convergent. Then the Fourier series of 1 is absolutely convergent. As one might expect, a proof of this result using classical methods is quite difficult. The reader may want to try to find one. Our discussion of the Fourier transform, however, enables us to reformulate the theorem as follows:
It is clear that a function in C(T) is in the range of ) if, and only if, its Fourier series is absolutely convergent. Thus the hypothesis of Wiener's theorem
tells us that there is an a E e' (Z) such that a = f. Suppose now that we can find bE e' (Z) for which a * b = e. Then £(a * b) = a b = :3`(e) = 1; i.e., a(9) . b(O) = f (9) . b(9) = 1 for all 0. Because f (O) is never zero, this says that b = 1. Thus 1 is in the range of !a, it is equal to b, and hence its Fourier series is absolutely convergent. In other words Wiener's theorem, a theorem about Fourier series, is equivalent to the following: If a E e'(Z) and a is never zero, then there is a b E t'(Z) such that a * b = e; i.e., if a E e'(Z) and a never vanishes, then a has a "multiplicative inverse" in e' . We shall investigate this result more fully in the next section. To begin, let us define our terms.
Definition 2. An element a of e' (Z) is said to be invertible if there is an element b in e' (Z) such that a * b = b * a = e. We call b the (see Exercises 1, problem 2(e)) inverse of a and denote it by a'.
It is clear that if a E e' has an inverse, then a is never zero, because a* a-' = a(a-') = F = 1. What is difficult is showing that when a is never zero, a has an inverse. We end this section with a proof that (e' (Z), II II 1) is a Banach space; i.e., every Cauchy sequence of points in this space converges to a point of this space (Chapter Three, Section 2, Definition 3).
Lemma 5. The space (e' (Z), II 11 1) is a complete normed space. -
Proof. Let {a'}'1 be a Cauchy sequence in e' (Z) and, given e > 0, choose N so that IIa"-am1I1 <e whenever both m and n exceed N. Then for any fixed jo and any fixed k with jo < k, we have k
00
Ia"(jo) -a' (jo) I <_ E Ia"(.?) - am(J)I 5 E Ia"U) - am(j)l< E
j=-k
00
whenever both m and n exceed N. It follows that {a"(j)}n is a Cauchy 1
sequence in K for each fixed j. Because K is complete, this sequence converges.
1. The Fourier Transform on Z
117
Let b(j) = liiun_x a" (j) for each j E Z. Now things get a little bit Inessy. We let
bk =I ..,O.b(-k),b(-k+l), .b(O),b(1),. ,b(k),0,. -} and we let, for each it.
bk = {... ,O,a"(-k),... a"(0) ... a"(k) 0 ...} Then bk and bk are in f'(Z) and we have k
IIbkIII <- Ilbk-bkII1+IIbkIII <-E+:Ia"(j)l J= -k
whenever it > N. If we fix n > N and let k tend to infinity we find that 00
_x
Ib(J)I < e + IIa"III
showing that b = {b(j)}' 1o is in f'(Z). Finally, we shall show that line".-.00 IIa" - b111 = 0. As we saw above, for any fixed k, k
j=-k
la"(j) - am(j)l < e
whenever both in and n exceed N. Letting in go to infinity we get k
E la"(j) - b(j)I < e
j=-k
whenever it > N. Now choose, and fix, it > N and let k go to infinity. We find
IIa" - bIlI se whenever n > N.
Exercises 1
1. Let a E f'(Z) and, for a fixed in E Z. let a...(n) = a(n - in.) for all n. Show that a,,,(0) = e+1mea(g)
118
Chapter Four
The Fourier Transform
* 2. For any a, b, c in e' (Z) and any A E C show that:
(a) a*b=b*a: (h) A(a * b) = (Aa) * b = a * (Ab);
(c) a*(b*c)=(a*b)*c;
(d) a*(b+c)=a*b+a*c. (e) Show that the inverse of a, when it exists, is unique.
* 3. Let {b,, } C e' and suppose that lim b = bo for the e'-norm. Show that lima * b = a * bo for any a E 0. If a E 1' is fixed, show that the map Ta : f' --+ V defined by TQ(b) = a * b is linear and continuous.
4. Let C E f' be defined as follows: c(n) = 1 if n = 0 or n = 1, and c(n) = 0 for all other n.
(a) Show that (c * a)(n) = a(n) + a(n - 1), here a E t' is arbitrary. (b) Let c' = /c, c2 = c * c, .... Show that for any fixed m > 0 Cm(n) _ (n) = n.(+n, n .' 5. Define g E 1'(Z) by letting g(1) = 1, g(n) = 0 for n # 1.
* (a) Let g' = g, gz = g * g, ... and show that fork fixed, gk(n) = 1 when n = k and gk(n) = 0 when n 34 k.
* (b) Let g`' (n) = I when n = -1, g- (n) = 0 for n # 1. Show that g-' is the inverse of g.
* (c) Compute (g-')k for each positive integer k.
* (d) For any fixed a E 1' show that a = Eo c a(n) gn where convergence is for the e'-norm; here go = e. (e) Given a, b in 1' use the series representation of these elements given in (d) to find a series representation for a * b.
(f) Use the representation given in (d) to show that to every continuous linear map cp : t' (Z) - C there corresponds a unique sequence {cn } such that
i.
6. If f E C(T) is never zero, show that 1 is in C(T).
2. Invertible Elements in el (Z)
2.
Invertible Elements
119
in £'(Z)
We recall that e(n) = 0 when n 54 0, e(0) = 1, and an element a of 1l is invertible if there is an element a-1 E P1 such that a * a-1 = a-1 * a = e. For a E Q1, a 0 0, we set a° = e, a1 = a and, for any integer k > 1, we set ak=a*ak-1
Theorem 1. Let a E 1'(Z) and suppose that Ilalll < 1. Then the element (e - a) is invertible. Furthermore, if a 36 0,
(e -a)-' = E a j=0
where convergence is for the 11-norm.
Proof. For each n > 1 let us set bn =
j=0 aj and consider the sequence
{bn} C 6. When r > n we have Ilbm-bn111=11
a'1115 j=n+1
Ila!111
j=n+1
M
E Ilalli (Lemma 3 of Section 1 of this chapter) n+1
E Ilalli < lllallllalll n+1
because the series of real numbers Eo Ilalli is geometric, and because Ilalll < 1,
it converges to 1'u
But, again because Ilalll < 1, limn_.a, Ilalli+1 = 0
showing that {bn} is a Cauchy sequence in (11, II Ill). By Lemma 5 of Section 1, this sequence converges to an element of 0 that we shall denote by cc
F,
aj
j=0
We shall now show that this sum is the inverse of (e - a). We have
(e-a)*
(ai) =(e-a)* (Iima3) r
=slim {(e - a) * -00
(Exercises 1, problem 3)
0
n
n
= noo lim
E aj]
aj j=0
aJ+1
j=0
lim (e. - an+1) = e
120
Chapter Four
The Fourier Transform
By Exercises 1, problem 2(a), reversing the factors give the same result.
Corollary 1. Let a E e1(Z) be invertible and suppose that b E e1 satisfies IIa - bll I < T' . Then b is invertible. IF,
Proof. We have Ile-a-1*bII, = IIa-1*(a-b)II, 5 IIa-1II,IIa-bllI < 1 by Lemma 3 of Section 1 and our hypothesis. Thus, by Theorem 1, [e- (e-a'1 *b)] = a-1 *b is invertible. Call this c. So c = a-1 * b and c is invertible. But then b = a * c, and this last convolution is invertible (its inverse is c-1 * a-1). It follows from Corollary 1 that the set of all invertible elements of e1 is open
and hence that its complement, what is sometimes called the set of singular elements of e1, is a closed set.
Definition 1. Recall that a linear map from e1(Z) into C is called a linear functional on e1 (Z). A linear functional gyp, that is not identically zero, is called a multiplicative functional if cp(a * b) = V(a) Wp(b) for all a, b in t1 (Z). If yp is a multiplicative functional, then W(a) 34 0 for some a. Consequently, cp(a) = cp(a * e) = v (a) ,p(e), and we see that W(e) must be one. Furthermore, if b is invertible, then Wp(b) 54 0 and, in fact, cp(b-1) = w . Theorem 1 tells us
more about these functionals.
Corollary 2. If V is a multiplicative functional on e1 (Z), then for every a in this space we have IV(a)I S I1all1. In particular, every multiplicative functional on e1 (Z) is continuous on this space (Chapter Three, Section 2, Theorem 1).
Proof. We argue by contradiction. Suppose that for some a we have Icp(a)I > llall1. Set A = ,p(a) and note that because I X 1 < 1, Theorem 1 tells us that 0 showing that because O is a (e - j4 0 is invertible. But then cp (e linear map, V (z) 34 1. This contradicts the definition of A. I
I I
Let us now choose, and fix, a multiplicative functional cp on e1 and consider M(cp) = {a E e1 I v(a) = 0}. The properties of M(W) are easily proved but worth listing.
(a) M(V) is closed. C M(c,) and lima,, = ao. Then Suppose clearly this gives us cp(ao) = 0; i.e., ao E M(ap)-
and
(b) M(V) is a linear subspace of e1, M(ap) 0 e1, and if N is a linear subspace for which M(V) 9 N, then either M((p) = N or N = V.
2. Invertible Elements in 1'(Z)
121
Because c is linear, it is clear that M(W) is a linear subspace of e', and
because p is not identically zero, M(W) 54 L'; in fact, we may choose ao E el such that p(ao) 0. Now given b E V set a = o . Clearly v(b - aao) = 0 so b - aao = c E M(AO), and we see that b = c + aao. Now suppose M(ap) C N. If M(W) = N, we are done. If there is a b E N\M(cp), then b = c + aao for some scalar a and some c E M. Because a 34 0 and
N is a linear subspace, we see that ao E N. But then N = el as claimed. (c) If a E M(V) and b E t', then b* a E M(ap). Because v(b * a) =,p(b) v(a) this is obvious.
Definition 2. A nonempty subset I of e' (Z) is said to be an ideal in el (Z) if (a) 1 is a linear subspace; (b) whenever a E I and b E t' (Z), b * a is in I. An ideal I is called a proper ideal if 1 # L'(Z). Finally, a proper ideal is called a maximal ideal if the only ideals that contain it are itself and V (Z).
We observe that a proper ideal cannot contain e nor can it contain any invertible element. Thus if I is a proper ideal, then I is contained in the set of singular elements of el (defined just before Definition 1). Because this last set is closed, we see that the closure (Chapter 3, Exercises 2, problem 2) of I is contained in it. We leave it to the reader to show that the closure of an ideal is an ideal and we have shown:
Lemma 1. The closure of a proper ideal in t' is a proper ideal in el. In particular, every maximal ideal in t' is closed. Let A be a nonempty family of nonempty sets. A subfamily N of A is said to be a nest if for any sets A, B in N either A C B or B C A. A nest M is called a maximal nest if the conditions (i) N is a nest and (ii) M C N together imply that M = N. So a maximal nest M is as large as possible; more precisely, if
A¢ M, then there must be a set B E M such that neither A C B nor B C A is true. There is a useful equivalent form of the axiom of choice (Chapter Zero, Section 2) that was first proved by Hausdorff. Maximum Principle: Given any
nest N in A there is a maximal nest M in A such that NC M.
Lemma 2. Every proper ideal in £1(Z) is contained in a maximal ideal. Proof. Let I be a proper ideal in el(Z) and let Z be the family of all proper ideals in this set. Then {I} is a nest in Z. By the maximum principle there is a maximal nest M in Z that contains {I}. It is convenient to index the ideals in M (Chapter Zero, Section 2, just after Definition 3). So we may write
M = {I,, Ia E A}, and we know that because {I} C M, I = IQ for some a E A. Now let J = IJ0EA IQ (Chapter Zero, Section 2, just after Definition 3). It is clear that I C J. We shall show that J is a maximal ideal.
122
Chapter Four
The Fourier Transform
Our first task is to show that J is a linear subspace of t' (Z). So given a, b in J we must show that any linear combination of these two vectors is in J. But
we have a E Ia and b E Is for a,,3 in A. Because M is a nest, either Ia C 1q or Is C I,,. Let us suppose the former. Then a and b are in II, and because Is is a linear subspace of e' (Z), any linear combination of a and b are in Iq. But Ip C J so we see that J is a linear subspace of e' (Z). Next we must show that for any a E J and any c E el(Z), a * c is in J. Now a E Ia for some a and because I, is an ideal a * c is in this set. Thus a * c E IQ C J; and we have shown that J is an ideal in t' (Z) that contains I. Now .1 = VQEA to and each Ia is a proper ideal. Thus e ¢ Ia all a E A, and so e tf J and we see that J is a proper ideal. Finally, we must show that J is a maximal ideal. Suppose J' is a proper
ideal in tl(Z) and suppose that J C X. Then M U {J'} is a nest in I and M C M U {J'}. Because M is a maximal nest, we must have M = M U {J'};
thus J' E M and so J' = I.y for some ry E A. Clearly, we have reached a contradiction because Iy C J. Our discussion above, see (b) and (c), shows that the null space of any multiplicative functional is a maximal ideal in el (Z). Using some abstract algebra and some complex function theory (Liouville's theorem), one can show that the converse is true. Because we are not supposing that our readers are familiar with these topics we shall only state: (*) Any proper closed ideal of i'(Z) is contained in the null space of a multiplicative functional on el (Z).
Theorem 2. Given a maximal ideal M in el(Z), there is a 0 E T such that
M= {aEel(Z)Ia(O)=0}. Proof. Let cp be a multiplicative functional on el such that M = {a E e' cp(a) = 0}. Such a cp exists by (*). Now recall the element g defined in Exercises 1
1, problem 5: g(1) = 1, g(n) = 0 for n 34 1. Because Iw(a)I S IIafl, for every a Eel and because light = 1, we see that A = (p(g) must satisfy IAI < 1. But IIg'II i = 1 so 111 < 1 also and we conclude that A = W(g) = e+i6 for some fixed 0 E T.
Now for any a E t' we have
p(a) =
Ckim
k-00
a(n)gnJ k
= lim
k-00
00
a(n)ene
a(n)W(g)" =
-k
(-k a(n)ger/
-00
Thus V(a) is just a evaluated at 8, and so M = {a I cp(a) = 0} = {a E Il a(6) = 0} as claimed.
3. The Fourier Transform on IR
123
Theorem (N. Wiener). If f E C(T) is never zero and if the Fourier series of f converges absolutely, then the Fourier series of 1 converges absolutely. Proof. We cannot claim to have here a complete proof because we simply stated (*) and it is crucial. We are only giving a sketch. As we have seen we can choose
b E t' such that b = f and our hypothesis tells us that b is never zero on T. Suppose that b does not have an inverse in tl, and let I = {a * b I a E tl}. Clearly, I is an ideal and because e ¢ I, it is a proper ideal in 1'. Thus there is a multiplicative functional V such that I C M = {a E t' I W(a) = 0}. Then for some 0 E T we must have M = {a E t1 I a(9) = 0}. But this says b(9) = 0,
which is a contradiction. It follows that b is invertible in t1; i.e., there is a b-1 E 11 such that b-1 * b = e. Clearly, (b-1 * b) = F or (b'1) f = 1. Exercises 2
*1. For any fixed aEtl(Z),letI(a)={a*bIbEi'(Z)}. (a) Show that I(a) is an ideal in t'(Z) that contains a. (b) Show that I(a) is a proper ideal if, and only if, a is not invertible. (c) Show that the closure of a proper ideal is a proper ideal. 2. For any a E e1(Z), let o(a) = {A E CI (a - \e) is not invertible). We call a(a) the spectrum of a, and we call C\o(a) _- p(a) the resolvent set of a. (a) Show that o(a) is a closed set by showing that p(a) is an open set.
(b) Show that any \ E C such that Al I> Ilalli is in p(a) and hence a(a) c {A E Cl Al I<- IIalli}.
3.
The Fourier Transform on IR
The theory here is very much like that for the case of Z except that the definitions, and the proofs of the theorems, involve subtleties not found in the discrete case. Let us begin informally, pursuing, as much as possible, the analogy with V (Z). Because this space consists of all a : Z - C such that la(n)l < oo; we "should" define V(R) to be
{f
1700 If(t)Idt
Assuming that this is a vector space we set, for all f E L' (R),
IIfIli =
lf(t)ldt
Chapter Four
124
The Fourier Transform
With a "proper" interpretation of the integral, we can show that (L' (R), 11
11 1)
is a Banach space. (See Chapter Three, Exercises 5, problem 3(c) for a construction of this space.) The integral needed is that of Lebesgue, an extension of the classical integral of Riemann. We shall not assume a knowledge of this integral and so our proofs of the theorems stated below are only "sketches"; i.e., they are incomplete. For each f E L1(IR) we define
f (x) = f 00f (t)e{xtdt Observe that f is also defined for all real numbers. This is a problem, and it is wise to distinguish the domain of f, which is R, from that of f . We shall do so by agreeing that f is defined on a "second" real line that we shall denote by IR.
For each a E f1(Z), a is a continuous function on the unit circle; i.e., a is an element of C(T). Let C(R) be the vector space of all continuous functions on A. Then we have:
Lemma 1. For each f E L1(R), ?E C(A). Moreover, the map :3` that takes each f to f is linear and one-to-one. Proof. Let f E L1(R) be given. Then 00
rx
f (x + h) = f f(t)e&(:+h)tdt =
J
00
Hence
etnt f(t)es=tdt
x
x
f (x + h) - f(x) = f (eiht - 1)e'xtf(t)dt and
x If(x+h) - f(x)I <_ f lf(t)I
Iesht
- lldt
As h --+ 0 so does Ic'ht - lI and, one can show, so does the integral. Thus f E C(R).
It is easy to see that a : L' (R) - C(R) is a linear map. To show that 3 is one-to-one, we need only show that one can recapture f E L1(R) from a knowledge of its Fourier transform f . To do this one uses a kind of Cesaro convergence for integrals. We have R
f (t) = lim
,
`l7f
fR
(1
I
lf J
(x)e-iztdr
3. The Fourier Transform on R
125
where convergence is for the Li-norm (compare this to Chapter One, Exercises 6, problem 3).
Note. When f happens to be integrable this formula gives
f
O0
M) =
27r
f (x)e-'=tdx
which is valid in most practical applications.
Lemma 2. For any f E L' (R),
f (x) = 0. Furthermore,
maxIf(x)I 5 Ill Iii
zER
XER
Proof. The first statement is called the Riemann-Lebesgue theorem. It generalizes the theorem of Riemann given in Section 5 of Chapter 1 (Theorem 1). We have
f (X) = J f (t)e''tdt and
- f (X) =
r x etz(t+;) f(t)dt = r
J 0
00
e4--f(t - - )dt X
J
Hence
21f(x)l 5 f
If(t) - f(t - -)Idt
For any f E L'(R) this last integral tends to zero a IxI - 00. We leave the second statement to the reader. Let us illustrate our results by calculating the Fourier transform of some specific functions. In these examples a is a fixed, positive, real number.
(a) Let U(t) = 1 when t > 0, U(t) = 0 when t < 0. Set f (t) = e-atU(t).
f
00
If (t)Idt =
f
00
e-atdt =
1
a
o
so f is in L'(R). Now 00
f (X) = f e-atU(t)etztdt = f 00 e(-a+'=)tdt 00
= ar-. lim
o
e(-a+iz)M
l -a+ix I
-
1
1
-a+ixj
=
1
a-ix
Note that f is continuous and that liml=I-00 If(x)I = line
a
1
+z
= 0.
126
Chapter Four
The Fourier Transform
(b) Let la(t) = 1 for ItI
a, 1a(t.) = 0 for Iti > a. eiax - e- sax
ja
la(x)
e'xtdt =
= 2sinax x
ix
a
Again we note that the Fourier transform is continuous and tends to zero as 1X I - oc.
(c) Referring to example (b), let k(t) = (1 - ItI)lI(t). Because k(-t) = k(t) we see that
k(x) =
J
I
k(t)cosxtdt = 2 J I (1 - t)cosxtdt
I
o
(1 - cosx1 4 y =2I x2 J- x2 sin
stn(a)l2 xl k2) - L x/2 J
(d) Consider the Abel kernel f (t) =
f(x) =
J
x x fu e(a+:Z)tdt + e-ahuIetxedt _ e(-a+u)tdt J x x fo 1
= lim M-.-oc La+ix
-
e(a+ix)M
a+ ix
I + liIn
e(-a+ix)N
1
N-oc -a + ix - I\-a+ix
1
J
2a
a2+x2 (e) Here we simply note that when f is differentiable we have
d f (x)
x = dx J
f
jut f °`
which is the Fourier transform of i t f (t). More generally, Fourier transform of (it) n f (t).
(t)]etdt
I (x) is the
In the Space PI (Z) we found a kind of multiplication. There is an analogous operation here.
Definition 1. For any f, g E L' (R) we define (f * 9)(x) = foc f (x - y)9(y)dy Lemma 3. For any f ,gin LI (R) the function f *g is in L I (R) and Ill * g (I I < Ilf III 119111.
3. The Fourier Transform on R
127
Proof
Of * 9II = /1(1 * g)(x)ldx = <
f f dx
ff I
f(x - y)9(y)dyl ds
If(x-y)II9(y)Idy= f I9(y)Idy I If(x - y)ldx
= IIgIIl fCQ If (x - y)ldx = IIgII1 Ill Iii
assuming the interchange in the order of integration is valid.
Lemma 4. For any f, g in L1(R) we have `£(f *g) = t3(f)a(g) or, in another notation, (f * g) = f . 9 Proof (f *9)(x) = f (f * 9)(t)e,=tdt = f e42C (f f (t - u)g(u)du) dt setting w = t - u this becomes
f
f
etx(w+u)du f f (w)g(u)dw =
e`xug(u)du / etxw f(w)dw
=9(x)-f(x)
J
A typical application of the Fourier transform is to solve differential equations. As an illustration of this consider
- t t2 + n2u(t) = f (t) where f (t) is assumed known. Applying !J' we get
3 [- dt2 +
n2u]
(:x)2u + a2u =
(x2
+ a2)u = f
Thus u(x)
= (x2 + a2) f (X) = g(x)f(X)
where, from (d),
g(t) =
1
e-aItl
2a
It follows, then, from Lemma 4 that u (t)
e-a2nltl
1
* f(t)
2n
foo
e-alt alf(s)ds
128
Chapter Four
The Fourier Transform
Exercises 3 1. Let f (t) = e2*t I I (t). Compute f (x).
2. Let f E L' (R) and let a, h be real numbers. Show that (a) For a j6 0, (b)
Q[f(at)] = Tjf(p);
f(t - It)] = e+imhf(x);
(c) Qr[eiat f(t)] = f (x + a); (d)
at) f (t)) =' f (x - a) +'f (x + a).
3. For any f, g, h in L' (R) and any A E C show that
(a) f *g=g*f; (b) A(f * g) _ (Af) * g = f * (Ag);
(c) f * (g * h) _ (f * g) * h;
(d) f *(g+h)= f *g+f *h. *4. Show that there is no function e(t) E L'(R) such that e * f = f * e = f for all f E Ll (R). Hint: If e exists, then e * e = e. Now apply the Fourier transform.
4.
Naive Group Theory
We can get a deeper understanding of the Fourier transform by using some ideas from modern algebra. These are usually discussed abstractly, but, we believe, it may be better to begin with some concrete examples.
Definition 1. A nonempty subset A of C is said to be an additive group if (a - b) eA whenever both a and b are in A.
It is clear that C, R, Q, Z and even {0} are additive groups. The last of these is called the trivial group. We might also recall that, for any function f : R -+ K, the set P (f) is an additive group (Chapter One, Section 2, Lemma 1). It is a nontrivial group when, and only when, the function f is periodic. Many of the properties of P (f) are shared by any additive group.
Lemma 1. Let A be an additive group. Then (i) (kA; (ii) for any aeA the number (-a) is in A; (iii) when a and b are in A so is (a + b); (iv) for any aeA the set {na I neZ} is contained in A.
4. Naive Group Theory
129
we must have (a - a) Proof. We may assume that A is nontrivial. Then for in A, which proves (i). Now that we know OeA, (0 - a) is in A for every aeA, proving (ii). To prove (iii) we note that if a, b are in A, then a, -b are in A, and
so a - (-b) is in A. Finally, let aeA and suppose that na is not in A for some positive integer n. Let no be the smallest positive integer such that noa /A.Then by (iii) we see that a+a = 2a,-A; hence no > 2. It follows that (no - 1) aeA because of the way no was defined. But, then, again using (iii), (no - 1) a + a = noa is in A, which is a contradiction. We have shown that naEA for every positive integer n. By (i), however, 0 a is in A, and by (ii) (-na) = (-n) a is in A for every n. Thus {na I nEZ} C A as claimed. When A, B are additive groups and B C A we shall say that B is a subgroup of A. In particular, when aeA the set {na I nEZ}, which is clearly an additive group, is a subgroup of A.
Clearly, every additive group is a subgroup of C, but it is often useful to specify that we are working with a subgroup of R or a subgroup of Z.
Corollary 1. Let A be a subgroup of the additive group R. If A contains a smallest positive member, say ao, then A = {naa I neZ}. Proof. We have seen that {nao I neZ} C A. Given b&A suppose, first, that b is positive. Then ao < b, and we may write b = kao+r where k is a positive integer and 0 < r < aa. Because both b and kao are in A, it follows that rcA and hence that r = 0. Thus be {nao I neZ}.When b is negative, (-b) is a positive member
of A and so (-b) = kao for some integer k. But then b = (-k) ao and we are done.
Corollary 2. If A is a subgroup of the additive group Z, then either A = {0} or A = {ne I neZ} where a is the smallest positive member of A.
Definition 2. A nonempty subset M of C is said to be a multiplicative group if xy-1eM whenever both x and y are in M. It is clear that C\ {0} , R\ {0}, and Q\ {0} are multiplicative groups. So are the sets { 11 1, -1 and T (the unit circle). We have an analogue of Lemma 1.
Lemma 2. Let M be a multiplicative group. Then (i) If M; (ii) for any xEM the number x-1 is in M; (iii) when x and y are in M so is xy; (iv) for any xeM the set {x" I neZ} is contained in M.
When M, N are multiplicative groups and N C M, we shall say that N is a subgroup of M. In particular, when xeM the set {x" I neZ}, which is clearly a multiplicative group, is a subgroup of M.
130
Chapter Four
The Fourier Transform
It is now necessary to specify the arithmetic operation under which a given set is a group. Unless it is clear, from the context, what operation we mean, we shall denote an additive group by (A, +) and a multiplicative group by (M, .).
The exponential function, f (x) = e2, maps R onto the positive real numbers. The set of all positive real numbers, let us denote this by R+, is clearly a multiplicative group. Thus f maps (IR, +) onto (IR+, ) and, furthermore, f (x + y) = f (x) f (y) for all x, y in R. This particular function is, of course, both one-to-one and onto. Let us look at this kind of mapping in a slightly more general context.
Let (A, +) be an additive group, let (M, ) be a multiplicative group, and let f : A M be a function such that f (a + b) = f (a) f (b) for all a, b in A.
Because 0 ' M and f (a) = f (a + 0) = f (a) f (0), we see that f (0) = 1.
But then 1 = f (0) = f (a + (-a)) = f (a) f (-a), giving us the fact that f (-a) = f (a)-1. Using these observations we can show that the range of f is a subgroup of (M, .). To do this suppose x, y elements in the range of f are given. We must show that xy-' is in the range of f. But we have a, b in A such that f (a) = x and f (b) = y. Thus (a - b) fA, because A is an additive group,
and f (a - b) = f (a + (-b)J = f (a) f (-b) = xy-1, proving our claim. Let us now show that {arA I f (a) = 1} is a subgroup of (A, +). Given a, b in this set (i.e., given a, b in A such that f (a) = 1 = f (b)) we must show that a - b is in this set (i.e., we must show that f (a - b) = 1). But f (a - b) = f (a + (-b)) = f (a) f (-b) = f (a) f (b)-1 = 1 and we are finished. Exercises 4 * 1. Prove Lemma 2.
+ 2. The function f (x) = In x maps IR+ onto It Moreover, for any x, y in IIt+ we have f (xy) = f (x) + f (y). This particular function is one-to-one and onto. Now let (M, ) be a multiplicative group, let (A, +) be an additive
group, and let f : M - A satisfy f (xy) = f (x) + f (y) for all x, y in M. (a) Show that the range of f is a subgroup of (A, +). (b) Show that {xfM I f (x) = 0} is a subgroup of (M, ). (c) Show that U4 = {±1,±i} is a multiplicative group. (d) Let f : Z -. U4 be defined as follows: f (n) = in for every n. Show that f (m + n) = f (m) f (n) for all m, n in Z. Identify the range of
f and If (n)=1}. (e) Let g : Z -' U4 be defined as follows: g (n) = (-1)" for every n. Show that g (m + n) = g (m) g (n) for all m, n in Z. Identify the range of g and {nfZ I g (n) = +1} .
5. Not So Naive Group Theory
131
* 3. Let n be a fixed positive integer. Let U,, = {zeC I z" = 1} .
(a) Show that U. is a multiplicative group; in fact, it is a subgroup of T.
(b) Let p (x) be a given polynomial. The complex number r is a root of the equation p (x) = 0 if, and only if, p (r) = 0. We recall that
r is a root of the equation p (x) = 0 if, and only if, x - r is a factor of the polynomial p (x). The largest positive integer k such that (x - r)' is a factor of p (x) is called the multiplicity of the root r. A root of multiplicity one is called a simple root. Show that any root of p (x) = 0 that has multiplicity greater than one is also a root of p' (x) = 0 (the derivative of p (x)). Conclude that every root of x" - 1 = 0 is simple and hence that U,, contains n distinct numbers.
4. Let Q (f) = (a + b V2- I a. b are in Q 1. Clearly, Q (v C R.
(a) Show that Q (f) is a subgroup of (R, +). (b) Show that Q (v'-2) \ {0} is a subgroup of (R\ {0} , .) .
# 5. When (M, ) is a multiplicative group and
the set {x" I neZ} C M,
as we have seen. If there is an xeM such that {x" I ncZ} = Al, then we say that (M, ) is a cyclic group and we say that x is a generator of M.
(a) Show that (U4, ) is cyclic and find all of its generators. (b) Show that (U,,, ) is a cyclic group and identify a generator. Hint: Every element of U" is an element of T and hence can be written as e'0.
(c) An additive group (A, +) is said to be cyclic if there is an such that A = {na I neZ}. We have seen that any subgroup of Z is cyclic. Is Z cyclic?
5.
Not So Naive Group Theory
We have seen that for any f : R - K, the set P (f) = {aeR I f (x + ne) = f (x) for all x) is an additive group. Observe that for real numbers x, y we have f (x) = f (y) when x - y is in P (f ); for then x - y = aeP (f) and f (x) = f (y + a) = f (y). Thus f is constant on certain subsets of R. We can get a better understanding of the structure of these sets by making use of the group theory we discussed in the last section.
Definition 1. Let A be an additive group and let C be a subgroup of A. We shall say that the elements a, b of A are congruent modulo C, and we shall write a m b mod C, if (a - b) EC.
132
Chapter Four
The Fourier Transform
We shall give many examples of this very soon. For now we observe that congruence modulo C gives us a relation on the set A; it is in fact, as we shall now show, an equivalence relation (Chapter Zero. Section 2).
Lemma 1. For any additive group A and any subgroup C of A, congruence modulo C is an equivalence relation on A.
Proof. We must show that this relation is reflexive, symmetric, and transitive. To prove that it is reflexive we must show that for any aEA, a - a mod C; i.e., we must show that for any aeA, (a - a) is in C. But this is immediate because a - a = 0 and 0 is in any subgroup of A (Section 4, Lemma 1, (i)).
Next we must show that for any a, b E A such that a - 6 mod C, we also have b - a mod C. Again this is immediate because (a - b) E C and so - (a - b) = b - a is in C by Lemma 1. (ii), of Section 4. Finally, suppose that a. b, c are in A and a - b mod C, b - c niod C. We must show that a - c mod C. The first of these gives us (a - b) E C, and the second gives (b - c) E C. By Lemma 1, (iii), of Section 4, we may add two elements of C. Thus (a - b) + (b - c) = (a - c) E C. This, of course, says
a - cmodC. For any aeA, [a] = {bfA I a - b mod C} is called the equivalence class containing a (Chapter Zero, Exercises 2, problem 1). The set of equivalence classes is denoted by A/C; read "A mod C." We began our discussion by noting that for any f : R - K, the set P (f ) is a subgroup of the additive group R and that f (x) = f (y) whenever x =y mod P (f ). Thus f is constant on the equivalence classes of R; i.e., f is defined on the elements of R/P (f ). As an illustration of this, suppose f has smallest positive period 27r. Because P (f) is then {(2-,r) n I neZ }, we see that x - y mod P (f) simply means x - y = (2ir) k for some integer k. But any real number x can be written as x = (2a) k+r where k is an integer and 0 < r < 27r. Thus each equivalence class contains a number (and clearly only one such number) from the set (0, 2ir). So we may regard f as being defined on this last set. This is, of course, just another way of saying that f is a function on the unit circle. For subgroups of the additive group Z the construction of the equivalence
classes is simple and illuminating. To be specific, we take A = Z and C = {6n [ neZ}. Then a - b modC means a - b is a multiple of six. Now for any aEZ we may write a = 6n + r, where n is an integer, r is an integer, and 0 < r < 6. Thus any aEZ is equivalent to one, and clearly only one, of the numbers 0, 1,2,3,4,5. Thus A/C = {[0] , [1], [21, [31, [41, [5]}, where [0] _
{nEZI n-OmodC}.... The choice of six is simply one of convenience. We should note here that C = {6n I rzEZ} is usually denoted by (6) and the set A/C = Z/ (6) is often denoted by Z6. Furthermore, a - b mod C is often written a 6 mod 6.
5. Not So Naive Group Theory
133
Given any subgroup C of the additive group Z, we have seen that C = {ne I ncZ) where a is the smallest positive member of C (Corollary 2 to Lemma
1 of Section 4). We shall denote this subgroup by (e). Note that Z/C = Z/ (t) consists of the a equivalence classes [0], [1] , ..., [t - 1]. This set is denoted by Zt. Returning now to the case of an additive group A and a subgroup C of A, let us note that there is a natural "addition" on the elements of A/C. Given [a], [b] in this set we choose an element of [a], let us take a, and an element of [b], take b, form their sum in A and set [a] + [b] equal to the equivalence class containing a + b. To show that this is well defined, we must show that if a'f (a) and b'f (b), then a' + b' and a + b are in the same equivalence class; i.e., we
must show that (a + b) - (a' + b') mod C. This is very easy. Because a'f [a] and b'f [b] we have a - a' in C and b - b' in C. Thus (a - a') + (b - b') is in C (Section 4, Lemma 1, (iii)), which says (a + b) - (a' + b') mod C. Let us illustrate this operation for the case of Z6 = Z/ (6). To add [3] and [4] we take any element of [3], let's take 3 (you can take 45 if you prefer). and any element of [4], we'll take 4 but you can take 52 if you wish, and add them.
So, because 3 + 4 = 7, [3] + [4] = [7] = [1] because 7 - 1 mod 6 (note that 45 + 52 = 97 so [3] + [4] = [97], but 97
1 mod 6).
Following this pattern we can construct an "addition table" for Z6. It is customary, and convenient, to denote the elements of this set by 0, 1, 2, 3, 4, 5. We have: 0
1
2
3
4
0
0
1
2
3
4
5
1
1
2
3
4
5
0
2
2
3
4
5
0
1
3
3
4
5
0
1
2
4
4
5
0
1
2
3
5
5
0
1
2
3
4
5
We should stress that the elements of Z6 are not complex numbers and the
operation + is not ordinary addition. The table shows that 0 is the identity element of Z6. We mean a + 0 = 0 + a = a for every afZ6. Furthermore. given any afZ6 there is a bfZ6 such that a + b = 0. We call b the additive inverse of a. So we can define a kind of subtraction: a - b is taken to mean a plus the additive inverse of b. For example, 3 - 2 is
to mean 3 + 4 because 4 is the additive inverse of 2. Thus 3 - 2 is 1 while
3-4=3+2=5.
There is a natural map
remainder one gets when n is divided by 6. So cp6 (12) = 0, cps (11) = 5, 1, etc. Observe that, for any in, n in Z, cps (m + n) = cp6 (m) + cps (n).
134
Chapter Four
Thb Fourier Transform
We warn the reader that in writing m + n we are using ordinary addition, whereas in writing cpr, (m) + cps (n) we are using the operation defined in Zs. Interpreting this sum as an ordinary addition leads to erroneous results.
Exercises 5 * 1. Let M be a multiplicative group and let N be a subgroup of M.
(a) For any s, t in M define s =_ t mod N to mean st-'cN. Show that congruence modulo N is an equivalence relation on M. (b) The set of equivalence classes of M under the relation defined in (a) is denoted by M/N. For any Is), [t] in M/N define [s) * It] as follows: choose se [s] , to [t] and set [s] * [t] = [st]. Show that this operation is well defined; i.e., if s'e [s] and t'e [t] show that s't' - st mod N. (c) Show that the element [1] EM/N is a multiplicative identity for this set; i.e., show that [s] * (1) = [11 * (s] = Is] for all Is) in MIN. (d) For each Is] eM/N show that there is a It] eM/N such that [s] * [t] _ I11.
(e) Let 119 be the multiplicative group of all nonzero complex numbers and let N = T (the unit circle). Identify the elements of M/N.
2. Compute the addition table for Z9. Define V9 (n) to be the remainder one gets when n is divided by 9. Show that when n > 0, cpq (n) can be computed as follows: Add the digits of n. If the result is greater than 9, add its digits. Continue until you get a number between 0 and 9. 3.
(a) We refer to problem 2. Show that for all m, n in Z, cp9 (m + n) _ 'ps (m) +,p9 (n) (b) Compute W9 (2315) + V9 (7489). Show that this is Sp9 (9804).
(c) Find
I cp9 (n) = 01.
* 4. Let A be an additive group and let C be a subgroup of A. Define V : A -y A/C as follows: cp (a) = [a] for every aeA. (a) Show that cp (a + b) = cp (a) +,p (b) for all a, beA.
(b) What is {aeA I p (a) _ [0])?
6. Finite Fourier Transform
6.
135
Finite Fourier Transform
The Fourier transform plays a fundamental role in modern digital signal processing (see Notes). This application requires the handling of enormous amounts of numerical data. I have been told that a color TV picture, for example, requires the processing of about 8 million bits of data per second. What is clearly needed here is an efficient numerical technique for computing the Fourier transforms involved because, in most cases, they cannot be found analytically. The fast Fourier transform algorithm is such a technique. Its efficacy is remarkable. In some cases 1 billion computations can be replaced by 1 million, one one-thousandth of the original number. What one does is replace the Fourier transform of a function defined on R by the Fourier transform of a function defined on Ze for large e. The algorithms are then applied to this replacement. We shall concern ourselves with the problem of how the Fourier transform can be defined on Zt. Once we have done that, we investigate its properties and give some applications. In the case of the additive group Z we considered
e'(Z)={a:Z-ClIa(n)1
_
J
x a (n - m) b (m) and note we need
For a, b in e' (Z) we defined (a * b) (n) _
here the group operation in Z; i.e., n - m = n + (-m) where (-m) is the additive inverse of m. We have seen that subtraction is possible in Zt and so we can define convolution for functions on this set. Let us do that and work out a simple case in detail. To begin with let t-1
e'(Zr)= a:Zt-CjEja(n)1
Clearly, every function on Ze is in e' (Z,), so our notation is a little silly. We use it to emphasize the analogy between what we are doing here and what we did on Z.
Definition 1. For any a, bee' (Zt) we define the convolution of a and b, a * b, 1-1
a (k - m) b (m).
as follows: (a * b) (k) = m=o
Let us stress that the subtraction is to take place in Z1. To get a "feeling" for this operation, let us work out the convolution of two elements a, bin e' (Z3).
136
Chapter Four
The Fourier Transform
We have: 2
(a * b) (0) = E a (0 - j) b((j) = a (0 - 0) b (0) + a (0 - 1) b (1) + a (0 - 2) b (2) j=o
=a(0)b(0)+a(2)b(1)+a(1)b(2) because 0 - 1 is the additive inverse of 1 in Z3 and this is 2, while the additive inverse of 2 is 1. Similarly, 2
(a *b)(1) _ Ea(1 - j)b(j) =a(1 -0)b(0) +a(1 - 1)b(1)+a(1 -2)b(2) 3-o
=a(1)b(0)+a(0)b(1)+a(2)b(2) because, in Z3, 1 - 2 = 1 + 1 = 2. 2
(a*b)(2) =Ea(2- j)b(j) = a (2 - 0) b (0) + a (2 - 1) b (1) + a (2 - 2) b (2) j-o =a(2)b(0)+a(1)b(1)+a(0)b(2) Observe that if we write a and b as (ao, a,, a2) and (bo, bi, b2), respectively, then a * b = (aobo + alb, + alb2, albo + aob, + a2b2, a2bo + albs + aob2).
This last expression clearly shows that 11 (Z3) contains an identity. The function e = (1, 0, 0) satisfies a * e = e * a = a for every a. This operation is sometimes referred to as "cyclic convolution." Its basic properties are easily established.
Lemma 1. For any a, b, c in e1 (Ze) we have
(a) a*b=b*a; (b) a*(b*c)=(a*b)*c; (c) o(a*b)_(aa)*b=a*(ab)for anycEC; (d) a*(b+c)=a*b+a*c; (e) There is an element eet' (Ze) such that a * e = e * a = a for every a. The Fourier transform on Z takes each function in el (Z) to a continuous function on T; the unit circle in the complex plane. Recall that for aee1 (Z), d (9) _ a (n) e"`9. Let us examine, a little more closely, eiie for 9 fixed. We can define ce : Z T by setting ce (n) = e'ne for each neZ; remember 9 is fixed. Observe that ce (m + n) = et(m+")e = eime . ein9 = ce (m) , ce (n) for all m. n in Z. Maps like this have a name.
6. Finite Fourier Transform
137
Definition 2. A map c : Z -. T such that c (tit + n) = C (tit) c (n) for all in, nFZ is Called a character of Z.
Lemma 2. To every character c of Z there corresponds a unique real number Bc[0, 21r) such that c (n) = (cie)" for every nfZ. Proof. Because c (n) is never zero and c (n + 0) = e (n) c (0), we see that c (0) = 1. Furthermore, c (n) = c (1 + 1 + ... + 1) = c (1)" for every natural number it.
Combining these observations we have I = c(0) = c. (n + (-n)) = c(n) c(-n) and so r (n) = c (1)" for every integer it. Now c (l) E T and so c (1) = coy for some real number y. If y V [0,27r) we may write y = (2a) k + 0 where k is an integer and 0 E (0, 2a). Clearly. then c(n) = (c ,lo)" = ri(2kv+9)n = (..i2knff WO)" = (c'B)" for every n E Z. This proves the lemma because it is clear that no two 0's in (0, 2a) can give the same character of Z.
Corollary 1. There is a one-to-one correspondence between the characters of Z and the points of T. Proof. Because there is a one-to-one correspondence between the points of T and the set [0. 2r), this follows immediately from our lemma.
For aef t (Z) , a (0) = F°`,., a (n) ei"B. If we think of 0 as fixed, then we
are taking the character eie and applying it to each n. co (n) = e'r'e. then multiplying by a (n) and finally summing over the domain of a: i.e.. summing over Z. We will do something similar for functions in P (Z,). To begin, however, we must first identify the "characters of Z,."
Definition 3. Let f be a natural number and consider the set Z,. A reap (n) for all m. n y : Z, T is called a character of Z, if y (m + n) = y (m) in Z,.
Lemma 3. There is a one-to-one correspondence between the characters of Z, and the solutions to the equation zt = 1. Proof. Let ur be a complex number such that wf = 1. Define a map y : Zr - T by setting y (k) = wk for each keZ,. Clearly, y (tit + it) = ittr"k" = urn w" = y (tit) -y (it) for any in, it in Z,. Furthermore, if in - it mod F, then y (tit) = y (it). so -y is well defined and is clearly a character of Z.
Suppose now that y is a given character of Z,. Then because y (k + 0) = y (k) y (0) we see that y (0) = 1. Also. for any kfZ, we have .y (k) = y (1 + ... + 1) = )(1)k . Thus y is completely determined`t.by y (1). But y (F) = y (0) = y (1)' = I and soy (1) is a root of the equation
= 1.
Chapter Four
138
The Fourier Transform
Definition 4. Let f be a natural number. The set of all solutions to the equation zt = 1 is denoted by Ut. The elements of this set are called fth roots of unity.
We have seen (Exercises 4, problem 3) that Ut is a multiplicative group containing f elements. Let us look more closely at U4 and its multiplication table: 1
-1
i
-i
1
-1
i
-i
-1
-1
1
-i
i
-i
-i
i
1
-1
1
It is, of course, true that 14 = 1, but 4 is not the smallest positive integer for which this is true. Clearly, 11 = 1; we say 1 has order 1. Similarly, (-1)4 = 1,
but the smallest positive integer for which this is true is 2; we say (-1) has order 2. The remaining elements i and -i each has order 4; 4 is the smallest positive integer n such that i" = 1 and it is the smallest positive integer n each
that (-i)" = 1. Note that {i" I neZ} is just {1, -1, i, -i}; i.e., it coincides with U4. Thus U4 is a cyclic group and i is a generator of this group (Exercises 4, problem 5). Clearly, -i is also a generator of this group.
Definition 5. Let t be a natural number and let
The order of a, O (a), is the smallest natural number k such that ak = 1. When 0 (a) = f we say that a is a primitive fth root of unity.
Clearly, e2*'lt = cos + isin T is a primitive fth root. When f = 3, this gives us cos z3 + i sin s3 = -+i. When f = 4, we get cos 24 + i sin 21 = i.
Lemma 4. The order of each element of Ut is a divisor of f. An element w of U, is a primitive fth root if, and only if, it generates the group Ut.
and 0 (a) = k, then clearly k < f. Now f = qk + r where q and r are integers and 0 < r < k. Thus 1 = at = qk+r = (ak)° ar = (1)° ar = a'', which says. because k is the smallest natural number such that ak = 1, that Proof. If
r=0.
Now suppose that wfUt has order f. Then Ut 2 {w' I 1 < j < f} so to show that these sets are equal we need only show that the last one contains f distinct members. But if wp = w9 for 1 < p < q:5 f, then w4_y = 1 and 0 < q - p < f. This contradicts the fact that w has order f. Thus w generates Ut.
6. Finite Fourier Transform
139
Finally, let w be a generator of Ur. If 0(w) = k < f, then{w' I jeZ} = { tn' . w2, w3, ..., wk } .
and this contains k < I elements. So the order of w must
be e.
Definition 6. Let f be a fixed natural number and let ur be the primitive tth For any aff' (Zr) we define a (wk) = F'a(j) (wk)3 and call this root o?=a.0
function, its domain is Ur, the Fourier transform For the case of a function adfl (Z3) this becomes 2
a (w°) _
a (j) (w°)' = a (0) + a (1) + a (2) J_° 2
a(w) = Ea.(j)vi = a(0)+a(1)w+a(2)w2 j=0 2
a(w2) =Ea(.I)(w2)' =a(0)+a(1)w2+a(2)w4 J=O
A more convenient way to write this is
f a (w°) & (w)
1=
8 (w2)
1
1
1
1
w
w2
1
w2
w'
a (0) a (1) a (2)
Of course. because w = e2i'13(Us, 111 4 = w.
Recall that the map.F takes f' (Z) into C (T). So if act' (Z), F (a) = a is defined on T. Furthermore- F is linear, invertible, and "preserves" convolution.
The map Fr takes each function ad' (Z1) and transforms it into a function a on (It. However, it is customary, or at least very common, to regard a as also having domain Zr. This is done as follows: Recall y (k) = wk is a map from Zr to Ur that satisfies y (k + ru) = y (k) y (m) all k, m. in Zr. When w is primitive, as it is in this discussion, y is both one-to-one and onto. Thus we may identify (Zr. +) with (Ur, ). With this understanding a (w°) = a (0).
a(w) = a(1).....a W- 1) = a(f - 1). Our Fourier transform may now be written Fr (a) (k) = a (k) _ r-1
Ea(j)(III k)' _ E a (j) (c'P
in matrix notation:
a=°
j=0
a (0)
1
1
l
...
1
a (0)
a(1)
1
u}
w2
...
wr-'
a(1)
a((- 1)
1
(ur-')
(t+'1-1)2
(wr-t)r '
a(f - 1)
140
The Fourier Transform
Chapter Four
Theorem 1. For each fixed l the map .fit is linear and preserves convolution; i.e., Ft (a * b) = .fit (a) Ft (b) Proof a-1
a-1
t-1
Ft(a*b)(n) =>(a*b)(k)(wn)k = k=0
t-
=i(j)
[a(i)b(k_i)j
k=0 j=0
j)
j=0
(wn)k
(wn)k
k=0
E-1
a (9) (wn)j E
1=O b (k -
j)
(wn)k-j
= Fit (a) (n) .Ft (b) (n)
a (j) (Wk)j. Then
Let us set -'F;-1 (a) = 7
t-1
Theorem 2. For any act' (Ze) we have a (k) = .Pt (a) (k) _
EaU) (alkyl j=0
t-1 and a (k) = Ft 1 (a) (k) = 71E,& U) (YU
k)-1.
j=0
Proof. In this last sum we replace a (j) by its definition obtaining 1
F
[-1
Ea(j) j=0
(wk)J
t-1
= , j ( 'n=o- ' a (n) (wf)n
(wk)J
j=o
t-1
E a (n)
a-1
(w,i-k)i
n= n=0
When n = k this inner sum is e, and when n 96 k it is zero (Exercises 6, problem 1). Thus
l t-1 a (j) (+o j=0
k)J
= a(k)
6. Finite Fourier Transform
141
We have seen that the Fourier transform on Z,, the finite Fourier transform, maps functions defined on Z, to functions defined on the set of characters of Z,, the set Z,. Now Zl can be identified with U,, and this last set can be identified with Z,. So the transform 3r maps functions defined on Z, to functions defined on Zc. The transform is linear and so in this, finite, case it can be represented by a matrix. Moreover, the transform preserves convolution and is invertible. We shall see (Exercises 6, problem 1) that when w E U1 (i.e., when w is
an fth root of unity) and w y6 1, then E?_ow1 = 0. Now suppose that w is a primitive fth root of unity. Then for any m, 0 < m < e - 1, wm is in U, and, because w is primitive. wm 1. Hence Et_o(wm)J = 0. Let us work out the case e = 3. Recall:
(f) = f =
1
1
1
1
w
w2
f
w2 wd
1
where, of course, w4 = w. Consider now 1
1
1
1
1
1
1
w
w2
1
w
w2
1
w2
w4
1
=
w2 w°
3
1+MT+w2
1+w2+W4
1 + w+ w2
3
1+ w+,D2
1 + w2 + w4
1 + w + w2
3
we have used the fact that sv = 1/w (Exercises 6, problem 2(b)). This last matrix is just
3
1
0 0
0
1
0
0
0
1
and, it is easy to see, reversing the factors gives us the same result. Thus to invert `), we take its matrix and replace each entry by its complex conjugate and multiply the result by 3. In another notation: 2
2
%(f)(k) = f(k) _ Ef(J)(Wk)4 _ f(J)(e+ZP )k' J=0
J=0
Chapter Four
142
The Fourier Transform
and 2
2
9(j)('Dk)j = 3
(9)(k) = 3 J=o
9(3)(e-
211
)ki
i=o
Remark on Notation. Some writers define the Fourier transform in a slightly different way to obtain a symmetry between the transform and its inverse. They set
and obtain 1
t-1
72: t g(j) (CVk)j i=0
This is also done in connection with the Fourier transform on R. One writes
`3(f) = f(x) =
1 tar
r00
J
, f(t)e+ixtdt
so that l-1(9)
=
1
00 9(x)e-'xtdx
tar J_,o
We see little advantage in doing this, but the reader should be aware that it is not uncommon.
Exercises 6 + 1. Let w E Ut, w 76 1. One of the properties of tth roots of unity that helps make the fast Fourier transform algorithm so efficient is the following: E-a = 0. Follow the steps below to give two proofs of this. wJ-
(a) Set r = rJ_ow). Show that wr = r and hence, if r
0, we have a
contradiction.
(b) Factor n - 1. 2. Let W E Ut. Show that (a) w-1 = wt-1; (b) Gr = 1/w; (c) if t is a prime (an integer larger than one that has no factors other than itself and one), then every element of U, except +1 is a primitive tth root of unity.
7. An Application
143
3. Show that e'(Zt) contains an element e such that e * f = f for every f E e1(ZZ). Compute cci(e). An element f of t'(Z1) is said to be invertible if there is an element g E e' (Z,) such that f * g = e.
(a) Show that f is invertible if, and only if, Z (f) is never zero.
(b) If f is not invertible find all g such that f * g = 0. (c) f is called a divisor of zero if there is g y6 0 such that f * g = 0. Show that f is a divisor of zero if, and only if, f is not invertible. (d) If f is invertible we have an element f -' E e' (Z1) such that f * f -' = e. Show that f is unique and that -1)(n) = 1(f)(in) for
every nEZ,. 4. Write the Fourier transform on t' (Z2) in matrix form and then invert the matrix. 5.
(a) Let f E e1(Z2) be given by f = (0, 1). Compute !ZMf). Similarly, compute 3'3 (f) where f = (0, 0, 1). (b) Characterize all g E e' (ZZ) such that g * g * terms of 3't(g).
* g (e factors) is e. in
6. Given f = (ao, a,, a2), 9 = (bo, b1, b2) in e'(Z3) compute:
(a) f and
(b) f 9; (c) f * 9;
(d) (f * 9) 7. Continuous characters of R. Let c : R - T be a continuous mapping such that c (x + y) = c (x) c (y) for all x, y in R. Show that there is an element a E R such that c (x) =
(et°)= for all x E R. Hint: Recall that Q is dense in R (Chapter Zero, Section 3, Corollary 2 to Theorem 1).
7.
An Application
There are some interesting, and surprising, connections between the finite Fourier transform and the theory of polynomial equations. These were found by Lagrange in his study of the cubic and quartic. The calculations involved are a bit messy, and we shall try to present the underlying ideas first so as not to lose sight of them in the technical details. Let us start with the simplest case of all, the well-known quadratic.
144
Chapter Four
The Fourier Transform
Suppose we are given the equation
ax2+bx.+c=0
(1)
where a 96 0 and the roots are xl, 22. The French mathematician Vieta first noted the relation between the roots and coefficients. If this case they are .r1 + x2 =
(2)
a
a
(3) a As functions of x1i x2 these are clearly symmetric, meaning they are invariant when we interchange the two variables. It turns out that any symmetric function of xl, 22 can be represented in terms of the two functions given in (2) and (3); i.e., any symmetric function of the roots can be expressed in terms of the coefficients. We shall prove this only for the special cases we need for our discussion. As an example, consider 0 = (xl - 22)2. This is clearly invariant x1=1'2 =
when we reverse our variables. We can write 0 in terms of a, b, and c as follows:
A = (x1 - x2)2 = xj + x2 - 251x2 = xj +X 2 + 221x2 - 42122 = (Si + x2)2 4x122 c -b 2 b2 - 4ac
(4)
where we have used (2) and (3). Combining (2) and (4) we have the system b (5)
XI +X2 XI
a
- x2 =
a
This is easily solved, but note that it is equivalent to: b 1
1
21
a
x2
b2 - 4ac
(6) 1
-1
a
1
The coefficient matrix in (6) is the finite Fourier transform obtained using the primitive square root of unity. We may solve (1) by solving (5) or (6). In the latter case
(7)
7. An Application
145
Let us turn now to the cubic
x3+Px2+qx+r=0
(8)
with roots xl, x2, x3. Lagrange chose, and fixed, a primitive cube root of unity w and considered the form Al = (x1 + wx2 + w2x3)3
(9)
He noted that this is not symmetric. Three of the six permutations of x1, x2, x3 do leave it invariant, but three transform it into A2 = (xl +w2x2 +wx3)3
(10)
Similarly, the three permutations that leave (9) invariant also leave (10) invariant, whereas the other three transform it into (9). This means that Al +L2 and A I A2 are invariant under all six permutations and hence can be expressed
in terms of p, q, and r; we shall do this explicitly below. It follows then that Al and A2 can be found by solving the equation
X2-(01+A2)X+(01A2)=0 and once we have those numbers we may write
xl+x2+x3=-P XI + wx2 + w2x3 =
01
(12)
xl + W2x2 + WS3 = 3 A2 or 1
1
XI
w
w2
X2
w2
IC
1(
X3
=
(13)
OrN2
Again we see that the coefficient matrix is the finite Fourier transform. As we have seen in Section 6, the system (13) can be solved by inverting the matrix. We find that the solutions to (8) are xi
X2
X3
(14)
146
Chapter Four
The Fourier Transform
We now show how one can express Al and 02 in terms of p, q, and r. Recall xl, x2, x3 are roots of
x3+px2+qx+r=0
(8)
The Vieta formulas in this case are
X1+x2+x3=-p
(15)
xlx2 + xlx3 + X2X3 = q
(16)
X1X2z3 = -r
(17)
Observe that (-p)2 = (XI +x2+x3)2 = x1 +z2+x3+2(x1x2+xlx3+x2x3) and hence, using (16), we have
xi+x2+x3=p2-2q
(18)
Now if we put xI, x2i x3 into (8) and add the results we get
x1+xz+x3+p(x1+x2+x3)+q(xl+x2+x3)+3r=0 and putting (15) and (18) into this equation we find
xi+xZ+x3=-p(p2-2q)+pq-3r=-p3+3pq-3r
(19)
We are trying to express 2102 and Al + A2 in terms of p, q, and r. The results obtained above enable us to do this for 01A2 rather easily. Let 6 = (xl + wx2 + W2x3)(xl + W2 X2 + WX3)
= xi + x2 + x3 + w(xlx3 + xlx2 + x2x3) + w2(xlx3 + x1x2 + x2x3) (20) _ (p2 - 2q) + (w + w2)q
where we have used (18) and (16). Now w is a primitive cube root of unity, hence not equal to 1. Thus w2 + w + 1 = 0 (Exercises 6, problem 1). Using this in (20), we get
6 =p2 -2q-q = p2 - 3q and so 01A2 = (x1 + wx2 + w2x3)3(x1 + w2x2 + wx3)3 = (p2 - 3q)3
(21)
The expression for Al + A2 in terms of p, q, and r is more difficult to obtain. We need a preliminary result first. Let
A=xix2+xlx2+xix3+xlxa+x2x3+x2x3
(22)
7. An Application
147
By equations (15) and (16) we may write -pq = (x1 + X2 + x3)(xlx2 + XI x3 + x2x3) = X2
2 + xjx3 + X1X2X3
+ XIX2 + xlx2x3 + x2x3 + x1x2x3 + x1x3 + x2x3
=.1-3r Hence
A= -pq+3r
(23)
A direct calculation shows that
(a+b+c)3=a3+b3+c3+3(a2b+ab2+a2c+ac2+b2c+bc2)+6abc (24)
Using this we find that
Al = (x1 + wx2 + w2x3)3 = xi + x3 + x3 + 3[xjwx2 + x1(wx2)2 + xiw2x3 + x1(w2x3)2 + (wx2)2w2x3 + (wx2)(w2x3)21 + 6x1x2x3
= x1 + x2 + x3 + 3w(xjx2 + x1x3 + x2x3) + 3w2[xix2 + xjx3 + x2x3 + 6x1x2x3
and, similarly, we find that
A2 = (x1 + w2x2 + wx3)3 = x1 + x2 + x3 + 3w2[xlx2 + xjx3 +x2x3 + 3w[xlx2 + xjx3 + x2x31 + 6x1x2x3
Adding and again using the fact that w2 + w + 1 = 0 we find that
Al + A2 = 2(x1 + x2 + x3) - 3(4x2 + x1x2 + xjx3 + x1x3 + x2x3 + x2x3) + 12x1x2x3
Finally, using (19), (23), and (17) we get Al + A2 = (x1 + wx2 + w2x3)3 + (x1 + w2x2 + wx3)3
(25)
= 2(-p3 + 3pq - 3r) - 3(-pq + 3r) - 12r = -2p3 + 9pq - 27r Equation (11) is then
72-[A1+A21X+AlA2=X2-[-2p3+9pq-27r]X+(p2-3q)3=0
148
Chapter Four
The Fourier Transform
Rom this we can solve the cubic as we showed earlier.
Note. In applying this method we must choose cube roots of Al and A2. We must be careful to do that in such a way that 'VA I A2 = p2 - 3q. To illustrate how this method may be used, we shall solve an equation with known roots. We take
x3-6x2+11x-6=0 whose roots are easily seen to be 1, 2, and 3. Clearly, p = -6, q = 11, and r = -6. Thus 01A2 = (p2 - 3q)3 = (36 - 3(11)]3 = 27 and
Al+L2=-2p3+9pq-27r = -2(-6)3 + 9(-6)(11) - 27(-6) = 0 Hence our quadratic is simply
X2+27=0 and, solving this, we find that
A2 = -i 33/2
Al = i 33/2,
We must take the cube root of each of these quantities. Of the three cub roots of i, we select -i. With this choice we find that
'01=-if and '02=if = -i23 = 3 = p2 - 3q. The roots of our equation,
Note that
x1i x2, x3, are given by x1
1
1
x2
1
w
2
X3
1
w2
w
1
-p
V Y2
7. An Application
149
which leads to (taking w = 11 + i$)
X1 = 3(-P+ Y O1 +
X2 = 3(-p+w2' = X3=
9
A2) = 3[-(-6) - if +if] = 2
01 +w
V A-2)
= 3[6-w2i\ 3+wiV3]
3[6+if(w-w2)] = 3[6+(if)(if)]
=
3[6-3] = 1
3(-P+w' O1 +w2 3 D2) = 3[6+w(-iV3)+w2(f)]
= 3[6 + (w2 - w)if] = 3[6 - (if)(if)] = 3(9) = 3 Exercises 7 1. Use Lagrange's method to solve:
(a) x3-1=0 (b) x3-5x2+8x-4=0 2. A permutation of a finite set is a one-to-one map from this set onto itself. There are six permutations of the set {1, 2, 3}. These may be written as
a=
d=
1
2
3
2
3
1
1
2
3
2
1
3
b=(1 2
3
2
1
'
II\\
3
e=
_ C=
1
2
3
1
3
2
f-C 1
2
3
3
1
2
where we understand that a(1) = 2, a(2) = 3, a(3) = 1, etc. Let us call x1 + wx2 + w2x3i y, and let a(y) = xa(1) + wxa(2) + W2xa(3) _ x2+wx3+W2T.1. (a) Show that a(y) = w2y, and f (y) = wy. (b) Show that b(y) = w2c(y), and d(y) = wc(y). (c) Conclude that y3 = (x1 + wx2 + w2x3)3 is unaffected by a, e, and f whereas b, c, and d transform it into (x1 + w2x2 + wx3)3.
3. Let A = (x1 - x2)(x1 - x3)(x2 - x3). Referring to problem 2. take a(0) = [xa(1) - xa(2)][xa(1) - xa(3)][xa(2) - xa(3)] and similarly for the other permutations. Which permutations map A to A (these are called even) and which map A to -A (these are called odd)?
150
8.
Chapter Four
The Fourier Transform
Some Algebraic Matters
We have seen that, for a fixed natural number n, the multiplicative group (U,,, ) is generated by any primitive nth root of unity. In particular, if w is a primitive
6th root of unity, then w'1 = 1, six is the smallest positive integer for which this is true, and U6 = {wo, w, w2, w3, w4, w5 }. It is easy to see that w2, which is
a 6th root of unity, is also a cube root of unity. The same is true of
w4. The
element w3 is a square root of unity, whereas ws is primitive.
There is a problem here that we shall solve in this section. A complex number z is an nth root of unity if z" = 1. But, for such a number z, we may have zk = 1 for k < n. The smallest positive integer for which this is true was called the order of z (Section 6, Definition 5), 9(z), and we have seen that 9(z) must be a divisor of n. What we shall do is figure out how to compute the order of every nth root. We have already seen how to begin. We choose a primitive nth root w and note that because U. is equal to {art 10 < j < n - 1), every nth root of unity is equal to a" f o r some j = 0,1, 2, .. , n - 1. So all we need do is figure out how to compute the order of '. The problem can be solved by recalling a little number theory. When a, b are integers we say a is a factor, or a divisor, of b if there is an integer in such that am = b. When this is the case we say a divides b (or divides evenly into b). and write alb. A useful observation is this:
Lemma 1. If a, b, c, and d are integers, if a + b = c, and if d divides any two of the numbers a, b, c, then it divides the third one. Definition 1. Let a, b be two positive integers. A positive integer d is said to be the greatest common divisor of a and b if (i) dia and dub; (ii) any integer c that divides both a and b also divides d.
So. for example. the numbers 18 and 24 have 1, 2, 3, and 6 as common divisors. The greatest of these is, of course, 6.
Lemma 2. Any two positive integers a, b have a unique greatest common divisor d. Furthermore, there are integers xo and yo such that d = axe + byo. Proof. Let .11(a, b) = {ax + by x. y are in Z}. Clearly, a = a 1 + b 0 and b = a - 0 + b 1 are in M(a, b), and because (ax, + by,) - (ax2 + by2) = a(x, - X2) + b(y, - y2), this set is an additive group (Section 4, Definition 1). Hence. by Corollary 2 to Lemma 1 (Section 4), there is a smallest positive I
integer d E M(a, b) and M(a, b) = {kd I k E Z}. It follows that dia and lib. However, because d E M(a, b), there are integers xo and yo such that d = axo + Mjo. From this, and Lemma 1, we see that any common divisor of a and b must divide d.
To see that d is unique we simply note that any two greatest common divisors of a and b must divide each other. Because they are positive integers they must then be equal.
8. Some Algebraic Matters
151
Definition 2. Given two positive integers a and b, we denote their greatest common divisor (their g.c.d.) by (a, b). We shall say that a and b are relatively prime when (a, b) = 1.
From Lemma 2 we see that when (a, b) = 1 there are integers xo and yo such that 1 = axo + byo. The converse is also true. If, for two natural numbers a and b, there are integers x0 and yo such that 1 = axo + by0, then, by Lemma 1, a and b must be relatively prime. This has a useful consequence. Suppose
t = (a, b), so e = ax + by for some integers x and y. Then 1 = (f )x + (t )y. showing that the natural numbers 7 and b are relatively prime. If a, b, c are natural numbers and albc we cannot conclude that alb or that aic. For example, 6124 so 618 3, but 6 does not divide either 8 or 3. This observation gives meaning to our next result.
Lemma 3. Let a, b, c be natural numbers and suppose that a and b are relatively prime. If albs, then a1e.
Proof. We have I = (a, b) and so 1 = ax + by for some integers x and y. Thus c = (ac)x + (bc)y and hence, by Lemma 1, ale. Recall that a prime number is an integer greater than 1 whose only factors
are itself and 1. If p is a prime and a is a natural number, then either pla or (p, a) = 1. Thus:
Corollary 1. Let a, b be natural numbers, let p be a prime, and suppose that plab. Then either pea or plb. We are now ready to compute the order of wk, where w is a primitive nth
root of unity and 0 < k < n - 1. Recall that the order of wk is the smallest positive integer d such that (wk)d = 1. We have seen that if for some positive integer m, (wk)"' = 1, then dim. It is easy enough to repeat the argument here. Because d < m. m = qd + r for integers q and r with 0 < r < d. But (Wk) m = 1 and (wk)d = 1 so (wk)'' = 1, showing r must be zero.
Theorem 1. Let w be a primitive nth root of unity. Then the order of wk is k.i
; i.e., the integer obtained by dividing n by the g.c.d. of n and k. In
particular, wk is primitive if, and only if, n and k are relatively prime.
Proof. Suppose that the order of wk is d and that e = (k, n). We want to show that d = . Because these are positive integers we need only show that they divide each other. Now (wk)n/t = (wn)k/l = (1)k/f = 1 and so, by the observation we made immediately above, Because (wk)d = 1, nikd. It follows that (7)l(7).
Chapter Four
152
The Fourier Transform
which shows () I () d. But ' and k are relatively prime. Thus ()Id and we are done.
The number theory we have discussed here has some further interesting applications that we shall now explore. We first note that any additive subgroup of Z has an unusual property. Given a subgroup (t) of Z we have (a b) E (e)
whenever b E (e), and this is true for all a E Z. It is this property that enables us to introduce a new operation, a kind of multiplication, onto the set Z1. First we need:
Lemma 4. Let (e) be any additive subgroup of Z and suppose that a - a' mod (e), b =_ b' mod (e). Then, a b = a' b' mod (e). Proof. Because a - a' mod (t), we have a = a' + kt for some integer k. Also, b = b'+ me for some integer m. Hence a b = a'. Y+ ((a'm)e+ (b'k)e+ (mke)e], which proves our result.
Definition 3. Let a be a fixed natural number greater than 1. For any U, b in Zt we define a b to be (ab); note here that a denotes the equivalence class, modulo 1, that contains a.
It is clear, from our lemma, that the binary operation is well defined. Moreover, because (ab) = (ba) we have a b = b a, because (a(bc)J = (ab)c] we
a=a
have
Let us compute the multiplication tables in two cases: 0 0
0
1
0
2 0
3 0
0
1
2
3
4
2
0
2
4
1
3
4
0
0
3 4
1
3
(Z5, )
4 2
1
2
3
4
5
0
0
0
0
0
0
1
0
1
2
3
4
5
2
0
2
4
0
2
4
3
0
3
0
3
0
3
4
0
4
2
0
4
2
5
0
5
4
3
2
1
0
1
3
0 0
4
2 1
(Za, )
These tables display some interesting features of this new operation, but before we can discuss them we need some additional terminology. To begin with, the element 1 is called the multiplicative identity of Zt. An element b of Zt is said
to be invertible if it has a multiplicative inverse; i.e., an element a E Zt such 1. that The table for (Z5, ) shows that every nonzero element of this set has a multiplicative inverse. This is not true in Ze. There we see that only I and 5 have inverses, whereas 2,!i, i do not. Even stranger things happen in (Z6, ).
8. Some Algebraic Matters
153
The table shows that 2.3 = 0 and that 3.4 = 6, while none of the factors is zero. We have developed the number theory we need to explain these "weird" results. Again we must start with some terminology.
Definition 4. Two elements a, 6 of Z, are said to be proper divisors of zero if a54 034
We observe that a proper divisor of zero could not have a multiplicative inverse (see Exercises 8, problem 2).
Theorem 2. The set (Zt, ) contains proper divisors of zero if, and only if, f is a composite integer (i.e., an integer greater than one that is not a prime). k f 1 < n < f.. Clearly then, 10 i 0 a whereas (kn) = 0, showing that Zt contains proper
Proof. Suppose first that a is composite, sot =
divisions of zero. Now suppose that there are proper divisions of zero in Zt. Let us say k, Iii
are nonzero, yet k n is zero. Then kn = mi for some integer m. If a were a prime it would follow that elk, giving us k = 0, or fIn and implying that n = 0. Thus f must be a composite integer.
Corollary 1. When a is a prime the set Zt does not contain proper divisors of zero.
Our next result tells us which elements of Zt have, and which do not have, multiplication inverses. Its proof also shows the connection between the latter elements and the proper divisors of zero.
Theorem 3. An element k of Zt has a multiplicative inverse if, and only if, the integers k and t are relatively prime.
Proof. Suppose that (k, t) = 1. Then there are integers x and y such that 1 = kx + £y. Thus kx =_ 1 mod (P) showing that a is the multiplicative inverse of k.
Next we suppose that k and t are not relatively prime; i.e., (k, 1) = n > 1.
Then k = k1n and i = Pin where 1 < kl < k and 1 < Pi < t. It follows that k f i = (kin) ei = k1(n1i) = kiI E (1). Thus k and ti are proper divisors of zero.
Remark 1. The proof of Theorem 3 shows that when k E Z1, k 0 6 and k is not invertible, then k is a proper divisor of zero. More precisely, there is an ti E Zt such that k and ti are proper divisors of zero.
154
Chapter Four
The Fourier Transform
Corollary 1. When t is a prime, every nonzero member of Zt has a multiplicative inverse. Moreover, for a, 6 in Zt we have a. b = b if, and only if, one of these factors is zero.
Remark 2. Given a natural number rn let E(m) = {k E NJ k < m and (k, m) = 1}. Euler defined a function
Exercises 8 1. Let w be a primitive nth root of unity. Compute the order of each of the numbers ud, j = 0. 1, ... , n - 1 when: (a) n = 6;
(b) n = 12;
(c) n is a prime number. * 2. Suppose that a and 6 are proper divisors of zero in Zt. Show that neither of these elements can have a multiplicative inverse in Zt. # 3. Suppose that the integers k, a., b and n satisfy ka = kb mod n. If (k, n) =
1, show that a - b mod n. Hint: In Z we have ka = Z. 4. Compute V(1), V(2), V(3). p(4), and V(p) where p is a prime. (a) If p is not a prime show that 1 - 2
(p - 1) _ 6 in Zp.
(b) If p is an odd prime note that (p - 1) is its own inverse in Zp. Conclude that 1 - 2 6---2) = 1 in Zp. (c) Use (a) and (b) to show that p is a prime if, and only if, 1.2 (p - 1) - -1 mod p. This is called Wilson's theorem.
9.
Prime Numbers
Primes have been investigated since ancient times. A great deal is known about them, and yet many mysteries remain. To begin our discussion let us restrict our attention to the set N = 11, 2, 3.... } of natural numbers.
Given n and d in N we shall say that d is a divisor, or a factor, of n, and we shall write din, if there is an m E N such that dm = n. Observe that any factor of d is also a factor of n.
Definition 1. A natural number n > 1 is said to be a prime if the only divisors of n are n and 1.
9. Prime Numbers
155
Clearly 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, ... are primes. A number n. > 1 that is not a prime is called a composite number.
For any n E N let D(n) = {d E NI dIn}, so D(n) is the set of all factors of n. When n > 1 the set D(n)\{1} 54 0, and so it has a smallest member p. Observe that p must be a prime; for if 1 < q < p and q1p, then qJn is smaller than the smallest divisor of n. Thus every n > 1 has a prime factor.
Theorem 1. There are infinitely many prime numbers. Proof. Suppose that the set of primes is finite and let this set be denoted by pt, p2, ... Pk, k E N. Set
Q=pi'N2. ... Pk+1 and note that no pj can divide Q; for, if it did, then by Lemma 1 of Section 8, it would have to divide 1. But then any prime factor of Q, and there is at least one, is not. on our list of all primes, an obvious contradiction.
If n is a composite number, then n = plgl, where p, is the smallest prime
divisor of n. If ql is a prime, then n. is the product of two primes. If qi is composite, then qt = p2g2, where p2 is the smallest prime divisor of ql. Thus n = plp2q2. Continuing in this way we should be able to write n as a product of prime numbers. This is, in fact, the case as we shall now prove.
Theorem 2. Any integer greater than 1 is either a prime or it can be written as a product of primes. Proof. We argue by contradiction. Let Al be the set of natural numbers greater than 1 that are not prime and cannot be written as a product of primes. Suppose
M # 0. Then there is a smallest element m of M. Clearly m is not a prime
and som=ab,where 1 1, to ask if there is more than one way to factor it into primes. It turns out that apart from permuting the factors, this can be done in just one way.
Theorem 3. Given n > 1 we may find primes pi < p2 < numbers al , a2,
, aA such that
n = pi pat ... pAk This can be done in one, and only one, way.
< pit and natural
Chapter Four a The Fourier Transform
156
Proof. Assume that the theorem is false. Then there is a smallest integer ri1 > 1
that has two distinct factorizations: m = pl' pk`, where p1 < p2 < .. < A are primes and al, ... , ak are natural numbers, and ri1 = qb' qt° where q1 <
.
. < of are primes and b1, ... , be are natural numbers.
It follows that P1 Igii ... qe' and so p1 Iqb' for some i, 1 < i < f (Section 8, Corollary 1 to Lemma 3). From this we see that p1lgi and p1 = qi. A similar argument shows that q, = pj for some j, I < j < k. Thus
Pi:P,=qj Sqi=P1 giving us p1 = Q1.
Now r < m and so this number (i.e.. p1) can be written, as described above, in just one way. But Pt
ak a,-1 a2 b2 ... qebe p2 ... pk = q1bi-1 a2
- P1
It follows that k = f, p2 = q2, ... , Pk = gk, and a1 - 1 = b1 - 1, a2 = b2.... , ak = bk. Thus the factorization of m is unique, and we have reached a contradiction. For any n E N let us set a(n) equal to the sum of all the divisors of n. We observe that
a(7)=8<2.7, a(12)=28>2.12, a(6)=12=2(6) The number n is called deficient when a(n) < 2n, it is called abundant when a(n) > 2n, and it is said to be perfect when a(n) = 2n. This terminology goes back to the numerologist of antiquity. Perfect numbers were discussed by Euclid, and he gave a means of generating them. He considered numbers of the form n = 2P-1(2P - 1) When 2P - 1 is a prime, the factors of n are 1, 2, 22, ... , 2P-1.
1
(2P - 1),
2 (2P - 1), ... ,
2P-1 (2P - 1)
and so a(n) = 1 + 2 +
+ 2P-1 + (2P - 1)(1 + 2 +
+ 2P-1)
= (2P - 1) + (2P - 1)(2P - 1)
= (2P - 1)(1 + (2P - 1)] = 2P(2P - 1) = 2n
To sum up these observations: Whenever 2P-1 is a prime, the number 2P-1(2P1) is perfect.
9. Prime Numbers
157
We turn now to the question of when 2p - 1 is prime. It is easy to see that when n is composite so also is 2" - 1 (see Exercises 9, problem 5). Thus we need only consider the numbers
Mp=2p-1 where p = 2, 3, 5, 7, 11, .... These numbers were discussed by the French cleric
Mersenne and have come to be called Mersenne numbers. Note that M2 =
22-1=3, M3=23-1=7, M5=25-1=31, M7=27-1=127 are all primes. Unfortunately, M11 = 211 - I = 2047 = (23)(89). Still the first four Mersenne numbers give us four perfect numbers: 22-1(3) = 6, 23-1(7) _ 28, 25-1(31) = 496, and 27-1(127) = 8128. In 1876 Lucas showed that M127 = 170,141,183,460,469,231,731,687,303,715,884,105,727
is a prime. It is now known that M11213, M19737, M21701, and M232U9 are primes. Given a prime p the Mersenne number Mp may, or may not, be a prime. It is useful then to have some test, especially one that can be easily programmed on a computer, to determine when M. is a prime. Such a test was discovered by Lucas and put in final form by Lehmer. He showed that Mp is a prime if, and only if, MplSp_1 where the numbers S" are defined as follows: S1 = 4, S"+1 = S n - 2, n = 1, 2, 3, ... . We shall give a proof of the "interesting" half of this result; i.e., that when MpISp_1, Mp is a prime. A few preliminary results are needed here.
ad=land
Lemma1. Let a=2+f,
a_2m-1
Proof.
When m = 1 we have S1 = a + a, which is clearly true. Assume
that there is a smallest integer k + 1 for which our formula fails. Then, for m = 1, 2, ... , k the formula is true. Thus (a2k-1 /
Sk+1 = Sk - 2 =
= =
+ `2
(a2k-1I1\
Zk
2k
a2k-1`1z
/
+
-2 2(aa)2k-1 + (?i2,-,)2
-2
a +a
because as = 1. This is a contradiction and the lemma is proved. ir2Y-2
Suppose now that Mp divides Sp_1. Then Mp must divide a2°-2 + and so this sum is equal to RMp for some integer R. Multiplying by a2°-2 gives us
a2"-1 =
RMpaV-2
-1
(1)
158
Chapter Four
The Fourier Transform
and squaring this gives us
a2° = (RMpa2'-2 - 1)2 = KMp + 1
(2)
for some integer K. Assume that. A1p is a composite number. Then Mp has a prime factor q with q2 < MP (see Exercises 9, problem 2). Recalling the definition of Mp we see that q 0 2. Define a set X = {a + bV'I a, b E ZQ}. Clearly, X contains q2 elements, among them 0 and 1. For x, y in X we define x y as follows: x = a + bV', y = c+df, (ac+3bd)+(ad+bc)f, where the terms (ac+3bd) and (ad+bc) are found in Z?. Clearly, x y E X. For any x E X we define x1 = x, x2 = x x, etc., and we shall say that x is invertible if there is a y E X such that x y = 1. Let Y he the set of invertible elements of X and note that Y contains at most q2 - 1 elements.
Lemma 2. For each y E Y, yr - 1 for some natural number r < q2 - 1. Proof. If y E Y, then each of the elements y, y2, y3'. .. , yq2-1 is in Y. There
are two cases: (i) If the numbers listed are all distinct, then one of them must equal 1 (because Y contains at most q2 - 1 members) and we are done. (ii) If ye - y" for 1 < t < m, then 1 and again we are done. For each y E Y let 0(y) be the smallest positive integer such that yO(V) = 1.
Then if r is any integer such that y' - I we must have O(y)I r (see Section 6, the discussion just below Definition 5). a2p_' = -1 mod q by (1) and alp = 1 Now or E X and because qI A1,
mod q by (2). Thus a E Y and O(a)l 2p. and O(a) cannot be less than 2p because a2°-` - -1. (Note that because q > 2 we do not have 1 - -1.) Thus
O(a)=2p
Exercises 9 1. If dl n and rl d, show that r) n. * 2. If n is a composite number and p is the smallest prime factor of n, show that p2 < n. 3. Factor the following numbers into primes: (a) 1001 (b) 756
10. Euler's Phi Function
159
(c) At some point in his life, Euler became blind. It was after that that he factored 1,000,009 into (293)(3413).
4. For any n > I consider F_ 1 where the sum is over all divisors d of n. Show that this sum is less than 2 when n is deficient, greater than 2 when
n is abundant, and equal to 2 when n is perfect. Hint: Compute E 55. When n is composite, show that 2" - 1 is composite. 6. Euler showed that every even perfect number is the form 2 p- L My, where
Mp is prime. At the time of this writing, it is not known if odd perfect numbers exist.
10.
Euler's Phi Function
Given m E N the number of invertible elements of Zm coincides with the number of integers k such that 1 < k < m and (k, m) = 1. This is also the number of primitive mth roots of unity. As we have seen (Section 8, Remark 2) the function W(m) gives us this number. Our purpose here is to derive a formula for computing rp(m) from m.
Lemma 1. Fix b E N and consider the set Zb. It consists of the equivalence classes F--l. Suppose that r > 0 is an integer and a E N satisfies (a, b) = 1. Then r, r+ r), -(2a+ r),. . . , ((6 - 1)a + r) are all the elements of Zb.
Proof. The list given contains b elements of Zb. It suffices only to show that these elements are distinct. Now if sa+r - to+r mod b, where 0 < s, t < b-1, then sa to mod b. However, because (a, b) = 1 this tells us that s - t mod b (Exercises 8, problem 3), and this in turn, because 0 < s, t < b - 1 gives us
s=t.
To illustrate the power of Lemma 1, we use it to prove a result that has found application in securing computer files.
Theorem 1 (Fermat). Let a, p be natural numbers and suppose that p is a prime. If p does not divide a, then ap-1 = 1 mod p. Because p does not divide a it cannot divide any of the numbers a, 2a, 3a,... , (p - 1)a. By Lemma 1, with r = 0, the set a, Ea, 3a, ... , p - la and 0, coincides with Z,. It follows that 5, 2a, ... , (p - 1)a coincides with the set 1,... F-71. Thus, in Z,, we have Proof.
aa-1 .1.2...(p_1)=1.2...(p-1)
Chapter Four
160
The Fourier Transform
or, equivalently, by
aP-1 . 1 .2...(p- 1) __ 1 .2...(p- 1) mod p
But (1 -2. (p - 1), p) = 1 and so by Exercises 8, problem 3, we must have aP-1
=- 1
mod p.
a mod p.
Corollary 1. If p is a prime and a is a natural number, then aP Let us return now to our discussion of the phi function.
Theorem 2. If a, b are natural numbers and (a, b) = 1, then y(ab) = V(a)cp(b). Proof. Consider an integer k, 0 < k < ab. We may write k = aq + r, where
0 < r < a - 1. But note that we also have 0 < q < b - 1. In fact, each k. 0 < k < ab can be written in one, and only one, way ask = aq + r with q and r as given above.
Now given k, 0 < k < ab we ask if (k, a) = 1. This may be written a) = 1. and this is true if, and only if, (r, a) = 1. There are y(a)
(aq + r.,
numbers r that are smaller than a and prime to a. Choose and fix one of them, say r1. By Lemma 1 the list r1, a+r1, 2a+r1.... , (b-1)a+ru, contains b numbers distinct clod b. Hence this list contains exactly
both a and b. But the number of integers that are prime to both a and b is y(ab) and, as we have just seen, there are :p(a)V(b) such numbers.
Corollary 1. If p is a prime and n is a natural number, then ip(p") = PP (1y ).
Proof. The natural numbers that are < p" and not prime top" are p, 2p, 3p..... p" -1p. There are then, p"- I such numbers. Hence the number of natural num-
bers that are < p" and prime to p" must be p" - p" = p"(1 -
.1).
We have seen that any natural number m > 1 is either a prime or it can be written as a product of primes in "essentially" only one way. When m is a prime p(n1) = m - 1. When in = pI' pk` we have:
Corollary 2. ip(m) = '(pi' ...p7) = m(1 -P1i)... (1 - L). P Proof. It is clear that (p;' ,
1 when i 1-1 j. Thus
'(p'' ... pk) = 4(pi') ... p(pk' )
10. Euler's Phi Function
(1 - - ). Hence we see that
By Corollary 1 we know that V(m) = Al,
161
(I_ P1 /
pk` I 1 - pk
P2' (1 -
=Pi' .p2,...pAk (1 - p,) ... C1 - pk 1
=m
1-Pt1)...(1-Pk\
When p, a are natural numbers and p is a prime, the assertion that p does not divide a means (p, a) = 1. Fermat showed that we then have ap-1 =- 1 mod p. Recall that V(p) = p - 1 so we have aPip> _- 1 mod p. Euler showed that for any natural number in if (m, a) = 1, then a'('") _- I mod in. This generalization of Fermat's result also plays a role in securing computerized information. We shall discuss this application further after we prove Euler's result.
Theorem 3. Let a, in be natural numbers and suppose that (a, m.) = 1. Then a91'>
1 mod in.
Proof. Let the n = V(in) natural numbers that are < in and prime to in be denoted by a1, a2, ... , an. Note that (aa,, m) = 1 for j = 1, 2,... , n. Furthermore, if aa; _- aaj mod in, then a; _- aj mod in by Exercises 8, problem 3. Because 1 _< a;, a. < in, this means a; = aj. Thus the n elements of Z,", aa1, aa2,... , aa" are distinct and each of them is invertible in Z",. Hence they must coincide with at, a2, ... , a". But then, in Zm, (aal(aa2)...(Qa")=TI .i2...(1"
or, equivalently, a" . a 1
. a2 ... a" = a a2 ... a" mod in.
Finally, because (at - . an, in) = 1 we may again use problem 3 of Exercises
8 to conclude that an
1 mod in.
Let us finish up with a brief discussion of how the theorems of Fermat and Euler are used in connection with the problem of coding and decoding confidential messages. We shall assume that our message must first be converted
into an integer M. We must first encode M to obtain the encoded message E. This is then sent to our correspondent who must, somehow, derive M from E. As a first example, suppose we choose a prime r and a natural number s such that (s, r - 1) = 1. Given M we encode it by setting E = M
162
Chapter Four
The Fourier Transform
The problem now is to obtain M from E. So we want to find t so that
E' = Al" - M mod r. Suppose we can find t so that at = k(r - 1) + 1 for some integer k. Then
Et = (A,ln)t = A!"' =
AAIk(r-')+' = (Mr-1)k
Because r is a prime, Fermat's theorem tells us that Mr- = 1 mod r and so E' - Al mod r as required. To find t so that st. = k(r-1)+1, we use Euler's theorem. The equation says
that at = 1 mod (r-1). Now gp(r-l) = 1 mod (r-1) provided (s,r-1) = 1, and we chose a so that this is true. Hence all we need to do is take t =
s`P(r-1i-
With this choice st = S0(r-1) - I mod (r - 1) as required. As an example, let us take r = 17 and s = 3. Then (3,17 - 1) = (3,16) = 1 as required. Given M we find E = M3 and send this to our correspondent. To decode he must have t such that at - 1 mod (r - 1). We have seen that
t = s7 = 377 = 2187. We may reduce this by noting that 2187 - 11 mod 16. Thus to find M we simply compute E" mod 17. A variant on this technique concerns "public access" codes. Here the r is published openly. It is a many digit number that those using the code know is the product of two very large primes p and q. One chooses s so that (s, v(r)) = 1 and encodes the message, Al, as before. So E = M' is sent.
We now seek t so that E' = M'' - 1L! mod r. Again we want at 1 mod V(r) or at = 1 for some k. So we take t = Then 4t = V(`p(r)) = 1 mod p(r) by Euler's theorem. Finally, E' = A!"' = Mkw(r)+ = )Ma(r)]kM = M mod r because Mw(r) 1 mod r. Note that because the receiver knows that r = pq. s/he also knows V(r) _
(p - l)(9 - 1). Exercises 10 1. Prove Corollary 1 to Fermat's theorem. 2. Compute ya(rn) for the following numbers: (a) in. = (64)(81)(625); (b) in = 21A:
(c) rn=72.19.372. 3. If p is a prime, show that tp(1) + ap(p) + p(p2) + + :p(p") = p"
4. There is a conjecture, unproven at the time of this writing, that n is a prime if, and only if.
10. Euler's Phi Function
163
Notes 1. The axiom of choice, and its many equivalent forms, is discussed in "General Topology" by John L. Kelley (D. Van Nostrand Company, Inc., 1955). See pp. 32-35. A proof of statement (*) used in the discussion of Weiner's theorem (Section 2) can be found in "Introduction to Topology and Modern Analysis" by George F. Simmons (McGraw-Hill Book Company, New York, 1963). See p. 321. See also pp. 308-310 for a further discussion of related ideas. Another proof of Wiener's theorem is given in "An Introduction to Harmonic Analysis" by Yitzhak Katznelson (John Wiley & Sons, Inc., New York, 1968). This is done on p. 202 where (*) is also proved.
2. The Fourier transform on R, using Lebesque integration, is treated in Katznelson's book (Note 1 above) and in the book "Fourier 'transforms" by R. R. Goldberg (Cambridge University Press, New York, 1961). Goldberg, in an appendix, discusses the Fourier transform on a locally compact
abelian group, and Katznelson has a chapter on this topic. A more extensive discussion can be found in "Abstract Harmonic Analysis" by E. Hewitt and K. A. Ross (Springer-Verlag, New York, 1963). 3. Lagrange's work on the theory of equations is discussed in detail by Harold
M. Edwards in his book "Galois Theory" (Springer, New York, 1984). See p. 2 and pp. 18-22. Edwards also discusses some related work of Vandermonde and, on p. 19, gives a brief (and to me at least. surprising) biography of Lagrange: "Unlike Vandermonde, who was French but did not have a French name, Lagrange had a French name but was not French. He was Italian (he was born with the name Lagrangia and his native city was Turin) and at the time of the publication of his 116flexions he was a member of Frederick the Great's Academy in Berlin. When he left Berlin
in 1787 he went to Paris, where he spent the rest of his life and where, of course, he was a leading member of the scientific community; this has tended to reinforce the impression that he was French. He was certainly the greatest mathematician of the generation between Euler and Gauss, and, indeed, has a secure place among the great mathematicians of all time."
4. Our definition of the set X and its algebra leaves something to be desired (see Chapter Five, Exercises 3, problem 2). The interested reader can also consult the article by J. W. Bruce, "A Really Trivial Proof of the LucasLehmer Test" in The American Mathematical Monthly, vol. 100, number 4, April 1993, pp. 370-371. Mersenne numbers are useful in digital signal processing, and certain finite Fourier transforms are sometimes called Mersenne number transforms (see 2 in the Notes to Chapter 5).
164
Chapter Four
The Fourier Transform
5. The Euler Phi function is discussed in "Introduction to the Theory of Numbers" by Leonard Eugene Dickson (Dover Publication, Inc., New York, 1957), see pp. 7 and 8, and in "Number Theory in Science and Communication" by M. R. Schroeder (Springer-Verlag, New York, 1986) on pp. 113-116. Schroeder also discusses using the phi function for "PublicKey Encryption" on pp. 118-126.
An application of Euler's function to signal processing is given in a paper
by N. K. hose and Kim in the JEEE Trans. Signal Processing, vol. 39, no. 10. Oct. 1991, pp. 2167-2173.
Chapter Five
Abstract Algebra
Many of the topics discussed earlier in the text can be treated more systematically by bringing in some ideas from modern algebra. To do this, we must raise the level of abstraction a little bit. The gain in insight, however, is well worth the price we must pay to achieve it. In this last chapter, we very briefly discuss groups, rings, algebras, and fields and indicate how they arise in harmonic analysis.
1.
Groups
Groups are found in many areas of mathematics-we have seen lots of them already-and also in physics and certain parts of chemistry. One might expect that a concept with such a wide range of applicability would be defined abstractly. This is the case, and it takes some getting used to. To begin with, there is only one binary operation (see below) in a general group. This can be something very familiar like addition or multiplication or it can be something more exotic like convolution or function composition. The examples below make this clear. When we say that a nonempty set G is closed under a single-valued binary operation, we mean that we have a function cp : C x G G. It is convenient, and customary, to denote cp(g, h) by go h (read "g circle h") for each pair (g, h)
E G x G and to say that G is closed under the binary operation o; we shall always assume that o is single valued.
Definition 1. Let G be a nonempty set that is closed under a binary operation o. We shall say that the pair (G, o) is a group if (a) for all g, h, k in G we have g o (h o k) = (g o h) o k; (b) there is an element e in G such that 165
166
Chapter Five
Abstract Algebra
g o t= e o y= g for every g E G; (c) for each g E G there is an element h E G
such that goh = hog=e. We call (a) the associative law and say that the operation o is associative. The element e, whose existence is postulated in (b), is called the identity of G. The element h discussed in (c) is called the inverse of the given element g E G and is usually denoted by g-1; we shall see that, given g, the inverse clement is uniquely determined. The pair (C,+) is clearly a group because we know that addition of complex numbers is an associative operation, that C has an identity (the number 0), and that each z E C has an inverse (the number (-z)). The pair (C. ) is not a group. It is true that multiplication of complex numbers is associative, and the number 1 is the identity for this operation. However, the number zero, which certainly belongs to C. does not have a multiplicative inverse. It. is easily seen, however, that (C\{0}, ) is a group. Let us show how (a). (b), and (c) of Definition 1 imply that the identity and inverses are unique.
Lemma 1. In any group (C, o) there is only one identity element e. Furthermore. given g E G there is only one element h E G such that g o h = hog = e.
Prof. We are assuming that g o e = e o g = g for all g E G. Suppose that c' is an other element of G for which g o e' = e' o g = g for all g E G. Then, in
particular, e=eoe'=e,'oe.ande.'=e'oe=eo e'. Because e.oe'=e'oe.we see that e = e'. Now let g E C and suppose we have two elements h, k in G such that goh =
hog=eandgok=kog=e.Thenh=hoe=ho(gok]= [hog]ok=eok = k and we are done.
Corollary 1. For any g E G, (g- 1)
-i= g.
Proof. Given g E G there is an element g-' such that g o g-1 = 9-1 09 = e. But this says g is an inverse of g'. Now the inverse of g-1 is, by definition, (g-1)-'. Thus. by Lemma 1. g = (g-1)-'.
Definition 2. Let (C, o) be a group and let H be a nonempty subset of G. If (H, o) is a group we shall say that H is a subgroup of G.
if H is a subgroup of (C, o), then because H 0 0 there is an element h E H. But because (H, o) is a group, h-1 must belong to H. Furthermore, because H is closed under the operation o, h o h -1 = e is in H. Lemma 2. Let (G. o) be a group and let H be a nonempty subset of G. Then H is a subgroup of G if, and only if, g o h-1 E H whenever g and h are in H.
1. Groups
167
Proof. Our discussion just above this lemma shows that our condition is necessary. Suppose now that H satisfies this condition. Then because H 34 0 we
can choose h E H and note that e = It o h-' E H. Thus (H, o) satisfies (a) and (b) of Definition 1; we have just proved (b), and (a) follows from the fact that (G, o) is a group. To prove (c) choose any h E H and note that because
e E H, h-' = e o h-' is in H. Finally, we must show that H is closed under o; i.e., we must show that g o h E H whenever g, h E H. So let us suppose g. h E H are given. Then h-' E H as we have seen. Hence g o (h -') -' E H and, by Corollary 1, this is just g o h. We have seen that (C, +) is a group. By Lemma 2 then a nonempty subset
A of C is a subgroup of C if (a - b) E A whenever a, b are in A. Referring to Chapter Four, Section 4, Definition 1, we see that what we have called an additive group is simply a subgroup of (C, +). Similarly, because (C\{O}, ) is a group, a nonempty subset M of (C\{0}) is a subgroup if x y-' E M whenever x and y are in M. Referring to Definition 2 of Chapter Four, Section 4, we see that what we have called a multiplicative group is simply a subgroup of (C\{0}, ).
Definition 3. A group (G, o) is said to be an abelian group, or a commutative group, if the binary operation o satisfies the commutative law; i.e., if goh = hog for all g, h E G. Because addition and multiplication of complex numbers are commutative operations, all subgroups of (C, +) and of (C\ {o), ) are abelian groups.
Recall that given an additive group A and a subgroup C of A we said that elements a. b of A were congruent modulo C, and we wrote a =- b mod C, when (a - b) E C (Chapter Four, Section 5, Definition 1). We showed that this gives us an equivalence relation on A, and we denoted the set of equivalence classes by A/C. In the case of a multiplicative group M with subgroup N. we defined t mod N to mean s t -' E N (Chapter Four, Exercises 5, problem 1). In s
this case too we get an equivalence relation on M, and we denoted the set of equivalence classes by M/N. Not surprisingly, we can do this for any group G and any of its subgroups H. We sketch the process here and leave the details, which are almost identical to those in the special cases mentioned above, to the reader.
Given a group (G, o) and a subgroup H of G we define for g, h E G. g It mod H to mean that goh'' E H. We say g is congruent to It modulo H.
Lemma 3. Congruence modulo H is an equivalence relation on G. The set of equivalence classes of G defined by the relation congruence modulo H is denoted by G/H. We have a natural mapping VH : C -+ G/H defined
168
Chapter Five
Abstract Algebra
by setting vH (g) equal to the equivalence class of G (i.e., the element of G/H)
that contains g, for each g E G. Calling this equivalence class [g] we have
OH(9)=[9] for each geC. Definition 4. Let (G, o) be an abelian group and let H be a subgroup of g. For each [g], [h] in G/H define [g] * [h] to he [g o h].
Lemma 4. Referring to Definition 4 we claim that (C/H, *) is an abelian group. Proof. We must first show that * is well defined. So suppose we are given [g], [h] in G/H and we choose two elements 9', g in [g] and two elements h', h in [h]. We must show that g' o h' __ g o h mod H, for then we have [g' o h'] = [g o h]. Now g' og-' and h' o h -1 are in H and so their "product," (g' o g-') o (h' o h - '). is in
H. But (g'og-')o(h'oh-1) =g'o[g-' o (h'oh-1)] = q'o[(g-' oh') oh-1], and because (G, o) is abelian g-1 oh' = h'og-1. Hence (g' o g-') o (h' o h-1) = gIo[(h'og-1)oh-1] =y
o [h' o (g- 1 o h- 1)]
=(9'oh')o(9oh)-1
because
(,q o h) -' = h-1 o g -' =g-1 oh-1 in this case. The fact that * is associative and commutative follows from the fact that o has these properties. More explicitly,
[g]*([h]*[k])=[g]*[hok]=[go (h o k)] = [(g o h) o k) = [g o h] * [k] = ([g] * [h]) * [k]
and
[g] * [h] = [9oh] = [hog] = [h] * [9]
for all g, h, k in C.
The identity element of G/H is just [e] because [g] * [e] = [g o e] = 1.91 and [e] * [g] = [e o g] = [g]. Finally, for any
[h] E GI H, [h] * [h-'] _ [h o h-'] _ [e] = [h-' o h] = [h-'] * [h] and so
[h]-1
= [h-']
Corollary 1. For any fixed 71 E N, (Z,,,+) is an abelian group (Chapter Four, Section 5).
There are many other familiar mathematical objects that are groups. Any vector space, for instance, is an abelian group under addition (Appendix A). In particular then, the set of all 2 x 2 matrices with real entries is an abelian group under addition. The nonsingular 2 x 2 matrices are a group under multiplication, but this one is nonabelian. Another example of a nonabelian group that we have considered is the set of permutations on (1, 2, 3) (Chapter Four, Exercises 7, problem 2). Here the binary operation is function composition. and because
aob=(i;3)=c,whereas boa=(213)=d.weseethat aob#boa.
2. Morphisms
169
Exercises 1 1. Let. (C, o) be a group and let H be a subgroup of C. Show that congruence modulo H is an equivalence relation on C. Do it in steps: (a) show that for any g E G, g =_ g mod H; (b) if g, h. E C and g - h mod H. show that
h . g znod H; (c) if y, h, k are in C and g . h, h
k, show that y . k.
2. Consider the set (Z,,, ) for n fixed. Let E be the set of all elements of is an abelian group. this set that have inverses. Show that 3. Let (C, o) be an abelian group and let H be a subgroup of G. For each y E G let .pH(y) = [yj E G/H. So 'pH : G-. G/H. (a) Show that 'pH (g o h) ='pH (g) *'pH (h) for all g, h in G. (b) What is {g E G I 'pH (g) = [eI}?
2.
Morphisms
T such that c(m. + n) _ We have had occasion to consider mappings c : Z c(m) c(n) for all m, n E Z, and also mappings c : Zr, -+ T that have this property (Chapter Four, Section 6, Definition 2 and Definition 3). Mappings of this kind play a fundamental role in the general theory of groups.
Definition 1. Let (C, o) and (C', *) be two groups. A function 'p : C -- C' is said to be a homomorphism from G into G' if tp(g o h) = cp(g) * 'p(h) for all
g,hinG. In Chapter Four. Section 6 we defined certain maps that we called the characters of Z and of Z,,. We see now that these are simply the homomorphisms from (Z, +) and (Z,,, +), respectively, into (T, .). The basic properties of any homomorphism are easily proved.
Lemma 1. Let. (C, o) and (C', *) be two groups and let p : C -. G' be a homomorphism. Then (a) 'p(e) = e' here e is the identity element of G and r is the identity of G'; (b) for every g E G, Sp(g"') = cp(g)-t; (c) the set Ker = {g E GI Mp(g) = e'} is a subgroup of G.
For any g E C we have V (g) = 'p (g o e) = 'p (g) * 'p (e). Multiplying on the left by 'p(g)-z we see that e' = 'p(e) as claimed. This proves (a). Now Proof.
let g E G be given. We have g-z 09 = 909- 1 = e and so ip(g`) * cp(s) = ,p(g) *,p(g-') = e' by (a). It now follows from Lemma 1 of the last section that yo(g-') = Mp(g)-'. This proves (b). The set Ker p is nonempty because, by (a). it contains C. To show that it is a subgroup we need only show that when g, h are in this set. so also is
170
Chapter Five
Abstract Algebra
g o h-1 (Section 1, Lemma 2). But Ker W(g) = {g E G I w (g) = e'}, so all we must do is show that p(g o h-1) = e' given that 9 (g) = e' and p (h) = e'- Now = e'* (e')-l = e', here we have 'P(g o h-1) = 5o(g) *'p(h-1) ='p(g) * [V used (b). (h)]-1
Definition 2. Given a homomorphism V : G - G', the set Ker W = {g E GI cp(g) = e'}, where e' is the identity element of G', is called the kernel of cp.
We have just shown that the kernel of any homomorphism on G is a subgroup of C.
Corollary 1. For g, h E G we have Mp(g) = W(h) if, and only if, g o h-1 E Ker cp.
Proof. Suppose that v(g) = cp(h) then W(g) * 1p(h)-1 = e', but because V is a homomorphism, this can be written as cp (g) * p (h)-1 = :p (g) * So p (g o h-1), and because this is equal to e', g o h-' E Ker cp. Now suppose that g o h-1 is in the kernel of gyp, then e' = cp(g o h-1) _ cp(g) *;p(h-1) = Mp(g) * cp(h)-1. Multiplying on the right by V(h) we see that 1P(h) _ V(g).
Corollary 2. The homomorphism
Because we always have p(e) = e'", assuming that p is one-to-one = {e}, then
immediately implies Ker V = {e}. Conversely, if we assume Ker V(g) = (h) tells us that g o h-1 = e and hence that g = h.
Definition 3. A homomorphism between two groups is called a monomorphism if it is one-to-one . A monolnorphism that is also onto is called an isomorphism. Finally, two groups are said to be isomorphic if there is an isomorphism from one of these groups onto the other.
The set R is a group under addition and the set R+, the set of positive real numbers, is a group under multiplication. Note that f (x) = ex is an isomorphism from the group (R, +) onto the group (ll8+, ) and that g (x) = In x is an isomorphism from (R+, ) onto (JR, +). Long before the advent of the computer the fact that the log function is an isomorphism was exploited to simplify lengthy calculations. In these applications the base 10, the so-called common logarithms, was used. Let (G, o) be a group, and let K be a subgroup of G. Two elements g, h of G goh-1 is in K, and this is an equivalence relation on are congruent modulo K if G (Section 1, Lemma 3). We want to characterize the equivalence class defined by an element g E G; i.e., the set {h E G I g h mod K} that we denoted by
[g]. Note that h E [g] if, and only if, hog'' E K. Thus h o g-1 = k E K, and
2. Morphisms
171
so h = k o g. This shows that [g] C {k o g I k E K}; let's call this last set Kg.
IfhEKg,thenh=kogforsomekEKandhencehog-1 = k E K; i.e., Kg C [g]. So we have, for any g E G, [g] = Kg. We will use this very soon. Note that [g] * [h] = (Kg) * (Kh) = K (g o h).
Theorem 1. Let (G, o) and (G', *) be two abelian groups and let W : G -. C' be a homomorphism from G onto G'. Let K = KerW. Then the groups (G/K, *)
and (G', *) are isomorphic. Furthermore, the map b : G/K - G' defined by ' (Kg) = V (g) is an isomorphism between these two groups. Proof. We first show that i/i is well defined. Suppose that Kg = Kg' for g, g'
in C, we must show that >G(Kg) = W(g) = cp(g') = 0 (Kg'). But because Kg = Kg' and e E K (e is the identity of G), we see that g = k o g' for some k E K. Hence V(g) = 5o (k o g') = o(k) * W (g') = e' * ep (g') _ V(g'), where e' is
the identity of G', and we have used the fact that k E K = Ker V. Thus 0 is well defined.
Given g' E G' there is a g E G such that V(g) = g', because W is onto. But then 10 (Kg) = g' showing that 10 is onto.
Suppose that io (Kg) = i,(Kg'). Then, as we saw in the first paragraph of E Ker p = K. Hence Kg = Kg', and we have shown that 0 is one-to-one. Finally, 0 [(Kg) * (Kg')] = V) [K (g o g')] = p (g o g') = V (g) * (g') _ this proof, W (g) = W (g') and so cp (g) * cp (g')-1 = e'. This says go (g')
p (Kg) * 1G(Kg')
Corollary 1. For any fixed n E N, the groups (Z,,, +) and (U,,, ) are isomorphic.
Proof. Let w be a primitive nth root of unity. Define V : Z - U,, by setting V(k) = wk for every k E Z. Clearly, (p is a homomorphism because ip (k + 1) = wk+1 = wkw1 = cp (k). p (1). It is onto because w is primitive. Note that Ker p =
{kEZIcp(k)=1} = {kEZIwk=1} = {kEZIk=nlforsomelEZ} = {nl I 1 E Z} = (n). Thus Z/ (n), which is (Z,,, +), is isomorphic to (U,,, ) .
Remark 1. Theorem 1 is also true for nonabelian groups. This is because the kernel of a homomorphism is a special kind of subgroup, called a normal subgroup. We shall not discuss these matters any further (but see Exercises 2, problem 4) except to say that when G is nonabelian, G/H need not be a group.
Exercises 2 1. Let (G, o) and (G', *) be two groups, and let
172
Chapter Five
Abstract Algebra
2. Let (C. o) be a group, and let V : G - G be an isomorphism. Because V maps G onto itself, we call V an automorphism. (a) Show that H = {g E G I cp (g) = g} is a subgroup of G. (b) Define p : (C, +) -, (C, +) by setting V(z) = a (the complex conjugate of z) for each z E C. Show that this is an automorphism and identify {z E C I tp(z) = z}.
3. Show that f (x) = e`? is a monomorphism from (R, +) into (C\ {o} , ) and identify Im(f ).
4. Given a group (G.o) we may define, for each fixed g E G, p9(h)
g-tohogforallItEG.
(a) Show that p9 : C G is an automorphism (problem 2 above). (b) A subgroup H of G is said to be a normal subgroup if gyp, (H) _ {yo9 (h) I It E H} C H for every fixed g E G. Show that the kernel of any homomorphism on G is a normal subgroup. (c) If (G. o) is abelian, show that every subgroup of C is a normal subgroup.
3.
Rings
The subgroups of the group (Z, +) are of the form (1) = {kl I k E Z} where l = 0, 1, 2.... is fixed. These have the unusual property that n a E (1) whenever a E (1) and n E Z. This is what enabled us to define multiplication in ZI (Chapter Four. Section 8, Definition 3, Lemma 4 and the discussion just before
Lemma 4). We can get. a better understanding of what is happening here by introducing another abstract algebraic structure.
Definition 1. A ring (R, +,) is a nonempty set R that is closed under two associative binary operations + and for which (a) (R, +) is an abelian group; (b) given a, b. c in R, we have both a. (b+c) = a b+a c and (a+b) c = a c+b c. If we also have (c) a It = It a for all a, It in R, then we may say that the ring (R, +. ) is commutative.
Given a ring (R, +, ) the identity element of (R, +) is denoted by 0 and the inverse of any a E R is denoted by (-a). When R has had a multiplicative identity, we shall denote it by 1 and say that (R, +, ) is a ring with unit. To avoid triviality, we shall always assume, given a ring with unit, that 0 I. It is clear that (Z. +. ) is a commutative ring with unit. The set of all 2 x 2 matrices with real entries is, under matrix addition and matrix multiplication, a uoncommutative ring that has a unit. Let us show that the laws of signs, that we all memorized in high school, are consequences of the axioms for a ring.
3. Rings
173
Lemma 1. In any ring (R, +. ) we have (i) a 0 = 0 a = 0 for every a E R;
(ii) a (-b) = -(a b) = (-a) b for all a. b in R; (iii) (-a) (-b) = a b for all a, b in R.
Proof. Foranya,binRwehave a = (b + 0) a = b a + 0 a
a = 0. Now that (i) has been proved we may use it to prove (ii) and (iii). Because 0 = a 0 = a [b + (-b)] =
a b+ a (-b) we see that a (-b) is the inverse, in (R. +), of a b. But there is only one inverse of a b, - (a b) (Section 1. Lemma 1). Thus a (-b)
(a b).
The proof that (-a) b = -(a b) is similar. Finally, 0 = (-a) 0 = (-a) [b + (-b)] = (-a) b + (-a) (-b). Using (ii) we may rewrite this as 0 = -(a b) + (-a) (-b). But this clearly shows that (-a) (-b) is the inverse in (R,+) of -(a b) and since that inverse is a b, In the study of a given ring, certain of its subrings are of special importance. We define these now.
Definition 2. Let (R, +, ) be a ring. A nonempty subset I of R is called an ideal in R if (a) I is a subgroup of (R. +) ; (h) for any a E I and any r E R both a r and r a are in I. An ideal I is called a proper ideal if 154 R. Remark 1. We have seen that (11(Z) , +, *) is a commutative ring with unit (Chapter Four. Section 1, Definition 1, Lemma 3 and Exercises 1, problem 2). When we defined the ideals in this ring, however, we were looking at it as an "algebra over C" (Chapter Four, Section 2, Definition 2). A ring (R, +. ) is called an "algebra over C" if (a) (R, +) is a vector space over C (Appendix A): (b) for each a,b in R and each \ E C, A(a b) = (Aa) b = a (Ab). In a ring (R, +. ) we want our ideals to be subgroups of (R, +); in an algebra (R. +, ) over C we want our ideals to be linear subspaces of the vector space (R, +). The proof that (11 (Z) , +, *) is an algebra over C was given in (Chapter Three, Section 1, example (iv) and Chapter Four, Exercises 1, problem 2(b)).
Let us focus now on a fixed commutative ring with unit, (R, +. ), and a proper ideal I in R. Because I is a subgroup of the abelian group (R. +). defining a - b mod I to mean (a - b) E I gives us an equivalence relation on R (Section 1, Lemma 3). Also, the equivalence classes are the sets a + I = {a + b I b E I }
(Section 2, the discussion just above Theorem 1). We know that R/I is an abelian group when we define (a + I) + (b + I) to be (a + b) + 1. We can define
a multiplication in R/I. What we do is set (a + I) * (b + I) equal to a b + I. We shall now prove:
Theorem 1. (R/I, +. *) is a commutative ring with unit. The map : R R/I defined by setting cp (a) = a + I for all a E R satisfies (i) (a + b) _ :p(a) + p(b) for all a. b in R; (ii) V (a b) = V (a) * p (b) for all a, b in R.
174
Chapter Five
Abstract Algebra
Proof. We already know that (R/I, +) is an abelian group and that the map W satisfies (a). Let us show that * is well defined. To do this we must show that if a = a' mod I and b = b' mod I then a b a'. b' mod I. Now (a - a') E I, hence (a - a') b E I. Also, because b - b' is in I, a' (b - 6') E I. Thus we have both in I. Adding we get a b - a' b'in I which says a 6 = a' b' mod 1, and this is what we wanted to show.
Observe that * is commutative for (a + I) * (b + I) = a b + I = b a + I = (b + I) * (a + I) because (R. ) is commutative. Similarly, * is associative. Note
that (a+I) * ((b+l)+(c+I)] = (a+l) * ((b+c)+I] = a. (b+c)+I =
(a b + a . c) + I = (a b + I) + (a. c + I ), showing that (R/I, +, *) is a ring. The fact that (1 + I) is a unit in this ring and that cp satisfies (ii) are immediate from the definitions.
Corollary 1. For each fixed 1, (Z1, +, ) is a commutative ring with unit. Proof. Zt = Z/(1) and the set (1) = {k1 I k E Z} is an ideal in(Z, +, ). Because this last ring is commutative and has a unit, the corollary is an immediate consequence of our theorem.
Exercises 3 1. We have seen that the ideals in (Z, +, ) are of the form (a) = {k a I k E Z}.
(a) Show that (a) c (b) if, and only if, b I a. (b) A proper ideal I in a ring (R, +, ) is said to be a maximal ideal if the only ideals that contain I are R and I itself. What are the maximal ideals in (Z,+, )? 2. We refer to Chapter Four, Section 9 and the discussion just before Lemma 2. Let q be a fixed odd prime number and let X = {a + bf I a, b E Zq }.
For al + bl f and a2 + b2 f in X define their sum to be (al + a2) + (b1 + b2) f , where al + a2 and bl + b2 are computed in Zq.
(a) Show that (X, +) is an abelian group. tobe(ala2+3bi62)+(alb2+a2bl) (b) Define (al f, where ala2 + 3b1b2 and alb2 + a2b1 are computed in Zq. Show that (X, +, ) is a commutative ring with unit.
3. Let (R, +, ) be a commutative ring with unit. For any fixed a E R let (a) = {r a f r E R}. Show that (a) is an ideal in R. 4. Show that, in a commutative ring with unit, every proper ideal is contained in a maximal ideal (defined in 1(c) above).
4. Fields
4.
175
Fields
W e are all familiar with (Q, +, ) , (R, +, ) , and (C, +, ) . Each of these is a commutative ring with unit, but they are very special rings in that each of their nonzero elements has a multiplicative inverse. Such rings are called fields, and in this section we discuss finite fields, sometimes called Galois fields. These
are of interest in pure mathematics, and they have proved useful in digital signal processing [see Note 21.
Definition 1. A field (F, +, ) is a commutative ring with unit in which every nonzero element has a multiplicative inverse.
We have seen that (Z,, +, ) is a commutative ring with unit (Section 3, Corollary 1 to Theorem 1). Furthermore, by Corollary 1 to Theorem 3 (Chapter Four, Section 8), (Z,, +, ) is a field if, and only if, 1 is a prime. Given a field (F, +, ) it is convenient to denote the additive and multiplicative identities of F by 0 and 1, respectively. If a E F, we set a, a+a,
and in general n a = a + (n - 1) a, n = 3,4,.... Consider now the sequence of field elements 1 2 1, 3. L... There are two possibilities: We can have m ! n 1 whenever m 96 n, or we can have n 1 = m 1 for some m, n with n < m. In the latter case we see that (m - n) 1 = 0.
Definition 2. Let (F, +, ) be a field. If n 1 96 0 for all n E N, we shall say that F has characteristic zero. If F does not have characteristic zero, then the smallest element n of N such that n 1 = 0 is taken to be the characteristic of F.
It is clear that Q, R, and C each have characteristic zero and that Zp has characteristic p.
Theorem 1. The characteristic of a field is either zero or a prime number. Proof. If (F, +, ) is a field having characteristic n > 0, then n is the smallest
positive integer such that n 1 = 0. If n is a composite number, then n = pq where 1 < p < n and 1 < q < n. It is clear, however, that 0 = n 1 = (pq) I = (p 1) (q 1). If the field element p 1 34 0, it has a multiplicative inverse (p 1)-1. Hence, recalling that (p 1)-1 0 = 0 (Section 3, Lemma 1, (a)), we see that 0 = (p 1) -1 - 0 = (p 1)-' [(p 1) (q 1)] = q 1. But this is impossible because 1 < q < n and n is the characteristic of F. Now let (F, +, ) be a field with characteristic p 0 0. Set 0. 1 = 0 and note
that {0.1,1.1, 2. 1, ...} = {(n - 1) 11 n E N} is just the set {0.1,1 1, ..., (p - 1)1 }. Furthermore, if m 1 and n 1 are in this last set, then m 1 + n 1 = (m + n) 1 and (m-1)-(n- 1) = (mn) 1 provided that the suns, m + n, and the product, mn, are carried out mod p.
Chapter Five
176
Abstract Algebra
Now we can define a map V : Zp - F by setting V (n) = n 1 for each n E Zp. We are using n to mean two different things. In p(n), n is an element of Zp, and in n 1, n is the integer n. This should not cause the reader any trouble. Introducing new notation to distinguish the two n's seems unnecessarily pedantic at this point. Observe that V (m + n) = rp(m) + W(n) and V (mn) = V (rn) p(n) for all m, n in Zp. We have shown that every field of characteristic p "contains" the field Zp.
Theorem 2. Let (F, +, ) be a field with a finite, say q, number of elements. Then q = p", where p is the characteristic of F, and n E N. Proof. It is clear, because F has only finitely many elements, that its characteristic must be p 96 0. Now F contains Zp, and we may regard F to be a vector space whose scalars come from Zp (Appendix A.) Let v1, ..., v be a basis for
this vector space. Then every element x E F can be written in one, and only one, way as a linear combination of the basis elements. Hence x = Fnj=1 (kjvj, where each aj E Zp. Note also that any such sum is in F. Because there are p choices for each coefficient and there are n coefficients, F contains p" elements.
Remark 1. It is true, although we shall not prove it, that for any prime p and any n E N there is a field containing p" elements. Furthermore, there is only one such field in that any two are isomorphic. As we mentioned above, such fields are called Galois fields. They are often denoted by GF (p"). Exercises 4 1. Let (F, +, ) be a field and let a, b E F. If a b = 0, show that either a = 0
orb=0. (a) If (F, +.) is a field, show that the only ideals in F are {0} and F itself.
(b) If (R, +, ) is a commutative ring with unit and if the only ideals in R are {0} and R itself, show that (R, +, ) is a field 2. Let (F, +, ) be a field with characteristic p
0.
(a) For any a, b E F, show that (a + b)' = ap+bp. Hint: Use the binomial theorem and the fact that p is a prime. (b) Let G = {xp I x E F}. Show that (G, +, ) is a field.
Notes
177
Notes 1. A very accessible introduction to modern algebra can be found in "Abstract Algebra" by J. N. Herstein (Macmillan Publishing Co., 2nd Edition, New York, 1990).
2. Groups, rings, fields, and their application to digital signal processing are discussed in "Fast Algorithms for Digital Signal Processing" by Richard E. Blahut (Addison-Wesley Publishing Co., Reading, MA, 1985). See pp. 21-33. The Fourier transform in an arbitrary field is discussed on p. 31. There is also a treatment of finite fields (pp. 155-165). Euler's phi function is discussed on pp. 156 and 157, and Mersenne numbers and associated finite fields are used on pp. 182-185.
3. There are many applications of group theory to various areas of physics. These are discussed in "Applications of Finite Groups" by John S. Lomont (Academic Press, New York, 1961).
Appendix A
Linear Algebra
Here we shall recall some terminology from linear algebra
Definition 1. A nonempty set V is said to be a vector space over K if it has all of the following properties:
1. There is a rule that assigns to each pair u, u of V a unique element of V called their sum and denoted by u + v. This rule is called vector addition. 2. Vector addition satisfies
(a) The Commutative Law; i.e., u + v = v + u for all u, v in V.
(b) The Associative Law; i.e., (u + v) + w = u + (v + w) for all u, v, u' in V.
(c) There is a unique element 0 E V for which u + 0 = 0 + u for all u E V. We call 0 the additive identity of V.
(d) For each u E V there is a unique element (-u) E V such that
u+(-u)=(-u)+u=0.
3. There is a rule that assigns to each pair a, v, where a E K and v E V, a unique element of V called the product of a and v denoted by av. The rule is called scalar multiplication. 4. Scalar multiplication satisfies.
(a) a(,3u)=(af)uforalla,J)inKandalluE V; (b) a(u+v)=au+avfor all a in K and all u, v E V; 179
180
Appendix A
Linear Algebra
(c) (a + 3) it = av + I3u for all a,Q in K and all u E V;
(d) lu=uforalluEV. The elements of a vector space V over K are called vectors, the elements of K are called scalars, and K is called the field of scalars. Now let V be a vector space over K.
Definition 2. A nonempty subset W of V is said to be a linear subspace if (a) For any two vectors it, v, in W the vector u + v is in W; (a) For any vector w in W and any scalar a (i.e., any element of a of K), the vector aw is in W. Let S be a nonempty subset of V. A vector v is said to be a linear combination of vectors in S if there is a finite set u1i...uk in S and scalars al, ...ak such that v = k=1 ajui. The set of all vectors in V that are linear combinations of vectors in S is denoted by lin S. It is easy to see that lin S is a linear subspace that contains S. We call it the linear subspace spanned by, or generated by, S. In particular, if lin S = V we say that S spans V.
Definition 3. Let V, W be two vector spaces over the same K, and let T : W. We shall say that T is a linear function, or linear map, if (i) T (u + v) = V T (u) + T (v) for all u, v in V; (ii) T (av) = aT (v) for all a in K and all v in V.
Corresponding to any linear map T : V -' W we have two linear subspaces: the range of T (Definition 1 of Section 2, Chapter Zero), which is a linear subspace of W; and the null space, or kernel, of T defined to be {v E V I T (v) = 0}.
Appendix B
The Completion
Definition 1. Let (E, 1 1-1 1 ) be a normed space. A Banach space (E, I I I I ) is said to be a completion of (E, 11-11) if (a) There is a linear map T : E
E such that
IT
=
II
x
II
for every,
I IIi
x E E (we say that the map T is norm preserving);
(b) The linear subspace T (E) = IT-; I x E E} is dense in (E, II
II
)
We shall show that every normed space has a completion and that, in a certain sense, this completion is unique. The construction is quite straightforward; it is similar to Cantor's construction of the reals from the rationals but lengthy and a little bit tedious. We shall leave some of the more routine verifications to the reader so as not to obscure the underlying ideas. We begin with some simple facts about Cauchy sequences (Definition 3 of Section 2, Chapter Three) in a normed space (E, II II ). 1.
if x,yare in E,then lll=II-II4I< Ix
- VII.
2. If { x n } C E is a Cauchy sequence, then the sequence of real numbers
{IIxnII} 3. If {
0 then
is convergent. and { U n } are two Cauchy sequences in E and if I'm00 II x
II
-00 ''m IIVnII-
We read the proofs for the reader. 181
"n (x f )
Appendix B
182
The Completion
Theorem 1. Every normed space has a completion. Proof.
Let (E,
be a normed space and let C be the set of all Cauchy
sequences of points of E. We may define an equivalence relation (Chapter Zero,
Section 2) on C by agreeing to say { xn } and { v n } are equivalent when tim I S n - y n
11
= 0. For each { x n } E C, let
[ { x n}
] be the equivalence
class (Chapter Zero, Exercises 2, Problem 1) that contains {xn}, and let E be the set of all equivalence classes in C. We can define two operations on the elements of k as follows: For any [{xn}], [{Un}] we let +
[{yn}] =
[{xn+yn}j,andforanyscalar \welet.\[{xn}] =
[{'\xn}]
It is easy to show that E, with these operations, is a vector space.
We have seen that for any {xn} E C, n""oo IIxnII exists ((2) above). Furthermore if { x n } and (;,,,I are equivalent, then n'1o
IIxnII = n"°D II;nVI
((3) above). Thus we may define, for any [{xn}] E E,
III{xn}III1
n 1-+n00 IIxnII
It is easy to see that II'II1 is a norm on E. Now given any x E E we consider the sequence x, x,...; i.e., the element
{in) E C such that in = x all n. Then, denoting this sequence by
{x},[{x}] EEand
Il[{x}]II1 -IIxII We define a map T:E
setting T; = [{ x }] for every x E E. It is clear that T is a well-defined linear map that is norm preserving (i.e., II T x II
t=
II x II for all x ).
Let. us show that the linear subspace T (E) is dense in (E, II'II1) Given [{
E E the sequence fix-) is in C and so, given e > 0, there is an N
such that
Ixn -
x'nII<e
whenever in, n > N. Choose, and fix k > N and consider the sequence xk, xk, .... We have
[{fin}] -
[{xk,
xk.,...}]
=
[fin - k} 9l:11
Appendix B
The Completion
183
and
II[{xn}] - [{xk,xk,...}]III = II[{xn - xk} lim n--+oollxn -
xkll<e
Thus, as k -. oo. the sequence
xl, xl, ... }]
x2, ...}] , ...
,
converges to the given element of R for
x[{!k,
xk, ...}] ,...
1. Because each term of this sequence as claimed.
is in T (E), we see that T (E) is dense in (E,
Note that we have just shown that for any [{xn}] in k the sequence T-1 1, Ti2, ...T x k, ... converges to [{ below.
for
II
II1. We use this observation
To finish the proof we need only show that (El 11-111) is complete. So let
IV-1 be a Cauchy sequence in E and choose, as we may because T (E) is
dense in E, v E E such that 1
IITv -
Vnlll < n
We do this for each n = 1, 2.... Consider the sequence { v n } contained in E that we have constructed. We note that
I1
vn-vm1=JIT(vn-vm)II1-IITvn-TVm
IITV'-
n+
-
IIVn
Vn 11
L
+01n-1/mlll+Il1m-TUm11,
- Vmlll + nl
where we have used the fact that T is norm preserving. Because 1V -l is a Cauchy sequence, our inequality shows that 1;-) is also a Cauchy sequence.
Set V =
and recall that
converges to V for IIII1
184
Appendix B
The Completion
It follows from the inequality
Vn - Tvnlll + IITvn - VIII
IITvn n +
- VII'
that the Cauchy sequence ( Vn } is convergent to a point of E. While we have the proof before us, let us show that when (V, (,)) is an inner product space and 11- 11 is the norm induced on V by the inner product, then the construction applied to (V, I I II) gives us a space (V, I I II) that is actually a
Hilbert space. First note that when { v n } , {wn } are in C (the set of all Cauchy sequences of (V, II'II)) then
(vn,wn>-(vm,wm I=
I(v - vm, wn - wm)I <
I
vn -
°mll llwn
- wm11
by Theorem 2 of Section 3, Chapter 3. Hence n' (vn, wn) exists in K. Furthermore, if { V- } and { u } are equivalent, respectively, to{ v n } and {wn}, then
I(Un,wn)-(vnwn)I
- wnll
showing that lim (v n, wn ) = lim (v'n V n ). Hence we may define, for any [{ vn}] , [{wn}] in V, an inner product by setting ([{vn}]
, [{wn}]) = lim(vn,wn)
It is easy to see that the norm associated with this inner product is just I I ' I I 1. Thus (V, 1 1-1 1 1 ) is complete showing that (V. (,)) is a Hilbert space.
Finally, we note that the map T : V -i V defined in the proof of Theorem 1 satisfies
(Tu,Tv) = (u, v) for all u, v in V.
Appendix B
The Completion
185
We have proved:
Theorem 2. Given any inner product space (V, (,)) there is a Hilbert space and there is a linear map T : V -- V such that (T-u, Ti for all it, r in V (we say that T preserves the inner product), and the linear subspace T (V) is dense in V.
Definition 2. Given an inner product space (V, (,)) a Hilbert space having the properties stated in Theorem 2 is called a completion of (V, (.)).
Let its now show that all completions of a normed or an inner product space are "essentially" the same.
Theorem 3. Given any two completion of a normed space, there is a linear norm preserving map from either of these completions onto the other. Given any two completions of an inner product space, there is a linear map from either of these completion onto the other that preserves inner product. Proof. Let (E. II'II) be a normed space, and let (E1, I1'II) and (E2, 0112) be two completions of this space. Then we have two linear norm-preserving maps T,
:E
E, and T2 : E - E2 such that T1 (E) is dense in (E1.11.111) and
T2 (E) is dense in (E2,11-112). To simplify the notation, let its simply say that E
is a dense linear subspace of both (El,
and (E2,11.112)
We may define a map T : E, - E2 as follows: For each r E E C E1 CE (r) = r E E2. For y E E,\E we find first a sequence
we set T
such that lira
IT
y for 11'II1
Then {x, } is a Cauchy sequence for
and
is a Cauchy sequence for II'II2 We set T (y) equal to the
limit of {T
for 11
112
Note that if { i } and { x' } both converge to {/ for 11-11,, { T n converges to zero for both II.11, and 11'112: because
T}
{ r } and { c' } are in E.
Thus lint T (r ) (for 11'112) and we sec: that. T is well defined. It is easy to see that T is linear. Hence to show that T is one-to-one (Chapter Zero, Section 2) we need only show that IV E E, I T(y) = 0 E E2 } contains
only the zero vector of E,. Suppose that
is in this last set, and let {
be
a sequence in E that converges toy for 11' II 1 But then limn :r = lim T O E E2 for 11112 and so, because II.111 = 11 I12 on E. lim r = 0 E E1 for 11
Thus 9 is the zero vector of E.
II 1
186
Appendix B
The Completion
Given any z E E2 we can find a sequence V-1 C E that converges to z for II' II2 But then { i" } converges to an element y E El for 11-111; because it
is Cauchy for El. Clearly then, T (y) = z, and we have shown that the map T is onto.
Finally, let y E E1 , and consider II y II 1 and IIT (y) II2. We have T (y) = limT
(x")wliere {x"} C E and lim a" = y for II'II. But then lim III"II
II, and lim II ;"II2 lim II T (x") II2 y = IIT M II2 (Chapter Three, Exercises 2, problem 1(a) ). Because II' II , and II' II2 coincide on E, we see that I T (y) II2 I
U II I
and we have proved that T is norm preserving.
Now let (V, (,)) be an inner product space, and let (Vi (, )1) and (V2 WO) be two completions of this space. We regard V to be a dense linear subspace of the Hilbert space (VI, 11-111) and a dense linear subspace of the Hilbert space
(V21 II'II2) As above we define a one-to-one, linear map T : Vi - V2 that is onto. Let u, v be any two elements of Vi and find in V such
that lim u" = u and lim v" = v for II'II1 Then
CT (u) ,T (0)2
= lim (T = lim
\u")'T \v"))2 (-"'v-)2
=lim\u"'v")I=(u'v)i showing that T preserves the inner product. Now that we have proved Theorem 3 we shall speak of the completion of a normed or inner product space. We shall also identify the space with its "copy"
in its completion. Thus we shall say that any normed space is a dense linear subspace of its completion.
Appendix C
Solutions to the Starred Problems
Note: Throughout this appendix we shall write "iff" in place of the phrase "if. and only if."
Chapter Zero 0.1.2 We have defined (x, y) to be {{x} , {x, y}} .
(a) (x, y) = (y, x) if x = y
If x = y, then we have (x, y) = (x, x) and (y, x) = (x, x), and these are the same. Now suppose (x, y) = (y, x), then {{x} , {x, y}} = {I y} , {y, x} I. We want to show that x = y. Suppose this is false. Then we have {x} a fly}, {y, x} } and {x} 0 {y, x}, hence {x} = {y}.
But this gives us x = y and we are done.
(b) (x, y) = (u, v) if x = u and y = v. Assume (x, y) = (u, v). Suppose
first that u = v. Then (u,v) _ ({u},{u,v}} = {{u},{u,u}} = {{u} , {u}} _ {{u}}. Now {x} a (u, v) and so {x} = {u}, giving us x = u. Furthermore, (x, y) must be a singleton (because, as we have
just seen (u, v) is a singleton). Hence {x, y} = {x} and because ye {x, y} ,
y = x. Thus x = y = u = v in this case. Similarly, if
x=y,thenx=y=u=v.
Suppose now that x y and hence u 34 v. Then both (x, y) and (u, v) contain one singleton: {x} and {u}. Clearly then, {x} = {u}. giving us u = x. Furthermore, each ordered pair contains exactly one unordered pair: {x, y} and {u, v}, hence {x, y} = {u, v}. Now ye {x, y} so ye {u, v}. If y = u, then y = x, which is false. Thus y = v.
187
188
Appendix C
Solutions to the Starred Problems
0.1.3 X- Y= X n Y
Let xfX - Y. Then xEX and x V Y. Thus xrX and x ' X n Y, hence
xeX-XnY. Thus X-YcX-XnY. Now letxEX-XnY. Then xEX and x j X n Y. Thus x cannot be in Y. So xEX, x 9 Y giving us
xEX -Y.ThusX - XnYcX -Y. 0.1.4
(a) Show (i) CE (X u Y) = CE (X) n CE (Y) and (ii) CE (X n Y) _ CE (X) U CE (Y).
i. xECE(XUY)iffxjXUYiffxjXandxeYiffzECE(X) ii.
and xECE (Y) if xECE (X) n CE M. xECE (XnY) iffx e X n Y iff x y' X or x e Y if xECE (X) or x(CE (Y) if zECE (X) U CE (Y).
(b) X - Y = Xn[CE(XnY)) = XnCE(Y) xEX - Y if xeX and x e Y if xeX and xECE (Y) iff xEX n CE (Y).
ThusX-Y =XnCE(Y). By problem 1.3, X -Y =X-XnY and by what we have just shown X - X n Y = X n CE (X n Y).
Thus X- Y = XnCE(XnY).
(c) XUY=Eiff{xEE IxeX}CYiffCE(X)
Y. By symmetry
XUY=E if CE (Y)CX.
(d) If X C Y and xECE (Y), then x e Y so x j X; i.e., if zECE (Y), then XECE (X), giving us CE (Y) c CE (X). Now suppose CE (X) 2 CE (Y). By what has just been shown, CE [CE (X)] e CE [CE (Y)]; i.e., X g Y. Thus X C Y. If X U Y = Y, then for any xEX, x(X U Y and so Conversely, if X C Y, then any xEX U Y must be in X g Y or in Y; i.e., X U Y C Y. But clearly Y C X U Y and so X U Y= Y. If X n Y = X, then xeX implies xeX n Y and hence xeY. Thus X CZ Y. Conversely, if X C Y, then for any xEX. xEX and xEY hence xEX n Y.
Thus X CXnY. But X D X n Y and so X = X n Y. 0.2.1
(a) Because - is an equivalence relation it is reflexive (property (a) of the definition). Thus, for every xEX, x - x and so xE [x] showing that this set is nonempty. (b) Suppose ze [x] n (y]. Then x - z and y - z. Because an equivalence relation is symmetric, we have x - z and z - y (property (b) of the definition). But then by property (c) of the definition we must have x - y. It follows then that [x) = [y). Now suppose [x] # [y). Then x 0 y and because zE [x] n [y] implies, as we
have just shown, that x - y, we must have [x] n [y] = 0.
Chapter Zero
189
0.2.2 f : X - Y, g is a partial function from Y into Z. (a) Suppose R (f) C D (g). Then for any xeX, f (x) eR (f) C D (g) and so g [f (x)] has meaning; i.e., it is a unique element of Z. Thus when R (f) C D (g) , g [f (x)] is a well-defined function from X into Z. (b) Let f : X - Y and suppose that there is a function g : Y -, X such that (fog) (y) = y for all ycY and (gof) (x) = x for all xeX. Then f is onto because it maps the element g (y) eX onto the element
yeY showing that R (f) = Y. It is also one-to-one. For suppose f (x) = f (x'). Then x = g [f (x)] = g [f (x')] = xi. Now suppose f : X - Y is given to be one-to-one and onto. Then for each ycY the set pre f (y) is a singleton. We set g (y) equal to the unique element of this set. Then clearly g [f (x)] = x for every xeX X. We also have f [g (y)] = y, by the way g was defined, for every yeY.
0.2.3 We have a set X and its power set P (X). (a) We define, for Y, Z in P (X), Y =_ Z to mean there is a one-to-one mapping f from Y onto Z. Clearly, Y - Y for every YeP (X) for we may define f (y) = y for all yeY and in this way obtain a map f : Y - Y that is one-to-one and onto. Suppose that Y = Z. Then we have g : Y -i Z, g is one-to-one and onto. By problem 2(b) we know that the map g- I : Z -. Y exists and that it is one-to-one and onto. Thus Z - Y showing that our relation = is symmetric.
Finally, suppose Y e Z and Z- W. Let f: Y--+ Z and g: Z W be one-to-one, onto functions. Consider gof : Y -+ W. This map is onto because given any weW there is a zEZ such that g (z) = w,
and there is a yeY such that f (y) = z, hence (gof) (y) = w. It is also a one-to-one mapping because (gof) (y) = gof (y') implies f (y) = f (y') because g is one-to-one, and this implies y = y' because f is one-to-one. Thus Y e W showing that - is transitive. (b) For Y, Z in P (X) define Y < Z to mean Y C Z. It is clear that for any YeP (X), Y C Y and so Y < Y. Thus our relation is reflexive.
Suppose that Y< Z and Z< W. Then Y C Z and Z C W hence Y C W, which means Y <W. Thus our relation is transitive. Finally, if Y < Z and Z < Y, then we have both Y C Z and Z C Y, which means Y = Z. So our relation is antisymmetric. 0.2.5 We have a family of sets {Aa I .\el} with Aa C X for each ael. Then (a) Cx [a<,A,] = ACJ Cx (AA)
We have xeCx [a ,,A.\] if x 0 UAa iff x 0 Aa for every Ael if xeCx (Ax) for every Ael iffxen L,Cx (A,).
190
Solutions to the Starred Problems
Appendix C
(b) Cx [ Ael n A,] = uAdCx (AA) We have xcCx [nAA] if x 0 , I A. iff x V A,\ for some W if xcCx (AA) for some Ael iff xea'jCx (A,). 0.4.1
(a) Given z = a + bi, z = a - bi and so z z =a 2 - abi + abi - b2i2 = a2 + b2 = Iz12.
(b) Ifz=a+bi00, thenlzI
0. Now
a multiplicative inverse. 1. Then Supposewehave z
z
i
z
w = (cos
i sin
s (cos (cos
i sin
i sin
i (sin 0 cos p + cos 0 sin gyp)]
0 cos cp
= rs[cos(0+W)+isin(0+cp)]. (b) We want to show that z" = r" (cos 0 + i sin 0) for every natural number n. Clearly, this is true when n = 1. Suppose that for some n, it is false. Then (n j N I z" 34 r" (cos n0 + i sin n6) } is not empty. Let no be the smallest member of this set. As we have already noted, no > 1. Now no - 1 is a natural number and, from the way no was defined, z.,O-1 = r"O' 1 [cos (no - 1) 0 + i sin (no - 1) 0]. But then z"° = z
r"°` [cos(no - 1)0+isin(no - 1)0], z"0` = r (cos 0 + i sin 0) giving us, by (a), zn° = r"O (cos no0 + i sin no0) . This is a contradiction.
0.4.4 We are given two complex numbers, z, w.
(a) z = reie and w = se'B. Now le'° I = Icos a + i sin al = (cos' a+ sine a) l = 1 for any real a. Thus Iz wI = rs = Izl IwI. (b) We want to show that Iz + wI < Iz) + IwI.
Note that Iz+w12 = (z+w) (F+--w) _ (z+w) (z+w) and hence Iz+WI2=Iz12+zw+2.w+IwI2.
Now write z = re'B, w = se'e, then z w + i w = rs [e'(e-'°l + e'(v-O) _
rs[cos(0- p)+isin(0-
cos (0 - gyp) = cos (9p - 0) and sin (0 - gyp) _ - sin (
191
Chapter Zero
0.4.5 Let p(z) be a polynomial with complex coefficients having degree n > 0. For any complex number r we may divide p(z) by (z - r) to get p(z) = (z - r) q (z) + p (z). Here p (z) - 0 or deg p (z) < deg (z - r) ; i.e., p (z)
must be a constant. Let ri be a root of the equation p (z) = 0. Then
p(rl)=0,sop(z)=-0and we have p(z)=(z-rl)q,(z). Clearly, q (z) has degree n - 1. If n - 1 54 0, then the equation qi (z) = 0 has a complex root r2. As before qi (z) = (z - r2) q2 (z) where q2 (z) has degree n - 2. Thus p (z) = (z - ri) (z - r2) q2 W. Continue in this way. The process
q (z) because then the
ends when we have p (z) = (z - rl) . . . (z -
degree of qn (z) must be zero; i.e., q (z) must be nonzero constant.
0.4.6 For anyz=a+biin C, u (z) = a and v (z) = b. Now let f :R - Che given. Then f (x) = a + bi = u (z) + v (z) i = (uof) (x) + (vof) (x) i. and we may set g (x) _ (uof) (x), h (x) = (vof) (x) for all xtR obtaining two functions from R to R such that f (x) = g (x) + ih (x) for all x.
0.5.2 We are told that f is differentiable at soEK. We want to show that f is continuous at this point; i.e., we want to show that ,1im° f (s) = f (so). The trick is to write
f (s) _
{f(8I(80)] (s - so) + f (so)
Taking the limit as s approaches so the quantity in the square brackets tends to f' (so) and hence si' e° f (s) = f (so) as was to be shown.
0.5.3 We are given f : R - C and we write f (x) = g (x) + ih (x) where g. h map to R to R+ (problem 0.4.6). (a) If (x) - (uo + ivo)I2 = I(g (x) - uo) + i (h (x) - vo)12 = Ig (x) - uol2+ g (x) = uo and Ih (x) - vole. Now if xii z° f (x) = uo +ivo, then lim x-x° h (x) = vo. Conversely, if lim g (x) = uo and lun it (x) = vo, then lim f (x) = uo + ivo. (b) f is continuous on [a, bJ iff both g and h are continuous on this interval. This follows immediately from (a) and the definitions. (c)
f (xx - xo
xo)
- (uo + ivo) I2
2
xoxo)
= I g (xx -
- uo
+Ih(x)-h(xo) x - xo
2
-z/o
- x° = uo + ivo, then g and h are differentiable and Thus if x-x° iir xxx° g'(xo) = no, h'(xo) = vo. Conversely, if g'(xo) = uo, h'(xo) = vo, then
f' (xo) exists and is equal to no + ivo.
192
Appendix C
Solutions to the Starred Problems
'n' snn = and let {snk } be any subsequence of {4,i} noo Then given e > 0 there is an N such that ISO - snI < E whenever n > N. Also, there is a k0 such that nk > N when k > k0. It follows
0.5.7 (a) Suppose
that ISO - snk I < E whenever k > k0, showing that k-soo fi'n' snk = s0.
(b) Suppose that {sn} is a Cauchy sequence. Then given e > 0 there is an N that is Isn - 8m1 < £ whenever both m and n exceed N. If {snk } is any subsequence of (a, j, then there is a k0 such that nk > N when k > k0. It follows that Isnk - sn, I < £ whenever k and 1 exceed k0.
(c) Let {sn} be a Cauchy sequence and let {snk} be a subsequence such that s0. Then given E > 0 there is a k0 such that
Isnk - s0I < z when k > k0. We can also find an N such that I sn - sin I < z when m and n both exceed N. Now if n > N we can choose k > k0 so that nk > N. Then £ 180 - s,(
180 - snkl + Isnk - snI <
2
+
E
2 = £
Hence n`!no Sit = s0.
(d) Let {sn} be a Cauchy sequence. Then we can choose N so that IS,, - s,,.1 < 1 when m, n > N. In particular, we have !s f < Is, - sN I+ 18N1 <_ 1 + ISN1 for all n > N. Let K be all integer that is greater than 1811, 1821, ..., IsN I. Then we have I sn I < 1 + K for all n.
Chapter One 1.1.1 We have u = u (x, y) and x = r cos 9. y = r sin 9.
C921
On ax
au Oy
=
On
cos 9 +
ar - ax a + ay Or ax 02u _ [O2uOx are - cos B axe a + 8yax & J
On
a,J
sin B
a2u a02u ax 1
+ sin B I
= ar cos20+2sin0cos0y + a`"sinB y2
2
an ay
axay ar + sy2 ar J
193
Chapter One because
a2u a2u axay - ayax
au ax + au ay = 1921 au (-r sin B) + ay ae ax Oy
au
aB - ax a8 au
aZu
00ax (-rcose) +
ay
(r cos B)
+ (-rsin6)
Y Y-
ra2u ax axe ae +
a2u ayl ax aB
a2u ax
a2u ay + (rcosB) [axayTo + aye a9] 2
+r 2 sin2 0 a22
_ -r cos 0 a - r sin 0
- 2r2 sin 0 cos 0 axay
a2u 82 + r2 cos2 Bl now
a2u
1 a2u
1 au
a2u
a2u
5re+rar+r2T02 =8x2+a 22 y
1.2.3 f : R - lR is continuous and a > 0 is any element of P (f). We distinguish several cases:
(a) a>0 Case 1.
0 < x. Then x+a
L
x+a
x
=f -f =f
a
+f
x+a
Ix
Now
f where
f (t)
dt= jf (u + a) du= f f (u) du 0
we set t = u + a and used the fact that f (u + a) = f (u). It
now follows that x+a x
a 0
194
Solutions to the Starred Problems
Appendix C
x < 0. Here we have two subcases:
Case 2.
+0 =
flo X
a
+ f.
a
0
X
- = f. +10
f, a
-
But
J
r
f(t)dt=J f(u+a)du=Je JJJJ
:
r
f(u)du hence
-r -x+a r-z+a
J
=
fa
a
0 0
r z+a
nl
0
r0
If",. - 1
-r
J
Again -z+a = J-x So f -X
=
1
(b) a< 0 Then
1r x-a
y
0
_Iy+a_la
where we have set y = x - a. 1.2.4 We have g (x) = a cos Ax + bsin Ax, where a, b are complex numbers not both zero and A is a positive constant. It is clear that P(g) {n (Za) n E Z}. and if a = 0 (or if b = 0), these two sets are equal. We want to show that these sets are equal when neither a nor b is zero. Let a E P (g), so that g (x + a) = g (x) for all x. Then a cos A (x + a) + bsin A (x + a) = a cos Ax + bsin Ax for all x. A little algebra gives us: [a cos Aa + b sin Aa - a) cos Ax + [b cos Aa - a sin .\a - bj sin Ax = 0
for all x.
Chapter One
195
We need a simple result: Lemma. If A cos Ax + B sin Ax = 0 for all x, then A = B = 0. Proof. Set x = 0 to see that A = 0. Then B sin Ax = 0 for all x. Clearly, This proves the lemma. B = 0 (take, for instance, x = We can now conclude that a (cos as - 1)+b sin .1a = 0 and that b (cos .1a - 1)2A).
a sin Aa = 0. Multiply the first of these by a, the second by b, and add the results to obtain (a2 + b2) (cos as - 1) = 0.
where k E Z. If a2 + b2 = 0, then a = fib and clearly b 54 0 (otherwise we'd have both a and b zero). Thus when a = ib our equation becomes ib(cosAa - 1) + b sin as = 0 or i (cos as - 1) + sin as = 0. Equating real and imaginary parts gives cos Aa = 1. The case a = -bi is similar.
If a2 + b2 54 0, then cos as = 1; hence as = 2kir and a
1.2.6 Given f : R -' C there are functions g, h that map R to R such that f (x) = g (x) + ih (x) for all x (see Chapter Zero, Exercises 4, problem 6).
We must show that (a) P (f) = P (g) (P (h). It is clear that P (g) f1 P (h) C P(f ). Suppose a E P (f ). Then f (x + a) _
f (x) for all x, giving g (x + a) + ih (x + a) = g (x) + ih (x) for all x. Equating real and imaginary parts we see that a E P (g) and a E P (h). (b) If f is periodic, then P (f) contains a nonzero member. Thus P (g) and P (h) both contain this nonzero number showing that g and h are periodic. To see that the converse is false, consider f (x) = sin 27rx + i sin (27rvf2-) x. Then P (g) = In I n E Z} and P (h) = { 7 I n E Z}. Clearly, P (g) n P (h) = {0}; hence P (f) = {0} and f is not periodic. (c) If f is continuous, periodic, and nonconstant, then f has a smallest positive period.
We have seen that f is continuous if g and h are (see Chapter Zero, Exercises 5, problem 3(b)). If f does not have a smallest positive period, then P (f) contains arbitrarily small positive numbers. But because
P (f) = P (g) n P (h) C P (g) we see that P (g), and similarly P (h), contains arbitrarily small positive numbers. It follows from Lemma 2 of
Chapter One, Section 2 that both g and h are constant, and this is a contradiction.
1.2.7 Let f : R on R.
C be continuous and periodic. Then f is uniformly continuous
Let e > 0 be given. We must show that we can find a 6 > 0 such
that Ix - yj < 6 implies 11(x) - f (y)I < e. Let a E P (f ), a > 0.
196
Appendix C
Solutions to the Starred Problems
Then by Theorem 3 of Chapter Zero, Section 5 we know that f is uniformly continuous on [0, a]. Thus, for the given e, we can find 5 so that If (x) - f (y) I < e whenever x, y are in [0, a] and Ix - yI < 6. Clearly, we may suppose that 6 < a.
Now let x, y E R be such that Ix - yI < 6. We may suppose x < y. Consider{[na, (n + 1) a] I n E Z}. There are two cases:
(i) na<x
(ii) x<no
(a) If aea=+be-" =0 for all x, then a = b = 0. Setting x = 0 gives the equation a+b = 0. Differentiating and setting x = 0 in the result gives A(a - b) = 0. Solving these equations we see that a = b = 0. (b) Set f (x) = clear + c2e-at where c1 and c2 are constants not both zero. Suppose f (x + a) = f (x) for all x. Then 0 = cl (eaa - 1) ea=+ c2 (e_t - 1) e'ax for all x. It follows from part (a) that
c1 (e"" - 1) = 0 = c2 (e-ax - 1). If c1 3& 0, then ea)' = 1, giving
us aA = 0. Because A > 0, a = 0. If c200,then e-«a= 1, and again we get a = 0. Thus P (f) _ {0} and the function f is not periodic.
1.3.2 Equation (7) of Chapter One, Section 3 is r2R" (r)+rR' (r)-n2R (r) = 0. Set r = et. Then de = & or, demoting the derivative with respect
to t by R, R=R'et. Now R = e`
d
,-,R'- di] + R' A et = e2t R,
+ et R'. Thus R - R = e2t R"
We now have r2R" = r2 `Ie.-2t (R - R)] = R-R and rR' = et (e-ti) _ R. Using these equations (7) becomes R - n2R = 0. The general solution to this last equation is R = c1e"t +c2e-"t and because r = et we see that
R = c1 r" + c2r-". 1.3.3 We want to solve r2R" (r) + rR' (r) = 0. Again we use r = et. We saw in the last problem (1.3.2) that e2tR" = R - R and etR' = R. Using these, our equation becomes R = 0. Hence R = c and R = ct + d where c and d are constants. Because r = et, R(r) = c In r + d.
Chapter One
197
1.4.1 We have a sequence { un (r, 0) ' , of functions that are continuous on GUT and harmonic in G. If {un (1, 0)} is uniformly convergent over T, then {un (r, 0)} is uniformly convergent over GUT.
Given e > 0 we must show that we can find an integer N such that lun (r, 8) - un, (r, 0)1 < e for all (r, 8) E GUT, whenever m, n > N (Lemma I of Chapter One, Section 4). Now from our hypothesis we can
find, given e > 0, N so that
I
(1,0) - un,(1,0I < c for all 0 E T
whenever m, n > N. Thus -e < un (1, 0) - u,n (1, 0) < e for all 0 E T because our functions are real-valued. Because they are also continuous on T, we have
-E <
0 T{un (1,0) - Urn(1,0)} < Max E
E
{un (1, 0) - un, (1, 0) } < E
whenever m, n > N. But by Theorem 1 of Chapter One, Section 1 we have
-E < 0 E
<
1.4.2
T
{un (r, 0) - un, (r, B)} {un (1, 0) - um (1, 0)} <- (r.0) E1G U T
(r, 8) E G U T max
GET
tun (r, 0) - un, (r, 8) }
{u, (1,0)-urn(1,0)} <e
(a) For n E N, n fixed, Un (r, 0) = r" (an cos n0 + bn sin nO) is harmonic in G, i.e., inside the unit circle. Bun
8r
= nr n
(n - 1) 2Un = n
r"-2 (an cos n0 + bn sin n0) 2
aun
80 =
r" (-nan sin n8 + nbn cos n0), 2u
= r" (-n2an cosn0 - n2bn sin n0) a2un + 1 Bun + 1 02U" = n (n - 1) Tn-1 (an Cosn0 8r2 r Or r2 a2 +bn sin nO) + nr"-2 (an cos n0 + b sin n0)
- n2r"-2 (an cos n0 - bn sin 0)
[n(n-1)+n-n 2]r"-2[ancosn0-bnsin0J
=0 forall (r,0)EG El 1 n2 pn (b) For any fixed p, 0 < p < 1, the series En°_o pn E' n= , npn. n=
are convergent.
198
Appendix C
Solutions to the Starred Problems
Because these are series with positive terms, we can apply the ratio test. Clearly, Pn+t
P = lim pn = lim
(n + 1)pn+i = lim (71 + 1)2 Pn+1 npn n2pn
Because we are given that p < 1, all three series converge.
(c) Suppose that for some M, Ian1 < M and IbnI < M for all n. Then u (r, 0) = En 1 rn (an cos n9 + bn sin n.9) is harmonic in G, i.e., for
0<0<2aandr<1.
Irn (an cos n9 + bn sin n9)1 < rn2M for every n. Because E rn is convergent, the M-test (Corollary 1 to Lemma 1, Chapter One, Section 4) shows that the series given converges uniformly in G. Similarly, each of the series
02
rn (an cos n0 + bn sin n9) =
nrn-1 (an cos n0 + bn sin n9)
r" (an cos n0 + bn sin n,9) =
n (n -1) r
_ 2 (an
cos n9
5T2
+bn sin n0)
F0 rn (an cos n0 + bn sin n0) = 82
a02 rn (an cos n9 + bn sin n9) =
nrn (-a sin n9 + bn cos n9) n2rn (-an cos n9 - bn sin n9)
is uniformly convergent. Writing un (r, 0) for rn (an cos n0 + bn sin n0)
we may use Lemma 4 of Chapter One. Section 4 to obtain 02u
1 au
1 E12u
are + r Or + r2 002
ra2un
1 aun
1 a2un
are + r ar + r2 002
which is zero by (a). 1.5.2 If f E Cr (T), then there is an M such that Ian (f) I < A'! and I bn (f ) 1 < M for every n.
Because f is continuous on T there is a number u such that If (0)1 <
p for all 0 E T. Thus because an (f) = -1 f, f (0) cos nOd0 we have Ian (f )1 < n f If (0)1 d0 < E (2;r) = 2p for n = 0, 1, 2,... and because bn (f) = T f °`T f (0) sin n0d0 we have I bn (f) 1 <- 2p, n =1, 2, ....
199
Chapter One
1.6.2 Suppose that the sequence {an}n 1 is convergent to b. We claim that the , n = 1, 2, ... also converge to b. numbers Qn = n jQ-bj-la1+a2+...+an-nbl< Ia1-bI+...+Ian-bI
n
n
Now given c > 0 we may choose N so that Ian - bI < 2 when n > N. Then
at -bI+...+IaN_1 -bI + IaN -bI+...+Ian -bI
to - bI <
n
n
(n-N+1)E
< Ia1 -bI +...+IaN_1 -bI +
n
n
2
We note that "- N+1 <- 1 and that by choosing n > N sufficiently large n we can make the first term less than 1; because the numerator of this term is fixed. 1.6.3
0
K. (X)
1
E
m
eikx + [ eikx + ... +
u
i(o)r
=C1- 0
fe'(°)x+(1-
m
+ (1
k=-((
e(m-1)r
tx
I)er+rl- 1e-'x+ mJ
m
- mm- 1
mI
> eikr
-(m-1)
-t
0
m-I
...
e'(m-1)r
/ IkI M
m//
e'k:
m-1
k=-(m-1) \
IkI
e-1kx
mJ
(i_tI)ek r Rkl<m
as was to be shown.
1.7.1 For any f E C,. (T) there is a sequence of trigonometric polynomials that converges to f uniformly over T. Given any n E N we can find (Corollary 2 to Theorem 1, Chapter One. Section 7) a trigonometric polynomial r,, such that If (0) - rn (0) 1 < for all 0 E T. The sequence (T,,} chosen in this way converges to f uniformly over T; i.e. given E > 0 we choose N E N such that < E and note that If (0) - rn (0)I < n < e for all 0 E T whenever n > N.
200
Appendix C
1.7.2
(a) Let r (0) =
Solutions to the Starred Problems
cjei^'e. Then f, r (0) a-enedO = 0 unless n = n2 for some j. When n = nJ the integral is 2ir. Hence the Euler formulas (Chapter One, Section 5) show that the complex Fourier series of r (0) is r (0).
(b) Given r (gyp) = L.+`.=1
and, as .1 cj in part (a), the 1 uler formulas sI ow that this last sum is the real Fourier series of r (
1.7.3
1
_
1
1-rcosw+isinw)
1-rei'4
1 (1-rcosw)+irsinw 1-rcosw-irsinw (1 - rcosw) + irsinw _ 1-rcosw+irsinw
_
_
_ Now - 1 + 2Re
1
1 - re1°'
(1 - rcosw)2 + (rsinw)2 1 -rcosw + it sin w 1 - 2rcosw + r2 (cos2w + sinew) 1 - rcosw rsinw
1-2rcosw+r2 +i 1 -2rcosw+r2 2(1-rcosw) 1 + I - 2rcosw + r2
-1+2rcosw-r2+2-2rcosw 1-2rcosw+r2 1 - r2 1 -- 2rcosw + r2
1.7.6 If f E Cr (7) and c,, (f) = 0 for all n, then the partial sums, and hence the Cer'ro means, of the Fourier series of f are all identically zero. Because these means converge to f uniformly over T, f must be identically zero.
Chapter Two 2.3.1
lim
x
x-02sinlix
x-Osin(2) lim
lirn
z
1
x--+ 0 Zcos(Z) lim O f (O + h) - f (0 + 0) f+ (0) = h h
where f (0 + 0) =
lim f (0 + h) h 0+
-1
Chapter Three
201
and in this case f (0 + 0) = 1. Thus h 2
lim
f+ (0) - h-+0+ lim h
_
sin (2)
-1
h
=
lim
i - sin (Z )
h-+0+ hsin(Z)
2 - 12 cos (2 )
0+ z Cas 2) + sin (h) sin (2) lim
_
a h-.0+ a sin(g)+1Cos(z)+2cos(2)
-0
Similarly, f_ (0) = 0. 2.3.4 (b) We are given f uniformly continuous on (a, b).
i. Let {xn} be a Cauchy sequence in (a, b). We claim that if is a Cauchy sequence. Let a > 0 be given. Because f is uniformly
continuous, we can choose 8 > 0 so that Ix - yI < 6 implies If (x) - f (y)I < e. Because {xn} is a Cauchy sequence, we can
find N such that I xn - xn, I < 8 when m, n _> N. But then If (xn) - f (x,n)I < a when m, n > N, and we are done. ii. Now let {x,,} C (a, b) and suppose nHM x = a. Because (f (xn)} is a Cauchy sequence it converges to, say a. We want to show that h""o f (a + h) = a; i.e., given e > 0 we must show that there is a b > 0 such that if (a + h) - al < e when 0 < h < b. First use the uniform continuity off to find b so that If (x) - f (y)I < 2 when Ix - yi < b. Now If (a + h) - al < If (a + h) - f (xn)I+I f (xn) - al- Choose
N such that a < XN < a + h and If (xN) - aI < K. Note that when O < h < b we have I (a + h) - xN I < b and hence If (a + h) - f (xN)I < z. Thus 0 < h < d gives If (a + h) - al < e and we are done. The proof that h,i f (b - h) exists is similar.
O+
Chapter Three
3.1.1 For anyyinE,x=x-y+y and sollxll
Ix-
IIxII-Ilyll
greater than, or equal to, both II x ll have
IIIxII
- llylll! III - VII'
- II y Il and II y I - II x II , and so we
202
Appendix C
Solutions to the Starred Problems
3.1.2 The vector (1, i) E C2 has the property that 1112 + IiI2 = 2, whereas
12+i2=0. 3.1.3 (a) Let y E AE (xo). Then
II_
o - y < e. Let us take 6 to be less
than the smaller of the numbers
Note
xo - y III E - V yll . that with this choice d + 1 10 - y II < e. Supposell Z - y < 5. II
Thenllz- loll
I
+lly - ioll <6+Ily
xoll <e. Thus
Os (y) C o (ao). (b) Suppose y V j (io). Then ll y -loll > E. Let us take 6 < loll -E and consider Os (;).If z E 06 (),then I I y =-xo l
II (x
ll
y)
(xo
y) Il and this, by 3.1.1, is
'11(_ Thus 06 (y)nfe( o)=0.
-E)l>e.
(c) Let {OR1a E I} be a collection of open sets and let 0 = 0 Oa. If E 0, then i E 0. for some a E 1. Because 0,, is open there is an e > 0 such that OF (x) C 00. But then OE (x) C 0, showing that 0 is an open set. Now let {Oj}"=1 be a finite family of open sets and let
O=n" Oj.If i E O, then i E Ojfor every j. Ttius because Ojis an open set, we have ej > 0 such that 0E, (x) C Off, and this is true for j = 1, 2...., n. Let e = nun {ej Then 0E (cc) C OE, (i) C Oi
for j = 1, 2, , n. Hence 0E (x) C 0. If 0 = 0, it is an open set by definition, for it is a neighborhood of each of its points.
(d) A set is closed if its complement is open. Let {Ca I a E I} be a family of closed sets and let C = an f C,,. Then E-C = E- «EI C. = OE, (E - C0) (Chapter Zero, Exercises 2, problem 5). Because each
Ca is closed, each E - C,, is open and so by (c), E - C is open. Thus C is closed. Now let
be a finite family of closed sets and let C = UjL 1Cj.
Then E-C = E-UJ=1C, = n;-r (E - CJ) by (Chapter Zero, Exercises 1, problem 4). Because each of the sets E - C, is open, so also is E - C by (c). We conclude that C is a closed set.
Chapter Three
3.2.1 (E,
and
C E given, and
203
converges to ao for all II'II
111o - x l This last quantity (a) By 3.1.1 we have x o l l - l x n 11 l < tends to zero as n - oc, and the first quantity is nonnegative. Thus l
lim
n-.oo llxnll =
11x0
(b) Given113_-xm11 e > 0 we choose N so that Ilxo - xn1I < 2 when n > N. Then
11(xn-xo)
1110-.cm11 <Ewhen m,n>N.
3.2.2 S is a nonempty set, S' the set of adherent points of S, S = S U S'. (a) s0 E S' if there is a sequence of distinct points of S that converges
to x0. If {gn} C S, sn # sn when n # m and nl-moo sn = xo, then given E > 0 there is an N such that s E OE (xo) whenever n > N. Clearly,
(s n I n > NJ is an infinite set and because e > 0
is arbitrary we see that i o E S'. Next
we choose 82 E O() fl S and s 2 0 s 1. Then choose s 3 E 01 (ao) n S such that s3 V { s t, s2}. Continue in this way. The sequence { sn } so obtained consists of distinct points of S, and
clearly this sequence converges to x0.
(b) We shall solve this problem in stages. S is closed if S' C S. If S is a closed set, then, by part (a) and Lemma 1, S' C S. Now suppose S' C S. Let C S and let n1-- s n = y E E. To prove that S is closed we need only show that y E S. There are two cases: If infinitely many of the s n are distinct, then every neighborhood of y contains infinitely many points of S. Hence y E S', but S' C S so y E S in this case. In the second case only a finite number of the s are distinct. But then there is an N such that s,, = s n+lfor all n > N. Because lim s n = V, it follows that y E S. (S')' C S' and so S' is a closed set.
Let y0 E (S')'. By (a) there is a sequence { yn} S S' such that li'° V,, = y0 and the yn are distinct. Given e > 0 we must show noc that O£ (yo) contains infinitely many points of S; this shows that
204
Solutions to the Starred Problems
Appendix C
y o E S'. Choose N such that s E S such that II s
1 I y o - y N II < z . Then the set of all
- yN II < 2 is infinite because VN E S'. But
for any such s, 117 -yoll
S is a closed set. If C is closed, S C C, then S C C. Let yo E (S U S')' and let P-1 C S U S' be a sequence of distinct points that converges to yo. Infinitely many of the Vn are in S, or infinitely
many of the y are in S. In the first case yo E S', and the second case yo E (S')'.
Now let C be a closed set that contains S. We need only show that S' C C; but this is immediate from (a) and Lemma 1. (d) Let H be a linear subspace of E. Then H is a linear subspace. Given y E H and A E k we must show A E H. Clearly, we may assume
A ,E 0. Now y E H = HUH'. If y E H, we are done. If V E H'. we can find {h} C H such that '- h = . Given e > 0
- I < > for all n > N. But then
there is an N such that V-
IIAhn
- Ayll= Alllhn -
I <eforalln>N.Thus limAhn
=Ay
and because every Ah n E H, Ay E H' C if. Now let x , y E H. We must show that .x + y is in H. If a and y are in H there is nothing to do. If :c E H, V E H', we choose { y n } C H
such that lim yn = y. Then lim (x +
(i +
E H for all n, (i + y) E H. Finally, if rr, y E H'
choose { x n} C H
Clearly, (;n +
3.3.1
y and because
y n}
C H such that lim in = x and lim y n=
E H for all n, and lim
(x n +
:c + y.
(a) For any u E V we have (u.0) _ (u.2 0) _ (2 0, u) _ 2(0 , u) = 2(-u,0 ). If (u 0) # 0, then we could divide and ,
obtain 2=1. Note: v+0 = v all v,so 0+0 = 0,so2.0 = 0.
Chapter Three
205
(b)
IIu+ 12+IIu-vII2=("+t1,
+V)+(u-V, u-v)
= IIuIl2_+IIVIl2_+IIuII2+IIvII2 2(Ilzd112+IIvll2) (c)
-IIq2
+IIv1I2
because (u,v) =(v,u) =0. (d)
Now
IIu+x11-Ilv-vi1=2(u,v) +2(u,u)
IIu+i;112 =
-i(v,v>] =
as we see from (b).
+iv,+iv> _ (u,u>-i(u,v}+i [(v,u> Ilull2+i(v,u)+IIvI12 soillu+ivll2-=Ilull2++iIIv112
and Also IIu-
i1I2=(u-iv,u-iv+ -IIu_II2+i(u,v) -i(v,u)
Thus illy+ivII
-iv112 +IIv112801 lu
-ills-iv112=2(u,v)-2(v,u)
206
Solutions to the Starred Problems
Appendix C
and so
IIu+vI12- llu
- vIll +Lllu+ivll2-illu -ivll2 = 4(u,v 4
4
3.3.3 { u } and { v " } are sequences in V that converge to u o and v o. (a) For any w E/V, lim (i n,i) _ (uo, vr>. We write
) - \uo'u'/I =
I(u"
C.S.B. inequality (Theorem 2). Because w is fixed and lim II un
uo'u')I :5 IIu"
- uoll
Ilu'll by the
- uOII = 0, we are done. (b) lim (un, Vn) = (vo, v0) . Here we write vn) - (uo, vo)I = vn) - (unf'00)+(un,;0) - (u0';0)1 = I(un,
l(un,
I(urt vn -
+
(un
v0) + (un - u0i -v0)I
-zp. v0)I < Ilunll Ilvn -
I(un' vn - v0)I v011
+ IIu"
- uoII P011
Now II un - u011 -+ 0 as n - oo so the last term tends to zero. Because { u n } is convergent the set { II u -11
I n = 1, 2,... } is bounded; i.e., there
is an M such that Il v n 11 < M all n; to see this just note that 11-u un
-uo+uoll <
n ll
I1un-uoII+IIu011 and I1un-u0I1 , 0. Because
l v n - v o ll ~ 0 we see that the first term in our inequality tends to zero. 3.4.1
(a)
S is an orthonormal subset of V. We claim S is maximal if (u, v) =
0 all v E S implies u= 0. First suppose that S is maximal and let ul E V, (u, v) = 0 for
all V E S. If u
0 , then S # S U
and this last set is
orthononnal. This contradicts the maximality of S. Now suppose that the orthonormal set S has the property stated. If S C S', where S' is orthonormal, we must show that S=S'. Suppose
u E S' - S. Then 11u = 1 because u E S', but we also must I
have (u, V) = 0 for all v E S. This implies u = 0, which is a contradiction.
Chapter Four
207
(b) Any z = {zj }'1 such that (z , e ,) = 0 for all n must be the zero vector because (z , e n) = zn for each n. Thus, by (a) the set { e n } is maximal.
(c) Let f E C (T) and suppose that (f, e'2' .> = 2x f R f /t) a-intdt = 0 for every n. Then all the Fourier coefficients Jof f are zero. It follows that the partials sums of the Fourier series of f , and hence the CesAro means of this series, are zero. Because these means converge
uniformly to f (Theorem I of Section 7, Chapter 1) we see that f must be identically zero.
3.5.1 The linear span of the set {e'nt } in L' coincides with the set of trigonometric polynomials (see Chapter One. Section 7 just above Corollary 2). We want to show that this set is dense in L' [-7r, 7r] for II'II I; i.e., we want to find, for any given f E L' [-Tr, a], a sequence{an} that converges to f for the L'-norm. Now C (T) is dense in L' [-tr, a] and so, for the given f, we can find {gn } C C (T) such that (g,,) converges to f for the L'-norm. For each fixed n we can find an such that _ 'nax * Ign (t) - on (t)I < *n (Chapter One, Section 7, Corollary 2). Clearly, 11 g,, - anlll < for each and n. But then 11f - 0'.111 <- III - 9n lI 1 + I I9n - 0'.111 <_ 11f - g-11 I + we see that {an} converges to f for the L'-norm.
<
3.5.2 For notational convenience let us set v n = "2' , for each n. By Theorem 5, { u n }
is an orthonormal basis for L2 (T). Consider the sequence
R I2 is convergent, this On v n °O}k-o C L2 (T). Because F 00 Nn sequence is Cauchy (see the proof of Corollary 2 to Theorem 1 in Section 4). Thus 00 On v n converges to an element f E L2 (T). But then, by k
I
the discussion at the very beginning of Section 4, we must have
f,
Zn
= Qn for every n.
Chapter Four 4.1.2 (a) (a * b) (n) = Em a (n - m) b (m) = E. a (k) b (n - k) = (b * a) (n),
where we have set k = n - m (remember n is fixed). (b) A(a * b) = A {(a * b) (n)} = A {Em a (n - m) b(m)} = {F_ Aa (n - m) b(m)} = {E a(n - m) Ab (m), and the first of these is Aa * b, whereas the second is a * Ab.
(c) [a * (b * c)] (n) = Em a (n - m) (b * c) (m) _ Em a (n - m) (Ek b (m - k) c(k)) = Ek c (k) (Em a (n - m) b (m - k)) if we let I = m - k this becomes
208
Appendix C
Solutions to the Starred Problems
Eke(k)(El a In -(I+ k)jb(l)) =
Eke(k)(E a[(n-k)-1]b(1))=Ek(a*b)(n-k)e(k)= [(a * b) * c] (n).
(d) [a * (b + c)] (n)
a (n - m) (b + c) (m) _
E,a(n-m) {b (m)+c(m)}=Em a(n-in)b(in)+ E,n a (n - m) c (m) = (a * b) (n) + (a * c) (n)
(e) Supposea*c=c*a=eanda*b=b*a=e. Then b=b*c= b*(a*c) = (b*a)*c=(e)*c=c.
4.1.3
llbn - boll,. Because = IIa * (bn - bo)II1 <_ ,t", Ilbn - boll, = 0. we see that "ma a * bn = a * bo for every fixed IIa * b - a * boII1
IIalll
aE11(Z). If we define Ta (b) = (a * b) for all b E 11, where aE 11 is fixed, then Ta is linear by 4.1.2 parts (b) and (d). This map is also continuous because IITa (b)III = Ila * bII1 S IIall1 IIbII I for all b E P.
4.1.5
(a) g(1)=1, g(n)=0alln76 and this is zero unless in = I and n - in. = 1. Thus (g * g) (n) = 1
when n = 2 and is zero when n 0 2. Suppose gk (n) = 1 when it = k and is zero when n 96 k. Then gk+1 (n) = g * g' (n) = Em g (n - m) gk (m) is zero unless in = k and it - in = 1. Thus
gk/1(k+1)=1,and gk+1(n)=0if n#k+1.
(b) g-1 (n) = 1 when n = -1, g-1 (n) = 0 when n 36 -1. Then g * g-1 (n) = E. g (n - m) g-' (in), and this is zero unless in = -1 and it - in = 1. So g * g"' (0) = 1, and g * g-' (n) = 0 for it 54 0; i.e., g * g-' = c. Now g_' * g (n) _ E g-' (n - m) g (m.) is zero unless in =land s-in =-1.So g-1*g=e and we have shown that 9- is the inverse of g. (g-')2 (it)
(c)
= (g-1 *g-1) (n) = Eg-1 (n - in) g-' (m). This is zero unless in = -1 and n - in = -1. So (g-1)2 (n) = 1 when it = -2 and is zero when n $ -2. As in (a) we see that. (g-1)k (n) = I when
n = -k and is zero when n # -k.
(d) a E l' so a = {a (n)}. Also
O. a (n) gn = ... + a (-1) g-' +
a (0) g° + a (1) g1 + ... = {a (n)}',,, and this is just a.
4.2.1 We have aE11(z)fixed, I(a)={a*bIbEl'}. (a) Because a * e = a, a E 1(a). Clearly, a * b + a * c = a* (b + c), and X (a * b) = a*(Aa) by 4.1.2 (b) and (d). Thus I (a) is a linear subspace of V. To show that I (a) is an ideal, we must show that when c E I (a)
anddEl', d*ce I(a). Now CE 1(a) andsoc= a*bforsome b E 11. Also d * c = d * (a * b) = (d * a) * b = (a * d) * b = a * (d * b)
by 4.1.2 (c) and (a).
Chapter Four
209
(b) We want to show I (a) is a proper if a is not invertible. First suppose
that a is invertible. Then there is an a-' E 1' such that a * a-' = a-1 * a = e. Hence e E I (a). But then b = b * e E I (a) for every b E It showing that I (a) is not proper. Thus when I (a) is proper, a cannot be invertible. Now suppose that a is noninvertible. Then e ¢ 1(a) for, otherwise. e = a * b = b * a for some b E 11, which would
mean a is invertible. Thus I (a) must be proper. (c) As discussed just before Lemma 1 we need only show that the closure of an ideal is an ideal. Let I be an ideal in 1' and let I be the closure
of I. Then 1, and hence I (by 3.2.2 (f)), is a linear subspace of 1'.
Let aE1andletbE11- We must show that b * a is in I. If a E 1.
i'
this is immediate. If a E I\I, then there is a sequence (a,) C I such that a = a for the 11 norm. But then b * a E I for all n. and un b * an = b * a for the 1'-norm (4.1.3). Thus b * a E I and we are done.
4.3.4 Suppose that for some e r= L' (R) we have e * f = f for every f E L' (R). Then e * e = e. Hence a (x) . e (x) = e (x) ore (x) [e (x) - 11 = O for all x E R. Now a (x) is a continuous function (Lemma 1), so either a (x) = o for all x or a (x) = 1 for all x. The first of these would mean that e is the zero element of L1 (R) and in that case e * f = 0 # f. The second possibility can be ruled out because Ixl- (x) = 0 by Lemma 2. 4.4.1 We refer to the statement of Lemma 2. Here M is a multiplicative group.
Because M y6 0 there is an x E M. But then x- x-1 = 1 E M and we have proved (i). To prove (ii) we note that for any x E M, 1 - x-' is in M.
When x, y are in M, x and y' are in M by (ii). Hence x (y-') -' = x y is in M. It is now clear that x" E M for every n E Z. 4.4.2 f : M -+ A where M is a multiplicative group and A is an additive group.
Furthermore f (x y) = f (x) + f (y) all x, y in M.
(a) For any x E M, f (x) = f (x 1) = f (x) + f (1) showing that f (1) = 0 and hence 0 is in the range of f. Also f (1) = f (x x f (x) +f (x-1)' so f (x-') = -f (x) Now let a, b be in the range of f . We must show that a - b is in the
range of f. There are x, y in M such that f (x) = a and f (y) = b.
Hence f (x y= f (x) + f (y') = a + (-b) and we are done. (b) {x E M I f (x) = 0} 0 0 because it contains 1 (see (a)). If x, y are in
this set, then f (x y-1) = f (x) + f (y 1) = f (x) + [- f (y)]. But both f (x) and f (y) are zero, so x Y-' is in our set. (c) U4 = {fl,±i}. Note that -i and (-i)-1 = _`_ _ i.
210
Appendix C a Solutions to the Starred Problems (d) f : Z -+ U4, f (n) = in all n. Then f (m + n) = im+n = im . in = f (m) f (n). For positive n, in = i4q+r = (i4) it = Y'' because i4 = 1. Here 0 < r < 4 so in is i0, i1, i2 or i3. All of these, io = 1, i, i2 = -1,
and i3 = -i, are in U4. So the range is U.1, because i-n = -l, , and this is in U4 for each positive n. When n = 0, f (n) = I E U4.
{nEZIf(n)=1} = {nEZIin=1} = {nEZIi4q+r=1} _ In I n = 4q + r, and r = 0} = all integer multiples of four.
(e) g : Z U4, g (n) = (-1)n. Here the range of g is {+1, -1} and {nEZIg(n)=1}= all even integers. 4.4.3 Here it is a fixed positive integer and U4 = {z E C Iz" = 1}.
= 1. Thus (a) If x, y E U,,, then xn = 1 and yn = 1. Also (y-1)' = y_ 1)n (x = x" (y-1)n = 1 and hence x y-1 is in Un. If Z" = 1, then 1 = JZnj = IZIn by 0.4.4 (a) . Thus IZI = 1 because the only positive real number whose nth power is 1, is 1.
(h) If p (x) = 0 has a root r of multiplicity k > 1, then p (x) _ (x - r)' q (x). But then p' (x) = (x - r)k q' (x) + k (x - r)k-1 q (x) by the product rule. Because k > 1, k - 1 > 0 and we see that
p'(r)=0.
Now set p(x) = x" - 1. Because p' (x) = nxn-1 its only solutions are x = 0 (with multiplicity n - 1). Because zero is clearly not a root of p (x), p (x) and p' (x) have no common roots. Thus all roots of p (x) are simple. 4.4.5
(a) Clearly, {in I n E Z} = {i° i1,i2,i3} = U4, so i is a generator. So also is -i. (b) For n fixed. e' = cos 2' = n + i sin L' n is clearly in Un; for (e'?")" n
k c2i' = cos 27r + i sin 27r = 1 . But the n numbers (e' za l , 0 < k < n
are in U,,, and they are distinct. So we have a generator of Un. (c) The number 1 is a generator of the additive group Z.
4.5.1 M is a multiplicative group; N is a subgroup of M.
(a) For s, t E M, s - t modN is to mean
st-1
E N. Because N is a multiplicative group, 1 E N (Section 4, Lemma 2). Thus for any
s E M, s- s-1 E N and so s- s modN. Suppose s =- t modN. Then s t-' E N. But then (s t-1)-1 E N (Section 4, Lemma 2) and so t s' 1 E N. It follows that t =- s modN. Finally, let s - t 1nodN, t u modN. Then st-1 and to-1 are in N. But then (st-1) (tu-1) = su-1 is in N, and we see that s - u modN (again we used Lemma 2 of Section 4).
211
Chapter Four
(b) Let s' E [s], and t' E [t]. Then Ws-' E N and
t't-'
E N. Hence
(SIR- 1) (et-1) = (s't') (st)-' E N showing s't' __ st modN.
(c) ForanysEM, (d) For any s E Al., s-1 E Al (Lemma 2, Section 4 of Chapter 4). Hence
[g] * [sl = [s-1] * [s] = [ss-'] =
[s-14]
_ [1] (e) For z1, z2 E C\101, z1 = z2 u1odN means z1 z2 E N. which means z E N. Now z1 = r1er0jz2 = r2e10a, and N = T (the unit circle). So z = L e'(°t -02), and this will be in T if 1. So our equivalence classes are concentric circles centered at tie origin. 4.5.4 C is a subgroup of the additive group A. yo : A is defined by setting V (a) = [a] all a E A. Then V (a + b) = [a + b] _ [a] + [b] = y; (a) +.p (b)
for all a, bE A. Also {aEAIc(a)=[0]}
{aEAIaEC}=C. 4.6.1
(a) Let r =
(aEA[aaOmodC} _
o iw`and so wr = -_o wJ" 1. Now we may list the
summands as follows: r = w0 + w1 + w2 + w3 + ... + w1-2 +
wt-1 and
...+wt-2+wl-1+wt.
wr=wl+W2+w3+
Observe that they differ only in that w° is in the sum defining r and w1 is in the sum-defining wr. But w° = 1 = wl so they are equal. (b) w" - 1 = (w - 1) (wi-1 + w"-2 + ... + w + 1). Now if w" = 1, then
at least one of the factors on the right must be zero. But w 34 1; hence the second factor must be zero.
4.6.2 We have w E U1. (a) w-1 (b) w w = jw12 = 1 and so _ (c) the order of each w E U, is a division of I (Definition 5, Lemma 4). When 1 is a prime every member of Ut, except +1, has order 1. Thus every such member of U,. is primitive 1t1iroot of unity. 4.8.2 If a, b are in Z, and a b = 0, then if a has an inverse (a)-1 we would have (a)-' (a b) = (a)-1 0 = 0. But (a)-' (a b) = [(o.)-' a.] b = 1 b giving us b = U. Similarly, b cannot have inverse.
4.8.3 Suppose ka = kb modn and (k, n) = 1. Then k a = k b in Z" because (k, n) = 1, k has an inverse (k)
(k)-1
(k a) =
(k)-1
-1
E Z,a (see Theorem 3). But then
(k b), giving us a = b or, equivalently, a
b mod n.
4.9.2 We have n = p q where p is the smallest prime factor of it. Here q is a prime or, if not, q has a prime factor q1. Clearly, p < q1 because q1 is a prime factor of n. But then p2 < pQ1 < pq = it.
212
Appendix C
Solutions to the Starred Problems
4.9.5 If to is composite, then n = pq where 1 < p < n, 1 < q < n. Thus 2" - 1 = 2P9 - 1 = (2P)Q -- 1 = (2P - 1) {(2P)Q"1 + (2P) q-2 +... + 1]. Because 2P - 1 > 0 and the second factor is also positive, 2" - 1 is composite.
Index Basis, orthonormal, 101. 006 108,
A
Abel kernel, 126 Abelian group; see Group, abelian Abel's test, 47(5e) Absolute value, 10(3), 13, 14(4) Abstract algebra, 165 Abundant number, 156 Additive group; see Group, addititivc
Additive identity, 7 3 179 Additive inverse, 7 13 Adherent point, 89(2) Analysis, 1.5=21
Analytic geometry, 14 Angle, 1 14 94, 22 Approximation theorem (Weierstrass), 55., fi-5 66
Argand, R. U-21 Associative law, 166. 179 Axiom of choice, 5 6 11(4). 121. L63 Axiom of continuity, 9 Axiomatic set theory, Axioms for a field, 125 Axioms for a group, 165 Axioms for an algebra, 173 Axioms for a ring, L72 Axioms for a vector space, 179 Automorphism, 172(2) B Ball, closed, 83_
Ball, open, 81 84(3b) Ball, radius epsilon, 8.3 Ball, unit, 84
Banach space, 82 L04
109
Belongs to, 1
Bernoulli, 23 31.55 Bernstein polynomials, 52(9) Bessel, 11 21 Bessel's inequality (strong form), 92 (weak form), 92 Binary operation, 165 Blahut, E., 122 Boas, R.P., 21 Bombelli, 12.21 Bose and Kim, 164 Bound, greatest lower, 10(2d) Bound, least upper, & 9, 10(2c) Bound, lower, & 1 Bound, upper, 8
Boundary data, 25 51 52 54 Bounded function, 88 Bounded linear map, 86 Bounded sequence, 21(7d), 42(2). 46(4. Q Bounded set, 15, 21(7d) Bruce, J.W., 163 Brunelleschi, U Bunyakowski, (C.S.B. inequality), 92 Butzer, P.L. and Stark, E.L., 55 C Cantor set, 11(7)
Cardano, G., 12.21 Cartesian product, 3 4, 13 Cauchy, A.L., IS
214
Index
Cauchy (C. S.B. inequality), 92 Cauchy-Riemann equations, 18 20(4b) Cauchy sequence. 200. 6a), 21(7), 88 181 Cesilro, E., 43
Cesaro convergent, 43 46(l. 2 4) Cesdro mean, 43 44, 42 Cesilro summable, 43 Character of the additive group of integers, (Z. +). 137 Character of the additive of integers modulo n. (Z., +), 137 Character of the additive group of real numbers, (R, +), 143(7) Characteristic of a field, 125 Circle, 24 Circle, unit (T), 25, 27 Classical harmonic analysis, 23=55 Closed ball, 83 84(3b)
Closed set, 15 83 84(3). 85 90(4b) Closed under a binary operation, 165 Closure of an ideal, 121 Closure of a set, 89(2) Closure of a subspace, 89(2d) Codomain, 5 Collection, 1
Commutative group; see Group. Abelian Commutative law, 179 Commutative ring, L72 Compact set, lb Compact support, 107(3b), 105 Comparable elements, 6(3b) Complement, orthogonal, 109, 111, 112(1)
Complement, set-theoretic, 30). 15 Complete, inner product space, 97 1
Complete normed space, 82 Complete orthonormal set; see Maximal orthonormal set Completeness property of R. 2 Completion, 144, 105. Appendix B, 181-186
Complex conjugate, 14(l) Complex Fourier coefficients, 41.51 Complex Fourier series, 41.51 Complex numbers (C), I t-14 Composite number, 153, 155 Composition of functions, 6(2) C°(complex n-space), 79. C°(complex n-space ), norm of, 82 iii Congruence modulo a subgroup, 11. 167
Contained in, 2 Continuous function, 16 17 199,
203b, 5126 27 28(3), 34 Continuous linear map, 86 Convergent sequence, 12.20(1). 21(7a. c), 85 Convergent series, 33 3 Al Convolution, 114 118(2-5). 126, 128(3). 135
C,(T), Continuous (real-valued) functions on the unit circle, 399, 4l.
42, 47.49,50.51.52 C(T), continuous functions on the unit circle, 25-28.51(remark 11, 8(). 89
CO(R), continuous functions on R with compact support, 141 (3b) Cubic equation, 145 Cyclic group; see Group, cyclic D D'alembert, 23,
55
Danish Academy of Sciences, 13 Dantzig, T., 21 Debnath, L. and Mikusinski, P., L12 Deficient number, 156
del Ferro, 2 21 De Morgan, 3(421 Dense set, 104, 14S Derivative, 6 17, 35, 36, 37, 42(3) Derivative, one-sided, 68 69 73 24 ,
DeVito. C.L., 112
Dickson, L.E., 164
Index
215
Differentiable, 16 17 20(2, 3c), 35
Even function, 60, 61(2. 31 62.64(2)
42(3), 126 Differential equation, ordinary, 29 30 31, 31. L27
F Factor, see Divisor
Differential equation, partial, 8 25.
Family of sets, 5 6 7 5 L6
30, 32 Dilation, 1D8 Direction, 90 Dirichlet kernel, 7-0
Fcjdr, L., 47 FejEr's kernel, 46(3). 74
Dirichlet problem (DP). 24, 25 29
Fermiparadox, 47 55
47 5 Disjoint sets, 2 Disjoint sets, pairwise, S_
Distance, 14 Divergent, 12
Divisor (factor) of a natural number, 154 Divisor of an integer, 15.0 Divisor of zero, 1.53 Domain of a function, 5. Domain of a relation, 4 E
Edwards, H.M.. 1.63 Equal sets, 2 Equivalence class, 6(1), 103. 132, 134(l), 167 Equivalence relation, 4 61l, -31132,
1340). 167 Equation, Laplace's, 18 20(4a), 24 Equation, ordinary differential; see Differential equation, ordinary Equation, partial differential; see Differential equation, partial Equation polynomial, 12. 143 Equations, Cauchy-Riemann; see Cauchy-Riemann equations Euclid, 156
Fej6r's theorem, 47 62 102(lc). L06 Fermat, P., 159 L61
Ferarri, L., L Field, 115 Finite dimensional, 79-82. 84(4) Fontana. N. (Ta'taglia), 12, 21
Fourier, J-BJ., 57 72 Fourier coefficients, 41 42.42(2.3:
51.58.61.68 Fourier cosine series, 62 63 Fourier series, 41, 47.49 51, 58. 61,
70 23 Fourier sine series, 62.63 Fourier transform, 85 Fourier transform on R, 123 Fourier transform on Z, 113 Fourier transform on Zr,, 139 Frederick the great, 163 Fredholm,99 Function, 4, 5, 15(6). 23 57 Function space, 113 Function, symmetric. 144 Functional, linear, 87 Functional, multiplicative. 120 Functions, addition of, 80 113 Functions, multiplication of, 80 115 Functions, multiplication by scalars, 80 LL3 Functions, L.R. L03
Euclidean norm, 8), 82 iii
G
Eulcr, L., 13.21 23 31 55 159(3c,
Galois field, 126 Gauss, K.F., 13
61 16x.1
Euler formulas, 39 40 4L 57 Euler's phi function, L5.4 1. 164 Euler's totient function; see Euler's phi function
Generator of a cyclic group; see
Group, cyclic Geophysicists, 1.001
Gerver, J., 37 54
216
Index
Gibbs phenomenon, 76(2 Goldberg, R.R., 163 Greatest common divisor, 150 151 Greatest lower bound, 10(2d). 27 Group, Abelian, 167 Group, abstract, 165 Group, additive, 128 Group. cyclic, 131(5) Group. multiplicative, 122
Intersection, 2 Invariant, 144, 145
H
J
Haar wavelet, 109 Hardy, GH.. 37 Harmonic analysis. 23 Harmonic function, 24 37(l, 2) Harriot, T., 12, 21 Hausdorff maximal principle, 121 Herstein, J.H., L72 Hewitt, E. and Ross, K.A., 163 Hilbert, D., 22 Hilbert space, 97 L05 Homomorphism, 169 Hungarians, 47
Jeffery, R.L., a 27
Inverse, additive, 7 0- 114 125 Inverse element, 1 66 119 L66 Inverse function, 6(2b). 51 Inverse, multiplicative, & 13 Invertible, 116 120 Isomorphic, 120 Isomorphism, 174
K
Katznelsen, Y., 163 Kelley, J., 163 Kernel, Dirichlet's, 20 Kernel, FejEr's, 44 24 Kernel, Poisson's, 55(1) Kernel (null space), Appendix A, 90(4a). 180 Kernel of a homomorphism, 170 Konopinski, E., 42 Kreyszig, E., 72
I
Ideal, proper, 121, 123(1), 123 Ideal, maximal, 121 Identity element, 166, L22 Identity operator, 81 Imaginary part of a complex number, 1
Independent set; see Linearly independent set Inequality, Bessel's, (strong) 27 (weak) 22 Inequality, Cauchy-SchwarzBunyakowski, 22 Inequality, triangle, l0(3a), 14(4b), 81 Infinite dimensional spaces, 80,84(4) Inner product, 79.91 Integers (Z), 9 Integral equations, 22 Integral, Lebesgue, 41, 103. 124 Integral, Riemann, 17.20(3b), 35 57 Integrable, 52
L
Lagrange, G., 143, L63 Lagrange's identity, 7.1 Lagrange's solution to the cubic, 145
Laplace's equation, l.& 24.30 Laplace's equation in polar co-ordinates, 25 Lebesgue integral; see Integral, Lebesgue Left-shift operater, 81
Limit, 16, 20(3), 4142 Linear combination, Appendix A, 107 Linear functional, 87 Linear mapping, Appendix A, 85 Linear operater, 82 Linear subspace. Appendix A, 89(2d), 90(4a) Linearly independent set, 84(4), 89(3) 11(Z) (1-one of Z), 82 iv 84(4), 11 LL6
Index
217
Lomont, J.S.. L72
0
L.P. function, L031
Odd function, 60.6a -31!k2.
L.P. spaces. 10 IHI4 1i(N) (complex 12 of N), 95(6) li(N) (real 62 of N), 82 ii 82 iii
One-to-one function, 5
84L4L 93. 101 (complex L-2 of Z), 82 l (Z) (real 1-2 of Z), 82
Open interval, 1116) Open neighborhood, 15 Open set, 15, 83, 84(3 ) Operator, icft-shift, 82 Operator, right-shift, 81 Order of an nth root, 138 Orthogonal complement, 109. 111,
L'(R) (L-1 of R), 107(4), L23 LI(T) (L-1 of T), LOA of R), 1070c) L2(R) L2(T) (L-2 of T), 11)5. 106 Lucas-Lehmer test, L52
Onto function, 5 7(4) Open ball, 83.84(3a)
112(1) Orthogonal set, 91
Orthogonal vectors, 91
M
MacLaurin's series, 50 Matrix, 139, 141, 168, 122 Maximal ideal, see Ideal, maximal Maximal Nest, 121 Maximal orthonormal set, 28.141, 102(1)
Maximum, 12.24 Mersenne numbers, I57 L63 Meyer, Y., LIZ
Minimum, 17.24 Modulus, L3 Monomorphism, 1.20 Morphorism, 169 Multiplicity of a root, 131(3b) Multiresolution analysis, 1118. 109 N
Natural numbers (N), 8 83 Neighborhood, Nest, 121 Norm, 81 Norm, Euclidean, 8 Z 82 iii Norm, from an inner product, 93 Norm, sup, 88 Normed space, 81 Normalized vector. 91 Normal subgroup. see Subgroup, normal
Null space (kernel). Appendix A. 180
Orthonormal basis. 101. 1116
Orthonormal set, 91 96(infnite). 98(maximal)
P Pairwise disjoint sets, 5 Paradox. Fermi's, 47 55 Paradox, Russell's, 7(4c) Parallelogram identity, 24(b) Parseval's relation, 99& and d) Partial differential equation: see Differential equation, partial Partial function, 4 5.6(21 Partial order relation, 66(3b) Perfect numbers, LS6 Periodic function. 25.21.28(1-7). 22 Permutation, 145. 142(2. 3J, 168 P(f), set of periods of a function f. 25. 26 Piecewise continuous function, 65.66.
6b, 69.7()(4). 23.24 Poisson's integral, 5,_,15 1)
Poisson's kernel, 5511) Pointwise convergence, 4y 20.73. Polar form of a complex number, 14 Polar identity. 94(Id) Polynomial, 12 5(L 52(7 ). 65. Polynomial, trigonometric, 49. 54.
51(1.2) Preimage.190(4)
218
Index
Prime numbers, 11 1.54 Primitive root of unity, 13
Privalov, LL 24
Sectionally continuous function; see Piecewise continuous function Sequence, 18
Product, Cartesian; see Cartesian
Sequence, bounded, 21(7d). 422
product Product, inner, 79, 11 Pythagorean relation, 94(Ic)
Sequence, Cauchy, 19 220 5, §1 21(71, 88
151
Q Q (rational numbers), 9. 11(5) Quadratic equation, 144 Quantum mechanics, 99. 112 R
Rational numbers (Q), 9 Real numbers (R), 2 Real part of a complex number, 15(6) Relation. 4 Relation, equivalence; see Equivalence relation Relation, partial order, see Partial order relation Relation, total order, see Total order relation Relatively prime integers, 151 Resolvent set, 123(2) Riemann, B., 36 41. 68 Riemann integral; see Integral, Riemann Right-shift operator. 82 Ring, 172 Ring, commutative, L22 R (real n-space). 72 IR"(real n-space) norm of, Roots of an equation, 12 Roots of unity, 131(3), 138 Russell paradox; see Paradox, Russell's Ryan, R.D., 112 S Scalar multiplication, 179 Schroeder, M.R., 164 Schwarz (C.S.B. inequality), 92
Sequence, convergent, 19 49 85 Sequence, divergent, 19.46(1) Sequence, of functions, 32, 33.39 49 Series, 23, 33 Series, convergent, 33 39 Series, Fourier, 41 58, 6.k 2Q Series, of functions, 33, 32 Series, of numbers, 43, 46(5)
Series, power, 370, 1 5 Series, summable, 43
Set, L 2 3 SETI, 42 Signal processing, L08 Silard, L., 42 Simmons, G.F., 163 Singular element, 120 Space, Banach, 88 89 Space, Hilbert, 92 Space, inner product, 91 Space, nonmed, 81 Space, of functions, 84 Space, vector, Appendix A, 179 Span, linear, Appendix A, 1811 Spectrum. 123(2) Subgroup, 129, 186
Subgroup, normal, 11 172(4) Subset, 2 Subspace, Appendix A, 1811 Sums, infinite, 31 Superposition, 1112 (see also linear combination) Support, 107(3b)
Symmetric function; see Function, symmetric
T Ta'taglia (N. Fontana), 12.21 Tauberian theorems, 464)
Index
Teller, E., 42 Total order relation, 663b , 8 Totient function; see Euler's totient function Transform, Fourier; see Fourier transform Translation, LOS Triangle inequality, 10(3a), 14(4b), 81 Trigonometric polynomial; see Polynomial, trigonometric Trigonometric series, 23 U Un, 131(3), 1.311
Uniformly continuous function, 17 20(6) 2BL71 65 69 70(4)
Uniformly convergent sequence, 32 33. 36. 37, 37(1). 39, 47,
510,9 ) Uniformly covergent series, 33. 39
Union, 2 5 Unit circle (T), 22 Unit, ( ring with unit), 112
219
Vector, normalized, 91 Vector, unit, 91 Vector space, Appendix A, 179 Vector space of functions, 80 Vieta, 144, Lob
W Wavelet, 14$ Webb, S., 5S Weierstrass approximation theorem,
55 65 66 Weierstrass, K.. 36 Weierstrass M-test, 33 36 31 Wessel, 13.21 Wiener, N ., 115 , 123 , L63 Witmer, T. R ., 21 Wojtaszczyk , P., 112
Y York, H. 42
V
Z Z (the integers), 2 Zero, proper divisor of, 153
Vandermonde, L63
Zero, vector, 179
Vector, Appendix A, 179
Z (the integers modulo n), 133
Carl L. DeVito
HARMONIC ANALYSIS bglI
Ii
o
o go®
Many branches of mathematics come together in harmonic analysis; each adds richness to the subject and
provides insights into this fascinating field. DeVito's Hamronfc Analysis: A Gentle Introduction presents a clear, comprehensive introduction to Fourier analysis and Harmonic analysis, and provides numerous examples and
models, leaving students with a clear understanding of the theory.
Key Features of Harmonic Analysis: A Gentle Introduction: Material is presented in a unified way, avoiding the "cookbook" character of many texts. For example, in Chapter one an interesting boundary value problem is solved. The discussion then leads naturally to harmonic functions,
periodic functions, and Fourier series. The various ways such series can converge are investigated in three separate chapters. Roots of unity and the fast Fourier transform are discussed. Here the symbiotic interplay between algebra and analysis is emphasized and exploited.
Students are brought into the discussion by providing them with problems keyed to the material being discussed. Such problems are marked with a star. Solutions to starred exercises throughout the text are provided in an appendix.
The structure and format of the text is flexible, allowing the instructor to tailor it to the needs of the course and
the needs of the students. There is a discussion of the number theory that arises in connection with the Fourier transform and a chapter on the relevant structures from abstract algebra.
ISBN
(n-
90000
lanes and Bartlett Publishers 40 Tall Pine Drive Sudbury, MAO 1775
978.443.5000 mloOlbpub cnm
www.jbpub.com
10 0-7637-3893-x
9