Birkhauser Advanced Texts Basler Lehrbucher
Edited by Herbert Amann, Zurich University Ranee Kathryn Brylinski, Penn S...
24 downloads
684 Views
16MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Birkhauser Advanced Texts Basler Lehrbucher
Edited by Herbert Amann, Zurich University Ranee Kathryn Brylinski, Penn State University
v.s. Sunder Functional Analysis Spectral Theory
Birkhauser Verlag Basel· Boston· Berlin
Author: V.S. Sunder rJle Institute of Mathematical Sciences Taramani Chennai 600113 IDdja -1991 Mathematics Subject Classification 46-01
A CIP catalogue record for this book is available from the Library of Congress, Washington D.C., USA
Deutsche Bibliothek Cataloging-in-Publication Data Sunder, Viaka1athur S.: Functional analysis: spectral theory / V. S_ Sunder. - Basel; Boston; Berlin: Birkhliuser, 1998 (Birkhliuser advanced texts) ISBN 3-7643-5892-0 (Basel ... ) ISBN 0-8176-5892-0 (Boston)
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in other ways, and storage in data banks. For any kind of use the permission of the copyright owner must be obtained. © 1997 Hindustan Book Agency (India) Authorized edition by Birkhliuser Verlag, P.O. Box 133, CH-4010 Basel, Switzerland, for exclusive distribution worldwide except India Printed on acid-free paper produced from chlorine-free pulp. TCF 00 Printed in Germany ISBN 3-7643-5892-0 ISBN 3-8176-5892-0
987654321
Contents
Preface . . . . . . 1
2
3
VII
N ormed spaces l.1 Vector Spaces l.2 Normed spaces l.3 Linear operaters 1.4 The Hahn-Banach theorem l.5 Completeness . . . . . . . . l.6 Some topological considerations .
1 3 6 10 13 21
Hilbert Spaces 2.1 Inner product spaces 2.2 Some preliminaries . 2.3 Orthonormal bases 2.4 The adjoint operator 2.5 Strong and weak convergence
29 32 36 44 53
C*-Algebras Banach algebras ..... Gelfand-Naimark theory. Commutative C* -algebras Representations of C* -algebras The Hahn-Hellinger theorem
80 90 99
Some Operator Theory The spectral theorem . Polar decomposition Compact operators . . Fredholm operators and index .
115 118 123 134
3.1 3.2 3.3 3.4 3.5 4
61 72
4.1 4.2 4.3 4.4
v
vi
5 Unbounded Operators 5.1 Closed operators . . . . . . . . . . . . . . . 5.2 Symmetric and self-adjoint operators . . . . 5.3 Spectral theorem and polar decomposition .
Contents
149 155 161
Appendix A.1 Some linear algebra. A.2 Transfinite considerations A.3 Topological spaces . . . A.4 Compactness . . . . . . . A.5 Measure and integration . A.6 The Stone-Weierstrass theorem A.7 The Riesz representation theorem .
226 235
Bibliography
237
Index . . . .
238
171 184 190 197 209
Preface
This book grew out of a course of lectures on functional analysis that the author gave during the winter semester of 1996 at the Institute of Mathematical Sciences, Madras. The one difference between the course of lectures and these notes stems from the fact that while the audience of the course consisted of somewhat mature students in the Ph.D. programme, this book is intended for Master's level students in India; in particular, an attempt has been made to fill in some more details, as well as to include a somewhat voluminous Appendix, whose purpose is to (at least temporarily) fill in the possible gaps in the background of the prospective Master's level reader. The goal of this book is to begin with the basics of normed linear spaces, quickly specialise to Hilbert spaces and to get to the spectral theorem for (bounded as well as unbounded) operators on separable Hilbert space. The first couple of chapters are devoted to basic propositions concerning normed vector spaces (including the usual Banach space results - such as the HahnBanach theorems, the Open Mapping theorem, Uniform boundedness principle, etc.) and Hilbert spaces (orthonormal bases, orthogonal complements, projections, etc.). The third chapter is probably what may not usually be seen in a first course on functional analysis; this is no doubt influenced by the author's conviction that the only real way to understand the spectral theorem is as a statement concerning representations of commutative C* -algebras. Thus, this chapter begins with the standard Gelfand theory of commutative Banach algebras, and proceeds to the Gelfand-Naimark theorem on commutative C* -algebras; this is then followed by a discussion of representations of (not necessarily commutative) C*-algebras (including the GNS construction which establishes the correspondence between cyclic representations and states on the C* -algebra, as well as the so-called "noncommutative Gelfand Naimark theorem" which asserts that C*-algebras admit faithful representations on Hilbert space); the final section of this chapter is devoted to the Hahn-Hellinger classification of separable representations of a commutative C* -algebra (or equivalently, the classification of separable spectral measures on the Borel O'-algebra of a compact Hausdorff space). vii
viii
Preface
The fourth chapter is devoted to more standard "operator theory" in Hilbert space. To start with, the traditonal form of the spectral theorem for a normal operator on a separable Hilbert space is obtained as a special case of the theory discussed in Chapter 3; this is followed by a discussion of the polar decomposition of operators; we then discuss compact operators and the spectral decomposition of normal compact operators, as well as the singular value decomposition of general compact operators. The final section of this chapter is devoted to the classical facts concerning Fredholm operators and their "index theory" . The fifth and final chapter is a brief introduction to the theory of unbounded operators on Hilbert space; in particular, we establish the spectral and polar decomposition theorems. A fairly serious attempt has been made at making the treatment almost self-contained. There are seven sections in the Appendix which are devoted to: (a) discussing and establishing some basic results in each of the following areas: linear algebra, transfinite considerations (including Zorn's lemma and a "naive" treatment of infinite cardinals), general topology, compact and locally compact Hausdorff spaces, and measure theory; and (b) a proof of the Stone-Weierstrass theorem, and finally, a statement of the Riesz Representation theorem (on measures and continuous functions). Most statements in the appendix are furnished with proofs, the exceptions to this being the sections on measure theory and the Riesz representation theorem. The intended objective of the numerous sections in the Appendix is this: if a reader finds that (s)he does not know some "elementary" fact from, say linear algebra, which seems to be used in the main body of the book, (s)he can go to the pertinent section in the Appendix, to attempt a temporary stop-gap filler. The reader who finds the need to pick up a lot of "background material" from some section of the Appendix, should, at the earliest opportunity, try to fill in the area in which a lacuna in one's training has been indicated. In particular, it should be mentioned that the treatment in Chapter 3 relies heavily on various notions from measure theory, and the reader should master these prerequisites before hoping to master the material in this chapter. A few "standard" references have been listed in the brief bibliography; these should enable the reader to fill in the gaps in the appendix. Since the only way to learn mathematics is to do it, the book is punctuated with a lot of exercises; often, the proofs of some propositions in the text are relegated to the exercises; further, the results stated in the exercises are considered to be on the same footing as "properly proved" propositions, in the sense that we have freely used the statements of exercises in the subsequent treatment in the text. Most exercises, whose solutions are not immediate, are furnished with fairly elaborate hints; thus, a student who is willing to sit down with pen and paper and "get her hands dirty" should be able to go through the entire book without too much difficulty. Finally, all the material in this book is very "standard" and no claims for originality are made; on the contrary, we have been heavily influenced by the
Preface
ix
treatment of [Sim] and [Yos] in various proofs; thus, for instance, the proofs given here for the Open Mapping Theorem, Urysohn's lemma and Alexander's sub-base theorem, are more or less the same as the ones found in [Sim] , while the proofs of the Weierstrass as well as the Stone-Weierstrass theorems are almost identical to the ones in [Yos]; furthermore, the treatment in §A.2 has been influenced a little by [Hal], but primarily, the proofs of this section on cardinal numbers are a specialisation of the proofs in [MvN] for corresponding statements in a more general context. Acknowledgements. The author would like to thank the Institute of Mathematical Sciences for the excellent facilities it provided him during his preparation of these notes; he also wishes to thank the students of his class - in particular, M. Rajesh for having made the process of giving the course an enjoyable experience; he would also like to thank R. Bhatia and M. Krishna for their constant encouragement in this venture. Finally, the author would like to thank Paul Halmos for having taught him most of this material in the first instance, for having been a great teacher, and later, colleague and friend. This book is fondly dedicated to him and to the author's parents (for reasons too numerous to go into here).
Chapter 1 N ormed spaces
1.1
Vector Spaces
We begin with the fundamental notion of a vector space. Definition 1.1.1 A vector space is a (non-empty) set V that comes equipped with the following structure:
(a) There exists a mapping V x V --+ v, denoted (x, y) f---+ X + y, referred to as vector addition, which satisfies the following conditions, for arbitrary x, y, z E V: (i) (commutative law) x + y = y + x; (ii) (associative law) (x + y) + z = x + (y + z); (iii) (additive identity) there exists an element in V, always denoted by 0 (or OV, if it is important to draw attention to V), such that x + 0 = x; (iv) (additive inverses) for each x in V, there exists an element of V, denoted by -x, such that x + (-x) = O.
(b) There exists a map rc x V --+ V, denoted by (a, x) f---+ ax, referred to as scalar multiplication, which satisfies the following conditions, for arbitrary x, y E V, a,(3 E rc: (i) (associative law) a((3x) = (a(3)x; (ii) (distributive law) a(x + y) = ax + ay, and (a + (3)x ax + (3x; and (iii) (unital law) Ix = x. Remark 1.1.2 Strictly speaking, the preceding axioms define what is usually called a "vector space over the field of complex numbers" (or simply, a complex vector space). The reader should note that the definition makes perfect sense if every occurrence of rc is replaced by lR; the resulting object would be a vector space over the field of real numbers (or a "real vector space"). More generally, we could replace rc by an abstract field K, and the result would be a vector space over K, and K is called the "underlying field" of the vector space. However, in these notes, we I
Chapter 1. Normed spaces
2
shall always confine ourselves to complex vector spaces, although we shall briefly discuss vector spaces over general fields in §A.l. Furthermore, the reader should verify various natural consequences of the axioms, such as: (a) the 0 of a vector space is unique; (b) additive inverses are unique - meaning that if x, y E V and if x + y = 0, then necessarily y = -x; more generally, we have the cancellation law, meaning: if x, y, z E V and if x + y = x + z, then y = z; (c) thanks to associativity and commutativity of vector addition, the expression 'E~=l Xi = Xl + x2 + ... + Xn has an unambiguous (and natural) meaning, whenever Xl, ... ,X n E V. In short, all the "normal" rules of arithmetc hold in a general vector space. D Here are some standard (and important) examples of vector spaces. The reader should have little or no difficulty in checking that these are indeed examples of vector spaces (by verifying that the operations of vector addition and scalar multiplication, as defined below, do indeed satisfy the axioms listed in the definition of a vector space). Example 1.1.3 (1) C n is the (unitary) space of all n - tuples of complex numbers; its typical element has the form (6, ... , t;n), where t;i E C VI:::; i :::; n. Addition and scalar multiplication are defined co-ordinatewise: thus, if x = (t;i), y = (7]i), then x + y = (t;i + 7]i) and ax = (at;i). (2) If X is any set, let C X denote the set of all complex-valued functions on the set X, and define operations co-ordinatewise, as before, thus: if f, 9 E C X , then (f + g)(x) = f(x) + g(x) and (af)(x) = af(x). (The previous example may be regarded as the special case of this example with X = {I, 2, ... , n}.) (3) We single out a special case of the previous example. Fix positive integers m, n; recall that an m x n complex matrix is a rectangular array of numbers of the form
A
The horizontal (resp., vertical) lines of the matrix are called the rows (resp., columns) of the matrix; hence the matrix A displayed above has m rows and n columns, and has entry aj in the unique position shared by the i-th row and the j-th column of the matrix. We shall, in the sequel, simply write A = (( aj)) to indicate that we have a matrix A which is related to a~ as stated in the last paragraph. The collection of all m x n complex matrices will always be denoted by Mmxn(C) - with the convention that we shall write Mn(C) for Mnxn(C). It should be clear that Mmxn(C) is in natural bijection with C X , where D X = {(i,j) : 1 :::; i :::; m, 1 :::; j :::; n}.
1.2. Normed spaces
3
The following proposition yields one way of constructing lots of new examples of vector spaces from old. Proposition 1.1.4 The following conditions on a non-empty subset W c V are equivalent: (i) x, YEW, a E e x + y, ax E W; (ii) x,y E W, a E e ax+y E W; (iii) W is itself a vector space if vector addition and scalar multiplication are restricted to vectors coming from W.
'* '*
A non-empty subset W which satisfies the preceding equivalent conditions is called a subspace of the vector space V.
The (elementary) proof of the proposition is left as an exercise to the reader. We list, below, some more examples of vector spaces which arise as subspaces of the examples occurring in Example 1.1.3. Example 1.1.5 (1) {x = (6, ... '~n) E en: L~=l ~n = o} is a subspace of en and is consequently a vector space in its own right.
(2) The space C[O, 1] of continuous complex-valued functions on the unit interval [0,1] is a subspace of e[O,lJ, while the set Ck(O, 1) consisting of complex-valued functions on the open interval (0,1) which have k continuous derivatives is a subspace of e(O,1). Also, U E C[O, 1]: J~ f(x)dx = O} is a subspace of C[O, 1]. (3) The space £00 = {(a1,a2,oo.,a n ,oo.) E eN: sUPnlanl < oo} (of complex sequences), where N denotes the set of natural numbers. may be as a subspace of eN,; similarly the space £1 = {(a1,a2,oo.,a n ,oo.) e, Ln lanl < oo} (of absolutely summable complex sequences) may be as a subspace of £00 (and consequently of eN.)
bounded regarded : an E
regarded
(4) The examples in (3) above are the two extremes of a continuum of spaces an, E eN: Ln Ian IP < oo}, for 1 "50 p < 00. It defined by £P = {( aI, a2, . is a (not totally trivial) fact that each £P is a vector subspace of eN. 00
,
00
.)
(5) For p E [1,00]' n E N, the set £~ = {a E £P: ak = of £P which is in natural bijection with en.
1.2
°
'v' k > n} is a subspace 0
Normed spaces
Recall that a set X is called a metric space if there exists a function d : X x X [0,00) which satisfies the following conditions, for all x, y, z EX:
°
d(x, y) = d(x,y) d(x,y)
¢o}
x=y d(y,x)
<
d(x,z)
+
d(z,y) .
--+
(1.2.1) (1.2.2) (1.2.3)
Chapter 1. Normed spaces
4
The function d is called a metric on X and the pair (X, d) is what is, strictly speaking, called a metric space. The quantity d(x, y) is to be thought of as the distance between the points x and y. Thus, the third condition on the metric is the familiar triangle inequality. The most familiar example, of course, is lR. 3 , with the metric defined by 3
2)Xi - Yi)2. i=l
As with vector spaces, anyone metric space gives rise to many metric spaces thus: if Y is a subset of X, then (Y, dl y x y) is also a metric space, and is referred to as a (metric) subspace of X. Thus, for instance, the unit sphere 52 = {x E lR.3 : d(x,O) = I} is a metric space; and there are clearly many more interesting metric subspaces of lR.3 . It is more customary to write Ilx II = d( x, 0), refer to this quantity as the norm of x and to think of 52 as the set of vectors in lR. 3 of unit norm. The set lR. 3 is an example of a set which is simultaneously a metric space as well as a vector space (over the field of real numbers), where the metric arises from a norm. We will be interested in more general such objects, which we now pause to define. Definition 1.2.1 A norm on a vector space V is a function V 3 x which satisfies the following conditions, for all x, y E V, a E C:
Ilxll
x=o Ilaxll lalllxli Ilx+yll < Ilxll + Ilyll
and a normed norm on it.
vector space
=
0
{=}
is a pair (V,
II . II)
f---t
Ilxll
E [0,00)
(1.2.4) (1.2.5) (1.2.6)
consisting of a vector space and a
Example 1.2.2 (1) It is a fact that £P (and hence vector space with respect to the norm defined by
£~,
for any n E N) is a normed
(1.2.7) This is easy to verify for the cases p = 1 and p = 00; we will prove in the sequel that II . 112 is a norm; we will not need this fact for other values of p and will hence not prove it. The interested reader can find such a proof in [Sim] for instance; it will, however, be quite instructive to try and prove it directly. (The latter exercise/effort will be rewarded by a glimpse into notions of duality between appropriate pairs of convex maps.)
1.2. Normed spaces
5
(2) Given a non-empty set X, let B(X) denote the space of bounded complexvalued functions on X. It is not hard to see that B(X) becomes a normed vector space with respect to the norm (denoted by II . 1100) defined by 1111100 = sup{II(x)1 : x E X} .
(1.2.8)
(The reader, who is unfamiliar with some of the notions discussed below, might care to refer to §A.4 and §A.6 in the Appendix, where (s)he will find definitions of compact and locally compact spaces, respectively, and will also have the pleasure of getting acquainted with "functions vanishing at infinity".) It should be noted that if X is a compact Hausdorff space, then the set C(X) of continuous complex-valued functions on X is a vector subspace of B(X) (since continuous functions on X are necessarily bounded), and consequently a normed vector space in its own right. More generally, if X is a locally compact Hausdorff space, let Co(X) denote the space of all complex-valued continuous functions on X which "vanish at infinity"; thus I E Co(X) precisely when I : X ---> C is a continuous function such that for any positive E, it is possible to find a compact subset K c X such that II(x)1 < E whenever x rj:. K. The reader should verify that Co(X) c B(X) and hence, that Co(X) is a normed vector space with respect to 11·1100. (3) Let C~(O,l) = {J E Ck(O,l) : III(j)lloo < 00 for all 0:::; j:::; k}, where we write I(j) to denote the j-th derivative of the function I - with the convention that 1(0) = I. This becomes a normed space if we define I
k
L
11111
(1.2.9)
IIj<j)lloo .
j=O
More generally, if n is an open set in IRn, let C~(n) denote the space of complex-valued functions on n which admit continuous partial derivatives of order at most k which are uniformly bounded functions. (This means that for any multi-index a = (al,a2, ... ,an ) where the ai are any non-negative integers sat. f· ,\,n . I denvatlve · . u<=Ini -- 8018a2 8 1al... 80n I eXIsts, . IS ymg Ia I -- L.Jj=1 aj < _ k ,t h e partla Xl
an
x2
Xn
is continuous - with the convention that if aj = 0 V j, then I = I - and is bounded.) Then, the vector space C~(n) is a normed space with respect to the norm defined by 11111
=
D {n:O::;lnl::;k}
Note that - verify! - a normed space is a vector space which is a metric space, the metric being defined by d(x,y)
Ilx -YII·
(1.2.10)
Chapter 1. Normed spaces
6
It is natural to ask if the "two structures" - i.e., the vector (or linear) and the metric (or topological) structures - are compatible; this amounts to asking if the vector operations are continuous. Recall that a function f : X --> Y between metric spaces is said to be continuous at Xo E X if, whenever {xn} is a sequence in X which converges to Xo i.e., if limn dx(xn, xo) = 0 - then also the sequence {j(xn)} converges to f(xo); further, the function f is said to be continuous if it is continuous at each point of X. Exercise 1.2.3 (1) If X, Yare metric spaces, there are several ways of making X x Y into a metric space; if d x , d y denote the metrics on X, Y respectively, verify that both the following specifications define metrics on X x Y:
d 1 ( (xl,yd,(X2,Y2)) d oo ( (xl,yd, (X2,Y2) )
dx (Xl, X2)
+ dY(YI' Y2)
max{ dx (Xl, X2), d y (YI, Y2)}
Also show that a sequence {(Xn, Yn)} in X x Y converges to a point (xo, Yo) with respect to either of the metrics d i , i E {I, oo} if and only if we have "coordinate-wise convergence"- i.e., if and only if {xn} converges to Xo with respect to d x , and {Yn} converges to Yo with respect to d y .
(2) Suppose X is a normed space; show that each of the following maps is continuous (where we think of the product spaces in (a) and (c) below as being metric spaces via equation 1.2.10 and Exercise (1) above): (a) XxX :1 (x,y) f-t (x+Y)E X; (b) X :1 X f-t - X EX; (c) C x X :1 ((x, x) f-t (Xx EX. (3) Show that a composite of continuous functions between metric spaces - which makes sense in a consistent fashion only if the domains and targets of the maps to be composed satisfy natural inclusion relations - is continuous, as is the restriction to a subspace of the domain of a continuous function.
1.3
Linear operators
The focus of interest in the study of normed spaces is on the appropriate classes of mappings between them. In modern parlance, we need to identify the proper class of morphisms in the category of normed spaces. (It is assumed that the reader is familiar with the rudiments of linear algebra; the necessary material is quickly discussed in the appendix - see §A.I - for the reader who does not have the required familiarity.) If we are only interested in vector spaces, we would concern ourselves with linear transformations between them - i.e., if V, Ware vector spaces, we would be
1.3. Linear operators
7
interested in the class L(V, W) of mappings T : V sense that
T(ax+{3y)
=
---7
W which are linear in the
aTx+{3Ty, Va,{3 EC, x,yEV
(*)
When V = W, we write L(V) = L(V, V). We relegate some standard features (with the possible exception of (2)) of such linear transformations to an exercise. Exercise 1.3.1 (1) 1fT E L(V, W), show that T preserves collinearity, in the sense that if x, y, z are three points (in V) which lie on a straight line, then so are Tx, Ty, Tz (in W).
(2) If V = JR3 (or even JRn ), and if T : V ---7 V is a mapping which (i) preserves collinearity (as in (1) above), and (ii) T maps V onto V, then show that T must satisfy the algebraic condition displayed in (*). (This may be thought of as a reason for calling the above algebraic condition "linearity".) (3) Show that the following conditions on a linear transformation T E L(V, W) are equivalent: (i) Tis 1-1; i.e., x,y E V, Tx = Ty =} x = y; (ii) ker T(= {xEV:Tx=O}) = {O}; (iii) T "preserves linear independence", meaning that if X = {Xi: i E I} C V is a linearly independent set, so is T(X) = {TXi : i E I} C W. (A set X as above is said to be linearly independent if, whenever Xl, ... ,X n E X, the only choice of scalars aI, ... ,an E C for which the equation a1X1 + ... + anXn = 0 is satisfied is the trivial one: a1 = ... = an = O. In particular, any subset of a linearly independent set is also linearly independent. A set which is not linearly independent is called "linearly dependent". It must be noted that {O} is a linearly dependent set, and consequently, that any set containing 0 is necessarily linearly dependent.) (4) Show that the following conditions on a linear transformation T E L( V, W) are equivalent: (i) T is invertible - i.e., there exists S E L(W, V) such that ST = id v and TS = id w , where we write juxtaposition (ST) for composition (S 0 T); (ii) T is 1-1 and onto; (iii) if X = {Xi : i E I} is a basis for V - equivalently, a maximal linearly independent set in V, or equivalently, a linearly independent spanning set for V - then T( X) is a basis for W. (A linearly independent set is maximal if it is not a proper subset of a larger linearly independent set.) (5) When the equivalent conditions of (4) are satisfied, T is called an isomorphism and the spaces V and Ware said to be isomorphic; if T is an isomorphism, the transformation S of (i) above is unique, and is denoted by T- 1 .
Chapter 1. Normed spaces
8
(6) Show that GL(V) = {T E L(V) : T is invertible} is a group under multiplication.
(7) (i) IfBv =
{X1,X2, ... ,Xn } is a basis for V, andBw = {Y1,Y2,···,Ym} is a basis for W, show that there is an isomorphism between the spaces L(V, W) and Mmxn(!C) given by L(V, W) :7 T f---+ [Tl~~, where the matrix [Tl~~ = ((tj)) is defined by m
TXj
=
2: tjYi
(1.3.11)
i=l
= [Sl~V3 [TlBBV2 where this makes sense. (Recall that if (ii) Show that [STl B B V3 Vl V2 VI A = ((aU) is an m x n matrix, and if B = ((bj)) is an n x p matrix, then the product AB is defined to be the m x p matrix C = (( cj )) given by n
2: akbj
cj
, VI::; i ::; m, 1 ::; j ::; p .)
k=l
(iii) If we put W = V, Yi = Xi, show that the isomorphism given by (i) above maps GL(V) onto the group Gl(n, q of invertible n x n matrices. (iv) The so-called standard basis Bn = {ei : 1 ::; i ::; n} for V = en is defined by el = (1,0, ... ,0), ... , en = (0,0, ... ,1); show that this is indeed a basis for en. (v) (In the following example, assume that we are working over the real field JR.) Compute the matrix [Tl ~~ of each of the following (obviously) linear maps on JR 2 : (a) T is the rotation about the origin in the counter-clockwise direction by an angle of i; (b) T is the projection onto the x-axis along the y-axis; (c) T is mirror-reflection in the line y = x. When dealing with normed spaces - which are simultaneously vector spaces and metric spaces - the natural class of mappings to consider is the class of linear transformations which are continuous. Proposition 1.3.2 Suppose V, Ware normed vector spaces, and suppose T E L(V, W); then the following conditions on T are equivalent: (i) T is continuous at 0(= Ov); (ii) T is continuous; (iii) The quantity defined by IITII
= sup{IITxll: x E V, Ilxll ::; I}
(1.3.12)
is finite; in other words, T maps bounded sets of V into bounded sets of W. (A subset B of a metric space X is said to be bounded if there exist Xo E X and R > such that d(x,xo)::; R V x E B.)
°
l.3. Linear operators
9
Proof. Before proceeding to the proof, notice that
T E L(V, W)
=?
TO v = Ow .
(1.3.13)
In the sequel, we shall simply write 0 for the zero vector of any vector space.
(i)
=?
(ii):
xn ----> x in V=? (x n - x) ----> 0 =? T(x n - x) ----> TO = 0 =? (Txn - Tx) ----> 0 =? TX n ----> Tx in W .
(ii) =? (iii): To say that IITII = 00 means that we can find a sequence {xn} in X such that Ilxn II :s; 1 and IITxn II ~ n. It follows from equation 1.3.13 that if Xn are as above, then the sequence {~Xn} would converge to 0 while the sequence {T(~xn)} would not converge to 0, contradicting the assumption (ii). (iii)
=?
(i): Notice to start with, that if IITII <
00,
and if 0
i=
x E V, then
1
IITxl1
= Ilxll'IIT(~x)11 :s;
Ilxll'IITII,
and hence
xn---->O
=?
Ilxnll---->O
=?
IITxnl1
:s;
IITII Ilxnll ----> 0
D
and the proposition is proved.
Remark 1.3.3 The collection of continuous linear transformations from a normed space V into a normed space W will always be denoted, in these notes, by the symbol £(V, W). As usual, we shall use the abbreviation £(V) for £(V, V). In view of condition (iii) of Proposition 1.3.2, an element of £(V, W) is also described by the adjective "bounded". In the sequel, we shall use the expression bounded operator to mean a continuous (equivalently, bounded) linear transformation between normed spaces. D We relegate some standard properties of the assignment T exercises.
f-->
IITII to the
Exercise 1.3.4 Let V, W be non-zero normed vector spaces, and let T E £(V, W).
(1) Show that IITII
sup{IITxll : x E V, Ilxll sup{IITxll : x E V, Ilxll sup{
:s; =
I} I}
IITxl1
TxIf : x E V,x i= O}
inf{K
> 0: IITxl1 :s; Kllxll 'V x
E
V} .
Chapter 1. Normed spaces
10
(2) 'c(V, W) is a vector space, and the assignment T 'c(V, W).
f---7
IITII
defines a norm on
(3) If S E 'c(W, X), where X is another normed space, show that ST (= SoT) E ,C(V,X) and IISTII:s: IISII·IITII· (4) If V is a finite-dimensional normed space, show that L(V, W) = 'c(V, W); what if it is only assumed that W (and not necessarily V) is finite-dimensional? (Hint: let V be the space IP' of all polynomials, viewed as a subspace of the normed space e[O, 1] (equipped with the "sup"-norm as in Example 1.2.2 (2)), and consider the mapping cP : IP' ----+ C defined by cpU) = 1'(0).) (5) Let
.Ai
E
:s: i :s: n,
and show that IITII normed space.)
=
1:S: p
:s: 00;
max{ l.Ai I
consider the operator T E
: 1 :s: i :s:
(6) Show that the equation U(T)
,c(£~)
defined by
n}. (You may assume that £~ zs a
T(l) defines a linear isomorphism U : IIU(T)II IITII for all
'c(C, V) ----+ V which is isometric m the sense that T E 'c(C, V).
1.4
The Hahn-Banach theorem
This section is devoted to one version of the celebrated Hahn-Banach theorem. We begin with a lemma.
Lemma 1.4.1 Let M be a linear subspace of a real vector space X. Suppose p : X ----+ lR is a mapping such that (i) p(x + y) :s: p(x) + p(y) V x, Y E X; and (ii) p(tx) = tp(x) Vx E X, O:s: t E lR. Suppose CPo : M ----+ lR is a linear map such that CPo (x) :s: p( x) for all x EM. Then, there exists a linear map cP : X ----+ lR such that cplM = CPo and cp(x) :s: p(x) Vx E X. Proof. The proof is an instance of the use of Zorn's lemma. (See §A.2, if you have not yet got acquainted with Zorn's lemma.) Consider the partially ordered set P, whose typical member is a pair (Y, 'I/J), where (i) Y is a linear subspace of X which contains ~; and (ii) 'I/J : Y ----+ lR is a linear map which is an extension of CPo and satisfies 'I/J(x) :s: p(x) V x E Y; the partial order on P is defined by setting (Y1,'l/Jl) :s: (Y2,'l/J2) precisely when (a) Y1 C Y2, and (b) 'l/J21y, = 'l/Jl. (Verify that this prescription indeed defines a partial order on P.) Furthermore, if C = { (Yi, 'l/Ji) : i E I} is any totally ordered set in P, an easy verification shows that an upper bound for the family C is given by (Y, 'I/J), where Y = UiEI Yi and 'I/J : Y ----+ lR is the unique (necessarily linear) map satisfying 'l/JIYi = 'l/Ji for all i.
1.4. The Hahn-Banach theorem
11
Hence, by Zorn's lemma. the partially ordered set P has a maximal element, call it (Y, 'l/J). The proof of the lemma will be completed once we have shown that
Y=X. Suppose Y =I- X; fix Xo E X -Y, and let Y1 = Y +llb o = {y+txO : y E Y, t E lR}. The definitions ensure that Y1 is a subspace of X which properly contains Y. Also, notice that any linear map 'l/Jl : Y1 ---> lR which extends 'l/J is prescribed uniquely by the number to = 'l/Jl (xo) (and the equation 'l/Jl (y + txo) = 'l/J(y) + tto). We assert that it is possible to find a number to E lR such that the associated map 'l/Jl would - in addition to extending 'l/J - also satisfy 'l/Jl :::: p. This would then establish the inequality (Y, 'l/J) :::: (Y1, 'l/Jl), contradicting the maximality of (Y, 'l/J); this contradiction wpuld then imply that we must have had Y = X in the first place, and the proof would be complete. First observe that if Yl, Y2 E Yare arbitrary, then,
'l/J(YI
+ Y2)
< p(Yl + Y2) < p(Yl - xo) + P(Y2 + xo) ; and consequently; (1.4.14) Let to be any real number which lies between the supremum and the infimum appearing in equation 1.4.14. We now verify that this to does the job. Indeed, if t > 0, and if Y E Y, then, since the definition of to ensures that 'l/J(Y2) + to :::: p(Y2 + xo) 'V Y2 E Y, we find that:
'l/Jdy + txo)
+ tto t ['l/J( + to]
'l/J(y)
t)
Y
< t p( t + xo) p(y + txo) . Similarly, ift < 0, then, since the definition of to also ensures that 'l/J(yd -to :::: Yl E Y, we find that:
P(YI - xo) 'V
'l/J(y)
+ tto
-t ['l/J( ~t) - to]
<
-t p(~t - xo) p(y + txo) .
Thus, complete.
'l/Jl (y + txo) :::: p(y + txo) 'V y
E Y,
t
E
lR, and the proof of the lemma is 0
Chapter 1. Normed spaces
12
It is customary to use the notation V* for the space £(V, C) of continuous (equivalently, bounded) linear functionals on the normed space V. We are ready now for the Hahn-Banach theorem, which guarantees the ex. istence of "sufficiently many" continuous linear functionals on any normed space. (Before proceeding to this theorem, the reader should spend a little time thinking about why V* -I- {O} for a normed space V -I- {O}.) Theorem 1.4.2 (Hahn-Banach theorem) Let V be a normed space and let Vo be a
subspace of V. Suppose (i) ¢Ivo ¢o; and (ii)
¢o
E
Vo*; then there exists a ¢ E V* such that
II¢II = 11¢011·
Proof. We first consider the case when V is a "real normed space" and Vo is a (real) linear subspace of V. In this case, apply the preceding lemma with X = V, M = Vo and p(x) = II¢oll·llxll, to find that the desired conclusion follows immediately. Next consider the case of complex scalars. Define 'lj;o(x) = Re ¢o(x) and xo(x) = 1m ¢o(x), V x E Vo, and note that if x E Vo, then
'lj;o(ix)
+ iXo(ix)
=
¢o(ix)
=
i¢o(x)
=
-Xo(x)
+ i'lj;o(x)
,
and hence, Xo(x) = -'lj;o(ix) V x E Vo. Observe now that 'lj;o : Xo ---> JR. is a "real" linear functional of the real normed linear space X o, which satisfies:
I'lj;o(x) I ~
l¢o(x)1
~
II¢oll·llxll
V x E Vo ;
deduce from the already established real case ~hat 'lj;o extends to a real linear functional - call it 'lj; - of the real normed space V such that I'lj; (x) I ~ 11¢011· Ilxll V x E V. Now define ¢ : V ---> C by ¢(x) = 'lj;(x) - i'lj;(ix), and note that ¢ is a C-linear map which extends ¢o. Finally, if x E V and if ¢(x) = re ie , with r > 0, E JR., then,
e
1¢(x)1
e-ie¢(x) ¢(e-iex) 'lj;(e-iex)
<
II¢oll·lle-iexll II¢oll·llxll
and the proof of the theorem is complete.
o
We shall postpone the deduction of some easy corollaries of this theorem until we have introduced some more terminology in the next section. Also, we postpone the discussion of further variations of the Hahn-Banach theorem - which are best described as Hahn-Banach separation results, and are stated most easily using the language of "topological vector spaces" - to the last section of this chapter, which is where topological vector spaces are discussed.
1.5. Completeness
1.5
Completeness
A fundamental notion in the study of metric spaces is that of completeness. It would be fair to say that the best - and most useful - metric spaces are the ones that are complete. To see what this notion is, we start with an elementary observation. Suppose (X, d) is a metric space, and {Xn : n E N} is a sequence in X; recall that to say that this sequence converges is to say that there exists a point x E X such that Xn --+ x - or equivalently, in the language of epsilons, this means that for each E > 0, it is possible to find a natural number N such that d( X n , x) < E whenever n 2:> N. This is easily seen to imply the following condition: Cauchy criterion: For each E > 0, there exists a natural number N such that d(xm, xn) < E whenever n, m 2:> N. A sequence which satisfies the preceding condition is known as a Cauchy sequence. Thus, the content of our preceding discussion is that any convergent sequence is a Cauchy sequence. Definition 1.5.1 (a) A metric space is said to be complete if every Cauchy sequence in it is convergent. (b) A normed space which is complete with respect to the metric defined by the norm is called a Banach space. The advantage with the notion of a Cauchy sequence is that it is a condition on only the members of the sequence; on the other hand, to check that a sequence is convergent, it is necessary to have the prescience to find a possible x to which the sequence wants to converge. The following exercise will illustrate the use of this notion. Exercise 1.5.2 (1) Let X = C[O, 1] be endowed with the sup-norm II . 1100, as in Example 1.2.3 (2). (a) Verify that a sequence {fn} converges to f E X precisely when the sequence Un} of functions converges uniformly to the function f· (b) Show that C is a complete metric space with respect to d( z, w) = Iz - wi, and use the fact that the limit of a uniformly convergent sequence of continuous functions is continuous, to prove that X is complete (in the metric coming from the sup-norm), and consequently, a Banach space. (c) Show, more generally, that the space Co(Z), where Z is a locally compact Hausdorff space, - see Example 1.2.3 (2) - is a Banach space.
(2) Suppose (X, II . II) is a Banach space; a series 2:::;::'=1 xn is said to converge in X if the sequence {sn} of partial sums defined by Sn = 2::~=1 Xk, is a convergent sequence. Show that an "absolutely summable" series is convergent, provided the ambient normed space is complete - i. e., show that if {xn} is a sequence in X such
Chapter l. Normed spaces
14
that the series 2::~=1 Ilx n II of non-negative numbers is convergent (in JR.), then the series 2::~=1 Xn converges in X. Show, conversely, that if every absolutely summable series in a normed vector space X is convergent, then X is necessarily complete. (3) Show that the series 2::~=1 ~2 sin nx converges to a continuous function in [0,1 J. (Try to think of how you might prove this assertion using the definition of a convergent sequence, without using the notion of completeness and the fact that e[o, 1] is a Banach space.) (4) Adapt the proof of (l)(b) to show that if W is a Banach space, and if V is any normed space, then also £(V, W) is a Banach space.
(5) Show that the normed space CP is a Banach space, for 1 ::; P ::;
00.
As a special case of Exercise 1.5.2(4) above, we see that even if V is not necessarily complete, the space V* is always a Banach space. It is customary to refer to V* as the dual space of the normed space V. The justification, for the use of the term "dual" in the last sentence, lies in the Hahn-Banach theorem. To see this, we first list various consequences of this very important theorem as exercises. Exercise 1.5.3 Let X be a normed space, with dual space X* .
(1) If 0 -=I- Xo E X, show that there exists a ¢ E X* such that II¢II = 1 and ¢(xo) = Ilxoll. (Hint: set Xo = CXo = {axo: a E C}, consider the linear functional ¢o E Xo defined by ¢o(AXo) = Allxoll, and appeal to the Hahn-Banach theorem.)
(2) Show that the equation ( j(x) ) (¢) defines an isometric mapping j : X (X*)* .)
---->
=
(1.5.15)
¢(x)
(X*)*. (It is customary to write X** for
(3) If Xo is a closed subspace of X - i.e., Xo is a vector subspace which is closed as a subset of the metric space X - let XI Xo denote the quotient vector space; (thus, XI Xo is the set of equivalence classes with respect to the equivalence relation x", Xo y {:} (x - Y) E Xo; it is easy to see that XI Xo is a vector space with respect to the natural vector operations; note that a typical element of XI Xo may - and will - be denoted by x + X o, x EX). Define Ilx+Xoll
inf{llx - xoll : Xo E Xo}
dist(x, Xo) .
(1.5.16)
Show that (i) XI Xo is a normed vector space with respect to the above definition of the norm, and that this quotient space is complete if X is;
1.5. Completeness
15
(ii) if x E X, then x rt. Xo if and only if there exists a non-zero linear functional ¢ E X* such that ¢(x) #- 0 and ¢(y) = 0 V Y E Xo. (Hint: Apply Exercise (1) above to the space XjX o .) Remark 1.5.4 A Banach space X is said to be reflexive if the mapping j of Exercise 1.5.3(2) is surjective - i.e., j(X) = X**. (Since any dual space is complete, a reflexive space is necessarily complete, and this is why we have - without loss of generality - restricted ourselves to Banach spaces in defining this notion.) It is a fact that fP - see Example 1.2.2(4) - is reflexive if and only if 1 < p < 00. D
In the next sequence of exercises, we outline one procedure for showing how to "complete" a normed space. Exercise 1.5.5 (1) Suppose Y is a normed space and Yo is a (not necessarily closed) subspace in Y. Show that: (a) if Y1 denotes the closure of Yo - i.e., if Y1 {y E Y : ::J {yn}~=l C Yo such that Yn ----+ y} - then Y1 is a closed subspace of Y; and (b) if Z is a Banach space, and if T E' £(Yo, Z), then there exists a unique operator T E £(Y1' Z) such that Tlyo = T; further, if T is an isometry, show that so also is T.
(2) Suppose X is a normed vector space. Show that there exists a Banach space X which admits an isometric embedding of X onto a dense subspace Xo; i.e., there exists an isometry T E £(X, X) such that Xo = T(X) is dense in X (meaning that the closure of Xo is all of X). (Hint: Consider the map j"of Exercise 1.5.3(2), and choose X to be the closure of j(X).) (3) Use Exercise (1) above to conclude that if Z is another Banach space such that there exists an isometry S E £(X, Z) whose range is dense in Z, and if X and T are as in Exercise (2) above, then show that there exists a unique isometry U E £(X, Z) such that U 0 T = Sand U is onto. Hence the space X is essentially uniquely determined by the requirement that it is complete and contains an isometric copy of X as a dense subspace; such a Banach space is called a completion of the normed space X. We devote the rest of this section to a discussion of some important consequences of completeness in the theory of normed spaces. We begin with two results pertaining to arbitrary metric (and not necessarily vector) spaces, namely the celebrated "Cantor intersection theorem" and the "Baire category theorem". Theorem 1.5.6 (Cantor intersection theorem) Suppose C 1 ::) C 2 ::) •.. ::) C n ::) ... is a non-increasing sequence of non-empty closed sets in a complete metric space X. Assume further that diam(Cn )
Then
n~=l
= sup{d(x,y): X,y
C n is a singleton set.
E
Cn}
----+
0 as n
----+ 00.
16
Chapter 1. Normed spaces
Proof. Pick Xn E C n - this is possible since each C n is assumed to be non-empty. The hypothesis on the diameters of the Cn's shows that {x n } is a Cauchy sequence in X, and hence the sequence converges to a limit, say x. The assumptionS that the Cn's are nested and closed are easily seen to imply that x E nnCn. If also y E nnCn, then, for each n, we have d(x, y) ~ diam(Cn ). This clearly forces x = y, thereby completing the proof. 0 Exercise 1.5.7 Consider the four hypotheses on the sequence {Cn } in the preceding theorem - viz., non-emptiness, decreasing property, closedness, shrinking of diameters to 0 - and show, by example, that the theorem is false if anyone of these hypotheses is dropped. Before we ge to the next theorem, we need to introduce some notions. Call a set (in a metric space) nowhere dense if its closure has empty interior; thus a set A is nowhere dense precisely when every non-empty open set contains a non-empty open subset which is disjoint from A. (Verify this last statement!) A set is said to be of first category if it is expressible as a countable union of nowhere dense sets. Theorem 1.5.8 (Baire Category Theorem) No non-empty open set in a complete metric space is of the first category; or, equivalently, a countable intersection of dense open sets in a complete metric space is also dense.
Proof. The equivalence of the two assertions is an easy exercise in taking complements and invoking de Morgan's laws, etc. We content ourselves with proving the second of the two formulations of the theorem. For this, we shall repeatedly use the following assertion, whose easy verification is left as an exercise to the reader: If B is a' closed ball of positive diameter - say {j - and if U is a dense open set in X, then there exists a closed ball Bo which is contained in B n U and has a positive diameter which is less than ~ {j. Suppose {Un}~=l is a sequence of dense open sets in the complete metric space X. We need to verify that if U is any non-empty open set, then Un(n~=l Un) is not empty. Given such a U, we may clearly find a closed ball Bo of positive diameter - say {j - such that Bo cU. Repeated application of the preceding italicised assertion and an easy induction argument yield a sequence {Bn}~=l of closed balls of positive diameter such that: (i) Bn C Bn-lnUnVn~l; (ii) diam Bn < (~)n{j V n ~ l. An appeal to Cantor's intersection theorem completes the proof. o Remark 1.5.9 In the foregoing proof, we used the expression "diameter of a ball" to mean the diameter of the set as defined, for instance, in the statement of the Cantor intersection theorem. In particular, the reader should note that if B = {y EX: d(x, y) < {j} is the typical open ball in a general metric space, then
1.5. Completeness
17
the diamter of B might not be equal to 215 in general. (For instance, consider X = IL, x = 0,15 = 1, in which case the "ball" B in question is actually a singleton and hence has diameter 0.) It is in this sense that the word diameter is used in the preceding proof. D We give a flavour of the kind of consequence that the Baire category theorem has, in the following exercise. Exercise 1.5.10 (i) Show that the plane cannot be covered by a countable number of straight lines; (ii) more generally, show that a Banach space cannot be expressed as a countable union of "translates of proper closed subspaces"; (iii) where do you need the hypothesis that the subspaces in (ii) are closed? As in the preceding exercise, the Baire category theorem is a powerful "existence result". (There exists a point outside the courttably many lines, etc.) The interested reader can find applications in [Rud], for instance, of the Baire Category Theorem to prove the following existence results: (i) there exists a continuous function on [0,1] which is nowhere differentiable; (ii) there exists a continuous function on the circle whose Fourier series does not converge anywhere. The reader who does not understand some of these statements may safely ignore them. We now proceed to use the Baire category theorem to prove some of the most useful results in the theory of linear operators on Banach spaces. Theorem 1.5.11 (Open mapping theorem) Suppose T E £(X, Y) where X and Y are Banach spaces, and suppose T maps X onto Y. Then T is an open mapping - i. e., T maps open sets to open sets.
Proof. For Z E {X, Y}, let us write Bf = {z E Z : Ilzll < r}. Since a typical open set in a normed space is a union of translates of dilations of the unit ball, it is clearly sufficient to prove that T(Bf) is open in Y. In fact it is sufficient to show that there exists an r > such that B; C T(Bf). (Reason: suppose this latter assertion is true; it is then clearly the case that B;:: C T(B;) \If > O. Fix y E T(Bf), say y = Tx,llxll < 1; choose < E < 1_- Ilxll and observe that y + B;:: c y + T(B;) = T(x + B;) c T(Bf), so we have verified that every point of T(Bf) is an interior point.) The assumption of surjectivity shows that Y = Un T(B;); deduce from the Baire category theorem that (since Y is assumed to be complete) there exists some n such that T(B;) is not nowhere dense; this means that the closure T(B?f) of T(B;) has non-empty interior; since T(B?f) = nT(Bf) , this implies that T(Bf) has non-empty interior. Suppose y + B; c T(Bf); this is easily seen to
°
°
Chapter 1. Normed spaces
18
imply that BY C T(Bi). Hence, setting t = ~s, we thus see that B{ C T(Bj'), and hence also B~ C T(Bf) \IE > O. Thus, we have produced a positive number t with the following property:
8,E>O,
yEY,
Ilyll
=}
3xEX suchthat and
Ily -
Txll
< 8.
Ilxll<E (1.5.17)
Suppose now that Yo E B{. Apply 1.5.17 with E = I, 8 = ~t, Y = Yo to find Xo E X such that Ilxoll < 1 and Ilyo - Txoll < ~t. Next apply 1.5.17 with E = ~, 8 = (~)2t, Y = YI = Yo - Txo, to find Xl EX such that IlxI11 < ~ and IIYI - TXIII < (~)2t. By repeating this process inductively, we find vectors Yn E Y, Xn E X such that IIYnl1 < (~)nt, Ilxnll < (~)n, Yn+l = Yn - TXn \In 2': O. Notice that the inequalities ensure that the series L~=o Xn is an "absolutely summable" series in the Banach space X and consequently - by Exercise 1.5.2(2) - the series converges to a limit, say X, in X. On the other hand, the definition of the Yn's implies that
Yn+l
Yn - TXn Yn-l - T.Tn-1 - TXn
Since IIYn+111 ----> 0, we may conclude from the continuity of T that Yo = Tx. Finally, since clearly Ilxll < 2, we have now shown that B{ C T(B{), and hence Brt C T(Bj') and the proof of the theorem is complete. 0 2
Corollary 1.5.12 Suppose X, Yare Banach spaces and suppose T E £(X, Y). Assume that T is 1-1.
(1) Then the following conditions on T are equivalent: (i) ran T = T(X) is a closed subspace of Y,(ii) the operator T is bounded below, meaning that there exists a constant c > 0 such that IITxl1 2': cllxll \I X E X. (2) In particular, the following conditions are equivalent: (i) Tis 1·1 and maps X onto Y,(ii) there exists a bounded operator S E £(Y, X) such that ST = id x and TS = id y ,- in other words, T is invertible in the strong sense that T has an inverse which is a bounded operator. Proof· (1) Suppose T is bounded below and suppose TX n ----> Y; then {Txn}n is a Cauchy sequence; but the assumption that T is bounded below then implies that
1.5. Completeness
19
also {xn}n is a Cauchy sequence which must converge to some point, whose image under T must be y in view of continuity of T. Conversely suppose Yo = T(X) is closed; then also Yo is a Banach space in its own right. An application of the open mapping theorem to T E LeX, Yo) now implies that there exists a constant E > 0 such that C T(Bf) (in the notation of the proof of Theorem 1.5.11). We now assert that IITxl1 ~ ~llxll V x E X. Indeed, suppose x E X and IITxl1 = r; then T(2~x) E Bro c T(Bn. Since T is 1-1, this clearly implies that 2'rx E Bf, whence Ilxll < 2;; this proves our assertion.
Bra
(2) The implication (ii) ::::} (i) is obvious. Conversely, if T is a bijection, it admits an inverse map which is easily verified to be linear; the boundedness of the inverse D is a consequence of the fact that T is bounded below (in view of (1) above). Before proceeding to obtain another important consequence of the open mapping theorem, we need to set up some terminology, which we do in the guise of an exercise. Exercise 1.5.13 The graph of a function f : X GU) C X x Y defined by
GU)
=
-7
Y is, by definition, the subset
{(x, f(x» : x E X} .
Recall that if X and Yare metric spaces, then the product X x Y can be metrised in a variety of ways - see Exercise 1.2.3(1) - in such a way that a sequence in X x Y converges precisely when each of the co-ordinate sequences converges. (i) Show that if f is a continuous map of metric spaces, then its graph is closed (as a subset of the product) - i. e., if {x n } is a sequence in X and if there exists (x, y) E X x Y such that Xn - 7 x and f(xn) - 7 y, then necessarily y = f(x). (ii) What about the converse? i.e., if a function between metric spaces has a closed graph, is the function necessarily continuous? The content of the next result is that for linear mappings between Banach spaces, the requirements of continuity and having a closed graph are equivalent. In view of Exercise 1.5.13(i), we only state the non-trivial implication in the theorem below. Theorem 1.5.14 (Closed Graph Theorem) Suppose X and Yare Banach spaces, and suppose T : X - 7 Y is a linear mapping which has a closed graph. Then T is continuous, i.e., T E LeX, Y).
Proof. The graph G(T) is given to be a closed subspace of the Banach space X EEl", Y (by which, of course, we mean the space X x Y with norm given by II(x,y)11 = Ilxll + IlylJ)· Hence, we may regard G(T) as a Banach space in its own right.
Chapter 1. Normed spaces
20
Consider the mapping S : G(T) ----+ X defined by the equation S(x,1\.x)) = x. This is clearly a linear 1-1 map of G(T) onto X. Also, it follows from the definition of our choice of the norm on G(T) that S is a bounded operator with IISII :::; 1. It now follows from Corollary 1.5.12(1) that the operator S is bounded below; i.e., there exists a constant c > 0 such that Ilxll 2: c(llxll + IITxll). Hence IITxl1 :::; l~cllxll, and the proof is complete. D
Remark 1.5.15 Suppose T E £(X, Y), where X and Yare Banach spaces, and suppose that T is only assumed to be 1-1. (An example of such an operator, with X = Y = £2 is given by T(Xl,X2, ... ,Xn, ... ) (Xl,~X2, ... ,~Xn"")') Let Yo be the (not necessarily closed, and hence Banach) subspace of Y defined by Yo = T(X). Let T- 1 : Yo ----+ X be the (necessarily linear) inverse of T. Then T- 1 has a closed graph, but is not continuous unless T happened to be bounded below (which is not the case in the example cited above). Hence the hypothesis of completeness of the domain cannot be relaxed in the closed graph theorem. (Note that T- 1 is bounded precisely when T is bounded below, which, under the assumed completeness of X, happens precisely when Yo is complete - see Corollary 1.5.12.) This example also shows that the "onto" requirement in the Open Mapping Theorem is crucial, and cannot be dropped. D We conclude with another very important and useful result on operators between Banach spaces.
Theorem 1.5.16 (Uniform Boundedness Principle) Suppose X and Yare Banach saces. The following conditions on an arbitrary family {Ti : i E I} c £(X, Y) of operators are equivalent: (i) {Ti : i E I} is uniformly bounded - i.e., SUPiEI IITil1 < 00; (ii) {Ti : i E I} is "pointwise" bounded - i.e., for each x E X, the family {TiX : i E I} is bounded in Y, sUPiEI IITiXl1 < 00.
Proof. We only prove the non-trivial implication; so suppose (ii) holds. Let An = {x EX: IITiXl1 :::; nVi E I}. The hypothesis implies that X = U;{=l An. Notice also that each An is clearly a closed set. Conclude from the Baire category theorem that there exists a positive constant r > 0, an integer n, and a vector Xo E X such that (xo + E;) c An. It follows that E; C A 2n . Hence, Ilxll :::; 1
r -x E EX 2
r
r . IITi("2x)11 :::; 2n V 7 E I
4n r
sup IITiXl1 :::; iEI
This proves the theorem.
D
l.6. Some topological considerations
21
We conclude this section with some exercises on the uniform boundedness principle. (This "principle" is also sometimes referred to as the Banach-Steinhaus theorem.) Exercise 1.5.17 (1) Suppose S is a subset of a (not necessarily complete) normed space X which is weakly bounded, meaning that ¢(S) is a bounded set in C, for each ¢ E X*. Then show that S is norm bounded, meaning that sup{ Ilxll : x E S} < 00. (Hint: Apply the uniform boundedness principle to the family {j(x) : x E S} C X**, in the notation of Exercise 1.5.3(2)')
(2) Show that the completeness of Y is not needed for the validity of the uniform boundedness principle. (Hint: reduce to the complete case by considering the completion Y of Y.) (3) Suppose {Tn : n = 1,2, ... } c £(X, Y) is a sequence of bounded operators, where X, Yare Banach spaces, and suppose the sequence converges "strongly";' i. e., assume that {Tnx} is a convergent sequence in Y, for each x EX. Show that the equation Tx = limn Tnx defines a bounded operator T E £(X, Y).
1.6
Some topological considerations
This section is devoted to some elementary facts concerning "topological vector spaces" , which are more general than normed spaces - in much the same way as a topological space is more general than a metric space. The following text assumes that the reader knows at least the extent of "general topology" that is treated in the appendix; the reader who is unfamiliar with the contents of §§A.3 and A.4, should make sure she understands that material at least by the time she finishes the contents of this section. Definition 1.6.1 A topological vector space - henceforth abbreviated to t.v.s. - is a vector space X which is at the same time a topological space in such a way that the "vector operations are continuous"; explicitly, we demand that the following maps (whose domains are equipped with the natural product topology) are continuous:
(i) LXX:3 (a,x) (ii) X X X :3 (x, y)
ax E C; f---7 X + Y E X. If X, Yare topological vector spaces, the set of all continuous linear transformations T : X ---+ Y is denoted by £(X, Y), and we write £(X) = £(X, X). f---7
We leave some simple verifications as an exercise for the reader. Exercise 1.6.2 (1) Suppose X is a t.v.s. Show that: (a) the map X :3 x f---7 Tx E Homeo(X), defined by Tx(Y) = x+y, is a homomorphism of the additive group of the vector space X into the group H omeo(X) of homeomorphisms of the t. V.s. X; (recall that a homeomorphism of a topological space is a continuous bijection whose inverse is also continuous;)
22
Chapter 1. Normed spaces
(b) the map C X ::1 a f---+ DO' E £(X), defined by Dax = ax, is a homomorphism from the multiplicative group of non-zero complex numbers into the multiplicative group Q(£(X)) of invertible elements of the algebra £(X); (c) with the preceding notation, all the translation and dilation operators T x , Do: belong to H omeo(X); in particular, the group of homeomorphisms of X acts transitively on X-i. e., if x, yare arbitrary points in X, then there exists an element L in this group such that L(x) = y.
°
(2) Show that Nsym = {U: U is an open neighbourhood of which is symmetric in the sense that U = -U = {-x: x E U}} is a neighbourhood base for 0, meaning that if V is any open neighbourhood of 0, then there exists a U E Nsym such that U c V. The most important examples of topological vector spaces are the the socalled locally convex ones; these are the spaces where the topology is given by a family of seminorms. Definition 1.6.3 A seminorm on a vector space X is a function p : X --+ [0,00) which satisfies all the properties of a norm except for "positive-definiteness"; specifically, the conditions to be satisfied by a function p as above, in order to be called a seminorm are: for all a E C, x, Y EX, we demand that
p(x)
p(ax) p(x+y)
> 0 jajp(x)
(1.6.19)
< p(x) + p(y)
( 1.6.20)
Example 1.6.4 If X is a vector space, and if T : X a normed space Y, then the equation
PT(X) defines a seminorm on X. In particular, if ¢ : X
--+
--+
Y is a linear map of X into
jjTxjj
C is a linear functional, then the equation
p¢(x) = j¢(x)j defines a semi norm on X.
(1.6.18)
(1.6.21)
o
Remark 1.6.5 Every normed space is clearly a t.v.s. in an obvious manner. More generally, suppose P {pi : i E I} is a family of semi norms on a vector space; for each x E X and i E I, consider the map Ax : X --+ [0, 00) defined by Ax(Y) = Pi(Y-X). Let Tp denote the weak topology on X which is induced by the family {Ax: i E I, x E X} of maps - see Proposition A.3.10 for the definition of weak topology. We may then conclude the following facts concerning this topology on X (from Exercise A.3.U):
1.6. Some topological considerations
23
(a) For each fixed x E X,i E I,E > 0, define U(i,x,c) = {y EX: Pi(Y -x) < E}; then the family Sp = {U(i,x,c) : i E I, x E X, E > O} defines a sub-base for the topology Tp, and consequently the family Bp {nj=lU(iJ,Xj,cj) : {i j }j=l C I,{Xj}j=l C X,{fj}j=l C (0,00), n E N} is a base for the topology Tp. (The reader should verify that the family of finite intersections of the above form where (in any particular finite intersection) all the Ej'S are equal, also constitutes a base for the topology TP,) (b) A net - see Definition 2.2.3- {x,\ : A E A} converges to x in the topological space (X, Tp) if and only if the net {pi (X,\ - X) : A E A} converges to 0 in JR, for each i E I. (Verify this; if you do it properly, you will use the triangle inequality for seminorms, and you will see the need for the functions Ax constructed above.) It is not hard to verify - using the above criterion for convergence of nets in X· that X, equipped with the topology Tp is a t.v.s; further, Tp is the smallest topology with respect to which (X, Tp) is a topological vector space with the property that Pi : X -+ JR is continuous for each i E I. This is called the t.v.s. structure on X which is induced by the family P of semi norms on X. However, this topology can be somewhat trivial; thus, in the extreme case where P is empty, the resulting topology on X is the indiscrete topology - see Example A.3.2(1); a vector space with the indiscrete topology certainly satisfies all the requirements we have so far imposed on a t.v.s., but it is quite uninteresting. To get interesting t.v.s., we normally impose the further condition that the underlying topological space is a Hausdorff space. It is not hard to verify that the topology Tp induced by a family of semi norms satisfies the Hausdorff separation requirement precisely when the family P of seminorms separates points in X, meaning that if x -I- 0, then there exists a seminorm Pi E P such that Pi(X) -I- O. (Verify this!) We will henceforth encounter only t.v.S where the underlying topology is induced by a family of seminorms which separates points in the above sense. D Definition 1.6.6 Let X be a normed vector space. (a) The weak topology on X is the topology Tp, where P = {p¢: ¢ E X*}, where the P¢ 's are defined as in equation 1.6.21, and, of course, the symbol X* denotes the Banach dual space of X. (b) The weak* topology on the dual space X * is the topology Tp, where P = {Pj(x) : x E X}, where the p¢ 's are as before, and j : X -+ X** is the canonical inclusion of X into its second dual. (c) In order to distinguish between the different notions, the natural norm topology (which is the same as T{ II'II}) on X is referred to as the strong topology onX. It should be clear that any normed space (resp., any Banach dual space) is a t.v.s. with respect to the weak (resp., weak*) topology. Further, it follows from the definitions that if X is any normed space, then:
Chapter 1. Normed spaces
24
(a) any set in X which is weakly open (resp., closed) is also strongly open (resp., closed); and (b) any set in X* which is weak* open (resp., closed) is also weakly open (resp., closed). Remark 1.6.7 (1) Some care must be exercised with the adjective "strong"; there is a conflict between the definition given here, and the sense in which the expression "strong convergence" is used for operators, as in §2.5, for instance; but both forms of usage have become so wide-spread that this is a (minor) conflict that one has to get used to. (2) The above definitions also make sense for topological vector spaces; you only have to note that X* is the space of continuous linear functionals on the given t.v.s. X. In this case, the weak* topology on X* should be thought of as the topology induced by the family of "evaluation functionals" on X* coming from X. D Exercise 1.6.8 (1) Let X be a normed space. Show that the weak topology on X is a Hausdorff topology, as is the weak * topology on X*.
(2) Verify all the facts that have been merely stated, without a proof, in this section.
Our next goal is to establish a fact which we will need in the sequel. Theorem 1.6.9 (Alaoglu's theorem) Let X be a normed space. Then the unit ball of X* - i.e., ball X* = {¢ E X* : II¢II ~ I} is a compact Hausdorff space with respect to the weak * topology. Proof. Notice the following facts to start with: (1) if ¢ E ball X* and if x E ball X, then 1¢(x)1 ~ 1; (2) if f : ball X ~ C is a function, then there exists a ¢ E ball X* such that f = ¢Iball X if and only if the function f satisfies the following conditions: (i) If(x)1 ~ 1 "Ix E ball X; and (ii) whenever x, y E ball X and a, (3 E C are such that ax then f(ax+ (3y) = af(x) + (3f(y). -ball
X
+ (3y
E ball X, -
Let Z = lIJ) denote the space of functions from ball X into lIJ), where of course jj) = {z E C : Iz I ~ I} (is the closed unit disc in q; then, by Tychonoff's theorem (see Theorem A.4.15), Z is a compact Hausdorff space with respect to the product topology. Consider the map F : ball X* ~ Z defined by F(¢) = ¢Iball
X
.
25
1.6. Some topological considerations
The definitions of the weak* topology on ball X* and the product topology on Z show easily that a net {¢i : i E I} converges with respect to the weak* topology to ¢ in ball X* if and only if the net {F(¢i) : i E I} converges with respect to the product topology to F(¢) in Z. Also, since the map F is clearly 1-1, we find that F maps ball X* (with the weak* topology) homeomorphically onto its image ran F (with the subspace topology inherited from Z). Thus, the proof of the theorem will be complete once we show that ran F is closed in Z - since a closed subspace of a compact space is always compact. This is where fact (2) stated at the start of the proof is needed. Whenever x, y E ball X and 0, f3 E C are such that ox + f3y E ball X, define K x ,y,n,(3
= {f
E
Z : f(ox
+ f3y) = of(x) + f3f(y)}
;
note that K x ,y,Q,(3 is a closed subset of Z (when defined) and consequently compact; the content of fact (2) above is that the image ran F is nothing but the set nx ,y,o-.(3Kx ,y.c>,(1, and the proof is complete, since an arbitrary intersection of closed sets is closed. D The reason that we have restricted ourselves to discussing topological vector spaces, where the topology is induced by a family of semi norms in the above fashion, is that most "decent" t.v.s. arise in this manner. The object of the following exercises is to pave the way for identifying topological vector spaces of the above sort with the so-called locally convex topological vector spaces. Exercise 1.6.10 (1) Let X be a vector space. Show that if p : X ----> IR is a seminorm on X, and if C = {x EX: p( x) < I} denotes the associated "open unit ball", then C satisfies the following conditions: (a) C is convex - i.e., x, y E C, O:S: t :s: 1 =? tx + (1 - t)y E C; (b) C is "absorbing" - i. e., if x E X is arbitrary, then there exists E > 0 such that EX E C; and (c) C is "balanced" i.e., x E C, a E C, 101 :s: 1 =? ox E C. (d) if x E X, then the set {A > 0: AX E C} is an open set. (Hint: this is actually just the open interval given by (0, p/x))' where ~ is to be interpreted as 00.)
(2) (i) Show that any absorbing set contains O. (ii) If C is a convex and absorbing set in X, define Pc : X pc(x) =
inf{A
> 0: A-Ix
---->
IR by
E C} ,
(1.6.22)
and show that the map Pc satisfies the conditions of Lemma 1.4.1. (3) In the converse direction to (1), ifC is any subset of X satisfying the conditions (i) (a)-(d), show that there exists a semi-norm p on X such that C = {x EX: p(x) < I}. (Hint: Define p by equation 1.6.22, and verify that this does the job.) The function p is called the Minkowski functional associated with the set C.
Chapter 1. Normed spaces
26
The proof of the following proposition is a not very difficult use of the Minkowski functionals associated with convex, balanced, absorbing sets; we omit the proof; the reader is urged to try and prove the proposition, failing which she should look up a proof (in [Yos], for instance). Proposition 1.6.11 The following conditions on a topological vector space X are equivalent: (i) the origin 0 has a neighbourhood base consisting of convex, absorbing and balanced open sets; (ii) the topology on X is induced by a family of semi-norms - see Remark 1.6.5. A topological vector space which satisfies these equivalent conditions is said to be locally convex. We conclude this section (and chapter) with a version of the Hahn-Banach separation theorem. Theorem 1.6.12 (a) Suppose A and B are disjoint convex sets in a topological vector space X, and suppose A is open. Then there exists a continuous linear functional ¢ : X ----+ C and a real number t such that Re ¢(a) < t ::; Re ¢(b) Va E A, bE B .
(b) If X is a locally convex topological vector space, and if A C X (resp., B C X) is a compact (resp., closed) convex set, then there exists a continuous linear functional ¢ : X ----+ C and tI, t2 E lR. such that Re ¢(a) < h < t2 ::; Re ¢(b) Va E A, bE B .
Proof. By considering real- and imaginary- parts, as in the proof of the HahnBanach theorem, it suffices to treat the case of a real vector space X. (a) Let C = A - B = {a - b : a E A, b E B}, and fix Xo = ao - bo E C. The hypothesis ensures that D = C - Xo is a convex open neighbourhood of 0, and consequently D is an absorbing set. As in Exercise 1.6.1O(2)(ii), define
p(x) = inf{a > 0 : a-Ix
E
D} .
Then p(x) ::; 1 V xED. Conversely, suppose p(x) < 1 for some x E X; then, there exists some 0 < a < 1 such that a-Ix E D; since D is convex, 0 E D and a-I > 1, it follows that XED; thus p(x) < 1 =} xED. In particular,
o t/:. C Since p(tx)
= tp(x)
=}
(-xo) t/:. D
=}
p( -xo) 2: 1 .
Vx E X Vt > 0 - verify this! - we find thus that
p(-txo) 2: tV t > 0 .
(1.6.23)
1.6. Some topological considerations
27
Now define ¢o : Xo = llho ----> lR by ¢o(txo) -t, and note that ¢o(x) :::; p(x) V x E lRxo. (Reason: Let x = txo; if t ;:::: 0, then ¢(x) = -t :::; 0 :::; p(x); if t < 0, then, ¢(x) = -t = It I :::; p( -Itlxo) = p(x), by equation 1.6.23.) Deduce from Lemma 1.4.1 that ¢o extends to a linear functional ¢ : X ----> lR such that ¢I XO = ¢o and ¢(x) :::; p(x) Vx E X. Notice that ¢(x) :::; 1 whenever xED n (-D); since V = D n (- D) is an open neighbourhood of 0, it follows that ¢ is "bounded" and hence continuous. (Reason: if Xn ----> 0, and if E > 0 is arbitrary, then (EV is also an open neighbourhood of 0, and so) there exists no such that Xn E EV V n ;:::: no; this implies that 1¢(xn)1 :::; E V n ;:::: no.) Finally, if a E A, bE B, then a - b- Xo ED, and so, ¢(a) - ¢(b) + 1 = ¢(ab - xo) :::; 1; i.e., ¢(a) :::; ¢(b). It follows that if t = inf{¢(b) : b E B}, then ¢(a) :::; t :::; ¢(b) V a E A, b E B. Now, if a E A, since A is open, it follows that a - EXo E A for sufficiently small E > 0; hence also, ¢(a) + E = ¢(a - EXo) :::; t; thus, we do indeed have ¢(a) < t :::; ¢(b) Va E A, bE B. (b) For each x E A, (since 0 t/:- B - x, and since addition is continuous) it follows that there exists a convex open neighbourhood Vx of 0 such that B n (x + Vx + Vx ) = 0. Since {x + Vx : X E A} is an open cover of the compact set A, there exist Xl,"" Xn E A such that A C Uf=l (Xi + VXi )' Let V = nf=l VXi ; then V is a convex open neighbourhood of 0, and if U = A + V, then U is open (since U = UaEAa + V) and convex (since A and V are); further, A + V C Uf=l (Xi + VXi + V) C U~l (Xi + VXi + VXi ); consequently, we see that Un B = 0. Then, by (a) above, we can find a continuous linear functional ¢ : X ----> lR and a scalar t E lR such that ¢(u) < t :::; ¢(b) VuE U, bE B. Then, ¢(A) is a compact subset of (-00, t), and so we can find tl, t2 E lR such that sup ¢(A) < tl < t2 < t, and the proof is complete. D The reason for calling the preceding result a "separation theorem" is this: given a continuous linear functional ¢ on a topological vector space X, the set H = {x EX: Re¢(x) = c} (where c is an arbitrary real number) is called a "hyperplane" in X; thus, for instance, part (b) of the preceding result says that a compact convex set can be "separated" from a closed convex set from which it is disjoint, by a hyperplane. An arbitrary t.v.s. may not admit any non-zero continuous linear functionals. (Example?) In order to ensure that there are "sufficiently many" elements in X*, it is necessary to establish something like the Hahn-Banach theorem, and for this to be possible, we need something more than just a topological vector space; this is where local convexity comes in.
Chapter 2 Hilbert Spaces
2.1
Inner product spaces
While normed spaces permit us to study "geometry of vector spaces", we are constrained to discussing those aspects which depend only upon the notion of "distance between two points". If we wish to discuss notions that depend upon the angles between two lines, we need something more - and that something more is the notion of an inner product. The basic notion is best illustrated in the example of the space ]R2 that we are most familiar with, where the most natural norm is what we have called 11·112. The basic fact from plane geometry that we need is the so-called cosine law which states that if A, B, C are the vertices of a triangle and if is the angle at the vertex C, then 2(AC)(BC) cos = (AC)2 + (BC)2 - (AB)2 .
e
e
If we apply this to the case where the points A, Band C are represented by the vectors x = (Xl,X2), y = (Yl,Y2) and (0,0) respectively, we find that
21Ixll'llyll'
cos
e
IIxl1 2+ IlyW -llx - Yl12 2 (XIYl + X2Y2 ).
Thus, we find that the function of two (vector) variables given by
(2.1.1) simultaneously encodes the notion of angle as well as distance (and has the explicit interpretation (x, y) = Ilxllllyll cos e). This is because the norm can be recovered from the inner product by the equation
Ilxll =
(x,x)! .
(2.1.2)
The notion of an inner product is the proper abstraction of this function of two variables.
29
Chapter 2. Hilbert Spaces
30
Definition 2.1.1 (a) An inner product on a (complex) vector space V is a mapping :3 (x, y) f-> (x, Yl E C which satisfies the following conditions, for all x, y, Z E V and a E C: (i) (positive definiteness) (x, Xl ~ 0 and (x, Xl = 0 {::? x = 0; (ii) (Hermitian symmetry) (x, Yl = (y, Xl; (iii) (linearity in first variable) (ax + (3z, Yl = a(x, Yl + (3(z, Yl'
V xV
An inner product space is a vector space equipped with a (distinguished) znner product.
(b) An inner product space which is complete in the norm coming from the inner product is called a Hilbert space. Example 2.1.2 (1) If z
= (ZI, ... ,zn),w = (WI, ... ,Wn ) E cn, define n
=
(z,W)
L
ZiWi ;
(2.1.3)
i=1
it is easily verified that this defines an inner product on C n .
(2) The equation
r
(1,g) =
f(x)g(x) dx
(2.1.4)
J[O,I]
is easily verified to define an inner product on e[O, 1].
D
As in the (real) case discussed earlier of]R2, it is generally true that any inner product gives rise to a norm on the underlying space via equation 2.1.2. Before verifying this fact, we digress with an exercise that states some easy consequences of the definitions. Exercise 2.1.3 Suppose we are given an inner product space V; for x E V, define Ilxll as in equation 2.1.2, and verify the following identities, for all X,y,Z E V, a E C: (1) (x,y+az)
(2)
Ilx+yl12
=
= (x,y) + a(x,z);
IlxW + IIyl12 +
2 Re(x,y);
(3) two vectors in an inner product space are said to be orthogonal if their inner product is 0; deduce from (2) above and an easy induction argument that if {Xl, X2, ... , x n } is a set of pairwise orthogonal vectors, then n
n
i=1
;=1
(4) Ilx+yW + Ilx-yl12 = 2 (1IxI1 2+llyW); draw some diagrams and convince yourself as to why this identity is called the parallelogram identity; (5) (Polarisation identity) 4(x, y) = i = yCT.
L~=o i k (x + iky, X + iky), where, of course,
31
2.1. Inner product spaces
The first (and very important) step towards establishing that any inner product defines a norm via equation 2.1.2 is the following celebrated inequality. Proposition 2.1.4 (Cauchy-Schwarz inequality) If x, yare arbitrary vectors in an inner product space V, then
l(x,y)1
~
Ilxll·llyll·
Further, this inequality is an equality if and only if the vectors x and yare linearly dependent. Proof. If y = 0, there is nothing to prove; so we may assume, without loss of generality, that Ilyll = 1 (since the statement of the proposition is unaffected upon scaling y by a con.,tant). Notice now that, for arbitrary a E te,
o < Ilx - aYl12 IIxl1 2+ lal 2-
2Re(a(y,x)) .
A little exercise in the calculus shows that this last expression is minimised for the choice ao = (x, y), for which choice we find, after some minor algebra, that thereby establishing the desired inequality. The above reasoning shows that (if Ilyll = 1) if the inequality becomes an equality, then we should have x = aoy, and the proof is complete. 0 Remark 2.1.5 (1) For future reference, we note here that the positive definite property Ilxll = 0 =} x = 0 was never used in the proof of the Cauchy-Schwarz inequality; in particular, the inequality is valid for any positive definite sesquilinear form on V; i.e., suppose B : V x V -+ te is a mapping which is positive-semidefinite - meaning only that B(x, x) 2': 0 \/x - and satisfies the conditions of Hermitian symmetry and linearity (resp., "conjugate" linearity) in the first (resp., second) variable (see definition of inner product and Exercise 2.1.3 (1)); then
IB(x, y)12 ~ B(x, x) . B(y, y) . (2) Similarly, any sesquilinear form B on a complex vector space satisfies the polarisation identity, meaning: 3
4B(x,y)
=
LikB(x+iky,x+ikY)\/X,yEV. k=O
Corollary 2.1.6 Any inner product gives rise to a norm via equation 2.1.2.
o
Chapter 2. Hilbert Spaces
32
Proof. Positive-definiteness and homogeneity with respect to scalar multiplication are obvious; as for the triangle inequality,
Ilx+yl12
IIxl1 2 + IIyl12 + 2 Re(x,y) < IIxl1 2 + IlyW + 211xll'llyll , D
and the proof is complete.
Exercise 2.1.7 We use the notation of Examples 1.2.2(1) and 2.1.2.
(1) Show that n
I
L
Zi W il 2
<
i=1
(2) Deduce from (1) that the series L~l ai{3i converges, for any a, {3
E
£2, and
that 00
i=1
deduce that £2 is indeed (a vector space, and in fact) an inner product space, with respect to inner product defined by 00
(a, (3)
(2.1.5)
(3) Write down what the Cauchy-Schwarz inequality translates into in the example of e[O, 1]. (4) Show that the inner product is continuous as a mapping from V ><; V into C (In view of Corollary 2.1.6 and the discussion in Exercise 1.2.3, this makes sense.)
2.2
Some preliminaries
This section is devoted to a study of some elementary aspects of the theory of Hilbert spaces and the bounded linear operators between them. Recall that a Hilbert space is a complete inner product space - by which, of course, is meant a Banach space where the norm arises from an inner product. We shall customarily use the symbols H, K and variants of these symbols (obtained using subscripts, primes, etc.) for Hilbert spaces. Our first step is to arm ourselves with a reasonably adequate supply of examples of Hilbert spaces.
2.2. Some preliminaries
33
Example 2.2.1 (1) <en is an example of a finite-dimensional Hilbert space, and we shall soon see that these are essentially the only such examples. We shall write for this Hilbert space.
£;
(2) £2 is an infinite-dimensional Hilbert space - see Exercises 2.l.7(2) and l.5.2(5). Nevertheless, this Hilbert space is not "too big", since it is at least equipped with the pleasant feature of being a separable Hilbert space - i.e., it is separable as a metric space, meaning that it has a countable dense set. (Verify this assertion!) (3) More generally, let 5 be an arbitrary set, and define £2(5)
= {x =
((XS))SES:
L
Ix s l2 < oo}
sES
(The possibly uncount~ble sum might be interpreted as follows: a typical element of £2(5) is a family x = ((x.,)) of complex numbers which is indexed by the set 5, and which has the property that Xs = 0 except for s coming from some countable subset of 5 (which depends on the element x) and which is such that the possibly non-zero xs's, when written out as a sequence in any (equivalently, some) way, constitute a square-summable sequence.) Verify that £2(5) is a Hilbert space in a natural fashion.
(4) This example will make sense to the reader who is already familiar with the theory of measure and Lebesgue integration; the reader who is not, may safely skip this example; the subsequent exercise will effectively recapture this example, at least in all cases of interest. In any case, there is a section in the Appendix see §A.5 - which is devoted to a brief introduction to the theory of measure and Lebesgue integration. Suppose (X, E, /l) is a measure space. Let .c 2 (X, E, /l) denote the space of E-measurable complex-valued functions f on X such that Ifl2d/l < 00. Note that If + gl2 ::; 2(lf1 2 + IgI2), and deduce that .c 2 (X, E, /l) is a vector space. Note next that Ifgl ::; ~(lfI2 + IgI2), and so the right-hand side of the following equation makes sense, if f, g E .c 2 (X, E, /l):
Ix
(j, g) = [jgd/l.
(2.2.6)
It is easily verified that the above equation satisfies all the requirements of an inner product with the solitary possible exception of the positive-definiteness axiom: if (j, 1) = 0, it can only be concluded that f = 0 a.e. - meaning that {x : f(x) oF O} is a set of /l-measure 0 (which might very well be non-empty). Observe, however, that the set N = {f E .c 2 (X, E, /l) : f = 0 a.e.} is a vector subspace of .c 2(X, E, /l); now consider the quotient space L2(X, E, /l) = .c2 (X,E,/l)/N, a typical element of which is thus an equivalence class of squareintegrable functions, where two functions are considered to be equivalent if they agree outside a set of /l-measure O.
Chapter 2. Hilbert Spaces
34
For simplicity of notation, we shall just write L2(X) for L2(X, 8, J-L), and we shall denote an element of L2(X) simply by such symbols as f,g, etc., and think of these as actual functions with the understanding that we shall identify two functions which agree almost everywhere. The point of this exercise is that equation 2.2.6 now does define a genuine inner product on L2(X); most importantly, it is true that L2(X) is complete and is thus a Hilbert space. 0 Exercise 2.2.2 (1) Suppose X is an inner product space. Let X be a completion of X regarded as a normed space - see Exercise 1.5.5. Show that X is actually a Hilbert space. (Thus, every inner product space has a Hilbert space completion.)
(2) Let X = C[O, 1] and define (1, g)
=
11
f(x)g(x)dx .
Verify that this defines a genuine (positive-definite) inner product on C[O, 1]. The completion of this inner product space is a Hilbert space - see (1) above - which may be identified with what was called L2([O, 1], 8, m) in Example 2.2.1 (4), where (8 is the cy-algebra of Borel sets in (0,1 J and) m denotes the so-called Lebesgue measure on (O,l). In the sequel, we will have to deal with "uncountable sums" repeatedly, so it will be prudent to gather together some facts concerning such sums, which we shall soon do in some exercises. These exercises will use the notion of nets, which we pause to define and elucidate with some examples. Definition 2.2.3 A directed set is a partially ordered set (1, ::;) which, - in addition to the usual requirements of reflexivity (i ::; i), antisymmetry (i ::; j, j ::; i i = j) and transitivity (i ::; j,j ::; k i::; k), which have to be satisfied by any partially ordered set - satisfies the following property:
'*
\j
'*
i,j E J, :3 k E J such that i ::; k,j ::; k .
A net in a set X is a family {Xi : i E I} of elements of X which is indexed by some directed set J. Example 2.2.4 (1) The motivating example for the definition of a directed set is the set N of natural numbers with the natural ordering, and a net indexed by this directed set is nothing but a sequence.
(2) Let S be any set and let F(S) denote the class of all finite subsets of S; this is a directed set with respect to the order defined by inclusion; thus, Fl ::; F2 {:} F1 <;;; F 2 . This will be referred to as the directed set of finite subsets of S. (3) Let X be any topological space, and let x E X; let N(x) denote the family of all open neighbourhoods of the point x; then N(x) is directed by "reverse inclusion"; i.e, it is a directed set if we define U ::; V {:} V <;;; U.
35
2.2. Some preliminaries
(4) If I and J are directed sets, then the Cartesian product I x J is a directed set with respect to the "co-ordinate-wise ordering" defined by (il,)l) ::; (i 2,)2) {::} i l ::; i2 and)l ::; )2. 0 The reason that nets were introduced in the first place was to have analogues of sequences when dealing with more general topological spaces than metric spaces. In fact, if X is a general topological space, we say that a net {Xi : i E I} in X converges to a point X if, for every open neighbourhood U of x, it is possible to find an index ia E I with the property that Xi E U whenever ia ::; i. (As with sequences, we shall, in the sequel, write i ~ ) to mean) ::; i in an abstract partially ordered set.) Exercise 2.2.5 (1) If f : X -+ Y is a map of topological spaces, and if X EX, then show that f is continuous at X if and only the net {J(Xi) : i E I} converges to f(x) in Y whenever {Xi: i E I} is a net which converges to X in X. (Hint: use the directed set of Example 2.2.4{3}')
(2) Define what should be meant by saying that a net in a metric space is a "Cauchy net", and show that every convergent net is a Cauchy net. (3) Show that if X is a metric space which is complete - meaning, of course, that Cauchy sequences in X converge - show that every Cauchy net in X also converges. (Hint: for each n, pick in E I such that il ::; i2 ::; ... ::; in ::; ... , and such that d(Xi' Xj) < ~, whenever i,) ~ in; now show that the net should converge to the limit of the Cauchy sequence {Xi n }nEN.) (4) Is the Cartesian product of two directed sets directed with respect to the "dictionary ordering"? We are now ready to discuss the problem of "uncountable sums". We do this in a series of exercises following the necessary definition.
Definition 2.2.6 Suppose X is a normed space, and suppose {xs : s E S} is some indexed collection of vectors in X, whose members may not all be distinct; thus, we are looking at a function S 3 s 1---4 XsEX . We shall, however, be a little sloppy and simply call {xs : s E S} a family of vectors in X - with the understanding being that we are actually talking about a function as above. If F E F(S) define x(F) = 2::s E F Xs - in the notation of Example 2.2.4{2}. We shall say that the family {x s : s E S} is unconditionally summable, and has the sum x, if the net {X (F) : F E F(S)} converges to x in X. When this happens, we shall write
x Lx =
s .
sES
Exercise 2.2.7 (1) If X is a Banach space, then show that a family {xs : s E S} C X is unconditionally summable if and only if, given any f. > 0, it is possible to find a finite subset Fa C S such that 112::s E F XS II < f. whenever F is a finite subset of S which is disjoint from Fa. (Hint: use Exercise 2.2.5(3}.)
Chapter 2. Hilbert Spaces
36
(2) If we take S to be the set N of natural numbers, and if the sequence {x n : n E N} C X is unconditionally summable, show that if 7f is any permutation of N, then the series I:~=l X7r(n) is convergent in X and the sum of this series is independent of the permutation 7f. (This is the reason for the use of the adjective "unconditional" in the preceding definition.)
(3) Suppose {ai : i E S} C [0,(0); regard this family as a subset of the complex normed space
°
(b) I:sEs as
2.3
= A.
Orthonormal bases
Definition 2.3.1 A collection {ei : i E I} in an inner product space is said to be orthonormal if
Thus, an orthonormal set is nothing but a set of unit vectors which are pairwise orthogonal; here, and in the sequel, we say that two vectors x, y in an inner product space are orthogonal if (x, y) = 0, and we write x 1.. y.
£;.,
Example 2.3.2 (1) In for 1 ::; i ::; n, let ei be the element whose i-th co-ordinate is 1 and all other co-ordinates are 0; then {el, ... , en} is an orthonormal set in
£;..
(2) In jl2, for 1 ::; n < 00, let ei be the element whose i-th co-ordinate is 1 and all other co-ordinates are 0; then {en : n = 1, 2, ... } is an orthonormal set in £2. More generally, if S is any set, an entirely similar prescription leads to an orthonormal set {e s : s E S} in £2(S).
(3) In the inner product space e[O, 1]- with inner product as described in Exercise 2.2.2 - consider the family {en : n E Z} defined by en (x) = exp( 27finx) , and show that this is an orthonormal set; hence this is also an orthonormal set when regarded as a subset of L2([0, l],m) - see Exercise 2.2.2(2). D Proposition 2.3.3 Let {el' e2, ... ,en} be an orthonormal set in an inner product space X, and let x E X be arbitrary. Then, (i) if x = 2:~=1 (liei, (li E
2.3. Orthonormal bases
37
Proo]. (i) If x is a linear combination of the ej's as indicated, compute (x, ei), and use the assumed orthonormality of the ej's, to deduce that CXi = (x, ei). (ii) This is an immediate consequence of (i). (iii) Write y = L~=l (x, ei)ei, z = x ~ y, and deduce from (two applications of) Exercise 2.1.3(3) that n
IlxW
= IlyW + IIzl12 ::::'llyl12 = 2:: l(x,ei)1 2 .
o
i=l We wish to remark on a few consequences of this proposition; for one thing, (i) implies that an arbitrary orthonormal set is linearly independent; for another, if we write V{ei : i E I} for the vector subspace spanned by {ei : i E I} - i.e., this consists of the set of linear combinations of the ei's, and is the smallest subspace containing {ei : i E I} - it follows from (i) that we know how to write any element of V{ei : i E I} as a linear combination of the ei's. We shall find the following notation convenient in the sequel: if S is a subset of an inner product space X, let [S] denote the smallest closed subspace containing S; it should be clear that this could be described in either of the following equivalent ways: (a) [S] is the intersection of all closed subspaces of X which contain S, and (b) [S] = VS. (Verify that (a) and (b) describe the same set.) Lemma 2.3.4 Suppose {ei : i E I} is an orthonormal set in a Hilbert space H. Then the following conditions on an arbitrary family {CXi: i E I} of complex numbers are equivalent: (i) the family {cxiei : i E I} is unconditionally summable in H; (ii) the family {ICXiI2 : i E I} is unconditionally summable in C; (iii) sup {LiEF: ICXil2 : F E F(I)} < 00, where F(I) denote the directed set of finite subsets of the set I; (iv) there exists a vector x E [{ei: i E I}] such that (x,ei) = CXi Vi E I.
Proo]. For each F E F(I), let us write x(F) 2.1.3(3) that Ilx(F)W = 2::I CX iI 2 iEF
A(F) ; (say)
thus we find, from Exercise 2.2.7 (1), that the family {cxiei: i E I} (resp., {ICXiI2 : i E I}) is unconditionally summable in the Hilbert space H (resp., q if and only if, for each
E
> 0, it is possible to find Fo Ilx(F)11 2
E
F(I) such that
= A(F) < E V F
E
F(I ~ Fo) ;
this latter condition is easily seen to be equivalent to condition (iii) of the lemma; thus, the conditions (i), (ii) and (iii) of the lemma are indeed equivalent.
Chapter 2. Hilbert Spaces
38
(i) =} (iv) : Let x I:iEI (Xiei; in the preceding notation, the net {x (F) : FE F(1)} converges to x in 1i. Hence also the net {(x(F), ei) : F E F(1)} converges to (x, ei), for each i E I. An appeal to Proposition 2.3.3 (1) completes the proof of this implication. (iv)
=}
(iii) : This follows immediately from Bessel's inequality.
o
We are now ready to establish the fundamental proposition concerning orthonormal bases in a Hilbert space. Proposition 2.3.5 The following conditions on an orthonormal set {ei : i E I} in a Hilbert space 1i are equivalent: (i) {ei : i E I} is a maximal orthonormal set, meaning that it is not strictly contained in any other orthonormal set; (ii) x E 1i =} x = I:iEI(x, ei)ei; (iii) X,y E 1i =} (x,y) = I:iEI(x,ei)(ei,y); (iv) xE1i =} IIxl1 2 = I: iE1 1(x,ei)1 2 . Such an orthonormal set is called an orthonormal basis of 1i. Proof. (i) =} (ii): It is a consequence of Bessel's inequality and (the equivalence (iii) B (iv) of) the last lemma that there exists a vector, call it Xo E 1i, such that Xo = I:iEI(x,ei)ei.Ifx~xo,andifwesetf = Ilx~xoll (x-xo),thenitiseasy to see that {ei : i E I} U {J} is an orthonormal set which contradicts the assumed maximality of the given orthonormal set.
(ii) =} (iii): For F E F(1), let x(F) = I:iEF(x, ei)ei and y(F) = I:iEF(y, ei)ei, and note that, by the assumption (ii) (and the meaning of the uncountable sum), and the assumed orthonormality of the ei's, we have (x,y) = li}?(x(F),y(F)) = Ii}? L(x,ei)(y,ei) = L(x,ei)(ei,y) iEF
(iii)
=}
iEF
(iv): Put y = x.
(iv) =} (i): Suppose {ei : i E I U J} is an orthonormal set and suppose J is not empty; then for j E J, we find, in view of (iv), that IIejl12 =
LI(ej,ei)1 2 = 0, iEI
which is impossible in an orthonormal set; hence it must be that J is empty - i.e., the maximality assertion of (i) is indeed implied by (iv). 0 Corollary 2.3.6 Any orthonormal set in a Hilbert space can be "extended" to an orthonormal basis - meaning that if {ei : i E I} is any orthonormal set in a Hilbert space 1i, then there exists an orthonormal set {ei : i E J} such that I n J = 0 and {ei : i E I U J} is an orthonormal basis for 1i. In particular, every Hilbert space admits an orthonormal basis.
2.3. Orthonormal bases
39
Proof. This is an easy consequence of Zorn's lemma. (For the reader who is unfamiliar with Zorn's lemma, there is a small section - see §A.2 - in the appendix which discusses this very useful variant of the Axiom of Choice.) 0
Remark 2.3.7 (1) It follows from Proposition 2.3.5 (ii) that if {ei : i E I} is an orthonormal basis for a Hilbert space H, then H = [{ ei : i E I}]; conversely, it is true - and we shall soon prove this fact - that if an orthonormal set is total in the sense that the vector subspace spanned by the set is dense in the Hilbert space, then such an orthonormal set is necessarily an orthonormal basis. (2) Each of the three examples of an orthonormal set that is given in Example 2.3.2, is in fact an orthonormal basis for the underlying Hilbert space. This is obvious in cases (1) and (2). As for (3), it is a consequence of the Stone-Weierstrass theorem see Exercise A.6.10(i) - that the vector subspace of finite linear combinations of the exponential functions {exp(27rinx) : n E Z} is dense in {f E e[O, 1] : f(O) = f(l)} (with respect to the uniform norm - i.e., with respect to 11.11(0); in view of Exercise 2.2.2(2), it is not hard to conclude that this orthonormal set is total in L2([0, 1], m) and hence, by remark (1) above, this is an orthonormal basis for the Hilbert space in question. Since exp(±27rinx) = cos(27rnx) ±i sin(27rnx), and since it is easily verified that cos(27rmx) .1 sin(27rnx) Vm, n = 1,2, ... , we find easily that {I
= eo} U {V2cos(27rnx),
V2sin(27rnx) : n
=
1,2, ... }
is also an orthonormal basis for L2([0, l],m). (Reason: this is orthonormal, and this sequence spans the same vector subspace as is spanned by the exponential basis.) (Also, note that these are real-valued functions, and that the inner product of two real-valued functions in clearly real.) It follows, in particular, that if f is any (real-valued) continuous function defined on [0,1], then such a function admits the following Fourier series (with real coefficients): 00
f(x) =
aD
+L
(an
cos(27rnx)
+ bn sin(27rnx)
)
n=l
where the meaning of this series is that we have convergence of the sequence of the partial sums to the function f with respect to the norm in L2 [0, 1]. Of course, the coefficients an, bn are given by aD
1
= 10 f(x)dx
an =
1
210 f(x) cos(27rnx)dx , V n > 0, 1
bn = 210 f(x) sin(27rnx)dx ) V n >
°
40
Chapter 2. Hilbert Spaces
The theory of Fourier series was the precursor to most of modern functional analysis; it is for this reason that if {ei : i E I} is any orthonormal basis of any Hilbert space, it is customary to refer to the numbers (x, ei) as the Fourier coefficients of the vector x with respect to the orthonormal basis {ei : i E I}. 0
It is a fact that any two orthonormal bases for a Hilbert space have the same cardinality, and this common cardinal number is called the dimension of the Hilbert space; the proof of this statement, in its full generality, requires facility with infinite cardinal numbers and arguments of a transfinite nature; rather than giving such a proof here, we adopt the following compromise: we prove the general fact in the appendix - see §A.2 - and discuss here only the cases that we shall be concerned with in these notes. We temporarily classify Hilbert spaces into three classes of increasing complexity; the first class consists of finite-dimensional ones; the second class consists of separable Hilbert spaces - see Example 2.2.1(2) - which are not finite-dimensional; the third class consists of non-separable Hilbert spaces. The purpose of the following result is to state a satisfying characterisation of separable Hilbert spaces. Proposition 2.3.8 The following conditions on a Hilbert space H are equivalent: (i) H is separable; (ii) H admits a countable orthonormal basis. Proof. (i) =} (ii): Suppose D is a countable dense set in H and suppose {ei : i E I} is an orthonormal basis for H. Notice that (2.3.7) Since D is dense in H, we can, for each i E I, find a vector Xi E D such that Ilxi - ei II < {}. The identity 2.3.7 shows that the map I '3 i f--+ Xi E D is necessarily 1-1; since D is countable, we may conclude that so is I.
(ii) =} (i): If I is a countable (finite or infinite) set and if {ei : i E I} is an orthonormal basis for H, let D be the set whose typical element is of the form LjEJ ajej, where J is a finite subset of I and aj are complex numbers whose real and imaginary parts are both rational numbers; it can then be seen that D is a countable dense set in H. 0 Thus, non-separable Hilbert spaces are those whose orthonormal bases are uncountable. It is probably fair to say that any true statement about a general nonseparable Hilbert space can be established as soon as one knows that the statement is valid for separable Hilbert spaces; it is probably also fair to say that almost all useful Hilbert spaces are separable. So, the reader may safely assume that all Hilbert spaces in the sequel are separable; among these, the finite-dimensional ones are, in a sense, "trivial", and one only need really worry about infinite-dimensional separable Hilbert spaces.
2.3. Orthonormal bases
41
We next establish a lemma which will lead to the important result which is sometimes referred to as "the projection theorem" . Lemma 2.3.9 Let M be a closed subspace of a Hilbert space H; (thus M may be regarded as a Hilbert space in its own right;) let {ei : i E I} be any orthonormal basis for M, and let {ej : j E J} be any orthonormal set such that {ei : i E I U J} is an orthonormal basis for H, where we assume that the index sets I and J are disjoint. Then, the following conditions on a vector x E H are equivalent: (i) x 1.. Y \I y E M; (ii) x = LjEJ(X, ej)ej . Proof. The implication (ii) =} (i) is obvious. Conversely, it follows easily from Lemma 2.3.4 and Bessel's inequality that the "series" LiE! (x, ei)ei and LjE.! (x, ej )ej converge in H. Let the sums of these "series" be denoted by y and z respectively. Further, since {ei : i E I U J} is an orthonormal basis for H, it should be clear that x = y + z. Now, if x satisfies condition (i) of the Lemma, it should be clear that y = 0 and that hence, x = z, thereby completing the proof of the le·mma. D
We now come to the basic notion of orthogonal complement. Definition 2.3.10 The orthogonal complement S~ of a subset S of a Hilbert space is defined by S~ = {x E H : x 1.. y \I yES} . Exercise 2.3.11 If So eSc H are arbitrary subsets, show that
scf ~ S~
=
(V s) ~
= ([S])~
Also show that S~ is always a closed subspace of H.
We are now ready for the basic fact concerning orthogonal complements of closed subspaces. Theorem 2.3.12 Let M be a closed subspace of a Hilbert space H. Then, (1) M ~ is also a closed subspace;
(2) (M~)~
= M;
(3) any vector x E H can be uniquely expressed in the form x
M, z
=
y
+ z,
where y E
E M~;
(4) ifx,y,z are as in (3) above, then the equation Px operator P E £(H) with the property that
IIPxl1 2 =
(Px,x)
= IlxW -llx -
Pxl1 2
=
,
y defines a bounded
\Ix E H .
Chapter 2. Hilbert Spaces
42
ProoJ. (i) This is easy - see Exercise 2.3 .ll. (ii) Let I, J, {ei : i E I U J} be as in Lemma 2.3.9. We assert, to start with, that in this case, {ej : j E J} is an orthonormal basis for M.l. Suppose this were not true; since this is clearly an orthonormal set in M.l, this would mean that {e j : j E J} is not a maximal orthonormal set in M.l, which implies the existence of a unit vector x E M.l such that (x, ej) = 0 V j E J; such an x would satisfy condition (i) of Lemma 2.3.9, but not condition (ii). If we now reverse the roles of M, {ei : i E I} and M.l, {ej : j E J}, we find from the conclusion of the preceding paragraph that {ei : i E I} is an orthonormal basis for (M.l).l, from which we may conclude the validity of (ii) of this theorem. (iii) The existence of y and z was demonstrated in the proof of Lemma 2.3.9; as for uniqueness, note that if x = Yl + Zl is another such decomposition, then we would have Y - Yl = Zl - Z E M n M.l ; butwEMnM.l
=}
w.lw
=}
Ilw11 2 =O
=}
w=O.
(iv) The uniqueness of the decomposition in (iii) is easily seen to imply that Pis a linear mapping of H into itself; further, in the notation of (iii), we find (since y.l z) that IIxl1 2 = IlyW + IIzI12 = IIPxl1 2 + Ilx - Pxl1 2 ; this implies that IIPxl1 ::; Ilxll V x E H, and hence P E £(H). Also, since y .1 z, we find that IIPxl1 2 = IlyW =
(y,y+z)
thereby completing the proof of the theorem.
(Px, x) ,
o
The following corollary to the above theorem justifies the final assertion made in Remark 2.3.7(1). Corollary 2.3.13 The following conditions on an orthonormal set {ei : i E I} in a Hilbert space H are equivalent: (i) {ei : i E I} is an orthonormal basis for H; (ii) {ei : i E I} is total in H - meaning, of course, that H = [{ ei : i E I} ].
Proof· As has already been observed in Remark 2.3.7(1), the implication (i) =} (ii) follows from Proposition 2.3.5(ii). Conversely, suppose (i) is not satisfied; then {ei : i E I} is not a maximal orthonormal set in H; hence there exists a unit vector x such that x .1 ei Vi E I; if we write M = [{ ei : i E I} ], it follows easily that x E M.l, whence M.l 1= {O}; then, we may deduce from Theorem 2.3.12(2) that M 1= H - i.e., (ii) is also not satisfied. 0
2.3. Orthonormal bases
43
Remark 2.3.14 The operator P E £(H) constructed in Theorem 2.3.12(4) is referred to as the orthogonal projection onto the closed subspace M. When it is necessary to indicate the relation between the subspace M and the projection P, we will write P = PM and M = ran P; (note that M is indeed the range of the operator P;) some other facts about closed subspaces and projections are spelt out in the following exercises. 0
Exercise 2.3.15 (1) Show that (S1..)1..
=
[S], for any subset S
c H.
(2) Let M be a closed subspace ofH, and let P = PM. (a) Show that PM.l = 1- PM , where we write 1 for the identity operator on H (the reason being that this is the multiplicative identity of the algebra £(H)). (b) Let x E H; the following conditions are equivalent: (i) x E M; (ii) x E ran P (= PH); (iii) Px = x; (iv) IIPxl1 = Ilxll· (c) Show that M1.. = ker P = {x E H : Px = O}. (3) Let M and N be closed subspaces of H, and let P the following conditions are equivalent: (i) N eM; (ii) PQ = Q; (i)' M1.. c N1..; (ii)' (1 - Q)(l - P) 1 - P; (iii) QP = Q.
= PM, Q = PN; show that
(4) With M,N, P, Q as in (3) above, show that the following conditions are equivalent: (i) M.lN - i.e., N C M1..; (ii) PQ = 0; (iii) QP = O. (5) When the equivalent conditions of (4) are met, show that: (a) [MUN] = M+N = {x+y:xEM,yEN}; and (b) (P + Q) is the projection onto the subspace M + N. (c) more generally, if {Mi : 1 ::::: i ::::: n} is a family of closed subspaces of H which are pairwise orthogonal, show that their "vector sum" defined by I.:~=I Mi = {I.:~=I Xi : xi E Mi Vi} is a closed subspace and the projection onto this subspace is given by I.:~=I PMi . If M I , ... , Mn are pairwise orthogonal closed subspaces - see Exercise 2.3.15(5)(c) above - and if M = I.:~=I M i , we say that M is the direct sum of
Chapter 2. Hilbert Spaces
44 the closed subspaces M
i,
1
~
i
~
n, and we write
(2.3.8) conversely, whenever we use the above symbol, it will always be tacitly assumed that the Mi'S are closed subspaces which are pairwise orthogonal and that M is the (closed) subspace spanned by them.
2.4
The adjoint operator
We begin this section with an identification of the bounded linear functionals i.e., the elements of the Banach dual space H* of a Hilbert space.
Theorem 2.4.1 (Riesz lemma) Let H be a Hilbert space.
(a) If y E H, the equation ¢y(x)
= (x, y)
defines a bounded linear functional ¢y E H*; and furthermore,
(2.4.9)
II¢y 111i*
=
(b) Conversely, if ¢ E H*, there exists a unique element y E H such that ¢
Ilyl h-{· =
¢y.
Proof. (a) Linearity of the map ¢y is obvious, while the Cauchy-Schwarz inequality shows that ¢y is bounded and that II¢yll ~ Ilyli. Since ¢y(y) = IIyl12, it easily follows that we actually have equality in the preceding inequality. (b) Suppose conversely that ¢ E H*. Let M = ker ¢. Since II ¢Yl ~ ¢Y211 = IIYl ~ Y211 \:I Yl, Y2 E H, the uniqueness assertion is obvious; we only have to prove existence. Since existence is clear if ¢ = 0, we may assume that ¢ ic 0, i.e., that M ic H, or equivalently that M.l ic O. Notice that ¢ maps M.l 1-1 into C; since M.l ic 0, it follows that M.l is one-dimensional. Let z be a unit vector in M.l. The y that we seek assuming it exists - must clearly be an element of M.l (since ¢(x) = 0 \:Ix EM). Thus, we must have y = exz for some uniquely determined scalar 0 ic ex E C With y defined thus, we find that ¢y (z) = a; hence we must have ex = ¢( z). Since any element in H is uniquely expressible in the form x + ,z for some x E M" E C, we find easily that we do indeed have ¢ = ¢<jJ(z)z' D It must be noted that the mapping y f--+ ¢y is not quite an isometric isomorphism of Banach spaces; it is not a linear map, since ¢ay = a¢y; it is only "conjugate-linear". The Banach space H* is actually a Hilbert space if we define
that this equation satisfies the requirements of an inner product are an easy consequence of the Riesz lemma (and the already stated conjugate-linearity of the mapping y f--7 ¢y); that this inner product actually gives rise to the norm on H* is a consequence of the fact that Ilyll = II¢yll.
2.4. The adjoint operator
45
Exercise 2.4.2 (1) Where is the completeness of H used in the proof of the Riesz lemma; more precisely, what can you say about X* if you only know that X is a (not necessarily complete) inner product space? 1(Hint: Consider the completion of X.)
"'I
(2) 1fT E £(H,K), where H,K are Hilbert spaces, prove that
IITII
=
sup{I(Tx,y)l: x
Ilxll :s:
E H,y E K,
1,
Ilyll :s:
I} .
(3) If T
E £(H) and if (Tx, x) = 0 V x E H, show that T = O. (Hint: Use the Polarisation identity - see Remark 2.1.5(2). Notice that this proof relies on the fact that the underlying field of scalars is C - since the Polarisation identity does. In fact, the content of this exercise is false for "real" Hilbert spaces; for instance, consider rotation by 90 0 in IR2.)
We now come to a very useful consequence of the Riesz representation theorem; in order to state this result, we need some terminology. Definition 2.4.3 Let X, Y be inner product spaces.
(a) A mapping B : X x Y ---+ C is said to be a sesquilinear form on X x Y if it satisfies the following conditions: (i) for each fixed y E Y, the mapping X :3 x f--7 B(x, y) E C is linear; and (ii) for each fixed x E X, the mapping Y :3 Y f--7 B(x, y) E C is conjugate-linear; in other words, for arbitrary XI, X2 E X, YI, Y2 E Y and aI, a2,!JI,!J2 E C, we have 2
L
ai!JjB(xi, yj)
i,j=1
(b) a sesquilinear form B : X x Y ---+ C is said to be bounded if there exists a finite constant 0 < K < 00 such that IB(x,y)1
:s: K'llxll'llyll,
VXEX,yEY.
Examples of bounded sesquilinear maps are obtained from bounded linear operators, thus: if T E £(X, Y), then the assignment
(x,y)
f--7
(Tx,y)
defines a bounded sesquilinear map. The useful fact, which we now establish, is that for Hilbert spaces, this is the only way to construct bounded sesquilinear maps. Proposition 2.4.4 Suppose Hand K are Hilbert spaces. If B : H x K ---+ C is a bounded sesquilinear map, then there exists a unique bounded operator T : H ---+ K such that B(x,y) = (Tx,y) , Vx E H,y E K .
Chapter 2. Hilbert Spaces
46
Proof. Temporarily fix x E J-i. The hypothesis implies that the map
K:3 Y
f-+
B(x,y)
defines a bounded linear functional on K with norm at most Kllxll, where K and B are related as in Definition 2.4.3(b). Deduce from the Riesz lemma that there exists a unique vector in K - call it Tx - such that B(x, y)
= (y, Tx) 'V y
E
K
and that, further, IITxl1 ~ Kllxll· The previous equation unambiguously defines a mapping T : J-i ---t K which is seen to be linear (as a consequence of the uniqueness assertion in the Riesz lemma). The final inequality of the previous paragraph guarantees the boundedness of T. Finally, the uniqueness of T is a consequence of Exercise 2.4.2(2). D As in Exercise 2.4.2(1), the reader is urged to see to what extent the completeness of J-i and K are needed in the hypothesis of the preceding proposition. (In particular, she should be able to verify that the completeness of J-i is unnecessary.) Corollary 2.4.5 Let T E .c(J-i, K), where J-i, K are Hilbert spaces. Then there exists a unique bounded operator T* E .c(K, J-i) such that (T*y, x) =
(y, Tx) 'V x E J-i, y E K .
(2.4.10)
Proof. Consider the (obviously) bounded sesquilinear form B : K x J-i ---t
Definition 2.4.6 If T, T* are as in Corollary 2.4.5, the operator T* is called the adjoint of the operator T. The next proposition lists various elementary properties of the process of adjunction - i.e., the passage from an operator to its adjoint. Proposition 2.4.7 Let J-i 1 ,J-i 2,J-i3 be Hilbert spaces, and suppose T,T1 E.c(J-i 1,J-i2), S E .c(J-i2' J-i 3 ) and suppose 0: E
(f)
IITW
=
IIT*TII.
Proof. Most of these assertions are verified by using the fact that the adjoint operator T* is characterised by the equation 2.4.10.
2.4. The adjoint operator
47
(a) For arbitrary x E HI,y E H2, note that
(y, (aT + Tdx)
Ci(y, Tx) + (y, TIX) =
((CiT* and deduce that (aT
+ T I )* =
CiT*
+ (Tty, x) + Tt)y, x)
Ci(T*y, x)
+ Tt.
The proofs of (b), (c) and (d) are entirely similar to the one given above for (a) and the reader is urged to verify the details. (e) This is an immediate consequence of equation 2.4.10 and (two applications, once to T and once to T*, of) Exercise 2.4.2 (2). (f) If x E HI, note that
IITxl12
(Tx, Tx)
=
(T*Tx, x)
< IIT*Txll'llxll (by Cauchy-Schwarz) < IIT*TII'llxI1 2 ; since x was arbitrary, deduce that IITI12 ::; IIT*TII. The reverse inequality follows at once from (e) above and Exercise 1.3.4(3). 0 In the remainder of this section, we shall discuss some important classes of bounded operators.
Definition 2.4.8 An operator T self-adjoint if T = T* .
E
£(H), where H is a Hilbert space, is said to be
Proposition 2.4.9 Let T E £(H), where H is a Hilbert space. (a) T is self-adjoint if and only if (Tx, x) E ffi. 't:/x E Hi (b) there exists a unique pair of self-adjoint operators T i , i = 1,2 such that T = TI + iT2i this decomposition is given by TI = T+;t*, T2 = T;r, and it is customary to refer to TI and T2 as the real and imaginary parts of T and to write Re T = T I , 1m T = T 2. Proof. (a) If T E £(H), we see that, for arbitrary x, (Tx, x) = (x,T*x) = (T*x,x) , and consequently, (Tx, x) E ffi. ¢} (Tx, x) = (T*x,x). Since T is determined - see Exercise 2.4.2(3) - by its quadratic form (i.e., the map H ::1 x 1-+ (Tx, x) E iC), the proof of (a) is complete. (b) It is clear that the equations Re T = T + T* 1m T = T - T* 2 ' 2i define self-adjoint operators such that T = Re T + i 1m T. Conversely, if T = TI +iT2 is such a decomposition, then note that T* iT2 and conclude that TI = Re T, T2 = 1m T.
TI D
Chapter 2. Hilbert Spaces
48
Notice, in view of (a) and (b) of the preceding proposition, that the quadratic forms corresponding to the real and imaginary parts of an operator are the real and imaginary parts of the quadratic form of that operator. Self-adjoint operators are the building blocks of all operators, and they are by far the most important subclass of all bounded operators on a Hilbert space. However, in order to see their structure and usefulness, we will have to wait until after we have proved the fundamental spectral theorem - which will allow us to handle self-adjoint operators with exactly the same facility that we have when handling real-valued functions. Nevertheless, we have already seen one special class of self-adjoint operators, as shown by the next result.
Proposition 2.4.10 Let P E £(H). Then the following conditions are equivalent: (i) P = PM is the orthogonal projection onto some closed subspace M c H; (ii) P = p 2 = P* .
Proof. (i) =} (ii): If P = PM, the definition of an orthogonal projection shows that P = p 2 ; the self-adjointness of P follows from Theorem 2.3.12(4) and Proposition 2.4.9(a).
(ii)
=}
(i): Suppose (ii) is satisfied; let M x EM=} =}
=
ran P, and note that
:3y E H such that x = Py Px = p 2 y = Py = x ;
(2.4.11)
on the other hand, note that
y E M-L
¢}
(y, pz) = 0 Vz E H
¢}
(Py, z) = 0 Vz E H
¢}
Py = 0;
(since P
= P*) (2.4.12)
hence, if z E H and x = PMz, Y = PM-,-z, we find from equations 2.4.11 and 2.4.12 that pz = Px + Py = x = PMz. D The next two propositions identify two important classes of operators between Hilbert spaces.
Proposition 2.4.11 Let H, K be Hilbert spaces; the following conditions on an operator U E £(H, K) are equivalent: (i) if {ei : i E I} is any orthonormal set in H, then also {Uei : i E I} is an orthonormal set in K; (ii) there exists an orthonormal basis {ei : i E I} for H such that {U ei : i E I} is an orthonormal set in K; (iii) (Ux,Uy) = (x,y) V x,y E H; (iv) IIUxl1 = Ilxll V x E H; (v) U*U = ht. An operator satisfying these equivalent conditions is called an isometry.
49
2.4. The adjoint operator
Proof. (i)
(ii)
=?
(ii): There exists an orthonormal basis for 11.
(iii): If x,y E 11 and if {ei: i E I} is as in (ii), then
=?
(Ux, Uy)
iEI
i,jEI
(x, y) .
(iii)
=?
(iv): Put y = x.
(iv)
=?
(v): If x
E
11, note that
(U*Ux,x) =
IlUxl1 2
=
IIxl1 2
=
(lHx,X) ,
and appeal to the fact that a bounded operator is determined by its quadratic form - see Exercise 2.4.2(3).
(v)
=?
(i): If {ei : i E I} is any orthonormal set in 11, then
o Proposition 2.4.12 The following conditions on an isometry U E £(11, K.) are equivalent: (i) if {ei : i E I} is any orthonormal basis for 11, then {Uei : i E I} is an orthonormal basis for K.; (ii) there exists an orthonormal set {ei : i E I} in 11 such that {Uei : i E I} is an orthonormal basis for K.; (iii) UU* = lK; (iv) U is invertible; (v) U maps 11 onto K..
An isometry which satisfies the above equivalent conditions is said to be unitary. Proof. (i)
(ii)
=?
=?
(ii): Obvious.
(iii): If {ei : i
E
I} is as in (ii), and if x
E
K., observe that
iEI
iEI
=
L(X,Uei)Uei iEI
=x.
(since U is an isometry)
50
Chapter 2. Hilbert Spaces
(iii) :=} (iv): The assumption that U is an isometry, in conjunction with the hypothesis (iii), says that U* = U~l. (iv)
:=}
(v): Obvious.
(v) :=} (i): If {ei : i E I} is an orthonormal basis for H, then {Uei : i E I} is an orthonormal set in H, since U is isometric. Now, if z E K, pick x E H such that z = U x, and observe that
IIUxl1 2 IIxl1 2
IIzl12
L L
l(x,ei)1 2
iEI
I(z, Uei)12
,
iEI
and since z was arbitrary, this shows that {U ei : i E I} is an orthonormal basis for K. 0 Thus, unitary operators are the natural isomorphisms in the context of Hilbert spaces. The collection of unitary operators from H to K will be denoted by U(H, K); when H = K, we shall write U(H) = U(H, H). We list some elementary properties of unitary and isometric operators in the next exercise. Exercise 2.4.13 (1) Suppose that Hand K are Hilbert spaces and suppose {ei : i E I} (resp., {fi : i E I}) is an orthonormal basis (resp., orthonormal set) in H
(resp., K), for some index set I. Show that: (a) dimH ::; dimK; and (b) there exists a unique isometry U E £(H,K) such that Uei
=
fi Vi E I.
(2) Let Hand K be Hilbert spaces. Show that: (a) there exists an isometry U E £(H, K) if and only if dim H ::; dim K; (b) there exists a unitary U E £(H, K) if and only if dim H = dim K. (3) Show that U(H) is a group under multiplication, which is a (norm-) closed subset of the Banach space £(H). (4) Suppose U E U(H, K); show that the equation £(H)
3
T
a,!._p
UTU*
E
£(K)
(2.4.13)
defines a mapping (ad U) : £(H) ----> £(K) which is an "isometric isomorphism of Banach *-algebras", meaning that: (a) ad U is an isometric isomorphism of Banach spaces: i.e., ad U is a linear mapping which is 1-1, onto, and is norm-preserving; (Hint: verify that it is linear and preserves norm and that an inverse is given by ad U*.)
2.4. The adjoint operator
51
(b) ad U is a p'f'Oduct-preserving map between Banach algebras; i. e., (ad U)(TIT2)
= ( (ad
U)(TI) )( (ad U)(T2 )),
for all
T I , T2 E .c(H);
(c) ad U is a *-preserving map; i.e., ( (ad U)(T) )* = (ad U)(T*)
for all
T E .c(H) .
(5) Show that the map U f---+ (ad U) is a homomorphism f'f'Om the g'f'OUp U(H) into the g'f'OUp Aut .c(H) of all automorphisms (= isometric isomorphisms of the Banach *-algebra .c(H) onto itself); further, verify that if Un ---> U in U(H, K), then (ad Un)(T) ---> (ad U)(T) in .c(K) for all T E .c(H).
A unitary operator between Hilbert spaces should be viewed as "implementing an inessential variation"; thus, if U E U(H, K) and if T E .c(H), then the operator UTU* E .c(K) should be thought of as being "essentially the same as T", except that it is probably being viewed from a different observer's perspective. All this is made precise in the following definition.
Definition 2.4.14 Two operators T E .c(H) and S E .c(K) (on two possibly different Hilbert spaces) are said to be unitarily equivalent if there exists a unitary operator U E U(H, K) such that S = UTU*. We conclude this section with a discussion of some examples of isometric operators, which will illustrate the preceding notions quite nicely.
Example 2.4.15 To start with, notice that if H is a finite-dimensional Hilbert space, then an isometry U E .c(H) is necessarily unitary. (Prove this!) Hence, the notion of non-unitary isometries of a Hilbert space into itself makes sense only in infinite-dimensional Hilbert spaces. We discuss some examples of a non-unitary isometry in a separable Hilbert space. (1) Let H = £2 (= £2 (1'1) ). Let {en : n E 1'1} denote the standard orthonormal basis of H (consisting of sequences with a 1 in one co-ordinate and 0 in all other co-ordinates). In view of Exercise 2.4.13(1)(b), there exists a unique isometry S E .c(H) such that Sen = en+1 \;In E 1'1; equivalently, we have
For obvious reasons, this operator is referred to as a "shift" operator; in order to distinguish it from a near relative, we shall refer to it as the unilateral shift. It should be clear that S is an isometry whose range is the proper subspace jvl = {el}1-, and consequently, S is not unitary. A minor computation shows that the adjoint S* is the "backward shift":
Chapter 2. Hilbert Spaces
52
and that SS* PM (which is another way of seeing that S is not unitary). Thus S* is a left-inverse, but not a right-inverse, for S. (This, of course, is typical of a non-unitary isometry.) Further - as is true for any non-unitary isometry - each power sn, n ~ 1, is a non-unitary isometry.
(2) The "near-relative" of the unilateral shift, which was referred to earlier, is the so-called bilateral shift, which is defined as follows: consider the Hilbert space 1i = £2(Z) with its standard basis {en: n E Z}. The bilateral shift is the unique isometry B on 1i such that Ben = en+l 'Vn E Z. This time, however, since B maps the standard basis onto itself, we find that B is unitary. The reason for the terminology "bilateral shift" is this: denote a typical element of 1i as a "bilateral" sequence (or a sequence extending to infinity in both directions); in order to keep things straight, let us underline the O-th co-ordinate of such a sequence; thus, if x = 2:~=-oo ane n , then we write x = (... ,a-I, ao, al," .); we then find that B( ... ,a-l,aO,al,"')
= (... ,a-2,a-l,aO, ... ).
(3) Consider the Hilbert space 1i = L2([0, 1], m) (where, of course, m denotes "Lebesgue measure") - see Remark 2.3.7(2) - and let {en: n E Z} denote the exponential basis of this Hilbert space - see Example 2.3.2(3). Notice that len(x)1 is identically equal to 1, and conclude that the operator defined by
(WJ)(x) = el(x)f(x) 'Vf E 1i is necessarily isometric; it should be clear that this is actually unitary, since its inverse is given by the operator of multiplication bye-I. It is easily seen that Wen = en+l 'Vn E Z. If U : £2 (Z) ---> 1i is the unique unitary operator such that U maps the n-th standard basis vector to en, for each nEZ, it follows easily that W = U BU*. Thus, the operator W of this example is unitarily equivalent to the bilateral shift (of the previous example). More is true; let M denote the closed subspace M = [{en : n ~ I} l; then M is invariant under W - meaning that W(M) eM; and it should be clear that the restricted operator WIM E £(M) is unitarily equivalent to the unilateral shift. (4) More generally, if (X, B, J1) is any measure space - see the appendix (§A.5) and if : X ---> C is any measurable function such that 1<1>1 = 1 J1 - a.e., then the equation
M¢f
=
f ,f
E
L2(X, B, J1)
defines a unitary operator on L2(X, B, J1) (with inverse given by M-;p).
o
2.5. Strong and weak convergence
2.5
53
Strong and weak convergence
This section is devoted to a discussion of two (vector space) topologies on the space £(H, K) of bounded operators between Hilbert spaces. Definition 2.5.1 (1) For each x E H, consider the seminorm Px defined on £(H, K) bYPx(T) = IITxll; the strong operator topology is the topology on£(H,K) defined by the family {Px : x E H} of seminorms (see Remark 1.6.5).
(2) For each x E H, y E K, consider the seminorm Px,y defined on £(H, K) by Px,y (T) I(Tx, y) I; the weak operator topology is the topology on £(H, K) defined by the family {Px,y : x E H,y E K} of seminorms (see Remark 1.6.5). Thus, it should be clear that if {Ti : i E I} is a net in £(H, K), then the net converges to T E £(H, K) with respect to the strong operator topology precisely when IITiX - Txll -+ 0 \:Ix E H, i.e., precisely when the net {TiX : i E I} converges to Tx with respect to the strong (i.e., norm) topology on K, for every x E H. Likewise, the net {Td converges to T with respect to the weak operator topology if and only if ((Ti - T)x, y) -+ 0 for all x E H, y E K, i.e., if and only if the net {TiX} converges to Tx with respect to the weak topology on K, for every x E H. At the risk of abusing our normal convention for use of the expressions "strong convergence" and "weak convergence" (which we adopted earlier for general Banach spaces), in spite of the fact that £(H, K) is itself a Banach space, we adopt the following convention for operators: we shall say that a net {Td converges to T in £(H, K) (i) uniformly, (ii) strongly, or (iii) weakly, if it is the case that we have convergence with respect to the (i) norm topology, (ii) strong operator topology, or (iii) weak operator topology, respectively, on £(H, K). Before discussing some examples, we shall establish a simple and useful lemma for deciding strong or weak convergence of a net of operators. (Since the strong and weak operator topologies are vector space topologies, a net {Td converges strongly or weakly to T precisely when {Ti - T} converges strongly or weakly to 0; hence we will state our criteria for nets which converge to 0.) Lemma 2.5.2 Suppose {Ti : i E I} is a net in £(H, K) which is uniformly bounded - i.e., sup{IITill : i E I} = K < 00. Let Sl (resp., S2) be a total set in H (resp., K) - i.e., the set of finite linear combinations of elements in Sl (resp., S2) is dense in H (resp., K). (a) {Td converges strongly to 0 if and only iflimi IITiXl1 = 0 for all x E Sl; and (b) {Td converges weakly to 0 if and only if limi(Tix,y) = 0 for all x E Sl, Y E S2.
Proof. Since the proofs of the two assertions are almost identical, we only prove one of them, namely (b). Let x E H, y E K be arbitrary, and suppose E > 0 is given. We assume, without loss of generality, that E < 1. Let M = 1 + max{llxll, Ilyll}' and K = 1 + sUPi IITi II; (the 1 in these two definitions is for the sole reason of
Chapter 2. Hilbert Spaces
54
ensuring that K and M are not zero, and we need not worry about dividing by zero). The assumed totality of the sets Si, i = 1,2, implies the existence of vectors x' = "£":=1 akxk, Y' = ,,£7=1 (3jYj, with the property that Xk E S1, Yj E S2 \/k, j and such that Ilx-x'll < 3;M and Ily-y'll < 3;M· SinceE < 1 ~ K, M, it follows that Ilx'll ~ Ilxll + 1 ~ M; similarly, also Ily'll ~ M. Let N = 1 +max{lakl, l(3j I : 1 ~ k ~ m, 1 ~ j ~ n}. Since I is a directed set, the assumption that limi (TiXk, Yj) = 0 \/j, k implies that we can find an index io E I with the property that I(TiXk, Yj) I < 3Ni.mn whenever i :::: i o, 1 ~ k ~ m, 1 ~ j ~ n. It then follows that for all i :::: io, we have:
I(Tix,y)1
<
I(Tix,y-y')1
+
I(Ti(x-x'),y')1 m
+
I(TiX',y')1
n
< 2KM 3;M + L L l a k(3jl·I(Tixk,Yj)1 k=1 j=1
<
2E
+
3
N 2 mn
E
3N 2 mn
E ,
and the proof is complete.
D
Example 2.5.3 (1) Let S E .c(f12) be the unilateral shift - see Example 2.4.15(1). Then the sequence {(s*)n : n = 1,2, ... } converges strongly, and hence also weakly, to o. (Reason: the standard basis {em : mEN} is total in £2, the sequence {(s*)n : n = 1,2, ... } is uniformly bounded, and (S*)ne m = 0 \/n > m. On the other hand, {sn = ((s*)n)* : n = 1,2, ... } is a sequence of isometries, and hence certainly does not converge strongly to o. Thus, the adjoint operation is not "strongly continuous" . (2) On the other hand, it is obvious from the definition that if {Ti : i E I} is a net which converges weakly to T in .c(H, K), then the net {T;* : i E I} converges weakly to T* in .c(K, H). In particular, conclude from (1) above that both the sequences {(s*)n : n = 1,2, ... } and {sn : n = 1,2, ... } converge weakly to 0, but the sequence {(s*)nsn: n = 1,2, ... } (which is the constant sequence given by the identity operator) does not converge to 0; thus, multiplication is not "jointly weakly continuous" . (3) Multiplication is "separately strongly (resp., weakly) continuous" - i.e., if {Sd is a net which converges strongly (resp., weakly) to S in .c(H, K), and if A E .c(K,Kd,B E .c(H1' H), then the net {ASiB} converges strongly (resp., weakly) to ASB in .c(H1' Kd. (For instance, the "weak" assertion follows from the equation
(ASiBx, y) = (Si(Bx), A*y).) (4) Multiplication is "jointly strongly continuous" if we restrict ourselves to uniformly bounded nets - i.e., if {Si : i E I} (resp., {Tj : j E J}) is a uniformly bounded net in .c(H1' H2) (resp., .c(H2, H3) which converges strongly to S (resp.,
2.5. Strong and weak convergence
55
T), and if K = I x J is the directed set obtained by coordinate-wise ordering as in Example 2.2.4(4), then the net {Tj 0 Si : (i,j) E K} converges strongly to To S. (Reason: assume, by translating if necessary, that S = 0 and T = 0; (the assertion in (3) above is needed to justify this reduction); if x E HI and if E > 0, first pick io E I such that II Sixil < ~, Vi ~ i o, where M = 1 + SUPj IITj II; then pick an arbitrary jo E J, and note that if (i,j) ~ (io,jo) in K, then IITjSixl1 :::; M11Six11 < E.) (5) The purpose of the following example is to show that the asserted joint strong continuity of multiplication is false if we drop the restriction of uniform boundedness. Let N = {N E £(H) : N 2 = O}, where H is infinite-dimensional. We assert that N is strongly dense in £(H). To see this, note that sets of the form {T E £(H) : II(T - To)xill < E, VI:::; i :::; n}, where To E £(H), {Xl, ... , Xn} is a linearly independent set in H, n E Nand E > 0, constitute a base for the strong operator topology on £('H). Hence, in order to prove the assertion, we need to check that every set of the above form contains elements of N. For this, pick vectors YI, ... ,Yn such that {Xl, ... , Xn , YI, ... , Yn} is a linearly independent set in H, and such that IIToxi - Yi II < E Vi; now consider the operator T defined thus: TXi = Yi, TYi = 0 Vi and Tz = 0 whenever z ..1 {Xl, ... ,Xn , YI,"" Yn}; it is seen that TEN and that T belongs to the open set exhibited above. Since N #- £(H), the assertion (of the last paragraph) shows three things: (a) multiplication is not jointly strongly continuous; (b) if we choose a net in N which converges strongly to an operator which is not in N, then such a net is not uniformly bounded (because of (4) above); thus, strongly convergent nets need not be uniformly bounded; and (c) since a strongly convergent sequence is necessarily uniformly bounded - see Exercise 1.5.17(3) - the strong operator topology cannot be described with "just sequences". D We wish now to make certain statements concerning (arbitrary) direct sums.
Proposition 2.5.4 Suppose {Mi : i E I} is a family of closed subspaces of a Hilbert space H, which are pairwise orthogonal - i. e., Mi ..1 M j for i #- j. Let Pi = P Mi denote the orthogonal projection onto Mi. Then the family {Pi : i E I} is unconditionally sum mable, with respect to the strong (and hence also the weak) operator topology, with sum given by P = PM, where M is the closed subspace spanned by UiEIM i . Conversely, suppose {Mi : i E I} is a family of closed subspaces, suppose Pi = PM" suppose the family {Pi: i E I} is unconditionally summable, with respect to the weak topology; and suppose the sum P is an orthogonal projection; then it is necessarily the case that Mi ..1 M j for i #- j, and we are in the situation described by the previous paragraph. Proof. For any finite subset F c I, let P(F) = LiEF Pi) and deduce from Exercise 2.3.15(5)(c) that P(F) is the orthogonal projection onto the subspace M(F) LiEFMi.
Chapter 2. Hilbert Spaces
56
Consider the mapping :F(!) :3 F f---7 P(F) E £(H) - where :F(I) denotes the net of finite subsets of I, directed by inclusion, as in Example 2.2.4(2). The net {P(F) : F E :F(I)} is thus seen to be a net of projections and is hence a uniformly bounded net of operators. Observe that Fl ~ F2 '* Fl C F2 '* M(Fr) C M(F2)' Hence M = UFE:F(I)M(F). The set S = UFE:F(I)M(F) U M-.l is clearly a total set in H and if xES, then the net {P( F)x} is eventually constant and equal to x or 0, according as x E UFE:F(I)M(F) or x E M-.l. It follows from Lemma 2.5.2 that the net {P(F)} converges strongly to P. In the converse direction, assume without loss of generality that all the Mi are non-zero subspaces, fix io E I and let x E Mia be a unit vector. Notice, then, that 1
> (Px, x)
> conclude that (PiX, x) = 0 'Vi proposition is complete.
-I- i o, i.e.,
2)Pi x,x) iEI (PiaX, x) 1 ,. Mia -.l Mi for i
-I- i o, and the
proof of the D
If M and M i , i E I are closed subs paces of a Hilbert space H as in Proposition 2.5.4, then the subspace M is called the (orthogonal) direct sum of the subspaces M i , i E I, and we write (2.5.14) M = EBiEI Mi.
Conversely, if the equation 2.5.14 is ever written, it is always tacitly understood that the Mi'S constitute a family of pairwise orthogonal closed subs paces of a Hilbert space, and that M is related to them as in Proposition 2.5.4. There is an "external" direct sum construction for an arbitrary family of Hilbert spaces, which is closely related to the notion discussed earlier, which is described in the following exercise. Exercise 2.5.5 Suppose {Hi: i E I} is a family of Hilbert spaces; define H
=
{((xd) : Xi E Hi 'Vi,
L
IIxil12 < oo} .
iEI
(a) Show that H is a Hilbert space with respect to the inner product defined by (((xd), ((Yi))) = '£iEI(Xi, Yi). (b) Let M j = {((Xi)) E H : xi = O'Vi -I- j} for each j E I; show that Mi,i E I is a family of closed subspaces such that H EBiEI M i , and that Mi is naturally isomorphic to Hi, for each i E I.
57
2.5. Strong and weak convergence
In this situation, also, we shall write H = ffiiEfHi' and there should be no serious confusion as a result of using the same notation for both "internal direct sums" (as in equation 2.5.14) and "external direct sums" (as above). There is one very useful fact concerning operators on a direct sum, and matrices of operators, which we hasten to point out. Proposition 2.5.6 Suppose H = ffijEJHj and K = ffiiEfKi are two direct sums of Hilbert spaces - which we regard as internal direct sums. Let Pi (resp., Qj) denote the orthogonal projection of K (resp., H) onto Ki (resp., Hj).
(1) 1fT
E
£(H,K), define T]x = PiTx, V x E H j ; then T] E £(Hj,K i ) Vi E I,
j E J;
(2) The operator T is recaptured from the matrix ((T])) of operators by the following prescription: if x = ((Xj)) E H, and ifTx = ((Yi)) E K, then Yi = LjEJ T]xj, Vi E I, where the series is interpreted as the sum of an unconditionally summable family in K (with respect to the norm in K); (thus, if we think of elements of H (resp., K) as column vectors indexed by J (resp., I) - whose entries are themselves vectors in appropriate Hilbert spaces - then the action of the operator T is given by "matrix multiplication" by the matrix (( TJ)) .) LiEfo Pi and Q(Jo) = (3) For finite subsets 10 c I, Jo c J, let P(lo) LjEJo Qj. Then the net {P(lo)TQ(Jo ) : k = (lo, Jo) E K = F(J) x F(J)} where K is ordered by coordinate-wise inclusion - converges strongly to T, and
IITII
lim
IlP(lo)TQ(Jo)II
sup
IIP(lo)TQ(Jo)11
(Io,Jo)EK
(2.5.15)
(Io,Jo)EK
(4) Conversely, a matrix ((T])), where T] E £(Hj, K i ), Vi E I,j E J, defines a bounded operator T E £(H, K) as in (2) above, precisely when the family {IISII} - where S ranges over all those matrices which are obtained by replacing all the entries outside a finite number of rows and columns of the given matrix by 0 and leaving the remaining finite "rectangular submatrix" of entries unchanged - is uniformly bounded. (5) Suppose M = ffilELMI is another (internal) direct sum of Hilbert spaces, and suppose S E £(K, M) has the matrix decomposition ((Sm with respect to the given direct sum decompositions of K and M, respectively, then the matrix decomposition for SoT E £(H, M) (with respect to the given direct sum decompositions of H and M, respectively) is given by (S 0 T); = LiEf Sf 0 T], where this series is unconditionally summable with respect to the strong operator topology. (6) The assignment of the matrix ((TJ)) to the operator T is a linear one, and the matrix associated with T* is given by (T*){ = (TJ)*.
58
Chapter 2. Hilbert Spaces
Proof. (1) Sice l!Pill S-I, it is clear theat T] E £(Hj,Kd and that (see Exercise 1.3.4(3)).
IIT]II :::; IITII
(2) Since LjEJ Qj = id}{ - where this series is interpreted as the sum of an unconditionally summable family, with respect to the strong operator topology in C(H) - it follows from "separate strong continuity" of multiplication - see Example 2.5.3(3) - that
as desired. (3) Since the nets {P(Io) : fo E F(I)} and {Q(Jo) : Jo E F(J)} are uniformly bounded and converge strongly to id/C and id}{ respectively, it follows from "joint strong continuity of multiplication on uniformly bounded sets" - see Example 2.5.3(4) - that the net {P(Io)TQ(Jo) : (Io, J o) E F(I) x F(J)} converges strongly to T and that IITII:::; lim )nt{(}o,Jo) IIP(Io)TQ(Jo)ll. (The limit inferior and limit superior of a bounded net of real numbers is defined exactly as such limits of a bounded sequence of real numbers is defined - see equations A.5.16 and A.5.17, respectively, for these latter definitions.) Conversely, since P(Io) and Q(Jo) . are projections, it is clear that I!p(Io)TQ(Jo)II :::; IITII for each (Io, J o); hence IITII ~ lim sUP(Io,Jo) IIP(Io)TQ(Jo)ll, and the proof of (3) is complete.
1~)tlTnder the hypothesis, for each k = (Io, Jo) E K = F(1) xF(J), define n to be ~clearly bounded linear)
map from H to K defined by "matrix multiplication" by the matrix which agrees with ((TJ)) in rows and columns indexed by indices from fo and J o respectively, and has O's elsewhere; then {Tk : k E K} is seen to be a uniformly bounded net of operators in £(H, K); this net is strongly convergent. (Reason: S = UJoEF(J)Q(Jo)H is a total set in H, and the net {Tkx : k E K} is easily verified to converge whenever XES; and we may appeal to Lemma 2.5.2.) If T denotes the strong limit of this net, it is easy to verify that this operator does indeed have the given matrix decomposition. (5) This is proved exactly as was (2) - since the equation LiE! Pi = id/C implies that LiE! SPiT = ST, both "series" being interpreted in the strong operator topology. (6) This is an easy verification, which we leave as an exercise for the reader.
D
We explicitly spell out a special case of Proposition 2.5.6, because of its importance. Suppose Hand K are separable Hilbert spaces, and suppose {ej} and {Ii} are orthonormal bases for Hand K, respectively. Then, we have the setting of Proposition 2.5.6, if we set H j = Cej and Ki = C/i respectively. Since Hj and Ki are I-dimensional, any operator S : Hj --> Ki determines and is determined by a uniquely defined scalar A, via the requirement that Sej = Ali. If we unravel what the proposition says in this special case, this is what we find:
2.5. Strong and weak convergence
59
For each i,j, define tj = (Tej, 1;); think of an element x E H (resp., y E K) as a column vector x = ((Xj = (x, ej))) (resp., y = ((Yi = (y, fi) ))); then the action of the operator is given by matrix-multiplication, thus: if Tx = y, then Yi = 2: j tjXj - where the series is unconditionally summable in C. In the special case when H = K has finite dimension, say n, if we choose and fix one orthonormal basis {el,"" en} for H - thus we choose the f's to be the same as the e's - then the choice of any such (ordered) basis sets up a bijective correspondence between £(H) and Mn UC) which is an "isomorphism of *-algebras", where (i) we think of £(H) as being a *-algebra in the obvious fashion, and (ii) the *-algebra structure on MnUC) is defined thus: the vector operations, i.e., addition and scalar multiplication, are as in Example 1.1.3(2), the product of two n x n matrices is as in Exercise 1.3.1(7)(ii), and the "adjoint" of a matrix is the "complex conjugate transpose" - i.e., if A is the matrix with (i, j)-th entry given by aj, then the adjoint is the matrix A* with (i,j)-th entry given by aiThis is why, for instance, a matrix U E MnUC) is said to be unitary if U*U = In (the identity matrix of size n), and the set U(n) of all such matrices has a natural structure of a compact group. Exercise 2.5.7 If H is a Hilbert space, and if x, y E H, define Tx,y : H Tx,y(z) = (z, y)x. Show that
---->
H by
(a) Tx,y E £(H) and IITx,yll = Ilxll·llyll. (b) T;,y = Ty,x, and Tx,yTz,w = (z, y)Tx,w. (c) if H is separable, if {e 1, e2, ... , en, ... } is an orthonormal basis for H, and if we write Eij = Tei,ej' show that EijEkl = bjk Eil, and Eij = Eji for all i,j,k,l. (d) Deduce that ifH is at least two-dimensional, then £(H) is not commutative. (e) Show that the matrix representing the operator Ekl with respect to the orthonormal basis {ej L is the matrix with a 1 in the (k, l) entry and D's elsewhere. (f) For 1 :::; i,j :::; n, let EJ E Mn(C) be the matrix which has 1 in the (i,j) entry and D's elsewhere. (Thus {Ej : 1 :::; i, j :::; n} is the "standard basis" for MnUC)') Show that EjEt = bJEI V i,j,k,l, and that 2:~=1 E1 is the (n x n) identity matrix In. In the infinite-dimensional separable case, almost exactly the same thing - as in the case of finite-dimensional H - is true, with the only distinction now being that not all infinite matrices will correspond to operators; to see this, let us re-state the description of the matrix thus: the j-th column of the matrix is the sequence of Fourier coefficients of Tej with respect to the orthonormal basis {ei : i EN}; in particular, every column must define an element of £2. Let us see precisely which infinite matrices do arise as the matrix of a bounded operator. Suppose T = ((t~))i,j=l is an infinite matrix. For each n, let Tn denote
Chapter 2. Hilbert Spaces
60
the "n-th truncation" of T defined by if 1 :::; i,j :::; n otherwise. Then it is clear that these truncations do define bounded operators on H, the n-th operator being given by n
Tn x
Ltj(x,ej)ei. i,j=1
It follows from Proposition 2.5.6(4) that T is the matrix of a bounded operator on H if and only if SUPn IITn II < 00. It will follow from our discussion in the next chapter that if An denotes the n x n matrix with (i, j)-th entry tj, then IITn 112 is nothing but the largest eigenvalue of the matrix A~An (all of whose eigenvalues are real and non-negative). We conclude with a simple exercise concerning "direct sums" of operators. Exercise 2.5.8 Let H = EBiEI Hi be a(n internal) direct sum of Hilbert spaces. Let EBr:I£(Hi ) = {T = ((Ti) )iEI : Ti E £(Hi) Vi, and sUPi IITi II < oo}. (a) 1fT E EBf;I£(H i ), show that there exists a unique operator EBiTi E £(H) with the property that (EBiTi)Xj = Tjxj, VXj E H j , Vj E 1; further, II EBi Till = SUPi IIT& (The operator (EBiTi) is called the "direct sum" of the family {T;}
of operators.) (b) 1fT,S E EBf:I£(H i ), show that EBi(Ti and
+ Si) =
(EBiTi)
+ (EBiSi),
EBi(TiSi ) = (EBiTi)(EBiSi),
Chapter 3 C* - Algebras
3.1
Banach algebras
The aim of this chapter is to build up the necessary background from the theory of C* -algebras to be able to prove the spectral theorem. We begin with the definition and some examples of a normed algebra. Definition 3.1.1 A normed algebra is a normed space A o with the following additional structure: there exists a well-defined multiplication in Ao, meaning a map A o x A o ----+ Ao, denoted by (x, y) ----+ xy, which satisfies the following conditions, for all X,y,Z E Ao, ex E C: (i) (associativity) (xy)z = x(yz); (ii) (distributivity) (exx+y)z = exxz+yz, z(exx+y) = exzx+zy; (iii) (sub-multiplicativity of norm) IlxY11 ~ Ilxll·llyll·
A Banach algebra is a normed algebra which is complete as a normed space. A normed (or Banach) algebra is said to be unital if it has a multiplicative identity - i. e., if there exists an element, which we shall denote simply by 1, such that Ix = xl = x\lx. Example 3.1.2 (1) The space B(X) of all bounded complex-valued functions on a set X - see Example 1.2.2(2) - is an example of a Banach algebra, with respect to pointwise product of functions - i.e., (fg)(x) = f(x)g(x). Notice that this is a commutative algebra - i.e., xy = yx \Ix, y. (2) If A o is a normed algebra, and if 60 is a vector subspace which is closed under mUltiplication, then 60 is a normed algebra in its own right (with the structure induced by the one on A o) and is called a normed subalgebra of A o; a normed subalgebra 6 of a Banach algebra A is a Banach algebra (in the induced structure) if and only if 6 is a closed subspace of A. (As in this example, we shall use the notational convention of using the subscript 0 for not necessarily complete normed algebras, and dropping the 0 subscript if we are dealing with Banach algebras.) 61
Chapter 3. C' -Algebras
62
(3) Suppose Ao is a normed algebra, and A denotes its completion as a Banach space. Then A acquires a natural Banach algebra structure as follows: if x, YEA, pick sequences {xn}, {Yn} in Ao such that Xn --+ X, Yn --+ y. Then, pick a constant K such that Ilxn II, llYn II :S K \In, note that
IlxnYn - xmYml1
+ (xn - xm)Ymll < Ilxn(Yn - Ym)11 + II(xn - xm)Ymll < Ilxnll·IIYn - Ymll + Ilxn - xmll·IIYmll < K(IIYn - Ymll + Ilxn - xrnll) , Ilxn(Yn - Ym)
and conclude that {XnYn} is a Cauchy sequence and hence convergent; define xy = limn XnYn' To see that the "product" we have proposed is unambiguously defined, we must verify that the above limit is independent of the choice of the approximating sequences, and depends only on x and y. To see that this is indeed the case, suppose Xi,n --+ x, Yi,n --+ y, i = 1,2. Define X2n-l = Xl,n, X2n = X2,n, Y2n-l = Y1,n, Y2n = Y2,n, and apply the preceding reasoning to conclude that {xnYn} is a convergent sequence, with limit z, say; then also {Xl,nYl,n = X2n-1Y2n-d converges to z, since a subsequence of a convergent sequence converges to the same limit; similarly, also X2,nY2,n --+ Z. Thus, we have unambiguously defined the product of any two elements of A. We leave it to the reader to verify that (a) with the product so defined, A indeed satisfies the requirements of a Banach algebra, and (b) the multiplication, when restricted to elements of A o, coincides with the initial multiplication in A o, and hence Ao is a dense normed subalgebra of A. (4) The set Cc(lR), consisting of continuous functions on lR which vanish outside a bounded interval (which interval will typically depend on the particular function in question), is a normed subalgebra of B(lR), and the set Co(lR) , which may be defined as the closure in B(lR) of Cc(lR) is a Banach algebra. More generally, if X is any locally compact Hausdorff space, then the set Cc(X) of continuous functions on X which vanish outside a compact set in X is a normed subalgebra of B(X), and its closure in B(X) is the Banach algebra Co(X) consisting of continuous functions on X which vanish at 00. In particular, if X is a compact Hausdorff space, then the space C(X) of all continuous functions on X is a Banach algebra. (5) The Banach space £1 (Z) is a Banach algebra, if multiplication is defined thus: if we think of elements of £1 as functions f : Z --+ !C with the property that Ilflll = L:nEZ If(n)1 < 00, then the (so-called convolution) product is defined by
(f
* g)(n) =
L
(3.1.1)
f(m)g(n - m) .
mEZ To see this, note - using the change of variables l = n - m - that
L If(m)g(n - m)1 = L If(m)g(l)1
m,n
m,l
= Ilflll ·llglll <
00,
3.1. Banach algebras
63
and conclude that if f,g E .(11(2:), then, indeed the series defining (j * g)(n) in equation 3.1.1 is (absolutely) convergent. The argument given above actually shows that also Ilf * gill::; Ilflll ·llgll1· An alternative way of doing this would have been to verify that the set cc(2:), consisting of functions on 2: which vanish outside a finite set, is a normed algebra with respect to convolution product, and then saying that .(11 (2:) is the Banach algebra obtained by completing this normed algebra, as discussed in (3) above. There is a continuous analogue of this, as follows: define a convolution product in Cc(IR) - see (4) above - as follows:
(j
* g)(x)
=
1: 1:
f(y)g(x - y)dy ;
(3.1.2)
a "change of variable argument" entirely analogous to the discrete case discussed above, shows that if we define
Ilflli =
If(x)ldx ,
(3.1.3)
then Cc(IR) becomes a normed algebra when equipped with the norm II· III and convolution product; the Banach algebra completion of this normed algebra - at least one manifestation of it - is the space LI(IR,m), where m denotes Lebesgue measure. The preceding constructions have analogues in any "locally compact topological group"; thus, suppose G is a group which is equipped with a topology with respect to which mulitplication and inversion are continuous, and suppose that G is a locally compact Hausdorff space. For 9 E G, let Lg be the homeomorphism of G given by left multiplication by g; it is easy to see (and useful to bear in mind) that G is homogeneous as a topological space, i.e., the group of homeomorphisms acts transitively on G (meaning that given any two points g, h E G, there is a homeomorphism - namely Lhg-l - which maps 9 to h). Useful examples of topological groups to bear in mind are: (i) the group Q(A) of invertible elements in a unital Banach algebra A - of which more will be said soon - (with respect to the norm and the product in A); (ii) the group U(rt) of unitary operators on a Hilbert space rt (with respect to the norm and the product in £(rt)); and (iii) the finite-dimensional (or matricial) special cases of the preceding examples GLnUC) and the compact group U(n, C) = {U E Mn(C) : U*U = In} (where U* denotes the complex conjugate transpose of the matrix U). (When A and rt are one-dimensional, we recover the groups ex (of non-zerocomplex numbers) and '][' (of complex numbers of unit modulus); also note that in case A (resp., rt) is finite-dimensional, then the examples in (i) (resp., (ii)) above are instances of locally compact (resp., compact) groups.) Let Cc(G) be the space of compactly supported continuous functions on G, regarded as being normed by 11·1100; the map Lg of G induces a map, via composition, on Cc(G) by ).g(j) = f 0 L g-1. (The reason for the inverse is "functorial",
Chapter 3. C'-Algebras
64
meaning that it is only this way that we can ensure that Agh = Ag 0 Ah; thus 9 r-+ Ag defines a "representation" of G on C c ( G) (meaning a group homomorphism A : G --+ Q( £(Cc(G))), in the notation of example (i) of the preceding paragraph). The map A is "strongly continuous", meaning that if gi --+ 9 (is a convergent net in G), then IIAg.1 - Agflloo --+ 0 "If E Cc(G). Analogous to the case of IR or Z, it is a remarkable fact that there is always a bounded linear functional me E C c ( G)* which is "(left-) translation invariant" in the sense that me 0 Ag = me Vg E G, and satisfies the 'regularity condition' that me(f) > 0 whenever 0 =f. f E Cc(G) and f ~ o. It is then a consequence of the Riesz Representation theorem (see the Appendix - §A.7) that there exists a unique measure - which we shall denote by JLe-defined on the Borel sets of G such that mg(f) = Je f dJLe· This measure JLe inherits translational invariance from me - i.e., JLe 0 Lg = JLe· It is a fact that the functional me, and consequently the measure JLe, is (essentially) uniquely determined by the translational invariance condition (up to scaling by a positive scalar); this unique measure is referred to as (left) Haar measure on G. It is then the case that Cc(G) can alternatively be normed by
Ilflll =
10 If I dJLe ,
and that it now becomes a normed algebra with respect to convolution product
(f * g)(t) =
10 f(S)g(S-lt) dJLe(s) .
(3.1.4)
The completion of this normed algebra - at least one version of it - is given by L1 (G, JLe). (6) Most examples presented so far have been commutative algebras - meaning that xy = yx "Ix, y. (It is a fact that L1 (G) is commutative precisely when G is commutative, i.e., abelian.) The following one is not. If X is any normed space and X is not I-dimensional, then £(X) is a non-commutative normed algebra; (prove this fact!); further, £(X) is a Banach algebra if and only if X is a Banach space. (The proof of one implication goes along the lines of the proof of the fact that C[O,I] is complete; for the other, consider the embedding of X into £(X) given by £y(x) = ¢(x)y, where ¢ is a fixed linear functional of norm 1.) In particular, - see Exercise 2.5.7 - MnUC) is an example of a non-commutative Banach algebra for each n = 2, 3, . . . . D Of the preceding examples, the unital algebras are B(X) (X any set), C(X) (X a compact Hausdorff space), £1 (Z) (and more generally, L1 (G) in case the group G is (equipped with the) discrete (topology), and £(X) (X any normed space). Verify this assertion by identifying the multiplicative identity in these examples, and showing that the other examples do not have any identity element.
3.1. Banach algebras
65
Just as it is possible to complete a normed algebra and manufacture a Banach algebra out of an incomplete normed algebra, there is a (and in fact, more than one) way to start with a normed algebra without identity and manufacture a near relative which is unital. This, together with some other elementary facts concerning algebras, is dealt with in the following exercises. Exercise 3.1.3 (1) If Ao is a normed algebra, consider the algebra defined by At = Ao EEl!:'! CC; thus, At = {(x, a) : x E X, a E CC} as a vector space, and the product and norm are defined by (x, a)(y, (3) = (xy+ay+ (3x, a(3) and II(x, a) II = Ilxll + lal. Show that with these definitions, (a) At is a unital normed algebra; (b) the map Ao 3 x f--+ (x,O) E At is an isometric isomorphism of the algebra Ao onto the subalgebra {(x,O): x E Ao} of At; (c) Ao is a Banach algebra if and only if At is a Banach algebra.
(2) Show that multiplication is jointly continuous in a normed algebra, meaning that Xn --+ x, Yn --+ Y =} XnYn --+ xy. (3) If Ao is a normed algebra, show that a multiplicative identity, should one exist, is necessarily unique; what can you say about the norm of the identity element? (4) (a) Suppose I is a closed ideal in a normed algebra Ao - i.e., I is a (norm-) closed subspace of A o, such that whenever x, y E I, z E A o, a E CC, we have ax + y, xz, zx E I; then show that the quotient space Ao/I - with the normed space structure discussed in Exercise 1.5.3(3) - is a normed algebra with respect to multiplication defined (in the obvious manner) as: (z +I)(ZI +I) = (ZZI +I). (b) if I is a closed ideal in a Banach algebra A, deduce that A/I is a Banach algebra. (c) in the notation of Exercise (1) above, show that {(x,O) : x E Ao} is a closed ideal - call it I - of At such that At /I ~ cc. We assume, for the rest of this section, that A is a unital Banach algebra with (multiplicative) identity 1; we shall assume that A i {O}, or equivalently, that 1 i o. As with I, we shall simply write A for AI. We shall call an element x E A invertible (in A) if there exists an element - which is necessarily unique and which we shall always denote by x-I - such that xx- I = x-Ix = I, and we shall denote the collection of all such invertible elements of A as Q(A). (This is obviously a group with respect to multiplication, and is sometimes referred to as the "group of units" in A.) We gather together some basic facts concerning units in a Banach algebra in the following proposition. Proposition 3.1.4 (1) If x E Q(A), and if we define Lx (to be the map given by
left-multiplication by x) as Lx(Y) = xy, Vy E A then Lx is an invertible operator, with (Lx)-I = L x -!;
E
£(A) and further, Lx
Chapter 3. C' -Algebras
66
(2) If X E 9(A) and yEA, then xy E 9(A) {:} yx E 9(A) {:} y E 9(A). (3) If {Xl, X2, ... ,Xn } C A is a set of pairwise commuting elements - i. e., XiXj = XjXi Vi, j - then the product Xl x2 ... Xn is invertible if and only if each Xi is invertible.
(4) Ifx E A and
111- xii < 1,
then X E 9(A) and 00
x-I =
2.:(1- x)n ,
(3.1.5)
n=O where we use the convention that yO = 1. In particular, if A E Ilxll, then (x - A) is invertible and
2.: x An+1. n
00
(X-A) -1 = -
(3.1.6)
n=O
(5) 9(A) is an open set in A and the mapping x 9(A) onto itself.
f--+
X-I is a homeomorphism of
Proof. (1) Obvious. (2) Clearly if y is invertible, so is xy (resp., yx); conversely, if xy (resp., yx) is invertible, so is y = X-1(xy) (resp., y = (yx)x- 1). (3) One implication follows from the fact that 9(A) is closed under multiplication. Suppose, conversely, that x = X1X2 ... Xn is invertible; it is clear that Xi commutes with x, for each i; it follows easily that each Xi also commutes with X-I, and that the inverse of Xi is given by the product of X-I with the product of all the xy's with j i- 1. (Since all these elements commute, it does not matter in what order they are multiplied.) (4) The series on the right side of equation 3.1.5 is (absolutely, and hence) summabIe (since the appropriate geometric series converges under the hypothesis); let {sn = L~=o(l - X)i} be the sequence of partial sums, and let S denote the limit; clearly each Sn commutes with (1- x), and also, (1- x)sn = sn(1- x) = Sn+1 -1; hence, in the limit, (1 - x)s = s(l - x) = S - 1, i.e., xs = sx = 1. As for the second equation in (4), we may apply equation 3.1.5 to (1 - Din place of x, and justify the steps
n
00
_A- 1 '""' ~ L...J An n=O 00
n
- n=O L A~+1
.
3.1. Banach algebras
67
(5) The assertions in the first half of (4) imply that 1 is in the interior of 9(A) and is a point of continuity of the inversion map; by the already established (1) and (2) of this proposition, we may "transport this local picture at 1" via the linear homeomorphism Lx (which maps 9(A) onto itself) to a "corresponding local picture at x", and deduce that any element of 9(A) is an interior point and is a point of continuity of the inversion map. 0 We now come to the fundamental notion in the theory. Definition 3.1.5 (a) Let A be a unital Banach algebra, and let x E .A. Then the spectrum of x is the set, denoted by a(x), defined by
a(x) =
pEe:
(x - A) is not invertible} ,
(3.1. 7)
and the spectral radius of x is defined by r(x) = SUp{IAI: A E a(x)} .
(3.1.8)
(This is thus the radius of the smallest disc centered at 0 in C which contains the spectrum. Strictly speaking, the last sentence makes sense only if the spectrum is a bounded set; we shall soon see that this is indeed the case.) (b) The resolvent set of x is the complementary set p( x) defined by
p(x) = C - a(x) = {A E C : (x - A) is invertible} ,
(3.1.9)
p(x) 3 A ~ (x - A)-l E 9(A)
(3.1.10)
and the map is called the resolvent function of x. To start with, we observe - from the second half of Proposition 3.1.4(4) that
r(x) ~
Ilxll .
(3.1.11)
Next, we deduce - from Proposition 3.1.4(5) - that p(x) is an open set in C and that the resolvent function of x is continuous. It follows immediately that a(x) is a compact set. In the proof of the following proposition, we shall use some elementary facts from complex function theory; the reader who is unfamiliar with the notions required is urged to fill in the necessary background from any standard book on the subject (such as [Ahl], for instance). Proposition 3.1.6 Let x E A; then, (a) limlAI->oo IIRx(A)11 = 0; and (b) (Resolvent equation)
Chapter 3. C* -Algebras
68
(c) the resolvent function is "weakly analytic", meaning that if ¢ E A*, then the map ¢ 0 Rx is an analytic function (defined on the open set p( x) C C), such that lim ¢
1>-1-->00
0
Rx (>\) = O.
Proof. (a) It is an immediate consequence of equation 3.1.6 that there exists a constant C > 0 such that
IIRx(A)11
~
C
f\I 'VIAl> Ilxll + 1,
and assertion (a) follows. (b) If A,p E p(x), then
(A - p)Rx(A)Rx(J.L)
(A - J.L)(x - A)-l(x - J.L)-l Rx(A) ( (x - J.L) - (x - A) ) Rx(J.L) Rx(A) - Rx(J.L) .
(c) If ¢ E A* is any continuous linear functional on A, it follows from Proposition 3.1.4(5) and the resolvent equation above, that if J.L E p(x), and if A is sufficiently close to p, then A E p( x), and lim (¢( Rx(A) . Rx(J.L) )) >--->/-'
¢(Rx(J.Lf) , thereby establishing (complex) differentiability, i.e., analyticity, of ¢ 0 Rx at the point J.L. It is an immediate consequence of (a) above and the boundedness of the linear functional ¢ that lim ¢ 0 Rx(A) = 0 , 1>-1-->00 and the proof is complete. o Theorem 3.1.7 If A is any Banach algebra and x E A, then O"(x) is a non-empty compact subset of C.
Proof. Assume the contrary, which means that p(x) = C. Let ¢ E A* be arbitrary. Suppose p(x) = C; in view of Proposition 3.1.6(c), this means that ¢ 0 Rx is an "entire" function - meaning a function which is "complex-differentiable" throughout C - which vanishes at infinity; it then follows from Liouville's theorem that ¢ 0 Rx(A) = 0 'VA E C. (For instance, see [AhlJ, for Liouville's theorem.) Since ¢ was arbitrary, and since A* "separates points of A" - see Exercise 1.5.3(1) - we may conclude that Rx().) = 0 'V A E C; but since Rx(A) E 9(A), this is absurd, and the contradiction completes the proof. 0
3.1. Banach algebras
69
Remark 3.1.8 Suppose A Mn(C). It is then a consequence of basic properties of the determinant mapping - see the Appendix (§A.1) - that if T E Mn(C), then A E a(T) if and only if A is an eigenvalue of the matrix T, i.e, PT(A) = 0 where PT(Z) = det(T - z) denotes the characteristic polynomial of the matrix T. On the other hand, it is true - see Exercise A.l.19 (c) - that any polynomial of degree n, with leading coefficient equal to (-1) n, is the characteristic polynomial of some matrix in Mn(C). Hence the statement that a(T) is a non-empty set, for every T E Mn(C), is equivalent to the statement that every complex polynomial has a complex root, viz., the so-called Fundamental Theorem of Algebra. Actually, the fundamental theorem is the statement that every complex polynomial of degree N is expressible as a product of N polynomials of degree 1; this version is easily derived, by an induction argument, from the slightly weaker statement earlier referred to as the fundamental theorem. D Proposition 3.1.9 Ifp(z) = I:~=o anz n is a polynomial with complex coefficients, and if we define p(x) = ao· 1 + I:~=I anx n for each x E A, then (i) p(z) f-7 p(x) is a homomorphism from the algebra qzj of complex polynomials onto the subalgebra of A which is generated by {1, x}; (ii) (spectral mapping theorem) if p is any polynomial as above, then a( p(x))
= p( a(x)) = {p(A): A E a(x)} .
Proof. The first assertion is easily verified. As for (ii), temporarily fix A E C; if p(z) = I:~=o anz n , assume without loss of generality that aN =f. 0, and (appeal to the fundamental theorem of algebra - see Remark 3.1.8 - to) conclude that there exist AI, ... ,AN such that N
p(z) - A
aN
II (z -
Ai) .
n=1
(Thus the Ai are all the zeros, counted according to multiplicity, of the polynomial p(z) - A.) Deduce from part (i) of this Proposition that N
p(x) - A
=
aN
II (x -
Ad .
n=1
Conclude now, from Proposition 3.1.4(3), that A
tJ.
a(p(x))
and the proof is complete.
¢o}
Ai
¢o}
A
tJ. a(x) VI::; tJ. p(a(x))
i ::; n
D
Chapter 3. C* -Algebras
70
The next result is a quantitative strengthening of the assertion that the spectrum is non-empty. Theorem 3.1.10 (spectral radius formula) If A is a Banach algebra and if x E A, then (3.1.12) r(x) = lim Ilxnll';'. n--->oo
Proof. To start with, fix an arbitrary ¢ E A*, and let F = ¢ 0 Rx; by definition, this is analytic in the exterior of the disc {A E C : IAI :::; r(x)}; on the other hand, we may conclude from equation 3.1.6 that F has the Laurent series expansion
F(A) = _ ~ ¢(xn) L An+1 ' n=O
which is valid in the exterior of the disc {A E C: IAI :::; Ilxll}. Since F vanishes at infinity, the function F is actually analytic at the point at infinity, and consequently the Laurent expansion above is actually valid in the larger region IAI > r(x). So if we temporarily fix a A with IAI > r(x), then we find, in particular, that
= o.
lim ¢(xn) n-----)o(X)
,\n
Since ¢ was arbitrary, it follows from the uniform boundedness principle - see Exercise 1.5.17(1) - that there exists a constant K > 0 such that Ilxnll :::; K lAin "In . Hence, Ilxnll';' < K';' IAI. By allowing IAI to decrease to r(x), we may conclude that lim sup Ilxnll';' :::; r(x) . (3.1.13) n
On the other hand, deduce from Proposition 3.1.9(ii) and the inequality 3.1.11 thatr(x) = r(xn)';' :::; Ilxnll';', whence we have
r(x) < lim inf Ilxnll';' , n
and the theorem is proved.
(3.1.14) D
Before proceeding further, we wish to draw the reader's attention to one point, as a sort of note of caution; and this pertains to the possible dependence of the notion of the spectrum on the ambient Banach algebra. Thus, suppose A is a unital Banach algebra, and suppose B is a unital Banach subalgebra of A; suppose now that x E B; then, for any intermediate Banach subalgebra Bee c A, we can talk of
pc(x) = {A E C : :3z E C such that z(x - A) = (x - A)z = I}, and iTc(x) = C - pc(x). The following result describes the relations between the different notions of spectra.
71
3.1. Banach algebras
Proposition 3.1.11 Let 1, x E Be A, as above. Then,
(a) O"l3(x) ::::> O"A(X); (b) 8( O"l3(x)) C 8( O"A(X) ), where 8(~) denotes the "topological boundary" of the subset ~ c C. (This is the set defined by 8~ = ~ n C - ~; thus, A E 8~ if and only if there exists sequences {An} c ~ and {zn} C C - ~ such that A = limn An = limn zn.) Proof. (a) We need to show that Pl3(x) C PA(X). So suppose A E Pl3(x); this means that there exists Z E B c A such that z(x - A) = (x - A)Z = 1; hence, A E PA(X), (b) Suppose A E 8( 0"l3 (x) ); since the spectrum is closed, this means that A E 0"l3 (x) and that there existt: a sequence {An} C Pl3(X) such that An --> A. Since (by (a)) An E PA(X) 'Vn, we only need to show that A E O"A(X); if this were not the case, then (x - An) --> (x - A) E 9(A), which would imply - by Proposition 3.1.4(5) - that (x - An)-I --> (x - A)-I; but the assumptions that An E Pl3(x) and that B is a Banach subalgebra of A would then force the conclusion (x - A)-I E B, contradicting the assumption that A E O"l3(x). 0 The purpose of the next example is to show that it is possible to have strict inclusion in Proposition 3.1.11(a).
Example 3.1.12 In this example, and elsewhere in this book, we shall use the symbols IIJ), ID and 1I' to denote the open unit disc, the closed unit disc and the unit circle in C, respectively. (Thus, if Z E C, then Z E IIJ) (resp., ID, resp., 1I') if and only if the absolute value of Z is less than (resp., not greater than, resp., equal to) 1.) The disc algebra is, by definition, the class of all functions which are analytic in IIJ) and have continuous extensions to ID; thus,
A(IIJ))
=
U E C(ID) : fllIli
is analytic} .
It is easily seen that A(IIJ)) is a (closed subalgebra of C(ID) and consequently a) Banach algebra. (Verify this!) Further, if we let T : A(IIJ)) --> C(1I') denote the "restriction map", it follows from the maximum modulus principle that T is an isometric algebra isomorphism of A(IIJ)) into C(1I'). Hence, we may - and do - regard B = A(IIJ)) as a Banach sub algebra of A = C(1I'). It is another easy exercise - which the reader is urged to carry out - to verify that if fEB, then O"B(J) = f(ID), while 0" A (J) = f (1I'). Thus, for example, if we consider fo (z) = z, then we see that O"l3(Jo) = ID, while O"A(JO) = 1I'. The preceding example more or less describes how two compact subsets in C must look like, if they are to satisfy the two conditions stipulated by (a) and (b) of Proposition 3.1.11. Thus, in general, if ~o, ~ c C are compact sets such that ~ C ~o and 8~0 C 8~, then it is the case that ~o is obtained by "filling in some of the holes in ~". (The reader is urged to make sense of this statement and to then try and prove it.) 0
72
3.2
Chapter 3. C' -Algebras
Gelfand-Naimark theory
Definition 3.2.1 (a) Recall .. see Exercise 3.1.3 (4) - that a subset I of a normed algebra Ao is said to be an ideal if the following conditions are satisfied, for all choices of x, y E I, Z E Ao, and a E C:
ax + y,
XZ,
zx E I .
(b) A proper ideal is an ideal I which is distinct from the trivial ideals {O} and A o· (c) A maximal ideal is a proper ideal which is not strictly contained in any larger proper ideal. Remark 3.2.2 According to our definitions, {O} is not a maximal ideal; in order for some of our future statements to be applicable in all cases, we shall find it convenient to adopt the convention that in the single exceptional case when A o = C, we shall consider {O} as a maximal ideal. 0 Exercise 3.2.3 (1) Show that the (norm-) closure of an ideal in a normed algebra is also an ideal. (2) If I is a closed ideal in a normed algebra Ao, show that the quotient normed space Ao/I - see Exercise 1.5.3(3) - is a normed algebra with respect to the natural definition (x + I)(y + I) = (xy + I), and is a Banach algebra if A o is. (3) Show that if A is a unital normed algebra, then the following conditions on a non-zero ideal I (i. e., I -I- {O}) are equivalent: (i) I is a proper ideal; (ii) In Q(A) = 0; (iii) 1 tJ. I. (4) Deduce from (3) above that the closure of a proper ideal in a unital Banach algebra is also a proper ideal, and hence, conclude that any maximal ideal in a unital Banach algebra is closed. (5) Consider the unital Banach algebra A = C(X), where X is a compact Hausdorff space. (a) For a subset SeX, define I(S) = {J E A: f(x) = 0 '
3.2. Gelfand-Naimark theory
73
Let I be a closed ideal in C(X). Define F = {x EX: f(x) = O\/f E I}. Then clearly Ie I(F). Conversely, let f E I(F) and let U = {x : If(x) < E}, V = {x: If(x)1 < H, so U and V are open sets such that F eVe V cU. First deduce from Urysohn's lemma (Theorem A.4.23) that there exists hE C(X) such that h(V) = {O} and h(X - U) = {I}. Set h = fh, and note that IIh - fll < E. For each x t/:. V, pick fx E I such that fx(x) = 1; appeal to compactness of X - V to find finitely many points Xl, ... ,X n such that X - V c Uf=dy EX: Ifx,(y)1 > ~}; set 9 = 4L:~llf:riI2, and note that 9 E I - since Ifil2 = fdi - and Ig(y)1 > 1 \/y E X - V; conclude from Tietze's extension theor·em - see Theorem A.4.24 that there exists an hI E C(X) such that hI (y) = ()y) \/ y t/:. V. Notice now that hghl E I and that hghl = h, since h is supported in X-V. Since E > 0 was arbitrary, and since I was assumed to be closed, this shows that f E I.) (d) Show that if Fi , i = 1,2 are closed subsets of X, then J(Fd C J(F2) <=? Fl :=> F2, and hence deduce that the maximal ideals in C(X) are in bijective correspondence with the points in X. (Hint: Urysohn's lemma.) (e) If SeX, and if the closure of 10 (S) is I (F), what is the relationship between Sand F. 1
(6) Can you show that there are no proper ideals in Mn (C)? (7) If I is an ideal in an algebra A, show that: (i) the vector space A/I is an algebra with the natural structure; (ii) there exists a 1-1 correspondence between ideals in A/I and ideals in A which contain I. (Hint: Consider the correspondence 3 ---7 'if-I (3), where 'if : A ---7 A/I is the quotient map.) Throughout this section, we shall assume that A is a commutative Banach algebra, and unless it is explicitly stated to the contrary, we will assume that A is a unital algebra. The key to the Gelfand-Naimark theory is the fact there are always lots of maximal ideals in such an algebra; the content of the Gelfand-Naimark theorem is that the collection of maximal ideals contains a lot of information about A, and that in certain cases, this collection "recaptures" the algebra. (See the example discussed in Exercise 3.2.3(5), for an example where this is the case.)
Proposition 3.2.4 If x E A, then the following conditions are equivalent: (i) x is not invertible; (ii) there exists a maximal ideal I in A such that x E I.
Proo]. The implication (ii) =} (i) is immediate - see Exercise 3.2.3(3). Conversely, suppose x is not invertible. If x = 0, there is nothing to prove, so assume x i= o. Then, Ia = {ax: a E A} is an ideal in A. Notice that Ia i= {O} (since it contains x = Ix) and Io i= A (since it does not contain 1); thus, Ia is a proper ideal. An easy application of Zorn's lemma now shows that there exists a maximal ideal I which contains I a , and the proof is complete. 0
Chapter 3. C' -Algebras
74
The proof of the preceding proposition shows that any commutative Banach algebra, which contains a non-zero element which is not invertible, must contain non-trivial (maximal) ideals. The next result disposes of commutative Banach algebras which do not satisfy this requirement. Theorem 3.2.5 (Gelfand-Mazur theorem) The following conditions on a unital commutative Banach algebra A are equivalent: (i) A is a division algebra - i. e., every non-zero element is invertible; (ii) A is simple - i.e., there exist no proper ideals in A; and (iii) A = C1.
Proof. (i) =} (ii): If Ao contains a proper ideal, then (by Zorn) it contains maximal ideals, whence (by the previous proposition) it contains non-zero non-invertible elements.
(ii) =} (iii): Let x E A; pick A E (J(x); this is possible, in view of the already established fact that the spectrum is always non-empty; then (x - A) is not invertible and is consequently contained in the ideal A(x - A) -=I- A; deduce from the hypothesis (ii) that A(x - A) = {O}, i.e., x = AI. (iii)
=}
o
(i): Obvious.
The conclusion of the Gelfand-Naimark theorem is, loosely speaking, that commutative unital Banach algebras are "like" the algebra of continuous functions on a compact Hausdorff space. The first step in this direction is the identification of the points in the underlying compact space. Definition 3.2.6 (a) A complex homomorphism on a commutative Banach algebra A is a mapping ¢ : A ---* C which is a non-zero algebra homomorphism - i. e., ¢ satisfies the following conditions: (i) ¢(ax + y) = a¢(x) + ¢(y), ¢(xy) = ¢(x)¢(y) I::j x, YEA, a E C; (ii) ¢ is not identically equal to O.
The collection of all complex homomorphisms on A is called the spectrum of A for reasons that will become clear later - and is denoted by A.
(b) For x E A, we shall write x for the function
x : A ---* C,
defined by x( ¢) = ¢(x).
Remark 3.2.7 (a) It should be observed that for unital Banach algebras, condition (ii) in Definition 3.2.6 is equivalent to the requirement that ¢(1) = L (Verify this!) (b) Deduce; from Exercise 3.2.3(5), that in case A = C(X), with X a compact Hausdorff space, there is an identification of A with X in such a way that j is 0 identified with f, for all f E A.
3.2. Gelfand-Naimark theory
75
Lemma 3.2.8 Let A denote a unital commutative Banach algebra.
(a) the mapping ¢ ---7 ker ¢ sets up a bijection between all maximal ideals in A; (b)
£(.4) =
.4
and the collection of
a(x);
(c) if it is further true that 11111 1, then, the following conditions on a map ¢ : A ---7 C are equivalent: (i) ¢ E .4; (ii) ¢ E A*, II¢II = 1, and ¢(xy) = ¢(x)¢(y) V x, yEA.
Proof. (a) Let ¢ E .4 and define I = ker ¢. Since ¢ is an algebra homomorphism of A onto C, it follows - see Exercise 3.2.3(7) - that I is a maximal ideal in A. Conversely, if I is a maximal ideal in A, it follows from Exercise 3.2.3(7) that A/I is simple, and hence, by the Gelfand-Mazur Theorem, we see that A/I = Cl; consequently, the quotient map A ---7 A/I = C gives rise to a complex homomorphism ¢I which clearly has the property that ker ¢I = I. Finally, the desired conclusion follows from the following facts: (i) two linear functionals on a vector space with the same kernel are multiples of one another, and (ii) if ¢ E .4, then A = ker ¢ EB Cl (as vector spaces), and ¢(1) = 1. (b) If x E A, ¢ E .4, then (x - ¢(x)l) E ker ¢, and hence, by Proposition 3.2.4, we see that ¢(x) E a(x). Conversely, it follows from Proposition 3.2.4 and (a) of this Lemma, that a(x) c £(.4). (c) Deduce from (b) and the inequality 3.1.11 that if ¢ E we have 1¢(x)1 r(x) S Ilxll,
.4, then,
for any x E A,
s
and hence II¢II S 1. On the other hand, since ¢(1) = 1, we must have II¢II and hence (i) =? (ii); the other implication is obvious.
=
Corollary 3.2.9 Let A be a non-unital commutative Banach algebra; then ¢ E ¢ E A* and II¢II S 1.
.4 =?
1, 0
Proof. Let A+ denote the "unitisation" of A as in Exercise 3.1.3(1). Define ¢+ : A+ ---7 C by ¢+(x, a) = ¢(x) + a; it is easily verified that ¢+ E A+; hence, by Lemma 3.2.8(c), we see that ¢+ E (A+)* and II¢+II S 1, and the desired conclusion follows easily. 0 In the sequel, given a commutative Banach algebra A, we shall regard .4 as a topological space with respect to the subspace topology it inherits from ball A* = {¢ E A* : II¢II S I}, where this unit ball is considered as a compact Hausdorff space with respect to the weak* -topology - see Alaoglu's theorem (Theorem 1.6.9).
Chapter 3. C* -Algebras
76
Theorem 3.2.10 Let A be a commutative Banach algebra, with spectrum A. Then, (1) A is a locally compact Hausdorff space (with respect to the weak* -topology), which is compact in case the Banach algebra A contains an identity; and
(2) the map x -+ :i: defines a mapping r : A -+ Co(A) 1f1hich is a contractive homomorphism of Banach algebras (where, of course, Co(A) is regarded as a Banach algebra as in Example 3.1.2(4)) - meaning that the following relations hold, for all x,y E A,a E C: (i) IIr(x)11 :::; Ilxll; (ii) r(ax + y) = ar(x) + r(y); and (iii) r(xy) = r(x )r(y). The map
r
is referred to as the Gelfand transform of A.
Proof. (1) Consider the following sets: for fixed x,y E A, let Kx,y = {¢ E ball A* : ¢(xy) = ¢(x)¢(y)}; and let V = {¢ E ball A* : ¢ i= a}. The definition of the weak* topology implies that Kx,y (resp., V) is a closed (resp., open) subset of ball A*; hence K = nX,YEAKx,y is also a closed, hence compact, set in the weak' topology. Notice now that A = K n V, and since an open subset of a (locally) compact Hausdorff space is locally compact - see Proposition A.6.2(3) - and Hausdorff, we see that A is indeed a locally compact Hausdorff space. In case A has an identity, then F = {¢ E ball A* : ¢(1) = I} is weak* closed, and A = K n F is a closed subset of ball A*, and the proof of (1) is complete. (2) The definition of the weak* topology guarantees that :i: is a continuous function on A, for every x E A; it is also obvious that the mapping r : x f-+ :i: is linear and preserves products; (for example, if x, yEA and if ¢ E A is arbitrary, then (by the definition of a complex homomorphism) we have: xy(¢) = ¢(xy) = ¢(x)¢(y) = :i:(¢)y(¢), whence xy = :i:f)). To complete the proof, we need to verify that r maps A into Co(A), and that Ilr(x)11 :::; Ilxll· First consider the case when A has an identity. In this case, Co(A) = C(A), while, by Lemma 3.2.8(b), we find that sup{I:i:(¢)I : ¢ E A} SUp{IAI : A E a(x)} r(x)
IIf(x)11
<
Ilxll,
and the proof of the unital case is complete. Now suppose A does not have an identity. To start with, deduce, from Corollary 3.2.9, that r is indeed contractive, so we only need to verify that functions on (A+) of the form x do indeed vanish at infinity.
3.2. Gelfand-Naimark theory
77
Let A+ be the unitisation of A, and let A 3 ¢ J... ¢+ E (A+) be as in the proof of Corollary 3.2.9. Notice now that (A+) = f(A)U{¢o}, where ¢o(x,a) = a, since ¢o is the unique complex homomorphism of A+ whose restriction to A is identically equal to o. On the other hand, (the definitions of) the topologies on A and (A+) show that a net {¢i : i E I} converges to ¢ in A if and only if {¢i : i E I} converges to ¢+ in A; in other words, the function f maps A homeomorphic ally onto f(A). Thus, we find - from the compactness of (A+), and from the nature of the topologies on the spaces concerned - that we may identify (A+) with the one-point compactification (A).of the locally compact Hausdorff space A, with the element ¢o playing the role of the point at infinity; notice now that if we identify A as a subspace of (A+) (via f), then the mapping (x:O) is a continuous functon on (A+) which extends :i; and which vanishes at "00" (since ¢o(x, 0) = 0); thus, (by Proposition A.6.7(a), for instance) we find that indeed :i; E Co((..44')), 't/x E A, and the proof is complete. 0
Corollary 3.2.11 Let r be the Gelfand transform of a commutative Banach algebra A. The following conditions on an element x E A are equivalent: (i) x E ker r - i.e., r(x) = 0; (ii) limn Ilxnll~
= O.
In case A has an identity, these conditions are equivalent to the following two conditions also: (iii) x E I, for every maximal ideal I of A; (iv) u(x) = {O}. Proof. Suppose first that A has an identity. Then, since IIr(x)11 = r(x), the equivalence of the four conditions follows easily from Proposition 3.2.4, Lemma 3.2.8 and the spectral radius formula. For non-unital A, apply the already established unital case of the corollary to the element (x,O) of the unitised algebra A +. 0 Definition 3.2.12 An element of a commutative Banach algebra A is said to be quasi-nilpotent if it satisfies the equivalent conditions of Corollary 3.2.11. The radical of A is defined to be the ideal consisting of all quasinilpotent elements of A. A commutative Banach algebra is said to be semi-simple if it has trivial radical, (or equivalently, it has no quasinilpotent elements, which is the same as saying that the Gelfand transform of A is injective). A few simple facts concerning the preceding definitions are contained in the following exercises.
Chapter 3. C' -Algebras
78
Exercise 3.2.13 (1) Consider the matrix N E Mn(C) defined by N = «n~)), where n~ = 1 if i = j - 1, and n~ = 0 otherwise. {Thus, the matrix has 1 's on the "first super-diagonal" and has O's elsewhere.} Show that N k = 0 {:} k 2: n, and hence deduce that if A = {l=~:/ QiNi : Qi E C} (is the subalgebra of Mn(C) which is generated by {N}}, then A is a commutative non-unital Banach algebra, all of whose elements are {nilpotent, and consequently} quasi-nilpotent.
(2) If A is as in {1} above, show that A is empty, and consequently the requirement that
.A is compact does not imply that A
must have an identity.
(3) Show that every nilpotent element {i. e., an element with the property that some power of it is O} in a commutative Banach algebra is necessarily quasi-nilpotent, but that a quasi-nilpotent element may not be nilpotent. (Hint: For the second part, let N n denote the operator considered in {1} above, and consider the operator given by T = ffi~Nn acting on ffi~=lcn.) We now look at a couple of the more important applications of the GelfandNaimark theorem in the following examples.
Example 3.2.14 (1) Suppose A = C(X), where X is a compact Hausdorff space. It follows from Exercise 3.2.3(5)(d) that the typical complex homomorphism of A
is of the form ¢x(f) = f(x), for some x E X. Hence the map X 3 x --t ¢x E A is a bijective map which is easily seen to be continuous (since Xi --t x => f(Xi) --t f(x) . for all f E A); since X is a compact Hausdorff space, it follows that the above map is actually a homeomorphism, and if A is identified with X via this map, then the Gelfand transform gets identified with the identity map on A.
(2) Let A = e1(Z), as in Example 3.1.2(5). Observe that A is "generated, as a Banach algebra" by the elements e±l, the basis vectors indexed by 1 and -1; in fact, el = en Vn E Z, and x = LnEZ x(n)e n , the series converging (unconditionally) in the norm. If ¢ E A and ¢(el) = z, then note that for all nEZ, we have Iznl = 1¢(en)1 :S 1, since lIenll = 1 Vn; this clearly forces Izl = 1. Conversely, it is easy to show that if Izl = 1, then there exists a unique element ¢z E A such that ¢z(el) = z. (Simply define ¢z(x) = LnEZx(n)zn, and verify that this defines a complex homomorphism.) Thus we find that A may be identified with the unit circle ']['; notice, incidentally, that since Z is the infinite cyclic group with generator 1, we also have a natural identification of '][' with the set of all homomorphisms of the group Z into the multiplicative group ']['; and if Z is viewed as a locally compact group with respect to the discrete topology, then all homomorphisms on Z are continuous. More generally, the preceding discussion has a beautiful analogue for general locally compact Hausdorff abelian groups. If G is such a group, let 0 denote the set of continuous homomorphisms from G into ']['; for "I E 0, consider the equation ¢"'((f) = fa !(th(t)dma(t), where ma denotes (some choice, which is unique up to normalisation of) a (left = right) Haar measure on G - see Example 3.1.2(5).
3.2. Gelfand-Naimark theory
79
(The complex conjugate appears in this equation for historic reasons, which ought to become clearer at the end of this discussion.) It is then not hard to see that 1;, defines a complex homomorphism of the Banach algebra £1 (G, me). The pleasant fact is that every complex homomorphism of £1(G,me) arises in this manner, and that the map G :3 "I f---+ 1;, E (£1 (G, me) f is a bijective correspondence. The (so-called Pontrjagin duality) theory goes on to show that if the (locally compact Hausdorff) topology on £1 ~e) is transported to G, then G acquires the structure of a locally compact Hausdorff group, which is referred to as the dual group of G. (What this means is that (a) G is a group with respect to pointwise operations - i.e., hI . "(2)(t) = "II (th2(t), etc; and (b) this group structure is "compatible" with the topology inherited from .,4, in the sense that the group operations are continuous.) The elements of G are called characters on G, and G is also sometimes referred to as the character group of G. The reason for calling G the "dual" of G is this: each element of G defines a character on G via evaluation, thus: if we define Xth) = "I(t), then we have a mapping G :3 t f---+ Xt E G = (G); the theory goes on to show that the above mapping is an isomorphism of groups whic~ is a homeomorphism of topological spaces; thus we have an identification G ~ G of topological groups. Under the identification of (£I(G,me)fwith G, the Gelfand transform becomes identified with the mapping £1 (G, me) :3 f f---+ j E Co (G) defined by
jh) =
L
f(th(t)dme(t) ;
(3.2.15)
the mapping j defined by this equation is referred to, normally, as the Fourier transform of the function f - for reasons stemming from specialisations of this theory to the case when G E {lR, Z, T}. Thus, our earlier discussion shows that ;t = T (so that, also if = Z). (It is a fact that, in general, G is discrete if and only if G is compact.) It is also true that the typical element of 1R is of the form "It (s) = exp( its), so that 1R may be identified with lR as a topological group. Further, if we choose the right normalisations of Haar measure, we find the equations: nEZ
j(n) j(x) which are the equations defining the classical Fourier transforms. It is because of the occurrence of the complex conjugates in these equations, that we also introduced a complex conjugate earlier, so that it could be seen that the Gelfand transform specialised, in this special case, to the Fourier transform. D
80
3.3
Chapter 3. C* -Algebras
Commutative C* -algebras
By definition, the Gelfand transform of a commutative Banach algebra is injective precisely when the algebra in question is semi-simple. The best possible situation would, of course, arise when the Gelfand transform turned out to be an isometric algebra isomorphism onto the appropriate function algebra. For this to be true, the Banach algebra would have to behave "like" the algebra Co(X); in particular, in the terminology of the following definition, such a Banach algebra would have to possess the structure of a commutative C* -algebra.
Definition 3.3.1 A C*-algebra is, by definition, a Banach algebra A, which is equipped with an "involution" A '3 x f--+ x* E A which satisfies the following conditions, for all x, YEA, a E C: (i) (conjugate-linearity) (ax + y)* ax* + y*; (ii) (product-reversal) (xy)* = y*x*; (iii) (order two) (x*)* = x; and (iv) (C*-identity) Ilx*xll = IIxW·
The element x* is called the adjoint of the element x, and the mapping x referred to as the "adjoint-map" or as "adjunction".
f--+
x* is
Example 3.3.2 (1) An example of a commutative C* -algebra is furnished by Co(X), where X is a locally compact Hausdorff space; it is clear that this algebra has an identity precisely when X is compact. (2) An example of a non-commutative C* -algebra is provided by £(H), where H is a Hilbert space of dimension at least two. (Why two?) (3) If A is a C*-algebra, and if B is a subalgebra which is closed (with respect to the norm) and is "self-adjoint" - meaning that it is closed under the mapping x f--+ x* - then B has the structure of a C* -algebra in its own right. In this case, we shall call B a C* -subalgebra of A. (4) This is a non-example. If G is a locally compact abelian group- or more generally, if G is a locally compact group whose left Haar measure is also invariant under right translations - define an involution on Ll(G,mc) by f*(t) = f(t- 1 ); then this is an involution of the Banach algebra which satisfies the first three algebraic requirements of Definition 3.3.1, but does NOT satisfy the crucial fourth condition; that is, the norm is not related to the involution by the C* -identity; nevertheless the involution is still not too badly behaved with respect to the norm, since it is an isometric involution. (We ignore the trivial case of a group with only one element, which is the only case when the C*-identity is satisfied.) 0 So as to cement these notions and get a few simple facts enunciated, we pause for an exercise.
3.3. Commutative C* -algebras
81
Exercise 3.3.3 (1) Let A be a Banach algebra equipped with an involution x f--+ x* which satisfies the first three conditions of Definition 3.3.1. (a) Show that if A is a C*-algebra, then Ilx*11 = Ilxll \:j x E A. (Hint: Use the submultiplicativity of the norm in any Banach algebra, together with the C*identity.) (b) Show that A is a C* -algebra if and only if IIxl1 2 ::; Ilx* xii for all x E A. (Hint: Use the submultiplicativity of the norm in any Banach algebra, together with (a) above.)
(2) An element x of a C* -algebra is said to be self-adjoint if x = x*. (a) Show that any element x of a C* -algebra is uniquely expressible in the form x = Xl + iX2, where Xl and X2 are self-adjoint elements. (Hint: Imitate the proof of Proposition 2.4.9.) (b) If A is a C* -algebra which has an identity 1, show that 1 is necessarily selfadjoint.
(3) Let A be a C* -algebra, and let 5 be an arbitrary subset of A. (This exercise gives an existential, as well as a constructive, description of the "C* -subalgebra generated by 5 ".) (a) Show that n{B: 5 c B,B is a C*-subalgebra of A} is a C*-subalgebra of A that contains 5, which is minimal with respect to this property, in the sense that it is contained in any other C* -subalgebra of A which contains 5; this is called the C* -subalgebra generated by 5 and is denoted by C* (5) . (b) Let 51 denote the set of all "words in 5u5*"; i.e., 51 consists of all elements of the form w = a1 a2 ... an, where n is any positive integer and either ai E 5 or a7 E 5 for all i; show that C* (5) is the closure of the set of all finite linear combinations of elements of 51. In order to reduce assertions about non-unital C* -algebras to those about unital C* -algebras, we have to know that we can embed any non-unital C* -algebra as a maximal ideal in a unital C*-algebra (just as we did for Banach algebras). The problem with our earlier construction is that the norm we considered earlier (in Exercise 3.l.3(1)) will usually not satisfy the crucial C'-identity. We get around this difficulty as follows.
Proposition 3.3.4 Suppose A is a C* -algebra without identity. Then there exists a C* -algebra A + with identity which contains a maximal ideal which is isomorphic to A as a C* -algebra.
Proof. Consider the mapping A ::1 x
f--+
Lx E £(A) defined by
Lx(Y) = xy
\:j
YEA.
It is a consequence of sub-multiplicativity of the norm (with respect to products) in any Banach algebra, and the C* -identity that this map is an isometric algebra homomorphism of A into £(A).
Chapter 3. C* -Algebras
82
Let A+ = {Lx + a1 : x E A, a E
JJxy + ayJJ2 JJ(y*x* + ay*)(xy + ay)JJ JJy*. (z*z)(y)JJ
JJZ(y)JJ2
<
JJz*zJJ·JJyW,
where we have used the C*-identity (which is valid in A) in the second line, and the submultiplicativity of the norm (with respect to products), the definition of the norm on A+, and Exercise 3.3.3(1)(b) in the last line. Since y was arbitrary, this shows that, indeed, we have JJZJJ2 ::; JJz* zJJ, and the proof is complete. 0 Exercise 3.3.5 (1) Show that if Y is a closed subspace of a Banach space X, and if F is a finite-dimensional subspace of X, then also the vector sum Y + F is a closed subspace of X. (Hint: Consider the quotient map 7f : X --t XjY, note that F + Y = 7f- 1 (7f(F)), and appeal to Exercise A.6.5(1){c) and the continuity of 7f.)
(2) Show that if, in (1) above, the requirement that F is finite-dimensional is relaxed to just requiring that F is closed, then the conclusion is no longer valid in general. (Hint: Let T E .c(H) be an operator which is 1-1, but whose range is not closed - (see Remark 1.5.15, for instance); let X = H ffi H, Y = H ffi {O}, and F = G(T) (thegraph ofT).) (3) If A is any (not necessarily commutative) unital Banach algebra, and if x show that the equation
E
A,
(3.3.16)
defines a mapping A 3 x ~ eX E 9(A) with the property that (i) e X +Y = eXe Y , whenever x and yare two elements which commute with one another;
3.3. Commutative C' -algebras
83
(ii) for each fixed x E A, the map JR. 3 t f--+ e tx E Q(A) defines a (continuous) homomorphism from the additive group JR. into the multiplicative (topological) group Q(A). We are now ready to show that in order for the Gelfand transform of a commutative Banach algebra to be an isometric isomorphism onto the algebra of all continuous functions on its spectrum which vanish at infinity, it is necessary and sufficient that the algebra have the structure of a commutative C* -algebra. Since the necessity of the condition is obvious, we only state the sufficiency of the condition in the following formulation of the Gelfand-Naimark theorem for commutative C* -algebras. Theorem 3.3.6 Suppose A is a commutative C* -algebra; then the Gelfand transform is an isometric *-algebra isomorphism of A onto Co(A); further, A is compact if and only if A has an identity.
Proof. Case (a): A has an identity. In view of Theorem 3.2.10, we only need to show that (i) if x E A is arbitrary, then r(x*) = r(x)* and IIf(x)11 = Ilxll; (ii) A is compact; and (iii) r maps onto C(A). (i) We first consider the case of a self-adjoint element x = x* E A. Define = e itx , 'Vt E JR.; then, notice that the self-adjointness assumption, and the continuity of adjunction, imply that
Ut
U;
~ Ci~)nr
~ c-~~)n) hence, by the C* -identity and Exercise 3.3.5(3), we find that for any t E JR.,
IIUtW = Ilu-tUtll =
1.
Now, if ¢ E A, since II¢II = 1, it follows that I¢(Ut) I :::; 1 'Vt E JR.; on the other hand, since ¢ is (continuous and) multiplicative, it is seen that ¢( Ut) = eit¢(x); in other words, leit¢(x)1 :::; 1 'Vt E JR.; this clearly implies that ¢(x) E lR.. Thus, we have shown that :i; is a real-valued function, for every self-adjoint element x E A. Thanks to the Cartesian decomposition (of an element as a sum of a selfadjoint and a "skew-adjoint" element), we may now conclude that r is indeed a homomorphism of *-algebras. Again, if x = x* E A, observe that
IIxl1 2 = Ilx*xll
Chapter 3. C* -Algebras
84
and conclude, by an easy induction, that Ilx11 2" Ilx 2" II for every positive integer n; it follows from the spectral radius formula that
r(x)
=
limllx2"112h n
Ilxll;
on the other hand, we know that IIr(x)11 = r(x) - see Lemma 3.2.8(a); thus, for self-adjoint x, we find that indeed II r( x ) II = II x II; the case of general x follows from this, the fact that f is a *-homomorphism, and the CO-identity (in both A and C(A)), thus: Ilf(x)*r(x)11 IIf(x*x)11 Ilx*xll Ilx11 2 , thereby proving (i). (ii) This follows from Theorem 3.2.10 (1). (iii) It follows from the already established (i) that the r(A) is a norm-closed self-adjoint subalgebra of C(A) which is easily seen to contain the constants and to separate points of A; according to the Stone-Weierstrass theorem (Theorem A.6.9), the only such subalgebra is C(A).
Case (b): A does not have a unit. Let A+ and I be as in the proof of Proposition 3.3.4. By the already established Case (a) above, and Exercise 3.2.3(5)(d), we know that f A+ is an isometric *-isomorphism of A+ onto C(A), and that there exists a point ¢o E (A+) such that f A+(I) = {J E C(A) : f(¢o) = O}. As in the proof of Theorem 3.2.10, note that if 7f : A ---7 I is the natural isometric *-isomorphism of A onto I, then the map ¢ ........ ¢ 0 7f defines a bijective correspondence between (A +) - {¢o} and A which is a homeomorphism (from the domain, with the subspace topology inherited from (A+), and the range); finally, it is clear that if x E A, and if ¢o #- ¢ E (A+), then (f A+ (7f(x))) (¢) = (f A(X)) (¢07f), and it is thus seen that even in the non-unital case, the Gelfand transform is an isometric *-isomorphism of A onto Co(A). Finally, if X is a locally compact Hausdorff space, then since Co(X) contains an identity precisely when X is compact, the proof of the theorem is complete. D Before reaping some consequences of the powerful Theorem 3.3.6, we wish to first establish the fact that the notion of the spectrum of an element of a unital C* -algebra is an "intrinsic" one, meaning that it is independent of the ambient algebra, unlike the case of general Banach algebras (see Proposition 3.1.11 and Example 3.1.12). Proposition 3.3.7 Let A be a unital C* -algebra, let x E A, and suppose B is a C*-subalgebra of A such that l,x E B. Then,oA(x) = aB(x); in particular, or equivalently, if we let Ao = C*( {I, x}), then, aA (x) = aAo (x).
3.3. Commutative C* -algebras
85
Proof. We already know that O"B(X) :J O"A(X) - see Proposition 3.1.11; to prove the reverse inclusion, we need to show that PA(X) C PB(X); i.e., we need to show that if ,\ E tC is such that (x -,\) admits an inverse in A, then that inverse should actually belong to B; in view of the spectral mapping theorem, we may assume, without loss of generality, that ,\ = 0; thus we have to prove the following assertion: If 1,x E B C A, and if x E 9(A), then
X-I
E
B.
We prove this in two steps.
Case (i): x = x*. In this case, we know from Theorem 3.3.6 - applied to the commutative unital C* -algebra Ao = C* ( {1, x} ), that 0" Ao (x) C lR; since any closed subset of the real line is, when viewed as a subset of the complex plane, its own boundary, we may deduce from Proposition 3.1.11 (applied to the inclusions Ao C B C A) that O"A(X) C O"B(X) C O"Ao(X)
= O(O"Ao(X))
C O(O"A(X)) C O"A(X) ;
consequently, all the inclusions in the previous line are equalities, and the proposition is proved in this case.
Case (ii): Suppose x E 9(A) is arbitrary; then also x* E 9(A) and consequently, x*x E 9(A); by Case (i), this implies that (X*X)-I E B; then (X*X)-IX* is an element in B which is left-inverse of x; since x is an invertible element in A, deduce that X-I = (X*X)-IX* E B, and the proposition is completely proved. 0 Thus, in the sequel, when we discuss C* -algebras, we may talk unambiguously of the spectrum of an element of the algebra; in particular, if T E £(H), and if A is any unital C* -subalgebra of £(H) which contains T, then the spectrum of T, regarded as an element of A, is nothing but 0" £(1t) (T), which, of course, is given by the set {,\ EtC: T -,\ is either not 1-1 or not onto }. Further, even if we have a non-unital C* -algebra A, we will talk of the spectrum of an element x E A, by which we will mean the set 0" A + (x), where A + is any unital C' -algebra containing A (or equivalently, where A + is the unitisation of A, as in Proposition 3.3.4); it should be noticed that if x belongs to some C'-algebra without identity, then its spectrum must necessarily contain O. Since we wish to obtain various consequences, for a general not necessarily commutative C' -algebra, of Theorem 3.3.6 (which is a result about commutative C* -algebras), we introduce a definition which permits us to regard appropriate commutative C' -subalgebras of a general C' -algebra. Definition 3.3.8 An element x of a
c· -algebra is said to
be normal if x' x
= xx' .
Exercise 3.3.9 Let A be a C· -algebra and let x E A. Show that the following conditions are equivalent: (i) x is normal; (ii) C· ({ x}) is a commutative C* -algebra; (iii) if x = Xl + iX2 denotes the Cartesian decomposition of x, then XIX2 = X2XI.
Chapter 3. C* -Algebras
86
We now come to the first of our powerful corollaries of Theorem 3.3.6; this gives us a continuous functional calculus for normal elements of a C* -algebra.
Proposition 3.3.10 Let x be a normal element of a C* -algebra A, and define
A - { C*( {l, x}) 0C*( {x})
if A has an identity 1 otherwise
Then,
(a) if A has an identity, there exists a unique isometric isomorphism of C* -algebras denoted by C (0-( x)) :3 f f-+ f (x) E Ao with the property that h (x) = x, where h : 0"( x) (c q --+ C is the identity function defined by h (z) = z; and (b) if A does not have an identity, and if we agree to use the (not incorrect) notation CO(O"(x)) {f E C(O"(x)) : f(O) = O}, then there exists a unique isometric isomorphism of C* -algebras denoted by Co (O"( x)) :3 f f-+ f (x) E Ao with the property that h (x) = x, where h is as in (a) above. Proof. (a) Let ~ = O"(x). Notice first that x (= r Ao(X)) : Ao --+ ~ is a surjective continuous mapping. We assert that this map is actually a homeomorphism; in view of the compactness of Ao (and the fact that C is a Hausdorff space), we only need to show that x is 1-1; suppose x(1)d = X(1)2), 1>j E Ao; it follows that 1>llv = 1>2Iv, where V = {2=~j=o ai,jXi(X*)j : n E N, ai,j E C}; on the other hand, the hypothesis (together with Exercise 3.3.3(3)(b)) implies that V is dense in A o, and the continuity of the 1>i's implies that 1>1 = 1>2, as asserted. Hence
x is indeed a homeomorphism of Ao onto ~. Consider the map given by
it is easily deduced that this is indeed an isometric isomorphism of the C* -algebra C(~) onto Ao with the desired properties. On the other hand, it follows from the Stone-Weierstrass theorem - see Exercise A.6.1O(iii) - that the set V' = {J: f(z) = L~j=o ai,jzi zj , n E N, ai,j E C} is dense in C(~); so, any continuous *-homomorphism of C(~) is determined by its values on the set V', and hence by its values on the set {fo, h}, where fj(z) = zj,j = 0,1; and the proof of the corollary is complete. (b) Suppose Ad does not have an identity; again let ~ = O"(x), and observe that see the few paragraphs preceding Definition 3.3.8. Thus, if A+ denotes the unitisation of A as in Proposition 3.3.4, and if we regard A as a maximal ideal in A+ (by identifying x E A with Lx E A+), we see that Ao gets identified with the maximal ideal Io = {L y : Y E Ao} of At = {L y + a: y E Ao,a E C}. Under the isomorphism of C(~) with At that is guaranteed by applying (a) to the element Lx E At, it is easy to see that what we have called Co(~) gets mapped onto A o , and this proves the existence of an isomorphism with the desired properties. Uniqueness is established along exactly the same lines as in (a). 0
oE ~ -
3.3. Commutative C* -algebras
87
The moral to be drawn from the preceding proposition is this: if something is true for continuous functions defined on a compact subset of C (resp., 1R), then it is also true for normal (resp., self-adjoint) elements of a C* -algebra. We make precise the sort of thing we mean by the foregoing "moral" in the following proposition. Proposition 3.3.11 Let A be a C* -algebra.
(a) The following conditions on an element x E A are equivalent: (i) x is normal, and O"(x) C 1R; (ii) x is self-adjoint. (b) The following conditions on an element u E A are equivalent: (i) u is normal, and O"(u) C 1I'; (ii) A has an identity, and u is unitary, i.e., u*u = uu* = l.
(c) The following conditions on an element pEA are equivalent: (i) p is normal, and O"(p) C {O, I}; (ii) pis a projection, i.e., p=p2 =p*.
(d) The following conditions on an element x E A are equivalent: (i) x is normal, and O"(x) C [0, (0); (ii) x is positive, i. e., there exists a self-adjoint element yEA such that x
=
y2.
(e) Ifx E A is positive (as in (d) above), then there exists a unique positive element yEA such that x = y2; this y actually belongs to C* ( {x}) and is given by y = f (x) where f(t) = d; this unique element y is called the positive square root of x and is denoted by y = x~ .
(f) Let x be a self-adjoint element in A. Then there exists a unique decomposition x = x+ - x_, where x+, x_ are positive elements of A (in the sense of (d) above) which satisfy the condition x+x_ = 0. Proof. (a): (i) =} (ii): By Proposition 3.3.10, we have an isomorphism C*({x}) ~ C(O"(x)) in which x corresponds to the identity function II on O"(x); if O"(x) C 1R, then II, and consequently, also x, is self-adjoint.
(i): This also follows easily from Proposition 3.3.10. The proofs of (b) and (c) are entirely similar. (For instance, for (i) =} (ii) in (b), you would use the fact that if z E 1I', then Izl2 = 1, so that the identity function II on O"(x) would satisfy fi II = fdi = 1.) (ii)
=}
(d) For (i) =} (ii), put Y = f(x), where f E C(O"(x)) is defined by f(t) = d, where d denotes the non-negative square root of the non-negative number t. The implication (ii) =} (i) follows from Proposition 3.3.10 and the following two facts: the square of a real-valued function is a function with non-negative values; and, if f E C(X), with X a compact Hausdorff space, then O"(f) = f(X). (Prove this last fact!)
Chapter 3. C·-Algebras
88
(e) If 1 is as in the proof of (d), and if we define y = 1(x), it then follows (since 1(0) = 0) that y E C* ({ x}) and that y is positive and y2 = x. Suppose now that z is some other positive element of A such that x = Z2. It follows that if we write Ao = C*( {z}), then x E A o, and consequently, also y E C*( {x}) c Ao; but now, by Theorem 3.3.6, there is an isomorphism of Ao with C(X) where X = u(z) C [0,00), and under this isomorphism, both z and y correspond to non-negative continuous functions on X, such that the squares of these two functions are identical; since the non-negative square root of a non-negative number is unique, we may conclude that z = y. (f) For existence of the decomposition, define x± = 1±(x), where 1± is the continuous function on lR (and hence on any subset of it) defined by J±(t) = max{O, ±t}, and note that J±(t) :::: 0, f+(t) - 1-(t) = t and f+(t)1-(t) = for all t E lR. Also, note that J±(O) = 0, so that J±(x) E C*({x}). For uniqueness, suppose x = x+ - x_ is a decomposition of the desired sort. Then note that x_x+ = x':..x+. = (x+x_)* = 0, so that x+ and x_ are commuting positive elements of A. It follows that Ao = C* ( {x+, x _ }) is a commutative C*algebra which contains {x+, x _ , x} and consequently also J± (x) E C* ( { x}) c Ao. Now appeal to Theorem 3.3.6 (applied to Ao), and the fact (which you should prove for yourself) that if X is a compact Hausdorff space, if 1 is a real-valued continuous function on X, and if gj,j = 1,2 are non-negative continuous functions on X such that 1 = gl - g2 and glg2 = 0, then necessarily gdx) = max{J(x),O} and g2(X) = max{ - 1(x), O}, and conclude the proof of the proposition. 0
°
Given an element x of a C* -algebra which is positive in the sense of Proposition 3.3.11(d), we shall write x :::: or x. We wish to show that the set of positive elements of a C* -algebra form a positive cone - meaning that if x and y are positive elements of a C* -algebra, and if a, b are non-negative real numbers, then also ax + by :::: 0; since clearly ax, by :::: 0, we only need to show that a sum of positive elements is positive; we proceed towards this goal through a lemma.
° °: :;
Lemma 3.3.12 Let A be a unital Banach algebra, and suppose x, YEA. Then, a(xy) U {O}
=
u(yx) U {O}.
Proof. We wish to show that if Ai- 0, then (A - xy) E Q(A) {:} (A - yx) E Q(A); by replacing x by Xif necessary, it is sufficient to consider the case A = 1. By the symmetry of the problem, it clearly suffices to show that if (1 - yx) is invertible, then so is (1 - xy). We first present the heuristic and non-rigorous motivation for the proof. Write (1 - yx)-l = 1 + yx + yxyx + yxyxyx + ... ; then, (1 - xy)-l = 1 + xy + xyxy + xyxyxy
+ ...
=
1 + x(l - YX)-ly.
Coming back to the (rigorous) proof, suppose u = (1-YX)-1; thus, u-uyx = = 1. Set v = 1 + xuy, and note that
u - yxu
v(1 - xy)
(1
+ xUY)(1 -
xy) =
1 + xuy - xy - xuyxy
3.3. Commutative C' -algebras
89
1 + x( u - 1 - uyx)y 1,
and an entirely similar computation shows that also (1 - xy)v that (1 - xy) is indeed invertible (with inverse given by v). Proposition 3.3.13 If A is a CO-algebra, if x, YEA, and if x:::: also (x + y) :::: 0.
=
1, thus showing D
°and y :::: 0, then
Proof. To start with, (by embedding A in a larger unital CO-algebra, if necessary), we may assume that A is itself a unital C' -algebra. Next, we may (by scaling both x and y down by the same small positive scalar, if necessary), assume (without loss of generality) that Ilxll, Ilyll ::; l. Thus r(x) ::; 1 and we may conclude that O'(x) C [0,1], and consequently deduce that 0'(1 - x) C [0,1]' and hence that (1 - x) :::: 0 and III - xii = r(l - x) ::; l. Similarly, also III - yll ::; l. Then, III - ~ II = ~ II (1- x) + (1 - y) II ::; l. Since ~ is clearly self-adjoint, this means that 0'( x~Y) C [0,2]' whence O'(x + y) C [0,4] (by the spectral mapping theorem), and the proof of lemma is complete. D Corollary 3.3.14 If A is a C' -algebra, let Asa denote the set of self-adjoint elements of A. If x, y E A sa , say x :::: y (or equivalently, y ::; x) if it is the case that (x - y) :::: o. Then this defines a partial order on the set Asa.
°: :
Proof. Reflexivity follows from 0; transitivity follows from Proposition 3.3.13; anti-symmetry amounts to the statement that x = x*, x :::: and -x :::: imply that x = 0; this is because such an x should have the property that O'(x) C [0,(0) and O'(x) C (-00,0]; this would mean that O'(x) = {O}, and since x = x*, it would D follow that Ilxll = r(x) = 0.
°
°
Lemma 3.3.15 Suppose z is an element of a C* -algebra and that z' z < 0; then z = O.
Proof. Deduce from Lemma 3.3.12 that the hypothesis implies that also zz* ::; 0; hence, we may deduce from Proposition 3.3.13 that z* z + zz* ::; 0. However, if z = u+iv is the Cartesian decomposition of z, note then that z* z+zz* = 2( U 2 +V 2 ); hence, we may deduce, from Proposition 3.3.13, that z* z + zz* :::: 0. Thus, we find that z* z + zz* = 0. This means that u 2 = -v 2 ; arguing exactly as before, we find that u 2 :::: 0 and u 2 :s 0, whence u 2 = 0, and so u = 0 (since Ilull = IIu 2 11 ~ for D self-adjoint u). Hence also v = 0 and the proof of the lemma is complete. Proposition 3.3.16 The following conditions on an element x in a C* -algebra are equivalent: (i) x:::: 0; (ii) there exists an element z E A such that x = z* z.
Chapter 3. C* -Algebras
90
Proof. The implication (i) =:} (ii) follows upon setting z = x~ - see Proposition 3.3.1l(e). As for (ii) =:} (i), let x = x+ - x_ be the canonical decomposition of the self-adjoint element x into its positive and negative parts; we then find (since x_x+ = 0) that x_xx_ = -x~ ::; 0; but x_xx_ = (zx_)*(zx_), and we may conclude from Lemma 3.3.15 that x~ = 0 whence z* z = x = x+ 2' 0, as desired.
o 3.4
Representations of C* -algebras
We will be interested in representing an abstract C* -algebra as operators on Hilbert space; i.e., we will be looking at unital *-homomorphisms from abstract unital C* -algebras into the concrete unital C* -algebra £(1i) of operators on Hilbert space. Since we wish to use lower case Roman letters (such as x, y, z, etc.) to denote elements of our abstract C* -algebras, we shall, for the sake of clarity of exposition, use Greek letters (such as ~,77, (, etc.) to denote vectors in Hilbert spaces, in the rest of this chapter. Consistent with this change in notation (from Chapter 2, where we used symbols such as A, T etc., for operators on Hilbert space), we shall adopt the following further notational conventions in the rest of this chapter: upper case Roman letters (such as A, B, M, etc.) will be used to denote C* -algebras as well as subsets of C* -algebras, while calligraphic symbols (such as S, M, etc.) will be reserved for subsets of Hilbert spaces. We now come to the central notion of this section, and some of the related notions. Definition 3.4.1 (a) A representation of a unital C* -algebra A is a *-homomorphism 7r : A --+ £(1i) (where 1i is some Hilbert space), which will always be assumed to be a "unital homomorphism", meaning that 7r(1) = 1 - where the symbol 1 on the left (resp., right) denotes the identity of the C*-algebra A (resp., £(1i)).
(b) Two representations 7ri : A --+ £(1i i ), i = 1,2, are said to be equivalent if there exists a unitary operator U:1i 1 --+1i 2 with the property that 7r2(X)=U7rl(X)U* VxEA.
(c) A representation 7r : A --+ £(1i) is said to be cyclic if there exists a vector 1i such that {7r(x)(O: x E A} is dense in 1i. (In this case, the vector~ is said to be a cyclic vector for the representation 7r.)
~ E
We commence with a useful fact about *-homomorphisms between unital C* -algebras. Lemma 3.4.2 Suppose 7r : A --+ B is a unital *-homomorphism between unital C* -algebras; then, (a) x E A =:} a(7r(x)) C a(x); and (b) 117r(x)11 ::; Ilxll Vx E A.
3.4. Representations of C* -algebras
91
Proof. (a) Since 7r is a unital algebra homomorphism, it follows that 7r maps invertible elements to invertible elements; this clearly implies the asserted spectral inclusion. (b) If x = x* E A, then also 7r(x) = 7r(x*) = 7r(x)* E B, and since the norm of a self-adjoint element in a C* -algebra is equal to its spectral radius, we find (from (a)) that
117r(x)11 = r(7r(x)) S; r(x) = Ilxll ; for general x E A, deduce that
D
and the proof of the lemma is complete.
Exercise 3.4.3 Let {7ri : A ---+ .c(Hi )}iEI be an arbitrary family of representations of (an arbitrary unital C* -algebra) A; show that there exists a unique (unital) representation 7r : A ---+ .c(H), where H = ffiiEIHi' such that 7r(x) = ffiiEI7ri(X). (See Exercise 2.5.8 for the definition of a direct sum of an arbitrary family of operators.) The representation 7r is called the direct sum of the representations {7ri : i E I}, and we write 7r = ffiiEI7ri' (Hint: Use Exercise 2.5.8 and Lemma 3.4. 2') Definition 3.4.4 If H is a Hilbert space, and if S c .c(H) is any set of operators, then the commutant of S is denoted by the symbol S', and is defined by S'
= {x'
E .c(H) : x'x
= xx' "Ix
E S} .
Proposition 3.4.5 (a) If S c .c(H) is arbitrary, then S' is a unital subalgebra of .c(H) which is closed in the weak-operator topology (and consequently also in the strong operator and norm topologies) on .c(H). (b) If S is closed under formation of adjoints - i.e., if S = S*, where of course S* = {x*: XES} - then S' is a C* -subalgebra of .c(H).
(c) (i) If SeT c .c(H), then S' ::) T'; (ii) if we inductively define s,(n+1) = S'
=
(s'(n)), and S'(l)
S', then
s,(2n+1) "In 2: 0
and ScSI! =
S,(2) =
s,(2n+2) "In 2: 0 .
(d) Let M be a closed subspace of H and let p denote the orthogonal projection onto M; and let S c .c(H); then the following conditions are equivalent:
92
Chapter 3. C'-Algebras
(i) x(M) eM, \:Ix E S; (ii) xp = pxp \:Ix E S. If S = S* is "self-adjoint", then the above conditions are also equivalent to (iii) pES'. Proof. The topological assertion in (a) is a consequence of the fact that multiplication is "separately weakly continuous" ~ see Example 2.5.3(3); the algebraic assertions are obvious.
(b) Note that, in general, y E (S*)'
so that (S*)'
=
{o}
yx*
= x'y \:Ix
{o}
xy*
= y*x \:Ix E
{o}
y* E S'
E S
S
(S')*, for any S C £(H).
(c) (i) is an immediate consequence of the definition, as is the fact that
S c S" ;
(3.4.17)
applying (i) to the inclusion 3.4.17, we find that S' ::::> S"'; on the other hand, if we replace S by S' in the inclusion 3.4.17, we see that S' c S"'; the proof of (c) (ii) is complete. (d) For one operator x E £(H), the condition that x(M) C M, or that xp = pxp, are both seen to be equivalent to the requirement that if ((xj)h:C;i,j:C;2 is the "matrix of x with respect to the direct sum decomposition H = M EEl M~" (as discussed in Proposition 2.5.6), then = 0; i.e., the "matrix of x" has the form
xi
[xl =
[~ ~]
where, of course, a E £(M)), b E £(M~, M)) and c E £(M~)) are the appropriate "compressions" of x - where a compression of an operator x E £(H) is an operator of the form z = PM 0 XIN for some subspaces M,N C H. If S is self-adjoint, and if (i) and (ii) are satisfied, then we find that xp = pxp and x*p = px*p for all XES; taking adjoints in the second equation, we find that px = pxp; combining with the first equation, we thus find that px = xp \:Ix E S; i.e., pES'. Conversely, (iii) clearly implies (ii), since if px = xp, then pxp = xp2 = xp. D We are now ready to establish a fundamental result concerning self-adjoint subalgebras of £(H). Theorem 3.4.6 (von Neumann's density theorem) Let A be a unital *-subalgebra of £(H). Then A" is the closure of A in the strong operator topology.
3.4. Representations of C' -algebras
93
Proof. Since A c A", and since A" is closed in the strong operator topology, we only need to show that A is strongly dense in A". By the definition of the strong topology, we need to prove the following:
Assertion: Let z E A"; then for any n E N, 6, ... '~n E 1-{ and an x E A such that II(x - z)~ill < E for 1 ::::: i ::::: n.
E
> 0, there exists
Case (i): We first prove the assertion when n = 1. Let us write ~ = 6, and let M denote the closure of the set A~ = {x~: x E A}. Note that A~ is a vector space containing ~ (since A is an algebra containing 1), and hence M is a closed subspace of 1-{ which is obviously stable under A, meaning that xM c M \Ix E A. Hence, if p is the projection onto M, we may deduce (from Proposition 3.4.5(d)) that pEA' and that p~ =~. Since z E A", we find that zp = pz, whence zM eM; in particular, z~ E M; by definition of M, this means there exists x E A such that II (z~ - x~ II < Eo Case (ii): n E N arbitrary. Let 1-{n = 1-{!fJ n :~r.ms !fJ1-{; as in Proposition 2.5.6, we shall identify £(1-{n) with Mn(£(1-{)). Let us adopt the following notation: if a E £(1-{) , let us write a(n) for the element of Mn(£(1-{)) with (i,j)-th entry being given by 8ja; thus,
0 0 a 0
0 0
0
a
o
000
a
a
o o
With this notation, define A(n)
= {a(n): a E A} ,
and observe that A(n) is a unital *-subalgebra of £(1-{(n)). We claim that {((bj)): bj E A' \I i,j}
(3.4.18)
and that (A")(n)
{zen) : z Let b = ((b~)). Then barn) equation 3.4.18 follows.
E
A"} .
((b~a)), while a(n)b =
(3.4.19) ((abj)); the validity of
94
Chapter 3. C* -Algebras
As for equation 3.4.19, begin by observing that (in view of equation 3.4.18), we have ej E (A(n))" where ej is the matrix which has entry 1 = id rt in the (i,j)-th place and O's elsewhere; hence, if y E (A(n))", we should, in particular, have yej = ejy for all 1 :S i, j :S n. This is seen, fairly easily, to imply that there must exist some z E £(H) such that y = z(n); another routine verification shows that z(n) will commute with ((bj)) for arbitrary bj E A' if and only if z E A", thus completing the proof of equation 3.4.19. Coming back to the assertion, if z E A", consider the element z(n) E (A(n))" and the vector ~ = (6,6, ... '~n) EHn,andappealtothealreadyestablished Case (i) to find an element of A (n) - i.e., an element of the form a(n), with a E A - such that II(z(n) - a(n))~11 < E; this implies that II(z - a)~i" < E for 1 :S i :S n. D Corollary 3.4.7 (Double commutant theorem) The following conditions on a unital *-subalgebra M C £(K) are equivalent: (i) M is weakly closed; (ii) M is strongly closed; (iii) M = Mil. A subalgebra M as above is called a von Neumann algebra. Proof. (i) =} (ii): Obvious.
(ii) =} (iii): This is an immediate consequence of the density theorem above. (iii) =} (i): This follows from Proposition 3.4.5(a).
D
Remark 3.4.8 Suppose 7r : A -+ £(H) is a representation; a closed subspace M C H is said to be 7r-stable if 7r(x) (M) c M, '\Ix E A. In view of Proposition 3.4.5( d), this is equivalent to the condition that p E 7r(A)', where p is the projection of H onto M. Thus 7r-stable subspaces of H are in bijective correspondence with projections in the von Neumann algebra 7r(A)'. In particular - since p EM=} 1 - P E M - we see that the orthogonal complement of a 7r-stable subspace is also 7r-stable. Every 7r-stable subspace yields a new representation - call it 7rIM - by restriction: thus 7rIM : A -+ £(M) is defined by 7rIM(X)(~) = 7r(x)(~), '\Ix E A, ~ E M. The representation 7rIM is called a sub-representation of 7r. Lemma 3.4.9 Any representation is equivalent to a direct sum of cyclic (sub-}representations; any separable representation is equivalent to a countable direct sum of cyclic representations. Proof. Notice, to start with, that if 7r : A -+ £(H) is a representation and if ~ E H is arbitrary, then the subspace M = {7r(x)~: x E A} is a closed subspace which is 7r-stable, and, by definition, the SUb-representation 7rIM is cyclic (with cyclic
3.4. Representations of C* -algebras
95
vector ~). Consider the non-empty collection P whose typical element is a nonempty collection S = {Mi: i E I} of pairwise orthogonal non-zero 'if-stable subspaces which are cyclic (in the sense that the sub-representation afforded by each of them is a cyclic representation). It is clear that the set P is partially ordered by inclusion, and that if C = {S>.: A E A} is any totally ordered subset of P, then S = U'>'Ei\ S,>. is an element of P; thus every totally ordered set in P admits an upper bound. Hence, Zorn's lemma implies the existence of a maximal element S = {Mi: i E I} of P. Then M = EBiEI Mi is clearly a 'if-stable subspace of H, and so also is Ml... If Ml.. =I- {O}, pick a non-zero ~ E Ml.., let Mo = {'if(x)~: x E A}, and observe that S U {Mo} is a member of P which contradicts the maximality of S. Thus, it should be the case that Ml.. = {O}; i.e., H = EBiEI M i , and the proof of the first assertion is complete. As for the second, if H = EBiEIMi is an orthogonal decomposition of H as a direct sum of non-zero cyclic 'if-stable subspaces as above, let ~i be any unit vector in Mi. Then {~i : i E I} is an orthonormal set in H, and the assumed separability of H implies that I is necessarily countable. 0 Before beginning the search for cyclic representations, a definition is in order. Definition 3.4.10 A state on a unital C* -algebra A is a linear functional ¢ : A ---> C which satisfies the following two conditions: (i) ¢ is positive, meaning that ¢(x*x) 20 Vx E A - i.e., ¢ assumes non-negative values on positive elements of A; and (ii) ¢(l) = 1. If 'if : A ---> £(H) is a representation, and ~ is a unit vector in H, the functional ¢, defined by ¢(x) = ('if(x)~,O for all x E A, yields an example of a state. It follows from Lemma 3.4.2 that in fact ¢ is a bounded linear functional, with II¢II = ¢( 1) = 1. The content of the following proposition is that this property characterises a state.
Proposition 3.4.11 The following conditions on a linear functional ¢ : A equivalent: (i) ¢ is a state; (ii) ¢ E A* and II¢II = ¢(l) = 1.
--->
Care
Proof. (i) =} (ii): Notice, to begin with, that since any self-adjoint element is expressible as a difference of two positive elements - see Proposition 3.3.11(f) ¢(x) E lR whenever x = x*; it follows now from the Cartesian decomposition, that ¢(x*) = ¢(x) for all x E A. Now, consider the sesquilinear form B¢ : A x A ---> C defined by
B¢(x,y)
=
¢(y*x) , V x,y EA.
(3.4.20)
Chapter 3. C* -Algebras
96
It follows from the assumed positivity of ¢ (and the previous paragraph) that Bq, is a positive-semidefinite sesquilinear form on A - as in Remark 2.1.5; in fact, that remark now shows that
1¢(y*xW ~ ¢(x*x) . ¢(y*y) V x, yEA.
(3.4.21 )
In particular, setting y = 1 in equation 3.4.21, and using the obvious inequality x*x ~ IIxl1 2 1 and positivity of ¢, we find that 1¢(xW ~ ¢(x*x)¢(I) ~
so that, indeed ¢ E A* and ¢(1) = 1, as desired.
II¢II
~
1;
Ilxl!2¢(1)2 = IIxl1 2
since
11111 = 1 = ¢(1),
we find that
II¢II =
(ii) => (i): We shall show that if x = x*, and if O'(x)
C [a, b], then ¢(x) E [a, b]. (This will imply that ¢ is positive and that ¢( 1) = 1.) Set e = a!b, r = b;a; and note that O'(x - e) C [-r, r] and consequently, Ilx - ell ~ r; hence it follows (from the assumptions (ii)) that
I¢(x) -
el
=
I¢(x -
e)1
~
Ilx - ell
~
r;
in other words, we indeed have ¢(x) E [a, b] as asserted, and the proof is complete. D
We are almost ready for the fundamental construction due to Gelfand, Naimark and Segal, which associates cyclic representations to states. We will need a simple fact in the proof of this basic construction, which we isolate as an exercise below, before proceeding to the construction. Exercise 3.4.12 Suppose V(k) = {x~k): i E I} is a total set in Hk, for k = 1,2, so that (x~I),X;I)h-tl = (X~2),X;2)h{2 for all i,j E I. Show that there exists a unique unitary operator U : HI -H2 such that UX~I) =
X~2) ViE I.
Theorem 3.4.13 (GNS construction) Let ¢ be a state on a unital C* -algebra Ai then there exists a cyclic representation 7rq, : A - £(Hq,) with a cyclic vector ~q, of unit norm such that ¢(x)
=
(7rq,(x)~q" ~q,) V x EA.
(3.4.22)
The triple (Hq" 7r q" ~q,) is unique in the sense that if 7r : A - H is a cyclic representation with cyclic vector ~ such that ¢( x) = (7r (x)~,~) V x E A, then there exists a unique unitary operator U : H - Hq, such that U ~ = ~q, and U7r(x)U* = 7rq,(x) Vx E A. Proof. Let Bq, be the positive-semidefinite sesquilinear form defined on A using the state ¢ as in equation 3.4.20. Let Nq, = {x E A: Bq,(x, x) = o}. It follows from
3.4. Representations of C* -algebras
97
the inequality 3.4.21 that x E N¢ if and only if ¢(y*x) = 0 'Vy E A; this implies that N¢ is a vector subspace of A which is in fact a left-ideal (i.e., x E N¢ =? zx E N¢ 'V z E A). Deduce now that the equation
\X + N¢, y + N¢)
=
¢(y*x)
defines a genuine inner product on the quotient space V = A/N¢. For notational convenience, let us write 'f](x) = x+N¢ so that 'f]: A ---; V; since N¢ is a left-ideal in A, it follows that each x E A unambiguously defines a linear map Lx : V ---; V by the prescription: Lx'f](y) = 'f](xy). We claim now that each Lx is a bounded operator on the inner product space V and that IILxllL:(v) :s: IlxiIA' This amounts to the assertion that ¢(y*x*xy)
=
IILx'f](y)112
:s: IlxI1 211'f](y)112
= IlxI1 2¢(y*y)
for all x, yEA. Notice now that, for each fixed YEA, if we consider the functional 'If;(z) = ¢(y*zy), then 'If; is a positive linear functional; consequently, we find from Proposition 3.4.11 that Il'lf;l I = 'If;(1) = ¢(y*y); in particular, we find that for arbitrary X,y E A, we must have ¢(y*x*xy) = 'If;(x*x) :s: 11'lf;11'llx*xll; in other words, ¢(y*x*xy) :s: IlxI12¢(y*y), as asserted. Since V is a genuine inner product space, we may form its completion - call it H¢ - where we think of V as a dense subspace of H¢. We may deduce from the previous paragraph that each Lx extends uniquely to a bounded operator on H¢, which we will denote by 'iT¢(x); the operator 'iT¢(x) is defined by the requirement that 'iT¢(x)'f](y) = 'f](xy); this immediately implies that 'iT¢ is an unital algebra homomorphism of A into £(H¢). To see that 'iT¢ preserves adjoints, note that if x, y, z E A are arbitrary, then ¢(z*(xy)) ¢((x* z)*y) \'f](y) , 'iT¢(x*)'f](z)) ,
which implies, in view of the density of 'f]( A) in H¢, that 'iT ¢ (x) * = 'iT¢ (x*), so that 'iT¢ is indeed a representation of A on H¢. Finally, it should be obvious that E,¢ = 'f](1) is a cyclic vector for this representation. Conversely, if (H, 'iT, E,) is another triple which also "works" for ¢ as asserted in the statement of the second half of Theorem 3.4.13, observe that for arbitrary X,y E A, we have
for all x, YEA; the assumptions that E, and E,¢ are cyclic vectors for the representations 'iT and 'iT¢ respectively imply, via Exercise 3.4.12, that there exists a unique unitary operator U : H ---; H¢ with the property that U('iT(x)E,) = 'iT¢(x)E,¢ for all x E A; it is clear that U has the properties asserted in Theorem 3.4.13. D
98
Chapter 3. C* -Algebras
We now wish to establish that there exist "sufficiently many" representations of any C* -algebra. Lemma 3.4.14 Let x be a self-adjoint element of a C* -algebra A. Then there exists a cyclic representation 7r of A such that 117r(x)11 = Ilxll·
Proof. Let Ao = C* ({ 1, x}) be the commutative unital C* -subalgebra generated by x. Since Ao ~ C(a(x)), there exists - see Exercise 3.2.3(5)(d) - a complex homomorphism ¢o E Ao such that l¢o(x)1 = Ilxll· Notice that 11¢011 = 1 = ¢0(1). By the Hahn-Banach theorem, we can find a ¢ E A* such that ¢IAo = ¢o and II¢II = 11¢011· It follows then that II¢II = 1 = ¢(1); hence ¢ is a state on A, by Proposition 3.4.1l. If 7rq, is the (cyclic) GNS-representation afforded by the state ¢ as in Theorem 3.4.13, we then see that Ilxll = l¢o(x)1 = 1¢(x)1 = 1(7rq,(x)~q,,~¢)I:::; 117r¢(x)11 in view of Lemma 3.4.2 (b), the proof of the lemma is complete.
D
Theorem 3.4.15 If A is any C* -algebra, there exists an isometric representation 7r: A -+ £(H). If A is separable, then we can choose the Hilbert space H above to be separable.
Proof. Let {Xi: i E I} be a dense set in A. For each i E I, pick a cyclic representation 7ri : A -+ £(Hi) such that II7ri(X;Xi)11 = Ilx;Xill - which is possible, by the preceding lemma. Note that the C*-identity shows that we have II7ri(Xi)11 = Ilxill for all i E I. Let 7r = EBiEI7ri ; deduce from Lemma 3.4.2 and the previous paragraph that, for arbitrary i,j E I, we have II7rj(Xi)11 :::; Ilxill = II7ri(Xi)ll, and hence we see that 117r(Xi)11 = Ilxill Vi E I; since the set {Xi: i E I} is dense in A, we conclude that the representation 7r is necessarily isometric. Suppose now that A is separable. Then, in the above notation, we may assume that I is countable. Further, note that if ~i E Hi is a cyclic vector for the cyclic representation 7ri, then it follows that {7ri (x j )~i : j E I} is a countable dense set in Hi; it follows that each Hi is separable, and since I is countable, also H must be separable. D Hence, when proving many statements regarding general C* -algebras, it would suffice to consider the case when the algebra in question is concretely realised as a C* -algebra of operators on some Hilbert space. The next exercise relates the notion of positivity that we have for elements of abstract C* -algebras to what is customarily referred to as "positive-definiteness" (or "positive-semidefiniteness") in the context of matrices.
3.5. The Hahn-Hellinger theorem
99
Exercise 3.4.16 Show that the following conditions on an operator T E £(H) are
equivalent: (i) T is positive, when regarded as an element of the C* -algebra £(H) (i.e., T = T* and a(T) C [0,00), or T = 52 for some self-adjoint element 5 E £(H), etc.) ; (ii) there exists an operator S E £(H, K) such that T = S* 5, where K is some Hilbert space; (iii) (Tx, xl ;::: 0 'tj x E H. (Hint: for (i) =} (ii), set K = H, 5 = T~; the implication (ii) =} (iii) is obvious; for (iii) =} (i), first deduce from Proposition 2.4.9(a) that T must be self-adjoint; let T = T + - T _ be its canonical decomposition as a difference of positive operators; use the validity of the implication (i) =} (iii) of this exercise to the operator T _ to deduce that it must be the case that (T_x, xl = 0 'tj x E H; since T_ is "determined by its quadratic form", conclude that T_ = 0, or equivalently, that T = T+ ;::: 0.)
3.5
The Hahn-Hellinger theorem
This section is devoted to the classification of separable representations of a separable commutative C* -algebra. Hence, throughout this section, we will be concerned with the commutative unital CO-algebra C(X), where X is a compact Hausdorff space which is metrisable. (The reason for this is that for a compact Hausdorff space X, the separability of C(X) is equivalent to metrisability of X - i.e., the topology on X coming from some metric on X.) It is an immediate consequence of the Riesz representation theorem - see §A.7 in the Appendix - that there is a bijective correspondence between states ¢ on (the commutative unital C*-algebra) C(X) on the one hand, and probability measures on (X, Ex), on the other, given by integration, thus:
¢(f) =
J
fd/l .
(3.5.23)
The next thing to notice is that the "GNS"- triple (HJL,7rJL'~JL) that is associated with this state (as in Theorem 3.4.13) may be easily seen to be given as follows: HJL = L 2 (X,E X ,/l); ~JL is the constant function which is identically equal to 1; and 7rJL is the "multiplication representation" defined by 7rJL(f)~ = f~ 'tj f E
C(X), ~
E
L2(X,/l).
Thus, we may, in view of this bijection between states on C(X) and probability measures on (X, Ex), deduce the following specialisation of the general theory developed in the last section - see Lemma 3.4.9 and Theorem 3.4.13. Proposition 3.5.1 If 7r : C(X)
-> £(H) is any separable representation of C(X) where X is a compact metric space - then there exists a (finite or infinite) countable collection {/In}n of probability measures on (X, Ex) such that 7r is equivalent to EBn 7r JLn·
100
Chapter 3. C* -Algebras
The problem with the preceding proposition is the lack of "canonical"ness in the construction of the sequence of probability measures /-Ln. The rest of this section is devoted, in a sense, to establishing the exact amount of the "canonicalness" of this construction. The first step is to identify at least some situations where two different measures can yield equivalent representations.
Lemma 3.5.2 (a) If /-L and /.I are two probability measures defined on (X, Bx) which are mutually absolutely continuous - see §A.5 - then the representations Jrl" and Jrv are equivalent. (b) If /-L and /.I are two finite positive measures defined on (X, Bx) which are mutually singular - see §A.5 - and if we let A = /-L + /.I, then Jr.\ ~ Jrl" EB Jr v .
Proof. (a) Let ¢ = (~~)~, and note that (by the defining property of the RadonNikodym derivative) the equation (U~) = ¢~, ~ E Hv defines an isometric linear operator U : Hv ---> HI" - where, of course, we write H.\ = L2(X, B, A). (Reason: if ~ E H v , then
In an identical fashion, if we set 1/J = ('f1!; ) ~, then the equation V TJ = 1/JTJ defines an isometric operator V : HI" ---> Hv; but the uniqueness of the RadonNikodym derivative implies that 1/J = ¢-l, and that V and U are inverses of one another. Finally, it is easy to deduce from the definitions that
whenever ~ E H v , f E C(X); in other words, U is a unitary operator which implements the equivalence of the representations Jr v and Jr w (b) By hypothesis, there exists a Borel partition X = AU B, such that /-L = /-LIA and /.I = /.lIB - where, as in §A.5, we use the notation /-LIE to denote the measure defined by /-LIE (A) = /-L(E n A). It is easily seen then that also /-L = AlA and /.I = AlB; the mapping H.\ :1 f f--+ (lAf, IB!) E HI" EB Hv is a unitary operator that establishes the desired equivalence of representations. 0 Our goal, in this section, is to prove the following classification, up to equivalence, of separable representations of C(X), with X a compact metric space. Before stating our main result, we shall fix some notation. Given a representation Jr, and I ::::: n ::::: ~o, we shall write Jrn to denote the direct sum of n copies of Jr. (Here, the symbol ~o refers, of course, to the "countable infinity".)
3.5. The Hahn-Hellinger theorem
101
Theorem 3.5.3 (Hahn-Hellinger theorem) Let X be a compact metric space.
(1) If7r is a separable representation ofC(X), then there exists a probability measure f..L defined on (X, B x) and a family {En : 1 :s: n :s: ~o} of pairwise disjoint Borel subsets of X such that (i) 7r ~ Eth
(ii) f..L is supported on
(meaning that the complement of this set has
Ul:Sn:SNoEn
f..L-measure zero).
(2) Suppose f..Li, i = 1,2 are two probability measures, and suppose that for each i = 1,2 we have a family {E~i) : 1 :s: n :s: ~o} of pairwise disjoint Borel subsets of X such that f..Li is supported on Ul
En
(ii) the measures f..Li are mutually absolutely continuous, and further, f..Li(E~l) jj.E~2))
where, of course, the symbol ference" of sets.
= 0 for all n and for i = 1,2
jj.
has been used to signify the "symmetric dif-
Proof of (1). According to Proposition 3.5.1, we can find a countable family - say {f..Ln : n E N} - of probability measures on (X, Bx) such that 7r
~
EBnEN 7rJ-Ln.
(3.5.24)
Let {En: n E N} be any set of strictly positive numbers such that LnEN En = 1, and define f..L = LnEN Enf..Ln. The definitions imply the following facts: (i) f..L is a probability measure; and (ii) if E E Bx, then f..L(E) = 0 {:} f..Ln(E) = 0 \j n EN. In particular, it follows from (ii) that each f..Ln is absolutely continuous with respect to f..L; hence, if we set An
= {x
EX:
(~) (x) > O}, it then follows from
Exercise A.5.22( 4) that f..Ln and f..LIAn are mutually absolutely continuous; also, it follows from (ii) above that f..L is supported on UnEN An. We thus find now that (3.5.25 ) We will find it convenient to make the following assumption (which we clearly may): N = {1,2, ... ,n} for some n E N (in case N is finite), or N = N, (in case N is infinite). We will also find the introduction of certain sets very useful for purposes of "counting".
Chapter 3. C* -Algebras
102
Let K = {I, 2, ... , INI} (where each k E K, define
INI
denotes the cardinality of N), and for
Ek = {XEX: L IAn(x) = k}. nEN
(3.5.26)
Thus, x E Ek precisely when x belongs to exactly k of the Ai'S; in particular, {Ek : k E K} is easily seen to be a Borel partition of UnENAn. (Note that we have deliberately omitted 0 in our definition of the set K.) Thus we may now conclude - thanks to Lemma 3.5.2(b) - that (3.5.27) Next, for each n E N, k E K, lEN such that 1 ::; l ::; k, define n
An,k,l
=
{xEAnnEk:
LIAj(x)=l}.
(3.5.28)
j=l
(Thus, x E An,k,l if and only if x E Ek and further, n is exactly the l-th index j for which x E Aj; i.e., x E An n Ek and there are exactly l values of j for which 1 ::; j ::; n and x E A j .) A moment's reflection on the definitions shows that
II An,k,l, 'V n E N, k E K II An,k,l, 'V k E K, 1 ::; l ::; k.
IEJIl,l:::;l:::;k
Ek
nEN
We may once again appeal to Lemma 3.5.2(b) and deduce from the preceding equations that 'Tr
ffinEN ffikEK 'TrfLICAnnE k ) ffinEN ffikEK ffiI E JIl,l:::;l:::;k'TrfL IA n ,k,1 ffikEK ffiIEJIl,l:::;l:::;k 'TrfLlEk
k ffikEK'TrfLlE k
'
and the proof of (1) of the theorem is complete.
D
We will have to proceed through a sequence of lemmas before we can complete the proof of (2) of the Hahn-Hellinger theorem. Thus, we have established the "existence of the canonical decomposition" of an arbitrary separable representation of C(X); we shall in fact use this existence half fairly strongly in the proof of the "uniqueness" of the decomposition. In the rest of this section, we assume that X is a compact metric space, and that all Hilbert spaces (and consequently all representations, as well) which we deal with are separable; further, all measures we deal with will be finite positive measures defined on (X, Bx).
3.5. The Hahn-Hellinger theorem
103
Lemma 3.5.4 Let ¢ E L=(X, B x , J.l); then there exists a sequence Un}n C C(X) such that (i) sUPn IlfnIIC(x) < 00; and (ii) the sequence {fn(x)}n converges to ¢(x), for J.l-almost every x E X. Proof. Since C(X) is dense in L 2(X,B x ,J.l), we may find a sequence {hn}n in C(X) such that Ilh n - ¢11£2(X,Bx,J.L) --7 0; since every null convergent sequence in L2(X, Bx,J.l) is known - see [Hall], for instance - to contain a subsequence which converges J.l-a.e. to 0, we can find a subsequence - say {9n}n - of {hn}n such that 9n(X) --7 ¢(x) a.e. Let II ¢ II £'00 = K, and define a continuous "retraction" of IC onto the disc of radius K + 1 as follows: Z
r (z )
=
{
( ~~ 1 )
z
if Iz I ~ (K if Iz I ~ (K
+ 1) + 1)
Finally, consider the continuous functions defined by f n = r 0 9n; it should be clear that Ilfnll ~ (K + 1) for all n, and that (by the assumed a.e. convergence of the sequence {9n}n to 1) also fn(x) --7 f(x) J.l- a.e. D Lemma 3.5.5 (a) Let 7r : C(X) --7 £(H) be a representation; then there exists a probability measure J.l defined on B x , and a representation if : L=(X,Bx,J.l) --7 £(H) such that: (i) if is isometric; (ii) if "extends" 7r in the sense that if(f) = 7r(f) whenever f E C(X); and (iii) if "respects bounded convergence" - meaning that whenever ¢ is the J.l-a.e. limit of a sequence {¢n}n C L=(X,J.l) which is uniformly bounded (i.e., sUPn II¢nIILOO(J.L) < 00), then it is the case that the sequence {if(¢n)}n converges in the strong operator topology to if( ¢).
(b) Suppose that, for i = 1,2, ifi : L=(X,Bx,J.li) --7 £(Hi) is a representation which is related to a representation 7ri : C(X) --7 £(Hi) as in (a) above; suppose 7r1 ~ 7r2; then the measures J.l1 and J.l2 are mutually absolutely continuous. Proof. (a) First consider the case when 7r is a cyclic representation; then there exists a probability measure J.l such that 7r ~ 7rJ.L; take this J.l as the measure in the statement of (a) and define if to be the natural "multiplication representation" defined by if(¢)~ = ¢~, V ¢ E L=(J.l),~ E L 2(J.l); then we see that statement (ii) is obvious, while (iii) is an immediate consequence of the dominated convergence theorem - see Proposition A.5.16(3). As for (i), suppose ¢ E L=(J.l) and E > 0; if E = {x : 1¢(x)1 2: II¢II - E}, then J.l(E) > 0, by the definition of the norm on L=(X,Bx,J.l); hence if ~ = J.l(E)-h E , we find that ~ is a unit vector in L 2(J.l) such that Ilif(¢)~11 2: II¢II - E; the validity of (a) - at least in the cyclic case under consideration - follows now from the arbitrariness of E and from Lemma 3.4.2 (applied to the C*-algebra L=(J.l)).
Chapter 3. C* -Algebras
104
For general (separable) 'if, we may (by the the already established part (1) of the Hahn-Hellinger theorem) assume without loss of generality that 'if = EBn'ifnl ; f..t En
then define if = EBn(~)n, where the summands are defined as in the last paragraph. We assert that this if does the job. Since /1> is supported on UnEn (by (1)(ii) of Theorem 3.5.3), and since ~ is an isometric map of £00 (En, /1>IEJ, it is fairly easy to deduce that if, as we have defined it, is indeed an isometric *-homomorphism of £oo(X,/1», thus establishing (a)(i). The assertion (ii) follows immediately from the definition and the already established cyclic case. As for (iii), notice that the Hilbert space underlying the representation if is of the form H = EBlSnSNo EBlSmSn Hn,m, where Hn,m = £2(En , /1>IEJ \II ::; m ::; n ::; No. So, if {¢kh, ¢ are as in (ii), we see that 'if(¢k) has the form X(k) =
EBlsmsnSNoX~~;", where x~~~,
=
~(¢d for 1 ::; m ::; n ::; No. Similarly if(¢)
has the form x = EBlSmSnSNoXn,m' where xn,m = ~(¢) for 1 ::; m ::; n ::; No. The assumption that sUPk II¢kll < 00 shows that the sequence {x(k)h is uniformly bounded in norm; further, if we let S = UlSmSnSNo Hn,m - where we naturally regard the Hn,m's as subspaces of the Hilbert space H - then we find that S is a total set in H and that x(k)~ -> x~ whenever ~ E S (by the already established cyclic case); thus we may deduce - from Lemma 2.5.2 - that the sequence {x(k)h does indeed converge strongly to x, and the proof of (a) is complete. (b) Suppose U : HI -> H2 is a unitary operator such that U'ifl (J)U* = 'if2(J) for all f E C(X). Define /1> = ~(/1>1 + /1>2), and note that both /1>1 and /1>2 are absolutely continuous with respect to (the probability measure) /1>. Let ¢ be any bounded measurable function on X. By Lemma 3.5.4, we may find a sequence {hh c C(X) such that sUPk Ilhll < 00 and such that h(x) -> ¢(x) for all x E X - N where /1>(N) = 0; then also /1>i(N) = 0, i = 1,2; hence by the assumed property of the ifi's, we find that {'ifi(Jk)h converges in the strong operator topology to 7ri(¢); from the assumed intertwining nature of the unitary operator U, we may deduce that U 'if 1(¢ )U* = 'if2 (¢) for every bounded measurable function ¢ : X -> C. In particular, we see that
= 'if2(I E ) \I E
U'ifl(1E)U*
E Bx
(3.5.29)
Since the ifi's are isometric, we find that, for E E B x,
/1>1(E)=O
9
'ifl(I E )=O
9
'if2(I E ) (by 3.5.29)
9
/1>2(E) = 0,
thereby completing the proof of the lemma.
D
Remark 3.5.6 Let 'if : C(X) -> £(H) be a separable representation; first notice from Lemma 3.5.5(b) that the measure /1> of Lemma 3.5.5(a) is uniquely determined up to mutual absolute continuity; also, the proof of Lemma 3.5.5(b) shows
3.5. The Hahn-Hellinger theorem
105
that if the representations 7rl and 7r2 of C(X) are equivalent, then the representations 7rl and 7r2 of LOO(X, JL) are equivalent (with the two equivalences being implemented by the same unitary operator). Furthermore, it is a consequence of Lemma 3.5.4 and Lemma 3.5.5(a)(iii) that the *-homomorphism if is also uniquely determined by the condition that it satisfies (a) (i)-(iii). Thus, the representation 7r uniquely determines the C*-algebra LOO(X,JL) and the representation if as per Lemma 3.5.5(a)(i)-(iii). D Lemma 3.5.7 If 7r, JL, ir are as in Lemma 3.5.5{a), then
(7r(C(X))"
=
ir(LOO(X, JL)) .
Thus, ir(LOO(X, JL) is the von Neumann algebra generated by 7r(C(X)).
Proof. Let us write A = 7r(C(X)), M = ir(LOO(X,JL)). Thus, we need to show that = A". Since A is clearly commutative, it follows that A C A'; since A is a unital
M
*-subalgebra of £(H), and since A' is closed in the strong operator topology, we may conclude, from the preceding inclusion and von Neumann's density theorem - see Theorem 3.4.6 that we must have A" C A'. Next, note that, in view of Lemma 3.5.4, Lemma 3.5.5(a)(iii) and Theorem 3.4.6, we necessarily have MeA". Thus, we find that we always have 0-
MeA" c A' .
(3.5.30)
Case (i): 7r is cyclic. In this case, we assert that we actually have A' = A" = M. In the case at hand, we may assume that the underlying Hilbert space is H = L2(X, JL), and that ir is the multiplication representation of L=(X, JL). In view of 3.5.30, we need to show that A' C M. So suppose x E A'; we wish to conclude that x = ir(¢), for some ¢ E LOO(X, JL). Notice that if this were true, it must be the case that ¢ = x~JL where ~JL denotes the constant function 1, as in the notation of Theorem 3.4.13. So, define ¢ = x~JL; we need to show that ¢ E L=(X, JL) and that x = ir(¢). For this, begin by deducing from the inclusion 3.5.30 that if 7/J E LOO(X, JL) is arbitrary, then x7/J = xir(7/J)~JL = ir(7/J)x~JL = ir(7/J)¢ = ¢7/J. Further, if we set Er consequently,
= {s
EX:
1¢(s)1 >
r}, note that rI E , <
IllEr IIH 1 < ~ 11¢IEr IIH 1 ~llxlErllH
<
1
-llxll.c(H) . IllEr IIH , r
I¢IIE"
and
106
Chapter 3. C* -Algebras
which clearly implies that fLeEr) = 0 Vr > IlxIIL(H); in other words, eP E LOO(X,fL); but then x and ire eP) are two bounded operators on H which agree on the dense set LOO(X,fL), and hence x = ir(eP), as desired.
Case {ii}: Suppose 7r = 7r~, for some 1 ~ n ~ No. Thus, in this case, we may assume that H = H~ is the (Hilbert space) direct sum of n copies of H/-L = L2(X, fL). We may ~ and shall ~ identify an operator x on H~ with its representing matrix ((x(i,j))), where x(i,j) E £(H/-L) Vi,j E N such that 1 ~ i,j ~ n (as in Proposition 2.5.6). In the notation of the proof of Theorem 3.4.6, we find that A = 7r/-L(c(x))(n) {x(n) : x E 7r/-L(C(X))} (where x(n) is the matrix with (i,j)-th entry being given by x or 0 according as i = j or i i= j; arguing exactly as in the proof of Theorem 3.4.6, (and using the fact, proved in Case (i) above, that 7r/-L(C(X))' = 7r/-L(L OO (X,fL))), we find that y E AI precisely when the entries of the representing matrix satisfy y(i,j) = 7r/-L(ePi,j) Vi,j, for some family {eP;,j : 1 ~ i,j ~ n} C LOO(X,fL). Again, arguing as in the proof of Theorem 3.4.6, and using the fact, proved in Case (i) above, that 7r/-L(L OO (X,fL))' = 7r/-L(LOO(X,fL)), we then find that A" = {zen) : Z E 7r/-L(L OO (X, fL))}; in other words, this says exactly that A" = 7r~(LOO(X,fL)) = M, as desired. Case {iii}: 7r arbitrary. By the already proved Theorem 3.5.3(1), we may assume, as in the notation of that theorem, that 7r = Efh::;n::;No7r:IEn' Let Pn = ir(lEJ, for 1 ~ n ~ No, and let Hn = ran Pn, so that the Hilbert space underlying the representation 7r (as well as ir) is given by H = Ef'h < n< No Hn· By 3.5.30, we see that Pn E A" C AI. This means that any operator in AI has the form EBnxn, with Xn E £(Hn); in fact, it is not hard to see that in order that such a direct sum operator belongs to AI, it is necessary and sufficient that Xn E 7r: IEn (C(X))' V 1 ~ n ~ No. From this, it follows that an operator belongs to A" if and only if it has the form EBnzn, where Zn E 7r:1En (C (X))" V 1 ~ n ~ No; deduce now from (the already established) Case (ii) that there must exist ePn E LOO(En'fLIEn) such that Zn = ~(ePn) "In; in order for EBnzn to be bounded, it must be the case that the family {llznll = IlePnliLoo(En,/-LIEn)}m be uniformly bounded, or equivalently, that the equation eP = Ln lEn ePn define an element of LOO(X, fL); in other words, we have shown that an operator belongs to A" if and only if it has the form Z = ir(eP) for some eP E LOO(X, fL), i.e., A" c M and the proof of the lemma is complete. D Remark 3.5.8 The reader might have observed our use of the phrase "the von Neumann subalgebra generated" by a family of operators. Analogous to our notation in the C*-case, we shall write W*(S) for the von Neumann subalgebra generated by a subset S c £(H). It must be clear that for any such set S, W* (S) is the smallest von Neumann algebra contained in £(H) and containing S. For a more constructive description of W* (S), let A denote the set of linear combinations of "words in S U S* n {1}"; then A is the smallest *-subalgebra containing S n {1},
3.5. The Hahn-Hellinger theorem
107
A = C* (S n {I} ), and W* (S) is the weak (equivalently, strong) operator closure of A. Also, W*(S) = (S U S*)". D
We need just one more ingredient in order to complete the proof of the HahnHellinger theorem, and that is provided by the following lemma.
= 7r~ and H = H~ denote the (Hilbert space) direct sum of n copies ofH/l = L2(X, f.-L), where I ::; n ::; No. Consider the following conditions on a family {pi : i E I} c £(H): (i) {Pi: i E I} is a family of pairwise orthogonal projections in 7r(C(X))'; and (ii) if FE Bx is any Borel set such that f.-L(F) > 0, and if ir is associated to 7r as in Lemma 3.5.5 (also see Remark 3.5.6), then Piir(I F ) i= 0 Vi E I.
Lemma 3.5.9 Let 7r
(a) there exists a family {Pi: i E I} which satisfies conditions (i) and (ii) above, such that III = n; (b) if n < 00 and if {Pi: i E I} is any family satisfying (i) and (ii) above, then III ::; n.
Proof. To start with, note that since H = H~, we may identify operators on H with their representing matrices ((x(i,j))), (where, of course, x(i,j) E £(H/l) V I ::; i,j ::; n). Further, A = 7r(C(X)) consists of those operators x whose matrices satisfy x(i,j) = Di,j7r/lU) for some f E C(X); and our analysis of Case (ii) in the proof of Lemma 3.5.7 shows that A' consists of precisely those operators y E £(H) whose matrix entries satisfy y(i,j) E 7r/l(C(X))' = 7r/l(LOO(X, f.-L)). (a) For I ::; m ::; n, define Pm(i,j) = Di,jDi,mI (where the I denotes the identity operator on H/l); the concluding remark of the last paragraph shows that {Pm: I ::; m ::; n} is a family of pairwise orthogonal projections in A' - i.e., this family satisfies condition (i) above; if F is as in (ii), note that ir(IF) is the projection q E £(H) whose matrix is given by q( i, j) = Di,j7T/l (IF), and it follows that for any I ::; m ::; n, the operator Pmq is not zero, since it has the non-zero entry 7r/l(IF) in the (m, m)-th place; and (a) is proved.
n«
n
(b) In order to prove that III ::; 00), it is enough to show that 1101 ::; for every finite subset 10 c I; hence, we may assume, without loss of generality, that I is a finite set. Suppose {Pm : m E I} is a family of pairwise orthogonal projections in A' which satisfies condition (ii) above. In view of the concluding remarks of the first paragraph of this proof, we can find a family {¢m(i,j) : mEl, I ::; i,j ::; n} C LOO(X,f.-L) such that the matrix of the projection Pm is given by Pm(i,j) = 7r/l(¢m(i,j)). Notice now that n
Pm
= PmP;" ::::} Prn(i,i) = LPm(i,j)Pm(i,j)* , j=l
Vm E I;
Chapter 3. C* -Algebras
108
in view of the definition of the ¢m(i,j)'s and the fact that i is a *-homomorphism, it is fairly easy to deduce now, that we must have n
¢m(i, i) = L
l¢m(i,j)1 2 a.e., VI:::; i:::; n, Vm E I .
(3.5.31)
]=1
In a similar fashion, if m, k E I and if m
i:- k,
then notice that
n
PmPk = 0
=?
PmP~ = 0
=?
LPm(i,j)Pk(l,j)* = 0, VI :::; i, l :::; n, j=1
which is seen to imply, as before, that n
L¢m(i,j)¢k(l,j)
=
0 a.e., VI:::; i,l:::; n, Vm
i:- k
E I.
(3.5.32)
j=1
Since I is finite, as is n, and since the union of a finite number of sets of measure zero is also a set of measure zero, we may, assume that the functions ¢m(i,j) (have been re-defined on some set of measure zero, if necessary, so that they now) satisfy the conditions expressed in equations 3.5.31 and 3.5.32 not just a.e., but in fact, everywhere; thus, we assume that the following equations hold pointwise at every point of X: n
L
l¢m(i,j)1 2
¢n,(i, i) VI :::; i :::; n, Vm E I
j=1 n
L¢m(i,j)¢k(l,j)
o VI
:::; i, l :::; n, Vm
i:- k
E I .
(3.5.33)
j=1
Now, for each mEl, define Fm = {s EX: (¢m(i,i))(s) = 0 VI :::; i:::; n}. Then, if s tJ- F m, we can pick an index 1 :::; im :::; n such that (¢rn (irn, in,)) (s) i:- 0; so, if it is possible to pick a point s which does not belong to Fm for all mEl, then we could find vectors
Vm
[
(¢rr,(im,I))(s) (¢m (i m , 2) ) (s ) .
1 ,m E I
(¢m(im, n) )(s) which would constitute a set of non-zero vectors in en which are pairwise orthogonal in view of equations 3.5.33; this would show that indeed III :::; n. So, our proof will be complete once we know that X i:- UmEIFm; we make the (obviously) even stronger assertion that f.1(Fm) = 0 Vm. Indeed, note that if
109
3.5. The Hahn-Hellinger theorem
s E F m , then ¢m(i, i)(s) = 0 \11 :::; i :::; n; but then, by equation 3.5.31, we find that ¢m(i,j)(s) = 0 \11 :::; i,j :::; n; this implies that ¢m(i,j)IF = 0 a.e. \I 1 :::; i,j :::; n; this implies that (every entry of the matrix representing the operator) Pm ir(IF) is 0; but by the assumption that the family {Pm: m E I} satisfies condition (ii), and so, this can happen only if M(Fm) = O. D We state, as a corollary, the form in which we will need to use Lemma 3.5.9. Corollary 3.5.10 Let 7f = EBl::;n::;No7f~IEn. Suppose A E Bx is such that M(A) > 0, and let 1 :::; n < 00; then the following conditions on A are equivalent: (a) there exists a family {Pi: 1 :::; i :::; n} of pairwise orthogonal projections in 7f(C(X))' such that (i) Pi = Pi ir(I A ) \11 :::; i :::; n; and (ii) FE Bx,M(A n F) > 0 =} Pi ir(IF) cf. 0 \11:::; i :::; n.
(b) A c Un::;m::;N o Em (mod M) - or equivalently, M(A n E k )
= 0 \I 1 :::; k < n.
Proof. If we write en ir(IEJ, it is seen - from the proof of Case (iii) of Lemma 3.5.7, for instance - that any projection P E 7f(C(X))' has the form P = Ln qn, where qn = pen is, when thought of as an operator on e n1i, an element of 7fn!LIEn (C(X))'. (a) =} (b): Fix 1 :::; k < 00, and suppose M(A n E k ) cf. 0; set qi = Piek and note that we can apply Lemma 3.5.9(b) to the representation 7fk!L l AnE and the family k
{qi : 1 :::; i :::; n}, to arrive at the concluion n :::; k; in other words, if 1 :::; k < n, then M(A n Ek) = o.
(b) =} (a): Fix any m ;::: n such that M(A n Em) > O. Then, apply Lemma 3.5.9(a) to the representation 7f /-lm l AnE to conclude the existence of a family {q~m) : 1 :::; m i < m} of pairwise orthogonal projections in 7f m l (C(X))' with the property that
qi
~AnEm
m
)7f:;;;---1 (IF) p., AnE1Y!.
cf.
0 whenever F is a Borel subset of A n Em such that
M(F) > o. We may, and will, regard each q;m) as a projection in the big ambient Hilbert space 1i, such that q~m) = q;m)ir(IAnE m ), so that q;m) E 7f(C(X))'. Now, for 1 :::; i ::; n, if we define
Pi n:S>m::;NO
!L(AnEm»o it is clear that {pi : 1 ::; i ::; n} is a family of pairwise orthogonal projections in 7f(C(X))', and that Pi = Piir(1A) \11 ::; i ::; n; further, if F E Bx satisfies M(A n F) > 0, then (since M(A n E k ) = 0 \11 ::; k < n) there must exist an m;::: n such that M(AnFnEm ) > 0; in this case, note that (Piir(IF))ir(IAnE m ) = q~m)ir(IAnFnE,J which is non-zero by the way in which the q~m),s were chosen. Hence it must be that Pi ir(IF) =1= 0; thus, (a) is verified. D
Chapter 3. C* -Algebras
110
We now complete the proof of the Hahn-Hellinger theorem.
Proof of Theorem 3.5.3(2). Suppose that for i = 1,2, /-lCi) is a probability measure on (X, B x), and {E~i) : 1 ::; i ::; No} is a collection of pairwise disjoint Borel sets such that /-lCi) is supported on UI
It follows from the proof of Lemma 3.5.5 that a choice for the measure asis given by /-lCi), for i = 1,2; we sociated to the representation 81I
-
J1
E(')
may conclude from Lemma 3.5.5(b) that the ~easures /-lCI) and /-l(2) are mutually absolutely continuous; by Lemma 3.5.2(a), it is seen that 7rJ1(!)IF is equivalent to 7rJ1(2) IF for any FEB x· Hence, we may assume, without loss of generality, that /-lCI) = /-l(2) = /-l (say). Hence we need to show that if {E~i) : 1 ::; n ::; No}, i = 1,2, are two collections of pairwise disjoint Borel sets such that
/-l(X -
II
ECi)) n
=0
,
i
= 1" 2
(3.5.34)
and (3.5.35) then it is the case that /-l(E~I)t:lE~2)) = 0 VI::; n::; No. Let 'Hi be the Hilbert space underlying the representation 7ri, and let U be a unitary operator U : 'HI ---; 'H2 such that U7rI (f) = 7r2(f)U , V f E C(X); then, see Remark 3.5.6 - also U7rI(rp) = 7G(rp)U , Vrp E LCXJ(X,/-l). In particular, we see that (3.5.36) Thus, we need to show that if 1 ::; k i= m ::; No, then /-leE;;;) n Ek2)) = O. (By equation 3.5.34, this will imply that E;;;) c E~) (mod /-l).) By the symmetry of the problem, we may assume, without loss of generality, that 1 ::; k < m ::; No. So fix k, m as above, and choose a finite integer n such that k < n ::; m. Now apply (the implication (b) =} (a) of) Corollary 3.5.10 to the representation 7rI and the set E;;;) to deduce the existence of a family {Pi : 1 ::; i ::; n} of pairwise orthogonal projections in 7rI(C(X))' such that (3.5.37) and (3.5.38)
111
3.5. The Hahn-Hellinger theorem
If we now set qi = UPiU', it is easy to deduce - in view of equation 3.5.36 and the consequent fact that U7rl(C(X))'U' = 7r2(C(X))' - that equations 3.5.37 and 3.5.38 translate into the fact that {qi : 1 :::; i :::; n} is a family of pairwise orthogonal projections in 7r2(C(X))' such that
(3.5.39) and (3.5.40) We may now conclude from (the implication (a) ::::} (b) of) Corollary 3.5.10when applied to the representation 7r2 and the set E~) - that f..l(E~) n Ek2 )) = 0, and the proof of the theorem is finally complete. D In the remainder of this section, we shall re-phrase the results obtained so far in this section using the language of spectral measures. Given a separable representation 7r : C(X) ---7 £(H), consider the mapping B x 3 E f--+ 7T (1 E) that is canonically associated to this representation (see Remark 3.5.6). Let us, for convenience, write P(E) = 7T(lE); thus the assignment E f--+ P(E) associates a projection in £(H) to each Borel set E. Further, it follows from the key continuity property of the representation 7T - viz. Lemma 3.5.5(a)(iii) - that if E = U~=l En is a partition of the Borel set E into count ably many Borel sets, then P(E) = L~=l P(En), meaning that the family {P(En)}~=l is unconditionally summable with respect to the strong operator topology and has sum P(E). Thus, this is an instance of a spectral measure in the sense of the next definition. Definition 3.5.11 A spectral measure on the measurable space (X, Bx) is a mapping E f--+ P(E) from Bx into the set of projections of some Hilbert space H, which satisfies the following conditions: (i) P(0) = 0, P(X) = 1 (where, of course, 1 denotes the identity operator on H); and (ii) P is countably additive, meaning that if {En}n is a countable collection of pairwise disjoint Borel sets in X with E = Un En, then P(E) = Ln P(En), where the series is interpreted in the strong operator topology on H. Observe that the strong convergence of the series in (ii) above is equivalent to the weak convergence of the series - see Proposition 2.5.4 - and they are both equivalent to requiring that if {En}n is as in (ii) above, then the projections {P(En)}n have pairwise orthogonal ranges and the range of P(E) is the direct sum of the ranges of the P(En)'s. Given a spectral measure as above, fix ~, rJ E H, and notice that the equation
f..lt;,T/(E)
=
(P(E)~, rJ)
(3.5.41 )
Chapter 3. C* -Algebras
112
defines a complex measure p,f,,1) on (X, Bx). The Cauchy-Schwarz inequality implies that the complex measure p,f,,1) has total variation norm bounded by 11~11·111]11· Then it is fairly straightforward to verify that for fixed j E C(X), the equation (3.5.42) defines a sesquilinear form on H; further, in view of the statement above concerning the total variation norm of p,<',1) , we find that
hence, by Proposition 2.4.4, we may deduce the existence of a unique bounded operator - call it 7r(f) - on H such that
(7r(f)~, 1]) =
l
jdp,f,,1) .
After a little more work, it can be verified that the passage j f--+ 7r(f) actually defines a representation of 7r : C(X) -> .c(H). It is customary to use such expressions as
7r(f) =
l
jdP =
l
j(x)dP(x)
to denote the operator thus obtained from a spectral measure. Thus, one sees that there is a bijective correspondence between separable representations of C(X) and spectral measures defined on (X, Bx) and taking values in projection operators in a separable Hilbert space. Thus, for instance, one possible choice for the measure p, that is associated to the representation 7r (which is, after all, only determined up to mutual absolute continuity) is given, in terms of the spectral measure P(·) by
n
n
where {~n}n is an orthonormal basis for H. Further, the representation which we have, so far in this section, denoted by 7r : UX"(X,p,) -> .c(H) can be expressed, in terms of the spectral measure P(·) thus:
(ir(CP)~,1]) or, equivalently,
ir(cp) =
1
=
cpdP =
l 1
cpdP,<.,1) ,
cp(>..)dP(>..) .
Hence, the Hahn-Hellinger theorem can be regarded as a classification, up to a natural notion of equivalence, of separable spectral measures on (X, Bx). We summarise this re-formulation as follows.
3.5. The Hahn-Hellinger theorem
113
Given a probability measure f.1 defined on (X, B), let PI" : B -; .c(L2(X, B, f.1)) be the spectral measure defined by (PIL(E) f, g) =
JEr f(x)g(x)df.1(x)
.
For 1 S; nS; ~o, we have a natural spectral measure P;: B -; .c((L2(X,B,f.1))n) obtained by defining P;:(E) = ffikEln PI"(E), where Hn denotes the direct sum of n copies of H and In is any set with cardinality n. In this notation, we may re-phrase the Hahn-Hellinger theorem thus: Theorem 3.5.12 If X is a compact Hausdorff space and if P : B -; .c(H) zs a "separable" spectral measure (meaning that H is separable), then there exists a probability measure f.1 : B -; [0,1] and a partition X = UO
If P : B -; .c(H) zs another spectral measure, with associated probability measure fj and partition X = UO:<:n:<:N o En, then P and P are unitarily equivalent - meaning that there exists a unitary operator U : H -; H such that P(E) = UP(E)U*VE E Bx . if and only if f.1 and fj are mutually absolutely continuous, and f.1(Enfl.En) = 0 V 0 S; n S; ~o.
Remark 3.5.13 Let X be a locally compact Hausdorff space, and let X denote the one-point compactification of X. Then any spectral measure P : Bx -; .c(H) may be viewed as a spectral measure PI defined on B X with the understanding that PI ({ oo}) = 0, or equivalently, PI (E) = P(E - {oo}). Hence we may deduce that the above formulation of the Hahn-Hellinger theorem is just as valid under the assumption that X is a locally compact Hausdorff space; in particular, we shall use this remark, in case X = JR. 0
Chapter 4 Some Operator Theory
4.1
The spectral theorem
This section is devoted to specialising several results of the preceding chapter to the case of a "singly generated" commutative C* -algebra, or equivalently to the case of C(I:) where I: is a compact subset of the complex plane, or equivalently, to facts concerning a single normal operator on a separable Hilbert space. We should also remark that we are going to revert to our initial notational convention, whereby x, y, z etc., denote vectors in Hilbert spaces, and A, B, 5, T, U etc., denote operators between Hilbert spaces. (We should perhaps apologise to the reader for this "jumping around"; on the other hand, this was caused by the author's experience with the discomfort faced by beginning graduate students with vectors being ~,TJ etc., and operators being x, y, z etc.) The starting point is the following simple consequence of Proposition 3.3.10. Lemma 4.1.1 The following conditions on a commutative unital C* -algebra A are equivalent: (a) there exists an element x E A such that A = C*({l,x}) and u(x) = I:;
(b) A
~
C(I:).
The next step is the following consequence of the conclusions drawn in §3.5. Theorem 4.1.2 (Spectral theorem (bounded case)) (a) Let I: c C be compact and let 7-l denote a separable Hilbert space. Then there exists a 1-1 correspondence between (i) normal operators T E £(7-l) , such that u(T) C I:, and (ii) spectral measures BE :':) E 1-+ P(E) E P(7-l) (where we write P{7-l) for the set of projection operators onto closed subspaces of 7-l). This correspondence is given by the equation
(Tx,y) where /-lx,y{E)
=
=
h).
(P{E)x, y). 115
d/-lx,y{).) ,
Chapter 4. Some Operator Theory
116
(b) IfT,P(E) are as above, then the spectrum ofT is the "support of the spectral measure P(·) ", meaning that A E a(T) if and only if P(U) :f. 0 for every open neighbourhood of A.
Proof. (a) If T E £(H) is a normal operator such that a(T) = 2:0 c 2:, then according to Proposition 3.3.10, there exists a unique representation C(2: o) :3 f f---+ f(T) E C*( {I, T}) c £(H) such that h(T) = Tj if fj(A) = Aj, j = 0, 1; from the concluding discussion in §3.5, this representation gives rise to a unique spectral measure BEo :3 E f---+ P(E) E P(H) such that U(T)x,y) =
r fdJ.lx,y 't:ff,x,y.
jEo
(4.1.1)
We may think of P as a spectral measure being defined for all Borel subsets of 2: (in fact, even for all Borel subsets of q by the rule P(E) = P(E n 2:0). Conversely, every spectral measure as in (a)(ii) clearly gives rise to a (representation of C(2:) into H and consequently a) normal operator T E £(H) such that equation 4.1.1 is valid for U(A) = Aj,j = 0,1 and hence for all) f E C(2:). Since homomorphisms of C*-algebras "shrink spectra" . see Lemma 3.4.2(a) ~ we find that a(T) c 2:. (b) Let T = JE AdP(A) as in (a) above. Define the measure J.l by J.l(E) = Ln 2~nIIP(E)enI12, where {en}" is an orthonormal basis for H. Then, by our discussion in §3.5 on spectral measures, we have an isometric *-isomorphism L OO (2: o, J.l) :3 ¢ f---+ JEo ¢dP E £(H), where 2:0 = a(T). Under this isomorphism, the function II (A) = A, A E 2:0 corresponds to T; but it is easy to see that AO E aL=(Eo,!-')Ud if and only if J.l( {A : IA - Aol < E}) > 0 for every E > O. (Reason: the negation of this latter condition is clearly equivalent to the statement that the function A f---+ .\~.\o belongs to L OO (2: o, J.l).) Since J.l(E) = 0 {o} P(E) = 0, the proof of (b) is complete. 0 Corollary 4.1.3 A complex number AO belongs to the spectrum of a normal oper-
ator T E £(H) if and only if AO is an "approximate eigenvalue for T", meaning that there exists a sequence {xn : n E N} of unit vectors in H such that IITx n Aoxnll---; O. Proof. An approximate eigenvalue of any (not necessarily normal) operator must necessarily belong to the spectrum of that operator, since an invertible operator is necessarily bounded below ~ see Remark 1.5.15. On the other hand, if T is normal, if P(·) is the associated spectral measure, and if AO E a(T), then P(U) t= 0 for every neighbourhood U of AO. In particular, we can find a unit vector Xn belonging to the range of the projection P(Un ), where Un = {A E C : IA - Aol < ~}. Since I(A - Ao)lun (A)1 < ~ 't:f A, we find that II(T - Ao)P(Un)11 ::::: ~, and in particular, II(T - Ao)xnll ::::: ~ 't:f n. 0
117
4.1. The spectral theorem
Remark 4.1.4 Note that the assumption of normality is crucial in the above proof that every spectral value of a normal operator is an approximate eigenvalue; indeed, if U is a non-unitary isometry ~ such as the unilateral shift, for instance ~ then U is not invertible, i.e., 0 E O"(U), but 0 is clearly not an approximate eigenvalue of any isometry. D We now make explicit, something which we have been using all along; and this is the fact that whereas we formerly only had a "continuous functional calculus" for normal elements of abstract C* -algebras (and which we have already used with good effect, in Proposition 3.3.11, for instance), we now have a measurable functional calculus for normal operators on a separable Hilbert space. We shall only make a few small remarks in lieu of the proof of the following Proposition, which is essentially just a re-statement of results proved in §3.5. Proposition 4.1.5 Let T be a normal operator on a separable Hilbert space H. Let ~ = O"(T) and let PC) be the unique spectral measure associated to T and let f.1(E) = Ln2~nIIP(E)enI12, where {en}n is any orthonormal basis for H. Then the assignment
¢
f->
£
¢ dP
¢(T) =
defines an isometric *-isomorphism of LOO(~, f.1) onto the von Neumann subalgebra W*( {I, T}) of .c(H) generated by {I, T}, such that (i) ¢j(T) = Tj, where ¢j(A) = Aj, j = 0,1; and (ii) the representation is "weakly continuous" meaning that if {¢n}n is a sequence in LOO(X, f.1) which converges to ¢ with respect to the "weak * topology" (see Remark A. 5.15), then ¢n (T) --> ¢(T) in the weak operator topology. Further, the representation is uniquely determined by conditions (i) and (ii), and is called the "measurable functional calculus" for T.
Remarks on the proof: Note that if 7r : C(~) --> .c(H) is the "continuous functional calculus" for T, and if 7i" and f.1 are associated with 7r as in Proposition 3.5.5, then
7i"(¢) =
£
¢dP
where P(E) = 7i"(lE) (and we may as well assume that f.1 is given as in the statement of this theorem). The content of Lemma 3.5.7 is that the image of this representation of LOO(~,f.1) is precisely the von Neumann algebra generated by 7r(C(~)) =C*( {l,T}). It is a fairly simple matter (which only involves repeated applications of the Cauchy-Schwarz inequality) to verify that if {¢n}n is a sequence in LOO(~,f.1), then to say that the sequence {7i"(¢n)}n converges to 7i"(¢) with respect to the weak operator topology is exactly equivalent to requiring that f"2j¢n - ¢)fdf.1 - t 0 for every f E Ll(~,f.1), and this proves (ii). The final statement is essentially contained in Remark 3.5.6. D
Chapter 4. Some Operator Theory
118
We now discuss some simple consequences of the "measurable functional calculus" for a normal operator.
Corollary 4.1.6 (a) Let U E £(H) be a unitary operator. Then there exists a selfadjoint operator A E £(H) such that U = e iA , where the right-hand side is interpreted as the result of the continuous functional calculus for A; further, given any a E lR, we may choose A to satisfy a(A) C [a, a + 21T].
(b) If T
E £(H) is a normal operator, and if n E N, then there exists a normal operator A E £(H) such that T = An.
Proof. (a) Let 0: C ---7 [a,a+21T)} be the unique (necessarily measurable) function with the property that z = IzleiI.J(z)Vz E C. Then note that if U is unitary, we have U = eiI.J(U); set A = O(U). (b) This is proved like (a), by taking some measurable branch of the logarithm defined everywhere in C and defining the n-th root using this branch of the logarithm.
o Exercise 4.1.7 Show that if M c £(H) is a von Neumann algebra, then M is generated as a Banach space by the set P(M) = {p EM: p = p* = p2}. (Hint: Thanks to the Cartesian decomposition, it is enough to be able to express any self-adjoint element x = x* E M as a norm-limit of finite linear combinations of projections in M; in view of Proposition 3. 3.11 (f), we may even assume that x ~ 0; by Proposition A.5.9(3), we can find a sequence {¢n}n of simple functions such that {¢n(t)}n is a non-decreasing sequence which converges to t, for all t E a(x), and such that the convergence is uniform on a(x); deduce that II¢n(x) - xii ---70.)
4.2
Polar decomposition
In this section, we establish the very useful polar decomposition for bounded operators on Hilbert space. We begin with a few simple observations and then introduce the crucial notion of a partial isometry. Lemma 4.2.1 Let T E £(H, K). Then, ker T =
ker (T*T) =
ker (T*T)!
ran-.l T* .
(4.2.2)
In particular, also ker-.l T =
ran T* .
(In the equations above, we have used the notation ran-.l T* and ker-.l T, for (ran T*)-.l and (ker T)-.l, respectively.) Proof· First observe that, for arbitrary x
IITxl1
2
=
(T*Tx, x)
=
E
H, we have
1
1
((T*T)2X, (T*T)2X)
whence it follows that ker T =
ker(T*T)!.
(4.2.3)
4.2. Polar decomposition
119
Notice next that x E ran-L T*
<=?
(x, T*y) = 0 'V y E K
<=?
(Tx, y)
<=?
Tx = 0
= 0 'V y
E K
and hence ran-L T* = ker T. "Taking perps" once again, we find - because of the fact that V -1-1 = V for any linear subspace V c K - that the last statement of the Lemma is indeed valid. Finally, if {Pn}n is any sequence of polynomials with the property that Pn(O) = 0 'V n and such that {Pn(t)} converges uniformly to Jt on a(T*T), it follows that IIPn (T*T) - (T*T) ~ II ---+ 0, and hence, xE ker(T*T)
=}
Pn(T*T)x=O'Vn =}(T*T)~x=O
and hence we see that also ker(T*T) c ker(T*T) ~; since the reverse inclusion is clear, the proof of the lemma is complete. 0 Proposition 4.2.2 Let H, K be Hilbert spaces; then the following conditions on an operator U E £(H, K) are equivalent: (i) U = UU*U; (ii) P = U*U is a projection; (iii)
Ulker.L U
is an isometry.
An operator which satisfies the equivalent conditions (i)-(iii) is called a partial isometry. Proof. (i) =} (ii): Clearly P p 2 = U*UU*U = U*U = P.
=
P*, while the assumption (i) clearly implies that
(iii): Let M = ran P. Then notice that, for arbitrary x E H, we have: (Px, x) = (U*Ux, x) = IIUxI1 2 ; this clearly implies that ker U = ker P = M-L, and that U is isometric on M (since P is identity on M). (ii)
=}
IIPxl1 2 =
(iii) =} (ii): Let M = ker-L U. For i = 1,2, suppose Zi E H, and Xi EM, Yi E M-L are such that Zi = Xi + Yi; then note that (UZ 1 ,UZ2 ) =
( UX l, UX2)
(Xl,X2) (since UIM is isometric) (Xl,Z2) ,
and hence U*U is the projection onto M.
Chapter 4. Some Operator Theory
120
(ii) =}- (i): Let M = ran U*U; then (by Lemma 4.2.1) M1- = ker U*U = ker U, and so, if x EM, y E M 1-, are arbitrary, and if z = x + y, then observe that U z = Ux + Uy = Ux = U(U*Uz). D Remark 4.2.3 Suppose U E £(H, K) is a partial isometry. Setting M = ker1- U and N = ran U (= ran U), we find that U is identically 0 on M 1-, and U maps M isometrically onto N. It is customary to refer to M as the initial space, and to N as the final space, of the partial isometry U. On the other hand, upon taking adjoints in condition (i) of Proposition 4.2.2, it is seen that U* E £(K, H) is also a partial isometry. In view of Lemma 4.2.1, we find that ker U* = N 1- and that ran U* = M; thus N is the inital space of U* and M is the final space of U* . Finally, it follows from Proposition 4.2.2(ii) (and the proof of that proposition) that U*U is the projection (of H) onto M while UU* is the projection (of K) onto N. D Exercise 4.2.4 If U E £(H, K) is a partial isometry with initial space M and final space N, show that if yEN, then U*y is the unique element x E M such that Ux=y. Before stating the polar decomposition theorem, we introduce a convenient bit of notation: if T E £(H, K) is a bounded operator between Hilbert spaces, we shall always use the symbol ITI to denote the unique positive square root of the positive operator ITI2 = T*T E £(H); thus, ITI = (T*T)~. Theorem 4.2.5 (Polar Decomposition) (a) Any operator T E £(H, K) admits a decomposition T = U A such that (i) U E £(H, K) is a partial isomertry; (ii) A E £(H) is a positive operator; and (iii) ker T = ker U = ker A.
(b) Further, if T = VB is another decomposition of T as a product of a partial isometry V and a positive operator B such that ker V = ker B, then necessarily U = V and B = A = ITI. This unique decomposition is called the polar decomposition ofT. (c) 1fT = UITI is the polar decomposition ofT, then
ITI
= U*T.
Proof· (a) If x, y E H are arbitrary, then, (Tx, Ty)
= (T*Tx, y) = (ITI 2 x, y) = (ITlx, ITly) ,
whence it follows - see Exercise 3.4.12 - that there exists a unique unitary operator Uo : ran ITI -+ ran T such that Uo(ITlx) = Tx V x E H. Let M = ran ITI and let P = PM denote the orthogonal projection onto M. Then the operator U = UoP
4.2. Polar decomposition
121
clearly defines a partial isometry with initial space M and final space N = ran T which further satisfies T = UITI (by definition). It follows from Lemma 4.2.1 that ker U = ker ITI = ker T. (b) Suppose T = VB as in (b). Then V* V is the projection onto ker..l V = ker..l B = ran B, which clearly implies that B = V*V B; hence, we see that T*T = BV*V B = B2; thus B is a, and hence the, positive square root of ITI2, i.e., B = ITI. It then follows that V(ITlx) = Tx = U(ITlx) 'Vx; by continuity, we see that V agrees with U on ran ITI, but since this is precisely the initial space of both partial isometries U and V, we see that we must have U = V. (c) This is an immediate consequence of the definition of U and Exercise 4.2.4.0 Exercise 4.2.6 (1) Prove the "dual" polar decomposition theorem; i.e., each T E £(J-i, K) can be uniquely expressed in the form T = BV where V E £(J-i, K) is a partial isometry, B E £(K) is a positive operator and ker B = ker V* = ker T*. (Hint: Consider the usual polar decomposition of T*, and take adjoints.)
(2) Show that if T = UITI is the (usual) polar decomposition of T, then Ulker"- T implements a unitary equivalence between ITI Iker"- ITI and IT* I Iker"- IT'I' (Hint: Write M = ker..l T, N = ker..l T*, W = UIM; then W E £(M,N) is unitary; further IT*12 = TT* = UITI 2U*; deduce that if A (resp., B) denotes the restriction of ITI (resp., IT*I) to M (resp., N), then B2 = WA2W*; now deduce, from the uniqueness of the positive square root, that B = W A W* .) (3) Apply (2) above to the case when J-i and K are finite-dimensional, and prove that if T E L(V, W) is a linear map of vector spaces (over C), then dim V = rank(T) + nullity(T) , where rank(T) and nullity(T) denote the dimensions of the range and kernel, respectively, of the map T. (4) Show that an operator T E £(J-i, K) can be expressed in the form T = W A, where A E £(J-i) is a positive operator and W E £(J-i, K) is unitary if and only if dim(ker T) = dim(ker T*). (Hint: In order for such a decomposition to exist, show that it must be the case that A = ITI and that the W should agree, on ker..l T, with the U of the polar decomposition, so that W must map ker T isometrically onto ker T*.) (5) In particular, deduce from (4) that in case J-i is a finite-dimensional inner product space, then any operator T E £(J-i) admits a decomposition as the product of a unitary operator and a positive operator. (In view of Proposition 3. 3.11 (b) and (d), note that when J-i = C, this boils down to the usual polar decomposition of a complex number.) Several problems concerning a general bounded operator between Hilbert spaces can be solved in two stages; in the first step, the problem is "reduced", using the polar decomposition theorem, to a problem concerning positive operators on a Hilbert space; and in the next step, the positive case is settled using the spectral theorem. This is illustrated, for instance, in exercise 4.2.7(2).
122
Chapter 4. Some Operator Theory
Exercise 4.2.7 (1) Recall that a subset ~ of a (real or complex) vector space V is said to be convex if it contains the "line segment joining any two of its points"; i.e., ~ is convex if x, y E ~,O :s: t :s: 1 =} tx + (1 - t)y E ~. (a) If V is a normed (or simply a topological) vector space, and if ~ is a closed subset of V, show that ~ is convex if and only if it contains the mid-point of any two of its points - i.e., ~ is convex if and only if x, y E ~ =} ~(x+y) E ~. (Hint: The set of dyadic rationals, i.e., numbers of the form 2';, is dense in JR..) (b) If S c V is a subset of a vector space, show that there exists a smallest convex subset of V which contains S; this set is called the convex hull of the set S and we shall denote it by the symbol co(S). Show that co(S) = {2=~=1 BiXi : n E N,Bi 2: 0, L~=l Bi = I}. (c) Let ~ be a convex subset of a vector space; show that the following conditions on a point x E ~ are equivalent: (i) x = ~(y + z), y, z E ~ =} x = Y = z; (ii) x = ty + (1 - t)z, 0 < t < 1, y, z E ~ =} x = Y = z. The point x is called an extreme point of a convex set ~ if x E ~ and if x satisfies the equivalent conditions (i) and (ii) above. (d) It is a fact, called the Krein-Milman theorem - see [Yos}, for instance - that if K is a compact convex subset of a Banach space (or more generally, of a locally convex topological vector space which satisfies appropriate "completeness conditions"), then K = co(oeK), where oeK denotes the set of extreme points of K. Verify the above fact in case K = ball (H) = {x E H: Ilxll :s: I}, where H is a Hilbert space, by showing that oe (ball H) = {x E H : Ilx II = I}. (Hint: Use the parallelogram law - see Exercise 2.1.3(4).) (e) Show that oe(ball X) -=I- {x EX: Ilxll = I}, when X = £;" n> 1. (Thus, not every point on the unit sphere of a normed space need be an extreme point of the unit ball.)
(2) Let Hand K denote (separable) Hilbert spaces, and let lE = {A E .c(H, K) : I} denote the unit ball of .c(H, K). The aim of the following exercise is to show that an operator T E lE is an extreme point of lE if and only if either T or T* is an isometry. (See (1)(c) above, for the definition of an extreme point.) (a) Let lE+ = {T E .c(H) : T 2: 0, IITII :s: I}. Show that T E oelE+ {o} T is a projection. (Hint: suppose P is a projection and P = ~(A + B), A, B E lE+; then for arbitrary x E ball(H), note that 0 :s: ~((Ax,x) + (Bx,x)) :s: 1; since oe[O, 1] = {O, I}, deduce that (Ax,x) = (Bx,x) = (px,x) V x E (ker P U ran P); but A 2: 0 and ker Pc ker A imply that A(ran P) C ran P; similarly also B(ran P) C ran P; conclude (from Exercise 2.4.2) that A = B = P. Conversely, if T E lE+ and T is not a projection, then it must be the case - see Proposition 3.3. 11 (c) - that there exists A E u(T) such that 0 < A < 1; fix E > 0 such that (A - 2E, A + 2E) C (0,1); since A E u(T), deduce that P -=I- 0 where P = 1(>,-€,A+€) (T); notice now that if we set A =
IIAII :s:
4.3. Compact operators
123
T - EP, B = T + EP, then the choices ensure that A, BE Iffi+, T = ~(A + B), but A -=I- T -=I- B, whence T tJ- 8 e lffi+.) (b) Show that the only extreme point of ball .c(H) = {T E .c(H) : IITII : : : I} which is a positive operator is 1, the identity operator on H. (Hint: Prove that 1 is an extreme point of ball .c(H) by using the fact that 1 is an extreme point of the unit disc in the complex plane; for the other implication, by (a) above, it is enough to show that if P is a projection which is not equal to 1, then P is not an extreme point in ball .c(H); if P -=I- 1, note that P = ~(U+ + U_), where U± = P ± (1 - P).) (c) Suppose T E 8e lffi; if T = UITI is the polar decomposition of T, show that ITI 1M is an extreme point of the set {A E .c(M) : IIAII : : : I}, where M = ker-L ITI, and hence deduce, from (b) above, that T = U. (Hint: if ITI = ~(C+D), with C, DE ball .c(M) and C -=I- ITI -=I- D, note that T = ~(A+B), where A = U C, B = U D, and A -=I- T -=I- B.) (d) Show that T E 8e lffi if and only if T or T* is an isometry. (Hint: suppose T is an isometry; suppose T = ~(A + B), with A, B E Iffi; deduce from (1)(d) that Tx = Ax = Bx \Ix E H; thus T E 8 e lffi; similarly, if T* is an isometry, then T* E 8e lffi. Conversely, if T E 8 e lffi, deduce from (c) that T is a partial isometry; suppose it is possible to find unit vectors x E ker T, y E ker T* ; define U±z = Tz ± (z, x)y, and note that U± are partial isometries which are distinct from T and that T = ~ (U+ + U _ ).)
4.3
Compact operators
Definition 4.3.1 A linear map T : X ----+ Y between Banach spaces is said to be compact if it satisfies the following condition: for every bounded sequence {xn}n C X, the sequence {Txn}n has a subsequence which converges with respect to the norm in Y. The collection of compact operators from X to Y is denoted by K(X, Y) (or simply K(X) if X = Y). Thus, a linear map is compact precisely when it maps the unit ball of X into a set whose closure is compact - or equivalently, if it maps bounded sets into totally bounded sets; in particular, every compact operator is bounded. Although we have given the definition of a compact operator in the context of general Banach spaces, we shall really only be interested in the case of Hilbert spaces. Nevertheless, we state our first result for general Banach spaces, after which we shall specialise to the case of Hilbert spaces. Proposition 4.3.2 Let X, Y, Z denote Banach spaces. (a) K(X, Y) is a norm-closed subspace of .c(X, Y). (b) if A E .c(Y, Z), B E .c(X, Y), and if either A or B is compact, then AB is also compact. (c) In particular, K(X) is a closed two-sided ideal in the Banach algebra .c(X).
Chapter 4. Some Operator Theory
124
Proof. (a) Suppose A, B E K(X, Y) and a E C, and suppose {xn} is a bounded sequence in X; since A is compact, there exists a subsequence - call it {Yn} of {xn} - such that {AYn} is a norm-convergent sequence; since {Yn} is a bounded sequence and B is compact, we may extract a further subsequence - call it {zn} - with the property that {Bzn} is norm-convergent. It is clear then that {(aA + B)zn} is a norm-convergent sequence; thus (aA + B) is compact; in other words, K(X, Y) is a subspace of £(X, Y). Suppose now that {An} is a sequence in K(X, Y) and that A E £(X, Y) is such that IIAn - All -+ O. We wish to prove that A is compact. We will do this by a typical instance of the "diagonal argument" described in Remark A.4.9. Thus, suppose So = {xn}n is a bounded sequence in X. Since Al is compact, we can extract a subsequence SI = {X~I)}n of So such that {AX~I)}n is convergent in Y. Since A2 is compact, we can extract a subsequence S2 = {X~2)}n of SI such that {AX~2)}n is convergent in Y. Proceeding in this fashion, we can find a sequence
{Sd such that Sk = {X~k)}n is a subsequence of Sk-l and {AkX~k)}n is convergent in Y, for each k ;::: 1. Let us write Zn = x~n); since {zn : n ;::: k} is a subsequence of Sk, note that {AkZn}n is a convergent sequence in Y, for every k ;::: 1. The proof of (a) will be completed once we establish that {Azn}n is a Cauchy sequence in Y. Indeed, suppose E > 0 is given; let K = 1 + sUPn Ilznll; first pick an integer N such that IIAN - All < 3~; next, choose an integer no such that IIANZn - ANZml1 < ~ V n, m ;::: no; then observe that if n, m ;::: no, we have: IIAzn - AZml1
<
II(A - AN)Znll
+ II(AN
+ IIANZn
- ANZml1
- A)zmll
<
(b) Let JE denote the unit ball in X; we need to show that (AB)(JE) is totally bounded (see Definition A.4.6); this is true in case (i) A is compact, since then B(JE) is bounded, and A maps bounded sets to totally bounded sets, and (ii) B is compact, since then B(JE) is totally bounded, and A (being bounded) maps totally bounded sets to totally bounded sets. D
Corollary 4.3.3 Let T E £(Hl' H 2 ), where Hi are Hilbert spaces. Then (a) T is compact if and only if ITI (= (T*T) ~ ) is compact; (b) in particular, T is compact if and only if T* is compact.
Proof. If T = UITI is the polar decomposition of T, then also U*T = ITI - see Theorem 4.2.5; so each of T and ITI is a multiple of the other. Now appeal to Proposition 4.3.2(b) to deduce (a) above. Also, since T* = ITIU*, we see that the compactness of T implies that of T*; and (b) follows from the fact that we may interchange the roles of T and T*. D
125
4.3. Compact operators
Recall that if T E .c(H, K) and if M is a subspace of H, then T is said to be "bounded below" on M if there exists an E > 0 such that IITxl1 : : : Ellxll V x E M. Lemma 4.3.4 1fT E K(H j , H 2) and ifT is bounded below on a subspace M ofH j , then M is finite-dimensional. In particular, if N is a closed subspace of H2 such that N is contained in the range of T, then N is finite-dimensional. Proof. If T is bounded below on M, then T is also bounded below (by the same constant) on M; we may therefore assume, without loss of generality, that M is closed. If M contains an infinite orthonormal set, say {en : n E Pi}, and if T is bounded below by Eon M, then note that IITe n - Tern II ::::: EV2 'In i= m; then {en}n would be a bounded sequence in H such that {Ten}n would have no Cauchy subsequence, thus contradicting the assumed compactness of T; hence M must be fini te-dimensional. As for the second assertion, let M = T~ j (N) n (kerJ. T); note that T maps M 1-1 onto N; by the open mapping theorem, T must be bounded below on M; hence by the first assertion of this Lemma, M is finite-dimensional, and so also ~~
0
The purpose of the next exercise is to convince the reader of the fact that compactness is an essentially "separable phenomenon" , so that we may ~ and will ~ restrict ourselves in the rest of this section to separable Hilbert spaces. Exercise 4.3.5 (a) Let T E .c(H) be a positive operator on a (possibly non-separable) Hilbert space H. Let E > 0 and let SE = {f(T)x : f E C(O"(T)), f(t) = o 'It E [0, If ME = [SEl denotes the closed subspace generated by SE! then show that ME C ran T. (Hint: let 9 E C(O"(T)) be any continuous function such that g( t) = c j 'It::::: ~; for instance, you could take
En.
g(t) =
{
h E2
if t ::::: ~ if 0 ::; t ::;
~
then notice that if f E C(O"(T)) satisfies f(t) = 0 'It::; E, then f(t) = tg(t)f(t) 'It; deduce that SE is a subset of N = {z E H : z = Tg(T)z}; but N is a closed subspace of H which is contained in ran T.) (b) Let T E K(HI' H 2 ), where HI, H2 are arbitrary (possibly non-separable) Hilbert spaces. Show that kerJ. T and ran T are separable Hilbert spaces. (Hint: Let T = UITI be the polar decomposition, and let ME be associated to ITI as in (a) above; show that U(M E ) is a closed subspace of ran T and deduce from Lemma 4.3.4 that ME is finite-dimensional; note that kerJ. T = kerJ. ITI is the closure of U;;'=l M ~, and that ran T = U (kerJ. T).)
In the following discussion of compact operators between Hilbert spaces, we shall make the standing assumption that all Hilbert spaces considered here are
Chapter 4. Some Operator Theory
126
separable. This is for two reasons: (a) by Exercise 4.3.5, this is no real loss of generality; and (b) we can feel free to use the measurable functional calculus for normal operators, whose validity was established ~ in the last chapter ~ only under the standing assumption of separability. Proposition 4.3.6 The following conditions on an operator T E £(1i 1 , 1i2) are equivalent: (a) T is compact; (b) ITI is compact; (c) if ME = ran l[E,CXJ)(ITI), then ME is finite-dimensional, for every E > 0; (d) there exists a sequence {Tn}~=l C £(1i 1 , 1i2) such that (i) IITn - Til ----> 0, and (ii) ran Tn is finite-dimensional, for each n; (e) ran T does not contain any infinite-dimensional closed subspace of 1i 2.
Proof. For (a)
=}
E
> 0, let us use the notation IE = l[E,CXJ) and P = 1E(ITI). E
(b): See Corollary 4.3.3.
(b) =} (c): Since t 2': ElE(t) V t 2': 0, we find easily that ITI is bounded below (by E) on ran PE , and (c) follows from Lemma 4.3.4. (c) =} (d): Define Tn = TPl.; notice that 0::::; t(l - 11.(t)) ::::; ~ V t 2': 0; conclude that IIITI(l-b(lTI))11 ::::; ifT = UITI is the polar ndecomposition ofT, deduce that liT - Tnll :;; ~; finally, the condition (c) clearly implies that each (Pl. and consequently) Tn has finite-dimensional range. n
£;
(d) =} (a): In view of Proposition 4.3.2(a), it suffices to show that each Tn is a compact operator; but any bounded operator with finite-dimensional range is necessarily compact, since any bounded set in a finite-dimensional space is totally bounded. (a)
=}
(e): See Lemma 4.3.4.
i,
(e) =} (c): Pick any bounded measurable function 9 such that get) = vt 2': E; then tg(t) = 1 Vt 2': 1; deduce that ITlg(ITl)x = x, V x E MEl and hence that ME = ITI(M E) is a closed subspace of (ran ITI, and consequently of) the initial space of the partial isometry U; deduce that T(M E) = U(ME) is a closed subspace of ranT; by condition (e), this implies that T(M E ) is finite-dimensional. But ITI and consequently T is bounded below (by E) on ME; in particular, T maps ME 1-1 onto T(M E ); hence ME is finite-dimensional, as desired. 0 We now discuss normal compact operators. Proposition 4.3.7 Let T E K..(1i) be a normal (compact) operator on a separable Hilbert space, and let E f--> P(E) = lE(T) be the associated spectral measure.
4.3. Compact operators
127
= P( {A E
(a) If f > 0, let P E
(b) If ~ = u(T) - {O}, then (i) ~ is a countable set; (ii) A E
~ ::::}
A is an eigenvalue offinite multiplicity; i.e., 0
< dim ker(T-A) <
00; (iii) the only possible accumulation point of ~ is 0; and (iv) there exist scalars ,\n E ~,n E N and an orthonormal basis {xn : n E N} of ran P(~) such that Tx
=
L
An (x, Xn) Xn , 'Vx E H .
nEN Proof. (a) Note that the function defined by the equation g(A) =
{
~
if IAI 2: f otherwise
is a bounded measurable function on u(T) such that g(A)A = IF, (A), where FE = {z E o 'Vi; then P E = L7=1 Pi is a decomposition of P E as a finite sum of non-zero (pairwise orthogonal) projections, where Pi = P( {Ad); since Pi is the projection onto ker(T-Ai) (see Exercise 4.3.8), we thus find that (i) {Ai : 1 :s; i :s; n} = ~nFE; and (ii) each Ai is an eigenvalue of T which has finite multiplicity. By allowing f to decrease to 0 through a countable sequence of values, we thus find that we have proved (i)-(iii) (since a countable union of finite sets is countable). For (iv), observe therefore that 1 - P( {O}) = P(~) = LAE~ P( {A}), and hence T = LAE~ AP( {A}); finally, if {Xn(A) : n E N A} is an orthonormal basis for ran P( {A}), for each A E ~,then P( {A})x = LnEN>. (x, Xn(A)) Xn(A) 'V x E H; letting {Xn}nEN be an enumeration of UAE~{Xn(A) : n E N A}, we find that we do indeed have the desired decomposition of T. 0 Exercise 4.3.8 Let X be a compact Hausdorff space and let Bx 3 E f-t P(E) be a spectral measure; let fJ be a measure which is "mutually absolutely continuous" with respect to P - thus, for instance, we may take fJ(E) = L IIP(E)e n 11 2 , where { en} n is some orthonormal basis for the underlying Hilbert space H - and let 7r : C(X) ---> £(H) be the associated representation.
128
Chapter 4. Some Operator Theory
(a) Show that the following conditions are equivalent: (i) 1i is finite-dimensional; (ii) there exists a finite set F c X such that IL = is finite-dimensional, for each x E F.
ILl F,
and such that ran P( {x} )
(b) If Xo EX, show that the following conditions on a vector ~ E 1i are equivalent: (i) 7f(J)~ = f(xo)~ V f E C(X); (ii) ~ E ran P( {xo}).
Remark 4.3.9 In the notation of the Proposition 4.3.7, note that if A E 2:, then A features (as some An) in the expression for T that is displayed in Proposition 4.3. 7(iv); in fact, the number of n for which A = An is exactly the dimension of (the eigenspace corresponding to A, which is) ker(T - A). Furthermore, there is nothing canonical about the choice of the sequence {x n }; all that can be said about the sequence {xn}n is that if A E 2: is fixed, then {xn : An = A} is an orthonormal basis for ker(T - A). In fact, the "canonical formulation" of Proposition 4.3.7(iv) is this: T = 2:AEE APA , where P A = Ip}(T). The An's admit a "canonical choice" in some special cases; thus, for instance, if T is positive, then 2: is a subset of (0, 00) which has no limit points except possibly for 0; so the members of 2: may be arranged in non-increasing order; further,in the expression in Proposition 4.3.7(iv), we may assume that N = {I, 2, ... ,n} or N = {I,2, ... }, according as dim ran ITI = n or ~o, and also AI;::: A2 ;::: .... In this case, it must be clear that if Al is an eigenvalue of "multiplicity ml" meaning that dim ker(T - AI) = ml, then An = Al for 1 :::; n :::; ml; and similar statements are valid for the "multiplicities" of the remaining eigenvalues of T. D We leave the proof of the following Proposition to the reader, since it is obtained by simply piecing together our description (see Proposition 4.3.7 and Remark 4.3.9) of normal compact operators, and the polar decomposition to arrive at a canonical "singular value decomposition" of a general compact operator.
Proposition 4.3.10 Let T E K(1il,1i2). Then T admits the following decomposition: (4.3.4) where (i) U(ITI) - {O} =
{An: n EN}, where N is some countable set; and
(ii) {x~n) : k E In} (resp., {yin) : k E In)} is an orthonormal basis for ker(ITI An) (resp., T(ker(ITI- An)) = ker(IT*I- An)). In particular, we may assume that N = {I, 2, ... , n} or {I, 2, ... } according as dim ran ITI < 00 or ~o, and that Al ;::: A2 ;::: ... ; if the sequence {sn = sn(T)} -
129
4.3. Compact operators
which is indexed by {n EN: 1 :::; n :::; dim Hd in case HI is finite-dimensional, and by N if HI is infinite-dimensional - is defined by if 0 < n :::; card (h ) if card(h) < n :::; (card(h)
+
card(h))
(4.3.5) if Ll
(T) 2:
S2 (T)
2: ... 2:
Sn (T)
2: ...
of (uniquely defined) non-negative real numbers, called the sequence of singular values of the compact operator T. Some more useful properties of singular values are listed in the following exercises. Exercise 4.3.11 (1) Let T E £(H) be a positive compact operator on a Hilbert space. In this case, we write An = Sn (T), since (T = ITI and consequently) each An is then an eigenvalue of T. Show that
An
=
.max
dlmM::;n
min{(Tx,x):xEM,llxll=l} ,
(4.3.6)
where the the maximum is to be interpreted as a supremum, over the collection of all subspaces M c H with appropriate dimension, and part of the assertion of the exercise is that this supremum is actually attained (and is consequently a maximum); in a similar fashion, the minimum is to be interpreted as an infimum which is attained. (This identity is called the max-min principle and is also referred to as the "Rayleigh-Ritz principle".) (Hint: To start with, note that if M is a finite-dimensional subspace of H, then the minimum in equation 4.3.6 is attained since the unit sphere of M is compact and T is continuous. Let {x n : n E N} be an orthonormal set in H such that Tx = LnEN An(x, xn)xn, - see Proposition 4.3.7 and Remark 4.3.9. Define Mn = [{Xj : 1 :::; j :::; n}]; observe that An = min{(Tx,x): X E M n , Ilxll = I}; this proves the inequality:::; in 4.3.6. Conversely, if dim M :::; n, argue that there must exist a unit vector Xo E M nM*_1 (since the projection onto Mn-I cannot be injective on M), to conclude that min{ (Tx, x) : x E M, Ilxll = I} :::; An.)
(2) If T : HI
---+
H2 is a compact operator between Hilbert spaces, show that sn(T) =
(Hint: Note that IITxll 2 sn(ITI2) = sn(ITI)2.)
. max
dlmM::;n
=
min{IITxll: x E M,
Ilxll =
I} .
(4.3.7)
(ITI 2 x,x), apply (1) above to ITI2, and note that
(3) If T E K(HI' H 2) and if sn(T) > 0, then show that (n :::; dim H 2, so that sn(T*) is defined, and) sn(T*) = sn(T). (Hint: Use Exercise 4.2.6(2)')
Chapter 4. Some Operator Theory
130
We conclude this discussion with a discussion of an important class of compact operators.
Lemma 4.3.12 The following conditions on a linear operator T E £(HI' H2) are equivalent: (i) L:n IITen l1 2 < 00, for some orthonormal basis {en}n of HI; (ii) L:m IIT* fml1 2 < 00, for every orthonormal basis {fm}m ofH2· (iii) L:n IITe n l1 2 < 00, for every orthonormal basis {en}n of HI· If these equivalent conditions are satisfied, then the sums of the series in (ii) and (iii) are independent of the choice of the orthonormal bases and are all equal to one another. Proof. If {en}n (resp., {fm}m) is any orthonormal basis for HI (resp., H2), then note that
n
n
m
m
n
m
and all the assertions of the proposition are seen to follow.
D
Definition 4.3.13 An operator T E £(HI' H 2) is said to be a Hilbert-Schmidt operator if it satisfies the equivalent conditions of Lemma 4.3.12, and the HilbertSchmidt norm of such an operator is defined to be
(4.3.8) where {en}n is any orthonormal basis for HI. The collection of all Hilbert-Schmidt operators from HI to H2 will be denoted by £2 (H 1, H2). Some elementary properties of the class of Hilbert-Schmidt operators are contained in the following proposition.
Proposition 4.3.14 Suppose T E £(HI' H2), S E £(H2' H 3 ), where HI, H 2, H3 are Hilbert spaces. (a) T E £2(HI, H 2) =} T* E £2(H 2, Hd; and furthermore, IIT*112 = IITI12 > IITII 00) where we write II . 1100 to denote the usual operator norm; (b) if either S or T is a Hilbert-Schmidt operator, so is ST, and if S E £2(H 2) H 3 ) ifT E £2 (7tl) 1t2) ;
(4.3.9)
4.3. Compact operators
131
(c) £2('HI, 'H 2 ) C K('HI' 'H 2 ); and conversely, (d) if T E K('HI, 'H2), then T is a Hilbert-Schmidt operator if and only if Ln sn(T)2 < 00; in fact,
n
Proof. (a) The equality IITI12 = IIT*112 was proved in Lemma 4.3.12. If x is any unit vector in 'HI, pick an orthonormal basis {en}n for 'HI such that el = x, and note that 1
IITI12
(~IITenI12)
=
2
2:: IITxl1 ;
since x was an arbitrary unit vector in 'HI, deduce that IITI12 2:: IITlloc, as desired. (b) Suppose T is a Hilbert-Schmidt operator; then, for an arbitrary orthonormal basis {en} n of 'H I, we find that
n
n
whence we find that ST is also a Hilbert-Schmidt operator and that IISTI12 ::; IISllocIITI12; ifT is a Hilbert-Schmidt operator, then, so is T*, and by the already proved case, also S*T* is a Hilbert-Schmidt operator, and
(c) Let M, = ran l["oc)(ITI); then M, is a closed subspace of 'HI on which T is bounded below, by E; so, if {el' ... ,eN} is any orthonormal set in M" we find that NE2 ::; L:=IIITe n I1 2 ::; IITII~, which clearly implies that dim M, is finite (and can not be greater than
CI~1I2 )2).
We may now infer from Proposition 4.3.6
that T is necessarily compact. (d) Let Tx = Ln sn(T)(x, xn)Yn for all x E 'HI, as in Proposition 4.3.10, for an appropriate orthonormal (finite or infinite) sequence {xn}n (resp., {Yn}n) in 'HI (resp., in 'H 2 ). Then notice that IITxnl1 = sn(T) and that Tx = 0 if x ..1 Xn 'v'n. If we compute the Hilbert-Schmidt norm of T with respect to an orthonormal basis obtained by extending the orthonormal set {Xn}n, we find that IITII~ = LnSn(T)2, as desired. 0 Probably the most useful fact concerning Hilbert-Schmidt operators is their connection with integral operators. (Recall that a measure space (Z, Hz, A) is said to be a-finite if there exists a partition Z = I1~=1 En, such that En E Hz, J-l(En ) < 00 'v'n. The reason for our restricting ourselves to a-finite measure spaces is that it is only in the presence of some such hypothesis that Fubini's theorem - see Proposition A.5.18(d)) regarding product measures is applicable.)
Chapter 4. Some Operator Theory
132
Proposition 4.3.15 Let (X,Bx,fJ) and (Y,By,v) be (J-finite measure spaces. Let H = L 2(X,B X ,fJ) and K = L 2(Y,B y ,v). Then the following conditions on an operator T E £(H, K) are equivalent: (i) T E £2(K, H); (ii) there exists k E L2(X x Y,B x ®BY,fJ x v) such that
(Tg)(x)
=
k(x, y)g(y) dV(y) v - a.e. V 9
[
E
K .
(4.3.10)
If these equivalent conditions are satisfied, then,
Proof. (ii) ::::} (i); Suppose k E L2(fJ X v); then, by Tonelli's theorem - see Proposition A.5.18(c) - we can find a set A E Bx such that fJ(A) = 0 and such that x tj. A ::::} k X (= k(x,·) ) E L2(v), and further,
IlkII12(pxv)
=
r
}X-A
IIPII12(v)dfJ(x) .
It follows from the Cauchy-Schwarz inequality that if 9 E L2(v), then kXg E L1 (v) V x tj. A; hence equation 4.3.10 does indeed meaningfully define a function Tg on X - A, so that Tg is defined almost everywhere; another application of the Cauchy-Schwarz inequality shows that
Ix
IITgII12(p)
r
1 [
}X-A <
k(x,y)g(y) dv(y)1 2 dfJ(x)
I(P,gkI 2
dfJ(x)
lx-A IIPII12(v)llgII12(v)
dfJ(x)
IlkII12(pxv) IlgII12(v) , and we thus find that equation 4.3.10 indeed defines a bounded operator T E £(K, H). Before proceeding further, note that if 9 E K and f E H are arbitrary, then, (by Fubini's theorem), we find that
(Tg, J)
=
Ix Ix ( [
(Tg)(x)f(x)dfJ(x) k(x, y)g(y)dv(y) ) f(x)dfJ(x)
(k, f ® gh2(pxv) ,
(4.3.11)
where we have used the notation (J ® g) to denote the function on X x Y defined by (J ® g)(x, y) = f(x)g(y).
4.3. Compact operators
133
Suppose now that {en : n E N} and {gm : m E M} are orthonormal bases for Hand K respectively; then, notice that also {gm : m E M} is an orthonormal basis for K; deduce from equation 4.3.11 above and Exercise A.5.19 that
L
I(Tgm , en htl 2
mEM,nEN
mEM,nEN
thus T is a Hilbert-Schmidt operator with Hilbert-Schmidt norm agreeing with the norm of k as an element of L 2 (JL xv). (i) =} (ii): If T : K ---+ H is a Hilbert-Schmidt operator, then, in particular - see Proposition 4.3.14(c) - T is compact; let
n
be the singular value decomposition of T (see Proposition 4.3.10). Thus {gn}n (resp., {In}n) is an orthonormal sequence in K (resp., H) and An = sn(T). It follows from Proposition 4.3.14(d) that Ln A~ < 00, and hence we find that the equation n
defines a unique element k E L 2 (JL X v); if T denotes the "integral operator" associated to the "kernel function" k as in equation 4.3.10, we find from equation 4.3.11 that for arbitrary 9 E K, f E H, we have
(Tg,fh-t
(k, f
L
(9
g) £2(/Lxv)
An (in
(9 gn'
f
(9 gh 2 (/Lxv)
n
n
n
whence we find that T = the kernel function k.
T and
so T is, indeed, the integral operator induced by 0
Exercise 4.3.16 1fT and k are related as in equation 4.3.10, we say that T is the integral operator induced by the kernel k, and we shall write T = Int k. Fori = 1,2,3, let Hi = L2(Xi' B i , JLi), where (Xi, B i , JLi) is au-finite measure space.
Chapter 4. Some Operator Theory
134
(a) Let h E L2(X2 X X 3, 8 2 129 8 3 , /J2 x /J3), k, kl E L2(Xl X X 2, 8 1 129 8 2 , /Jl x /J2), and let S = Int h E £2(H 3 , H2), T = Int k, Tl = Int kl E £2(H 2, Hd; show that (i) ifaEC, thenT+aTl = Int (k+akd; (ii) if we define k*(X2,xd = k(Xl,X2), then k* E L2(X2 and T* = Int k*; (iii) TS E £2(H3, Hd and TS = lnt (k * h), where (k*h)(Xl,X3) =
J
X
X l ,82 Q98 l ,/J2 x /Jd
k(Xl,X2)h(X2,X3) d/J2(x2)
X2
for (/Jl x /J3)-almost all (Xl,X3) E Xl x X 3· (Hint: for (ii) , note that k* is a square-integrable kernel, and use equation 4.3.11 to show that Int k* = (lnt k) * ; for (iii), note that I(bh )(Xl' X3) I ~ Ilk Xt II £2(/12) II hX311 £2(/12) to conclude that bh E L2(/Jl x /J3); again use Fubini's theorem to justify interchanging the orders of integration in the verification that lnt (k * h) = (Int k )(Int h).)
4.4
Fredholm operators and index
We begin with a fact which is a simple consequence of results in the last section. Lemma 4.4.1 Suppose I c £(H) is an ideal in the C* -algebra £(H). (a) If I =I- {O}, then I contains every "finite-rank operator" - i.e., if K E £(H) and if ran K is finite-dimensional, then K E I. (b) IfH is separable and if I is not contained in K,(H), then I = £(H). Proof. (a) If K is as in (a), then (by the singular value decomposition) there exists an orthonormal basis {Yn : 1 ~ n ~ d} for ker~ K and an orthonormal -d basis {zn: 1 ~ n ~ d} for ranT such that Kv = 2::n=l sn(K)(v,Yn/zn \j v E H. If I =I- {O}, we can find 0 =I- T E I, and a unit vector x E H such that Tx =I- 0; define An, Bn E £(H), 1 ~ n ~ d by
and note that AnT Bnv = (v, Yn/Zn
in particular, K
= 2::~=1 sn(K)AnTBn
E
\j V E 1{ ;
I.
(b) Suppose there exists a non-compact operator T E I; then, by Proposition 4.3.6(e), there exists an infinite-dimensional closed subspace N c ran T. If M = T- 1 (N) n ker~ T, it follows from the open mapping theorem that there exists a bounded operator So E £(N,M) such that SoTIM = idM ; hence if we
4.4. Fredholm operators and index
135
write 5 = 5 0 PN, then 5 E £(H) and 5TPM = PM; thus, PM E I; since M is an infinite-dimensional subspace of the separable Hilbert space H, we can find an isometry U E £(H) such that UU* = PM; but then, idrt = U PMU* E I, and hence I = £(H). D Corollary 4.4.2 IfH is an infinite-dimensional, separable Hilbert space, then K(H) is the unique non-trivial (norm- }closed ideal in £(H).
Proof. Suppose I is a closed non-trivial ideal in £(H); since the set of finiterank operators is dense in K(H) (by Proposition 4.3.6( d)), it follows from Lemma 4.4.1(a) and the assumption that I is a non-zero closed ideal, that I => K(H); the assumption that Ii- £(H) and Lemma 4.4.1(b) ensure that I c K(J-l). D Remark 4.4.3 Suppose H is a non-separable Hilbert space; let {ei : i E I} denote anyone fixed orthonormal basis for H. There is a natural equivalence relation on the collection 21 of subsets of I given by h rv h if and only if Ihl = 1121 (i.e., there exists a bijective map f : h --+ h); we shall think of every equivalence class as determining a "cardinal number a such that a:::; dim H". Let us write J for the set of "infinite cardinal numbers" so obtained; thus,
J
{ {h
c
I : h rv Io} : 10
c
I, lois infini te }
'{No, ... , a, . .. , dim H}'. Thus we say a E J if there exists an infinite subset 10 C I (which is uniquely determined up to rv) such that a = '1101'. For each a E J, consider the collection Ka(H) of operators on H with the property that if N is any closed subspace contained in ran T, then dim N < a. It is a fact that {Ka(H) : a E J} is precisely the collection of all closed ideals of £(H). Most discussions of non-separable Hilbert spaces degenerate, as in this remark, to an exercise in transfinite considerations involving infinite cardinals; almost all "operator theory" is, in a sense, contained in the separable case; and this is why we can safely restrict ourselves to separable Hilbert spaces in general. D Proposition 4.4.4 (Atkinson's theorem) 1fT E £(H1' H2), then the following conditions are equivalent: (a) there exist operators 5 1 ,52 E £(H2' Hd and compact operators i = 1,2, such that
(b) T satisfies the following conditions: (i) ran T is closed; and (ii) ker T and ker T* are both finite-dimensional.
Ki E
£(H i ),
Chapter 4. Some Operator Theory
136
(c) There exists S E .cUh H d and finite-dimensional subspaces Hi such that
c
Hi, i = 1, 2,
Proof. (a) =:} (b); Begin by fixing a finite-rank operator F such that IIKI - FII < ~ (see Proposition 4.3.6(d)); set M = ker F and note that if x E M, then
which shows that T is bounded below on M; it follows that T(M) is a closed subspace of H 2 ; note, however, that M.l is finite-dimensional (since F maps this space injectively onto its finite-dimensional range); hence T satisfies condition (i) thanks to Exercise A.6.5(3) and the obvious identity ran T = T(M) + T(M.l). As for (ii), since SIT = 1ft! + K I , note that KIx = -x for all x E ker T; this means that ker T is a closed subspace which is contained in ran KI and the compactness of KI now demands the finite-dimensionality of ker T. Similarly, ker T* C ran K:;' and condition (ii) is verified. (b) =:} (c); Let NI = ker T, N2 = ker T* (= ran.l T); thus T maps NI.l 1-1 onto ran T; the condition (b) and the open mapping theorem imply the existence of a bounded operator So E .c(Nl, NI.l) such that So is the inverse of the restricted operator TIN"'"; if we set S = SoPN2"," , then S E .c(H2' Hd and by definition, we ! have ST = 1ft! - PN! and TS = Ift2 - PN2; by condition (ii), both subspaces Hi are finite-dimensional. (c)
=:}
(a); Obvious.
D
Remark 4.4.5 (1) An operator which satisfies the equivalent conditions of Atkinson's theorem is called a Fredholm operator, and the collection of Fredholm operators from HI to H2 is denoted by F(HI' H 2), and as usual, we shall write F(H) = F(H, H). It must be observed - as a consequence of Atkinson's theorem, for instance - that a necessary and sufficient condition for F(HI' H2) to be non-empty is that either (i) HI and H2 are both finite-dimensional, in which case .c(HI' H 2) = F(HI' H 2), or (ii) neither HI nor H2 is finite-dimensional, and dim HI = dim H 2 . (2) Suppose H is a separable infinite-dimensional Hilbert space. Then the quotient Q(H) = .c(H) / K(H) (of .c(H) by the ideal K(H)) is a Banach algebra, which is called the Calkin algebra. If we write IrJ( ; .c(H) --> Q(H) for the quotient mapping, then we find that an operator T E .c(H) is a Fredholm operator precisely when IrJC(T) is invertible in the Calkin algebra; thus, F(H) = IrK I (Q(Q(H))). (It is a fact, which we shall not need and consequently not go into here, that the Calkin algebra is in fact a C* -algebra - as is the quotient of any C* -algebra by a normclosed *-ideal.)
4.4. Fredholm operators and index
137
(3) It is customary to use the adjective "essential" to describe a property of an operator T E £(H) which is actually a property of the corresponding element ndT) of the Calkin algebra, thus, for instance, the essential spectrum of T is defined to be
{A
E
C: (T - A)
tI F(H)}
.
(4.4.12) D
The next exercise is devoted to understanding the notions of Fredholm operator and essential spectrum at least in the case of normal operators. Exercise 4.4.6 (1) Let T E £(HI' H 2 ) have polar decomposition T show that
= UITI. Then
(a) T E F(H) {o} U E F(HI' H 2) and ITI E F(Hd. (b) A partial isometry is a Fredholm operator if and only if both its initial and final spaces have finite co-dimension (i. e., have finite-dimensional orthogonal complements) . (Hint: for both parts, use the characterisation of a Fredholm operator which is given by Proposition 4.4.4(b).)
(2) If HI = H2 = H, consider the following conditions on an operator T E £(H): (i) T is normal; (ii) U and ITI commute. Show that (i) =} (ii), and find an example to show that the reverse implication is not valid in general. (Hint: if T is normal, then note that
thus U commutes with ITI2; approximate ITI by a sequence of polynomials in ITI2, to conclude that U and ITI commute; for the "reverse implication", let T denote the unilateral shift, and note that U = T and ITI = 1.) (3) Suppose T = UITI is a normal operator as in (2) above. Then show that the following conditions on T are equivalent: (i) T is a Fredholm operator; (ii) there exists an orthogonal direct-sum decomposition H = M EB N, where dim N < 00, with respect to which T has the form T = TI EB 0, where TI is an invertible normal operator on M; (iii) there exists an E > 0 such that 1][Jl, (T) = l{o} (T) = Po, where (a) E ......... 1E (T) denotes the measurable functional calculus for T , (b) ID\ {z E
Chapter 4. Some Operator Theory
138
Izl < E} is the E-disc around the origin, and (c) Po is some finite-rank projection. (Hint: For (i) =? (ii) , note, as in the hint for exercise (2) above, that we have decompositions U = Uo EB 0, ITI = A EB 0 - with respect to 7-i = M EB N, where M = kerJ.. T and N = kerT (is finite-dimensional under the assumption (i)) where Uo is unitary, A is positive and 1-1, and Uo and A commute; deduce from the Fredholm condition that N is finite-dimensional and that A is invertible; conclude that in this decomposition, T = UoA EB 0 and UoA is normal and invertible. For (ii) =? (iii), ifT = TI EBO has polar decomposition T = UITI, then ITI = ITIIEBO and U = Uo EB 0 with Uo unitary and ITII positive and invertible; then if E > 0 is such that TI is bounded below by E, then argue that lllJ1, (T) = l[o,E)(ITI) = l{o}(ITI) = l{o}(T) = PN.) C :
(4) Let T E £(7-i) be normal; prove that the following conditions on a complex number A are equivalent:
(i) A E CJess(T); (ii) there exists an orthogonal direct-sum decomposition 7-i = M EB N, where dim N < 00, with respect to which T has the form T = TI EB A, where (TI - A) is an invertible normal operator on M; (iii) there exists E > 0 such that lllJ1,+,\(T) = l{,\}(T) = P,\, where][))E + A denotes the E-disc around the point A, and P,\ is some finite-rank projection. (Hint: apply (3) above to T - A.) We now come to an important definition.
Definition 4.4.7 1fT defined by
E
F(7-iI, 7-i2) is a Fredholm operator, its index is the integer
ind T
= dim(ker T) - dim(ker T*).
Several elementary consequences of the definition are discussed in the following remark. Remark 4.4.8 (1) The index of a normal Fredholm operator is always O. (Reason: If T E £(7-i) is a normal operator, then ITI2 = IT* 12, and the uniqueness of the square root implies that ITI = IT* I; it follows that ker T = ker ITI = ker T*.)
(2) It should be clear from the definitions that if T = UITI is the polar decomposition of a Fredholm operator, then ind T = ind U. (3) If7-i1 and 7-i 2 are finite-dimensional, then £(7-iI' 7-i 2 ) = F(7-i 1 , 7-i2) and ind T = dim 7-i 1 - dim 7-i 2\IT E £(7-i I, 7-i2); in particular, the index is independent of the operator in this case. (Reason: let us write p = dim(ran T) (resp., p* = dim(ran T*)) and v = dim(ker T) (resp., v* = dim(ker T*)) for the rank and nullity ofT (resp., T*); on the one hand, deduce from Exercise 4.2.6(3) that if dim 7-ii = ni, then
4.4. Fredholm operators and index p p
= nl - v and = p*; hence,
p*
=
139
n2 - v*; on the other hand, by Exercise 4.2.6(2), we find that
ind T
=v
- v*
=
(nl - p) - (n2 - p)
= nl
- n2 .)
(4) If S = UTV, where S E £(Hl' H 4 ), U E £(H3, H 4 ), T E £(H2' H 3), V E £(Hl' H2), and if U and V are invertible (i.e., are 1-1 and onto), then S is a Fredholm operator if and only if T is, in which case, ind S = ind T. (This should be clear from Atkinson's theorem and the definition of the index.)
(5) Suppose Hi = Hi ffiMi and dim Hi < 00, for i = 1,2,; suppose T E £(Hl' H2) is such that T maps Nl into N 2 , and such that T maps Ml 1-1 onto M 2 . Thus, with respect to these decompositions, T has the matrix decomposition
where D is invertible; then it follows from Atkinson's theorem that T is a Fredholm operator, and the assumed invertibility of D implies that ind T = ind A dim Nl - dim N2 (see (3) above). D Lemma 4.4.9 Suppose Hi = Hi ffiM i , for i = 1,2; suppose T E £(Hl' H2) has the associated matrix decomposition
T=
[~~],
where A E £(N1,N2 ),B E £(M 1,N2),C E £(N1,M2), and D E £(M 1,M2); assume that D is invertible - i.e., D maps Ml 1-1 onto M 2 . Then
and ind T = ind (A-BD-1C); further, if it is the case that dim Hi < 00, i = 1,2, then T is necessarily a Fredholm operator and ind T = dimN1 - dimN2 . Proof. Let U E £(H2) (resp., V E £(Hl)) be the operator which has the matrix decomposition
with respect to H2 = N2 ffi M2 (resp., Hl = Nl ffi Md. Note that U and V are invertible operators, and that
UTV
=
[A - BD-1C
o
0].
D'
Chapter 4. Some Operator Theory
140
since D is invertible, we see that ker(UTV) = ker(A - BD-IC) and that ker(UTV)* = ker(A - BD-IC)*; also, it should be clear that UTV has closed range ifand only if (A-BD-IC) has closed range; we thus see that T is a Fredholm operator precisely when (A - BD-IC) is Fredholm, and that ind T = ind(ABD-IC) in that case. For the final assertion of the lemma (concerning finitc0 dimensional Ni's), appeal now to Remark 4.4.8(5). We now state some simple facts in an exercise, before proceeding to establish the main facts concerning the index of Fredholm operators. Exercise 4.4.10 (1) Suppose Do E £(HI, H2) is an invertible operator; show that there exists E > 0 such that if DE £(H I , H2) satisfies liD - Doll < E, then D is invertible. (Hint: let Do = UolDol be the polar decomposition; write D = Uo(U D), note that liD - Doll = II(U D -IDol)ll, and that D is invertible if and only if U D is invertible, and use the fact that the set of invertible elements in the Banach algebra £(Hd form an open set.)
o o
o
(2) Show that a function ¢ : [0, 1]
---> ;Z
which is locally constant, is necessarily
constant. (3) Suppose Hi = Ni EEl M i , i = 1,2, are orthogonal direct sum decompositions of Hilbert spaces. (a) Suppose T E £(H I , H 2) is represented by the operator matrix
where A and D are invertible operators; show, then, that T is also invertible and that T- I is represented by the operator matrix
(b) Suppose T
E
£ (H I , H 2) is represented by the operator matrix
where B is an invertible operator; show that T E F(H I , H 2) if and only if CEF(NI ,M 2), and that if this happens, thenind T=ind C. Theorem 4.4.11 (a) F(H I , H2) is an open set in £(HI, H 2) and the function ind : F(H I , H 2) ---> C is "locally constant"; i.e., if To E F(H I , H 2), then there exists fj > 0 such that whenever T E £(H I, H 2) satisfies liT - To II < fj, it is then the case that T E F(H I , H 2) and ind T = ind To.
(b) T E F(H 1 ,H2),K E K(H 1 ,H2) K) = ind T.
=}
(T+K) E F(Hl,H2) andind(T+
4.4. Fredholm operators and index
141
(c) 5 E F(H2,H 3),T E F(Hl,H2)::::} ST E F(Hl,H3) andind(ST) ind T.
= ind 5+
Proof. (a) Suppose To E F(HI' H2)' Set NI = ker To and N2 = ker TO', so that
M, i =
1,2, are finite-dimensional spaces and we have the orthogonal decompositions Hi = M ttl M i , i = 1,2, where Ml = ran TO' and M2 = ran To. With respect to these decompositions of HI and H 2 , it is clear that the matrix of To has the form To
=
[~ ~o]
,
where the operator Do : Ml --+ M2 is (a bounded bijection, and hence) invertible. Since Do is invertible, it follows ~ see Exercise 4.4.10(1) ~ that there exists a b > 0 such that D E £(M I ,M 2 ), liD - Doll < b ::::} D is invertible. Suppose now that T E £(HI' H 2 ) and liT - Toll < b; let
T=[~ ~] be the matrix decomposition associated to T; then note that II D - Do II < band consequently D is an invertible operator. Conclude from Lemma 4.4.9 that T is a Fredholm operator and that ind T
= ind(A -
BD~IC)
= dimNI - dimN2 = ind To .
(b) If T is a Fredholm operator and K is compact, as in (b), define T t = T + tK, for ::; 1. It follows from Proposition 4.4.4 that each T t is a Fredholm operator; further, it is a consequence of (a) above that the function [0,1] :3 t f---> ind T t is a locally constant function on the interval [0,1]; the desired conlcusion follows easily ~ see Exercise 4.4.10(2).
o ::; t
(c) Let us write KI = HI ttl H2 and K2 = H2 ttl H3, and consider the operators U E £(K 2 ), R E £(KI' K 2 ) and V E £(KI) defined, by their matrices with respect to the afore-mentioned direct-sum decompositions of these spaces, as follows: U = [
o 11ta
]
R = [
~
where we first choose E > 0 to be so small as to ensure that R is a Fredholm operator with index equal to ind T + ind 5; this is possible by (a) above, since the operator R o, which is defined by modifying the definition of R so that the "off-diagonal" terms are zero and the diagonal terms are unaffected, is clearly a Fredholm operator with index equal to the sum of the indices of 5 and T.
Chapter 4. Some Operator Theory
142
It is easy to see that U and V are invertible operators - see Exercise 4.4.10 (3)(a) - and that the matrix decomposition of the product U RV E F(Kl' K 2 ) is given by:
URV
=
[s~
which is seen - see Exercise 4.4.1O(3)(b) - to imply that ST E F(1il' 1i3) and that ind(ST) = ind R = ind S + ind T, as desired. D Example 4.4.12 Fix a separable infinite-dimensional Hilbert space 1i; for definiteness' sake, we assume that 1i = £2. Let S E £(1i) denote the unilateral shift see Example 2.4.15(1). Then, S is a Fredholm operator with ind S = -1, and ind S* = 1; hence Theorem 4.4.11((c) implies that if n E N, then sn E F(1i) and ind(Sn) = -n and ind(S*)n = n; in particular, there exist Fredholm operators with all possible indices. Let us write Fn = {T E F(1i) : ind T = n}, for each n E Z. First consider the case n = O. Suppose T E Fo: then it is possible to find a partial isometry Uo with initial space equal to ker T and final space equal to ker T*); then define T t = T + tUo. Observe that t i- 0 =} T t is invertible; and hence, the map [0,1] :3 t f-+ T t E £(1i) (which is clearly norm-continuous) is seen to define a path - see Exercise 4.4.13(1) - which is contained in Fo and connects To to an invertible operator; on the other hand, the set of invertible operators is a path-connected subset of Fo -- see Exercise 4.4.13(2)(e); it follows that Fo is path-connected. Next consider the case n < O. Suppose T E F n , n < O. Then note that T(s*)n E Fo (by Theorem 4.4.11(c)) and since (s*)nsn = 1, we find that T = T(s*)nsn E Fosn; conversely since Theorem 4.4.11(c) implies that Fosn C F n , we thus find that Fn = Fosn. For n > 0, we find, by taking adjoints, that Fn = F*-n = (s*)nFo. We conclude that for all nEZ, the set Fn is path-connected; on the other hand, since the index is "locally constant", we can conclude that {Fn : n E Z} is precisely the collection of "path-components" (= maximal path-connected subsets) of F(1i). D Exercise 4.4.13 (1) A path in a topological space X is a continuous function f : [0,1] -> X; if f(O) = x, f(l) = y, then f is called a path joining (or connecting) x to y. Define a relation,....., on X by stipulating that x ,. . ., y if and only if there exists a path joining x to y. Show that,....., is an equivalence relation on X. The equivalence classes associated to the relation ,. . ., are called the pathcomponents of X; the space X is said to be path-connected if X is itself a path component.
(2) Let H be a separable Hilbert space. In this exercise, we regard £(H) as being topologised by the operator norm.
4.4. Fredholm operators and index
143
(a) Show that the set £saCH) of self-adjoint operators on H is path-connected. (Hint: Consider t f-+ tT.) (b) Show that the set £+ (H) of positive operators on H is path-connected. (Hint: Note that ifT;::: 0, t E [0,1], then tT;::: 0.) (c) Show that the set G L+ (H) of invertible positive operators on H form a connected set. (Hint: IfT E GL+(H), use straight line segments to first connect T to IITII· 1, and then IITII . 1 to 1.) (d) Show that the set U(H) of unitary operators on H is path-connected. (e) Use polar decomposition to show that Q(£(H)) is path-connected. (Hint: If U E U(H), find a self-adjoint A such that U = eiA - see Corollary 4.1.6(a) - and look at Ut = e itA .) We would like to conclude this section with the so-called "spectral theorem for a general compact operator". As a preamble, we start with an exercise which is devoted to "algebraic (possibly non-orthogonal) direct sums" and associated non-self-adjoint projections. Exercise 4.4.14 (1) Let H be a Hilbert space, and let M and N denote closed subspaces of H. Show that the following conditions are equivalent: (a) H = M + Nand M n N = {O}; (b) every vector z E H is uniquely expressible in the form z = x + y with x E M,yEN.
(2) If the equivalent conditions of (1) above are satisfied, show that there exists a unique E E £(H) such that Ez = x, whenever z and x are as in (1)(b) above. (Hint: note that z = Ez + (z - Ez) and use the closed graph theorem to establish the boundedness of E.) (3) If E is as in (2) above, then show that (a) E = E2; (b) the following conditions on a vector x E H are equivalent: (i) x E ran E; (ii) Ex = x. (c) ker E = N. The operator E is said to be the "projection on M along lv".
(4) Show that the following conditions on an operator E E £(H) are equivalent: (i) E = E2; (ii) there exists a closed subspace M C H such that E has the following operatormatrix with respect to the decomposition H = M EEl M.l :
Chapter 4. Some Operator Theory
144
(iii) there exists a closed subspace N C H such that E has the following operatormatrix with respect to the decomposition H E =
=
N ~ EB N:
[IN-" 0] . CO'
(iv) there exists closed subspaces M, N satisfying the equivalent conditions of (1) such that E is the projection of M along N. (Hint: (i) =? (ii): M = ran E (= ker( 1 - E)) is a closed subspace and Ex = x \;j x EM; since M = ran E, (ii) follows. The implication (ii) =? (i) is verified by easy matrix-multiplication. Finally, if we let (i)* (resp., (ii)*) denote the condition obtained by replacing E by E* in condition (i) (resp., (ii)), then (i) {:} (i)* {:} (ii)*; take adjoints to find that (ii)* {:} (iii). The implication (i) {:} (iv) is clear.) (5) Show that the following conditions on an idempotent operator E E2 = E - are equivalent: (i) E = E*;
(ii)
IIEII =
E
£(H) - z.e.,
1.
(Hint: Assume E is represented in matrix form, as in (4)(iii) above; notice that x E N~ =? IIExl12 = IIxl1 2+ IICxI1 2 ; conclude that IIEII = 1 {:} C = 0.) (6) If E is the projection onto M along N - as above - show that there exists an invertible operator S E £(H) such that SES- 1 = PM. (Hint: Assume E and B are related as in (4)(ii) above; define
S =
[l~ ~M-"] ;
deduce from (a transposed version of) Exercise 4.4.10 that S is invertible, and that
~M-"] [ ~]
~] [1~
.
(7) Show that the following conditions on an operator T E £(H) are equivalent: (a) there exist closed subspaces M,N as in (1) above such that (i) T(M) C M and TIM = A; and (ii) T(N) eN and TIN = B; (b) there exists an invertible operator S E £(H, M EB N) - where the direct sum considered is an "external direct sum" - such that ST S-l = A EB B. We will find the following bit of terminology convenient. Call operators Ti E £(H i ), i = 1,2, similar if there exists an invertible operator S E £(H1' H 2 ) such that T2 = ST1 S- 1 •
4.4. Fredholm operators and index
145
Lemma 4.4.15 The following conditions on an operator T E .c(H) are equivalent: (a) T is similar to an operator of the form To Ell Q E .c(M EEl N), where
(i) N is finite-dimensional; (ii) To is invertible, and Q is nilpotent. (b) T E F(H), ind(T) = 0 and there exists a positive integer ker Tn = ker T m V m ~ n.
n such that
Proof. (a) =? (b): If STS- 1 = To EEl Q, then it is obvious that STns-1 = TO' EEl Qn, which implies - because of the assumed invertibility of To - that ker Tn = S-l ( {O} EEl ker Qn), and hence, if n = dim N, then for any m ~ n, we see that kerTm = S-l({O} EElN). In particular, ker T is finite-dimensional; similarly ker T* is also finite-dimensional, since (S*)-lT* S* = To EEl Q*; further, ran T
= S-l(ran (1() EEl Q)) = S-l(M
EEl (ran Q)) ,
which is closed since S-l is a homeomorphism, and since the sum of the closed subspace M EEl {O} and the finite-dimensional space ({O} EEl ran Q) is closed in MEG N - see Exercise A.6.5(3). Hence T is a Fredholm operator. Finally, ind(T)
= ind(STS- 1) = ind(1()
EEl Q)
= ind(Q) = O.
(b) =? (a): Let us write Mn = ran Tn and N n = ker Tn for all n E N; then, clearly, We are told that N n = N m V m ~ n. The assumption ind T = 0 implies that ind T m = 0 V m, and hence, we find that dim(ker T*m) = dim(ker Tm) = dim(ker Tn) = dim(ker Tm) < 00 for all m ~ n. But since ker T*m = M;" we find that M~ c M*, from which we may conclude that Mrn = Mn V m ~ n. Let N = N n , M = M n , so that we have
N = ker T m and M = ran T m V m
~
n .
(4.4.13)
The definitions clearly imply that T(M) c M and T(N) eN (since M and N are actually invariant under any operator which commutes with Tn). We assert that M and N yield an algebraic direct sum decomposition of H (in the sense of Exercise 4.4.14(1)). Firstly, if z E H, then Tn Z E Mn = M 2n , and hence we can find v E H such that Tn z = T 2n v; thus z - Tnv E ker Tn; i.e., if x = Tnv and y = z-x, then x E M, yEN and z = x+y; thus, indeed H = M+N. Notice that T (and hence also Tn) maps M onto itself; in particular, if z E MnN, we can find an x E M such that z = Tnx; the assumption zEN implies that o = Tnz = T 2n x; this means that x E N 2n = N n , whence z = Tnx = 0; since
Chapter 4. Some Operator Theory
146
z was arbitrary, we have shown that N n M = {O}, and our assertion has been substantiated. If To = TIM and Q = TIN, the (already proved) fact that M n N = {O} implies that Tn is I-Ion M; thus To is 1-1; hence To is 1-1; it has already been noted that To maps M onto M; hence To is indeed invertible; on the other hand, it is obvious that Qn is the zero operator on N. 0 Corollary 4.4.16 Let K E K(H),- assume 0 =I >.. E a(K),- then K is similar to an operator of the form KI EB A E £(M EB N), where (a) KI E K(M) and>" a(Kd,- and (b) N is a finite-dimensional space, and a(A) = {>..}.
tt
Proof. Put T = K - >..; then, the hypothesis and Theorem 4.4.11 ensure that T is a Fredholm operator with ind(T) = O. Consider the non-decreasing sequence kerT C kerT 2 C ... C kerT n C ....
(4.4.14)
Suppose ker Tn =I ker Tn+l V n; then we can pick a unit vector Xn E (ker Tn+l )n(ker Tn)J. for each n. Clearly the sequence {Xn}~=l is an orthonormal set. Hence, limn IIKxnl1 = 0 (by Exercise 4.4.17(3)). On the other hand, Xn E ker T n +l
::::}
TX n E ker Tn
::::} ::::}
(Txn' Xn) = 0 (Kxn, Xn) = >..
contradicting the hypothesis that >.. =I 0 and the already drawn conclusion that KX n ---+ O. Hence, it must be the case that ker Tn = ker T n +l for some n E N; it follows easily from this that ker Tn = ker T m V m ;:::: n. Thus, we may conclude from Lemma 4.4.15 that there exists an invertible operator S E £(H, M EBN) - where N is finite-dimensional- such that STS- I = To EB Q, where To is invertible and a(Q) = {O}; since K = T + >.., conclude that SKS- 1 = (To + >..) EB (Q + >..); set Kl = To + >.., A = Q + >.., and conclude that indeed KI is compact, >.. a(Kd and a(A) = {A}. 0
tt
Exercise 4.4.17 (1) Let X be a metric space,- if X, Xl , X2, . .. EX, show that the following conditions are equivalent: (i) the sequence {xn}n converges to x,(ii) every subsequence of {xn}n has a further subsequence which converges to x. (Hint: for the non-trivial implication, note that if the sequence {xn}n does not converge to x, then there must exist a subsequence whose members are "bounded away from x".)
4.4. Fredholm operators and index
147
(2) Show that the following conditions on an operator T E .c(HI' H 2 ) are equivalent: (i) T is compact; (ii) if {xn} is a sequence in 7-f1 which converges weakly to 0 - i.e, (x,x n ) --+ o \:j x E 7-f1 - then IITxnl1 --+ O. (iii) if {en} is any infinite orthonormal sequence in 7-f 1, then IITenl1 --+ O. (Hint: for (i) =} (ii), suppose {Yn}n is a subsequence of {xn}n; by compactness, there is a further subsequence {zn}n of {Yn}n such that {Tzn}n converges, to z, say; since Zn --+ 0 weakly, deduce that TZn --+ 0 weakly; this means Z = 0, since strong convergence implies weak convergence; by (1) above, this proves (ii). The implication (ii) =} (iii) follows form the fact that any orthonormal sequence converges weakly to O. For (iii) =} (i), deduce from Proposition 4.3.6(c) that ifT is not compact, there exists an E > 0 such that ME = ran l[E,oo) (ITI) is infinitedimensional; then any infinite orthonormal set{ en : n E N} in ME would violate condition (iii).) We are finally ready to state the spectral theorem for a compact operator. Theorem 4.4.18 Let K E K(7-f) be a compact operator on a Hilbert space 7-f. Then, (a) A E a(K) - {O} =} A is an eigenvalue of K and A is "isolated" in the sense that there exists E > 0 such that 0 < Iz - AI < E =} Z ~ a(K); (b) if A E a(K) - {O}, then A is an eigenvalue with "finite algebraic multiplicity" in the strong sense described by Corollary 4.4.16; (c) a(K) is countable, and the only possible accumulation point of a(K) is O.
Proof. Assertions (a) and (b) are immediate consequences of Corollary 4.4.16, while (c) follows immediately from (a). D
Chapter 5 Unbounded Operators
5.1
Closed operators
This chapter will be concerned with some of the basic facts concerning unbounded operators on Hilbert spaces. We begin with some definitions. It will be necessary for us to consider operators which are not defined everywhere in a Hilbert space, but only on some linear subspace which is typically not closed; in fact, our primary interest will be in operators which are defined on a dense proper subspace. (Since we shall only consider Hilbert spaces, we adopt the following conventions throughout this chapter.) Symbols Hand K (and primed and subscripted variations thereof) will be reserved for separable complex Hilbert spaces, and the symbol V (and its variants) will be reserved for linear (typically non-closed) subspaces of Hilbert spaces. Definition 5.1.1 A linear operator "from H to K" is a linear map T : V -+ K, where V is a linear subspace of Hi the subspace V is referred to as the domain (of definition) of the operator T, and we shall write V = dom(T) to indicate this relationship between V and T. The linear operator T is said to be densely defined if V = dom T is dense in H. The reason for our interest in densely defined operators lies in the following fact. Proposition 5.1.2 Let T : V( C H) densely defined.
-+
K be a linear operator; assume that T is
(a) The following conditions on a vector y E K are equivalent: (i) there exists a constant C > 0 such that I(Tx,y)1 :s; Cllxll V x E Vi (ii) there exists a vector z E H such that (Tx, y) = (x, z) for all x E V. (b) Let V* denote the set of all y (a)i then
E
K which satisfy the equivalent conditions of
149
Chapter 5. Unbounded Operators
150
(i) if y E D*, there exists a unique
Z E 1i satisfying the requirements listed in (a)(ii); and (ii) there exists a unique linear operator T* : D* --> 1i such that
(Tx, y)
=
(x, T*y)
I;j
XED, Y E D* .
Proof. (a) (i) =} (ii): If (i) is satisfied, then the mapping D :3 x f-+ (Tx, y) E C defines a bounded linear functional on D which consequently - see Exercise 1.5.5(1)(b) - has a unique extension to a bounded linear functional on 15 = 1i; the assertion (ii) follows now from the Riesz representation. (ii)
=}
(i): Obvious, by Cauchy-Schwarz.
(b) (i) The uniqueness of z is a direct consequence of the assumed density of D. (ii) If y E D*, define T*y to be the unique element z as in (a)(ii); the uniqueness assertion in (b) (i) above clearly implies the linearity of T* (in exactly the same manner in which the linearity of the adjoint of a bounded operator was established). The operator T* has the stipulated property by definition, and it is uniquely determined by this property in view of the uniqueness assertion (b) (i).D Definition 5.1.3 The adjoint of a densely defined operator T (as in Proposition 5.1.2) is the unique operator T* whose domain and action are as prescribed in part (b) of Proposition 5.1.2. (It should be noted that we will talk about the adjoint of an operator only when it is densely defined; hence if we do talk about the adjoint of an operator, it will always be understood - even if it has not been explicitly stated - that the original operator is densely defined.)
We now list a few useful examples of unbounded operators. Example 5.1.4 (1) Let 1i = L 2 (X,B,p), where (X,B,p) is a CT-finite measure space; given any measurable function 4; : X --> C, let
(5.1.1 ) and define M¢ to be the operator given by M¢f =
4;i , I;j i
E
D¢
=
dom M¢ .
(2) If k: X x Y --> C is a measurable function, where (X,Bx,p) and (Y,By,v) are CT-finite measure spaces, we can associate the integral operator Int k, whose natural domain Dk is the set of those g E £2 (Y, v) which satisfy the following two conditions: (a) k(x, .)g E Ll(y, v) for p-almost every x, and (b) the function given by ((Int k)g)(x) = Ix k(x, y)g(y) dv(y), (which, by (a), is p-a.e. defined), satisfies (Int k)g E L2(X,p).
5.1. Closed operators
151
(3) A very important source of examples of unbounded operators is the study of differential equations. To be specific, suppose we wish to study the differential expression d2 T = --+q(X) (5.1.2) dx 2 on the interval [a, bj. By this we mean we want to study the passage f f---+ T f, where (T J)(x) = - ~:{ + q(x)f(x), it being assumed, of course, that the function f is appropriately defined and is sufficiently "smooth" as to make sense of T f. The typical Hilbert space approach to such a problem is to start with the Hilbert space H. = L2([a,b]) = L 2([a,b],B[a,b],m), where m denotes Lebesgue measure restricted to [a, bj, and to study the operator T f = T f defined on the "natural domain" dom T consisting of those functions f E H. for which the second derivative f" "makes sense" and Tf E L2([a,b]).
(4) Let ((aj))o<:;i,j K be a linear operator. Show that the following conditions are equivalent: (i) G(Td C G(T2), where the graph G(T) of a linear operator T "from H. to K" is defined by G(T) = {(x, y) E 'H ffi K : x E dom T, Y = Tx}; (ii) VI C V 2 and TIx = T 2x V x E VI. When these equivalent conditions are met, we say that T2 is an extension of TI or that TI is a restriction ofT2 , and we denote this relationship by T2 J TI or by TI C T2·
152
Chapter 5. Unbounded Operators
(2) (a) If Sand T are linear operators, show that S c T* if and only if (Tx, y) = (x, Sy) for all x E dom T, y E dom s. (b) If SeT, and if S is densely defined, then show that (also T is densely defined so that T* makes sense, and that) S* => T*. We now come to one of the fundamental observations. Proposition 5.1.6 Suppose T : V --+ K is a densely defined linear operator, where V C 1i. Then the graph G(T*) of the adjoint operator is a closed subspace of the Hilbert space KEEl 1i; in fact, we have:
G(T*) = {(Tx, -x) : x E V}..l
(5.1.3)
Proof. Let us write S = {(Tx, -x): x E V}; thus S is a linear subspace of KEEl1i. If y E dom T*, then, by definition, we have (Tx,y) = (x,T*y) V x E V, or equivalently, (y,T*y) E S..l. Conversely, suppose (y,z) E S..l; this means that (Tx,y) + (-x,z) = 0 "Ix E V, or equivalently, (Tx, y) = (x, z) V x E V. Deduce from Proposition 5.1.2(a)(ii) that y E dom T* and that T*y = z; thus, we also have S..l C G(T*). D Definition 5.1. 7 A linear operator T : V (C 1i) --+ K is said to be closed if its graph G(T) is a closed subspace of 1i EEl K. A linear operator T is said to be closable if it has a closed extension.
Thus Proposition 5.1.6 shows that the adjoint of a densely defined operator is always a closed operator. Some further consequences of the preceding definitions and proposition are contained in the following exercises. Exercise 5.1.8 In the following exercise, it will be assumed that V subspace.
C
1i is a linear
(a) Show that the following conditions on a linear subspace S C 1i EEl K are equivalent: (i) there exists some linear operator T "from 1i to K" such that S = G(T); (ii) S n ({O} EEl K) = {(O, On. (Hint: Clearly (i) implies (ii) since linear maps map 0 to 0; to see that (ii) implies (i), define V = Pi (S), where Pi denotes the projection of 1i EEl K onto the first co-ordinate, and note that (ii) may be re-phrased thus: if x E V, then there exists a unique y E K such that (x, y) E S.) (b) Show that the following conditions on the linear operator T : V equivalent: (i) T is closable;
--+
1i2 are
5.1. Closed operators
153
(ii) if {xn}n is a sequence in V such that Xn y = O.
--t
0 E 1t and TX n
--t
y E K, then
(Hint: (ii) is just a restatement of the fact that if (0, y) E G(T), then y = O. It is clear now that (i) =} (ii). Conversely, if condition (ii) is satisfied, we find that S = G(T) satisfies condition (a)(ii) above, and it follows from this and (a) that S is the graph of an operator which is necessarily a closed extension - in fact, the closure, in the sense of (c) below - ofT.)
(c) If T : V
--t K is a closable operator, then show that there exists a unique closed operator - always denoted by T, and referred to as the closure of T - such that G(T) is the closure G(T) of the graph of T; and deduce that T is the smallest closed extension of T in the sense that it is a restriction of any closed extension ofT.
We now reap some consequences of Proposition 5.1.6. Corollary 5.1.9 Let T : V (c 1t) --t K be a densely defined linear operator. Then, T is closable if and only if T* is densely defined, in which case T = T**. In particular, if T is a closed densely defined operator, then T = T** .
Proof. (a) If T* were densely defined, then T** = (T*)* makes sense and it follows (from Exercise 5.1.5(2)(a), for instance) that T c T**, and hence T** is a closed extension of T. Conversely, suppose T is closable. Let y E (dom T*)~; note that (y,O) E G(T*)~ and deduce from equation 5.1.3 that (y,O) belongs to the closure of {(Tx, -x) : x E dom T}; this, in turn, implies that (0, y) belongs to G(T) = G(T), and consequently that y = O. So "T closable" implies "T* densely defined" . Let W : 1t EEl K --t K EEl 1t denote the (clearly unitary) operator defined by W(x,y) = (y,-x), then equation 5.1.3 can be re-stated as G(T*)
= W(
G(T) )~ .
(5.1.4)
On the other hand, observe that W* : KEEl1t --t 1tEElK is the unitary operator given by W*(y, x) = (-x, y), and equation 5.1.4 (applied to T* in place ofT) shows that
G(T**) = (-W*)( G(T*))~ = ( W*(G(T*) )~
=
(G(T)~)~
(by 5.1.4)
=G(T) =
G(T) .
o
We thus find that the class of closed densely defined operators between Hilbert spaces is closed under the adjoint operation (which is an "involutoryoperation" on this class). We shall, in the sequel, be primarily interested in this class; we
Chapter 5. Unbounded Operators
154
shall use the symbol LcCH, K) to denote the class of densely defined closed linear operators T : dam T (c H) ----> K. Some more elementary facts about (domains of combinations of) unbounded operators are listed in the following exercise. Exercise 5.1.10 (a) The following conditions on a closed operatorT are equivalent: (i) dam T is closed; (ii) T is bounded.
(Hint: (i) =} (ii) is a consequence of the closed graph theorem. (ii) =} (i): if x Edam T, and if {xn}n C dam T is such that Xn ----> x, then {Txn}n converges, to y say; thus (x, y) E G(T) and in particular (since T is closed) x Edam T.) (b) 1fT (resp. T1): dam T (resp. T 1) (c H) ----> K and S : dam S(c K) ----> K1 are linear operators, the composite ST, the sum T + T1 and the scalar multiple aT are the operators defined "pointwise" by the obvious formulae, on the following domains:
{X Edam T: Tx Edam S}
dom(ST)
dam T n dam T1
dom(T + T 1 ) dam (aT)
dam T.
Show that: (i) T + T1 = T1 + T; (ii) SCaT + T 1) ~ (aST + STd. (iii) if ST is densely defined, then T* S* c (ST)*. (iv) if A E L(H, K) is an everywhere defined bounded operator, then (T - - T* + A* and (AT)* = T* A*.
+ A)*
=
(c) In this problem, we only consider (one Hilbert space H and) linear operators ''from H into itself". Let T be a linear operator, and let A E L(H) (so A is an everywhere defined bounded operator). Say that T commutes with A if it is the case that AT C TA. (i) Why is the inclusion oriented the way it is? (also answer the same question for the inclusions in (b) above!) (ii) For T and A as above, show that AT
C
TA
=}
A *T* c T* A * ;
in other words, if A and T commute, then A* and T* commute. (Hint: Fix y Edam T* = dam A*T*; it is to be verified that then, also A*y Edam T* and T*(A*y) = A*T*y; for this, fix x E dom T; then by hypothesis, Ax E dam T and TAx = ATx; then observe that (x, A*T*y) = (Ax, T*y) = (T(Ax), y) = (ATx, y) = (Tx, A*y) ,
5.2. Symmetric and self-adjoint operators
155
which indeed implies - since x E dom T was arbitrary - that A*y E dom T* and that T*(A*y) = A*T*y.)
(d) For an arbitrary set S C £c(H) (of densely defined closed operators ''from H to itself") , define S'
{A E £(H): AT eTA V T E S}
and show that: (i) S' is a unital weakly closed subalgebra of £(H); (ii) if S is self-adjoint - meaning that S = S* = {A* : A E S} - then S' is also self-adjoint, and is consequently a von Neumann algebra.
5.2
Symmetric and self-adjoint operators
Definition 5.2.1 A densely defined linear operator T : V (c H) ---> H is said to be symmetric if it is the case that T C T* . A linear operator T : V (c H) ---> H is said to be self-adjoint if T = T* . Note that symmetric operators are assumed to be densely defined - since we have to first make sense of their adjoint! Also, observe (as a consequence of Exercise 5.1.5(2)(b)) that any (densely-defined) restriction of any symmetric (in particular, a self-adjoint) operator is again symmetric, and so symmetric operators are much more common than self-adjoint ones. Some elementary consequences of the definitions are listed in the following proposition.
Proposition 5.2.2 (a) The following conditions on a densely defined linear operator To : dom To (c H) ---> H are equivalent: (i) To is symmetric; (ii) (Tox, y) = (x, Toy) V x, Y E dom To; (iii) (Tox, x) E lR V x E dom To.
(b) If To is a symmetric operator, then [[(To
± i)x[[2
=
[[Tox[[2
+ [[x[[2 , V x
E
dom To ,
(5.2.5)
and, in particular, the operators (To ± i) are (bounded below, and consequently) 1-1.
(c) If To is symmetric, and if T is a self-adjoint extension of T, then necessarily To eTc TO'. Proof. (a) Condition (ii) is essentially just a re-statement of condition (i) - see Exercise 5.1.5(2)(a); the implication (ii) => (iii) is obvious, while (iii) => (ii) is a consequence of the polarisation identity.
Chapter 5. Unbounded Operators
156
(b) follows from (a)(iii) - simply "expand" the left side and note that the cross terms cancel out under the hypothesis. (c) This is an immediate consequence of Exercise 5.1.5(2)(b).
o
1t
Example 5.2.3 Consider the operator D = of differentiation, which may be applied on the space of functions which are at least once differentiable. Suppose we are interested in the interval [0,1]. We then regard the Hilbert space H = L2[0, 1] - where the measure in question is usual Lebesgue measure. Consider the following possible choices of domains for this operator:
Vo VI V2
C~(O,
1) (c H) {J E C 1 ([0, 1]) : j(O) = j(l) = O} C 1 ([0, 1])
Let us write Tjj = iDj, j E Vj , j = 0, 1,2. Notice that To C Tl C T 2, and that if j,g E V 2, then since (fg)' = f'g + jgl and since fol h'(t)dt = h(l) - h(O), we see that
j(l)g(l) - j(O)g(O)
11
(fg)'(t)dt
11 (
j(t)gl(t) + f'(t)g(t) )dt ,
and consequently we find (what is customarily referred to as the formula obtained by "integration by parts"):
i(f(I)g(l) - j(O)g(O))
= - (f,iDg) + (iDj,g) .
In particular, we see that T2 is not symmetric (since there exist j, g E V 2 such that (f(I)g(l) - j(O)g(O)) =f. 0), but that Tl (and hence also To) is indeed symmetric; in fact, we also see that T2 c T{ c TO'. 0
Lemma 5.2.4 Suppose T ~ dom T (c H) ---> H is a (densely defined) closed symmetric operator. Suppose A E C is a complex number with non-zero imaginary part. Then (T - A) maps dom T 1-1 onto a closed subspace ojH.
Prooj. Let A = a + ib, with a, b E lR, b =f. o. It must be observed, exactly as in equation 5.2.5 (which corresponds to the case a = 0, b = =t=1 of equation 5.2.6), that (5.2.6) with the consequence that (T -..\) is indeed bounded below. In particular, (T -..\) is injective. If Xn E dom T and if (T - A)Xn ---> Y (say), then it follows that {xn} is also a Cauchy sequence, and that if x = limn Xn , then (x, y + AX) belongs to the closure of the graph of the closed operator T; hence x E dom T and (T - ..\)x = y, thereby establishing that (T - A) indeed maps dom T onto a closed subspace. 0
157
5.2. Symmetric and self-adjoint operators
In the sequel, we shall write ker S = {x E dom S : Sx = O} and ran S = {Sx : x E dom S}, for any linear operator S. Lemma 5.2.5 (a) If T: a closed subspace of H. (b) If T: dom T (c H)
dom T (c H) --->
K is a densely defined operator, then (ran T)l..
Proof. (a) Suppose
Xn --->
K is a closed operator, then ker T is
--->
x, where
Xn
=
ker T* .
E ker T \In; then
and the desired conclusion is a consequence of the assumption that G(T) is closed. (b) If y E K, note that
y E (ran T)l..
=
{=}
(Tx, y)
{=}
Y E ker T* ,
0 \I x E dom T
D
as desired.
Proposition 5.2.6 The following conditions on a closed symmetric operator T : dom T( c H) ---> H are equivalent: (i) ran (T - i) = ran (T + i) = H; (ii) ker (T*+i) = ker (T*-i) = {O}; (iii) T is self-adjoint.
Proof. (i) {=} (ii): This is an immediate consequence of Lemma 5.2.5(b), Exercise 5.1.1O(b)(iv), and Lemma 5.2.4. (i) ::::} (iii): Let x E dom T* be arbitrary; since (T - i) is assumed to be onto, we can find ayE dom T such that (T - i)y = (T* - i)x; on the other hand, since T c T*, we find that (x-y) E ker(T* -i) = ran(T+i)l.. = {O}; in particular x E dom T and T*x = Tx; i.e., also T* c T, as desired. (iii) ::::} (ii) If T 5.2.2(b).
= T*, then ker(T* ± i) = ker(T ± i) whic'h is trivial, by Proposition D
We now turn to the important construction of the Cayley transform of a closed symmetric operator. Proposition 5.2.7 Let To : dom To (c H) ---> H be a closed symmetric operator. Let Ro(±) = ran(To ± i) = kerl..(TO' =f i). Then,
(a) there exists a unique partial isometry UTOZ
furthermore,
=
{
UTo E
(To -0 i)x
.c(H) such that
if Z = (To + i)x ifzERo(+)l.. ;
(5.2.7)
Chapter 5. Unbounded Operators
158
(i) UTa has initial (resp., final) space equal to R o (+) (resp., Ro(-)); (ii) 1 is not an eigenvalue of UTa' We shall refer to UTo as the Cayley transform of the closed symmetric operator T.
(b) If T :=J To is a closed symmetric extension of To, and if the initial and final spaces of the partial isometry UT are denoted by R( +) and R( -) respectively, then (i) UT "dominates" the partial isometry UTa in the sense that R(+) :=J R o(+) and UTz = UTaz, V Z E R o (+), so that also R(-):=J Ro(-); and (ii) 1 is not an eigenvalue of UT.
(c) Conversely suppose U is a partial isometry with initial space R, say, and suppose (i) U "dominates" UTo (meaning that R :=J Ro( +) and U Z = UTaz V Z E R o (+)), and (ii) 1 is not an eigenvalue ofU. Then there exists a unique closed symmetric extension T :=J To such that U=UT . Proof. (a) Equation 5.2.5 implies that the passage Ro( +) :1
Z
= (To + i)x ~ (To - i)x = Y E Ro( -)
(5.2.8)
defines an isometric (linear) map of Ro ( +) onto Ro ( - ). Hence there does indeed exist a partial isometry with initial (resp. final) space Ro( +) (resp., Ro( - )), which satisfies the rule 5.2.8; the uniqueness assertion is clear since a partial isometry is uniquely determined by its action on its initial space. As for (ii), suppose Uz = z for some partial isometry U. Then, Ilzll = IIUzll = IIUU*Uzll ::; IIU*Uzll ; deduce from Exercise 2.3.15(2)(b) that z = U*Uz = U*z. Hence 1 is an eigenvalue of a partial isometry U if and only if it is an eigenvalue of U*; and ker(U - 1) = ker(U* - 1) C ker(U*U - 1). In particular, suppose UTaz = z for some z E H. It then follows from the last paragraph that z E Ro(+). Finally, if z,x,y are as in 5.2.8, then, observe that
z = Tox + ix, 1
1
X= 2i(z-y) = 2i(1-UTa )z,
y = Tox - ix, Tox= ~(z+y) = ~(l+UTo)z; (5.2.9)
and so, we find that
UTaz
=
Z
=}
z
= Tox + ix,
where x
=0
=}
z
=0
.
(b) If z E R o (+), then there exists a (necessarily unique, by Lemma 5.2.4) x E dom To, y E Ro(-) as in 5.2.8; then, z = Tox + ix = Tx + ix E R(+) and UTz = Tx - ix = Tox - ix = y, and hence UT does "dominate" UTa' as asserted in (i); assertion (ii) follows from (a)(ii) (applied with T in place of To).
159
5.2. Symmetric and self-adjoint operators
(c) Suppose U, R are as in (c). Let V = {x E 'H.: :J Z E R such that x = Z - Uz}. Then, the hypothesis that U "dominates" UTa shows that
V = (1 - U)(R) :J (1 - UTa)(Ro(+)) = dom To and hence V is dense in 'H.. This V will be the domain of the T that we seek. If x E V, then by definition of V and the assumed injectivity of (1- U), there exists a unique Z E R such that 2ix = z - Uz. Define Tx = ~(z + Uz). We first verify that T is closed; so suppose G(T) :3 (xn' Yn) ----+ (x, y) E 'H.ffi'H.. Thus there exist {zn}n C R such that 2ix n = Zn - UZ n , and 2Yn = 2Tx n = Zn + U Zn for all n. Then (5.2.10) hence Zn ----+ Y + ix = Z (say) and Uz = lim UZ n = Y - ix; since R is closed, it follows that Z E R and that Z - Uz = 2ix, Z + Uz = 2y; hence x E V and Tx = y, thereby verifying that T is indeed closed. With x, y, Z as above, we find that (for arbitrary x E dom T) 1
(Tx, x)
1
("2 (z + U z), 2i (z - U z)) -1
4j
[llzl1 2 -IIUzI1 2 + (Uz,z)
- (z,Uz)]
-1
4j2i Im(Uz,z) E
ffi.
and conclude - see Proposition 5.2.2(a)(iii) - that T is indeed symmetric. It is D clear from the definitions that indeed U = UT and that TosubsetT. We spell out some facts concerning what we have called the Cayley transform in the following remark. Remark 5.2.8 (1) The proof of Proposition 5.2.7 actually establishes the following fact: let T denote the set of all (densely defined) closed symmetric operators "from 'H. to itself"; then the assignment
T
:3
T
f----+
UT E U
(5.2.11)
defines a bijective correspondence from T to U, where U is the collection of all partial isometries U on 'H. with the following two properties: (i) (U -1) is 1-1; and (ii) ran (U - I)U*U is dense in 'H.; further, the "inverse transform" is the map
U:3 U
f----+
T = -i(1 + U)(1 - U)(-l) ,
(5.2.12)
where we have written (1 - U)( -1) to denote the inverse of the 1-1 map (1 U)lran(U'U) - thus, the domain of T is just ran((U - I)U*U).
Chapter 5. Unbounded Operators
160
Note that the condition (ii) (defining the class U in the last paragraph) is equivalent to the requirement that ker(U*U(U* - 1)) = {O}; note that U*U(U* 1) = U* - U*U = U*(1 - U) (since U* is a partial isometry); hence, condition (ii) is equivalent to the condition that ker( U* (1- U)) = {O}; and this condition clearly implies that (1- U) must be 1-1. Thus, (ii) =} (i), and we may hence equivalently define U = {U E £(H) : U = UU*U, ker(U* - U*U) = {O}} . (2) Suppose T is a self-adjoint operator; then it follows from Proposition 5.2.6 that the Cayley transform UT is a unitary operator, and this is the way in which we shall derive the spectral theorem for (possibly unbounded) self-adjoint operators from the spectral theorem for unitary operators. D It is time we introduced the terminology that is customarily used to describe many of the objects that we have already encountered.
Definition 5.2.9 Let T be a closed (densely defined) symmetric operator. The closed subspaces
V+(T) V~(T)
= ker(T* - i) = {x = ker(T* + i) = {x
= ix} dom T* : T*x = -ix}
E dom T* : T*x E
are called the (positive and negative) deficiency spaces of T, and the (cardinal numbers) are called the deficiency indices of T. Thus, if UT is the Cayley transform of the closed symmetric operator T, then V+(T) = kerUT and V~(T) = kerU
r.
Corollary 5.2.10 Let T E I (in the notation of Remark 5.2.8(1)). (a) 1fT C Tl and if also Tl E I, then V±(T) => V±(Td. (b) A necessary and sufficient condition for T to be self-adjoint is that 8+ (T) = L(T) = O. (c) A necessary and sufficient condition for T to admit a self-adjoint extension is that 8+(T) = L(T).
Proof. Assertions (a) and (b) are immediate consequences of Proposition 5.2.7(b) and Proposition 5.2.6. As for (c), if Tl is a symmetric extension of T, then, by Proposition 5.2.7(b), it follows that V±(Td C V±(T). In particular, ifTl is self-adjoint, then the Cayley transform U of Tl is a unitary operator which dominates the Cayley transform of T and consequently maps V+(T)J. onto V~(T)J.; this implies that 8+(T) = 8~(T).
5.3. Spectral theorem and polar decomposition
161
Conversely, suppose D+(T) = 8_(T). Then there exists a unitary operator UI such that UIID+(T)~ = UT . (Why?) If it so happens that 1 is an eigenvalue of U I , define U {-x if Ux = x UIx if x E ker~(UI - 1) x = and note that U is also a unitary operator which "dominates" UT and which does not have 1 as an eigenvalue; so that U must be the Cayley transform of a self-adjoint extension of T. 0
5.3
Spectral theorem and polar decomposition
In the following exercises, we will begin the process of applying the spectral theorem for bounded normal operators to construct some examples of unbounded closed densely defined normal operators.
Exercise 5.3.1 (1) Let H = L2(X, B, /1) and let C denote the collection of all measurable functions ¢ : X ----+ C; for ¢ E C, let M¢ be as in Example 5.1.4(1). Show that if ¢,"ljJ E C are arbitrary, then, (a) M¢ is densely defined. (b) M ¢ is a closed operator. (c) = M;p. (d) Ma¢H =:J aM¢ + M-.p, for all a E
M;
(Hint: (a) let En = {x EX: 1¢(x)1 :::; n} and let P n = MIEn; show that X = UnEn, hence the sequence {Pn}n converges strongly to 1, and note that ran Pn = {f E L 2(/1) : f = 0 a.e. outside En} C dom M¢; (b) if fn ----+ f in L 2(/1), then there exists a subsequence {fnk h which converges to f a. e. (c) First note that dom M¢ = dom M¢ and deduce from Exercise 5.1.5(2)(a) that M¢ eM;; then observe that if En, P n are as above, then ran P n C dom M; and in fact PnM; C M;Pn = M¢IEn E .c(H); i.e., if g E dom M;, and n is arbitrary, then, PnM;g = (fi1Eng a.e.; deduce (from the fact that M; is closed) that M;g = (fig; i.e., M; C M;p. (d)-(f) are consequences of the definition.) (2) Let Sn : Vn (C Hn) ----+ K. n be a linear operator, for n = 1,2, .... Define S : V (C ffinHn) ----+ ffinK. n by (Sx)n = Snxn, where V = {x = ((xn))n : Xn E Vn Vn, 2:n II SnXn 112 < oo}. Then show that: (a) S is a linear operator; (b) S is densely defined if and only if each Sn is; (c) S is closed if and only if each Sn is. (Hint: (a) is clear, as are the "only if" statements in (b) and (c); ifVn is dense in Hn for each n, then the set {((xn)) E ffinHn : Xn E Vn Vn and Xn = 0 for
Chapter 5. Unbounded Operators
162
all but finitely many n} is (a subset of V which is) dense in ffi n1in ; as for (c), if V 3 x(k) ~ x E ffin1in and Sx(k) ~ y E ffinKn' where x(k) = ((x(k)n)), x = ((x n )) and y = ((Yn)), then we see that x(k)n E dom Sn \:In, x(k)n ~ Xn and Snx(k)n ~ Yn for all n; so the hypothesis that each Sn is closed would then imply that Xn E dom Sn and Snxn = Yn \:I n; i.e., x E dom Sand Sx = y.) (3) If Sn, S are the operators in (2) above, with domains specified therein, let us use the notation S = ffin Sn. If S is densely defined, show that S* = ffin S~. (4) Let X be a locally compact Hausdorff space and let Bx 3 E f---7 P(E) denote a separable spectral measure on X. To each measurable function cp : X ~ C, we wish to construct a closed operator
Ix
cpdP
=
Ix
cp(>..)dP(>..)
(5.3.13)
satisfying the following constraints: (i) if P = P!-" then Jx cpdP agrees with the operator we denoted by M¢ in (1) above; and (ii) Jx cpd(ffiiPi) = ffii Jx cpdPi , for any countable direct sum of spectral measures, where the direct sum on the right is interpreted as in (2) above.
(a) Show that there exists a unique way to define an assignment ex P 3 (cp, P) f---7 Jx cpdP E .cc(1i) - where P denotes the class of all separable spectral measures on (X, B) - in such a way that the requirements (i) and (ii) above, are satisfied. (Hint: by the Hahn-Hellinger theorem - see Remark 3.5.13 - there exist a probability measure J1 : B x ~ [0, 1] and a partition {En : 0 ::; n ::; No} C B x such that P ~ ffio:Sn:St{o En ; appeal to (1)(a),(b) and (2)(a) , (b) above.)
P::
(b) (An alternative description of the operator Jx cpdP - constructed in (a) above - is furnished by this exercise.) Let cp : X ~ C be a measurable function and let P : Bx ~ .c(1i) be a separable spectral measure; set En = {x EX: Icp(x)1 < n} and P n = P(En). Then show that (i) cp1En E L oo (X, B, J1) and Tn = JX cp1E n dP is a bounded normal operator, for each n; (ii) in fact, if we set T = Jx cpdP (as in (a) above), then show that Tn = T Pn ; (iii) the following conditions on a vector x E 1i are equivalent: (a) sUPn IITnxl1 < 00; (f3) {Tnx}n is a Cauchy sequence; (iv) if we define V~P) to be the set of those vectors in 1i which satisfy the equivalent conditions of (iii), then V~P) is precisely equal to dom (Jx cpdP) and further,
5.3. Spectral theorem and polar decomposition
(v) Show that
163
(L
¢dP )*
and that
(L
(fidP ) (
L
¢dP) =
(
L
¢dP ) (
L
(fidP ) .
(Hint: Appeal to (1)(c) and (3) of these exercises for the assertion about the adjoint; for the other, appeal to (1)(f) and (2).)
J
Remark 5.3.2 (1) The operator x ¢dP constructed in the preceding exercises is an example of an unbounded normal operator, meaning a densely defined operator T which satisfies the condition TT* = T*T - where, of course, this equality of two unbounded operators is interpreted as the equality of their graphs.
(2) Thus, for example, if T is a bounded normal operator on a separable Hilbert space 'H, and if PT(E) = lE(T) for E E Ba(T), and if ¢ : a(T) --> C is any measurable function, then we shall define (the "unbounded functional calculus" for T by considering) the (in general unbounded) operator ¢(T) by the rule
r
¢(T) =
Ja(T)
¢dPT .
(5.3.14)
It should be noted that it is not strictly necessary that the function ¢ be defined everywhere; it suffices that it be defined P a.e.; (i.e., in the complement of some set N such that P(N) = 0). Thus, for instance, so long as AO is not an eigenvalue of T, we may talk of the operator ¢(T) where ¢(A) = (A - Ao)-l - this follows easily from Exercise 4.3.8.) D
We are now ready for the spectral theorem. Theorem 5.3.3 (Spectral Theorem for unbounded self-adjoint operators) (a) 1fT is a self-adjoint operator on 'H, then there exists a spectral measure (or equivalently, a projection-valued strongly continuous mapping) BIR :3 E f--+ P(E) E .c('H) such that (5.3.15) T = A dP(A)
l
in the sense of equation 5.3.13.
(b) Furthermore, the spectral measure associated with a self-adjoint operator is unique in the following sense: if Ti is a self-adjoint operator on 'Hi with associated spectral measures Pi : BIR ---> .c('H), for i = 1,2, then the following conditions on a unitary operator W : 'HI --> 'H2 are equivalent: (i) T2 = WTI W* ; (ii) P2(E) = WPI(E)W* VEE BIR.
Chapter 5. Unbounded Operators
164
Proof. (a) If U = UT is the Cayley-transform of T, then U is a unitary operator on H; let Pu : Ba(U) --> .c(H) be the associated spectral measure (defined by Pu(E) = lE(U)). The functions
lR:3 t
L
t - i , (1I' _ {1}):3 t+z
z ,i, -i (1l-z + z) ,
are easily verified to be homeomorphisms which are inverses of one another. Our earlier discussion on Cayley transforms shows that U - 1 is 1-1 and hence P u ({I}) = 0; hence go f = id a ( U) (Pu a.e.). Our construction of the Cayley transform also shows that T = g(U) in the sense of the unbounded functional calculus - see 5.3.14; it is a routine matter to now verify that equation 5.3.15 is indeed satisfied if we define the spectral measure P : BIR --> .c(H) by the prescription P(E) = Pu(f(E)). (b) This is an easy consequence of the fact that if we let PAil = Pi([n - 1, n)) and (i) _ _ (i) Tn -TilranPn,thenTi - EBnEZTn . D We now introduce the notion, due to von Neumann, of an unbounded operator being affiliated to a von Neumann algebra. This relies on the notion - already discussed in Exercise 5.1.1O( c) - of a bounded operator commuting with an unbounded one. Proposition 5.3.4 Let M c .c(H) be a von Neumann algebra; then the following conditions on a linear operator T "from H into itself" are equivalent: (i) A'T eTA', I;j A' EM'; (ii) U'T = TU', I;j unitary U' E M'; (iii) U'TU'* = T, I;j unitary U' EM'.
If it is further true that if T = T*, then the preceding three conditions are all equivalent to the following fourth condition: (iv) PT(E) E M I;j E E BIR , where PT denotes the unique spectral measure associated with T, as in Theorem 5.3.3. If T and M are as above, then we shall say that T is affiliated with M and we shall write TTJM. Proof· (i) ~ (ii): The hypothesis (i) amounts to the requirement that if x E dom T, and if A' E M' is arbitrary, then A' x E dom T and T A' x = A'Tx. In particular, if U' is any unitary element of M', then U'x, U'*x E dom T and TU'x = U'Tx and TU'*x = U'*Tx. Since U' is a bijective map, it follows easily from this that we actually have TU' = U'T. The implication (ii) {=} (iii) is an easy exercise, while the implication (ii) ~ (i) is a consequence of the simple fact that every element A' E M' is expressible as a linear combination of (no more than four) unitary elements of M'. (Proof of the
5.3. Spectral theorem and polar decomposition
165
above fact: by looking at the Cartesian decomposition, and using the fact that M' is closed under taking "real" and "imaginary" parts, it is seen that it suffices to prove that any self-adjoint element of a C* -algebra is a linear combination of two unitary elements; but this is true because, if A is self-adjoint and if IIAII :::; 1, then we may write A = ~(U+ + U_), where U± = g±(A), and g±(t) = t ± iJI=t2.) Finally the implication (iii) ¢:} (iv) - when T is self-adjoint - is a special case of Theorem 5.3.3(b) (applied to the case Ti = T, i = 1,2.) 0 It should be noted that if T is a (possibly unbounded) self-adjoint operator on H, if ¢ f---+ ¢(T) denotes the associated "unbounded functional calculus" , and if Me £(H) is a von Neumann algebra of bounded operators on H, then T'T)M =} ¢(T)'T)M for every measurable function ¢ : ~ ----+ C; also, it should be clear - from the double commutant theorem, for instance - that if T E £(H) is an everywhere defined bounded operator, then T'T)M ¢:} T E M. We collect some elementary properties of this notion of affiliation in the following exercises.
Exercise 5.3.5 (1) Let S, T denote linear operators "from H into itself", and let Me £(H) be a von Neumann algebra such that S, T'T)M. Show that (i) if T is closable, then T'T)M; (ii) if T is densely defined, then also T*'T)M; (iii) ST'T)M.
(2) Given any family S of linear operators "from H into itself", let Ws denote the collection of all von Neumann algebras M C £(H) with the property that T'T)M I;j T E S; and define W*(S)
n {M:MEWs}.
Then show that: (i) W* (S) is the smallest von Neumann subalgebra M of £(H) with the property that T'T)M I;j T E S; (ii) if SeT c £c(H), then W*(S) c W*(T)
= W*(T*) = (T U T*)" ,
where, of course, T* {T* : T'T)T} , and S' TA I;j T E S} (as in Exercise S.l.lO(d)).
{A E £(H)
AT c
(3) Let (X, B) be any measurable space and suppose B :3 E f---+ P(E) E £(H) is a countably additive projection-valued spectral measure (taking values in projection operators in £(H)), where H is assumed to be a separable Hilbert space with orthonormal basis {en}nEf\[. For x, y E H, let Px,y : B ----+ C be the complex measure defined by Px,y(E) = (P(E)x, y). Let us write POl Ln anPen,e n , when either a = ((an))n E £1
Chapter 5. Unbounded Operators
166
or a E JR~ (so that Pa makes sense as a finite complex measure or a (possibly infinite-valued) positive measure defined on (X, B). (a) Show that the following conditions on a set E E B are equivalent: (i) peE) = 0; (ii) PaCE) = 0 for some a = ((an))n where an > 0 V n E N; (iii) PaCE) = 0 for every a = ((an))n where an ;:::: 0 V n E N. (b) Show that the collection N p = {E E B : PCB) = O} is a a-ideal in B - meaning that if {Nn : n E N} c N p and E E B are arbitrary, then UnEf\lNn E N p and Nl nEE N p . (c) Let .c°(X, B) denote the algebra of all (B, Bc)-measurable complex-valued functions on X. Let Zp denote the set of functions h E .c°(X,B) such that h = 0 P-a.e. (meaning, of course, that P( {x : hex) =1= O}) = 0). Verify that Zp is an ideal in .c°(X, B) which is closed under pointwise convergence of sequences; in fact, show more generally, that if {fn} is a sequence in Zp and if f E .cO (X, B) is such that fn --t f P-a.e., then f E Zp. (d) With the notation of (c) above, define the quotient spaces LO(X,B,P) LOO(X, B, P)
.c°(X, B)/Zp
(5.3.16)
.c=(X, B)/(.cOO(X, B) n Zp)
(5.3.17)
where we write .cOO (X, B) to denote the class of all P-a.e. bounded measurable complex-valued functions on X. Verify that LOO(X,B,P) is a Banach space with respect to the norm defined by
IlfIILOO(x,B,P) = inf{sup{lf(x)l: x
E E} : (X - E) E
Np
} .
(4) Let T be a self-adjoint operator on H, and let PT : BJR --t .c(H) be the associated spectral measure as in Theorem 5.3.3. Show that: (i) we have a bounded functional calculus for T; i.e., there exists a unique *algebra isomorphism L 00 (JR, B JR , PT ) '3 ¢ f--+ ¢(T) E W* ( {T}) with the property that 1E(T) (= PT(E)) for all E E BJR; and (ii) if S E .cc(H) is such that SryW*({T}), then there exists a ¢ E LO(X,Bx,P) such that S = ¢(T) in the sense that S = J ¢(>.)dP(>.). Before we can get to the polar decomposition for unbounded operators, we should first identify just what we should mean by an unbounded self-adjoint operator being positive. We begin with a lemma. Lemma 5.3.6 Let T E .cc(H, K). Then, (a) for arbitrary z E H, there exists a unique x E x+T*Tx;
dom(T*T) such that z =
5.3. Spectral theorem and polar decomposition
167
(b) in fact, there exists an everywhere defined bounded operator A E £(7-i) such that (i) A is a positive 1-1 operator satisfying IIAII ::; 1; and (ii) (1 + T*T)-l = A, meaning that ran A = dom(l + T*T) = {x E dom T : Tx E dom T*} and Az + T*T(Az) = z \jz E 7-i. Proof. Temporarily fix z E 7-i; then, by Proposition 5.1.6, there exist unique elements x E dom(T) and y E dom(T*) such that (0, z)
= (y, T*y) - (Tx, -x) .
Thus, z = T*y + x, where y = Tx; i.e., x E dom(T*T) and z = (1 + T*T)x, thereby establishing (a). With the preceding notation, define Az = x, so that A : 7-i --t 7-i is an everywhere defined (clearly) linear operator satisfying (1 + T*T)A = 1, so that A is necessarily 1-1. Notice that
(Az,z)
=
(x,z)
=
IIxl1 2+ IITxl1 2 ,
(5.3.18)
and hence IIAzl12 = IIxl1 2 ::; (Az, z) ::; IIAzllllzll. In other words, A is indeed a bounded positive contraction; i.e., 0 ::; A ::; 1. Hence (b) (i) is true, and (b)(ii) is a consequence of the construction. D Proposition 5.3.7 (1) The following conditions on a closed densely defined operator S E £c(7-i) are equivalent: (i) S is self-adjoint and (Sx, x) ~ 0 \j x E dom S; (i)' S is self-adjoint and a(S) C [0,(0), where of course a(S) is called the spectrum of S, and peS) = C - a(S) is the resolvent set of S which consists of those complex scalars ..\ for which (S - ..\)-1 E £(7-i); (thus, ..\ '/:. a(S) precisely when (S - ..\) maps dom S 1-1 onto all of 7-i;) (ii) there exists a self-adjoint operator T acting in 7-i such that S = T2; (iii) there exists T E £c(7-i,K) (Jor some Hilbert space K) such that S = T*T.
A linear operator S E £c(7-i) is said to be a positive self-adjoint operator if the preceding equivalent conditions are met.
(2) Furthermore, a positive operator has a unique positive square root - meaning that if S is a positive self-adjoint operator acting in 7-i, then there exists a unique positive self-adjoint operator acting in 7-i - which is usually denoted by the symbol S! - such that S = (S!)2. Proof. (i) ¢=> (i)': If S is self-adjoint, let P : BIR --t £(7-i) be the (projectionvalued) spectral measure associated with S as in Theorem 5.3.3; as usual, if x E 7-i, let Px,x denote the (genuine scalar-valued finite) positive measure defined by Px,x(E) = (P(E)x, x) = IIP(E)xI1 2. Given that S = fIR ..\dP()") , it is easy to
Chapter 5. Unbounded Operators
168
see that the (second part of) condition (i) (resp., (in amounts to saying that (a) frr~. )'dpx,xC>") > 0 for all x E H (resp., (a)' P is supported in [0,00) - meaning that P((-oo,O)) = 0); the proof of the equivalence (i) {=? (i)' reduces now to the verification that conditions (a) and (a)' are equivalent; but this is an easy verification. (i) =? (ii): According to the hypothesis, we are given that 5 = fJR)' dP()'), where P : BJR ---+ £(H) is some spectral measure which is actually supported on the positive line [0,00). Now, let T = fJR).1 dP()'), and deduce (from Exercise 5.3.1(I)(c)) that T is indeed a self-adjoint operator and that 5 = T2. (ii) =? (iii): Obvious. (iii) =? (i): It is seen from Lemma 5.3.6 that (I+T*T)-l = A is a positive bounded operator A (which is 1-1). It follows that dA) C [0,1] and that A = f[O,lJ )'dPA().), where PA : B[O,lJ ---+ £(H) is the spectral measure given by PA(E) = IdA). Now it follows readily that T*T is the self-adjoint operator given by 5 = T*T = f[O,lJ (±-I)dPA().);finally,since (±-I) ::::: o P-a.e., we find also that (5x, x) :::::0. (2) Suppose that 5 is a positive operator; and suppose T is some positive selfadjoint operator such that 5 = T2. Let P = PT be the spectral measure associated with T, so that T = fJR)' dP()'). Define M = W*( {T}) (~ LCXJ(JR, BJR , P)); then TryM by definition, and so - see Exercise 5.3.5(1)(iii) - also 5ryM. It follows from Proposition 5.3.4 - that 5ryM =? Is(E) E M VEE BJR+ .
This implies that W*({5}) = {ls(E): E E BIRY' C M. If we use the symbol 51 to denote the positive square root of 5 that was constructed in the proof of (i) =? (ii) above, note that 51 ry W*( {5}) C M; let ¢ E LCXJ(JR, B, P) be such that 5 = ¢(T); the properties of the operators concerned now imply that ¢ is a measurable function which is (i) non-negative P-a.e., and (ii) ¢().)2 = ).2 P-a.e.; it follows that ¢().) = ). P-a.e., thus establishing the uniqueness of the positive square root. 0 Theorem 5.3.8 Let T E £c(H, K). Then there exists a unique pair U, tors satisfying the following conditions: (i) T = UITI; (ii) U E £(H, K) is a partial isometry; (iii) ITI is a positive self-adjoint operator acting in H; and (iv) ker T = ker U = ker ITI.
Furthermore, erator T*T.
ITI
ITI
of opera-
is the unique positive square root of the positive self-adjoint op-
5.3. Spectral theorem and polar decomposition
169
Proof. As for the uniqueness assertion, first deduce from condition (iv) that U*UITI = ITI, and then deduce that ITI has to be the unique positive square root of T*T; the uniqueness of the U in the polar decomposition is proved, exactly
as in the bounded case. Let Pn = ly*T([O, n]) and let 7-i n = ran Pn . Then {Pn } is an increasing sequence of projections such that Pn ---+ 1 strongly. Observe also that 7-i n c dom ¢(T*T) for every continuous (and more generally, any "locally bounded" measurable) function ¢ : lR ---+ C, and that, in particular, x E 7-i n =? x E
dom T*T and IITxl1
=
IIITI xii.
(5.3.19)
(Reason: IITxl1 2 = (Tx,Tx) = (ITI 2 x,x) = IIITlxW.) An appeal to Exercise 3.4.12 now ensures the existence of an isometric operator Uo : ran ITI ---+ J( such that Uo(ITlx) Tx for all x E U n 7-i n . Let U = U o 0 1(0,00) (ITI). Then it is clear that U is a partial isometry with initial space ker-L ITI, and that UITlx = Tx for all x E U n 7-i n . We now claim that the following conditions on a vector x E 7-i are equivalent: (i) x E dom T; (ii) sUPn IITPnxl1 < 00; (ii)' sUPn II ITIPnxl1 < 00; (i)' x E dom(ITI). The equivalence (ii) ¢:} (ii)' is an immediate consequence of equation (5.3.19); then for any x E 7-i and n E N, we see that ITlPnx is the sum of the family {ITI(Pj - Pj-d : 1 :S j :S n} of mutually orthogonal vectors - where we write Po = 0 - and hence condition (ii)' is seen to be equivalent to the convergence of the (orthogonal, as well as collapsing) sum (Xl
L
ITI(Pj - Pj-d x =
li~n ITIPnx .
j=1
Since Pnx ---+ x, the fact that ITI is closed shows that the convergence of the displayed limit above would imply that x E dom ITI and that ITlx = limn ITIPnx, hence (ii)' =? (i)'. On the other hand, if x E dom (ITI), then since ITI and Pn commute, we have: ITlx = lim PnlTlx = lim ITIPnx , n
n
and the sequence {II (ITlPnx)ll}n is convergent and necessarily bounded, thus
(i)'
=?
(ii)'.
We just saw that if x E dom ITI, then {ITIPnx}n is a convergent sequence; the fact that the partial isometry U has initial space given by Un 7-i n implies
Chapter 5. Unbounded Operators
170
that also {T Pnx U(ITlPnx)}n is convergent; since T is closed, this implies x E dom T, and so (i)' =} (i). We claim that the implication (i) =} (i)' would be complete once we prove the following assertion: if x E dom T, then there exists a sequence {Xdk C Un 'Hn such that Xk ---> x and TXk ---> Tx. (Reason: if this assertion is true, if x E dom T, and if Xk E Un'Hn is as in the assertion, then IllTlxk - ITlxll1 = IITxk - TXll1 (by equation (5.3.19)) and hence also {ITlxdk is a (Cauchy, hence convergent) sequence; and finally, since ITI is closed, this would ensure that x E dom ITI.) In order for the assertion of the previous paragraph to be false, it must be the case that the graph G(T) properly contains the closure of the graph of the restriction To of T to Un 'Hn; equivalently, there must be a non-zero vector y E dom T such that (y, Ty) is orthogonal to (x, Tx) for every x E Un'Hn; i.e., x E Un'Hn
=}
(x, y)
=}
((1
+ T*T)z =
lim Pn (1 n
(Tx, Ty)
+ T*T)x, y)
hence y E ((1 + T*T) (Un 'Hn))~. On the other hand, if Z E dom T*T (1
+
lim(l n
0
0;
ITI2,
dom
+ ITI2)z =
=
=
notice that
+ ITI2)Pnz
and hence (ran(1 + T*T))~ = ((1 + T*T)(Un'Hn))~. Thus, we find that in order for the implication (i) =} (i)' to be false, we should be able to find a non-zero vector y such that y E (ran(l +T*T))~; but by Lemma 5.3.6(a), there is no such non-zero vector y. Thus, we have indeed completed the proof of the equation dom T = dom ITI. Finally since the equation T = UITI has already been verified on dom ITI, the proof of the existence half of the polar decomposition is also complete. D
Appendix
A.I
Some linear algebra
In this section, we quickly go through some basic linear algebra ~ i.e., the study of finite-dimensional vector spaces. Although we have restricted ourselves to complex vector spaces in the text, it might be fruitful to discuss vector spaces over general fields in this appendix. We begin by recalling the definition of a field. Definition A.I.1 A field is a set ~ which we will usually denote by the symbollK ~ which is equipped with two binary operations called addition and multiplication, respectively, such that the following conditions are satisfied: (1) (Addition axioms) There exists a map lK x lK 3 (a, (3) that the following conditions hold, for all a, (3, 'Y E lK:
f---+
(a
+ (3)
E lK, such
(i) (commutativity) a + (3 = (3 + a; (ii) (associativity) (a + (3)
+ 'Y =
a
+ ((3 + 'Y);
(iii) (zero) there exists an element in lK, always denoted simply by 0, such that a + 0 = a; (iv) (negatives) there exists an element in lK, always denoted by -a, with the property that a + (-a) = O.
(2) (Multiplication axioms) There exists a map lK x lK 3 (a, (3) that the following conditions hold, for all a, (3, 'Y E lK:
f---+
(a(3) ElK, such
(i) (commutativity) a(3 = (3a; (ii) (associativity) (a(3h
=
a((3'Y);
(iii) (one) there exists an element in lK, always denoted simply by 1, such that 1 i- 0 and a1 = a; (iv) (inverses) if a i- 0, there exists an element in lK, always denoted by with the property that aa~l = 1. 171
a~l,
Appendix
172
(3) (Distributive law) Addition and multiplication are related by the following axiom, valid for all a, (3" E lK:
Some simple properties of a field are listed in the following exercises.
Exercise A.1.2 Let lK be a field.
(1) Show that if a, (3" E lK are such that a + (3 = a +" then (3 = ,; deduce, in particular, that the additive identity 0 is unique, and that the additive inverse -a is uniquely determined by a.
(2) Show that if a,(3" E lK are such that a(3 = a" and if a # 0, then (3 = ,; deduce, in particular, that the multiplicative identity 1 is unique, and that the multiplicative inverse a-I of a non-zero element a is uniquely determined by a. (3) Show that a . 0 = 0 for all a E lK; and conversely, show that if a(3 = 0 for some (3 # 0, show that a = O. (Thus, a field has no "zero divisors".) (4) Prove that ifal, ... ,an E lK, and ifK E Sn is any permutation of{l, ... ,n}, then al + (a2 + (- .. + (a n -l + an)" .))) = a11'(1) + (a11'(2) + (... + (a11'(n-l) +a11'(n))" .))) , so that the expression L~=1 ai may be given an unambiguous meaning. (5) If al = ... = an = a in (3), define na = L~=1 ai. Show that if m, n are arbitrary positive integers, and if a E lK, then (i) (m+n)a = ma+na; (ii) (mn)a = m(na); (iii) -(na) = n( -a); (iv) if we define Oa = 0 and (-n)a = - (na) (for n EN), then (i) and (ii) are valid for all m, n E Z; (v) (n· l)a = na, where the 1 on the left denotes the 1 in lK and we write n· 1 for L~=1 1. Remark A.1.3 If lK is a field, there are two possibilities: either (i) m·1 # 0 \1m E N or (ii) there exists a positive integer m such that m· 1 = 0; if we let p denote the smallest positive integer with this property, it follows from (5)(ii), (5)(v) and (3) of Exercise A.1.2 that p should necessarily be a prime number. We say that the field lK has characteristic equal to 0 or p according as possibility (i) or (ii) occurs for lK. Example A.1.4 (1) The sets of real (IR), complex (1::) and rational (QI) numbers are examples of fields of characteristic O. (2) Consider Zp, the set of congruence classes of integers modulo p, where p is a prime number. If we define addition and multiplication "modulo p", then Zp is a field of characteristic p, which has the property that it has exactly p elements.
A.I. Some linear algebra
173
For instance, addition and multiplication in the field Z3
0+ x
= x Vx
E {O, 1, 2}; 1 + 1
=
2,1
=
{O, 1, 2} are given thus:
+ 2 = 0,2 + 2 = 1;
o· x = 0, 1· x = x Vx E {I, 2}, 2·2 =
1.
(3) Given any field IK, let IK[t] denote the set of all polynomials in the indeterminate variable t, with coefficients coming from IK; thus, the typical element of IK[t] has the form p = ao + al t + a2 t 2 + ... + ant n , where ai E IK Vi. If we add and multiply polynomials in the usual fashion, then IK[t] is a commutative ring with identity ~ meaning that it has addition and multiplication defined on it, which satisfy all the axioms of a field except for the axiom demanding the existence of inverses of non-zero elements. Consider the set of all expressions of the form ~, where p, q E IK[t]; regard two such expressions, say Ei., q, i = 1,2 as being equivalent if Pl q2 = P2ql; it can then be verified that the collection of all (equivalence classes of) such expressions forms a field IK(t), with respect to the natural definitions of addition and multiplication. (4) If IK is any field, consider two possibilities: (a) char IK = 0; in this case, it is not hard to see that the mapping Z :3 m ---> m· 1 ElK is a 1-1 map, which extends to an embedding of the field Q into IK via ~ f---> (m· 1) (n· 1) ~ 1; (b) char IK = P i- 0; in this case, we see that there is a natural embedding of the field Zp into IK. In either case, the "subfield generated by 1" is called the prime field of IK; (thus, we have seen that this prime field is isomorphic to Q or Zp, according as the characteristic of IK is 0 or p). D Before proceeding further, we first note that the definition of a vector space that we gave in Definition 1.1.1 makes perfectly good sense if we replace every occurrence of C in that definition with IK ~ see Remark 1.1.2 ~ and the result is called a vector space over the field IK. Throughout this section, the symbol V will denote a vector space over IK. As in the case of C, there are some easy examples of vector spces over any field IK, given thus: for each positive integer n, the set IK n = {(al, ... ,a,,): ai ElK Vi} acquires the structure of a vector space over IK with respect to coordinate-wise definitions of vector addition and scalar multiplication. The reader should note that if IK = Zp, then is a vector space with pn elements and is, in particular, a finite set! (The vector spaces Z2 playa central role in practical day-to-day affairs related to computer programming, coding theory, etc.) Recall (from §1.1) that a subset W c V is called a subspace of V if x, y E W, a ElK=} (ax + y) E W. The following exercise is devoted to a notion which is the counterpart, for vector spaces and subspaces, of something seen several times in the course of this text ~ for instance, Hilbert spaces and closed subspaces, Banach algebras and closed subalgebras, C* -algebras and C* -subalgebras, etc.; its content is that there exist two descriptions ~ one existential, and one "constructive" ~ of the "sub-object" and that both descriptions describe the same object.
Z;
Appendix
174 Exercise A.L5 Let V be a vector space over lK.
(a) If {Wi: i E I} is any collection of subspaces of V, show that niE1Wi is a subspace of V. (b) If S c V is any subset, define V 5 n {W : W is a subspace of V, and 5 C W}, and show that V 5 is a subspace which contains 5 and is the smallest such subspace; we refer to V 5 as the subspace generated by S; conversely, if W is a subspace of V and if 5 is any set such that W = V 5, we shall say that 5 spans or generates W, and we shall call 5 a spanning set for W. (c) If 51 C 52 C V, then show that
V 51 c V 52.
V 5 = {L~l 0:iXi: 0:i E lK,Xi E 5,n EN}, for any subset 5 ofV. (The expression L~=l 0:iXi is called a linear combination of the vectors Xl, ... , xn; thus the subspace spanned by a set is the collection of all linear combinations of vectors from that set.) (d) 5how that
Lemma A.L6 The following conditions on a set 5 C (V - {O}) are equivalent: (i) if 50 is a proper subset of 5, then V 50 # V 5; (ii) if n E N, if Xl, X2, ... ,Xn are distinct elements of 5, and if 0:1,0:2, ... ,0: n E lK are such that L~l 0:iXi = 0, then 0:i = O'lfi. A set 5 which satisfies the above conditions is said to be linearly independent; and a set which is not linearly independent is said to be linearly dependent. Proof. (i) =} (ii): If L~=l 0:iXi = 0, with Xl, ... ,X n being distinct elements of 5 and if some coefficient, say 0:j, is not 0, then we find that Xj = Li#j (JiXi, where (Ji = - ~; this implies that V(5 - {x j }) = V 5, contradicting the hypothesis (i). J
(ii) =} (i): Suppose (ii) is satisfied and suppose 50 C (5 - {x}; (thus 50 is a proper subset of S;) then we assert that X r:J- V 50; if this assertion were false, we should be able to express X as a linear combination, say X = L~=l (JiXi, where Xl, ... ,X n are elements of 50, where we may assume without loss of generality that the Xi'S are all distinct; then, setting X = XO, 0:0 = 1, 0:i = -(Ji for 1 :<:; i :<:; n, we find that xo, Xl, ... ,Xn are distinct elements of 5 and that the 0:i's are scalars not all of which are 0 (since 0:0 # 0), such that L~=o 0:iXi = 0, contradicting the hypothesis (ii). D Proposition A.L 7 Let 5 be a subset of a vector space V; the following conditions are equivalent: (i) 5 is a minimal spanning set for V; (ii) 5 is a maximal linearly independent set in V. A set S satisfying these conditions is called a (linear) basis for V.
A.I. Some linear algebra
175
Proof. (i) =? (ii): The content of Lemma A.1.6 is that if S is a minimal spanning set for VS, then (and only then) S is linearly independent; if Sl ::) S is a strictly larger set than S, then we have V = V S c V Sl C V, whence V = V Sl; since S is a proper subset of Sl such that VS = VSl, it follows from Lemma A.1.6 that Sl is not linearly independent; hence S is indeed a maximal linearly independent subset of V. (ii) =? (i): Let W = VS; suppose W =I- V; pick x E V - W; we claim that Sl = S U {x} is linearly independent; indeed, suppose 2:::7=1 CtiXi = 0, where Xl, ... , Xn are distinct elements of S 1, and not all Cti's are 0; since S is assumed to be linearly independent, it must be the case that there exists an index i E {I, ... , n} such that x = Xi and Cti =I- 0; but this would mean that we can express x as a linear combination of the remaining xl's, and consequently that x E W, contradicting the choice of x. Thus, it must be the case that S is a spanning set of V, and since S is linearly independent, it follows from Lemma A.1.6 that in fact S must be a minimal spanning set for V. D
Lemma A.1.8 Let L = {Xl, ... , Xm} be any linearly independent set in a vector space V, and let S be any spanning set for V; then there exists a subset So c S such that (i) S - So has exactly m elements and (ii) L U So is also a spanning set for V. In particular, any spanning set for V has at least as many elements as any linearly independent set. Proof. The proof is by induction on m. If m = 1, since Xl E V = VS, there exists a decomposition Xl = 2:::7=1 CtjYj, with Yj E S. Since Xl =I- (why?), there must be some j such that CtjYj =I- 0; it is then easy to see (upon dividing the previous equation by Ctj - which is non-zero - and rearranging the terms of the resulting equation appropriately) that Yj E V{Y1, ... ,Yj-1,X1,Yj+1, ... ,Yn}, so that So = S - {Yj} does the job. Suppose the lemma is valid when L has (m - 1) elements; then we can, by the induction hypothesis (applied to L - {xm} and S) find a subset Sl c S such that S - Sl has (m -1) elements, and such that V(Sl U {Xl, ... ,xm-d) = V. In particular, since Xm E V, this means we can find vectors Yj E Sl and scalars Cti,{3j ElK, for i < m,j ::::; n, such that Xm = 2:::::~1 CtiXi+ 2:::7=1 {3jYj. The assumed linear independence of L shows that (x m =I- and hence that) {3joYjo =I- 0, for some jo· As in the last paragraph, we may deduce from this that Yjo E V({Yj : 1 ::::; j ::::; n, j =I- jO}U{Xl, ... , x m }; this implies that V = V(Sl U{X1, ... , xm-d) = V(SoU L), where So = SI - {Yjo}, and this So does the job. D
°
°
Exercise A.1.9 Suppose a vector space V over lK has a basis with n elements, for some n E N. (Such a vector space is said to be finite-dimensional.) Show that: (i) any linearly independent set in V has at most n elements; (ii) any spanning set for V has at least n elements; (iii) any linearly independent set in V can be "extended" to a basis for v.
176
Appendix
Corollary A.1.10 Suppose a vector space V over lK admits a finite spanning set. Then it has at least one basis; further, any two bases of V have the same number of elements.
Proof. If S is a finite spanning set for V, it should be clear that (in view of the assumed finiteness of S that) S must contain a minimal spanning set; i.e., S contains a finite basis - say 8 = {VI, ... , Vn } - for V (by Proposition A .1. 7). If 8 1 is any other basis for V, then by Lemma A.l.8, (applied with S = 8 and L as any finite subset of 8 1 ) we find that 8 1 is finite and has at most n elements; by reversing the roles of 8 and 8 1 , we find that any basis of V has exactly n elements. D It is a fact, which we shall prove in §A.2, that every vector space admits a basis, and that any two bases have the same "cardinality" , meaning that a bijective correspondence can be established between any two bases; this common cardinality is called the (linear) dimension of the vector space, and denoted by dimlK V. In the rest of this section, we restrict ourselves to finite-dimensional vector spaces. If {Xl, ... ,x n } is a basis for (the n-dimensional vector space) V, we know that any element X of V may be expressed as a linear combination I:~=1 aiXi; the fact that the Xi'S form a linearly independent set, ensures that the coefficients are uniquely determined by X; thus we see that the mapping lK n :3 (a1, ... ,an) :.£. I:~=1 aixi E V establishes a bijective correspondence between lK n and V; this correspondence is easily see to be a linear isomorphism - i.e., the map T is 1-1 and onto, and it satisfies T(ax + y) = aTx + Ty, for all a E lK and x, y E lKn. As in the case of C, we may - and will - consider linear transformations between vector spaces over lK; once we have fixed bases 8 1 = {VI, ... , v n } and 8 2 = {WI, ... , w m } for the vector spaces V and W respectively, then there is a 1-1 correspondence between the set L(V, W) of (lK-) linear transformations from V to Won the one hand, and the set A1mxn(lK) of m x n matrices with entries coming from lK, on the other; the linear transformation T E L(V, W) corresponds to the matrix ((tj)) E Mmxn(lK) precisely when TVj = I:7~1 tjwi for all j = 1,2, ... , n. Again, as in the case of C, when V = W, we take 8 2 = 8 1 , and in this case, the assignment T f--+ (( tj )) sets up a linear isomorphism of the (lK-) vector space L(V) onto the vector space Mn(lK), which is an isomorphism of lK-algebras - meaning that if products in L(V) and Mn (lK) are defined by composition of transformations and matrix-multiplication (defined exactly as in Exercise l.3.1(7)(ii)) respectively, then the passage (defined above) from linear transformations to matrices respects these products. (In fact, this is precisely the reason that matrix multiplication is defined in this manner.) Thus the study of linear transformations on an n-dimensional vector space over lK is exactly equivalent - once we have chosen some basis for V - to the study of n x n matrices over lK. We conclude this section with some remarks concerning determinants. Given a matrix A = ((a;)) E Mn (.IK), the determinant of A is a scalar (i.e., an element of
177
A.l. Some linear algebra
IK), usually denoted by the symbol det A, which captures some important features of the matrix A. In case IK = JR., these features can be quantified as consisting of two parts: (a) the sign (i.e., positive or negative) of det A determines whether the linear transformation of JR.n corresponding to A "preserves" or "reverses" the "orientation" in JR.n; and (b) the absolute value of det A is the quantity by which volumes of regions in JR.n get magnified under the application of the transformation corresponding to A. The group of permutations of the set {I, 2, ... , n} is denoted Sn; as is well known, every permutation is expressible as a product of transpositions (i.e., those which interchange two letters and leave everything else fixed); while the expression of a permutation as a product of transpositions is far from unique, it is nevertheless true - see [Art], for instance . that the number of transpositions needed in any decomposition of a fixed (J E Sn is either always even or always odd; a permutation is called even or odd depending upon which of these possibilities occurs for that permutation. The set An of even permutations is a subgroup of Sn which contains exactly half the number of elements of Sn; i.e., An has "index" 2 in Sn. The so-called alternating character of Sn is the function defined by if (J E An otherwise
(A.I.l)
which is easily seen to be a homomorphism from the group Sn into the multiplicativegroup {+l,-l}. We are now ready for the definition of the determinant.
Definition A.loll If A = ((aj)) E Mn(IK), the determinant of A is the scalar defined by det A
L uES n
n
t(T
II a~(i) i=l
The best way to think of a determinant stems from viewing a matrix A = ((aj)) E Mn(IK) as the ordered tuple of its rows; thus we think of A as (aI, a 2 , ... , an) where a i = (ai, ... ,a~) denotes the i-th row of the matrix A. Let R: IKn x
n
tenns
...
xIKn
--+
1 ·
Mn(IK) denote the map given by R(a , ... ,an)
=
((aj)),
and let D : IKn x n :~~ms x IKn --+ IK be defined by D = det 0 R. Then, R is clearly a linear isomorphism of vector spaces, while D is just the determinant mapping, but when viewed as a mapping from IKn x n .t~r.rns X IKn into IK. We list some simple consequences of the definitions in the following proposition.
Proposition A.lo12 (1) If A' denotes the transpose of the matrix A, then det A det A'. (2) The mapping D is an "alternating multilinear form": i.e.,
=
Appendix
178
(a) if Xl, ... , Xn
E ][(n and a E Sn are arbitrary, then,
D(x a (1),
...
=
,xa (n))
Ea D (xl,x2, ... ,xn)
and
(b) if all but one of the n arguments of D are fixed, then D is linear as a function of the remaining argument; thus, for instance, if Xl, yl, x 2, x 3 , ... , xn E ][(n and a E ][( are arbitrary, then D(ax l
+ yl, x 2 , .. . , xn) =
aD(xl, x 2 , ... , xn)
Proof. (1) Let B = A', so that bj = aiVi,j. Since a such that Ea = Ea-l Va E Sn, we have n det A Ea a~(i)
L
uESn
L aES
II IIn
f---+
+ D(yl, X2, ... , xn). a-I is a bijection of SN
i=l
Ea
a-1(j)
aj
j=l
n
n
L
II b~-l(i)
Ea-l
aES n
i=l
n
L
E1f
1fESn det B,
II b~(i) i=l
as desired. (2) (a) Let us write yi
= xa(i).
Then,
D(xa(l), .. . , xa(x n
))
n
n
n
n j
XT(j) TES n Ea
j=l
D(x\ ... ,xn ).
(b) This is an immediate consequence of the definition of the determinant and the distributive law in IK. 0
A.I. Some linear algebra
179
Exercise A.1.13 (a) Show that if xl, ... ,xn E lK n are vectors which are not all distinct, then D(x l , ... , xn) = o. (Hint: if xi = x j , where i i= j, let T E Sn be the transposition (ij), which switches i and j and leaves other letters unaffected; then, on the one hand tT = -1, while on the other, xT(k) = xk Vk; now appeal to the alternating character of the determinant, i.e., Proposition A.l.12(1).)
(b) Show that if the rows of a matrix are linearly dependent, then its determinant must vanish.
(Hint: if the i-th row is a linear combination of the other rows, use multilinearity of the determinant (i.e., Proposition A.l.12(2)) and (a) of this exercise.) (c) Show that if a matrix A is not invertible, then det A
= O.
(Hint: note that the transpose matrix A' is also not invertible, deduce from Exercise 1.3.1 (4)(iii) that the (columns of A', or equivalently) the rows of A are linearly dependent; now use (b) above.) The way one computes determinants in practice is by "expanding along rows" (or columns) as follows: first consider the sets XJ = {O" E Sn : 0"( i) = j}, and notice that, for each fixed i, Sn is partitioned by the sets Xl, . .. , X~. Suppose A = ((aj)) E Mn(lK); for arbitrary i,j E {I, 2, ... ,n}, let Aj denote the (n -1) x (n -1) matrix obtained by deleting the i-th row and the j-th column of A. Notice then that, for any fixed i, we have n
.L
det A
fa
aESn
II a~(k) k=l
(A.1.2)
After some careful book-keeping involving several changes of variables, it can be verified that the sum within the parentheses in equation A.1.2 is nothing but (-1 )i+j det Aj, and we find the following useful prescription for computing the determinant, which is what we called "expanding along rows": n
det A
=.L
(-I)i+jaj det Aj, Vi.
(A.1.3)
j=l
An entirely similar reasoning (or an adroit application of Proposition A.l.I2(I) together with equation A.1.3) shows that we can also "expand along columns": i.e., n
det A
=
L
(-l)i+ja; det A; , Vi.
(A. 1.4)
i=l
We are now ready for one of the most important features of the determinant.
Appendix
180
Proposition A.1.14 A matrix is invertible if and only ifdet A =I- 0; in case det A =I0, the inverse matrix B = A -1 is given by the formula
Proof. That a non-invertible matrix must have vanishing determinant is the content of Exercise A.1.13. Suppose conversely that d = det A =I- o. Define b; = (- )i+ j ~ det Ai. We find that if C = AB, then cj = 2::~=1 a",bj, \Ii, j. It follows directly from equation A.1.3 that ci = 1 \Ii. In order to show that Cis the identity matrix, we need to verify that n
2) _l)k+
j
ak
det A~
o , \Ii =I- j.
(A.1.5)
k=l Fix i =I- j and let P be the matrix with I-th row pi, where
if I =I- j if I = j and note that (a) the i-th and j-th rows of P are identical (and equal to a i ) which implies, by Exercise A.1.13 that det P = 0; (b) Pk = A~ \lk, since P and A differ only in the j-th row; and (c) p~ = a~ \lk. Consequently, we may deduce that the expression on the left side of equation A.1.5 is nothing but the expansion, along the j-th row of the determinant of P, and thus equation A.1.5 is indeed valid. Thus we have AB = I; this implies that A is onto, and since we are in finite dimensions, we may conclulde that A must also be 1-1, and hence invertible. D The second most important property of the determinant is contained in the next result. Proposition A.1.I5 (a) If A, B E Mn (IK.), then
det(AB)
=
(det A)(det B)
(b) If S E Mn (IK.) is invertible, and if A E Mn (IK.), then det (SAS- 1 ) =
det A .
(A.1.6)
Proof. (a) Let AB = C and let ai, bi and ci denote the i-th row of A, Band C respectively; then since cj 2::~=1 a~bj, we find that ci = L~=l a~bk. Hence we see that det (AB)
A.I. Some linear algebra
181 n
n
D(L ak,b k1 , ... , L
a'Lb kn )
n
ak1 1 ... aknn D(bk1 , ... , bkn) k 1 ) ... ,k n =1 n
L (II a~(i))D(ba(I), ... , ba(n))
aESn i=1 n
aESn i=1 (det A)(det B)
where we have used (i) the multilinearity of D (i.e., Proposition A.1.12(2)) in the third line, (ii) the fact that D(Xi, ... ,xn) = 0 if xi = x j for some i =I- j, so that D(b k1 , ... ,bkn ) can be non-zero only if there exists a permutation CT E Sn such that CT( i) = k i 'Vi, in the fourth line, and (iii) the fact that D is alternating (i.e., Proposition A.1.l2( 1)) in the fifth line. (b) If I denotes the identity matrix, it is clear that det I = 1; it follows from (a) that det S-1 = (det S) -1, and hence the desired conclusion (b) follows from another application of (a) (to a triple-product). D The second part of the preceding proposition has the important consequence that we can unambiguously talk of the determinant of a linear transformation of a finite-dimensional vector space into itself. Corollary A.1.16 If V is an n-dimensional vector space over lK, if T E L(V), if Bl = {Xl, ... ,X n } and B2 = {Yl, ... ,Yn} are two bases for V, and if A = [TJB1' B = [TJB2 are the matrices representing T with respect to the basis B i , for i = 1,2, then there exists an invertible matrix S E Mn(lK) such that B = S-1 AS, and consequently, det B = det A. Proof. Consider the unique (obviously invertible) linear transformation U E L(V) with the property that UXj = yj,1 ~ j ~ n. Let A = ((aj)),B = ((bj)), and let S = (( sj)) = [UJg~ be the matrix representing U with respect to the two bases (in the notation of Exercise 1.3.1(7)(i)); thus, by definition, we have: TXj
n Lajxi i=1
TYj
LbjYi
n
i=1 n
Yj
LS;Xi ; i=1
Appendix
182
deduce that n
TYj
Lb;Yi i=l n
. k L bjsi Xk i,k=l n
L k=l
(ts~bj )
Xk ,
t=l
while, also
TYj
T ( tS;Xi) t=l n
LsjTxi i=l n i k L Sjai Xk i,k=l
n
L k=l
(ta~s; )
Xk ;
,=1
conclude that
[Tj~~ = SB = AS. Since S is invertible, we thus have B = S-l AS, and the proof of the corollary is complete (since the last statement follows from Proposition A.l.15(b)). 0 Hence, given a linear transformation T E L(V), we may unambiguously define the determinant of T as the determinant of any matrix which represents T with respect to some basis for V; thus, det T = det [Tjg, where B is any (ordered) basis of V. In particular, we have the following pleasant consequence. Proposition A.1.I7 Let T E L(V); define the characteristic polynomial of T by the equation (A.l.7) PT()..) = det (T - )"lv) , ).. E IK
where 1v denotes the identity operator on V. Then the following conditions on a scalar).. E IK are equivalent: (i) (T - ),,1 v) is not invertible; (ii) there exists a non-zero vector x E V such that Tx = )..x; (iii) PT()..) = O.
A.l. Some linear algebra
183
Proof. It is easily shown, using induction and expansion of the determinant along the first row (for instance), that PT(A) is a polynomial of degree n, with the coefficient of An being (-1) n . The equivalence (i) {=} (iii) is an immediate consequence of Corollary A.1.16, Proposition A.1.14. The equivalence (i) {=} (ii) is a consequence of the fact that a II linear transformation of a finite-dimensional vector space into itself is necessarily onto and hence invertible. (Reason: if B is a basis for V, and if Tis 1-1, then T(B) is a linearly independent set with n elements in the n-dimensional space, and any such set is necessarily a basis.) 0
Definition A.1.tB If T, A, x are as in Proposition A.l.17(ii), then A is said to be an eigenvalue of T, and x is said to be an eigenvector of T corresponding to the eigenvalue A. We conclude this section with a couple of exercises. Exercise A.1.t9 Let T E L(V), where V is an n-dimensional vector space over lK.
(a) Show that there are at most n distinct eigenvalues of T. (Hint: a polynomial of degree n with coefficients from lK cannot have (n + 1) distinct roots.) (b) Let lK = JR, V = JR2 and let T denote the transformation of rotation (of the plane about the origin) by 90 0 in the counter-clockwise direction. (i) Show that the matrix ofT with respect to the standard basis B = {(I, 0), (0, I)} is given by the matrix
[~ ~1]
[TJB =
and deduce that PT(A) = A2 + 1. Hence, deduce that T has no (real) eigenvalue. (ii) Give a geometric proof of the fact that T has no (real) eigenvalue. (c) Let p(A) = (-1)nAn + L~:~ akAk be any polynomial of degree n with leading coefficient (-I)n. Show that there exists a linear transformation on any ndimensional space with the property that PT = p. (Hint: consider the matrix
A
where (3j
=
0 1 0 0
0 0 1 0
0 0 0 1
0 0 0 0
0 0 0 0
(30 (31 (32 (33
0
0
0
0
1
(3n-1
(_1)n+1aj, 'VO :::; j
< n.)
(d) Show that the following conditions on lK are equivalent:
Appendix
184
(i) every non-constant polynomial p with coefficients from lK has a root - i.e., there exists an a E lK such that pea) = 0; (ii) every linear transformation on a finite-dimensional vector space over lK has an eigenvalue. (Hint: Use Proposition A .1.17 and (c) of this exercise.}
A.2
Transfinite considerations
One of the most useful arguments when dealing with "infinite constructions" is the celebrated Zorn's lemma, which is equivalent to one of the basic axioms of set theory, the "Axiom of Choice". The latter axiom amounts to the statement that if {Xi : i E I} is a non-empty collection of non-empty sets, then their Cartesian product I1iEI Xi is also non-empty. It is a curious fact that this axiom is actually equivalent to Tychonoff's theorem on the compactness of the Cartesian product of compact spaces; in any case, we just state Zorn's Lemma below, since the question of "proving" an axiom does not arise, and then discuss some of the typical ways in which it will be used. Lemma A.2.1 (Zorn's lemma) Suppose a non-empty partially ordered set (P,::;) satisfies the following condition: every totally ordered set in P has an upper bound (i. e., whenever C is a subset of P with the property that any two elements of Care comparable - meaning that if x, Y E C, then either x ::; y or y ::; x - then there exists an element z E P such that x ::; z \Ix E C). Then P admits a maximal element - i. e., there exists an element z E P such that if x E P and z ::; x, then necessarily x = z. A prototypical instance of the manner in which Zorn's lemma is usually applied is contained in the proof of the following result. Proposition A.2.2 Every vector space (over an arbitrary field lK), with the single exception of V = {O}, has a basis.
Proof. In view of Proposition A.I. 7, we need to establish the existence of a maximal linearly independent subset of the given vector space V. Let P denote the collection of all linearly independent subsets of V. Clearly the set P is partially ordered with respect to inclusion. We show now that Zorn's lemma is applicable to this situation. To start with, P is not empty. This is because V is assumed to be non-zero, and so, if x is any non-zero vector in V, then {x} E P. Next, suppose C = {Si : i E I} is a totally ordered subset of P; thus for each i E I, we are given a linearly independent set Si in P; we are further told that whenever i,j E I, either Si C Sj or Sj C Si. Let S UiEI Si; we contend that S is a linearly independent set; for, suppose L:~=l akxk = 0 where n E N, ak E lK and Xk E S for 1 ::; k ::; n; by definition, there exist indices ik E I
A.2. Transfinite considerations
185
such that Xk E Sik Vk; the assumed total ordering on C clearly implies that we may assume, after re-labelling, if necessary, that Si 1 c Si 2 C ... C Sin. This means that Xk E Sin VI::; k ::; n; the assumed linear independence of Sin now shows that Q;k = 0 VI::; k ::; n. This establishes our contention that S is a linearly independent set. Since Si C S ViE I, we have in fact verified that the arbitrary totally ordered set C C P admits an upper bound, namely S. Thus, we may conclude from Zorn's lemma that there exists a maximal element, say 5, in P; i.e., we have produced a maximal linearly independent set, namely 5, in V, and this is the desired basis. D The following exercise lists a couple of instances in the body of this book where Zorn's lemma is used. In each case, the solution is an almost verbatim repetition of the above proof. Exercise A.2.3 (1) Show that every Hilbert space admits an orthonormal basis.
(2) Show that every (proper) ideal in a unital algebra is contained in a maximal (proper) ideal. (Hint: If I is an ideal in an algebra A, consider the collection P of all ideals of A which contain I; show that this is a non-empty set which is partially ordered by inclusion; verify that every totally ordered set in P has an upper bound (namely the union of its members); show that a maximal element of P is necessarily a maximal ideal in A.) The rest of this section is devoted to an informal discussion of cardinal numbers. We will not be very precise here, so as to not get into various logical technicalities. In fact we shall not define what a cardinal number is, but we shall talk about the cardinality of a set. Loosely speaking, we shall agree to say that two sets have the same cardinality ifit is possible to set up a 1-1 correspondence between their members; thus, if X, Y are sets, and if there exists a 1-1 function f which maps X onto Y, we shall write IXI = IYI (and say that X and Y have the same cardinality). When this happens, let us agree to also write X '" Y. Also, we shall adopt the convention that the symbol UiE! Xi (resp., A U B) denotes the union of a collection of (resp., two) sets which are pairwise disjoint. We come now to an elementary proposition, whose obvious proof we omit. Proposition A.2.4 If X = UiE! Xi and Y = UiE! Yi are partitions of two sets such that IXil = IYiI ViE I, then also IXI = IYI. Thus, "cardinality is a completely additive function" (just as a measure is a count ably additive function, while there is no "countability restriction" now). More generally than before, if X and Y are sets, we shall say that IXI ::; IYI if there exists some subset Yo C Y such that IXI = lYol. It should be clear that
Appendix
186
":<::; " is a reflexive (i.e., IXI :<::; IXI "IX) and transitive (i.e., IXI :<::; WI and WI :<::; IZI imply IXI :<::; IZI) relation. The next basic fact is the so-called Schroeder-Bernstein theorem, which asserts that ":<::; " is an anti-symmetric relation. (We are skirting close to serious logical problems here - such as: what is a cardinal number? does it make sense to talk of the set of all cardinal numbers? etc.; but we ignore such questions and hide behind the apology that we are only speaking loosely, as stated earlier. The reader who wants to know more about such logical subtleties might look at [Hal], for instance, for a primer, as well as an introduction to more serious (and less "naive") literature, on the sUbject.)
Theorem A.2.5 Let X and Y be sets. Then,
IXI
:<::;
WI, WI
:<::;
IXI
IXI
=}
=
WI .
Proof. The hypothesis translates into the existence of 1-1 functions f : X --> Y and 9 : Y --> X; to prove the theorem, we need to exhibit a 1-1 function which maps X onto Y. Let g(Y) = Yo. Set h = go f. Then h is a 1-1 map of X into itself. Define Xn = h(n)(x) and Y n = h(n)(yo ), where h(n) denotes the composition of h with itself n times. The definitions imply that
and consequently, we see that
It follows that we have the following partitions of X and Yo respectively:
X
Yo
(g (g
consider the function
(Xn - Yn))
(Yn - Xn+d)
1> : X
=
1>(x)
Xo
-->
(g II (g
II
(Yn - Xn+d)
II Xoo
(Xn - Yn))
II Xoo ;
Yo defined by
= { ~(x)
if x
It is clear that 1> is a bijection and consequently, g-l 01> is a bijection of X onto Y (where, of course, g-l denotes the map from Yo to Y which is the inverse of the bijection 9 : Y --> Yo). 0
A.2. Transfinite considerations
187
The next step is to show that ::; is a "total order" . Lemma A.2.6 Let X and Y be arbitrary sets. Then, either
IXI ::; IYI
or
IYI ::; IXI.
Proof. Assume, without loss of generality, that both X and Yare non-empty sets. In particular, there exist (singleton) subsets Xo C X, Yo c Y such that IXol = lYol. Let P = {(A,f,B): A C X,B c Y, and f : A -+ B is a bijection}. Then by the previous paragraph, P =I- 0. Define an ordering on P as follows: if (A, f, B), (C, g, D) E P, say that (A, f, B) ::; (C, g, D) if it is the case that A C C and that glA = f (so that, in particular, we should also have BcD). It should be easy to see that this defines a partial order on P; further, if C = {(Ai, 1;, B i ) : i E I} is a totally ordered set in P, it should be clear that if we define A = UiEIAi and B = UiE1B i , then there exists a unique map f of A onto B with the property that flAi = 1;; in other words, (A, f, B) is an upper bound for P. Thus Zorn's lemma is applicable; if (Xo, f, Yo) is a maximal element of P - whose existence is assured by Zorn's lemma - then it cannot be the case that both X - Xo and Y - Yo are non-empty; for, in that case, we could arbitrarily pick x E X - X o, Y E Y - Yo, and define the bijection F of A = Xo U {x} onto B = YoU{y} by F(x) = y and Flx o = f; then the element (A,F,B) would contradict the maximality of (Xo, f, Yo). In conclusion, either Xo = X (in which case IXI ::; IYI) or Yo = Y (in which D case IYI ::; IXI).
Definition A.2.7 A set X is said to be infinite if there exists a proper subset Xo C X (and Xo =I- X) such that IXI = IXol. (Thus X is infinite if and only if there exists a 1-1 function of X onto a proper subset of itself.) A set X is said to be finite if it is not infinite. Lemma A.2.8 A set X
is infinite
if and only if INI
::; IXI.
Proof. Suppose X is an infinite set; then there exists a 1-1 function f : X -+ X - {xo}, for some Xo EX. Inductively define Xn+l = f (x n ) for all n = 0, 1, 2, .... If we set Xn = f(n)(x), where f(n) denotes the composition of f with itself n times, it follows that Xn E Xn - X n + 1 Vn, whence Xn =I- Xm Vn =I- m; i.e., the map n f-+ Xn is a 1-1 map of N into X and hence INI ::; IXI. Conversely, if g : N -+ X is a 1-1 map, consider the map f : X -+ X defined by
f(x) = {g(n + 1) x
and note that
f :X
-+
if x if x
= g(n) tJ. g(N)
X - {g(l)} is a 1-1 map, whence X is infinite.
D
Appendix
188
The following bit of natural notation will come in handy; if X and Yare sets, let us write IXI < WI if it is the case that IXI S; WI and IXI =f:. WI· The next result is a sort of "Euclidean algorithm" for general cardinal numbers. Proposition A.2.9 Let X and Y be arbitrary sets. Suppose Y is not empty. Then, there exists a partition
X
(A.2.8)
where I is some (possibly empty) index set, such that IXil = WI \I i E I, and IRI < WI; further, if I is infinite, then there exists a (possibly different) partition X = UiE! Yi with the property that WI = IYiI \Ii E I. (The R in equation A.2.8 is supposed to signify the "remainder after having divided X by yll.) Proof. If it is not true that IXI < WI and we may set I desired sort.
So suppose
WI
S;
IXI.
WI
S;
lXI,
then, by Lemma A.2.6, we must have
= (/) and R = X to obtain a decomposition of the
Let P denote the collection of all families S = {Xi:
i E I} of pairwise disjoint subsets of X, where I is some non-empty index set, with the property that IXil = WI \Ii E I. The assumption WI S; IXI implies that
there exists Xo C X such that non-empty.
WI = IXol;
thus {Xo} E P, and consequently Pis
It should be clear that P is partially ordered by inclusion, and that if C = {SA: A E A} is a totally ordered set in P, then S = U'\EAS,\ is an element of P which is "greater than or equal to" (i.e., contains) each member S,\ of C. Thus, every totally ordered set in P has an upper bound. Hence, by Zorn's lemma, there exists a maximal element in P, say S = {Xi: i E I}. Let R = X - (UiE! Xi). Clearly, then, equation A.2.8 is satisfied. Further, if it is not the case that IRI < WI, then, by Lemma A.2.6, we must have WI = IAI, with A c R; but then the family S' = S U {A} would contradict the maximality of S. This contradiction shows that it must have been the case that IRI < WI, as desired. Finally, suppose we have a decomposition as in equation A.2.8, such that I is infinite. Then, by Lemma A.2.8, we can find a subset 10 = {in: n EN}, where in =f:. irn \In =f:. m. Since IRI S; lXi, I, we can find a subset Ro C Xi, such that IRI = IRol; notice now that
IXin I
IXil IRI
IXin +, I \I n E N IXil\liEI-1o
IRol
189
A.2. Transfinite considerations
and deduce from Proposition A.2.4 that
IXI
(il X;") Il CtL X;) I (i1 X;") Il CtL X;) I
Il RI Il 1101 ;
(A.2.9)
since the set on the right of equation A.2.9 is clearly a subset of Z = IliE! Xi, we may deduce ~ from Theorem A.2.5 ~ that IXI = IZI; if f : Z --> X is a bijection (whose existence is thus guaranteed), let Yi = f(Xd \::Ii E I; these Yi's do what is wanted of them. D Exercise A.2.10 (0) Show that Proposition A.2.9 is false ifY is the empty set, and find out where the non-emptiness of Y was used in the proof presented above. (1) Why is Proposition A.2.9 a generalisation of the Euclidean algorithm (jor natural numbers)?
Proposition A.2.11 A set X is infinite if and only if X is non-empty and there exists a partition X = = Il~=l An with the property that IXI = IAnl \::In. Proof. We only prove the non-trivial ("only if") implication. Case (i): X = N. Consider the following listing of elements of N x N:
(1,1); (1,2), (2, 1); (1,3), (2,2), (3, 1); (1,4), ... , (4, 1); ... this yields a bijection f : N --> and IAnl = INI \::In E N.
Nx N; let An =
f~ I
({ n}
x N); then N = IlnEN An
Case (ii): X arbitrary. Suppose X is infinite. Thus there exists a 1-1 function f of X onto a proper subset Xl C X; let Y = X - Xl, which is non-empty, by hypothesis. Inductively define YI = Y, and Y n +1 = f(Yn); note that {Yn : n = 1,2, ... } is a sequence of pairwise disjoint subsets of X with IYI = IYnl \::In. By case (i), we may write N = Il~=l An, where IAnl = INI \::In. Set Wn = IlkEAn Yk, and note that IWnl = IY x NI \::In. Set R = X - U~=oYn' and observe that X = (Il~=l W n ) Il R. Deduce ~ from the second half of Proposition A.2.9(b) ~ that there exists an infinite set I such that IXI = IYxNxII. Observe now that YxNxI = Il~=l (Y x An X I), and that IY x N x II = IY x An X JI, \::In. Thus, we find that we do indeed have a partition of the form X = 1l~=1 X n , where IXI = IXnl \::In, as desired. D
Appendix
190
The following corollary follows immediately from Proposition A.2.11, and we omit the easy proof.
Corollary A.2.12 If X is an infinite set, then IX x NI
=
IXI.
It is true, more generally, that if X and Yare infinite sets such that WI S; lXI, then IX x YI = IXI; we do not need this fact and so we do not prove it here. The version with Y = N is sufficient, for instance, to prove the fact that makes sense of the dimension of a Hilbert space. Proposition A.2.13 Any two orthonormal bases of a Hilbert space have the same cardinality, and consequently, it makes sense to define the dimension of the Hilbert space 1{ as the "cardinality" of any orthonormal basis. Proof. Suppose {ei : i E I} and {Ij : j E J} are two orthonormal bases of a Hilbert space 1{. First consider the case when I or J is finite. Then, the conclusion that III = IJI is a consequence of Corollary A.l.lD. We may assume, therefore, that both I and J are infinite. For each i E I, since {fj : j E J} is an orthonormal set, we know, from Bessel's inequality - see Proposition 2.3.3 - that the set J i = {j E J : (ei, fj) -I- O} is countable. On the other hand, since {ei : i E I} is an orthonormal basis, each j E J must belong to at least one J i (for some i E 1). Hence J = UiEIJi . Set K = {(i,j) : i E I,j E J;} c I x J; thus, the projection onto the second factor maps K onto J, which clearly implies that IJI S; IKI. Pick a 1-1 function fi : J i ---+ N, and consider the function F : K ---+ I x N defined by F(i,j) = (i,fi(j)). It must be obvious that F is a 1-1 map, and hence we find (from Corollary A.2.12) that IJI S; IKI
= IF(K)I
S;
II x NI = III-
By reversing the roles of I and J, we find that the proof of the proposition is complete. 0
Exercise A.2.14 Show that every vector space (over any field) has a basis, and that any two bases have the same cardinality. (Hint: Imitate the proof of Proposition A.2.13.)
A.a Topological spaces A topological space is a set X, where there is a notion of "nearness" of points (although there is no metric in the background), and which (experience shows) is the natural context to discuss notions of continuity. Thus, to each point x EX, there is singled out some family N(x) of subsets of X, whose members are called "neighbourhoods of the point x". (If X is a metric space, we say that N is a
A.3. Topological spaces
191
neighbourhood of a point x if it contains all points sufficiently close to x - i.e., if there exists some E > 0 such that N ::) B(X,E) = {y EX: d(x,y) < d.) A set is thought of as being "open" if it is a neighbourhood of each of its points i.e., U is open if and only if U E N(x) whenever x E U. A topological space is an axiomatisation of this set-up; what we find is a simple set of requirements (or axioms) to be met by the family, call it T, of all those sets which are open, in the sense above. Definition A.3.1 A topological space is a set X, equipped with a distinguished collection T of subsets of X, the members of T being referred to as open sets, such that the following axioms are satisfied: (1) X,0 E T; thus, the universe of discourse and the empty set are assumed to be empty; (2) U, VET =} un VET; (and consequently, any finite intersection of open sets is open); and (3) {Ui:iEl}CT =} UiEIUiET. The family T is called the topology on X. A subset F C X will be said to be closed precisely when its complement X - F is open. Thus a topology on X is just a family of sets which contains the whole space and the empty set, and which is closed under the formation of finite intersections and arbitrary unions. Of course, every metric space is a topological space in a natural way; thus, we declare that a set is open precisely when it is expressible as a union of open balls. Some other easy (but perhaps somewhat pathological) examples of topological spaces are given below. Example A.3.2 (1) Let X be any set; define the indiscrete topology.
T
= {X, 0}. This is a topology, called
(2) Let X be any set; define T = 2x = {A: A C X}; thus, every set is open in this topology, and this is called the discrete topology. (3) Let X be any set; the so-called "co-finite topology" is defined by declaring a set to be open if it is either empty or if its complement is finite. (4) Replacing every occurrence of the word "finite" in (3) above, by the word "countable", gives rise to a topology, called, naturally, the "co-countable topology" on X. (Of course, this topology would be interesting only if the set X is uncountable - just as the co-finite topology would be interesting only for infinite sets X - since in the contrary case, the resulting topology would degenerate into the discrete topology.) 0 Just as convergent sequences are a useful notion when dealing with metric spaces, we will find that nets - see Definition 2.2.3, as well as Example 2.2.4(3) will be very useful while dealing with general topological spaces. As an instance, we cite the following result:
Appendix
192
Proposition A.3.3 The following conditions on a subset F of a topological space are equivalent: (i) F is closed; i.e., U = X - F is open; (ii) if {Xi: i E I} is any net in F which converges to a limit x EX, then x E F. Proof. (i) =? (ii): Suppose Xi ----+ X as in (ii); suppose x 1:- F; then x E U, and since U is open, the definition of a convergent net shows that there exists some io E I such that Xi E U 'Vi 2': io; but this contradicts the assumption that Xi E F'Vi. (ii) =? (i): Suppose (ii) is satisfied; we assert that if x E U, then there exists an open set Bx C U such that x E Bx. (This will exhibit U as the union UxEUBx of open sets and thereby establish (i).) Suppose our assertion were false; this would mean that there exists an x E U such that for every open neighbourhood V of x - i.e., an open set containing x - there exists a point Xv E V n F. Then - see Example 2.2.4(3) - {xv: V E N(x)} would be a net in F which converges to the point x 1:- F, and the desired contradiction has been reached. D We gather a few more simple facts concerning closed sets in the next result.
Proposition A.3.4 Let X be a topological space. Let us temporarily write F for the class of all closed sets in X.
(1) The family F has the following properties (and a topological space can clearly be equivalently defined as a set where there is a distinguished class F of subets of X which are closed sets and which satisfy the following properties): (a) X,0 E F; (b) F1 , F2 E F =? (Fl U F2) E F; (c) if I is an arbitrary set, then {Fi : i E I} c F =? niE! Fi E F.
(2) if A c X is any subset of X, the closure of A is the set - which will always be denoted by the symbol if - defined by
if = n {F
E
F: A
C
F}
(A.3.10)
then (a) if is a closed set, and it is the smallest closed set which contains A; (b) A c B =? if c B; (c) x E if {o} UnA -I- 0 for every open set containing x. Proof· The proof of (1) is a simple exercise in complementation (and uses nothing more than the so-called "de Morgan's laws"). (2) (a) is a consequence of (1)(c), while (b) follows immediately from (a); and (c) follows from the definitions (and the fact that a set is closed precisely when its complement is open). D
A.3. Topological spaces
193
The following exercise lists some simple consequences of the definition of the closure operation, and also contains a basic definition. Exercise A.3.5 (1) Let A be a subset of a topological space X, and let x E A; then show that x E A if and only if there is a net {Xi : i E I} in A such that Xi ----> x.
(2) If X is a metric space, show that nets can be replaced by sequences in (1). (3) A subset D of a topological space is said to be dense if D X. (More generally, a set A is said to be "dense in a set B" if B C A.) Show that the following conditions on the subset D C X are equivalent: (i) D is dense (in X); (ii) for each x E X, there exists a net {Xi: i E I} in D such that Xi ----> x; (iii) if U is any non-empty open set in X, then D n U #- 0. In dealing with metric spaces, we rarely have to deal explicitly with general open sets; open balls suffice in most contexts. These are generalised in the following manner. Proposition A.3.6 (1) Let (X, T) be a topological space. The following conditions on a a subcollection BeT are equivalent: (i) U E T =} :3 {Bi : i E I} c B such that U = UiEI B i ; (ii) x E U, U open =} :3 BE B such that x E B c U.
A collection B satisfying these equivalent conditions is called a base for the topology T. (2) A family B is a base for some topology T on X if and only if B satisfies the two following conditions: (a) B covers X, meaning that X = UBEBB; and (b) B 1 ,B2 EB,xEB 1 nB2 =} :3BEBsuchthatxEBC(B1 nB2 ). The elementary proof of this proposition is left as an exercise for the reader. It should be fairly clear that if B is a base for a topology T on X, and if T' is any topology such that BeT', then, necessarily T C T'. Thus, if B is a base for a topology T, then T is the "smallest topology" with respect to which all the members of B are open. However, as Proposition A.3.6(2) shows, not any collection of sets can be a base for some topology. This state of affairs is partially remedied in the following proposition. Proposition A.3.7 (a) Let X be a set and let S be an arbitrary family of subsets of X. Then there exists a smallest topology T(S) on X such that S C T(S); we shall refer to T(S) as the topology generated by S.
(b) Let X, S, T(S) be as in (a) above. Let B = {X} U {B : 3n E N, and Sl, S2, ... , Sn E S such that B = 1 Si}. Then B is a base for the topology T; in particular,
nr=
Appendix
194
a typical element of T( S) is expressible as an arbitrary union of finite intersections of members of S. If (X, T) is a topological space, a family S is said to be a sub-base for the topology T if it is the case that T = T( S). Proof. For (a), we may simply define
T(S)
= n {T' : T'
is a topology and SeT'} ,
and note that this does the job. As for (b), it is clear that the family 3, as defined in (b), covers X and is closed under finite intersections; consequently, if we define T = {UBEBoB : 3 0 C 3}, it may be verified that T is a topology on X for which 3 is a base; on the other hand, it is clear from the construction (and the definition of a topology) that if T' is any topology which contains S, then T' must necessarily contain T, and we find that, hence, T = T(S). D The usefulness of sub-bases may be seen as follows: very often, in wanting to define a topology, we will find that it is natural to require that sets belonging to a certain class S should be open; then the topology we seek is any topology which is at least large as T(S), and we will find that this minimal topology is quite a good one to work with, since we know, by the last proposition, precisely what the open sets in this topology look like. In order to make all this precise, as well as for other reasons, we need to discuss the notion of continuity, in the context of topological spaces. Definition A.3.8 A function f : X
---+
Y between topological spaces is said to be:
(a) continuous at the point x EX, if f- l (U) is an open neighbourhood of x in the topological space X, whenever U is an open neighbourhood of the point f (x) in the topological space Y;
(b) continuous if it is continuous at each x E X, or equivalently, if f-l(U) is an open set in X, whenever U is an open set in Y. The proof of the following elementary proposition is left as an exercise for the reader. Proposition A.3.9 Let f : X
---+
Y be a map between topological spaces.
(1) if x E X, then f is continuous at x if and only if {f(Xi) : i E I} is a net converging to f (x) in Y, whenever {Xi : i E I} is a net converging to x in X; (2) the following conditions on f are equivalent: (i) f is continuous; (ii) f- 1 (F) is a closed subset of X whenever F is a closed subset of Y;
A.3. Topological spaces
195
(iii) {f(Xi) : i
E I} is a net converging to f(x) in Y, whenever {Xi: i E I} lS a net converging to X in X; (iv) f-l(B) is open in X, whenever B belongs to some base for the topology on
Y; (v) f- 1 (S) is open in X, whenever S belongs to some sub-base for the topology on Y. (3) The composition of continuous maps is continuous; i. e., if f : X ---+ Y and g : Y ---+ Z are continuous maps of topological spaces, then g 0 f : X ---+ Z is continuous. We are now ready to illustrate what we meant in our earlier discussion of the usefulness of sub-bases. Typically, we have the following situation in mind: suppose X is just some set, that {Y; : i E I} is some family of topological spaces, and suppose we have maps fi : X ---+ y;, Vi E I. We would want to topologise X in such a way that each of the maps fi is continuous. This, by itself, is not difficult, since if X is equipped with the discrete topology, any map from X into any topological space would be continuous; but if we want to topologise X in an efficient, as well as natural, manner with respect to the requirement that each fi is continuous, then the method of sub-bases tells us what to do. Let us make all this explicit. Proposition A.3.IO Suppose {fi : X ---+ Xiii E I} is a family of maps, and suppose Ti is a topology on Xi for each i E I. Let Si be an arbitrary sub-base for the topology Ti. Define T = T(S), where
Then,
(a) fi is continuous as a mapping from the topological space (X, T) into the topological space (Xi, Ti), for each i E I; (b) the topology T is the smallest topology on X for which (a) above is valid; and consequently, this topology is independent of the choice of the sub-bases Si, i E I and depends only upon the data {fi, Ti, i E I}. This topology T on X is called the weak topology induced by the family {fi : i E I} of maps, and we shall write T = T( {fi : i E I}).
The proposition is an immediate consequence of Proposition A.3.9(2)(v) and Proposition A.3. 7. Exercise A.3.11 Let X, Xi, Ti, fi and T = T( {fi : i E I}) be as in Proposition A.3.10. (a) Suppose Z is a topological space and g : Z ---+ X is a function. Then, show that 9 is continuous if and only if fi 0 9 is continuous for each i E I.
Appendix
196
(b) Show that the family H = {nj=di~I(ViJ: i l , ... ,in E I,n E N, Vi] E Tij Vj} is a base for the topology T. (c) Show that a net {x A : A E A} converges to x in (X, T) if and only if the net {ji(X A ) : A E A} converges to fi(x) in (Xi,Ti), for each i E I.
As in a metric space, any subspace (= subset) of a topological space acquires a natural structure of a topological space in the manner indicated in the following exercise. Exercise A.3.12 Let (Y, T) be a topological space, and let X c Y be a subset. Let ix-+y : X ----> Y denote the inclusion map. Then the subspace topology on X (or the topology on X induced by the topology T on Y) is, by definition, the weak topology T({ix-+y }). This topology will be denoted by Tlx. (a) Show that Tlx = {U n X : U E T}, or equivalently, that a subset F c X is closed in (X, Tlx) if and only if there exists a closed set FI in (Y, T) such that F = FI n X. (b) Show that if Z is some topological space and if f : Z ----> X is a function, then I is continuous when regarded as a map into the topological space (X, Tlx) if and only if it is continuous when regarded as a map into the topological space (Y,T). One of the most important special cases of this construction is the product of topological spaces. Suppose {(Xi, Ti) : i E I} is an arbitrary family of topological spaces. Let X = DiE! Xi denote their Cartesian product. Then the product topology is, by definition, the weak topology on X induced by the family {1fi : X ----> Xiii E I}, where 1fi denotes, for each i E I, the natural "projection" of X onto Xi. We shall denote this product topology by DiE! Ti. Note that if Hi is a base for the topology Ti, then a base for the product topology is given by the family H, where a typical element of H has the form
B
=
{x EX: 1fi(X) E Bi Vi E Io} ,
where 10 is an arbitrary finite subset of I and Bi E Hi for each i E 10 . Thus, a typical basic open set is prescribed by constraining some finitely many co-ordinates to lie in specified basic open sets in the appropriate spaces. Note that if Z is any set, then maps I : Z ----> X are in a 1-1 correspondence with families {Ii : Z ----> Xiii E I} of maps - where Ii = 1fi 0 f; and it follows from Exercise A.3.11 that if Z is a topological space, then the mapping f is continuous precisely when each fi is continuous. To make sure that you have really understood the definition of the product topology, you are urged to work out the following exercise. Exercise A.3.13 If (X, T) is a topological space, and if I is any set, let X the space of functions f : I ----> X.
I
denote
A.4. Compactness
197
(a) Show that X I may be identified with the product fliEr Xi, where Xi = X Vi E I. (b) Let X I be equipped with the product topology; fix Xo E X and show that the set D = {J E Xl : f(i) = Xo for all but a finite number of i 's } is dense in X, meaning that if U is any open set in X, then D n U i= 0.
(c) If A is a directed set, show that a net {f>. : oX E A} in X I converges to a point f E Xl if and only if the net {f~(i) : oX E A} converges to f(i), for each i E I. (In other words, the product topology on X I is nothing but the topology of "pointwise convergence". ) We conclude this section with a brief discussion of "homeomorphisms" .
Definition A.3.14 Two topological spaces X and Yare said to be homeomorphic if there exist continuous functions f : X -+ Y, g : Y -+ X such that fog = id y and go f = idx . A homeomorphism is a map f as above - i. e., a continuous bijection between two topological spaces whose (set-theoretic) inverse is also continuous. The reader should observe that requiring that a function f : X -+ Y is a homeomorphism is more than just requiring that f is 1-1, onto and continuous; if only so much is required of the function, then the inverse f- 1 may fail to be continuous; an example of this phenomenon is provided by the function f : [0, 1) -+ 11' defined by f(t) = exp(21fit). The proof of the following proposition is elementary, and left as an exercise to the reader.
Proposition A.3.15 Suppose f is a continuous bijective map of a topological space X onto a space Y; then the following conditions are equivalent: (i) f is a homeomorphism; (ii) f is an open map - i.e., if U is an open set in X, then f(U) is an open set in Y; (iii) f is a closed map - i.e., if F is a closed set in X, then f(F) is a closed set in Y.
A.4
Compactness
This section is devoted to a quick review of the theory of compact spaces. For the uninitiated reader, the best way to get a feeling for compactness is by understanding this notion for subsets of the real line. The features of compact subsets of ]R (and, more generally, of any ]Rn) are summarised in the following result.
Theorem A.4.1 The following conditions on a subset K c ]Rn are equivalent: (i) K is closed and bounded; (ii) every sequence in K has a subsequence which converges to some point in K;
198
Appendix
(iii) every open cover of K has a finite sub-cover - i. e., if {Ui : i E I} is any collection of open sets such that K C UiE1Ui , then there exists a finite subcollection {Uij : 1 ~ j ~ n} which still covers K (meaning that K C U']=l Ui ) . (iv) Suppose {Fi : i E I} is any family of closed subsets which has the finite intersection property with respect to K - i.e., niE10Fi n K =1= 0 for any finite subset 10 C I; then it is necessarily the case that niE1Fi n K =1= 0.
A subset K C ffi. which satisfies the equivalent conditions above is said to be compact. D We will not prove this here, since we shall be proving more general statements. To start with, note that the conditions (iii) and (iv) of Theorem A.4.1 make sense for any subset K of a topological space X, and are easily seen to be equivalent (by considering the equation Fi = X - U i ); also, while conditions (i) and (ii) make sense in any topological space, they may not be very strong in general. What we shall see is that conditions (ii)-(iv) are equivalent for any subset of a metric space, and these are equivalent to a stronger version of (i). We begin with the appropriate definition. Definition A.4.2 A subset K of a topological space is said to be compact if it satisfies the equivalent conditions (iii) and (iv) of Theorem A.4.1. "Compactness" is an intrinsic property, as asserted in the following exercise. Exercise A.4.3 Suppose K is a subset of a topological space (X, T). Then show that K is compact when regarded as a subset of the topological space (X, T) if and only if K is compact when regarded as a subset of the topological space (K, TIK) - see Exercise A.3.12. Proposition A.4.4 (a) Let H be any base for the topology underlying a topological space X. Then, a subset K C X is compact if and only if any cover of K by open sets, all of which belong to the base H, admits a finite sub-cover.
(b) A closed subset of a compact set is compact. Proof. (a) Clearly every "basic open cover" of a compact set admits a finite subcover. Conversely, suppose {Ui : i E I} is an open cover of a set K which has the property that every cover of K by members of H admits a finite sub cover. For each i E I, find a subfamily Hi C H such that Ui = UBEBiB; then {B: B E Hi,i E I} is a cover of K by members of H; so there exist B 1 , .•• ,Bn in this family such that K C Uf=l B i ; for each j = 1, ... ,n, by construction, we can find i j E I such that B j C Uij ; it follows that K C Uj=l Uij .
(b) Suppose C eKe X where K is a compact subset of X and C is a subset of K which is closed in the subspace topology of K. Thus, there exists a closed
A.4. Compactness
199
set F c X such that C = F n K. Now, suppose {Ui : i E I} is an open cover of C. Then {Ui : i E I} U {X - F} is an open cover of K; so there exists a finite subfamily 10 c I such that K C UiEloUi U (X - F), which clearly implies that C C UiEloUi. 0 Corollary A.4.5 Let K be a compact subset of a metric space X. Then, for any > 0, there exists a finite subset N, C K such that K C UxEN, B(X,E), {where, of course, the symbol B (x, E) = {y EX: d(y, x) < E} denotes the open ball with centre X and radius E). Any set N, as above is called an E-net for K.
E
Proof. The family of all open balls with (arbitrary centres and) radii bounded by ~, clearly constitutes a base for the topology of the metric space X. (Verify this!) Hence, by the preceding proposition, we can find Y1, ... ,Yn E X such that K C Uf=l B(Yi, ~). We may assume, without loss of generality, that K is not contained in any proper sub-family of these n open balls; i.e., we may assume that there exists Xi E K n B(Yi, ~), Vi. Then, clearly K C Uf=l B(Xi' E), and so N, = {Xl, ... ,Xn} is an E-net for K. 0 Definition A.4.6 A subset A of a metric space X is said to be totally bounded if, for each E > 0, there exists a finite E-net for A. Thus, compact subsets of metric spaces are totally bounded. The next proposition provides an alternative criterion for total boundedness. Proposition A.4.7 The following conditions on a subset A of a metric space X are equivalent: (i) A is totally bounded; (ii) every sequence in A has a Cauchy subsequence.
'*
(ii): Suppose So = {xdk=l is a sequence in A. We assert that there Proof. (i) exist (a) a sequence {Bn = B(zn, 2- n )} of open balls (with shrinking radii) in X; and (b) sequences Sn = {Xkn )}k=l' n = 1,2, ... with the property that Sn C Bn and Sn is a subsequence of Sn-1, for all n ::::: 1. We construct the Bn's and the Sn's inductively. To start with, pick a finite -21-net N! for A; clearly, there must be some Zl EN! with the property that 2
2
Xk E B(Zl'~) for infinitely many values of k; define B1 = B(Zl'~) and choose Sl to be any subsequence {Xk1)}~1 of So with the property that xk1) E B1 Vk. Suppose now that open balls B j = B(zj, 2- j ) and sequences Sj, 1 ::; j ::; n have been chosen so that Sj is a subsequence of Sj-1 which lies in B j VI ::; j ::; n. Now, let N 2 -(n+l) be a finite 2-(n+1Lnet for A; as before, we may argue that there must exist some Zn+1 E N 2 -(n+l) with the property that if Bn+1 = B(Zn+1,2-(n+1)), then xkn ) E Bn+1 for infinitely many values of k; we choose Sn+1 to be a subsequence of Sn which lies entirely in B n +1.
Appendix
200
Thus, by induction, we can conclude the existence of sequences of open balls Bn and sequences Sn with the asserted properties. Now, define Yn = x~n) and note that (i) {Yn} is a subsequence of the initial sequence So, and (ii) if n, m 2: k, then Yn, Ym E Bk and hence d(Yn' Ym) < 21~k; this is the desired Cauchy subsequence. (ii) =;. (i): Suppose A is not totally bounded; then there exists some ( > 0 such that A admits no finite (-net; this means that given any finite subset F c A, there exists some a E A such that d(a, x) 2: ( V x E F. Pick Xl E A; (this is possible, since we may take F = 0 in the previous sentence); then pick X2 E A such that d(XI,X2) 2: (; (this is possible, by setting F = {xd in the previous sentence); then (set F = {Xl, xd in the previous sentence and) pick X3 E A such that d( X3, Xi) 2: ( ,i = 1, 2; and soon. This yields a sequence {xn};:O= I in A such that d(xn' xm) 2: ( V n -I- m. This sequence clearly has no Cauchy subsequence, thereby contradicting the assumption (ii), and the proof is complete. D Corollary A.4.8 If A is a totally bounded set in a metric space, so also are its closure A and any subset of A. Remark A.4.9 The argument to prove (i) =;. (ii) in the above theorem has a very useful component which we wish to single out; starting with a sequence S = {xd, we proceeded, by some method ~ which method was dictated by the specific purpose on hand, and is not relevant here ~ to construct sequences Sn = {x~n)} with the property that each Sn+l was a subsequence of Sn (and with some additional desirable properties); we then considered the sequence {x~n)};:o= I' This process is sometimes referred to as the diagonal argument. Exercise A.4.10 (1) Show that any totally bounded set in a metric space is bounded, meaning that it is a subset of some open ball.
(2) Show that a subset of ffi.n is bounded if and only if it is totally bounded. (Hint: by Corollary A.4.8 and (1) above, it is enough to establish the total-boundedness of the set given by K = {x = (Xl, ... , Xn) : IXi I ~ N Vi}; given E, pick k such that the diameter of an n-cube of side is smaller than (, and consider the points of the form (T' ~, ... , 'f) in order to find an (-net.)
i
We are almost ready to prove the announced equivalence of the various conditions for compactness; but first, we need a technical result which we take care of before proceeding to the desired assertion. Lemma A.4.11 The following conditions on a metric space X are equivalent: (i) X is separable ~ i.e., there exists a countable set D which is dense in X (meaning that X = D); (ii) X satisfies the second axiom of count ability ~ meaning that there is a countable base for the topology of X.
A.4. Compactness
201
Proof. (i) =} (ii): Let B = {B(x,r) : xED, 0 < r E Q} where D is some countable dense set in X, and Q denotes the (countable) set of rational numbers. We assert that this is a base for the topology on X; for, if U is an open set and if y E U, we may find s > 0 such that B(y, s) c U; pick xED n B(y,~) and pick a positive rational number r such that d(x,y) < r < ~ and note that y E B(x,r) c U; thus, for each y E U, we have found a ball By E B such that y E By C U; hence U = UyEUBy. (ii) =} (i): If B = {Bn}~=l is a countable base for the topology of X, (where we may assume that each Bn is a non-empty set, without loss of generality), pick a point Xn E Bn for each n and verify that D = {Xn}~=l is indeed a countable dense set in X. D Theorem A.4.12 The following conditions on a subset K of a metric space X are equivalent: (i) K is compact; (ii) K is complete and totally bounded; (iii) every sequence in K has a subsequence which converges to some point of K.
Proof. (i) =} (ii): We have already seen that compactness implies total boundedness. Suppose now that {xn} is a Cauchy sequence in K. Let Fn be the closure of the set {xrn : m ;::: n}, for all n = 1, 2, .... Then {Fn} ~= 1 is clearly a decreasing sequence of closed sets in X; further, the Cauchy criterion implies that diam Fn -+ 0 as n -+ 00. By invoking the finite intersection property, we see that there must exist a point x E n~= 1Fn n K. Since the diameters of the Fn's shrink to 0, we may conclude that (such an x is unique and that) Xn -+ x as n -+ 00. (ii)
{o}
(iii): This is an immediate consequence of Proposition A.4.7.
(ii) =} (i): For each n = 1,2, ... , let N 1.. be a ~-net for K. It should be clear that D = U~=l N 1.. is a countable dense set in K. Consequently K is separable; hence, by Lemma A.4.11, K admits a countable base B. In order to check that K is compact in X, it is enough, by Exercise A.4.3, to check that K is compact in itself. Hence, by Proposition A.4.4, it is enough to check that any countable open cover of X has a finite sub-cover; by the reasoning that goes in to establish the equivalence of conditions (iii) and (iv) in Theorem A.4.1, we have, therefore, to show that if {Fn} ~= 1 is a sequence of closed sets in K such that n;;=l Fn i= 0 for each N = 1,2, ... , then necessarily n~=l Fn i= 0. For this, pick a point XN E n;;=lFn for each N; appeal to the hypothesis to find a subsequence {Yn} of {Xn} such that Yn -+ X E K, for some point x E K. Now notice that {ym}~=N is a sequence in n;;=lFn which converges to x; conclude that x E Fn Vn; hence n~=l Fn i= 0. D In view of Exercise A.4.1O(2), we see that Theorem A.4.12 does indeed generalise Theorem A.4.l.
Appendix
202
We now wish to discuss compactness in general topological spaces. We begin with some elementary results.
Proposition A.4.13 (a) Suppose f : X ---+ Y is a continuous map of topological spaces; if K is a compact subset of X, then f (K) is a compact subset of Y; in other words, a continuous image of a compact set is compact.
(b) If f : X
---+ IR is continuous, and if K is a compact set in X, then (i) f(K) is bounded, and (ii) there existpointsx,y E K such thatf(x)::; f(z)::; f(y) 'r:j z E K; in other words, a continuous real-valued function on a compact set is bounded and attains its bounds.
Proof. (a) If {Ui : i E I} is an open cover of f(K), then {j-l(Ui ) : i E I} is an open cover of K; if {j-l(Ui ) : i E Ia} is a finite sub-cover of K, then {Ui : i E Ia} is a finite sub-cover of f(K).
(b) This follows from (a), since com pact subsets of IR are closed and bounded (and hence contain their supremum and infimum). 0 The following result is considerably stronger than Proposition A.4.4(a).
Theorem A.4.14 (Alexander sub-base theorem) Let S be a sub-base for the topology T underlying a topological space X. Then a subspace K c X is compact if and only if any open cover of K by a sub-family of S admits a finite sub-cover. Proof. Suppose S is a sub-base for the topology of X and suppose that any open cover of K by a sub-family of S admits a finite sub-cover. Let B be the base generated by S - i.e., a typical element of B is a finite intersection of members of S. In view of Proposition A.4.4(a), the theorem will be proved once we establish that any cover of X by members of B admits a finite sub-cover. Assume this is false; thus, suppose Ua = {Bi : i E Ia} c B is an open cover of X, which does not admit a finite subcover. Let P denote the set of all families Ua cUe B with the property that no finite sub-family ofU covers X. (Notice that U is an open cover of X, since U ~ U a.) It should be obvious that P is partially ordered by inclusion, and that P -I- 0 since U E P. Suppose {Ui : i E I} is a totally ordered subset of P; we assert that then U = UiEIUi E P. (Reason: Clearly U a C UiEIUi C B; further, suppose {B l , ... , Bn} C UiEIUi ; the assumption of the family {Ud being totally ordered then shows that in fact there must exist some i E I such that {B l , ... , Bn} CUi; by definition of P, it cannot be the case that {B l , ... , Bn} covers X; thus, we have shown that no finite sub-family of U covers X. Hence, Zorn's lemma is applicable to the partially ordered set P. Thus, if the theorem were false, it would be possible to find a family U C B which is an open cover of X, and further has the following properties: (i) no finite sub-family ofU covers X; and (ii) U is a maximal element ofP - which means that whenever B E B - U, there exists a finite subfamily UB of U such that UB U {B} is a (finite) cover of X.
A.4. Compactness
203
Now, by definition of B, each element B ofU has the form B = SlnS2" ·nSn, for some Sl,"" Sn E S. Assume for the moment that none of the Si'S belongs to U. Then property (ii) of U (in the previous paragraph) implies that for each 1 :::; i :::; n, we can find a finite subfamily U i C U such that U i U {S;} is a finite open cover of X. Since this is true for all i, this implies that Uf=Pi U {B = nf=lS;} is a finite open cover of X; but since B E U, we have produced a finite subcover of X from U, which contradicts the defining property (i) (in the previous paragraph) of U. Hence at least one of the Si'S must belong to U. Thus, we have shown that if U is as above, then each B E U is of the form B = n~l Si, where Si E S \:Ii and further there exists at least one io such that Sio E U. The passage B f---+ Sio yields a mapping U :3 B f---+ S(B) E un S with the property that B C S(B) for all B E U. Since U is an open cover of X by definition, we find that un S is an open cover of X by members of S, which admits a finite sub-cover, by hypothesis. The contradiction we have reached is a consequence of our assuming that there exists an open cover Uo C B which does not admit a finite sub-cover. The proof of the theorem is finally complete. D We are now ready to prove the important result, due to Tychonoff, which asserts that if a family of topological spaces is compact, then so is their topological product. Theorem A.4.15 (Tychonoff"s theorem) Suppose {(Xi, Ti) : i E I} is a family of non-empty topological spaces. Let (X, T) denote the product space X = ITiEI Xi, equipped with the product topology. Then X is compact if and only if each Xi is compact.
Proof. Since Xi = 7ri(X), where 7ri denotes the (continuous) projection onto the i-th co-ordinate, it follows from Proposition A.4.13(a) that if X is compact, so is each Xi. Suppose conversely that each Xi is compact. For a subset A C Xi, let Ai = 7ri 1 (A). Thus, by the definition of the product topology, S = {U i : U E Ti} is a sub-base for the product topology T. Thus, we need to prove the following: if J is any set, if J :3 j f---+ i(j) E I is a map, if A( i(j)) is a closed set in Xi(j) for each j E J, and if njEFA(i(j))i(j) #- 0 for every finite subset F C J, then it is necessarily the case that njEJ A( i(j) )i(j) #- 0. Let h = {i (j) : j E J} denote the range of the mapping j f---+ i (j). For each fixed i E h, let J i = {j E J : i(j) = i}; observe that {A(i(j)) : j E J;} is a family of closed sets in Xi; we assert that njEFA(i(j)) #- 0 for every finite subset F C k (Reason: if this were empty, then, njEFA(i(j))i(j) = ( njEF A(i(j)) )i would have to be empty, which would contradict the hypothesis.) We conclude from the compactness of Xi that njEJ;A(i(j)) #- 0. Let Xi be any point from this intersection. Thus Xi E A(i(j)) whenever j E J and i(j) = i. For i E 1- h, pick an arbitrary point Xi E Xi' It is now readily verified that if X E X is the point
204
Appendix
such that 7ri(X) = Xi \:Ii E I, then X E A(i(j))i(j) \:Ij E J, and the proof of the theorem is complete. D Among topological spaces, there is an important subclass of spaces which exhibit some pleasing features (in that certain pathological situations cannot occur). We briefiy discuss these spaces now. Definition A.4.16 A topological space is called a Hausdorff space if, whenever x and yare two distinct points in X, there exist open sets U, V in X such that x E U, Y E V and Un V = 0. Thus, any two distinct points can be "separated" (by a pair of disjoint open sets). (For obvious reasons, the preceding requirement is sometimes also referred to as the Hausdorff separation axiom.) Exercise A.4.17 (1) Show that any metric space is a Hausdorff space. (2) Show that if {fi : X --> Xi : i E I} is a family of functions from a set X into topological spaces (Xi, Ti), and if each (Xi, Ti) is a Hausdorff space, then show that the weak topology T( {Ii : i E I}) on X (which is induced by this family of maps) is a Hausdorff topology if and only if the fi's separate points - meaning that whenever x and yare distinct points in X, then there e.Tists an index i E I such that fi(x) -=1= fi(Y)· (3) Show that if {(Xi, Ti) : i E I} is a family of topological spaces, then the topological product space (TIiEI Xi, TIiEI Ti) is a Hausdorff space if and only if each (Xi, Ti) is a Hausdorff space. (4) Show that a subspace of a Hausdorff space is also a Hausdorff space (with respect to the subspace topology). (5) Show that Exercise (4), as well as the "if" assertion of Exercise (8) above, are consequences of Exercise (2). We list some elementary properties of Hausdorff spaces in the following proposition. Proposition A.4.18 (a) If (X, T) is a Hausdorff space, then every finite subset of X is closed.
(b) A topological space is a Hausdorff space if and only if "limits of convergent nets are unique" - i. e., if and only if, whenever {Xi : i E I} is a net in X, and if the net converges to both x E X and y EX, then x = y.
(c) Suppose K is a compact subset of a Hausdorff space X and suppose y rt. K; then there exist open sets U, V in X such that K c U, Y E V and Un V = 0; in particular, a compact subset of a Hausdorff space is closed. (d) If X is a Eausdorff space, and ifC and K are compact subsets of X which are disjoint - i. e., if C n K = 0 - then there exists a pair of disjoint open sets U, V in X such that K C U and C c V.
A.4. Compactness
205
Proof. (a) Since finite unions of closed sets are closed, it is enough to prove that {x} is closed, for each x EX; the Hausdorff separation axiom clearly implies that X - {x} is open. (b) Suppose X is Hausdorff, and suppose a net {Xi : i E I} converges to x E X and suppose x f. y E X. Pick open sets U, Vas in Definition A.4.16; by definition of convergence of a net, we can find io E I such that Xi E U Vi ~ io; it follows then that Xi rj. V Vi ~ i o, and hence the net {xd clearly does not converge to y. Conversely, suppose X is not a Hausdorff space; then there exists a pair x, y of distinct points which cannot be separated. Let N(x) (resp., N(y)) denote the directed set of open neighbourhoods of the point x (resp., y) - see Example 2.2.4(3). Let 1= N(x) x N(y) be the directed set obtained from the Cartesian product as in Example 2.2.4(4). By the assumption that x and y cannot be separated, we can find a point Xi E Un V for each i = (U, V) E I. It is fairly easily seen that the net {Xi : i E I} simultaneously converges to both x and y. (c) Suppose K is a compact subset of a Hausdorff space X. Fix y rj. K; then, for each x E K, find open sets Ux , Vx so that .r E Ux , y E Vx and Ux n Vx = 0; now the family {Ux : x E K} is an open cover of the compact set K, and we can hence find Xl, ... ,X n E K such that K c U = Uf=l U x ,; conclude that if V = nf=l VXi , then V is an open neighbourhood of y such that U n V = 0; and the proof of (c) is complete. (d) By (c) above, we may, for each y E C, find a pair UY ' Vy of disjoint open subsets of X such that K c Uy, Y E VY' and Uy n Vy = 0. Now, the family {Vy : y E C} is an open cover of the compact set C, and hence we can find Yl," ., Yn E C such that if V = Uf=l VYi and U = nf=l Uyi , then U and V are open sets which satisfy the assertion in (d). D
Corollary A.4.19 (1) A continuous mapping from a compact space to a Hausdorff space is a closed map (see Proposition A.S.1S). --+ Y is a continuous map, if Y is a Hausdorff space, and if K c X is a compact subset such that flK is 1-1, then flK is a homeomorphism of K onto
(2) If f : X f(K).
Proof. (1) Closed subsets of compact sets are compact; continuous images of compact sets are compact; and compact subsets of Hausdorff spaces are closed.
(2) This follows directly from (1) and Proposition A.3.15.
D
Proposition A.4.18 shows that in compact Hausdorff spaces, disjoint closed sets can be separated (just like points). There are other spaces which have this property; metric spaces, for instance, have this property, as shown by the following exercise.
Appendix
206
Exercise A.4.20 Let (X, d) be a metric space. Given a subset A eX, define the distance from a point x to the set A by the formula
d(x, A)
(a) Show that d(x, A)
=0
=
(A.4.11)
inf{d(x,a):aEA};
if and only if x belongs to the closure A of A.
(b) Show that Id(x, A) -
d(y, A)I <::: d(x, y) , Vx, Y EX,
and hence conclude that X 3 x
f-+
d(x, A)JR is a continuous function.
(c) If A and B are disjoint closed sets in X, show that the function defined by d(x, A) f(x) = d(x, A) + d(x, B) is a (meaningfully defined) continuous function f : X that if x E A f(x) = { if x E B
~
->
[0,1] with the property (A.4.12)
(d) Deduce from (c) that disjoint closed sets in a metric space can be separated (by disjoint open sets). (Hint: if the closed sets are A and B, consider the set U (resp., V) of points where the function f of (c) is "close to 0" (resp., "close to 1"). The preceding exercise shows, in addition to the fact that disjoint closed sets in a metric space can be separated by disjoint open sets, that there exist lots of continuous real-valued functions on a metric space. It is a fact that the two notions are closely related. To get to this fact, we first introduce a convenient notion, and then establish the relevant theorem. Definition A.4.21 A topological space X is said to be normal if (a) it is a Hausdorff space, and (b) whenever A, B are closed sets in X such that AnB = 0, it is possible to find open sets U, V in X such that A C U, B c V and U n V = 0. The reason that we had to separately assume the Hausdorff condition in the definition given above is that the Hausdorff axiom is not a consequence of condition (b) of the preceding definition. (For example, let X = {I,2} and let T = {0, {I}, X}; then T is indeed a non-Hausdorff topology, and there do not exist a pair of non-empty closed sets which are disjoint from one another, so that condition (b) is vacuously satisfied.) We first observe an easy consequence of normality, and state it as an exercise.
A.4. Compactness
207
Exercise A.4.22 Show that a Hausdorff space X is normal if and only if it satisfies the following condition: whenever A eWe X, where A is closed and W is open, then there exists an open set U such that A c U e U c W. (Hint: consider
B
=
X - W.)
Thus, we find - from Proposition A.4.18(d) and Exercise A.4.20(d) - that compact Hausdorff spaces and metric spaces are examples of normal spaces. One of the results relating normal spaces and continuous functions is the useful Urysohn's lemma which we now establish. Theorem A.4.23 (Urysohn's lemma) Suppose A and B are disjoint closed subsets of a normal space X; then there exists a continuous function f : X --+ [0,1] such that if x E A (A.4.13) f(x) = if x E B
{~
Proof. Write Q2 for the set of "dyadic rational numbers" in [0,1]. Thus, Q2 n = 0,1,2, ... ,0 ~ k ~ 2n}. Assertion: There exist open sets {Ur (i) A c Uo, U 1 = X - B; and (ii) r,s E Q2,r < s =? Ur C Us·
:
=
{2~ :
r E Q2} such that
(A.4.14)
°
Proof of assertion. Define Q2 (n) = {2kn : ~ k ~ 2n }, for n = 0, 1, 2, .... Then, we clearly have Q2 = U~=o Q2(n). We shall use induction on n to construct Ur , r E Q2(n). First, we have Q2(0) = {O, I}. Define U 1 = X - B, and appeal to Exercise A.4.22 to find an open set Uo such that A C Uo c Uo C U 1 . Suppose that we have constructed open sets {Ur : r E Q2 (n)} such that the condition A.4.14 is satisfied whenever r, s E Q2(n). Notice now that if t E Q2 (n + 1) - Q2 (n), then t = 22:'.;t} for some unique integer ~ m < 2n. Set r = 2~rr:l' s = 22:'t?, and note that r < t < s and that r, s E Q2(n). Now apply Exercise A.4.22 to the inclusion Ur C Us to deduce the existence of an open set Ut such that Ur C Ut C Ut C Us. It is a simple matter to verify that the condition A.4 is satisfied now for all r, s E Q2(n + 1); and the proof of the assertion is also complete.
°
Now define the function
f :X
f (x) = { inf {t
--+
E
[0,1] by the following prescription:
Q~ : x
E
Ud
This function is clearly defined on all of X, takes values in [0,1]' is identically equal to on A, and is identically equal to 1 on B. So we only need to establish the continuity of f, in order to complete the proof of the theorem. We check continuity
°
Appendix
208
of f at a point x E X; suppose E > 0 is given. (In the following proof, we will use the (obvious) fact that Q2 is dense in [0,1].) Case (i): f(x) = 0: In this case, pick r E Q2 such that r < E. The assumption that f(x) = 0 implies that x E Ur ; also y E Ur ::::} f(y) ::; r < E. Case (ii): 0 < f(x) < 1: First pick p, t E Q2 such that f(x) ~ E < p < f(x) < t < f(x) + f. By the definition of f, we can find s E Q2 n (f(x), t) such that x E Us. Then pick r E Q2 n (p, f(x)) and observe that f(x) > r ::::} x ~ Ur ::::} X ~ Up. Hence we see that V = Us ~ Up is an open neighbourhood of x; it is also easy to see that y E V ::::} P ::; f(y) ::; s and hence y E V::::} If(y) ~ f(x)1 < E. Case (iii): f(x) = 1: The proof of this case is similar to part of the proof of Case (ii), and is left as an exercise for the reader. 0
We conclude this section with another result concerning the existence of "sufficiently many" continuous functions on a normal space.
Theorem A.4.24 (Tietze's extension theorem) Suppose f : A ----+ [~1, 1] is a continuous function defined on a closed subspace A of a normal space X. Then there exists a continuous function F : X ----+ [~1, 1] such that FI A = f. Proof. Let us set f = fo (for reasons that will soon become clear). Let Ao = {x E A: fo(x) ::; ~n and Bo = {x E A: fo(x) ::;:. then Ao and Bo are disjoint sets which are closed in A and hence also in X. By Urysohn's lemma, we can find a continuous function go : X ----+ [~!, such that go (Ao) = { ~!} and go(Bo) = Set h = fa ~ galA and observe that h : A ----+ [~~, Next, let Al = {x E A : h (x) ::; ~! nand B 1 = {x E A : h (x) ::;:. n; and as before, construct a continuous function gl : X ----+ [~! ~,! ~] such that gl(Ad = {~!. nand gl(Bd = n· Then define h = h ~ gIlA = fa ~ (go + gdlA' and observe that h : A ----+ [~(~)2, (~)2]. Repeating this argument indefinitely, we find that we can find (i) a sequence {In}~=o of continuous functions on A such that fn : A ----+ [~(~)n, (~)n], for each n; and (ii) a sequence {gn}~=o of continuous functions on X such that gn(A) c [~! (~)n, (~)n], for each n; and such that these sequences satisfy
n;
!]
H}·
n
.
.
H·
.
.
!.
!.
fn
=
fa ~ (go
+ 91 + ... + 9n-dlA
.
The series L~=o 9n is absolutely summable, and consequently, summable in the Banach space Cb(X) of all bounded continuous functions on X. Let F be the limit of this sum. Finally, the estimate we have on fn (see (i) above) and the equation displayed above show that FIA = fo = f and the proof is complete. 0
Exercise A.4.25 (1) Show that Urysohn's lemma is valid with the unit interval /0,1 j replaced by any closed interval [a, b]' and thus justify the manner in which Urysohn's lemma was used in the proof of Tietze's extension theorem. (Hint: use
A.5. Measure and integration
209
appropriate "affine maps" to map any closed interval homeomorphically onto any other closed interval.)
(2) Show that Tietze's extension theorem remains valid iJ[-I, 1] is replaced by (a) lR, (b) C, (c) lRn.
A.5
Measure and integration
In this section, we review some of the basic aspects of measure and integration. We begin with the fundamental notion of a O"-algebra of sets. Definition A.5.1 A O"-algebra of subsets of a set X is, by definition, a collection 8 of s7J,bsets of X which satisfies the following requirements: (a) X E 8; (b) A E 8 =? AC E 8, where AC = X - A denotes the complement of the set A; and (c) {An}~=l C 8 =? Un An E 8. A measurable space is a pair (X,8) consisting of a set X and a O"-algebm 8 of subsets of X.
Thus, a O"-algebra is nothing but a collection of sets which contains the whole space and is closed under the formation of complements and countable unions. Since nnAn = (UnA~)C, it is clear that a O"-algebra is closed under the formation of countable intersections; further, every O"-algebra always contains the empty set (since = XC).
°
Example A.5.2 (1) The collection 2x of all subsets of X is a O"-algebra, as is the two-element collection {0, X}. These are called the "trivial" O"-algebras; if 8 is any O"-algebra of subsets of X, then {0, X} c 8 c 2x.
(2) The intersection of an arbitrary collection of O"-algebras of subsets of X is also a O"-algebra of subsets of X. It follows that if S is any collection of subsets of X, and if we set 8(S) = n {8: S C 8,8 is a O"-algebra of subsets of X}, then 8(S) is a O"-algebra of subsets of X, and it is the smallest O"-algebra of subsets of X which contains the initial family S of sets; we shall refer to 8(S) as the O"-algebra generated by S. (3) If X is a topological space, we shall write 8 x = 8(T), where T is the family of all open sets in X; members of this O"-algebra are called Borel sets and 8 x is called the Borel O"-algebra of X. (4) Suppose {(Xi, 8 i ) : i E I} is a family of measurable spaces. Let X = DiE! Xi denote the Cartesian product, and let 7ri : X ---> Xi denote the i-th coordinate projection, for each i E I. Then the product O"-algebra 8 = DiE! 8 i is the 0"algebra of subsets of X defined as 8 = 8(S), where S = {7r;1 (Ai) : Ai E 8 i , i E I}, and (X,8) is called the "product measurable space".
210
Appendix
(5) If (X, B) is a measurable space, and if Xo C X is an arbitrary subset, define Blxo = {A n Xo : A E B}; equivalently, Blx o = {i- 1 (A) : A E B}, where i : Xo --7 X denotes the inclusion map. The a-algebra Blxo is called the a-algebra induced by B. 0 Exercise A.5.3 (1) Let H be a separable Hilbert space. Let T (resp., Tw) denote the family of all open (resp., weakly open) sets in H. Show that B7-{ = B(T) = B(Tw). (Hint: It suffices to prove that if x E H and if f > 0, then B(x, f) = {y E H : Ily - xii:::; f} E B(Tw); pick a countable (norm-) dense set {x n } in H, and note that B(x, f) = nn {y : I((y - x), xn)1 S fllxnll}')
(2) Let Tn (resp., Ts , Tw) denote the set of all subsets of £(H) which are open with respect to the norm topology (resp., strong operator topology, weak operator topology) on £(H). IfH is separable, show that Bt:.(7-{) = B(Tn) = B(Ts) = B(Tw). (Hint: use (1) above.) We now come to the appropriate class of maps in the category of measurable spaces. Definition A.5.4 If (Xi, B i ), i = 1,2, are measurable spaces, then a function f : Xl --7 X 2 is said to be measurable if f-1(A) E B1 V A E B 2 . Proposition A.5.5 (a) The composite of measurable maps is measurable.
(b) If (Xi, B i ), i = 1,2, are measurable spaces, and if B2 = B( S) for some family S C 2 X2 , then a function f : Xl --7 X 2 is measurable if and only if f- 1(A) E B1 for all A E S. (c) If Xi, i = 1,2 are topological spaces, and if f : Xl --7 X 2 is a continuous map, then f is measurable as a map between the measurable spaces (Xi, B Xi)' i = 1, 2. Proof. (a) If (Xi,Bi),i = 1,2,3, are measurable spaces, and if f: Xl --7 X 2 and g : X 2 --7 X3 are measurable maps, and if A E B 3 , then g-l (A) E B2 (since g is measurable), and consequently, (g 0 f)-l(A) = f-1(g-1(A)) E B1 (since f is measurable) .
(b) Suppose f-1(A) E B1 for all A E S. Since (f-1(AW = f- 1(AC),f- 1(X 2 ) = U;j-1(A i ), it is seen that the family B = {A C X 2 : f- l (A) E Bd is a a-algebra of subsets of B2 which contains S, and consequently, B ::::> B(S) = B 2 , and hence f is measurable.
Xl and f-1(UiAi) =
(c) Apply (b) with S as the set of all open sets in X2.
o
A.5. Measure and integration
211
Some consequences of the preceding proposition are listed in the following exercises. Exercise A.5.6 (1) Suppose {(Xi, 8 i ) : i E I} is a family of measurable spaces. Let (X,8) be the product measurable space, as in Example A.5.2(4). Let (Y, 8 0 ) be any measurable space. (a) Show that there exists a bijective correspondence between maps f : Y ----> X and families {fi : Y ----> XdiE! of maps, this correspondence being given by f(y) = ((fi(Y))); the map f is written as f = DiE! k (b) Show that if f, fi are as in (a) above, then f is measurable if and only if each fi is measurable. (Hint: Use the family S used to define the product O"-algebra, Proposition A.5.5(b), and the fact that 7ri 0 f = fi Vi, where 7ri denotes the projection of the product onto the i-th factor.)
(2) If (X,8) is a measurable space, if fi : X ----> ffi., i E I, are measurable maps (with respect to the Borel O"-algebra on ffi.), and if F : ffi.! ----> ffi. is a continuous map (where ffi.! is endowed with the product topology), then show that the map 9 : X ----> ffi. defined by 9 = F 0 (DiE! fi) is a (8, 8JR)-measurable function. (3) If (X, 8) is a measurable space, and if f, 9 : X ----> ffi. are measurable maps, show that each of the following maps is measurable: If I, af + bg (where a, b are arbitrary real numbers), fg, p +3j3, x f---> sin(g(x)). (4) The indicator function of a subset E c X is the real-valued function on X, always denoted in these notes by the symbol IE, which is defined by
IE(X)
=
{~
if x E E if x
Show that IE is measurable if and only if E E 8. (For this reason, the members of 8 are sometimes referred to as measurable sets. Also, the correspondence E ----> IE sets up a bijection between the family of subsets of X on the one hand, and functions from X into the 2-element set {O, I}, on the other; this is one of the reasons for using the notation 2 x to denote the class of all subsets of the set X.) In the sequel, we will have occasion to deal with possibly infinite-valued functions. We shall write lR = ffi. U {oo, -oo}. Using any reasonable bijection between lR and [-1,1] - say, the one given by the map
f(x) =
{
±1 _x_
1+Ixi
if x = ±oo if x E ffi.
- we may regard lR as a (compact Hausdorff) topological space, and hence equip it with the corresponding Borel a-algebra. If we have an extended real-valued function defined on a set X - i.e., a map f : X ----> lR - and if 8 is a O"-algebra on X, we shall say that f is measurable if f-l(A) E 8 V A E ~. The reader
Appendix
212
should have little difficulty in verifying the following fact: given a map f : X -+ = f-1(lR), fo = flxo' and let Hlx o be the induced o--algebra on the subset Xo - see Example A.5.2(5); then the extended real-valued function f is measurable if and only if the following conditions are satisfied: (i) f-l({a}) E H, for a E {-oo, oo}, and (ii) fo : Xo -+ lR is measurable. We will, in particular, be interested in limits of sequences of functions (when they converge). Before systematically discussing such (pointwise) limits, we pause to recall some facts about the limit-superior and the limit-inferior of sequences. Suppose, to start with, that {an} is a sequence of real numbers. For each (temporarily fixed) mEN, define
iii, let Xo
(A.5.15) with the understanding, of course, that bm = -00 (resp., Cm = +00) if the sequence {an} is not bounded below (resp., bounded above). Then and in particular,
thus, {b m : mEN} (resp., {cm : mEN}) is a non-decreasing (resp., nonincreasing) sequence of extended real numbers. Consequently, we may find extended real numbers b, c such that b = sUPm bm = limm bm and c = infm Cm limm Cm. We define b = lim infn an and c = lim sUPn an. Thus, we have lim inf an n
lim inf am m->oo n?:.m
=
lim sup am m->oo n?:.m
=
sup mEN
inf am
(A.5.16)
sup am
(A.5.17)
n;:::rn
and lim sup an n
=
inf
mEN
n2:rn
We state one useful alternative description of the "inferior" and "superior" limits of a sequence, in the following exercise. Exercise A.5.7 Let {an : n E N} be any sequence of real numbers. Let L be the set of all "limit-points" of this sequence; thus, l E L if and only if there exists some subsequence for which l = limk->oo a nk . Thus L c JR. Show that: (a) lim inf an, lim sup an E L; and (b) l E L =?- lim inf an :::; l:::; lim sup an. (c) the sequence {an} converges to an extended real number if and only if lim inf an = lim sup an.
A.5. Measure and integration
213
If X is a set and if Un : n E N} is a sequence of real-valued functions on X, we define the extended real-valued functions lim inf fn and lim sup fn in the obvious pointwise manner, thus:
(lim inf fn)(x)
lim inf fn(x),
(lim sup fn)(x)
lim sup fn(x).
Proposition A.5.S Suppose (X, B) is a measurable space, and suppose {fn : n E N} is a sequence of real-valued measurable functions on X. Then, (a) sUPn f nand inf n f n are measurable extended real-valued functions on X; (b) lim inf f n and lim sup f n are measurable extended real-valued functions on X;
(c) the set C = {x EX: {fn(x)} converges to a finite real number} is measurable, i.e., C E B; further, the function f : C ---> lR defined by f(x) = limn fn(x) is measurable (with respect to the induced O"-algebra Ble. (In particular, the pointwise limit of a (pointwise) convergent sequence of measurable real-valued functions is also a measurable function.) Proof. (a) Since infn fn = - sUPn(-fn), it is enough to prove the assertion concerning suprema. So, suppose f = sUPn fn, where {fn} is a sequence of measurable real-valued functions on X. In view of Proposition A.5.5(b), and since the class of sets of the form {t E ffi: : t > a} -- as a varies over lR- generates the Borel O"-algebra BjR, we only need to check that {x EX: f(x) > a} E B, for each a E lR; but this follows from the identity {x EX: f(x)
> a} =
U~=l {x
EX: fn(x) > a}
(b) This follows quite easily from repeated applications of (a) (or rather, the extension of (a) which covers suprema and infima of sequences of extended realvalued functions). (c) Observe, to start with, that if Z is a Hausdorff space, then.6. = {(z,z): z E Z} is a closed set in Z x Z; hence, if f, g : X ---> Z are (B, Bz )-measurable functions, then (f,g) : X ---> Z x Z is a (B,Bzxz)-measurable function, and hence {x EX: f(x) = g(x)} E B. In particular, if we apply the preceding fact to the case when f = lim sUPn fn, lim infn fn and Z = ffi:, we find that if D is the set of points x E X for which the sequence {fn(x)} converges to a point in ffi:, then DEB. Since lR is a measurable set in ffi:, the set F = {x EX: lim sUPn fn(x) E lR} must also belong to B, and consequently, C = D n FEB. Also since fie is clearly a measurable function, all parts of (c) are proved. 0
9
=
214
Appendix
The following proposition describes, in a sense, how to construct all possible measurable real-valued functions on a measurable space (X,8). Proposition A.5.9 Let (X,8) be a measurable space.
(1) The following conditions on a (8, 8JR)-measurable function f
X ----+ lR are equivalent: (i) f(X) is a finite set; (ii) there exists n E 1':1, real numbers a1, ... , an E lR and a partition X = U~l Ei such that Ei E 8 I::/i, and f = L7=1 ai 1 Ei ·
A function satisfying the above conditions is called a simple function (on (X, 8)).
(2) The following conditions on a (8, 8JR)-measurable function f : X
----+ lR are equivalent: (i) f(X) is a countable set; (ii) there exists a countable set I, real numbers an, n E I, and a partiton X = UnEIEn such that En E 8 I::/n E I, and f = LnElanlEn'
A function satisfying the above conditions is called an elementary function (on (X, 8)).
(3) If f : X
----+ lR is any non-negative measurable function, then there exists a sequence {In} of non-negative simple functions on (X,8) such that: (i) h(x) ::::; h(x) ::::; ... ::::; fn(x) ::::; fn+l(x) ::::; ... ,for all x E X; and (ii) f(x) = limn fn(x) = sUPn fn(x) 1::/ x E X.
Proof. (1) The implication (ii) =? (i) is obvious, and as for (i) =? (ii), if f(X) = {a1' ... , an}, set Ei = f- 1({ad ), and note that these do what they are expected
to do.
(2) This is proved in the same way as was (1). (3) Fix a positive integer n, let I~ = [2~' k2-t,l) and define E~ = f-1(I~), for k = 0, 1,2, .... Define h n = L~=l 2~ lE~' It should be clear that (the E~'s inherit measurability from f and consequently) h n is an elementary function in the sense of (2) above. The definitions further show that, for each x EX, we have
Thus, the hn's form a non-decreasing sequence of non-negative elementary functions which converge uniformly to the function f. If we set f n = min {hn' n}, it is readily verified that these fn's are simple functions which satisfy the requirements of~). 0
215
A.5. Measure and integration
Now we come to the all important notion of a measure. (Throughout this section, we shall use the word measure to denote what should be more accurately referred to as a positive measure; we will make the necessary distinctions when we need to.) Definition A.5.10 Let (X, H) be a measurable space. A measure on (X, H) zs a function J-L : H ----+ [0,00] with the following two properties: (0) J-L(0) = 0; and (1) J-L is count ably additive - i.e., if E = 11~=1 En is a countable "measurable" partition, meaning that E, En E H Vn, then J-L(E) = L~=l J-L(En). A measure space is a triple (X, H, J-L), consisting of a measurable space together with a measure defined on it. A measure J-L is said to be finite (resp., a probability measure) if J-L(X) < 00 (resp., J-L(X) = 1).
We list a few elementary consequences of the definition in the next proposition. Proposition A.5.n Let (X, H, J-L) be a measure space; then, (1) J-L is "monotone": i.e., A, BE H, A c B ::::} J-L(A) ::; J-L(B); (2) J-L is "countably subadditive": i.e., if En E H Vn = 1,2, ... , then J-L(U':::=lEn) ::; L~=l J-L(En); (3) J-L is "continuous from below": i.e., if E1 C E2 C ... C En C ... is an increasing sequence of sets in H, and if E = U':::=l En, then J-L(E) = limn J-L(En); (4) J-L is "continuous from above if it is restricted to sets of finite measure": i. e., if if E1 ::) E2 ::) ... ::) En ::) ... is a decreasing sequence of sets in H, if E = n':::=lEr" and if J-L(Ed < 00, then J-L(E) = limnJ-L(En).
Proof. (1) Since B = additivity of J-L that J-L(A)
A l1(B - A), deduce from the positivity and countable
< J-L(A) + J-L(B - A) J-L(A) J-L(A
+ J-L(B -
A)
+ J-L(0) + J-L(0) + ...
II (B - A) II0 II0 II",)
J-L(B) . (2) Define Sn = Ui=l E i , for each n = 1,2, ... ; then it is clear that {Sn} is an increasing sequence of sets and that Sn E H Vn; now set An = Sn - Sn-1 for each n (with the understanding that So = 0); it is seen that {An} is a sequence of pairwise disjoint sets and that An E H for each n; finally, the construction also ensures that n
Sn = U~l Ei =
11 Ai, Vn = 1,2, .... i=l
(A.5.1S)
Appendix
216 Further, we see that 00
II An ,
U~=IEn =
(A.5.19)
n=1
and hence by the assumed countable additivity and the already established monotonicity of fJ, we see that 00
fJ(II An) n=1 00
LfJ(An) n=1 00
< LfJ(En) , n=1
since An C En \In.
(3) If the sequence {En} is already increasing, then, in the notation of the proof of (2) above, we find that Sn = En and that An = En - En-I. Since countable additivity implies "finite additivity" (by tagging on infinitely many copies of the empty set, as in the proof of (1) above), we may deduce from equation A.5.18 that n
fJ(En)
= fJ(Sn) = L fJ(Ai) ; i=1
similarly, we see from equation A.5.19 that 00
fJ(E)
= L fJ(An) ; n=1
the desired conclusion is a consequence of the last two equations.
(4) Define Fn = EI - En, note that (i) Fn E B \In, (ii) {Fn} is an increasing sequence of sets, and (iii) U~=I Fn = EI - n~=1 En; the desired conclusion is a consequence of one application of (the already proved) (3) above to the sequence {Fn}, and the following immediate consequence of (1): if A, BE B, A c B and if fJ(B) < 00, then fJ(A) = fJ(B) - fJ(B - A). D We now give some examples of measures; we do not prove the various assertions made in these examples; the reader interested in further details, may consult any of the standard texts on measure theory (such as [Hall], for instance). Example A.5.12 (1) Let X be any set, let B = 2x and define fJ(E) to be n if E is finite and has exactly n elements, and define fJ(E) to be 00 if E is an infinite set. This fJ is easily verified to define a measure on (X, 2 X ), and is called the counting measure on X.
A.5. Measure and integration
217
For instance, if X = N, if En = {n, n + 1, n + 2, ... }, and if J1 denotes counting measure on N, we see that {En} is a decreasing sequence of sets in N, with n~=l En = 0; but J1(En) = 00 Vn, and so limn J1(En) oj- J1(nnEn); thus, if we do not restrict ourselves to sets of finite measure, a measure need not be continuous from above - see Proposition A.5.11(4). (2) It is a fact, whose proof we will not go into here, that there exists a unique measure m defined on (IR, B IR ) with the property that m([a, b]) = b - a whenever a, b E IR, a < b. This measure is called Lebesgue measure on R (Thus the Lebesgue measure of an interval is its length.) More generally, for any n E N, there exists a unique measure mn defined on (IRn, BlRn) such that if B TI~=l [ai, bi ] is the n-dimensional "box" with sides [ai, bi ], then mn(B) = TI~=l (b i - ai); this measure is called n-dimensional Lebesgue measure; thus, the n-dimensional Lebesgue measure of a box is just its (n-dimensional) volume. (3) Suppose {(Xi, Bi ,J1i) : i E I} is an arbitrary family of "probability spaces": i.e., each (Xi, Bi ) is a measurable space and J1i is a probability measure defined on it. Let (X, B) be the product measurable space, as in Example A.5.2( 4). (Recall that B is the O'-algebra generated by all "finite-dimensional cylinder sets"; i.e., B = B(C), where a typical member of C is obtained by fixing finitely many co-ordinates (i.e., a finite set 10 C /), fixing measurable sets Ei E Bi Vi E 10, and looking at the "cylinder C with base TIiEIo E/' - thus C = {((Xi)) EX: Xi E Ei Vi E Io}. It is a fact that there exists a unique probability measure J1 defined on (X, B) with the property that if C is as above, then J1(C) = TIiEIo J1i(Ed. This measure J1 is called the product of the probability measures J1i. It should be mentioned that if the family I is finite, then the initial measures J1i do not have to be probability measures for us to be able to construct the product measure. (In fact m n is the product of n copies of m.) In fact, existence and uniqueness of the "product" of finitely many measures can be established if we only impose the condition that each J1i be a O'-finite measure- meaning that Xi admits a countable partition Xi = U~=l En such that En E Bi and J1i(En) < 00 for all n. It is only when we wish to construct the product of an infinite family of measures, that we have to impose the requirement that (at least all but finitely many of) the measures concerned are probability measures. 0 Given a measure space (X, B, J1) and a non-negative (B, BIR)-measurable function f : X ---+ IR, there is a unique (natural) way of assigning a value (in [0,00]) to the expression fdJ1. We will not go into the details of how this is proved; we will, instead, content ourselves with showing the reader how this "integral" is computed, and leave the interested reader to go to such standard references as [Hall], for instance, to become convinced that all these things work out as they are stated to. We shall state, in one long proposition, the various features of the assignment f f--> f dJ1.
J
J
Appendix
218
Proposition A.5.13 Let (X, 8, JL) be a measure space; let us write M+ to denote the class of all functions f : X -+ [0,00) which are (8, 81R) -measurable.
(1) There exists a unique map M+ 3 f f---+ J fdJL E [0,00] which satisfies the following properties: (i) J 1E dJL = JL(E) \I E E 8; (ii) f,gEM+,a,bE[O,oo) =;.J(af+bg)dJL aJfdJL + bJgdJL; (iii) for any f E M+, we have
for any non-decreasing sequence {fn} of simple functions which converges pointwise to f. (See Proposition A.5.9(3).)
(2) Further, in case the measure JL is a-finite, and if f E M+, then the quantity
(J fdJL) which is called the integral of f with respect to JL, admits the following interpretation as "area under the curve": let AU) = {(x, t) E X x IR: 0::; t::; f(x)} denote the "region under the graph of j"; let 8 0 8 1R denote the natural product O'-algebra on X x IR (see Example A.5.2), and let.\ = JL x m denote the product-measure - see Example A.5.12(3) - of JL with Lebesgue measure on IR; then, (i) AU) E 80 8 1R ; and (ii) J fdJL = .\( AU) ). D Exercise A.5.14 Let (X, 8, JL) be a measure space. Then show that
(a) a function f : X -+ C is (8,8d-measurable if and only if Re f and 1m f are (8,81R )-measurable, where, of course, Re f and 1m f denote the real- and imaginary- parts of f, respectively; (b) a function f : X -+ IR is (8,8 1R )-measurable if and only if J± are (8,81R )measurable, where f ± are the "positive and negative parts of f" (as in the proof of Proposition 3.3.11 (f), for instance);
(c) if f: X -+ C then f = (II - 12)+i(13- f4), where {fJ : 1 ::; j ::; 4} are the nonnegative functions defined by II = (Re J) +, 12 = (Re J) - ,13 = (1m J) +, f 4 = (1m J)-; show further that 0 ::; fj (x) ::; If(x) I \Ix E X; (d) if f, g : X -+ [0, 00) are non-negative measurable functions on X such that f(x) ::; g(x) \Ix, then J fdJL ::; J gdJL. Using the preceding exercise, we can talk of the integral of appropriate complex-valued functions. Thus, let us agree to call a function f : X -+ C integrable with respect to the measure JL if (i) f is (8, Ed-measurable, and if (ii)
219
A.5. Measure and integration
J Ifldp, < 00. (This makes sense since the measurability of f implies that of Ifl.) Further, if {fj : 1 :::; j :::; 4} are as in Exercise A.5.14(d) above, then we may define
It is an easy exercise to verify that the set of p,-integrable functions form a vector space and that the map f ---> J fdp, defines a linear functional on this vector space.
Remark A.5.15 The space L1 (X, B, p,): If (X,B,p,) is a measure space, let £1 = £l(X,B,p,) denote the vector space of all p,-integrable functions f : X ---> C. Let N = {f E £1 : J Ifldp, = O}; it is then not hard to show that in order for a measurable function f to belong to N, it is necessary and sufficient that f vanish p,-almost everywhere, meaning that p,( {f of. O}) = O. It follows that N is a vector subspace of £1; define L1 = L1 (X, B, p,) to be the quotient space £1 IN; (thus two integrable functions define the same element of L1 precisely when they agree p,-almost everywhere); it is true, furthermore, that the equation
Ilflll =
J
Ifldp,
can be thought of as defining a norm on U. (Strictly speaking, an element of L1 is not one function, but a whole equivalence class of functions, (any two of which agree a.e.), but the integral in the above equation depends only on the equivalence class of the function f; in other words, the above equation defines a semi-norm on £1, which "descends" to a norm on the quotient space L1.) It is further true that L1 (X, B, p,) is a Banach space. In an exactly similar manner, the spaces LP(X, B, p,) are defined; since the cases p E {I, 2, oo} will be of importance for us, and since we shall not have to deal with other values of p in this book, we shall only discuss these cases. The space L2(X, B, p,): Let £2 = £2(X,B,p,) denote the set of (B,Bd-measurable functions f: X --->
Appendix
220
It should be mentioned that much of what was stated here for p = 1,2 has extensions that are valid for p ~ 1; (of course, it is only for p = 2 that L2 is a Hilbert space;) in general, we get a Banach space LP with
The space LOO(X, B, JL): The case L OO deserves a special discussion because of certain features of the norm in question; naturally, we wish, here, to regard bounded measurable functions with the norm being given by the "supremum norm"; the only mild complication is caused by our being forced to consider two measurable functions as being identical if they agree a.e.; hence, what happens on a set of (JL-) measure zero should not be relevant to the discussion; thus, for instance, if we are dealing with the example of Lebesgue measure m on ffi., then since m(A) = 0 for every countable set, we should regard the function f(x) = 0 \:Ix as being the same as the function 9 which is defined to agree with f outside the set N of natural numbers but satisfies g(n) = n \:I n E N; note that 9 is not bounded, while f is as trivial as can be. Hence, define £00 = £OO(X, B, JL) to be the class of all (B, Bc)-measurable functions which are essentially bounded; thus f E £00 precisely when there exists E E B (which could very well depend upon f) such that JL(X - E) = 0 and f is bounded on E; thus the elements of £00 are those measurable functions which are "bounded a.e." Define Ilflloo = inf{K > 0: :3 NEB such that JL(N) = 0 and If(x)1 ::; K whenever x E X - N}. Finally, we define LOO = LOO(X, B, JL) to be the quotient space LOO = £00 IN, where, as before, N is the set of measurable functions which vanish a.e. (which is also the same as {f E £00 : Ilflloo = 0). It is then the case that L oo is a Banach space. It is a fact that if JL is a finite measure (and even more generally, but not without some restrictions on JL), and if 1 ::; p < 00, then the Banach dual space of LP(X, B, JL) may be naturally identified with Lq(X, B, JL), where the "dual-index" q is related to p by the equation 1.P + 1.q = 1, or equivalently, q is defined thus: q
={
P~l
if 1 < p < if p = 1
00
where the "duality" is given by integration, thus: if f E LP, 9 E Lq, then it is true that f 9 E £1 and if we define
then the correspondence 9 isomorphism Lq ~ (LP)*.
f-->
¢g is the one which establishes the desired isometric
0
A.5. Measure and integration
221
We list some of the basic results on integration theory in one proposition, for convenience of reference.
Proposition A.5.l6 Let (X, B, J-L) be a measure space. (1) (Fatou's lemma) if Un} is a sequence of non-negative measurable functions on X, then,
(2) (monotone convergence theorem) Suppose {fn} is a sequence of non-negative measurable functions on X, suppose f is a non-negative measurable function, and suppose that for almost every x E X, it is true that the sequence {fn(x)} IS a non-decreasing sequence of real numbers which converges to f(x); then,
(3) (dominated convergence theorem) Suppose Un} is a sequence in LP(X, B, J-L) for some p E [I, (0); suppose there exists 9 E LP such that Ifni :S 9 a.e., for each n; suppose further that the sequence Un(x)} converges to f(x) for J-L-almost every x in X; then f E LP and Ilfn - flip ---> 0 as n ---> 00. Remark A.5.l7 It is true, conversely, that if Un} is a sequence which converges to f in LP, then there exists a subsequence, call it Unk : kEN} and agE LP such that Ifnk I :S 9 a.e., for each k, and such that {fnk (x)} converges to f(x) for J-L-almost every x E X; all this is for p E [I, (0). Thus, modulo the need for passing to a subsequence, the dominated convergence theorem essentially describes convergence in LP, provided 1 :S p < 00. (The reader should be able to construct examples to see that the statement of the dominated convergence theorem is not true for p = 00.) D For convenience of reference, we now recall the basic facts concerning product measures (see Example A.5.12(3)) and integration with respect to them.
Proposition A.5.l8 Let (X,Bx,J-L) and (Y,By,v) be cy-finite measure spaces. Let B = Bx®By be the "product cy-algebra", which is defined as the cy-algebra generated by all "measurable rectangles" - i. e., sets of the form A x B, A E B x, B E By.
(a) There exists a unique cy-finite measure .\ (which is usually denoted by J-L x v and called the product measure) defined on B with the property that '\(A x B) = J-L(A)v(B) , V A E B x , BE By ;
(A.5.20)
(b) more generally than in (a), if E E B is arbitrary, define the vertical (resp., horizontal) slices of E by EX = {y E Y : (x, y) E E} (resp., Ey = {x EX: (x, y) E E}); then
222
Appendix
(i) EX E By V x E X (resp., Ey E Bx V Y E Y); (ii) the function x f--> v(EX) (resp., y f--> p,(Ey)) is a non-negative measurable extended real-valued function on X (resp., Y); and (A.5.21)
(c) (Tonelli's theorem) Let f : X x Y --+ IR+ be a non-negative (B, BW/.)-measurable function; define the vertical (resp., horizontal) slices of f to be the functions r : Y --+ IR+ (resp., fy : X --+ IR+) given by r(y) = f(x, y) = fy(x); then, (i) r (resp., fy) is a (By, BW/.)-measurable (resp., (Bx, BW/.)-measurable) function, for every x E X (resp., y E Y); (ii) the function x f--> Jy rdv (resp., y f--> Jx fydp,) is a (Bx, Bfii)-measurable (resp., (By, Bfii)-measurable) extended non-negative real-valued function on X (resp., Y), and (A.5.22)
(d) (Fubini's theorem) Suppose f EL I (X X Y, B, A); if the vertical and horizontal slices of f are defined as in (c) above, then (i) r (resp., fy) is a (By,BcJ-measurable (resp., (Bx, BcJ-measurable) function, for p,-almost every x E X (resp., for v-almost every y E Y); (ii) r E L1(Y,By,v) for p,-almost all x E X (resp., fy E L1(X,Bx,p,) for v-almost all y E Y); (iii) the function x f--> Jy rdv (resp., y f--> Jx fydp,) is a p,-a.e. (resp., va. e.) meaningfully defined (B x, BcJ-measurable (resp., (By, BcJ-measurable) complex-valued function, which defines an element of L1(X,Bx,p,) (resp., Ll(y, By, v)); further, L
( [ r dv ) dp,(x) = Lxy fdA = [ ( L fydp, ) dv(y) . (A.5.23)
o
One consequence of Fubini's theorem, which will be used in the text, is stated as an exercise below.
Exercise A.5.l9 Let (X, B x , p,) and (Y, By, v) be a-finite measure spaces, and let {en: n E N} (resp., {fm : m EM}) be an orthonormal basis for L 2(X,B x ,p,) (resp., L2(y, By, v)). Define en Q9 fm : X x Y --+
223
A.5. Measure and integration
of Bx 181 By to show that each en 181 fm is (Bx 181 By, Bd-measurable; use Fubini's theorem to check that {en 181 fm}n,m is an orthonormal set; now, if k E L2(X X Y, B x 181 By, JL xv), apply Tonelli's theorem to Ikl 2 to deduce the existence of a set A E Bx such that (i) JL(A) = 0, (ii) x ~ A =? k X E L2(y, By, v), and
LIIPIII2(v) L( m~ I(P,
dJL(x) fm)12 )
r I(P,fm)1
L mEM
Jx
2
dJL(x)
dJL(x) ;
set gm(x) = (kX,fm), '\I m E M, x ~ A, and note that each gm is defined JL-a.e., and that
m
L L
l(gm,enW;
mEMnEN
note finally - by yet another application of Fubini's theorem - that (gm, en) = (k,en®fm), to conclude that IlkW = Lmnl(k,en ®fm)1 2 and consequently that {en 181 f m : m E M, n E N} is indeed an orthonormal basis for L2 (JL xv). We conclude this discussion on measure theory with a brief reference to absolute continuity and the fundamental Radon-Nikodym theorem. Proposition A.5.20 Let (X,B,JL) be a measure space, and let f: X non-negative integrable function. Then the equation
veE)
=
L
fdJL
=
J
lEfdJL
-->
[0,(0) be a
(A.5.24)
defines a finite measure on (X, B), which is sometimes called the "indefinite integral" of f; this measure has the property that E E B, JL(E) = 0 =? veE) = 0 .
(A.5.25)
If measures JL and v are related as in the condition A.5.25, the measure v is said to be absolutely continuous with respect to JL. The following result is basic, and we will have occasion to use it in Chapter III. We omit the proof. Proofs of various results in this section can be found in any standard reference on measure theory, such as [Hall], for instance. (The result is true in greater generality than is stated here, but this will suffice for our needs.)
Appendix
224
Theorem A.5.21 (Radon-Nikodym theorem) Suppose p, and v are finite measures defined on the measurable space (X, H). Suppose v is absolutely continuous with respect to p,. Then there exists a non-negative function gEL 1 (X, H, p,) with the property that v(E)
=
l
gdp, , VEE H ;
the function g is uniquely determined p,-a. e. by the above requirement, in the sense that ifg1 is any other function in L 1(X,H,p,) which satisfies the above condition, then 9 = gl P, - a.e. Further, it is true that if f : X ---t [0,(0) is any measurable function, then
J
fdv =
J
fgdp, .
Thus, a finite measure is absolutely continuous with respect to p, if and only if it arises as the indefinite integral of some non-negative function which is integrable with respect to p,; the function g, whose p,-equivalence class is uniquely determined by v, is called the Radon-Nikodym derivative of v with respect to p" and both of the following expressions are used to denote this relationship:
dv 9 = dp, , dv = g dp, .
(A.5.26)
We will outline a proof of the uniqueness of the Radon-Nikodym derivative in the following sequence of exercises.
Exercise A.5.22 Let (X, H, p,) be a finite measure space. We use the notation M+ to denote the class of all non-negative measurable functions on (X, H).
(1) Suppose f E M+ satisfies the following condition: if E E H and if alE ~ f ~ bl E p,-a.e. Show that ap,(E)
~
l
f dp,
~
bp,(E) .
(2) If f E M+ and if fA fdp, = 0 for some A E H, show that f = 0 a.e. on A i.e., show that p,({x E A : f(x) > O}) = O. (Hint: Consider the sets En = {x E A : ~ ~ f(x) ~ n}, and appeal to (J).) (3) If f E £l(X,H,p,) satisfies fEfdp, = 0 for every E E H, show that f = 0 a.e.; hence deduce the uniqueness assertion in the Radon-Nikodym theorem. (Hint: Write f = 9 + ih, where 9 and h are real-valued functions; let 9 = g+ - g_ and h = h+ - h_ be the canonical decompositions of 9 and h as differences of non-negative functions; show that (2) is applicable to each of the functions g±, h± and the set A = X.)
225
A.5. Measure and integration
(4) Suppose dv = 9 dJ1 as in Theorem A.S.21. Let A J1IA be the measure defined by J1(E n A)
\j
=
E E B ;
{x : g(x)
> O}, and let (A.5.27)
show that
(i) J1IA is absolutely continuous with respect to J1 and moreover, d(J1IA) = 1AdJ1; and (ii) J1IA and v are mutually is absolutely continuous have the same null sets. only if 9 vanishes J1-a. e.
absolutely continuous, meaning that each measure with respect to the other, or equivalently, that they (Hint: use (2) above to show that v(E) = 0 if and on E.)
Two measures are said to be singular with respect to one another if they are "supported on disjoint sets"; explicitly, if J1 and v are measures defined on (X, B), we write J1 ...L v if it is possible to find A, B E B such that X = AU B and J1 = J1IA' V = vlB (in the sense of equation A.5.27). The purpose of the next exercise is to show that for several purposes, deciding various issues concerning two measures - such as absolutely continuity of one with respect to the other, or mutual singularity, etc.) - is as easy as deciding corresponding issues concerning two sets - such as whether one is essentially contained in the other, or essentially disjoint from another, etc. Exercise A.5.23 Let {J1i : 1 ::; i ::; n} be a finite collection of finite positive measures on (X, B). Let A = L~l J1i; then show that:
(a) A is a finite positive measure and J1i is absolutely continuous with respect to A, for each i = 1, ... ,n; = ~, and if Ai = {gi > O}, then (i) J1i is absolutely continuous with respect to J1j if and only if Ai C Aj (mod A) (meaning that A(Ai - Aj) = 0); and (ii) J1i ...L J1j if and only if Ai and Aj are disjoint mod A (meaning that A(Ai n Aj) =0).
(b) if gi
We will find the following consequence of the Radon-Nikodym theorem (and Exercise A.5.23) very useful. (This theorem is also not stated here in its greatest generality. ) Theorem A.5.24 (Lebesgue decomposition) Let J1 and v be finite measures defined on (X, B). Then there exists a unique decomposition v = Va + V s with the property that (i) Va is absolutely continuous with respect to J1, and (ii) Vs and J1 are singular with respect to each other.
Appendix
226
Proof. Define A fL + v; it should be clear that A is also a finite measure and that both fL and v are absolutely continuous with respect to A. Define
dfL
dv
f = dA' 9 = dA' A = {J > O}, B = {g > O} . Define the measures Va and Vs by the equations
va(E) =
r
9 dA , vs(E) =
lEnA
r
9 dA ;
lE-A
the definitions clearly imply that v Va + Vs; further, the facts that Va is absolutely continuous with respect to fL, and that Vs and fL are mutually singular, are both immediate consequences of Exercise A.5.23(b). Suppose V = VI + V2 is some other decomposition of V as a sum of finite positive measures with the property that VI is absolutely continuous with respect to fL and V2 -.l fL. Notice, to start with, that both the Vi are necessarily absolutely continuous with respect to V and hence also with respect to A. Write gi = ~ and Fi = {gi > O}, for i = 1,2. We may now deduce from the hypotheses and Exercise A.5.23(b) that we have the following inclusions mod A:
F I ,F2 CB, FICA, F2C(X-A). note t h at d(v) dJ: = 1Ag an d inclusions (mod A):
d(v s ) ~
--
1 x -Ag, an d h ence, we h ave t h e
l' 11 . 10 owmg
O} { d(va) dA > and
F2 c B-A
=
{d~;)
>O};
whence we may deduce from Exercise A.5.23(b) that VI (resp., V2) is absolutely continuous with respect to Va (resp., vs). The equation V = VI + V2 must necessarily imply (again because of Exercise A.5.23(b)) that B = FI U F2 (mod A), which can only happen if FI = A n Band F2 = B - A (mod A), which means (again by the same exercise) that VI (resp., V2) and Va (resp., Vs) are mutually absolutely continuous. 0
A.6
The Stone-Weierstrass theorem
We begin this section by introducing a class of spaces which, although not necessarily compact, nevertheless exhibit several features of compact spaces and are closely related to compact spaces in a manner we shall describe. At the end of this section, we prove the very useful Stone-Weierstrass theorem concerning the algebra of continuous functions on such spaces. The spaces we shall be concerned with are the so-called locally compact spaces, which we now proceed to describe.
A.B. The Stone-Weierstrass theorem
227
Definition A.6.l A topological space X is said to be locally compact if it has a base B of sets with the property that given any open set U in X and a point x E U, there exists a set B E B and a compact set K such that x E B eKe U. Euclidean space jRn is easily seen to be locally compact, for every n EN. More examples of locally compact spaces are provided by the following proposition.
Proposition A.6.2 (1) If X is a Hausdorff space, the following conditions are equivalent: (i) X is locally compact; (ii) every point in X has an open neighbourhood whose closure is compact. (2) Every compact Hausdorff space is locally compact. (3) If X is a locally compact Hausdorff space, and if A is a subset of X which is either open or closed, then A, with respect to the subspace topology, is locally compact. Proo]. (1) (i) :::} (ii): Recall that compact subsets of Hausdorff spaces are closed - see Proposition A.4.18(c) - while closed subsets of compact sets are compact in any topological space - see Proposition A.4.4(b). It follows that if B is as in Definition A.6.1, then the closure of every set in B is compact.
(ii) :::} (i): Let B be the class of all open sets in X whose closures are compact. For each x E X, pick a neighbourhood Ux whose closure - call it Kx - is compact. Suppose now that U is any open neighbourhood of x. Let U I = Un ux , which is also an open neighbourhood of x. Now, x E U I C Ux c Kx. Consider Kx as a compact Hausdorff space (with the subspace topology). In this compact space, we see that (a) Kx - U I is a closed, and hence compact, subset of K x , and (b) x rt- Kx - U I . Hence, by Proposition A.4.18(c), we can find open sets VI, V2 in Kx such that x E VI, Kx - U I C V2 and VI n V2 = 0. In particular, this means that VI c Kx - V2 = F (say); the set F is a closed subset of the compact set Kx and is consequently compact (in Kx and hence also in X - see Exercise A.4.3). Hence F is a closed set in (the Hausdorff space) X, and consequently, the closure, in X, of VI is contained in F and is compact. Also, since VI is open in K x , there exists an open set V in X such that VI = V n Kx; but since VI C VI C F C U I C K x , we find that also VI = V n Kx n UI = V nUl; i.e., VI is open in X. Thus, we have shown that for any x E X and any open neighbourhood U of x, there exists an open set VI E B such that x E VI C VI C U, and such that VI is compact; thus, we have verified local compactness of X. (2) This is an immediate consequence of (1). (3) Suppose A is a closed subset of X. Let x E A. Then, by (1), there exists an open set U in X and a compact subset K of X such that x E U c K. Then
228
Appendix
x E An U cAn K; but An K is compact (in X, being a closed subset of a compact set, and hence also compact in A), and An U is an open set in the subspace topology of A. Thus A is locally compact. Suppose A is an open set. Then by Definition A.6.1, if x E A, then there exists sets U, K c A such that x E U eKe A, where U is open in X and K is compact; clearly this means that U is also open in A, and we may conclude from (1) that A is indeed locally compact, in this case as well. D In particular, we see from Proposition A.4.18(a) and Proposition A.6.2(3) that if X is a compact Hausdorff space, and if Xo EX, then the subspace A = X {xo} is a locally compact Hausdorff space with respect to the subspace topology. The surprising and exceedingly useful fact is that every locally compact Hausdorff space arises in this fashion. Theorem A.6.3 Let X be a locally compact Hausdorff space. Then there exists a compact Hausdorff space X and a point in X - which is usually referred to as the ''point at infinity", and denoted simply by 00 - such that X is homeomorphic to the subspace X - {oo} of X. The compact space X is customarily referred to as the one-point compactification of X.
Proof. Define X = X U {oo}, where 00 is an artificial point (not in X) which is adjoined to X. We make X a topological space as follows: say that a set U C X is open if either (i) U c X, and U is an open subset of X; or (ii) 00 E U, and X - U is a compact subset of X. Let us first verify that this prescription does indeed define a topology on X. It is clear that 0 and X are open according to our definition. Suppose U and V are open in X; there are four cases; (i) U, V c X: in this case U, V and Un V are all open sets in X; (ii) U is an open subset of X and V = X - K for some compact subset of X: in this case, since X - K is open in X, we find that Un V = Un (X - K) is an open subset of X; (iii) V is an open subset of X and U = X - K for some compact subset of X: in this case, since X - K is open in X, we find that Un V = V n (X - K) is an open subset of X; (iv) there exist compact subsets C, K c X such that U = X - C, V = X - K: in this case, C U K is a compact subset of X and Un V = X - (C U K). We find that in all four cases, Unv is open in X. A similar case-by-case reasoning shows that an arbitrary union of open sets in X is also open, and so we have indeed defined a topology on X. Finally it should be obvious that the subspace topology that X inherits from X is the same as the given topology. Since open sets in X are open in X and since X is Hausdorff, it is clear that distinct points in X can be separated in X. Suppose now that x EX; then, by Proposition A.6.2(1), we can find an open neighbourhood of x in X such that the closure (in X) of U is compact; if we call this closure K, then V = X - K is an open neighbourhood of 00 such that U n V = 0. Thus X is indeed a Hausdorff space.
229
A.5. The Stone-Weierstrass theorem
Suppose now that {Ui : i E I} is an open cover of X. Then, pick a Uj such E Uj ; since Uj is open, the definition of the topology on X implies that X - Uj = K is a compact subset of X. Then we can find a finite subset 10 c I such that K C UiEloUi ; it follows that {Ui : i E 10 U {j}} is a finite subcover, thereby establishing the compactness of X. D that
(Xl
Exercise A.6.4 (1) Show that X is closed in X if and only if X is compact. (Hence if X is not compact, then X is dense in X; this is the reason for calling X a compactification of X (since it is "minimal" in some sense).)
(2) The following table has two columns which are labelled X and X respectively; if the i-th row has spaces A and B appearing in the first and second columns, respectively, show that B is homeomorphic to A.
l.
X {1,2,3, ... }
2.
[0,1)
3.
~n
X {O} U {~ : n
=
1,2, ... }
[0,1]
sn
=
{x E ~n+l : IIxl12
= I}
The purpose of the next sequence of exercises is to lead up to a proof of the fact that a normed vector space is locally compact if and only if it is finitedimensional.
£;,
Exercise A.6.5 (1) Let X be a normed space, with finite dimension n, say. Let be the vector space en equipped with the norm 11·111 - see Example 1.2.2(1). Fix a -> X by T(al, ... , an) = L~=l aixi· basis {Xl, ... , xn} for X and define T : (a) Show that T E .C(£;', X), and that T is 1-1 and onto. (b) Prove that the unit sphere S = {x E Ilxll = I} is compact, and conclude (using the injectivity of T) that inf{IITxll : XES} > 0; hence deduce that T- l is also continuous. (c) Conclude that any finite-dimensional normed space is locally compact and complete, and that any two norms on such a space are equivalent - meaning that if 11'lIi,i = 1,2 are two norms on a finite-dimensional space X, then there exist constants k,K > 0 such that kllxlh :::: IIxl12 :::: Kllxlll for all x EX. (2) (a) Let Xo be a closed subspace of a normed space X; suppose Xo i= X. Show that, for each E > 0, there exists a vector x E X such that Ilxll = 1 and Ilx - Xo II 2 1 - E for all Xo E Xo. (Hint: Consider a vector of norm (1 - 2E) in the quotient space XI XO - see Exercise 1.5.3(3).) (b) If X is an infinite-dimensional normed space, and if E > 0 is arbitrary, show that there exists a sequence {xn}~l in X such that Ilxnll = 1 for all N, and Ilx n -xmll 2 (l-E), whenever n i= m. (Hint: Suppose unit vectors Xl, ... , xn have been constructed so that any two of them are at a distance of at least (1 - E) from one another; let Xn be the subspace spanned by {Xl, ... ,Xn};
£;,
£;, :
230
Appendix
deduce from (1)(c) above that Xn is a proper closed subspace of X and appeal to (2)(a) above to pick a unit vector Xn+l which is at a distance of at least (1 - E) from each of the first n Xi'S.} (c) Deduce that no infinite-dimensional normed space is locally compact. (3) Suppose M and N are closed subspaces of a Banach space X and suppose N is finite-dimensional; show that M + N is a closed subspace of X. (Hint: Let 7r: X --> XjM be the natural quotient mapping; deduce from (1) above that 7r(N) is a closed subspace of XjM, and that, consequently, N +M = 7r- 1 (7r(N)) is a closed subspace of X.) (4) Show that the conclusion of (3) above is not valid, in general, if we only assume that Nand M are closed, even when X is a Hilbert space. (Hint: Let T E £(H) be a 1-1 operator whose range is not closed - for instance, see Remark 1.5.15; and consider X = HEEl H, M = HEEl {O},N = G(T) = {(x, Tx) : x E H}.} We now wish to discuss continuous (real or complex-valued) functions on a locally compact space. We start with an easy consequence of Urysohn's lemma, Tietze's extension theorem and one-point compactifications of locally compact Hausdorff spaces. Proposition A.6.6 Suppose X is a locally compact Hausdorff space.
(a) If A and K are disjoint subsets of X such that A is closed and K is compact, then there exists a continuous function f : X --> [0,1] such that f(A) =
{O},/(K) = {I}. (b) If K is a compact subspace of X, and if f : K --> lR is any continuous function, then there exists a continuous function F : X --> lR such that FIK = f.
Proof. (a) In the one-point compactification X, consider the subsets K and B = Au {oo}. Then, X - B = X - A is open (in X, hence in X); thus Band K are disjoint closed subsets in a compact Hausdorff space; hence, by Urysohn's lemma, we can find a continuous function g : X --> [0,1] such that g(B) = {O}, g(K) = {I}. Let f = glx. (b) This follows from applying Tietze's extension theorem (or rather, from its extension stated in Exercise A.4.25(2)) to the closed subset K of the one-point compactification of X, and then restricting the continuous function (which extends / : K --> X) to X. 0 The next result introduces a very important function space. (By the theory discussed in Chapter 3, these are the most general commutative C* -algebras.) Proposition A.6.7 Let X be the one-point compactification of a locally compact Hausdorff space X. Let C(X) denote the space of all complex-valued continuous functions on X, equipped with the sup-norm II . 1100.
A.5. The Stone-Weierstrass theorem
231
(a) The following conditions on a continuous function f : X
(i) I is the uniform limit of a sequence
---->
{fn}~=l C Cc(X); i.e., there exists
a sequence {In}~=l of continuous functions fn : X ----> 0, there exists a compact subset K C X such that If (x) I < E whenever x ~ K; (iii) f extends to a continuous function F : X ---->
(b) Let I = {F E C(X) : F(oo) = O}; then I is a maximal ideal in C(X), and the mapping F f---7 Fix defines an isometric isomorphism of I onto Co(X). Proof. (a) (i) =? (ii): If E > 0, pick n such that Ilfn(x) - f(x)11 < E; let K be a compact set such that fn(x) = 0 \I x ~ K; then, clearly II(x)1 < E \I x ~ K.
(ii) =? (iii): Define F :
X ---->
{
t(x)
if x E X if x = 00
the function F is clearly continuous at all points of X (since X is an open set in X; the continuity of F at 00 follows from the hypothesis on f and the definition of open neighbourhoods of 00 in X (as complements of compact sets in X).
*}
(iii) =? (i): Fix n. Let An = {x EX: IF(x)1 ::::: and Bn = {x EX: IF(x)1 ::; 2~}' Note that An and Bn are disjoint closed subsets of X, and that in fact An C X (so that, in particular, An is a compact subset of X). By Urysohn's theorem we can find a continuous function ¢in : X ----> [0, 1] such that ¢in (An) = {I} and ¢in(Bn) = {O}. Consider the function fn = (F¢in)lx; then In : X ---->
and consequently the sequence {In} converges uniformly to (b) This is obvious.
2
+ I¢in (X ) I)
f. D
Now we proceed to the Stone-Weierstrass theorem via its more classical special case, the Weierstrass approximation theorem. Theorem A.6.S (Weierstrass approximation theorem) The set of polynomials is dense in the Banach algebra C[a, b], where -00 < a ::; b < 00.
Appendix
232
Proof. Without loss of generality - why? - we may restrict ourselves to the case where a = 0, b = l. We begin with the binomial theorem n
L (k) xkyn-k = (x + y)n ;
(A.6.28)
k=O first set y
=
1 - x and observe that n
L (k) xk(l -
x)n-k = 1 .
(A.6.29)
k=O Next, differentiate equation A.6.28 once (with respect to x, treating y as a constant) to get: n
L k (k) xk-1yn-k = n(x + y)n-l
(A.6.30)
k=O multiply by x, and set y = 1 - x, to obtain n
Lk (k) xk(l- x)n-k = nx.
(A.6.31)
k=O Similarly, another differentiation (of equation A.6.30 with respect to x), subsequent multiplication by x 2 , and the specialisation y = 1 - x yields the identity n
L
k(k - 1)
(k) xk(l -
x)n-k = n(n - 1)x 2 .
(A.6.32)
k=O We may now deduce from the preceding equations that n
L(k - nx)2 (k) xk(l -
x)n-k
k=O n 2x 2 - 2nx . nx + (nx + n(n - 1)x 2 ) nx(l-x).
(A.6.33)
In order to show that any complex-valued continuous function on [0, 1] is uniformly approximable by polynomials (with complex coefficients), it clearly suffices (by considering real and imaginary parts) to show that any real-valued continuous function on [0,1] is uniformly approximable by polynomials with real coefficients. So, suppose now that f : [0,1] --> lR is a continuous real-valued function, and that E > is arbitrary. Since [0,1] is compact, the function f is bounded; let M = Ilfll=. Also, since f is uniformly continuous, we can find 0 > Osuch that If(x) - f(y)1 < E whenever Ix - yl < o. (If you do not know what uniform
°
A.6. The Stone-Weierstrass theorem
233
continuity means, the E - /5 statement given here is the definition; try to use the com pact ness of [0, 1] and prove this assertion directly.) We assert that if p(x) =
L~=D f(~)
(k) xk(l-x)n-k, and if n is sufficiently
large, then Ilf - pi I < E. For this, first observe - thanks to equation A.6.29 - that if x E [0,1] is temporarily fixed, then k
n
12)f(x) - f(;)) k=D
If(x) - p(x)1
<
51
+
x)n-k I
52 ,
(k)
= I LkEfi (f(x) - f(~)) xk(l - x)n-k I, and the sets Ii are defined = {k: 0::; k ::; n, I~ - xl < /5} and 12 = {k: 0::; k::; n, I~ - xl :::0: /5}.
where 5 i by Ii
(k) xk(l -
Notice now, by the defining property of /5, that
<
51
L (k) xk(l E
x)n-k
kEf, n
<
E
L (k) xk(l -
x)n-k
k=D E ,
while 52
(k) xk(l -
<
2M L kEf2
<
2M L (k kEf2 2M n
<
< -->
-
n 2 /5 2
~/5nx)2
x)n-k
(k) xk (1 -
x )n-k
n
L(k - nx)2 (k) xk(1 - x)n-k
k=D 2Mx(1 - x) n/5 2 M 2n/5 2 o as n --> 00
,
where we used (a) the identity A.6.33 in the 4th line above, and (b) the fact that x(l - x) ::; ~ in the 5th line above; and the proof is complete. D We now proceed to the Stone-Weierstrass theorem, which is a very useful generalisation of the Weierstrass theorem.
234
Appendix
Theorem A.6.9 (a) (Real version) Let X be a compact Hausdorff space; let A be a (not necessarily closed) subalgebra of the real Banach algebra CIR(X); suppose that A satisfies the following conditions: (i) A contains the constant functions (or equivalently, A is a unital sub-algebra of CIR(X)); and (ii) A separates points in X - meaning, of course, that if x, yare any two distinct points in X, then there exists f E A such that f (x) -I- f (y).
Then, A is dense in CIR (X) . (b) (Complex version) Let X be as above, and suppose A is a self-adjoint subalgebra of C(X); suppose A satisfies conditions (i) and (ii) above. Then, A is dense in C(X).
Proof. To begin with, we may replace A by its closure, consequently assume that
A is closed, and seek to prove that A = CIR(X) (resp., C(X)). (a) Begin by noting that since the function t f---t It I can be uniformly approximated on any compact interval of IR by polynomials (thanks to the Weierstrass' theorem), it follows that f E A =:} If I E A; since x V y = max{x, y} = x+Yi1x - Y1 , and x 1\ y = min{x,y} = x+Y;XVY, it follows that A is a "sub-lattice" of CIR(X) meaning that f, 9 E A =:} J V g, f 1\ 9 E A. Next, the hypothesis that A separates points of X (together with the fact that A is a vector space containing the constant functions) implies that if x, yare distinct points in X and if s, t E IR are arbitrary, then there exists f E A such that J(x) = s, J(y) = t. (Reason: first find fa E A such that fo(x) = So -I- to = fo(Y); next, find constants a, b E IR such that aso + b = sand ato + b = t, and set J = afo + bl, where, of course, 1 denotes the constant function 1.) Suppose now that f E CIR(X) and that E > o. Temporarily fix x E X. For each y E X, we can - by the previous paragraph - pick Jy E A such that Jy(x) = J(x) and fy(y) = J(y). Next, choose an open neighbourhood Uy of y such that Jy(z) > J(z) - E V z E Uy. Then, by compactness, we can find {YI, ... ,Yn} C X such that X = Ui=l U yi . Set gX = fYl V f Y2V···V fYn' and observe that gX(x) = f(x) and that gX(z) > J(z) - E V z E X. Now, we can carry out the procedure outlined in the preceding paragraph, for each x EX. If gX is as in the last paragraph, then, for each x EX, find an open neighbourhood Vx of x such that gX(z) < f(z) + E V z E Vx. Then, by compactness, we may find {Xl, ... , X m } C X such that X = Uj=l VXj ; finally, set 9 = gXl 1\ gX2 1\ ... 1\ gXIT>, and observe that the construction implies that J(z) - E < g(z) < J(z) + E V z E X, thereby completing the proof of (a). (b) This follows easily from (a), upon considering real- and imaginary-parts. (This is where we require that A is a self-adjoint sub algebra in the complex case.) 0
235
A.7. The Riesz representation theorem
We state some useful special cases of the Stone-Weierstrass theorem in the exercise below. Exercise A.6.l0 In each of the following cases, show that the algebra A is dense in C(X): (i) X = 1l' = {z E C : Izl = I}, A = V{zn: n E Z}; thus, A is the class of "trigonometric polynomials"; (ii) X a compact subset of JRn, and A is the set of functions of the form f(xl, ... , x n ) = L~ ,... ,kn=O ak 1 , ... ,k n X~1 ••• x~n, where ak 1 , ... ,k n E C; (iii) X a compact subset of cn, and A is the set of (polynomial) functions of the form N
L (iv) X = {I,2,,,.,N}N, and A where
=
V{Wk 1 , ... ,k n : n E N,I ~ k1,,,.,k n ~ N},
The "locally compact" version of the Stone-Weierstrass theorem is the content of the next exercise. Exercise A.6.H Let A be a self-adjoint subalgebra ofCo(X), where X is a locally compact Hausdorff space. Suppose A satisfies the following conditions: (i) if x E X, then there exists f E A such that f(x) =I- 0; and (ii) A separates points.
{F + al : f E A, a E C}, where X, and F denotes the unique continuous extension of f to X; appeal to the already established compact case of the Stone- Weierstrass theorem') Then show that A
=
Co(X). (Hint: Let B
=
1 denotes the constant function on the one-point compactification
A.7
The Riesz representation theorem
This brief section is devoted to the statement and some comments on the Riesz representation theorem - which is an identification of the Banach dual space of C(X), when X is a compact Hausdorff space. There are various formulations of the Riesz representation theorem; we start with one of them. Theorem A.7.l (Riesz representation theorem) Let X be a compact Hausdorff space, and let B = Bx denote the Borel a-algebra (which is generated by the class of all open sets in X).
Appendix
236
(a) Let f..1 : H ----7 [0,00) be a finite measure; then the equation (A.7.34)
defines a bounded linear functional ¢/1 E C(X)* which is positive - meaning, of course, that f 2' 0 =} ¢/1(J) 2' O. (b) Conversely, if ¢ E C(X)* is a bounded linear functional which is positive, then there exists a unique finite measure f..1 : H ----7 [0,00) such that ¢ = ¢w
Before we get to other results which are also referred to as the Riesz representation theorem, a few words concerning general "complex measures" will be appropriate. We gather some facts concerning such "finite complex measures" in the next proposition. We do not go into the proof of the proposition; it may be found in any standard book on analysis - see [Yos], for instance. Proposition A.7.2 Let (X, H) be any measurable space. Suppose f..1 : H ----7 C is a "finite complex measure" defined on H - i.e., assume that f..1(0) = 0 and that f..1 is countably additive in the sense that whenever {En : 1 ::; n < oo} is a sequence of pairwise disjoint members of H, then it is the case that the series L~=l f..1(En) of complex numbers is convergent to the limit f..1(U':=l En). Then, the assignment CXJ
CXJ
n=l
n=l
defines a finite positive measure If..11 on (X, H) . Given a finite complex measure f..1 on (X, H) as above, the positive measure
1f..11 is usually referred to as the total variation measure of f..1, and the number 11f..111 = 1f..11 (X) is referred to as the total variation of the measure f..1. It is an easy exercise to verify that the assignment f..1 f---> I1f..11 I satisfies all the requirements of a norm, and consequently, the set M(X) of all finite complex measures has a natural structure of a normed space. Theorem A.7.3 Let X be a compact Hausdorff space. Then the equation
yields an isometric isomorphism M(X) :3 f..1
f--->
¢/1 E C(X)* of Banach spaces.
We emphasise that a part of the above statement is the assertion that the norm of the bounded linear functional ¢/1 is the total variation norm 11f..111 of the measure f..1.
Bibliography
[Ahl] Lars V. Ahlfors, Complex Analysis, McGraw-Hill, 1966. [Art] M. Artin, Algebra, Prentice Hall of India, New Delhi, 1994. [Arv] W. Arveson, An invitation to C*-algebra, Springer-Verlag, New York, 1976. [Ban] S. Banach, Operations lineaires, Chelsea, New York, 1932. [Hal] Paul R. Halmos, Naive Set Theory, Van Nostrand, Princeton, 1960. [Hall] Paul R. Halmos, Measure Theory, Van Nostrand, Princeton, 1950. [HaI2] Paul R. Halmos, Finite-dimensional Vector Spaces, Van Nostrand, London, 1958. [MvN] F.A. Murray and John von Neumann, On Rings of Operators, Ann. Math., 37, (1936), 116-229. [Rud] Walter Rudin, Functional Analysis, Mc-Graw Hill, 1973. [Ser] J.-P. Serre, Linear Representations of Finite Groups, Springer-Verlag, New York,1977. [Sim] George F. Simmons, Topology and Modern Analysis, McGraw-Hill Book Company, 1963. [Sto] M.H. Stone, Linear transformations in Hilbert Space and their applications to analysis, A.M.S., New York, 1932. [Yos] K. Yosida, Functional Analysis, Springer, Berlin, 1965.
237
Index
- finite, 187 - infinite, 187 cardinality, 185 Cauchy - -Schwarz inequality, 31 - criterion, 13 - sequence, 13 Cayley transform, 158 character, 79 - alternating, 177 - group, 79 characteristic polynomial, 182 closable operator, 152 closed - graph theorem, 19 - map, 197 - operator, 152 - set, 192 closure - of a set, 15, 192 - of an operator, 153 c9lumn, 2 commutant, 91 compact operator, 123 - singular values of a, 129 compact space, 198 complete, 13 completion, 15 complex homomorphism, 74 continuous function, 194 convex hull, 122 convex set, 25, 122 convolution, 62
absolutely continuous, 223 - mutually, 225 absorbing set, 25 adjoint - in a C* -algebra, 80 - of a bounded operator, 46 - of an unbounded operator, 150 affiliated to, 164 Alaoglu's theorem, 24 Alexander sub-base theorem, 202 almost everywhere, 219 Atkinson's theorem, 135 Baire category theorem, 16 balanced set, 25 Banach algebra, 61 - semi-simple, 77 Banach space, 13 - reflexive, 15 Banach-Steinhaus theorem, 21 basis - linear, 174 - standard, 8 Bessel's inequality, 36 bilateral shift, 52 Borel set, 209 bounded - below, 18 - operator, 9 C* -algebra, 80 Calkin algebra, 136 Cantor's intersection theorem, 15 cardinal numbers, 185 238
239
Index
cyclic vector, 90 deficiency - indices, 160 - spaces, 160 dense set, 193 densely defined, 149 determinant - of a linear transformation, 182 - of a matrix, 177 diagonal argument, 200 dimension - of a vector space, 176 - of the Hilbert space, 40, 190 direct sum - of operators, 60 - of spaces, 43 directed set, 34 disc algebra, 71 domain, 149 dominated convergence theorem, 221 dual group, 79 dyadic rational numbers, 207 E-net, 199 eigenvalue, 183 - approximate, 117 eigenvector, 183 elementary function, 214 essentially bounded, 220 extension of an operator, 151 extreme point, 122 Fatou's lemma, 221 field, 171 - characteristic of a, 172 - prime, 173 finite intersection property, 198 first category, 16 Fourier - coefficients, 40 - series, 39 - transform, 79 Fredholm operator, 136 - index of a, 138
Fubini's theorem, 222 functional calculus - continuous, 86 - measurable, 117 - unbounded, 166 Gelfand - -Mazur theorem, 74 - -Naimark theorem, 83 - transform, 76 GNS construction, 96 graph, 19 Haar measure, 64 Hahn-Banach theorem, 12, 26 Hahn-Hellinger theorem, 101 Hausdorff space, 204 Hilbert space, 30 - separable, 33 Hilbert-Schmidt - norm, 130 - operator, 130 homeomorphic, 197 homeomorphism, 197 ideal,72 - maximal, 72 - proper, 72 indicator function, 211 inner product, 29, 30 - space, 30 integrable function, 218 isometric operator, 10, 48 isomorphic, 7 isomorphism, 7 Krein-Milman theorem, 122 Lebesgue decomposition, 225 limit-inferior (lim inf), 212 limit-superior (lim sup), 212 linear - combination, 174 - operator, 149 - transformation, 6 - invertible, 7
Index
240 linearly - dependent, 174 - independent, 174 matrix, 2 max-min principle, 129 maximal element, 184 measurable - function, 210 - space, 209 measure, 215 - counting, 216 - finite, 215 - Lebesgue, 217 - n-dimensional, 217 - probability, 215 - product, 217 - a-finite, 217 - singular, 225 - space, 215 metric,4 - complete space, 13 - space, 3 - separable, 200 Minkowski functional, 25 monotone convergence theorem, 221 neighbourhood base, 22 net, 34 norm, 4 - of an operator, 10 normal element of a C* -algebra, 85 normed - algebra, 61 - space, 4 - dual, 14 - quotient, 14 nowhere dense set, 16 one-point compactification, 228 open - cover, 198 - map, 197 - mapping theorem, 17 - set, 191
orthogonal, 30 - complement, 41 - projection, 43 orthonormal set, 36 - basis, 38 7r-stable, 94 parallelogram identity, 30 partial isometry, 118, 119 - final space of a, 120 - initial space of a, 120 path, 142 - -component, 142 - -connected, 142 polar decomposition, 118 - theorem, 120, 169 Pontrjagin duality, 79 positive - element of a C* -algebra, 87 - functional, 95 - square root, 87, 168 - unbounded operator, 167 projection, 87 quasi-nilpotent, 77 radical, 77 Radon- Nikodym - derivative, 224 - theorem, 224 representation, 90 - cyclic, 90 - equivalent, 90 resolvent - equation, 67 - function, 67 - set, 67 restriction of an operator, 151 Riesz -lemma, 44 - representation theorem, 235 row, 2 a-algebra, 209 - Borel, 209 - product, 209
Index scalar multiplication, 1 Schroeder-Bernstein theorem, 186 second axiom of countability, 200 self-adjoint - bounded operator, 47 - element in a C* -algebra, 81 - unbounded operator, 155 seminorm, 22 sesquilinear form, 31, 45 - bounded, 45 similar, 144 simple function, 214 spanning set, 174 spectral - mapping theorem, 69 - measure, 111 - radius, 67 - formula, 70 - theorem - for bounded normal operators, 115 - for unbounded self-adjoint operators, 163 spectrum, 67 - essential, 137 - of a Banach algebra, 74 state, 95 Stone-Weierstrass theorem, 233 sub-representation, 94 subspace, 3 symmetric - neighbourhood, 22 - operator, 155 Tietze's extension theorem, 208 Tonelli's theorem, 222 topological space, 191 - locally compact, 227 - normal, 206 topology, 191 - base for the, 193 - discrete, 191 - indiscrete, 191 - product, 196
241 - strong, 23 - strong operator, 53 - sub-base for the, 194 - subspace, 196 - weak, 23, 195 - weak operator, 53 - weak*, 23 total - set, 53 - variation, 236 totally bounded, 199 triangle inequality, 4 Tychonoff's theorem, 203 unconditionally summable, 35 uniform boundedness principle, 20 uniform convergence, 13 unilateral shift, 51 unital algebra, 61 unitarily equivalent, 51 unitary - element of a C* -algebra, 87 - operator, 49 unitisation, 85 Urysohn's lemma, 207 vector addition, 1 vector space, 1 - complex, 2 - locally convex topological, 22, 25, 26 - over a field, 173 - topological, 21 von Neumann - 's density theorem, 92 - 's double commutant theorem, 94 - algebra, 94 weakly bounded set, 21 Weierstrass' theorem, 231 Zorn's lemma, 184
MMA • Monographs in Mathematics Managing Editors H. Amann / J.-P. Bourguignon / K. Grove / P.-L. Lions Editorial Board H. Araki / J. Ball / F. Brezzi / K. C. Chang / N. Hitchin / H. Hofer/H. Knorrer / K. Masuda / D. logier
The foundations of this outstanding book series were laid in 1944. Until the end of the 1970s, a total of 77 volumes appeared, including works of such distinguished mathematicians as Caratheodory, Nevanlinna and Shafarevich, to name a few. The series came to its name and present appearance in the 1980s. According to its well-established tradition, only monographs of excellent quality will be published in this collection. Comprehensive, in-depth treatments of areas of current interest are presented to a readership ranging from graduate students to professional mathematicians. Concrete examples and applications both within and beyond the immediate domain of mathematics illustrate the import and consequences of the theory under discussion. Volume 78
H. Triebel, Theory of Function Spaces I 1983, 284 pages, hardcover, ISBN 3-7643-1381-1.
Volume 79
G.M. Henkin /J. Leiterer, Theory of Functions on Complex Manifolds 1984, 228 pages, hardcover, ISBN 3-7643-1477-X.
Volume 80
E. Giusti, Minimal Surfaces and Functions of Bounded Variation 1984, 240 pages, hardcover, ISBN 3-7643-3153-4.
Volume 81
R.J. Zimmer, Ergodic Theory and Semi simple Groups 1984, 210 pages, hardcover, ISBN 3-7643-3184-4.
Volume 82
V.I. Arnold / S.M. Gusein-Zade / A.N. Varchenko, Singularities of Differentiable Maps - Vol. I 1985, 392 pages, hardcover, ISBN 3-7643-3187-9.
Volume 83
V.I. Arnold / S.M. Gusein-Zade / A.N. Varchenko, Singularities of Differentiable Maps - Vol. II 1988, 500 pages, hardcover, ISBN 3-7643-3185-2.
Volume 84
H. Triebel, Theory of Function Spaces II 1992, 380 pages, hardcover, ISBN 3-7643-2639-5.
Volume 85
K.R. Parthasarathy, An Introduction to Quantum Stochastic Calculus 1992, 300 pages, hardcover, ISBN 3-7643-2697-2.
Volume 86
M. Nagasawa, Schriidinger Equations and Diffusion Theory 1993, 332 pages, hardcover, ISBN 3-7643-2875-4.
Volume 87
J. Priiss, Evolutionary Integral Equations and Applications 1993,392 pages, hardcover, ISBN 3-7643-2876-2.
Volume 88
R.W. Bruggeman, Families of Automorphic Forms 1994, 328 pages, hardcover, ISBN 3-7643-5046-6.
Volume 89
H. Amann, Linear and Quasilinear Parabolic Problems, Volume I, Abstract Linear Theory 1995, 372 pages, hardcover, ISBN 3-7643-5114-4.
Volume 90
Ph. Tondeur, Geometry of Foliations 1997,318 pages, hardcover, ISBN 3-7643-5741-X.
Volume 91
H. Triebel, Fractals and Spectra related to Fourier analysis and function spaces 1997, Approx. 280 pages, hardcover, ISBN 3-7643-5776-2.
Volume 92
D. Spring, Convex Integration Theory. Solutions to the h-prindple in geometry and topology 1998, 226 pages, hardcover, ISBN 3-7643-5805-X.
LM • Lectures in Mathematics, ETH Zurich Department of Mathematics Research Institute of Mathematics Each year the Eidgenossische Technische Hochschule (£TH) at Zurich invites a selected group of mathematicians to give postgraduate seminars in various areas of pure and applied mathematics. These seminars are directed to an audience of many levels and backgrounds. Now some of the most successful lectures are being published for a wider audience through the Lectures in Mathematics, ETH Zurich series. Lively and informal in style, moderate in size and price, these books will appeal to professionals and students alike, bringing a quick understanding of some important areas of current research.
R.J. LeVeque Numerical Methods for Conservation Laws 2nd Edition, 3rd Printing 1994.
J.F. Carlson
1992. ISBN 3-7643-2723-5
1996. ISBN 3-7643-5389-9
R. Narasimhan Compact Riemann Surfaces
L. Simon Theorems on Regularity and Singularity of Energy Minimizing Maps based on lecture notes by Norbert Hungerbuhler
1992. ISBN 3-7643-2742-1
Modules and Group Algebras Notes by Ruedi Suter
1996. ISBN 3-7643-5397-X
A.J. Tromba Teichmiiller Theory in Riemannian Geometry 1992. ISBN 3-7643-2735-9
M. Freidlin
Markov Processes and Differential Equations: Asymptotic Problems G. Baumslag
1996. ISBN 3-7643-5392-9
Topics in Combinatorial Group Theory 1993. ISBN 3-7643-2921-1 J. Jost
M. Giaquinta
Nonpositive Curvature Geometric and Analytic Aspects
Introduction to Regularity Theory for Nonlinear Elliptic Systems
1997. ISBN 3-7643-5736-3
1993. ISBN 3-7643-2879-7 (h.M. Newman
Topics in Disordered Systems O. Nevanlinna Convergence of Iterations for Linear Equations
1997. ISBN 3-7643-5777-0
1993. ISBN 3-7643-2865-7
M. Yor R.-P. Holzapfel
Some Aspects of Brownian Motion, Part I
The Ball and Some Hilbert Problems
1992. ISBN 3-7643-2807-X
1995. ISBN 3-7643-2835-5
Some Aspects of Brownian Motion, Part II 1997. ISBN 3-7643-5717-7
Birkhiuser Advanced Texts Basler Lehrbucher Managing Editors:
H. Amann, Department of Mathematics, University of Zurich, Switzerland R. Brylinski, Pennsylvania State University, University Park, PA, USA This series presents, at an advanced level, introductions to some of the fields of current interest in mathematics. Starting with basic concepts, fundamental results and techniques are covered, and important applications and new developments discussed. The textbooks are suitable as an introduction for students and non-specialists, and they can also be used as background material for advanced courses and seminars. We encourage preparation of manuscripts in TeX for delivery in camera-ready copy which leads to rapid publication, or in electronic form for interfacing with laser printers or typesetters. Proposals should be sent directly to the editors or to: Birkhiiuser Verlag, P. O. Box 133, CH-401 0 Basel, Switzerland M. Brodmann, Algebraische Geometrie 1989.296 Seiten. Gebunden. ISBN 3-7643-1779-5 E.B. Vinberg, Linear Representations of Groups 1989. 152 pages. Hardcover. ISBN 3-7643-2288-8 K. Jacobs, Discrete Stochastics 1991.296 pages. Hardcover. ISBN 3-7643-2591-7 S.G. Krantz, H.R. Parks, A Primer of Real Analytic Functions 1992. 194 pages. Hardcover. ISBN 3-7643-2768-5 L. Conlon, Differentiable Manifolds: A First Course 1992. 369 pages. Hardcover. ISBN 3-7643-3626-9 H. Hofer / E. Zehnder, Symplectic Invariants and Hamiltonian Dynamics 1994. 342 pages. Hardcover. ISBN 3-7643-5066-0 M. Rosenblum, J. Rovnyak, Topics in Hardy Classes and Univalent Functions 1994. 264 pages. Hardcover. ISBN 3-7643-5111-X P. Gabriel, Matrizen, Geometrie, Lineare Algebra 1996. 648 Seiten. Gebunden. ISBN 3-7643-5376-7 M. Artin, Algebra 1998. ca. 723 Seiten. Broschur. ISBN 3-7643-5938-2