Operator
Birkha" user
Carlos S. Kubrusly
Elements of Operator Theory
Birkhauser Boston Basel Berlin
Carlos S. Kubrusly Catholic University R. Marques de S. Vicente 225 22453-900, Rio de Janeiro, Brazil
e-mail: carloa@ele. puc-rio. br
Ubrary of Congress Cataloging-In-Publication Data Kubrusly, Carlos S., 1947-
Elements of operator theory / Carlos S. Kubrusly p. cm. Includes bibliographical references and index. ISBN 0-8176-4174-2 (acid-free paper) - ISBN 3-7643-4174-2 (acid-free paper) 1. Operator theory. I Title. QA329.K79 2001 515'.724-dc2I 2001018439
AMS Subject Classifications: 47-XX, 47-01, 47A05, 47A10, 47A12, 47AI5, 47A75, 47610, 47B15, 47B20, 47637, 47005, 47L05. 46-XX, 46-01 .46A22. 46A30, 46B 10, 461315, 46820.46B45, 46B50. 46C05,46C07, 46015, 54-XX, 54-01, 54A20, 54B05, 54B10, 54B15, 54005. 54020, 54E35. 54E45, 54850, 54E52, 15-XX, 15-01, 15A03, 15A04. 03Exx. 03E10. 03820
Printed on acid-free paper.
02001 Birkhiuser Boston
Birkhiiuser
rt) L11 1),
All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (BirkhBuser Boston, c/o Springer-Verlag New York. Inc., 175 Fifth Avenue, New York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use of general descriptive names, trade names, trademarks. etc., in this publication, even if the former are not especially identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Marks Act. may accordingly be used freely by anyone.
ISBN 0-8176-4174-2
SPIN 10754156
ISBN 3-7643-4174-2
Reformatted from author's files in LATEX2e by TEXniques, Inc., Cambridge, MA Printed and bound by Hamilton Pnnting, Rensselaer, NY Printed in the United States of Amenca
987654321
To the memory of my father
The truth, he thought, has never been of any real value to any human being - it is a symbol for mathematicians and philosophers to pursue. In human relations kindness and lies are worth a thousand truths. He involved himself in what he always knew was a vain struggle to retain the lies. Graham Greene
Preface
"Elements" in the title of this book has its standard meaning, namely, basic principles and elementary theory. The main focus is operator theory, and the topics range from
sets to the spectral theorem. Chapter 1 (Set-Theoretic Structures) introduces the reader to ordering, lattices and cardinality. Linear spaces are presented in Chapter 2 (Algebraic Structures). Metric (and topological) spaces in Chapter 3 (Topological Structures). The purpose of Chapter 4 (Banach Spaces) is to put algebra and topology to work together. Continuity plays a central role in the theory of topological spaces,
and linear transformation plays a central role in the theory of linear spaces. When algebraic and topological structures are compatibly laid on the same underlying set, leading to the notion of topological vector spaces, then we may consider the concept of continuous linear transformations. By an operator we mean a continuous linear transformation of a normed space into itself. Chapter 5 (Hilbert Spaces) is central. There a geometric structure is properly added to the algebraic and topological structures. The spectral theorem is a cornerstone in the theory of operators on Hilbert spaces. It gives a full statement on the nature and structure of normal operators, and is considered in Chapter 6 (The Spectral Theorem). The book is addressed to graduate students, both in mathematics or in one of the sciences, and also to working mathematicians getting into operator theory and scientists willing to apply operator theory to their own subject. In the former case it actually is a first course. In the latter case it may serve as a basic reference on the so-called elementary theory of single operator theory. Its primary intention is to introduce operator theory to a new generation of students and provide the necessary background for it. Technically, the prerequisite for this book is some mathematical maturity that a first-year graduate student in mathematics, engineering or in one of
viii
Preface
the formal sciences is supposed to have already acquired. The book is largely selfcontained. Of course, a formal introduction to analysis will be helpful, as well as an introductory course on functions of a complex variable. Measure and integration are not required up to the very last section of the last chapter. Each section of each chapter has a short and concise (sometimes a compound) title. They were selected in such a way that, when put together in the contents, give a brief outline of the book to the right audience. The focus of this book is on concepts and ideas as an alternative to the computational approach. The proofs avoid computation whenever possible or convenient. Instead, I try to unfold the structural properties behind the statements of theorems, stressing mathematical ideas rather than long calculations. Tedious and ugly (all right, "ugly" is subjective) calculations were avoided where a more conceptual way to explain the stream of ideas was possible. Clearly, this is not new. In any event, every single proof in this book was specially tailored to meet this requirement but they (at least the majority of them) are standard proofs, perhaps with a touch of what may reflect some of the author's minor idiosyncrasies. In writing this book I kept my mind focused on the reader. Sometimes I am talking to my students and sometimes to my colleagues (they surely will identify in each case to whom I am talking). For my students the objective is to teach mathematics (ideas, structures and problems). There are 300 problems throughout the book, many of them with multiple parts. These problems, at the end of each chapter, comprise complements and extensions of the theory, further examples and counterexamples, or auxiliary results that may be useful in the sequel. They are an integral part of the main text, which makes them different from traditional classroom exercises. Many of these problems are accompanied by hints, which may be a single word or a sketch, sometimes long, of a proof. The idea behind providing these long and detailed hints is that just talking to students is not enough. One has to motivate them too. In my view, motivation (in this context) is to reveal the beauty of pure mathematics, and to challenge students with a real chance to reconstruct a proof for a theorem that is "new" to them. Such a real chance can be offered by a suitable, sometimes rather detailed, hint. At the end of each chapter, just before the problems, the reader will find a list of suggested readings that contains only books. Some of them had a strong influence in preparing this book, and many of them are suggested as a second or third reading. The reference section comprises a list of all those books and just a few research papers (82 books and 11 papers), all of them quoted in the text. Research papers are only mentioned to complement occasional historical remarks so that the few articles cited there are, in fact, classical breakthroughs. For a glance at current research in operator theory the reader is referred to recent research monographs suggested in Chapters 5 and 6. I started writing this book after lecturing on its subject at Catholic University of Rio de Janeiro for over 20 years. In general, the material is covered in two onesemester beginning graduate courses, where the audience comprises mathematics, engineering, economics and physics students. Quite often senior undergraduate students joined the courses. The dividing line between these two one-semester courses
Preface
ix
depends a bit on the pace of lectures but is usually somewhere at the beginning of Chapter 5. Questions asked by generations of students and colleagues have been collected. When the collection was big enough some former students, as well as current students, insisted upon a new book but urged that it should not be a mere collection of lecture notes and exercises bound together. I hope not to disappoint them so much. At this point, where a preface is coming to an end, one has the duty and pleasure to acknowledge the participation of those people who somehow effectively contributed in connection with writing the book. Certainly, the students in those
courses were a big help and a source of motivation. Some friends among students and colleagues have collaborated by discussing the subject of this book for a long time many on occasions. They are: Gilberto O. Correa, Oswaldo L. V. Costa, Giselle M. S. Ferreira, Marcelo D. Fragoso, Ricardo S. Kubrusly, Abilio P. Lucena, Helios Malebranche, Carlos E. Pedreira, Denise O. Pinto, MarcosA. da Silveira, and Paulo Cr sar M. Vieira. Special thanks are due to my friend and colleague Augusto C. Gadelha Vieira who read part of the manuscript and made many valuable suggestions. I am also grateful to Ruth F. Curtain who, back in the early seventies, introduced me to functional analysis. I wish to thank Catholic University of Rio de Janeiro for providing the release time that made this project possible. Let me also thank the staff of Birkhauser Boston and Elizabeth Loew of TeXniques for their ever efficient and friendly partnership. Finally, it is just fair to mention that this project was supported in part by CNPq (Brazilian National Research Council) and FAPERJ (Rio de Janeiro State Research Council). Carlos S. Kubrusly Rio de Janeiro November, 2000
Contents
Preface 1
Vii
Set-Theoretic Structures
1
Background 1.2 Sets and Relations . . 1.3 Functions . . . . . . 1.4 Equivalence Relations 1.5 Ordering . . . . . . . . 1.6 Lattices . . . . . . 1.7 Indexing . 1.8 Cardinality . . . . . 1.9 Remarks . . . . . . . . . Problems . . . . . 1.1
.
.
.
.. .. ..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
. .
..
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
.
.
.
3
.. .. .. . ... . .. .. .. .. . ...... . .. . ... .. .. .. .
.
.
.
.
.
. . . .
.
.
. .
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
7
.
.
.
.
. .
.
.
.
. .
.
.
8 10 12 14 21
.
.
.
.
24
.
.
.
.
.
.
.
.
.
.
.
.
.
.. .. .. . ... .... . .. . ... . . ... . .. .. . ... . .. .. .. .. ..
.
.
.
.
.
.
.
.
.
2 Algebraic Structures 2.1 Linear Spaces . . . . 2.2 Linear Manifolds . . 2.3 Linear Independence . . . 2.4 Hamel Basis . . . . . . . 2.5 Linear Transformations .. . 2.6 Isomorphisms .. . . 2.7 Isomorphic Equivalence . 2.8 Direct Sum . . . . . . . .
.. .. .. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.. .. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
37 .
.. .. .
.. .... ... .
.. .. .. .. .. .. .. .. . ... .. .. .. .. .. . ... . .. .
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. ... .
.
.
.
. .
.
.
37 43
46 48 55 58 64 66
xii
Contents
2.9 Projections Problems . . . .
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Topological Structures
.
.
.
.
.
.
.
..
.. ... .
3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9
Convergence and Continuity ... ..
.
.
.
.
.
.
.
3.11 Sequential Compactness . Problems . . . . . .
.. ..
Banach Spaces 4.1
.
.
.
.
.
.
.
.
.
.
.
.. .. ..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.. .
.
. .
.. ... .
.
70
.
75
.. .. .. ... . ... .. .. . .
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
. .
.
. .
.
.
.
.
.
. . . .
.
.
.
.
.
..
.
.
. .
. .
. . . Open Sets and Topology . . Equivalent Metrics and Homeomt.rphisms . . . . Closed Sets and Closure . . . . Dense Sets and Separable Spaces . . . . . . Complete Spaces . . . . . . . . . . . . . . Continuous Extension and Completion . . . The Baire Category Theorem . . . . . . . . .
.
.
Metric Spaces
.
.. ..
.. .. . .... ... . .. . .... .
3.1
3.10 Compact Sets ..
4
.
.
.
.. .... . .. .. . .
. .... . .
.
.
. .
.
. .
.
.
. .
.. .... . .. .. .. .. ... . .. .. .
. .
.
.
. .... . .. .. .
.. ... . .. .. Examples . ... ... . .. .. .. .. . ... . Subspaces and Quotient Spaces . . ... .. .. ... . ... .. Normed Spaces ..
..
.
.
.
.
.
.
.
.
.
.
.
. . . . 4.2 . . . . . 4.3 4.4 Bounded Linear Transformations . 4.5 The Open Mapping Theorem and Continuous Inverses . . 4.6 Equivalence and Finite-Dimensional Spaces .. . . . . 4.7 Continuous Linear Extension and Completion . . 4.8 The Banach-Steinhaus Theorem and Operator Convergence 4.9 Compact Operators . . . 4.10 The Hahn-Banach Theorem and Dual Spaces . . . . . . Problems . . . .
.
. .
.. .. .... ... . .. .. .
.
. . .
.. .. . ... . ..
.
.
.
.
.
.
.
. ... . . ... . . ... .. .. .. .. .. . . ... .. .. . ... ..... .. .. . ... ..
5
Hilbert Spaces 5.1
5.2 5.3
Inner Product Spaces Examples Orthogonality
.
. .
.
.
..
.
. . .
.
.. . ..... .
.
.
100 106 113
119 127 134 142 148 155 163
197 197
202 209 215 223 230 237 242 250 258 269 311
.
. .
.
.
.
. .
. .
.
. ... ....... . .. . ..... . . ... .. ..... .... .. . ... ..... . .
85 85 93
.
.
.
311
318 323 328 335 339 343 352 359 367 376 387
.. .. .. .. . ..... .. .. .. . ... . .. .. .. . ... .... . .. . .... . .. .. .. .. .. .. .. .. .... . ... .. .. . ... .. .. .. .. .. .. .. .. .. 5.13 Self-Adjoint Operators ....................... 396 . 5.4 Orthogonal Complement .. ... .... ... . 5.5 Orthogonal Structure . . . . . . . . . . . . . . . . . . . . . . . 5.6 Unitary Equivalence . . . . . 5.7 Summability . . . . . . . . . . 5.8 Orthonormal Basis . . . . 5.9 The Fourier Series Theorem . . . . . 5.10 Orthogonal Projection . . . 5.11 The Riesz Representation Theorem and Weak Convergence . . . . 5.12 The Adjoint Operator . . . . . . . . . .
.
.
.
.
.
. . .
.
.
.
.
.
.
.
.
.
.
.
.
xiii
Contents
. .... ... . .. ..
.................................
5.14 Square Root and Polar Decomposition Problems 6
The Spectral Theorem 6.1 Normal Operators .
.
.
.
.
.
.
.
. .
.
.
..
. .
. .
.
.
. .
.
..
. ... ..
.. ..
.
.. . ... ... . . .... . . ... .. .. . .... . . ... .. .. . ... .
The Spectrum of an Operator . . . Spectral Radius . . . . . . . . . . Numerical Radius . . . . . . . . . . . . . . . . . . . . . . . Examples of Spectra . . . . . . . . . The Spectrum of a Compact Operator . . The Spectral Theorem for Compact Normal Operators . . . . A Glimpse at the Spectral Theorem for Normal Operators .. Problems . . . . . . . . . . 6.2 6.3 6.4 6.5 6.6 6.7 6.8
.
.
.
.. .. .. ....... . ..
. .
. . .
.
.
.
.
. .... . .
.
.
.
.
.
.
..
401
408 441 441
449 457 462 466 474 480 489 496
References
509
Index
517
1
Set-Theoretic Structures
The purpose of this chapter is to present a very brief review of some basic settheoretic concepts that will be needed in the sequel. By basic concepts we mean standard notation and terminology, and a few essential results that will be required in later chapters. We assume the reader is familiar with the notion of set and elements (or members, or points) of a set, as well as with the basic set operations. It is convenient to reserve certain symbols for certain sets, especially for the basic number systems.
The set of all nonnegative integers will be denoted by No, the set of all positive integers (i.e., the set of all natural numbers) by N, and the set of all integers by Z. The set of all rational numbers will be denoted by Q, the set of all real numbers (or the real line) by R, and the set of all complex numbers by C.
1.1
Background
We shall also assume that the reader is familiar with the basic rules of elementary (classical) logic, but acquaintance with formal logic is not necessary. The foundations of mathematics will not be reviewed in this book. However, before starting our brief review of set-theoretic concepts, we shall introduce some preliminary notation, terminology and logical principles as a background for our discourse. If a predicate P( ) is meaningful for a subject x, then P(x) (or simply P) will denote a proposition. The terms statement and assertion will be used as synonyms for proposition. A statement on statements is sometimes called a formula (or a secondary proposition). Statements may be true or false (not true). A tautology is a
2
1. Set-Theoretic Structures
formula that is true regardless of the truth of the statements in it. A contradiction is a formula that is false regardless of the truth of the statements in it. The symbol Q (whose logical definition is "either P is denotes implies and the formula P false or Q is true") means "the statement P implies the statement Q". That is, "if P is true, then Q is true", or "P is a sufficient condition for Q". We shall also use the symbol * for the denial of ==>, so that ¢i denotes does not imply and the formula
P
. Q means "the statement P does not imply the statement Q". Accordingly.
let 6 stand for the denial of P (read: not P). If P is a statement, then t is its contradictory. Let us first recall one of the basic rules of deduction called modus ponens: "if
a statement P is true and if P implies Q, then the statement Q is true" - "anything implied by a true statement is true". Symbolically, (P true and P Q) (Q true). A direct proof is essentially a chain of modus ponens. For instance, if P R ensures that R is true. Indeed, is true, then the string of implications P Q if we can establish that P holds, and also that P implies Q, then (modus ponens) Q holds. Moreover, if we can also establish that Q implies R, then (modus ponens again) R holds. However, modus ponens alone is not enough to ensure that such a reasoning may be extended to an arbitrary (endless) string of implications. In certain cases the Principle of Mathematical Induction provides an alternative reasoning. Let N be the set of all natural numbers. A set S of natural numbers is called inductive if n + 1 is an element of S whenever n is. The Principle of Mathematical Induction states that "if 1 is an element of an inductive set S, then S = N". This leads to a second scheme of proof, called proof by induction. For instance, for each natural number n let P be a proposition. If P1 holds true and if P,, P for every natural number n. The scheme of proof by induction works for N replaced with No. There is nothing magical about the number 1 as far as a proof by induction is concerned. All that is needed is a "beginning" and the notion of "induction". Example: Let i be an arbitrary integer and let Z; be the set made up of all integers greater than or equal to i. For each integer k in Zi let Pk be Pk+t for each k, then Pk holds true for a proposition. If Pi holds true and if Pk every integer k in Z; (particular cases: Zo = No and Zt = N). "If a statement leads to a contradiction, then this statement is false." This is the rule of a proof by contradiction - reductio ad absurdum. It relies on the Principle of Contradiction, which states that "P and ff are impossible". In other words, the Principle of Contradiction says that the formula "P and f " is a contradiction. But this alone does not ensure that any of P or t must hold. The Law of the Excluded
Middle (or Law of the Excluded Third - tertium non datur) does: "either P or i¢ hold". That is, the Law of the Excluded Middle simply says that the formula "P or r " is a tautology. Therefore, the formula 0 f means "P holds only if Q holds", or "Q is a necessary condition for P". If P = Q and Q = P, then we shall write P . Q which means "P if and only if Q", or "P is a necessary and sufficient condition for Q", or "P and Q are equivalent" (and vice versa). Indeed, t }. Such the formulas P Q and 0 = f are equivalent: (P = Q) t= (0
1.2 Sets and Relations
3
an equivalence is the basic idea behind a contrapositive proof "to verify that a proposition P implies a proposition Q prove, instead, that the denial of Q implies the denial of P". We conclude this introductory section by pointing out another usual but slightly different meaning for the term "proposition."We shall often say "prove the following proposition" instead of "prove that the following proposition holds true". Here the term proposition is being used as a synonym for theorem (a true statement for which we demand a proof of its truth), and not as a synonym for an assertion or statement (that may be either true or false). A conjecture is a statement that has not been proved yet - it may turn out to be either true or false once a proof of its truth or falsehood is supplied. If a conjecture is proved to be true, then it becomes a theorem. Note that
there is no "false theorem" - if it is false, it is not a theorem. Another synonym for theorem is lemma. There is no logical difference among the terms "theorem", "lemma" and "proposition" but it is usual to endow them with a psychological hierarchy. Generally, a theorem is supposed to bear a greater importance (which is subjective) and a lemma is often viewed as an intermediate theorem (which may be very important indeed) that will be applied to prove a further theorem. Propositions are sometimes placed a step below, either as an isolated theorem or as an auxiliary
result. A corollary is, of course, a theorem that comes out as a consequence of a previously proved theorem (i.e., whose proof is mainly based on an application of that previous theorem). Unlike "conjecture", "proposition", "lemma", "theorem" and "corollary", the term axiom (or postulate) is applied to a fundamental statement (or assumption, or hypothesis) upon which a theory (i.e., a set of theorems) is built. Clearly, a set of axioms (or, more appropriately, a system of axioms) should be consistent (i.e., they should not lead to a contradiction), and they are said to be independent if none of them is a theorem (i.e., if none of them can be proved by the remaining axioms).
1.2
Sets and Relations
If x is an element of a set X, then we shall write x E X (meaning that x belongs to X, or x is contained in X). Otherwise (i.e., if x is not an element of X), x 0 X. We also write A C B to mean that a set A is a subset of a set B (A C_ B {x E A X E B)). In such a case A is said to be included in B. The empty set, which is a subset of every set, will be denoted by 0. Two sets A and B are equal (notation: A = B) if A C B and B C A. If A is a subset of B but not equal to B, then we say that A is a proper subset of B and write A C B. In such a case A is said to be properly included in B. A nontrivial subset of a set X is a nonempty proper subset of it. If P( ) is a predicate that is meaningful for every element x of a set X (so that P(x) is a proposition for each x in X), then (x E X : P(x)) will denote the subset of X consisting of all those elements x of X for which the proposition P(x) is true. The complement of a subset A of a set X. denoted by X\A, is the
4
1. Set-Theoretic Structures
subset {x E X : x V A). If A and B are sets, the difference between A and B, or the relative complement of B in A, is the set
A\B = {x E A: x 4 B}. We shall also use the standard notations U and fl for union and intersection,
respectively (xEAUBb{xEAor xEB}and xeAf1BIX EAand
x E B}). The sets A and B are disjoint if A fl B = 0 (i.e., if they have an empty intersection). The symmetric difference (or Boolean sum) of two sets A and B is the set
AvB = (A\B) U (B\A) = (A U B)\(A fl B). The terms class, family and collection (as their related terms prefixed with "sub") will be used as synonyms for set (usually applied for sets of sets, but not necessarily) without imposing any hierarchy among them. If X is a collection of subsets of a given set X, then U X will denote the union of all sets in X. Similarly. n x will denote the intersection of all sets in X (alternative notation: UAEXA and nAEXA) An important statement about complements that exhibits the duality between union and intersection is the so-called De Morgan laws:
X\( U A) = n (X\A) AEX
and
X\(n A) =
AEX
AEX
U (X\A). AEX
The power set of any set X, denoted by P(X), is the collection of all subsets of X.
Note that UP(X) = X E P(X) and f P(X) = 0 E P(X). A singleton in a set X is a subset of X containing one and only one point of X (notation: (x) c X is a singleton on x E X). A pair (or a doubleton) is a set containing just two points, say {x, y), where x is an element of a set X and y is an element of a set Y. A pair of points x E X and y E Y is an ordered pair, denoted by (x, y), if x is regarded as the first member of the pair and y is regarded as the second. The Cartesian product of two sets X and Y, denoted by X x Y, is the set of all ordered pairs (x, y) where x E X and y E Y. A relation R between two sets X and Y is any subset of the Cartesian product X x Y. If R is a relation between X and Y, and if (x, y) is a pair in R C X x Y, then we say that x is related toy under R (or x and y are related by R), and write x Ry (instead of (x, y) E R). Tautologically, for any ordered pair (x, y) E X x Y, either (x, y) E R or (x, y) 0 R (i.e., either xRy or x$ y). A relation between a set X and itself is called a relation on X. If X and Y are sets and if R is a relation between X and Y, then the graph of the relation R is the subset of X x Y
GR = {(x, y) E X XY: xRy}. A relation R clearly coincides with its graph GR.
1.3 Functions
1.3
5
Functions
Let x be an arbitrary element of a set X and let y and z be arbitrary elements of a set Y. A relation F between the sets X and Y is a -function if xFy and xFz imply y = z. In other words, a relation F between a set X and a set Y is called a function from
X to Y (or a mapping of X into Y) if for each x E X there exists a unique y E Y such that x Fy. The terms map and transformation are often used as synonyms for function and mapping. (Sometimes the terms correspondence and operator are also used but we shall keep them for special kinds of functions.) It is usual to write F : X - Y to indicate that F is a mapping of X into Y, and y = F(x) (or y = Fx) instead of xFy. If y = F(x), we say that F maps x to y, so that F(x) E Y is the value of the function F at x E X. Equivalently, F(x), which is a point in Y, is the image of the point x in X under F. It is also customary to use the abbreviation "the function X -+ Y defined by x -> F(x)" for a function from X to Y that assigns to each x in X the value F(x) in Y. A Y-valued function on X is precisely a function from X to Y. If Y is a subset of the set C, IR or Z, then complex-valued function, realvaluedfunction or integer-valued function, respectively, are usual terminologies. An
X-valued function on X (i.e., a function F: X -+ X from X to itself) is referred to as a function on X. The collection of all functions from a set X to a set Y will be denoted by YX. Indeed, YX C P(X x Y). Consider a function F : X - Y. The set X is called the domain of F and the set Y is called the codomain of F. If A is a subset of X, then the image of A under F. denoted by F(A), is the subset of Y consisting of all points y of Y such that y = F(x) for some x E A:
F(A) = {yEY: y=F(x)forsome xEACX}. On the other hand, if B is a subset of Y, then the inverse image of B under F (or the pre-image of B under F), denoted by F (B), is the subset of X made up of all points x in X such that F(x) lies in B:
F-t(B)= IXEX: F(x)EBCY). The range of F, denoted by R (F), is the image of X under F. Thus
?Z(F) = F(X) = {y E Y: y = F(x)forsomex E X}. If R(F) is a singleton, then F is said to be a constant function. If the range of F coincides with the codomain (i.e., if F(X) = Y), then F is a surjective function. In this case F is said to map X onto Y. The function F is injective (or F is a one-toone mapping) if its domain X does not contain two elements with the same image. In other words, let x and x' be arbitrary elements of X. A function F: X -+ Y is injective if F(x) = F(x') implies x = x'. A one-to-one correspondence between a set X and a set Y is a one-to-one mapping of X onto Y. That is, a surjective and injective function (also called a bijective function).
6
1. Set-Theoretic Structures
If A is an arbitrary subset of X and F is a mapping of X into Y, then the function
G: A -+ Y such that G(x) = F(x) for each x E A is the restriction of F to A. Conversely, if G: A -> Y is the restriction of F: X -* Y to some subset A of X, then F is an extension of G over X. It is usual to write G = FIA. Note that R(FIA) = F(A). Let A be a subset of X and consider a function F: A -* X. An element x of A is a feed point of F (or F leaves x fixed) if F(x) = x. The function J : A -+ X defined by J (x) = x for every x E A is the inclusion map (or the embedding, or the injection) of A into X. In other words, the inclusion map of A into X is the function
J : A --). X that leaves each point of A fixed. The inclusion map of X into X is called the identity map on X and denoted by 1, or by Ix when necessary (i.e., the identity on X is the function 1: X - X such that 1(x) = x for every x E X). Thus the inclusion map of a subset of X is the restriction to that subset of the identity map on X. Now consider a function on X; that is, a mapping F: X -* X of X into itself. A subset of X, say A. is invariant for F (or invariant under F, or FX, has its invariant) if F(A) c A. In this case the restriction of F to A, FIA: A range included in A: R(FI A) = F(A) C A C_ X. Therefore we shall often think of the restriction of F : X -> X to an invariant subset A C X as a mapping of A into itself: FIA: A --+ A. It is in this sense that the inclusion map of a subset of X can be thought of as the identity map on that subset: they differ only in that one has a larger codomain than the other.
Let F : X -- Y be a function from a set X to a set Y, and let G : Y
Z be a
function from the set Y to a set Z. Since the range of F is included in the domain of G, R(F) C Y. consider the restriction of G to the range of F. G I R(F): R(F) -+ Z. The composition of G and F, denoted by G o F (or simply by GF), is the function
from X to Z defined by (G o F)(x) = GIR(F) (F(x)) = G(F(x)) for every x E X. It is usual to say that the diagram X
Fi
Y
H\
IG
Z
commutes if H = G o F. Although the above diagram is said to be commutative whenever H is the composition of G and F. the composition itself is not a commutative operation even when such a commutation makes sense. For instance, if X = Y = Z and F is a constant function on X, say F(x) = a E X for every x E X. then G o F and F o G are constant functions on X as well: (G o F)(x) = G(a) and (F o G)(x) = a for every x E X. However G o F and F o G need not be the same (unless a is a fixed point of G). Composition may not be commutative but it is always
associative. If F maps X into Y. G maps Y into Z and K maps Z into W, then we can consider the compositions K o (G o F): X -+ W and (K o G) o F: X --> W. It is readily verified that K o (G o F) = (K o G) o F. For this reason we may and
1.4 Equivalence Relations
7
shall drop the parentheses. In other words, the diagram
XF4
Y
H\ IG Z
K
+W
commutes (i.e., H= G o F, L= K o G, and K o H= L o F). If F is a function on set X, then the composition of F : X -+ X with itself, F o F, is denoted by F2. Likewise, for any positive integer n E N, F" denotes the composition of F with itself n times. F o ... o F : X -+ X, which is called the nth power of F. A function
F : X -+ X is idempotent if F2 = F (and hence F" = F for every n E N). It is easy to show that the range of an idempotent function is precisely the set of all its fixed points. In fact. F = F2 if and only if 7Z(F) = {x E X : F(x) = x}. Suppose F : X -+ Y is an injective function. Thus, for an arbitrary element of R(F), say y, there exists a unique element of X, say xy, such that y = F(xy). This
defines a function from 7Z(F) to X, F-1: R(F) -+ X. such that xy = F-1 (y). Hence Y = F(F-t (y)). On the other hand, if x is an arbitrary element of X. then F(x) lies in R(F) so that F(x) = F(F-t (F(x))). Since F is injective, x = F-t (F(x)). Conclusion: For every injective function F: X - Y there exists a (unique) function F-t : R(F) -+ X such that F-t F: X - X is the identity on X (and FF-t : R(F) --> R(F) is the identity on R(F)). F-t is called the inverse of F on R(F): an injective function has an inverse on its range. If F is also surjective, then F-t : Y --> X is called the inverse of F. Thus an injective and surjective function is also called an invertible function (in addition to its other names).
1.4
Equivalence Relations
Let x, y and z be arbitrary elements of a set X. A relation R on X is
reflexive if xRx for every x E X, transitive if xRy and yRz imply x Rz . and symmetric if xRy implies yRx. An equivalence relation on a set X is a relation - on X that is reflexive, transitive and symmetric. If - is an equivalence relation on a set X, then the equivalence class of an arbitrary element x of X (with respect to -) is the set
[x] _ (X' E X : x'. X ). Given an equivalence relation - on a set X, the quotient space of X modulo -, denoted by X/- , is the collection
X/- =
I[X]CX: xEX}
8
1. Set-Theoretic Structures
of the equivalence classes (with respect to --) of every x E X. For each x in X set
tr(x) = [x] in X/-. This defines a surjective map r : X -+ X/- which is called the natural mapping of X onto X/-. Let X be any collection of nonempty subsets of a set X. X covers X (or X is a covering of X) if X = U X (i.e., if every point in X belongs to some set in X). X is disjoint if the sets in X are pairwise disjoint (i.e., A fl B = 0 for every pair of distinct sets A and B in X). A partition of a set X is a disjoint covering of X Let -_ be a an equivalence relation on a set X, and let X/::z be the quotient space of X modulo --. It is clear that
X/.-- is a partition of X. Conversely, let X be any partition of a set X and define a relation - IX on X as follows: for every x, x' in X, x is related to x' under - /X (i.e., x - IX x') if x and x' belong to the same set in X. In fact,
-/X is an equivalence relation on X. which is called the equivalence relation induced by a partition X. It is readily verified that the quotient space of X modulo the equivalence relation induced by the partition X coincides with X itself, just as the equivalence relation induced by the quotient space of X modulo the equivalence relation on X coincides with . Symbolically,
X/(- /X) = X
and
-/(X/--) = --.
on X induces a partition X/.: of X, which in turn induces back an equivalence relation - /(X/) on X that coincides with..:. On the other hand, a partition X of X induces an equivalence relation - /X on X, which in turn induces back a partition X l(- /X) of X that coincides with X. Conclusion: The collection of all equivalence relations on a set X is in a one-to-one correspondence with the collection of all partitions of X.
Thus an equivalence relation
1.5
Ordering
Let x and y be arbitrary elements of a set X. A relation R on X is
antisymmetric if xRy and yRx imply x = y. A relation < on a nonempty set X is a partial ordering of X if it is reflexive, transitive
and antisymmetric. If < is a partial ordering on a set X, the notation x < y means x < y and x 0 y. Moreover, y > x and y > x are just another way to write x < y and x < y, respectively. Thus a partially ordered set is a pair (X, <) where X is a nonempty set and < is a partial ordering of X (i.e., a nonempty set equipped with a partial ordering on it). Warning: It may happen that x j y and y _ x for some
1.5 Ordering
9
(x, y) E X x X. Let (X, <) be a partially ordered set, and let A be a subset of X. Note that (A, <) is a partially ordered set as well. An element x E X is an upper bound for A if y < x for every y E A. Similarly, an element x E X is a lower bound for A if x < y for every y E A. A subset A of X is bounded above in X if it has an upper bound in X. and bounded below in X if it has a lower bound in X. A bounded subset of X is one that is bounded both above and below. If a subset A of a partially ordered set X is bounded above in X and if some upper bound of A belongs to A, then this (unique) element of A is the maximum of A (or the greatest or biggest element of A), denoted by max A. Similarly, if A is bounded below in X and if some lower bound of A belongs to A, then this (unique) element of A is the minimum of A (or the least or smallest element of A), denoted by min A.
An element x E A is maximal in A if there is no element y E A such that x < y (equivalently, if x >E y for every y E A). Similarly, an element x E A is minimal in A if there is no element y E A such that y < x (equivalently, if y tE x for every y E A). Note that x >E y (or y y x) does not mean that y < x (or x < y) so that the concepts of a maximal (or a minimal) element in A and that of the maximum (or the minimum) element of A do not coincide. Example 1A. A collection of many (e.g., two) pairwise disjoint nonempty subsets of a nonempty set, equipped with the partial ordering defined by the inclusion relation C , has no maximum, no minimum, and every element in it is both maximal and minimal. On the other hand, the collection of all infinite subsets of an infinite set, whose complements are also infinite, has no maximal element in the inclusion
ordering C. (The notion of infinite sets will be introduced later in Section 1.8 for instance, the set all even natural numbers is an infinite subset of N which has an infinite complement). Let A be a subset of a partially ordered set X. Let UA C X be the set of all upper bounds of A. and let VA C X be the set of all lower bounds of A. If UA is nonempty and has a minimum element, say u = min UA, then u E UA is called the supremum (or the least upper bound) of A (notation: u = sup A). Similarly. if VA is nonempty and has a maximum, say v = max VA. then v E VA is called the infimum (or the
greatest lower bound) of A (notation: v = inf A). A bounded set may not have a supremum or an infimum. However, if a set A has a maximum (or a minimum), then
sup A = max A (or inf A = min A). Moreover, if a set A has a supremum (or an infimum) in A, then sup A = max A (or inf A = min A). If a pair {x, y) of elements of a partially ordered set X has a supremum or an infimum in X, then we shall use the following notation: x v y = sup{x, y} and x A y = inf {x, y). Let F : X -- Y be a function from a set X to a partially ordered set Y. Thus the range of F, F(X) C Y, is a partially ordered set. An upper bound for F is an upper bound for F(X), and F is bounded above if it has an upper bound. Similarly, a lower bound for F is a lower bound for F(X), and F is bounded below if it has lower bound. If a function F is bounded both above and below, then it is said to be bounded.
The supremum of F, sup.EX F(x), and the infimum of F. infXExF(x), are defined by supXExF(x) = sup F(X) and infXEXF(x) = inf F(X). Now suppose X also is
1. Set-Theoretic Structures
10
partially ordered and take an arbitrary pair of points X1, x2 in X. F is an increasing
function if xi < x2 in X implies F(xi) < F(x2) in Y, and strictly increasing if x1 < x2 in X implies F(xj) < F(x2) in Y. (For notational simplicity we are using the same symbol < to denote both the partial ordering of X and the partial ordering of Y). Note that F is strictly increasing if and only if it is increasing and injective. In a similar way we can define decreasing and strictly decreasing functions between partially ordered sets. If a function is either decreasing or increasing, then it is said to be monotone.
1.6
Lattices
Let X be a partially ordered set. If every pair {x, y} of elements of X is bounded above, then X is a directed set (or the set X is said to be directed upward). If every pair {x, y} is bounded below, then X is said to be directed downward. X is a lattice if every pair of elements of X has a supremum and an infimum in X (i.e., if there exits
a unique u E X and a unique v E X such that u = x v y and v= x A y for every pair x E X and y E X). A nonempty subset A of a lattice X that contains x v y and x A y for every x and y in A is a sublattice of X (and hence a lattice itself). Every lattice is directed both upward and downward. If every bounded subset of X has a supremum and an infimum, then X is a boundedly complete lattice. If every subset of X has a supremum and an infimum, then X is a complete lattice. The following chain of implications complete lattice
boundedly complete lattice
lattice
directed set
is clear enough, and none of them can be reversed. If X is a complete lattice, then X has a supremum and an infimum in X, which actually are the maximum and the minimum of X, respectively. Since min X E X and max X E X. this shows that a complete lattice in fact is nonempty (even if this had not been assumed when we defined a partially ordered set). Likewise, the empty set 0 of a complete lattice X has a supremum and an infimum. Since every element of X is both an upper and
a lower bound for 0, it follows that UQ = Vo = X. Hence sup 0 = min X and inf 0 = max X. Example 1B. The power set P (X) of a set X is a complete lattice in the inclusion
ordering c, where AVB=AUBandAAB=AflBforevery pair (A. B) of subsets of X. In this case, sup 0 = min P(X) = 0 and inf 0 = max P(X) = X. Example 1C. The real line llt with its natural ordering < is a boundedly complete lattice but not a complete lattice (and so is its sublattice Z of all integers). The set A = (.x E R : 0 < x < 1), as a sublattice of (R, <), is a complete lattice where sup 0 = min A = 0 and inf 0 = max A = 1. The set of all rational numbers Q is a sublattice of E (in the natural ordering < ) but not a boundedly complete lattice -
1.6 Lattices
1l
e.g., the set (x E Q : x2 < 2) is bounded in Q but has no infimum and no supremum in Q.
Example 1D. The notion of connectedness needs topology and we shall define it in due course. However, if the reader is already familiar with it. then he can appreciate now a rather simple example of a directed set that is not a lattice (connectedness will be defined in Chapter 3). The subcollection of P(R2) made up of all connected subsets of the Euclidean plane R2 is a directed set in the inclusion ordering a (both upward and downward) but not a lattice.
Lemma 1.1. (Banach-Tarski). An increasing function on a complete lattice has a fired point.
Proof. Let (X, <) be a partially ordered set, consider a function F: X -- X, and set A = (x E X : F(x) < x ). Suppose X is a complete lattice. Then X has a supremum in X (sup X = max X). Since Max X E X, it follows that F(max X) E X so that F(max X) < max X. Conclusion: A is nonempty. Take x E A arbitrary and let a be the infimum of A (a = inf A E X). If F is increasing, then F(a) < F(x) < x because a < x and x E A. Therefore F(a) is a lower bound for A, and hence F(a) < a. Thus a E A. On the other hand, since F(x) < x and since F is increasing, F(F(x)) < F(x).Thus F(x) E A so that F(A) C A, and hence F(a) E A (for a E A), which implies that a = inf A < F(a). Therefore a < F(a) < a. Thus (antisymmetry) F(a) = a. The next theorem is an extremely important result that plays a central role in Section 1.8. Its proof is based on the previous lemma.
Theorem 1.2. (Cantor-Bernstein). If there exist an injective mapping of X into Y and an injective mapping of Y into X, then there exists a one-to-one correspondence between the sets X and Y. Proof. First note that the theorem statement can be translated into the following
problem. Given an injective function from X to Y and also an injective function from Y to X. construct a bijective function from X to Y. Thus consider two functions
F : X --+ Y and G : Y --* X. Let P (X) be the power set of X. For each A E P (X) set
4(A) = X\G(Y\F(A)). It is readily verified that 4): P(X) --> P(X) is an increasing function with respect to the inclusion ordering of P(X). Therefore, by the Banach-Tarski Lemma, it has a fixed point in the complete lattice &'(X). That is, there exists Ao E P(X) such that O(Ao) = Ao. Hence Ao = X\G(Y\F(Ao)) so that
X\Ao = G(Y\F(Ao)).
1. Set-Theoretic Structures
12
Thus X \A0 is included in the range of G. If F : X -+ Y and G : Y - X are both injective, then it is easy to show that the function H : X -> Y, defined by
H(x) is injective and surjective.
J F(x),
G-t (x),
X E AoA,
x E X \ Ao, Cl
If X is a partially ordered set such that for every pair x, y of elements of X either
x < y or y < x, then X is simply ordered (synonyms: linearly ordered, totally ordered). A simply ordered set is also called a chain. Note that, in this particular case, the concepts of maximal element and maximum element (as well as minimal element and minimum element) coincide. It is clear that every simply ordered set is a lattice. For instance, any subset of the real line R (e.g., lit itself or Z) is simply ordered.
Example IE. Let < be a simple ordering on a set X and recall that x < y means x < y and x # y. This defines a transitive relation < on X that satisfies the trichotomy law: for every pair {x, y) in X exactly one of the three statements x < y,
x = y, or y < x is true. Conversely, if < is a transitive relation on a set X that satisfies the trichotomy law, and if a relation < on X is defined by setting x < y whenever either x < y or x = y, then < is a simple ordering on X. Thus, according to the above notation, < is a simple ordering on a set X if and only if < is a transitive relation on X that satisfies the trichotomy law. If X is a partially ordered set such that every nonempty subset of it has a minimum, then X is said to be well-ordered. Every well-ordered set is simply ordered. Example: Any subset of the set No of all nonnegative integers is well-ordered.
1.7
Indexing
Let F be a function from a set X to a set Y. Another way to look at the range of F is: for each x E X set yx = F(x) E Y and note that F(X) = { yx E Y : x E X }, which can also be written as (yx }xEx. Thus the domain X can be thought of as an index set, the range (yx )xEx as a family of elements of Y indexed by an index set X (an indexed
family), and the function F: X -+ Y as an indexing. An indexed family {yx}xEx may contain elements ya and yb, for a and b in X. such that ya = yb. If (y.'),Ex has the property that ya * yb whenever a # b, then it is said to be an indexed family of distinct elements. Observe that {yx}xEx is a family of distinct elements if and only if the function F : X -+ Y (i.e., the indexing process) is injective. The identity mapping on an arbitrary set X can be viewed as an indexing of X, the self-indexing of X. Thus any set X can be thought of as an indexed family (the range of the self-indexing of itself).
1.7 Indexing
13
A mapping of the set N (or No, but not Z) into a set Y is called a sequence (or an infinite sequence). Notations: (y")"EN, {y"}">t, (y,,)', or simply (y"). Thus a Y-valued sequence (or a sequence of elements in Y, or even a sequence in Y) is precisely a function from N to Y, which is commonly thought of as an indexed family (indexed by N) where the indexing process (i.e., the function itself) is often omitted.
The elements y" of {y"} are sometimes referred to as the entries of the sequence [y"). If Y is a subset of the set C, R or Z, then complex-valued sequence, real-valued sequence or integer-valued sequence, respectively, are usual terminologies. Let {Xy)yer be an indexed family of sets. The Cartesian product of (Xy)YEr, denoted by flyer Xy, is the set consisting of all indexed families (xy )yEr such that xy E X y for every y E I'. In particular, if Xy = X for ally E T, where X is a fixed set, then nyErXy is precisely the collection of all functions from r to X. That is,
fl X = Xr.
yer
Recall: Xr denotes the collection of all functions from a set r to a set X. Suppose r = H", where D" = (i E N: i < n) for some n E N (II" is called an initial segment of N). The Cartesian product of {X; );Ej. (or (Xi )" 1), denoted by n;EiX; or f[; X; , is the set X 1 x .. x X. of all ordered n-tuples (xl, ... , x") with x; E Xi for every i E ll". Moreover, if Xi = X for all i E d", then IIjEj.X is the Cartesian product of n copies of X which is denoted by X" (instead of X' n). The n-tuples
(xI, ... , x") in X" are also called finite sequences (as functions from an initial segment of N into X). Accordingly, n"eNX is referred to as the Cartesian product of countably infinite copies of X, which coincides with XN: the set of all X-valued (infinite) sequences. A remarkably useful way of defining an infinite sequence is given by the Principle of Recursive Definition which says that, if F is a function from a nonempty set
X into itself and if x is an arbitrary element of X, then there exists a unique X-valued sequence (x")"EN such that xj = x and x"+l = F(x") for every n E N. The existence of such a unique sequence is intuitively clear, and it can be easily proved by induction (i.e., by using the Principle of Mathematical Induction). A slight generalization reads as follows. For each n E N let G" be a mapping of X" into X. and let x be an arbitrary element of X. Then there exists a unique X-valued sequence (xn)AEN such that xl = x and x"+i = G"(xr,... , x")for every n E N. Since sequences are functions of N (or of No) to a set X, the terms associated with the notion of boundedness clearly apply to sequences in a partially ordered set X. In particular, if X is a partially ordered set, and if (x") is an X-valued sequence, then sup"x" and inf"x" are defined as the supremum and infimum, respectively, of the partially ordered indexed family (x" }. Since N and No (with their natural ordering) are partially ordered sets (well-ordered, really), the terms associated with the property of being monotone also apply to sequences in a partially ordered set X. Let (ZI ),EN be a sequence in a set Z, and let {nk }kEN be a strictly increasing sequence of positive integers (i.e., a strictly increasing sequence in N). If we think of
14
1. Set-Theoretic Structures
{nk } and (Zn) as functions, then the range of the former is a subset of the domain of the latter (i.e., the indexed family Ink }kEN is a subset of N). Thus we may consider
the composition of (zn ) with Ink), say {Z-J' which is again a function of N to Z; that is, (Znk } is a sequence in Z. Moreover, since Ink) is strictly increasing, to each element of the indexed family {Znt )kEN there corresponds a unique element of the indexed family {Zn }nEN. In this case the Z-valued sequence (Z-k} } is called a subsequence of {Zn).
A sequence is a function whose domain is either N or No, but a similar concept could be likewise defined for a function on a well-ordered domain. Even in this case, a function with domain Z (equipped with its natural ordering) would not be a sequence. Now recall the string of (nonreversible) implications well-ordered
simply ordered
lattice
directed set.
This might suggest an extension of the concept of sequence by allowing functions whose domains are directed sets. Anet in a set X is a family of elements of X indexed by a directed set 1'. In other words, if r is a directed set and X is an arbitrary set, then an indexed family (xy )yEr of elements of X indexed by r is called a net in X indexed by r. Examples: Every X -valued sequence (xn) is a net in X. In fact, sequences are prototypes of nets. Every X-valued function on Z (notations: (xk }kEZ (xk )k° _OO or {xk ; k = 0, ± 1, ±2, ... }) is a net (sometimes called double sequences or bisequences, although these nets are not sequences themselves).
1.8
Cardinality
Two sets, say X and Y, are said to be equivalent (denoted by X H Y) if there exists a one-to-one correspondence between them. Clearly (see Problems 1.8 and 1.9), X H X (reflexivity), X H Y if and only if Y H X (symmetry), and X H Z whenever X .-+ Y and Y <- Z for some set Z (transitivity). Thus, if there exists a set upon which H is a relation, then it is an equivalence relation. For instance, if the notion of equivalent sets is restricted to subsets of a given set X. then H is an equivalence relation on the power set P(X). If C = {xy }yEr is an indexed family of distinct elements of a set X indexed by a set r (so that xa # xp for every a A # in I'), then c ea r (the very indexing process sets a one-to-one correspondence between P and C). Let N be the set of all natural numbers and, for each n E N. consider the initial segment Iln = {i E Nl: i < n).
A set X is finite if it is either empty or equivalent to Iln for some n E N. A set is infinite if it is not finite. If X is finite and Y is equivalent to X. then Y is finite. Therefore, if X is infinite and Y is equivalent to X, then Y is infinite. It is easy to show by induction that, for each n E N, Iln has no proper subset equivalent to it. Thus (see Problem 1.12), every finite set has no proper subset equivalent to it. That
1.8 Canlinali y
15
is, if a set has a proper equivalent subset, then it is infinite. Moreover, such a subset must be infinite too (since it is equivalent to an infinite set).
Example IF. N is infinite. Indeed, it is easy to show that No is equivalent to N (the
function F: No -> N such that F(n) = n + 1 for every n E No will do the job). Thus No is infinite, because N is a proper subset of No which is equivalent to it, and so is N. To verify the converse (i.e., to show that every infinite set has a proper equivalent subset) we apply the Axiom of Choice.
Axiom of Choice. If (Xy)yEr is an indexed family of nonempty sets indexed by a nonempty index set r, then there exists an indexed family (xy)ycr such that
x,EXyforeach yEr. Theorem 13. A set is infinite if and only if it has a proper equivalent subset.
Proof. We have already seen that every set with a proper equivalent subset is infinite. To prove the converse take an arbitrary element xo from an infinite set Xo, and an arbitrary k from No. The Principle of Mathematical Induction allows us to construct, for each k E No, a finite family (Xn)Ri of infinite sets as follows. Set be recursively Xt = Xo\{xo} and, for every nonnegative integer n < k, let defined by the formula
Xn+t = Xn\{xn}, where {xn }n.o is a finite set of pairwise distinct elements, each xn being an arbitrary element taken from each Xn. However, if we consider the (infinite) indexed family (Xn }nENp = UkeN0 { Xn )kn, then we need the Axiom of Choice to ensure the existence of the indexed family {xn (nENp = UkENO{xn}n where each xn is arbitrarily taken from each Xn. Now set AO = {xn }nENo C Xo, A = {xn }fEN C Ao, and
X = A U (Xo\Ao) C Ao U (Xo\Ao) = Xo Note that AO *- No and A E- N (since the elements of Ao are distinct). Thus Ao .-> A (because No H N), and hence Xo H X (see Problem 1.20). Conclusion: Any infinite set Xo has a proper equivalent subset (i.e., there exists a proper subset o X of Xo such that Xo + X). If X is a finite set, so that it is equivalent to an initial segment 1. for some natural number n, then we say that its cardinality (or its cardinal number) is n. Thus the cardinality of a finite set X is just the number of elements of X (where, in this case. "numbering" means "indexing" as a finite set may be naturally indexed by an index set In). We shall use the symbol # for cardinality. Thus #In = n, and so #X = n whenever X 4+ B. For infinite sets the concept of cardinal number is a bit more complicated. We shall not define a cardinal number for an infinite set as we did for finite sets (which "number" should it be?) but define the following concept instead.
1. Set-Theoretic Structures
16
Two sets X and Y are said to have the same cardinality if they are equivalent. Thus, to each set X we shall assign a symbol #X, called the cardinal number of X (or the X F-s Y - two cardinality of X) according to the following rule: #X = #Y sets have the same cardinality if and only if they are equivalent; otherwise (i.e., if they are not equivalent) we shall write #X 54 #Y. We say that the cardinality of a set X is less than or equal to the cardinality of a set Y (notation: #X < #Y) if there exists an injective mapping of X into Y (i.e., if there exists a subset Y' of Y such that
#X = #Y'). Equivalently, #X < #Y if there exists a surjective mapping of Y onto X (see Problem 1.6). If #X < #Y and #X i4 #Y, then we shall write #X < #Y.
Theorem 1.4. (Cantor). #X < #9 (X) for every set X.
Proof Consider the function F: X
9(X) defined by F(x) = (x) for every
x E X, which is clearly injective. Thus #X < #9(X). Hence #X < #9(X) if and only if #X 96 #9(X). Suppose #X = #9(X) so that there exists a surjective function G: X -+ 9(X). Consider the set A = {x E X : x ¢ G(x)) in 9(X) and take a E X such that G(a) = A (recall: G is surjective). If a E A. then a 0 G(a) and hence a ¢ A, which is a contradiction. Conclusion 1: a 0 A. On the other hand, if a 0 A, then a E G(a) and hence a E A, which is another contradiction. Conclusion 2: a E A. Therefore (a A and a E A), which is impossible (i.e., which also is a contradiction). Final conclusion: #X 0 #9(X). 11 Let A be a subset of a set X. The characteristic function of the set A is the map XA : X -- (0, 1 } such that 1,
0,
xEA, XEX\A.
It is clear that the correspondence between the subsets of X and their characteristic functions is one-to-one. Hence #9 (X) coincides with the cardinality of the collection of all characteristic functions of subsets of X. More generally, let 2X denote the collection of al I maps of a set X into the set (0, 1) (i.e., set 2X = {0, 1 IX).
Theorem 1.5. #9(X) = #2X for every set X. Proof. Let F be a function from the collection 2X to the power set P (X) that assigns to each map tp : X -+ {0, 1) the inverse image of the singleton (1) under W. That is, consider the function F : 2X -i 9(X) defined by F(cp) = (p-' ({ 1 }) for every p in 2X.
Claim 1. F is surjective. Proof. If A is a subset of X, then the characteristic function of A, XA in 2X. is such
that XA '({1}) = A. Thus F(XA) = A for every A E 9(X). Therefore, 9(X) _ UAE9(X) F(XA) C U,E2x F(p) = ?Z(F). o Claim
2. F is injective.
1.8 Caniinaliry
17
Take cp, }b E 2X. If co # ii, then V (x) 96 *(x) for some x E X. Thus cp(x) = 0 and *(x) = 1 (or vice versa), so that x if 9p-1({1)) and x E Hence'p-t({1}) # so that F((p) 0 F(*). E3 Proof.
Conclusion: F is a one-to-one correspondence between 2X and P(X).
o
Although the next theorem may come as no surprise, it is all but trivial. This actually is a rewriting, in terms of the concept of cardinality, of the rather important Theorem 1.2
Theorem 1.6. (Cantor-Bernstein). Let X and Y be any sets. If #X < #Y and #Y < #X, then #X = #Y. The Cantor-Bernstein Theorem exhibits an antisymmetry property. Note that reflexivity and transitivity are readily verified (see Problem 1.22) which, together with Theorem 1.6, lead to a partial ordering property. Behind the antisymmetry property of Theorem 1.6 there is in fact a simple ordering property. To establish it (Theorem 1.7 below) we shall rely on the following axiom. Zorn's Lemma. Let X be a partially ordered set. If every simply ordered subset of X (i.e., if each chain in X) has an upper bound in X, then X has a maximal element.
The label "Zorn's Lemma" is inappropriate but has already been consecrated in the literature. It should read "Zorn's Axiom" instead, for it really is an axiom equivalent to the Axiom of Choice.
Theorem 1.7. For any two sets X and Y, either #X < #Y or #Y < #X. Proof. Consider two sets X and Y. Let Z be the collection of all injective functions from subsets of X to Y. That is,
T={ F E yA : A C X and F is injective). Recall that (see Problem 1.17), as a subset of F = UAEP(X) YA, I is a partially ordered set in the extension ordering, and every simply ordered subset of it has a supremum in Z. Thus, according to Zorn's Lemma, I contains a maximal function. Let Fo : Ao -> Y be a maximal function of Z, where Ao g X and #Fo(Ao) = #AO (since Fo is injective). Suppose Ao 34 X and Fo(Ao) # Y. Take xo E X\Ao and yO E Y\Fo(Ao), and consider the function F1 : Ao U {xo} -s Y defined by J Fo(x) E Fo(Ao),
Fi (x) = lyoE Y\Fo(Ao),
X E AO,
x=xoE X\AO,
which is injective (because FO is injective and yo 0 Fo(Ao)). Since F1 E I and FO = F1IAO, it follows that FO < F1, which contradicts the fact that FO is a maximal of I (for Fo 0 F1). Hence, either AO = X or FO(AO) = Y. If AO = X, then
18
1. Set-Theoretic Structures
Fo: X -* Y is injective and so #X < #Y. If Fo(Ao) = Y, then #Y = #Fo(Ao) = #Ao < #X (for Ao c X - see Problem 1.21(a)). o We have already seen that N H No. Thus N and No have the same cardinality. It is usual to assign a special symbol (aleph naught) to such a cardinal number: #1V = #No = No. We have also seen (cf. proof of Theorem 1.3) that, if X is an infinite set, then there exists a subset of it, say A. which is equivalent to N. Thus #Nl = #A < #X, and hence tto is the smallest infinite cardinal number in the sense that No < #X for every infinite set X (see Problems 1.21(a) and 1.22). A set X such that #X = to is said to be countably infinite (or denumerable). Therefore, every infinite set has a countably infinite subset. A set that is either finite or countably infinite (i.e., a set X such that #X < so) is said to be countable; otherwise it is said to be uncountable (or uncountably infinite, or nondenumerable). Proposition 1.8. #(X x X) = #X for every countably infinite set X.
Proof. Suppose #X = #N. According to Problems 1.26, 1.23(b) and 1.25(a) we get #X _< #(X x X) < #(N x N) = #N = #X. Hence the identity #X = #(X x X) follows by the Cantor-Bernstein Theorem.
17
Note that #X < #(X x X) for any set X (see Problem 1.26). Moreover, it is easy to show that #X < #(X x X) whenever X is a finite nonempty set. Thus, if a nonempty set X is such that #X = #(X x X), then it is an infinite set. The previous proposition ensured the converse for countably infinite sets. The next theorem (which is another application of Zorn's Lemma) ensures the converse for every infinite set.
Therefore, the identity #X = #(X x X) actually characterizes the infinite sets (of any cardinality) among the nonempty sets. Theorem 1.9. If X is an infinite set, then #X = #(X x X). Proof. First we verify the following auxiliary result.
Claim 0. Let C, D and E be nonempty sets. If #(E x E) = #E, then #(C U D) < #E whenever #C < #E and #D < #E. Proof. The claimed result is a straightforward application of Problems 1.26, 1.23(b)
and 1.22: #(C U D) <#(CxD) <#(ExE)=#E.a Now, back to the theorem statement. Let X be a set and let .7 be the collection of all injective functions from subsets of X to X x X such that the range of each function in J coincides with the Cartesian product of its domain with itself. That is,
,7 = I F E (X X X)" : A C X, F is injective and F(A) = A xA I. Note that ,7 is nonempty (at least the empty function is there). From now on suppose X is infinite. Thus X has a countably infinite subset, and so Proposition 1.8 ensures
the existence of a function in j with infinite (at least countably infinite) domain.
1.8 Cardinaliry
19
Recall that J is a partially ordered set in the extension ordering, and that every chain in J (i.e., every simply ordered subset of 9) has an injective supremum (see Problem 1.17).
Claim 1. Such a supremum in fact lies in J.
Proof. Let (Fy) be an arbitrary chain in J. and let D(Fy) and 1Z(Fy) denote the domain and range of Fy, respectively. Thus each Fy is an injective function from
Ay to XxX, with Ay = D(Fy) c X and Fy(Ay) = AyxAy. Now let VyFy: U, Ay -, X x X be the supremum of (Fy ), and note that (see Problem 1.17)
(V,Fy)(UyAy) = R(VyFy) = U,R(Fy) = U,(AyxAy). Clearly, U,(AyxAy) c (UyAy)x(UyAy). On the other hand, if (Fx, Fu} is an arbitrary pair from (Fy ), then Fx < F,u (or vice versa), so that Ax x A,,, c A, x A4. c UY(AyxAy). Therefore (for Ax c Aµ). Thus (UyAy)x(UyAy)
(VyFy)(UyAy) = (UyAy)x(UyAy),andhence V,Fy E 3.13 Conclusion: Every chain in 3 has an upper bound (a supremum, actually) in J. Thus, according to Zorn's Lemma, 3 contains a maximal element. Let FO: A0 -> X x X be a maximal function of 3, where 0 A Ao C X and Fo(Ao) _ Ao x Ao. Since Fo is injective,
#(AoxAo) = #Ao, which implies that the nonempty set Ao is an infinite set.
Claim 2. If #Ao < #X, then #Ao < #(X\Ao). Proof.
If #(X\Ao) < #Ao, then (cf. Problems 1.26 and 1.23(b)) #X =
#(Ao U X\Ao) < #(Aox(X\Ao)) < #(A0 xA0) = #Ao. Conclusion: #(X\Ao) < #A0 implies #X < #Ao (see Problem 1.22). Equivalently, #Ao < #X implies #A0 < #(X\Ao) by Theorem 1.7. 0
Note that #A0 < #X (for A0 c X). Suppose #Ao < #X. In this case #A0 < #(X\Ao) by Claim 2. Thus there exists a proper subset of X\Ao, say At, such that #A0 = #A 1. Therefore (Problem 1.23(b)) #(A; x Aj) < #(A0 x AO) = #A0 for all combinations of i, j in (0, 1), and hence #[(Ao x A 1) U (A I X AO) U (A i x A 1)] < #Ao according to Claim 0. Since the reverse inequality is trivially verified, it follows by Theorem 1.6 that #[(Ao x A 1) U (A 1 x AO) U (A 1 x A 1)] = #A0. Set
A = AOUAi and observe that (AxA)\(AoxA0) _ (AoxA1) U (A1 xAo) U (A1 xA1) because Ao and At are disjoint. Thus
#[(AxA)\(AoxA0)] = #Ao = #A1,
20
1. Set-Theoretic Structures
which ensures the existence of an injective function F1: A t -> X x X such that F1 (A 1) = (A x A)\ (A0 x A0). Now consider a function F from A to X x X defined as follows.
F(x) _
FO(X) E Ao x Ao,
X E A0,
Fi(x) E (A x A) \ (Ao x A0),
x E At = A \ Ao.
F: A - X x X is injective (because Fo and F1 are injective functions with disjointranges) and F(A) = AxA (for F0(A0) U Ft(A1) = AxA). Since F E .land F0 = FIAO, it follows that Fo < F, which contradicts the fact that F0 is a maximal (for F0 * F). Therefore
of
#Ao = #X, and hence (cf. Problem 1.23(b)) #(Ao x Ao) = #(X x X). Conclusion: #X = #Ao = #(A0 x A0) = #(X x X). E3 Theorem 1.9 is a natural extension of Proposition 1.8 which in turn generalizes Problem 1.25(a). Another important and particularly useful result in this line is given by the next theorem and its corollary.
Theorem 1.10. Let X be a set and consider an indexed family of sets (X y )yEr. If
#X y < #X for all y E r, then
#(U Xy) < #(rxX). yEr
Proof. Take an indexed family of sets (Xy)yEr and suppose there exists a set X such that #Xy < #X for all y E r. Thus for each y E r there exists a surjective mapping of X onto X,,, say, Fy : X -+ X,. Now consider a function G : r x x -+ UyErXydefined by G(y,x) = Fy(x) forevery Y E randeveryx E X.Takeanarbitrary y E Uy e r X y so that y E X y for some y E r. Since Fy : X -> Xy is surjective, there exists x E X such that y = Fy (x). Thus y = G(y, x); that is, y E R(G). < Hence G : r x X --). UyEr Xy is surjective, and so #(UyerXy)
#(r x x).
E3
Corollary 1.11. A countable union of countable sets is countable.
Proof. Consider a countable family of countable sets {Xn)nEN, where #N < #N and #Xn < #N for all n E N. Theorem 1.10 ensures that #(UnENXf) < #(NxN). However, #(N x N) < #(N x N) = #N. Thus
#( U Xn) < #N. nEN
(See Problems 1.23(b), 1.25(a) and 1.22.)
1.9 Remarks
1.9
21
Remarks
We assume the reader is familiar with the definition of an interval of the real line R. An interval of R is nondegenerate if it does not collapse to a singleton. It is easy to show that the cardinality of the real line R is the same as the cardinality of any nondegenerate interval of R. A typical example: The function F: [0, 1] -+ [-1, 1] given by F(x) = 2x - I for every x E [0, 11 is injective and surjective, and so is the
function G : (-1, 1) --> R defined by G(x) _ (I - jxl)-lx for every x E (-1, 1). Thus #[0, 11 _ #[-1, 1] and #(-1, 1) = #R. Since (-1, 1) C [-1, 1] C R, it follows that #R = #(- 1, 1) < #[-1, 11 < #R, and hence (Cantor-Bernstein Theorem) #[ -1, 11 _ #R. Therefore,
#[0, 1] = #(-1, 1) = #[-1, 1] = #R. Moreover, we can also prove that #R = #2N. Indeed, consider the function F: 2N -- [0, 1] that assigns to each sequence {a") in 2N (i.e., to each sequence with values either 0 or 1) a real number in [0, 1], in ternary expansion, as follows.
F({a"}) = 0.(2a1)(2a2)
...
for every {a" } E 2N. It can be shown that F is injective.
Reason: Every real number x E [0, 11 can be written as En°_1 fl p-" for a given positive integer p greater than one (i.e., for a given base p). In this case 0.f1 $2 ... is a p-ary expansion of x, where (P")"EN is a sequence of nonnegative integers ranging from 0 to p - 1. That is, {8" )nEN is a sequence of digits with respect to the base p- e.g., if p = 2, 3, or 10, then 0.fi1 #2 ... is a binary, ternary or denary (i.e., decimal) expansion of x, respectively. A p-ary expansion (with respect to a base p)
is not unique - e.g., 0.499 ... and 0.500... are decimal expansions for x =. However, if two p-ary expansions of x differ, then the absolute difference between the first digits in which they differ is equal to 1. Thus, if we take a p-ary expansion whose digits are either 0 or 2 (as we did), then it is unique. This can only be done for a p-ary expansion with respect to a base p > 3, so that a ternary expansion is enough.
Thus #2N < #[0, 11. On the other hand, let G : 2N -- [0, 11 be the function that assigns to each sequence {a" } in 2N a real number in [0, 11, in binary expansion, as follows.
G({an}) = 0.a1a2
...
for every {a,,) E P. It can also be shown that G is surjective.
Reason: Every real number x E [0, 11 can be written asF01a"2' so that0.a1a2 is a binary expansion of it for some sequence {a") E 2N.
...
Thus #[0, 11 < #2N. Therefore #[0, 11 = #2N by the Cantor-Bernstein Theorem. Hence
#R=#2N
22
1. Set-Theoretic Structures
(for #[0, 1
#R). Using Theorems 1.4 and 1.5 we may conclude that
#N < #R. Such a fundamental result can also be derived by the celebrated Cantor's diagonal procedure as follows. Clearly, #N < #R (since N C R). Suppose #N = #R. This implies that #N = #[0, 1] (for #[O, 1] = #R). Thus the interval [0, 11 can be indexed Write each x in decimal expansion: by N so that [0, 1] = Xn =
where each a,, (k E N) is a nonnegative integer between 0 and 9. Now consider the point x E [0, 11 with the following decimal expansion.
x = 0.ata2 ... where, again, each a (n E N) is a nonnegative integer between 0 and 9 but al 96 a 11,
a2 * a221 and so on. That is, a,, 0 an for each n E N (e.g., take at. diametrically opposite to a,,,, so that, for each n E N. a = a,,,, + 5 if 0 < a,,. < 4 or a,, = a,,,, - 5 if 5 < a,,,, < 9). Thus x 0 x for every n E N. Hence x f [0, 1], which is a contradiction. Therefore #N 96 #R. Equivalently (since #N < #R), #N < #R. We have denoted #N by No. Let us now denote #2N by 2° so that
#N = Ko < 2K0 = #R. Cantor conjectured in 1878 that there is no cardinal number between No and 2K0. The conjecture is called the Continuum Hypothesis.
Continuum Hypothesis. There is no set whose cardinality is greater than #N and smaller than #R. The Generalized Continuum Hypothesis is the conjecture that naturally generalizes the Continuum Hypothesis.
Generalized Continuum Hypothesis. For any infinite set X, there is no cardinal number between #X and #2x. There are several different axiomatic set theories, each based on a somewhat different axiom system. The most popular is probably the axiom system ZFC. It comprises the axiom system ZF ("Z" for Zermelo and "F" for Fraenkel) plus the Axiom of Choice. The Axiom of Choice actually is a genuine axiom to be added to ZF. Indeed, Godel proved in 1939 that the Axiom of Choice is consistent with ZF, and Cohen proved in 1963 that the Axiom of Choice is independent of ZF. The situation of the Continuum Hypothesis with respect to ZFCis somewhat similar to that of the Axiom of Choice with respect to ZF, although the Continuum Hypothesis itself is not as primitive as the Axiom of Choice (even if the Axiom of Choice might be regarded
1.9 Remarks
23
as not primitive enough). Gbdel proved in 1939 that the Generalized Continuum Hypothesis is consistent with ZFC, and Cohen proved in 1963 that the denial of the Continuum Hypothesis also is consistent with ZFC. Thus both the Continuum Hypothesis and the Generalized Continuum Hypothesis are consistent with ZFC and also independent of ZFC: neither of them can be proved or disproved on the basis of ZFC alone (i.e., they are undecidable statements in ZFC). The Generalized Continuum Hypothesis in fact is stronger than the Axiom of Choice: Sierpinski showed in 1947 that the Generalized Continuum Hypothesis implies the Axiom of Choice. We have already observed that the Axiom of Choice and Zorn's Lemma are equivalent. There is a myriad of axioms equivalent to the Axiom of Choice. Let us mention just two of them.
Hausdorff Maximal Principle. Every partially ordered set contains a maximal chain (i.e., a maximal simply ordered subset).
Zermelo Well-Ordering Principle. Every set may be well-ordered. In particular, the set R of all reals may be well-ordered. This is a pure existence result, not exhibiting (or constructing or even defining) a well-ordering of R. In fact, Feferman showed in 1965 that no defined partial ordering can be proved in ZFC to well-order the set R. If X and Y are any sets, properly well-ordered, and if there exists a one-to-one
order-preserving correspondence between them (i.e., an injective and surjective
mapping 4>: X -+ Y such that xt < x2 in (X, <) if and only if 4)(xt) < 0(x2) in (Y, <)), then X and Y are said to have the same ordinal number. If two wellordered sets have the same ordinal number, then they have the same cardinal number. However, unlike the notion of cardinal number, the notion of ordinal number depends on the well-orderings that well-order the sets.
Proposition 1.12. There is an uncountable set X, well-ordered by a relation < on it, with the following properties. X has a greatest element S2 and the set {x E X : x < z) is countable for every z in X \ (9). Proof. Let Y be an uncountable set. By the Well-Ordering Principle, there exists a well-ordering of Y. Take C not in Y, set Z = Y U and extend the well-ordering of Y to Z by setting y < C for every y E Y. Consider the set A = lot E Z : {z E Z : z < Of } is an uncountable set). A is nonempty ( E A because Y = (Z E Z : z < (' ) is uncountable), and hence it has a minimum (for 0 # A C Z and Z is well-ordered). Set 0 = min A so that X = {z E Z : z < S2} is the required set. 17 Moreover, it can be shown that such a well-ordered set X is unique in the sense that, if Y is any well-ordered set with the same properties of X, then there exists a one-to-one order-preserving correspondence between X and Y (i.e., then X and Y have the same ordinal number). The greatest (or the last) element S2 in X is called the least or first uncountable ordinal, and the elements x of X such that x < 92
24
1. Set-Theoretic Structures
are called countable ordinals. The greatest elements of the finite subsets of X are called finite ordinals. If w is the first infinite ordinal (i.e., the least nonfinite ordinal), then the set (x E X : x < w) of all finite ordinals and the set of all natural numbers N (equipped with its natural ordering) have the same ordinal number. It is usual to assign the symbol w as the ordinal number of any well-ordered set that is in a one-to-one order-preserving correspondence with N.
Suggested Reading Binmore [1] Brown and Pearcy [2]
Kolmogorov and Fomin [1] Moore [1] Royden [1] Simmons [1] Suppes [I ]
Crossley et al. [ 1 ]
Dugundji [1] Fraenkel, Bar-Hillel and Levy [I ] Halmos [3] Kelley [1)
Vaught [ 1]
Wilder [1]
Problems Problem 1.1. Let A, B and C be arbitrary sets. Prove that (a) (A\B) U (B\A) = (A U B)\(A n B)
(symmetric difference);
(b) (AVB)U(BVC)=(AUBUC)\(AnBnC); (c) X\(A U B) = (X\A) n (X\B) and X\(A n B) = (X\A) U (X\B) whenever A and B are subsets of X
(De Morgan laws).
Problem 1.2. Consider a function F : X Y from a set X to a set Y. Let A, A 1 and A2 be arbitrary subsets of X, and let B. B1 and B2 be arbitrary subsets of Y. Verify the following propositions.
(a) F(X\A) = F(X)\F(A).
(b) F-(Y\B) = X\F"1(B) (c) A t C A 2
c B2
F(A ) c F(A 2 ) . 1
F-' ( B I) c F-
(d)
B,
(e)
F(A1 U A2) = F(A1) U F(A2).
(f)
F(A1 n A2) c F(A1) n F(A2).
( B 2).
Problems (g)
25
F-' (Bt U B2) = F-' (Bt) O F-' (B2).
(h) F- (BI n B2) = F-'(Bt) n F- I (B2). (i)
A c F-' (F(A)).
(j)
F(F-'(B)) C B.
Problem 1.3. Consider the setup of Problem 1.2. Show that
(a) F is injective if and only if the inverse image under F of each singleton in 'R(F) is a singleton in X;
(b) F is injective if and only if F(A1 n A2) = F(A1) n F(A2) for every At, A2 C X; (c) F is injective if and only if the images of disjoint sets in X are disjoint in Y;
(d) F is injective if and only if A = F-' (F(A)) for every A C- X; (e) F is surjective if and only if the inverse image under F of each nonempty subset of Y is a nonempty subset of X ; (f)
F is surjective if and only if F(F-' (B)) = B for every B C Y.
Problem 1.4. Verify that a function F: X -- X is idempotent if and only if the range of F coincides with the set of all fixed points of F.
Problem 1.5. A function L : Y -+ X is said to be a left inverse of a function F: X -+ Y if LF = Ix, the identity on X. F is injective if and only if it has a left inverse. The restriction of a left inverse of F to the range of F is unique and coincides
with the inverse of F on R(F). (Recall: an injective function has an inverse on its range.) Prove.
Problem 1.6. A function R : Y -> X is said to be a right inverse of a function F: X - Y if FR = ly, the identity on Y. Show that F is surjective if and only if it has a right inverse. Note that any right inverse of F is injective (for it has a left inverse). Similarly, any left inverse of F is surjective (for it has a right inverse). Conclusion: There exists an injective mapping of X into Y if and only if there exists a surjective mapping of Y onto X. Problem 1.7. A function F : X --* Y is injective and surjective if and only if there exists a function G: Y --+ X such that GF = IX (the identity on X) and FG = ly (the identity on Y). Prove the above proposition and show that, in this case, such a function G is unique and coincides with the inverse of F.
26
1. Set-Theoretic Structures
Problem 1.8. If F: X -+ Y is an invertible function, then so is its inverse F't : Y - X and (F-1)-' = F. Moreover, if A C X, then F I A : A -+ F(A) is invertible
and (FIB)-' = F-tlF(A) Prove. Problem 1.9. Verify the following propositions. (a) The composition of two injective functions is an injective function. (b) The composition of two surjective functions is a surjective function. (c) The composition of two invertible functions is an invertible function.
Note: When we speak of the composition G o F of two functions F and G it is assumed that the domain of G includes the range of F.
Problem 1.10. Let F : X - Y and G : Y - Z be invertible mappings. Show that
(Go F)-1 =F-1oG-'. Problem 1.11. A function F : X --* X is an involution if F2 = 1. Verify that an involution is precisely an invertible function on X that coincides with its inverse: F = F-1. Show that the composition of two involutions is again an involution if and only if they commute.
Problem 1.12. Let F : X - Y be a one-to-one mapping of a set X onto a set Y. Let G : X --j- A be a one-to-one mapping of X onto a subset A of X. Prove: If A is a proper subset of X, then F(A) is a proper subset of Y. Hint: Consider the commutative diagram
X t- Y i
GI
1"
A FIA
F(A).
Problem 1.13. Let X be a set with more than one element, and consider the following relations R1, R2, R3. R4 and R5 on the power set P(X) of X. For every pair (A, B) of subsets of X
A RI B if A V B= 0
(i.e., if A= B),
A R2 B if AV B is finite,
A R3 B if ACB or B C A A R4 B if ACB
(i.e., if A\B = 0 or B\A = 0),
(i.e., if A\B = 0).
Problems
27
Show that the table below properly classifies these relations according to reflexivity, transitivity, symmetry and antisymmetry.
Reflexive Rt
Transitive
Symmetric
Antisymmetric
,/
R2 R3 R4 R5
Hint: To verify that R2 is transitive use Problem 1.1(b) and recall that the union of two sets is finite if and only if each of them is finite.
Problem 1.14. Consider the functions 4) : P (X) -> P(X) and H : X -s Y defined in the proof of Theorem 1.2. Show that (D is an increasing function with respect to the inclusion ordering of P(X), and that H is injective and surjective. Problem 1.15. Let Yx denote the collection of all functions from a set X to a set Y. Suppose Y is partially ordered and let < be a partial ordering of Y. Now consider a relation on Yx, also denoted by < and defined as follows. For any pair IF, G) of functions in Yx,
F
if
F(x) < G(x) for every x E X.
(a) Show that < is a partial ordering of Yx. (b) Prove: If (Y, <) is a lattice, then (Yx, <) is a lattice.
Hint: Suppose (Y, <) is a lattice. Take F and G from Yx and let U and V be functions in Yx defined as follows. U(x) = F(x) v G(x) and V(x) _ F(x) A G(x) for every X E X. Show that FVG = Uand FAG = V. Problem 1.16. Set Y = (0, 1) and let XA : X -+ (0, 1) be the characteristic function of an arbitrary subset A of a set X. Thus, for every A E 9(X), XA E Yx = 2x Let
A and B be subsets of X, and consider the partial ordering of 2x introduced in Problem 1.15. Prove the following propositions. XA < XB
ACB.
XA V XB = XAUB-
XA A XB = XAnB = XAXB -
28
1. Set-Theoretic Structures
(d) A n B = 0
XAU8 = XA + X8
XAnB = 0
Problem 1.17. Let.F be the collection of all functions from subsets of a set X to a set Y. That is,
F= {FEYA: ACX) = U YA. AeP(X)
The unique function whose domain is the empty set 0 is called the empty function in F. Consider the following relation < on Y. For any pair IF, G) of functions in F. F < G if F is a restriction of G (equivalently, if G is an extension of F). That is, F G if and only if F: A Y, G: B - Y, A c_ B C X. and F = GIA (i.e.,
F(x) = G(x) for every x E A).
(a) Show that the relation < on F is a partial ordering (called the extension ordering).
A function V : C -+ Y in.F is a lower bound for a pair of functions F : A -+ Y and
CCB,CcAand .F is an upper bound for the pair IF, G I if A c D, B C D. U IA = F and U I B = G.
(b) Show that every pair of functions F and G in ,F has an infimum F A G on .F. (In particular, if the domain A of F and the domain B of G are disjoint, then F A G is the empty function - which function is F A G if A and B are not disjoint but F(A) and G(B) are disjoint?) Let (Fy) be an indexed family of functions in Y. For each Fy let D(Fy) and R(Fy) denote the domain and range of Fy, respectively. Prove the following propositions. (c) If (Fy) is simply ordered (i.e., if (FyI is a chain in ,F), then it has a supre-
mum Vy Fy in.F, whose domain and range are D(VY F),) = UYD(Fy) and R(Vy Fy) = UY R(Fy ). Moreover, if each Fy is injective, then so is Vr Fy.
(d) If the domains (D(Fy)} are pairwise disjoint, then the family (Fy) has a supremum VVFy: UYD(Fy) -+ UYR(Fy) in F. Moreover, if the ranges (7.(Fy)) are also pairwise disjoint, and if each F. is injective, then so is VY Fy.
Problem 1.18. Let {X }FEN be a sequence of sets. Set YI = X I and n+1
n
Yn+1 = (UXk)\(UXk) k=1
k=1
for each n E N. Show by induction that n
n
UYk=UXk k=1
k=1
Problems
29
for every n E N. (Hint: A U (B\A) = A U B.) Verify that Y. c Xn for each n E N, and Yn, n Yn = 0 for every pair of distinct natural numbers m and n. Moreover, show that 00
00
U Yn = U X1. n=1
n=I
Such a disjoint sequence (Yn}nEN is referred to as the disjointification of (Xn)nEN.
Problem 1.19. Let &'(X) be the power set of a set X and consider the inclusion ordering of P(X). Let (Xn}O° be an arbitrary 6'(X)-valued sequence (i.e., a sequence of subsets of a given set X). Recall that, for each n E N, (Xk) , is an indexed family of subsets of X, and hence a subset of the complete lattice P(X). 1
Let infn
n
and sup(Xk )k° n, respectively; and
00
Yn=inf Xk=nXk n
n
k=n
so that (Yn)O_1 and (Zn)O
(a) Verify that (Yn) O sequence.
00
Zn=SUpXk=UXk
and
1
1
k=n
are P(X)-valued sequences as well.
is an increasing sequence and (Zn)O° 1 is a decreasing
The union UO° Y,, and the intersection n0° I Zn, which are elements of usually denoted by 1
00
are
00
and n Zn = lim sup Xn,
U Yn = lim inf Xn n=1
n=1
n
n
and called limit inferior and limit superior of (Xn)n'
1,
respectively.
(b) Show that lim inf X. c lim sup Xn. n
n
If lim infnXn = lim sup,, Xn, then we say that the sequence (X.)O 1 converges to the limit lim Xn = lim inf X,, = lim sup Xn. n
n
n
Prove the following propositions. (c) If (Xn)O°1 is an increasing sequence, then Yn = Xn for each n E N and
Zn = UO° 1 Xn, = supn X. for every n E N. so that
lim inf Xn = lim sup Xn = sup X, . n
n
n
30
1. Set-Theoretic Structures
(d) If { Xn }O01 is itself decreasing, then Y. = nm t X. = inf m Xm for every n e N and Zn = Xn for each n E N, so that
lim inf X,, = lim sup X. = inf Xn. n
n
n
Therefore, an increasing sequence of sets converges to its union (sups X. = Un X.) and, dually, a decreasing sequence of sets converges to its intersection (inf, X,, = nAX,); so that every monotone sequence of sets converges. Thus, since (YY)°O and (Z,,) 1 are always increasing and decreasing, respectively, they do converge: { Yn }O0 converges to its union Y,, = Uz t n°__n Xk, and {Z,,1 converges to its intersection nn° 1 Zn = Uk .n Xk. 1
1
1
nn0=1
(e) Verify the following identities.
lim inf X,, = lim ( inf Xk) n
lim sup Xn = lira ( sup Xk).
and
n
n
n
n
n
(f) Now show that
X \1im sup Xn = lim inf (X \X.) and X \Iim inf X. = lim sup(X\X,). n
n
n
n
Thus a sequence (Xn )n_1 converges if and only if the sequence of its complements {X\Xn }O0 , converges and, in this case,
lim(X\Xn) = X\ Iim Xn. n
n
Problem 1.20. Let A and B be subsets of the sets X and Y, respectively. Show that
A H B and X\A .-> Y\B
imply
X H Y.
(Warning: X H Y and A H B do not imply X\A H Y\B). Problem 1.21. Let A and B be any sets. Prove that
#A < #B
(a)
ACB
(b)
A c B, B is finite, and #A = #B
implies
(hint: inclusion map), imply
A = B;
and show that assertion (b) does not hold if B is infinite.
Problem 1.22. For any sets A, B and C, verify that
#A < #B < #C
implies
#A < #C.
Problems
31
Moreover, if #A < #B < #C or #A < #B < #C, then #A < #C. (Hint: CantorBernstein Theorem.)
Problem 1.23. Suppose the sets A, B, C and D are such that
#A < #C
#B < #D.
and
Prove the following propositions.
(a) #(A U B) < #(C U D) whenever c n D = 0. (b) #(A x B) < #(C x D). Moreover, if #A = #C and #B = #D, then the cardinalities in (b) coincide too.
Problem 1.24. Let YX denote the collection of all mappings of a set X into a set Y. Show that, if Y has more than one element, then
#P(X) < #YX for every set X. (In fact, #P(X) = #2x = #YX if X is infinite and 2 < #Y < #X). Problem 1.25. Let N, Z and Q have their standard meanings and set No = #N as usual. Verify the following identities.
(a) #(N x N) = Mo. Hint: Consider the function F : N x N -s N defined by the formula F(m, n) _ (m+n-1 m+n-2 + m for every m, n E N. F is injective and surjective. The array
n? 5
11
4
7 12
3
4
2
2
1
1
8 5 3
13
9 14
6 10 15
m
12 3 4 5 -+ may be suggestive.
(b) #Z = No. Hint: Let Ne denote the set of all nonnegative even integers (including zero), and let No denote the set of all positive odd integers. Recall that #No = #N, Ne and note that #No = #Ne and #N = #N,,. (Reason: the functions F : No
and G : N - No, given by F(n) = 2n for every n E No and G(n) = 2n - 1 for every n E N, are injective and surjective). Set N. = {k E Z: - k E N) so that #N_ = #N,,. Use Problem 1.23(a) to show that #Z < #N.
32
1. Set-Theoretic Structures
(c) #Q = No. Hint: The function F: ZxN -+ Q defined by F(k, n) = A for every k E Z and every n E N is surjective. Use Problem 1.23(b). Problem 1.26. The purpose of this problem is to prove that, if X and Y are nonempty sets, then
#(X U Y) < #(X xY). (a) First verify that the above assertion holds whenever X and Y are both (nonempty) finite sets. Consider the relations ^-x and -y on the Cartesian product X x Y defined as follows. For every pair of pairs (x1, yl) and (x2, Y2) in X x Y, (xI , yi) ^-x (xZ, y2)
xi = x2,
(xl, yl) -.y (x2, y2)
yl = ),-2.
(b) Show that -x and -y are both equivalence relations on X x Y. Now take x E X and y E Y arbitrary and consider the equivalence classes, [x j c X x Y
and [y] c X x Y, of the point (x, y) E X x Y with respect to ^-x and --y, respectively.
(c) Show that [x] H Y and [y] H X. Hint: Y
[x]
y
[y]
X x
Next suppose one of the sets X or Y, say X, is infinite and consider the singleton {(x, y)) on (x, y) E X X Y.
(d) Verify that #X = #([y)\((x, y))) and #Y = #[x]. (e) Finally, apply the results of Problems 1.21(a), 1.22 and 1.23(a) to conclude that #(X U Y) < #(X x Y). Problem 1.27. Prove the following propositions.
Problems
33
(a) The union of two sets is countable if and only if each of them is countable. Hint: Problems 1.22, 1.23, 1.25 and 1.26. (b) Let X be an infinite set, and let A and B be arbitrary subsets of X. The relation
- on P(X) defined by A-B
AVB is countable
if
is an equivalence relation - compare with Problem 1.13.
Problem 1.28. (a) Let E be an infinite set. Suppose A and B are sets such that
#A < #E
and
#B < #E.
The following propositions, which are naturally linked to Problems 1.23 and 1.26, are in fact corollaries of Theorem 1.9. Prove them.
#(A U B) < #E
and
#(A x B) < #E.
(b) Now use Problem 1.23(b) and Theorem 1.9 to show by induction that, if X is an infinite set, then
#X" = #X for every n E N. Problem 1.29. Let I be the collection of all injective functions from subsets of a set X to itself. That is, set
Z= IF E XA : A C X and F is injective?. We know from Problem 1.17 that, as a subset of UAEP(X)XA, T is partially ordered
in the extension ordering. Let ,7 be the collection of all those functions F in I for which the range R(F) is disjoint with the domain D(F):
.7 = IF E Z: R(F) c X\D(F)}. Problem 1.17 also tells us that every chain { Fy ) in ,7 has a supremum Vy Fy in I.
and also that D('s/>Fy) = UYD(Fy) and R(VYFy) = UYR(Fy). (a) Show that V. Fy in fact lies in J.
Hint: Take Fa and F,L arbitrary from (Fy } c J so that Fa < F. (or vice
versa). Note that R(F,) fl D(F,) c R(Fµ) fl D(F,) = 0, and conclude that UYR(Fy) fl UYD(Fy) is empty too. Therefore, every chain in .7 has an upper bound (a supremum, actually) in J. Thus, according to Zorn's Lemma, ,7 contains a maximal function. Let FO be a maximal function of ,7 and let Ao be the domain of Fo, so that Fo(Ao) c X\Ao. Suppose
#Ao < #(X\Ao) and show that, if X is an infinite set, then
34
1. Set-Theoretic Structures
(b) there exist two distinct points, say xo and xt, in (X\Ao)\Fo(Ao) Hint: Under the above assumption #Fo(Ao) < #(X \Ao) (why?) and X\Ao is infinite - recall: the union of finite sets is finite.
Now set A i = Ao U {xo} and consider the function Ft : A t - X defined by the formula
Ft (x) =
X E Ao,
Fo(x) E Fo(Ao), X1 E X \ Fo(Ao),
x = xo E X \ Ao.
(c) Show that F1 E J. Since F0 = F, I& it follows that F0 < F1, which contradicts the fact that Fo is a maximal of 9 (for F0 A Fi ). Therefore (by Theorem 1.7),
#(X\Ao) < #Ao. Next verify that #A0 = #Fo(Ao) < #(X\Ao) and conclude (by the Cantor-Bernstein Theorem) that
#Ao = #(X\Ao). Finally, by using Problem 1.28(a), show that
#Ao = #X. Outcome: If X is an infinite set, then there exists a subset of it, say Ao, such that
#Ao = #X = #(X\Ao). Problem 1.30. Let A be an arbitrary set and write A' = A x { A).
(a) Show that #A'= #A. Hint: The function that assigns to each a E A the ordered pair (a, A) in the Cartesian product A x (A) is a one-to-one mapping of A onto A'.
Clearly, A n A' = 0. Moreover, if B is any set such that B i6 A. then A' n B' = 0 where B' = B x { B }. Conclusion 1: If A and B are any sets, then there exist sets C and D such that
#A=#C, #B=#D
and
CnD=O.
Now suppose C1, C2, DI and D2 are sets with the following properties. Ci n Dt = 0,
C2 n D2 = 0, #Ct = #C2 and #Dt = #D2. (b) Show that #(Ci U Di) = #(C2 U D2).
Problems
35
Conclusion 2: #(C U D) is independent of the particular pair of sets {C, D) employed in Conclusion 1. We are now prepared to define the sum of cardinal numbers. If A and B are sets, then
#A + #B = #(CUD) for any pair of sets {C, D) such that
#A=#C, #B=#D
and
CnD=O.
In particular, ifAnB=0, then #A+#B= #(AUB). (c) Use Problem 1.29 to show that #X + #X = #X for every infinite set X. The definition of product of cardinal numbers is much simpler:
#A #B = #(A x B) for any sets A and B. According to Theorem 1.9. #X #X = #X for every infinite set X.
(d) Prove: If X and Y are two sets, at least one of which is infinite, then
#X + #Y = #X #Y = max(#X, #Y).
Hint: If #Y < #X, then #X < #X + #Y < #X + #X and #X < #X #Y <
#X #X.
2 Algebraic Structures
The main algebraic structure that will be involved with the subject of this book is that of a "linear space" (or "vector space"). A linear space is a set endowed with an extra structure in addition to its set-theoretic structure (i.e., an extra structure that goes beyond the notions of functions and ordering, for instance). Roughly speaking, linear spaces are sets where two operations, called "addition" and "scalar multiplication", are properly defined so that we can refer to the "sum" of two points in a linear space, as well as to the "product" of a point in it by a "scalar". Although the reader is supposed to have already had some contact with linear algebra and, in particular, with "finite-dimensional vector spaces", we shall proceed from the very beginning. Our approach avoids the parochially "finite-dimensional" constructions (whenever this is possible), and focuses either on general results that do not depend on the "dimensionality" of the linear space, or on abstract "infinite-dimensional" linear spaces.
2.1
Linear Spaces
A binary operation on a set X is a mapping of X x X into X. If F is a function from X x X to X, then we generally write z = F(x, y) to indicate that z in X is the value of F at the point (x, y) in X x X. However, to emphasize the rule of the binary operation (the outcome of a binary operation on two points of X is again a point of X). it is convenient (and usual) to adopt a different notation. Moreover, in order to emphasize the abstract character of a binary operation, it is also common to use
38
2. Algebraic Structures
a noncommittal symbol to denote it. Thus, if * is a binary operation on X (so that * : X x X -+ X), then we shall write z = x * y instead of z = *(x, y) to indicate that z in X is the value of * at the point (x, y) in X x X. If a binary operation * on X has the property that
x*(y*z) = (x*y)*z
for every x, y and z in X, then it is said to be associative. In this case we shall drop the parentheses and write x * y * z. If there exists an element e in X such that
x*e = e*x = x for every x E X, then e is said to be the neutral element (or the identity element) with respect to the binary operation * on X. It is easy to show that, if a binary operation * has a neutral element e, then e is unique. If an associative binary operation * on X
has a neutral element e in X, and if for some x E X there exists x-t E X such that
x *x - = x-t *x = e, I
then x-1 is called the inverse of x with respect to *. It is also easy to show that. if the inverse of x exists with respect to an associative binary operation *, then it is unique. A group is a nonempty set X on which is defined a binary operation * such that (a) * is associative,
(b) * has a neutral element in X, and (c) every x in X has an inverse in X with respect to *.
If a binary operation * on X has the property that
x*y = y*x for every x and y in X, then it is said to be commutative. If X is a group with respect to a binary operation *, and if (d) * is commutative,
then X is said to be an Abelian (or commutative) group.
Example 2A. The collection of all injective mappings of a nonempty set X onto itself (i.e., the collection of all invertible mappings on X) is a non-Abelian group with respect to the composition operation o. The neutral element (or the identity element) of such a group is, of course, the identity map on X. An additive Abelian group is an Abelian group X for which the underlying binary operation is interpreted as an addition and denoted by + (instead of *). In this case
the element x + y (which lies in X for every x and y in X) is called the sum of
2.1 Linear Spaces
39
x and y. The (unique) neutral element with respect to addition is denoted by 0 (instead of e) and called zero. The (unique) inverse of x under addition is denoted by -x (instead of x-t) and called the negative of x. Thus x + 0 = 0 + x = x and x + (-x) = (-x) + x = 0 for every x E X. Moreover, the operation of subtraction is defined by x - y = x + (-y) and x - y is called the difference between x and Y. If o : X x X -> X is another binary operation on X, and if
xo(y*z)=(xoy)*(xoz)
and
(y*z)ox=(yox)*(zox)
for every x, y and z in X, then o is said to be distributive with respect to *. The above properties are the so-called distributive laws. A ring is an additive Abelian group X with a second binary operation on X, called multiplication and denoted by , such that
(e) the multiplication operation is associative and (f) distributive with respect to the addition operation.
In this case the element x y (which lies in X for every x and y in X) is called the product of x and y (alternative notation: x y instead of x y). A commutative ring is a ring for which (g) the multiplication operation is commutative. A ring with identity is a ring X such that
(h) the multiplication operation has a neutral element in X. In this case such a (unique) neutral element in X with respect to the multiplication operation is denoted by I (so that x 1 = 1 x = x for every x E X) and called the identity.
Example 2B. The power set P (X) of a nonempty set X is a commutative ring with identity if addition is interpreted as symmetric difference (or Boolean sum) and multiplication as intersection (i.e., A + B = AV B and A A. B = A fl B for all subsets A and B of X). Here the neutral element under addition (i.e., the zero) is the empty set 0, and the neutral element under multiplication (i.e., the identity) is X itself.
A ring with identity is nontrivial if it has another element besides the identity. (The set {0} with the operations 0 + 0 = 0. 0 = 0 is the trivial ring whose only element is the identity). If a ring with identity is nontrivial, then the neutral element
under addition and the neutral element under multiplication never coincide (i.e.,
0 # 1). In fact, x 0 = 0 x = 0 for every x in X whenever X is a ring (with or without identity). Incidentally (or not) this also shows that, in a nontrivial ring with identity, zero has no inverse with respect to the multiplication operation (i.e., there
is no x in X such that 0 x = x 0 = 1). A ring X with identity is called a division ring if
40
2. Algebraic Structures
(i) each nonzero x in X has an inverse in X with respect to the multiplication operation.
That is, if x 34 0 in X. then there exists a (unique) x-t E X such that x x-t =
x-1.x=1.
Example 2C. Let the addition and multiplication operations have their ordinary ("numerical") meanings. The set of all natural numbers N is not a group under addition; neither is the set of all nonnegative integers No. However, the set of all integers Z is a commutative ring with identity, but not a division ring. The sets Q. R and C (of all rational, real and complex numbers, respectively), when equipped with their respective operations of addition and multiplication, are all commutative division rings. These are infinite commutative division rings but there
are finite commutative division rings (e.g., if we declare that 1 + 1 = 0 + 0 = 0,
0+1 = 1+0= 1.0.0=0.1 = 1.0=0.and1.1 = 1, then (0, 1) is a commutative division ring).
Roughly speaking, commutative division rings are the number systems of mathematics, and so they deserve a name of their own. Afield is a nontrivial commutative division ring. The elements of a field are usually called scalars. We shall be particularly concerned with the fields R and C (the real field and the complex field). An arbitrary field will be denoted by F. Summing up: A field IF is a set with more than one element (at least 0 and 1 are distinct elements of it) equipped with two binary operations, called addition and multiplication, that satisfy all the properties (a) to (i) - clearly, with * replaced by + in properties (a) to (d).
Definition 2.1. A linear space (or vector space) over a field F is a nonempty set X (whose elements are called vectors) satisfying the following axioms. Vector Addition. X is an additive Abelian group under a binary operation + called vector addition. Scalar Multiplication. There is given a mapping of IFx X into X that assigns to each scalar a in IF and each vector x in X a vector ax in X. Such a mapping defines an operation, called scalar multiplication, with the following properties. For all scalars
a and 6 in F, and all vectors x and y in X.
lx a(flx)
= x,
= (a fl)x, a(x+y) = ax+ay. (a+f)x = ax +fix. Some remarks on notation and terminology. The underlying set of a linear space is the nonempty set upon which the linear space is built. We shall use the same notation
X for both the linear space and its underlying set, even though the underlying set
2.1 Linear Spaces
41
alone has no algebraic structure of its own. A set X needs a binary operation on it, a field, and another operation involving such a field with X to acquire the necessary algebraic structure that will grant it the status of a linear space. The scalar 1 in the above definition stands, of course, for the identity in the field F with respect to the multiplication in F, and + denotes the addition in F. Observe that + (addition
in the field F) and + (addition in the group X) are different binary operations. However, once the difference has been pointed out, we shall from now on use the same symbol + to denote both of them. Moreover, we shall drop the dot also from the multiplication notation in F. and write a# instead of a P. The neutral element under the vector addition in X (i.e., the vector zero) is referred to as the origin of the linear space X. Again, we shall use the same symbol 0 to denote both the origin in X and the scalar zero in F. A linear space over R is called a real linear space and a linear space over C is called a complex linear space. Example 2D. R itself is a linear space over R. That is, the plain set R when equipped with the ordinary binary operations of addition and multiplication becomes a field, also denoted by R. If vector addition is identified with scalar addition, then it be-
comes a real linear space, denoted again by R. More generally. for each n E N, let F" denote the Cartesian product of n copies of a field F (i.e., the set of all ordered n-tuples of scalars in F). Now define vector addition and scalar multiplication coordinatewise, as usual: and
ax=(att,...,at;")
f o r everyx = ( 1 , ... , f") and Y = (v1, ... , v") in F" and everya in F. This makes F" into a linear space over F. In particular, R" (the Cartesian product of n copies of R) and C" (the Cartesian product of n copies of C) become real and complex linear spaces, respectively, whenever vector addition and scalar multiplication are defined coordinatewise. However, if we restrict scalar multiplication to real multiplication only, then C" can also be made into a real linear space.
Example 2E. Let S be a nonempty set and let F be an arbitrary field. Consider the set X = F$
of all functions from S to F (i.e., the set of all scalar-valued functions on S, where "scalar-valued" stands for "F-valued"). Let vector addition and scalar multiplication be defined pointwise. That is, if x and y are functions in X and a is a scalar in F, then x + y and ax are functions in X defined by
(x + y)(s) = x(s) + y(s)
and
(ax)(s) = a(x(s))
for every s E S. Now it is easy to show that X, when equipped with these two operations, in fact is a linear space over IF. Particular cases: FN (the set of all scalarvalued sequences) and FIO' 1) (the set of all scalar-valued functions on the interval
42
2. Algebraic Structures
[0. 1]) are linear spaces over F, whenever vector addition and scalar multiplication are defined pointwise. Note that the linear space F" in the previous example also is a particular case of the present example, where the coordinatewise operations are identified with the pointwise operations (recall: F" = FL and fi = (i E N : i < n)).
Example 2F. What was the role played by the field F in the previous example? Answer: Vector addition and scalar multiplication in FS were defined pointwise by using addition and multiplication in F. This suggests the following generalization of Example 2E. Let S be a nonempty set, and let Y be an arbitrary linear space (over a field F). Consider the set x = YS
of all functions from S to y (i.e., the set of all y-valued functions on S). Let vector addition and scalar multiplication in X be defined pointwise by using vector addition and scalar multiplication in Y. That is, if f and g are functions in X (so that f (s)
and g(s) are elements of y for each s E S) and a is a scalar in F, then f + g and of are functions in X defined by
(f + g)(s) = f (s) + g(s)
and
(af)(s) = a(f (s))
for every s E S. As before, it is easily verified that X, when equipped with these operations, becomes a linear space over the same field F. The origin of X is the null function 0: S -+ Y (which is defined by 0(s) = 0 for all s E S). Examples 2D and 2E can be thought of as particular cases of this one.
Example 2G. Let X be a linear space over F, and let x. x', y and y' be arbitrary vectors in X. An equivalence relation - on X is compatible with vector addition if
x'- x and y'-. y
imply
x' + y'- x + y.
Similarly, it is said to be compatible with scalar multiplication if, for x and x' in X
ands in F,
x'- x
implies
ax'- ax.
If an equivalence relation - on a linear space X is compatible with both vector addition and scalar multiplication, then we shall say that - is a linear equivalence relation. Now consider X/-., the quotient space of X modulo - (i.e., the collection of all equivalence classes [x] with respect to ), and suppose the equivalence relation - on X is linear. In this case a binary operation + on X/- can be defined by setting
[xJ + [y] = [x + y] for every [x] and [y] in X/-r. Indeed, since - is compatible with vector addition, it follows that [x + y] does not depend on which particular members x and y of the equivalence classes [x] and [y) were taken. Thus the operation + actually is
2.2 Linear Manifolds
43
a function from (X/- x X/-) to X/-. . This defines vector addition in X/-- . Scalar multiplication in X/-. can be similarly defined by setting
a[x] = [ax] for every [x] in X/-. and every a in F. Therefore, if - is a linear equivalence relation on a linear space X over a field F, then X/- becomes a linear space over F whenever vector addition and scalar multiplication in X/- are defined this way. It is clear by the definition of a linear space X that x + y + z is a well-defined vector in X whenever x, y and z are vectors in X. Similarly, if {x; }"_i is a finite set of vectors in X, then the sum xj + - + x,,, denoted by x;, is again a vector in X. (The notion of infinite sums needs topology and we shall consider them in Chapters 4 and 5.)
2.2
Linear Manifolds
A linear manifold of a linear space X over F is a nonempty subset M of X with the following properties.
x+yEM and axEM for every pair of vectors x, y in M and every scalar a in F. It is readily verified that a linear manifold M of a linear space X over a field F is itself a linear space over the same field F. The origin 0 of X is the origin of every linear manifold M of X. The zero linear manifold is (0), consisting of the single vector 0. If a linear manifold M is a proper subset of X, then it is said to be a proper linear manifold. A nontrivial
linear manifold M of a linear space X is a nonzero proper linear manifold of it ({0) -A M A X). Example 2H. Let M be a linear manifold of a linear space X and consider a relation
- on X defined as follows. If x and x' are vectors in X, then
X'- X
if
x' - x E M.
That is, x'- x if x' is congruent to x modulo M - notation: x' = x (mod M). Since M is a linear manifold of X, the relation - in fact is an equivalence relation
on X (reason: 0 E M - reflexivity, x' - x" = (x' - x) + (x - x") E M whenever x' - x and x - x" lie in M - transitivity, and x - x' E M whenever x' - x E M - symmetry). The equivalence class (with respect to -)
[x] = {x'EX: x'-x{ = {x'EX: x'=x+zforsomeZEM} of a vector x in X is called the coset of x modulo M - notation: [x] = x + M. The set of all cosets [x] modulo M for every x E X (i.e.. the collection of all equivalence
44
2. Algebraic Structures
classes [x] with respect to the equivalence relation - for every x in X) is precisely the quotient space X/- of X modulo -. Following the terminology introduced in Example 2G, - is a linear equivalence relation on the linear space X. Indeed, if
x'-x E Mandy'-y E M, then (x'+y')-(x+y)=(x'-x)+(y'-y) E M and ax' - ax = a(x' - x) E M for every scalar a. so that x'- x and y'-' y imply x' + y'-' x + y and ax'- ax. Therefore, with vector addition and scalar multiplication defined by
[x] + [y] = [x + y]
and
a[x] = [ax],
X/- is made into a linear space. This is usually denoted by X/M (instead of XI-) and called the quotient space of X modulo M. The origin of X/M is. of course, [0] = M. If M and N are linear manifolds of a linear space X, then the sum of M and N, denoted by M + N, is the subset of X made up of all sums x + y where x is a vector in M and y is a vector in N:
M+N = {ZEX: Z=x+y, xEMand yEN}. It is trivially verified that M + Al' is a linear manifold of X. If (M i )" t is a finite family of linear manifolds of a linear space X, then the sum > 1Mj is the linear manifold MI + + Mn of X consisting of all sums E" txj where each vector x; lies in Mi. More generally, if {My}yer is an arbitrary indexed family of linear
manifolds of a linear space X, then the sum EyerMy is defined as the set of all sums with xy E My for each index y and xy = 0 except for some finite set of indices (i.e., EyErMy is the set made up of all finite sums with each summand being a vector in one of the linear manifolds My). Clearly, EYErMy is
itself a linear manifold of X, and Ma c EyerMy for every M. E (My)yer A linear manifold of a linear space X is never empty: the origin of X is always there. Note that the intersection m fl N of two linear manifolds M and M of a linear space X is itself a linear manifold of X. In fact, if (My )yEr is an arbitrary collection
of linear manifolds of a linear space X, then the intersection nyerMy is again a
linear manifold of X. Moreover, fyErMy c M. for every M. E (My)yEr. Now consider the collection Cat(X) of all linear manifolds of a linear space X.
Since £at(X) is a subcollection of the power set P(X), it follows that Cat(X) is partially ordered in the inclusion ordering. If (My)yEr is any subcollection of Cat (X), then F-yErMy in Gat (X) is an upper bound for {My }yEr and (l yErMy in Gat(X) is a lower bound for {My}yEr. If U in Cat(X) is an upper bound for
(My )yEr (i.e., if My c U for all y E I'), then EyErMy C- U. Therefore
E My = sup(My)yEr yEr
2.2 Linear Manifolds
45
Similarly, if V in Cat(X) is a lower bound for {My}yEr (i.e., if V c My for all Thus y E I'), then V c nyErMy.
n my = inf{My}yer yEr
Conclusion: Cat (X) is a complete lattice. The collection of all linear manifolds of a linear space is a complete lattice in the inclusion ordering. If {M, N} is a pair of
elements of Cat(X), then M v A(= M + N and M AN = M fl N. Let A be an arbitrary subset of a linear space X, and consider the subcollection (a sublattice. actually) GA of the complete lattice Cat(X),
CA = {MECat(X): AcM}, consisting of all linear manifolds of X that include A. Set
span A = inf CA = n CA, which is called the (linear) span of A. Since A c_ n CA (for A C_ M for every M E CA ), it follows that inf GA = min CA so that span A E CA. Thus span A is the smallest linear manifold of X that includes A, which coincides with the intersection of all linear manifolds of X that includes A. It is readily verified that span 0 = {0},
span M = M for every M E Cat (X), and A e_ span A = span (span A) for every A E &'(X). Moreover, if A and B are subsets of X, then
ACB
implies
span A C span B.
If M and N are linear manifolds of a linear space X, then it is clear that M U N c
M + N. Moreover, if K is a linear manifold of X such that M U N C K, then
x+ y E K for every x E M and every y E N, and hence M + N g K. Thus M + N is the smallest linear manifold of X that includes M U N, which means that
M +A( = span (M U AO. More generally, let {My}yEr be an arbitrary subcollection of Cat(X), and suppose K E Cat(X) is such that UyErMy c K. Then every (finite) sum Y-yErxy
with each xy in My is a vector in K. Thus >yErMy c K. Since UyerMy c EyerMy,it follows that EyErMy is the smallest elementofLat (X) that includes UyErMy.
Equivalently,
E My = span ( U My). yEr
yEr
46
2. Algebraic Structures
2.3
Linear Independence
Let A be a nonempty subset of a linear space X. A vector x E X is a linear combination of vectors in A if there exist a finite set (xi }" of vectors in A and a finite family of scalars (a,}.1 such that 1
n
X = Eaixi. i=1
Warning: A linear combination is, by definition, finite. That is, a linear combination of vectors in a set A is a weighted sum of a finite subset of vectors in A, weighted by a finite family of scalars, no matter whether A is a finite or an infinite set. Since X is a linear space, any linear combination of vectors in A is a vector in X.
Proposition 2.2. The set of all linear combinations of vectors in a nonempty subset A of a linear space X is a linear manifold of X that coincides with span A.
Proof. Let A be an arbitrary subset of a linear space X. consider the collection LA of all linear manifolds of X that include A. and recall that
span A = min LA. Suppose A is nonempty and let (A) denote the set of all linear combinations of vectors in A. It is clear that A c (A) (every vector in A is a trivial linear combination
of vectors in A), and that (A) is a linear manifold of X (if x, y E (A). then x + y and ax lie in (A)). Thus (A) E LA.
Moreover, if M is an arbitrary linear manifold of X, and if x E X is a linear combination of vectors in M, then x E M (because M is itself a linear space).
Thus (M) c M. Since M c (M), it follows that (M) = M for every linear manifold M of X. Furthermore, if M E LA, then A C_ M and hence (A) c (M) (reason: (A) C (B) whenever A and B are nonempty subsets of X such that A C_ B). Therefore, M E LA implies (A) C M.
Conclusion: (A) is the smallest element of LA. That is,
(A) = span A.
O
Following the notation introduced in the proof of Proposition 2.2, (A) = span A
whenever A 36 0. Set (0) = span 0 so that (0) = (0), and hence (A) is welldefined for every subset A of X. We shall use one and the same notation, viz. span A, for both of them: the set of all linear combinations of vectors in A and the (linear) span of A. For this reason span A is also referred to as the linear manifold
2.3 Linear Independence
47
generated (or spanned) by A. If a linear manifold M of X (which may be X itself) is such that span A = M for some subset A of X. then we say that A spans M. A subset A of a linear space X is said to be linearly independent if each vector x in A is not a linear combination of vectors in A\(x }. Equivalently, A is linearly independent if x ¢ span (A\(x)) for every x E A. If a set A is not linearly independent, then it is said to be linearly dependent. Note that the empty set 0 of a linear space X is linearly independent (there is no vector in 0 that is a linear combination of vectors in 0). Any singleton (x) of X such that x 0 0 is linearly independent. Indeed,
span ((x)\(x)) = span 0 = {0} so that x f span ((x)\(x)) if x A 0. However, 0 E span ({O}\{0}) = (0), and hence the singleton (0) is not linearly independent. In fact, every subset of X that contains the origin of X is not linearly independent (reason: if 0 E A C X and A has another vector besides the origin, say x A 0. then 0 = Ox). Thus, if a vector x is an element of a linearly independent subset of a linear space X, then x A 0.
Proposition 2.3. Let A be a nonempty subset of a linear space X. The following assertions are pairwise equivalent.
(a) A is linearly independent. (b) Each nonzero vector in span A has a unique representation as a linear combination of vectors in A. (c) Every finite subset of A is linearly independent. (d) There is no proper subset of A whose span coincides with span A. Proof. The statement (b) can be rewritten as follows.
(ti) For every nonzero vector x E span A there exist a unique finite family of scalars {ai }" t and a unique finite subset (ai )°_ t of A such that x = E"_ t ai ai .
Proof of (a)=>(b). Suppose A A 0 is linearly independent and take an arbitrary nonzero x E span A. Consider two representations of x as a linear combination of vectors in A:
n
x=
m
8ibi = >Yici,
where each bi and each ci are vectors in A (and hence nonzero because A is linearly independent). Since x # 0 we may assume that the scalars £i and yi are all nonzero. Set B = {bi }"_t and C = {ci )r t, both finite nonempty subsets of A. Take an arbitrary b E B and note that b is a linear combination of vectors in (B\{b)) U C. However, since b E A and A is linearly independent, it follows that b is not a linear combination of any subset of A\{b). Thus b E C. Similarly, take an arbitrary c E C and conclude that c E B by using the same argument. Hence
48
2. Algebraic Structures
B C C C B. That is, B = C. Therefore x = 1,6i b; = >"_t y; b; , which implies that Frs i (f; - y, )b; = 0. Since each b; is not a linear combination of vectors in B\(b; }, it follows that fl; = y; for every i. Summing up: The two representations of x coincide. Proof of (b)=(a). If A is nonempty and every nonzero vector x in span A has a unique representation as a linear combination of vectors in A, then the unique representation of an arbitrary a in A as a linear combination of vectors in A is a itself (recall: A C_ span A). Therefore, every a E A is not a linear combination of vectors in A\(a), which means that A is linearly independent.
Proof of (a). (c). If A is linearly independent, then every subset of it clearly is linearly independent. If A is not linearly independent, then either A = (0) or there exists x E A that is a linear combination of vectors, say (x; )"=t for some n E N, in
A\(x) # 0. In the former case A is itself a finite subset of A that is not linearly independent. In the latter case (x; )" 1 U (x) is a finite subset of A that is not linearly independent. Conclusion: If every finite subset of A is linearly independent, then A is itself linearly independent.
Proof of Recalling that B C_ A implies span B e_ span A. the statement (d) can be rewritten as follows.
(d) B C A implies
span B C span A.
Suppose A is nonempty and linearly independent. Let B be an arbitrary proper subset
of A. If B = 0, then (d) holds trivially (0 96 A 96 (0) so that span 0 C span A). Thus suppose B A 0 and take any x E A\B. If X E span B, then x is a linear combination of vectors in B. This implies that B U (x} is a subset of A that is not linearly independent, and hence A itself is not linearly independent, which is a contradiction. Therefore, x 0 span B for every x E A\B whenever 0 0 B C A. Since X E span A (because x E A), and since span B C span A (for B C A), it follows that span B C span A so that (d) holds true.
Proof of (d)=(a). If A is not linearly independent, then either A = (0) or there exists x E A that is a linear combination of vectors in A\(x). In the former case the only proper subset of A = (0) is B = 0, and span B = (0) = span A. In the latter case B = A\(x) is a proper subset of A such that span B = span A (reason: span B C_ span A because B C A. and span A C span B because every vector in A is a linear combination of vectors in A\ (x)). Therefore, (d) implies (a).
2.4
D
Hamel Basis
A linearly independent subset of a linear space X that spans X is called a Hamel basis (or a linear basis) for X. In other words, a subset B of a linear space X is a
2.4 Hamel Basis
49
Hamel basis for X if (i)
B is linearly independent, and
(ii)
span B = X.
Let B = (xy )yEr be an indexed Hamel basis for a linear space X. If x is a nonzero vector in X, then Proposition 2.3 ensures the existence of a unique (similarly in(which may depend on x) such that ay = 0 for all dexed) family of scalars {ay
but a finite set of indices y, and x = EYErayxy. The weighted sum EyErayxy (i.e., the unique representation of x as a linear combination of vectors in B, or the unique (linear) representation of x in terms of B) is called the expansion of x on B. and the coefficients of it (i.e., the unique indexed family of scalars (ay )yEr) are called the coordinates of x with respect to the indexed basis B. If x = 0, then its unique expansion on B is the trivial one whose coefficients are all null. Since 0 is linearly independent, and since span 0 = (0), it follows that the empty set 0 is a Hamel basis for the zero linear space (0). Now suppose X is a nonzero linear space. Every singleton {x) in X such that x # 0 is linearly independent. Thus every nonzero linear space has many linearly independent subsets. If a linearly independent subset A of X is not already a Hamel basis for X, then we can construct a larger linearly independent subset of X.
Proposition 2.4. If A is a linearly independent subset of a linear space X, and if there exists x E X\span A, then A U (x} is a linearly independent subset of X.
Proof. Suppose there exists a vector x in X\span A. Note that x * 0, and hence X 96 (0). If A = 0, then the result is trivially verified ({x) = 0 U (x) is linearly independent). Thus suppose A is nonempty and set C = A U (x) C X. Since x 0 span A, it follows that x 0 span (C\(x}). Take an arbitrary a E A. Suppose a E A is a linear combination of vectors in C\(a}. Clearly, a 96 ax for every scalar a (for x 0 span A and a 76 0 because A is linearly independent) so that n
a = apx+Eaiai, i=l
where each ai is a vector in A\{a} and each ai is a nonzero scalar (recall: 0 96 a 96 F"_I ai ai for A is linearly independent). Thus x is a linear combination of vectors in A, which contradicts the assumption that x l span A. Therefore, every a E A is not a linear combination of vectors in C\{a}. Conclusion: Every C E C is not a linear combination of vectors in C\(c), which means that C is linearly independent.
0 Can we proceed this way, enlarging linearly independent subsets of X in order to form a chain of linearly independent subsets, so that an "ultimate" linearly independent subset becomes a Hamel basis for X? Yes, we can; and it seems reasonable that the Axiom of Choice (or any statement equivalent to it as, for instance, Zorn's
50
2. Algebraic Structures
Lemma) might be called into play. In fact, every linearly independent subset of any
linear space X is included in some Hamel basis for X, so that every linear space has a large supply of Hamel bases. Theorem 2.5. If A is a linearly independent subset of a linear space X, then there exists a Hamel basis B for X such that A C- B.
Proof. Let X be a linear space and suppose A is a linearly independent subset of X. Set
TA = {B E P(X): B is linearly independent and A C B), the collection of all linearly independent subsets of X that include A. Recall that, as a nonempty subcollection (A E TA) of the power set 6a(X), TA is partially ordered in the inclusion ordering.
Claim 1. TA has a maximal element.
Proof. If X = {0}, then A = 0 and TA = A) = (0) 0 0, so that the claimed result is trivially verified. Thus suppose X # (0). In this case, the nonempty collection TA contains a nonempty set (e.g., if A = 0, then every nonzero singleton in X belongs to IA; if A # 0, then A E TA). Now consider an arbitrary chain C in TA containing a nonempty set. Recall that U C denotes the union of all sets in C. Take an arbitrary
finite nonempty subset of U C, say, a set D C U C such that #D = n for some n E N. Each element of D belongs to a set in C (for D C U C). Since C is a chain, we can arrange the elements of D as follows. D = (x;) t such that xi E C; E C for C_ C . Thus D C_ C. Since C is linearly independent each index i, where Cl g (because C E C C TA), it follows that D is linearly independent. Conclusion: Every finite subset of U C is linearly independent. Therefore U C is linearly independent by Proposition 2.3. Moreover, since A C C for all C E C (for C e TA), it also follows that A C_ U C. Hence U C E TA. Since U C clearly is an upper bound for C. we may conclude: Every chain in TA has an upper bound in TA. Thus TA has a maximal element by Zorn's Lemma. a Claim 2. Take B E TA. B is maximal in TA if and only if B is a Hamel basis for X.
Proof. Again, if X = (0), then B = A = 0 is the only (and so a maximal) element in TA and span B = X, so that the claimed result holds trivially. Thus suppose X # (0), which implies that TA contains nonempty sets, and take an arbitrary B in TA. If span B# X (i.e., if span B C X), then take x E X\span B so that B U {x) E TA (i.e., B U {x} is linearly independent by Proposition 2.4, and A C B U (x} because A C_ B). Hence B is not maximal in TA. Therefore, if B is maximal in ZA, then
span B = X. On the other hand, if span B = X, then B 96 0 (for X A (0)) and every vector in X is a linear combination of vectors in B. Thus B U {x) is not linearly independent for every x E X\ B. This implies that there is no B' E TA such that B C B', which means that B is maximal in TA. Conclusion: If B E ZA' then B
2.4 Hamel Basis
51
is maximal in ZA if and only if span B = X. According to the definition of Hamel basis, B in ZA is such that span B = X if and only if B is a Hamel basis for X. o Claims 1 and 2 ensure that, for each linearly independent subset A of X, there exists
a Hamel basis B for X such that A c B. Since the empty set is a subset of every set (in particular. of every linear space), and since the empty set is linearly independent, it follows that the preceding theorem holds for A = 0. In this case Zg is simply the collection of all linearly independent subsets of the linear space X. and the theorem statement just says that every linear
space has a Hamel basis. Moreover, Claim 2 says that a Hamel basis for a linear space is precisely a maximal linearly independent subset of it (i.e., a Hamel basis is a maximal element of Zm ). The idea behind the previous theorem was that of enlarging a linearly independent
subset of X to get a Hamel basis for X. Another way of facing the same problem (i.e., another way to obtain a Hamel basis for linear space X) is to begin with a set that spans X and then to weed out from it a linearly independent subset that also spans X. Theorem 2.6. if a subset A of a linear space X spans X. then there exists a Hamel basis B for X such that B C A.
Proof. Let A be a subset of a linear space X such that span A = X, and consider the collection 2A of all linearly independent subsets of A:
2A = {B E P(X): B is linearly independent and B C A). If X = {0}, then either A = 0 or A = {0}. In any case ZA = (0) trivially has a maximal element. If X 0 (0), then A has a nonzero vector (for span A = X) and every nonzero singleton {x) c A is an element of I . Thus, proceeding exactly as in the proof of Theorem 2.5 (Claim 1). we can show that ZA has a maximal element. Let Ao be a maximal element of IA. If A is linearly independent, then we
are done (i.e., A is itself a Hamel basis for X since span A = X). Thus suppose A is not linearly independent so that Ao is a proper subset of A. Take an arbitrary a E A\Ao and consider the set A0 U (a) c A, which is not linearly independent because Ao is maximal in 2A. Since Ao is linearly independent, it follows that a is a linear combination of vectors in Ao. Thus A\Ao c_ span A0. and hence A = Ao U (A\Ao) c span Ao. Therefore span A c span (span A0) = span A0 c span A, which implies that span Ao = span A = X. Conclusion: A0 is a Hamel basis for X. Since X trivially spans X, the above theorem holds for A = X. In this case T. is precisely the collection of all linearly independent subsets of X (i.e., I = Z0), and the theorem statement again says that every linear space has a Hamel basis. An ever-present purpose in mathematics is a quest for hidden invariants. The concept of Hamel basis supplies a fundamental invariant for a linear space, namely. the cardinality of all Hamel bases for X.
52
2. Algebraic Structures
Theorem 2.7. Every Hamel basis for a given linear space X has the same cardinality.
Proof. If X = (0), then the result holds trivially. Suppose X -A (0) and let B and C be arbitrary Hamel bases for X (so that they are nonempty and do not contain the origin). Proposition 2.3 ensures that for every nonzero vector x E X there exists a unique finite subset of the Hamel basis C, say C, such that x is a linear combination
of all vectors in Cz C C. Now take an arbitrary c E C and consider the unique representation of it as a linear combination of vectors in the Hamel basis B. Thus c is a linear combination of all vectors in (b) U B' for some (nonzero) b E B and some finite subset B' of B. Hence c = fib + d, where f is a nonzero scalar and d is a vector in X different from c (for fib 96 0). If d = 0, then c = fib so that Cb = (c), and hence c E Cb trivially. Suppose d 96 0. Recalling again that C also is a Hamel basis for X, consider the unique representation of the nonzero vector d as a linear
combination of vectors in C, so that fib = c - d 96 0 is a linear combination of vectors in C. Thus b is itself a linear combination of all vectors in (c) U C' for some subset C' of C. Since such a representation is unique, it follows that (c) U C' = Cb. Thereforec E Cb. Summing up: For every c E C there exists b E B such that c E Cb. Thus
CCUCb. bE B
Now we shall split the proof into two parts. One dealing with the case of finite Hamel bases, and the other with infinite Hamel bases.
Claim 0. Let X be a linear space. If a subset E of X with exactly n elements spans X, then every subset of X with more than n elements is not linearly independent.
Proof. Assume the linear space X is nonzero (i.e., X # (0)) to avoid trivialities. Take an integer n E N and let E = (e; )" be a subset (with n distinct elements) of X such that span E = X. Now take an arbitrary subset of X with n + 1 elements, say D = {d; }7I . Suppose D is linearly independent. Next consider the set S1 = {dt } U E
which clearly spans X (because E already does it). Since span E = X, it follows that di is a linear combination of vectors in E. Moreover, d, i4 0 because D is linearly independent. Thus di = y,"_ 1a; e; where at least one, say ak, of the scalars (a; }°_, is nonzero. Therefore, if we delete ek from S1, then the resulting set
S', = S,\(ek) = {d,}UE\(ek) still spans X. That is, in forming this new set S, that spans X we have traded off one vector in D for one vector in E. Rename the elements of S' by setting s; = e; for each i 54 k and Sk = d1, so that S, = (s; ]" . Since D has at least two elements. set
S2 = (d2)USi = (di,d2)UE\{ek)
2.4 Hamel Basis
53
which again spans X (for Si spans X). Since span Sj = X, it follows that d2 is a linear combination of vectors in S. say d2 = E" if;si for some family of scalars {M"_,. Moreover, 0 # d2 54 fksk = fkdi because D is linearly independent. Thus there exists at least one nonzero scalar in (f;)"_t different from ,8k, say $j. Therefore, if we delete sj from S2 (recall: sj = ej 96 ek), then the resulting set
S2 = S2\(ei) = (di, d2) U E\(ek, ej) still spans X. Continuing this way we eventually get down to the set
Sn = {d1 }n=1 U E\ (e1 )° = D\{dn+l } which once again spans X. Thus is a linear combination of vectors in D\{dn+1 }, which contradicts the assumption that D is linearly independent. Conclusion: Every
subset of X with n + 1 elements is not linearly independent. Recalling that every subset of a linearly independent set is again linearly independent, it follows that every subset of X with more than n elements is not linearly independent. 17
Claim 1. If B is finite, then #C = #B.
Proof. Recall that Cb is finite for every b in B. If B is finite, then UbEBCb is a finite union of finite sets. Hence any subset of it is finite. In particular. C is finite. Since C is linearly independent, it follows by Claim 0 that #C < #B. Dually (swap the Hamel bases B and C), #B < #C. Hence #C = #B. o Claim 2. If B is infinite, then #C = #B.
Proof. Since B is infinite, and since Cb is finite for every b in B, it follows that #Cb < #B for all b in B. Thus, according to Theorems 1.10 and 1.9,
#(U Cb) < #(BxB) = #B beB
because B is infinite. Therefore #C < #B (recall that C C UbEBCb and use Problems 1.21(a) and 1.22). Moreover, Claim 1 says that B is finite whenever C is finite. Thus C must be infinite because B is infinite. Since C is infinite we may reverse the argument (swapping again the Hamel bases B and C) and get #B < #C. Hence #C = #B by the Cantor-Bernstein Theorem (Theorem 1.6). a Claims 1 and 2 ensure that, if B and C are Hamel bases for a linear Space X, then o B and C have the same cardinal number. Such an invariant (i.e., the cardinality of any Hamel basis) is called the dimension (or the linear dimension) of the linear space X, denoted by dim X. Thus dim X = #B
for any Hamel basis B for X. If the dimension of X is finite (equivalently, if any Hamel basis for X is a finite set) then we say that X is a finite-dimensional linear space. Otherwise (i.e., if any Hamel basis for X is an infinite set) we say that X is an infinite-dimensional linear space.
54
2. Algebraic Structures
Example 21. The Kronecker delta (or Kronecker function) is the mapping in 2ZXZ (i.e., the function from Z x Z to (0, 1)) defined by 1,
i = j,
0,
i A j,
for all integers i, j. Now consider the linear space F" (for an arbitrary positive integer n, over an arbitrary field IF - see Example 2D). The subset B of F" consisting of the n-tuples ei = (Si 1, ... , Si" ), with 1 at the ith position and zeros elsewhere, constitute a Hamel basis for IF". This is called the canonical basis (or the natural basis) for F". Thus dim F" = n. As we shall see later, F" in fact is a prototype for every finite-dimensional linear space (of dimension n) over a field F. Example 2J. Let FN be the linear space (over a field F) of all scalar-valued sequences
(see Example 2E), and let X be the subset of FN defined as follows. x = (tk )kEN belongs to X if and only if k = 0 except for some finite set of indices k in N. That is, X is the set consisting of all F-valued sequences with a finite number of nonzero entries, which clearly is a linear manifold of FN, and hence a linear space itself over F. For each integer i E N let ei be an F-valued sequence with just one nonzero entry (equal to 1) at the ith position; that is, e; = (Sik )keN E X for every i E N. Now set B = (ei C X. It is readily verified that B is linearly independent and that span B = X (every vector in X is a linear combination of vectors in B). Thus B is a Hamel basis for X. Since B is countably infinite. X is an infinitedimensional linear space with dim X = No. Therefore (see Problem 2.6(b)), FN is an infinite-dimensional linear space. Note that B is not a Hamel basis for FN (reason: span B = X and X is properly included in FN). The next example shows that No < dim FN whenever F = Q, F = R, or F = C.
Example 2K. Let CN be the complex linear space of all complex-valued sequences. For each real number t E (0, 1) consider the real sequence xt = (tk-t }kEN = {tk}kENO = (1, t, 12, ..) E CN whose entries are the nonnegative powers oft. Set A = (x, 1) C CN. We claim that A is linearly independent. A bit of elementary real analysis (rather than pure algebra) supplies a very simple proof as follows. Suppose A is not linearly independent. Then there exists s E (0, 1) such that x, is a Iinear combination of vectors in A\(s). That is, x, = E°_taix,i for some n E N. where Jai )"= t is a family of nonzero complex numbers and (x,; )" t is
a (finite) subset of A such that x,, # x, for every i = I.... , n. Hence n > I (reason: if n = 1, then z, = aIx,, so that sk = attk for every k E No, which implies that x, = x,, ). As the set (ti }"_ t consists of distinct points from (0, 1), suppose it is decreasingly ordered (reorder it if necessary) so that ti < it for each i = 2.... , n. Since sk a; tk, it follows that (s/ti )k = at + E -..tai (ri /tt )k for every k E No. However limk "_,ai (ti /tt )k = 0, because each ti/ti lies in (0. 1), and
hence limk(s f tt )k = at. Thus at = 0 (recall: x, 0 x,, so that s 0 tt) which is a contradiction. Conclusion: A is linearly independent. Therefore, according to
2.5 Linear Transformations
55
Theorem 2.5, there exists a Hamel basis B for CN including A. Since A C B and
#A = #(0, 1) = 2K0, it follows that 2K0 < #B. However, #C = #R = 2K0 < #B = dim CN, so that #CN = dim CN (see Problem 2.8). Conclusion: CN is an infinite-dimensional linear space such that
2K0 < dim CN = #CN. Note that the whole argument applies for C replaced by R, so that
2k < dim RN = #RN; but it does not apply to the rational field Q (the interval (0, 1) is not a subset of Q, and hence the set A is not included in QN). However, the final conclusion does hold for the linear space QN. Indeed, if F is an arbitrary infinite field, then 2K0 = #2N < #IFN = max(#IF, dim FN) according to Problems 1.24 and 2.8. Therefore, since #Q = No < 2K0 (Problem 1.25(c)), it follows that
2"0 < dim QN = #QN.
2.5
Linear Transformations
A mapping L : X --)- Y of a linear space X over a field IF into a linear space y over the same field F is homogeneous if
L(ax) = aLx for every vector x E X and every scalar of E F. The scalar multiplication on the left-hand side is an operation on X and that on the right-hand side is an operation on ) (so that the linear spaces X and y must indeed be defined over the same field F). L is additive if
L(xj +x2) = L(xt)+L(x2) for all vectors xt, x2 in X. Again, the vector addition on the left-hand side is an operation on X while the one on the right-hand side is an operation on Y. If X and Y are linear spaces over the same scalar field, and if L is a homogeneous and additive
mapping of X into y, then L is a linear transformation: a linear transformation is a homogeneous and additive mapping between linear spaces over the same scalar field. When we say that L: X -i, Y is a linear transformation, it is implicitly assumed that X and Y are linear spaces over the same field F. If X = y and L : X -). X is a linear transformation, then we refer to L as a linear transformation on X. Trivial example: The identity 1: X -+ X (such that 1(x) = x for every x E X) is a linear transformation on X. Recall that a field F can be made into a linear space over F itself (see Example 2D). If X is a linear space over F, then a linear transformation f : X -+ F is called a linear functional: a linear functional is a scalar-valued linear transformation (i.e., a linear transformation of a linear space X into its scalar field).
56
2. Algebraic Structures
If Y E Y is the value of a linear transformation L : X -+ Y at x E X, then we shall often write y = Lx (instead of y = L(x)). Since y is a linear space, it has an origin. The null space (or kernel) of a linear transformation L : X -- Y is the subset
N(L) = {x E X : Lx = 01 = L-'(10)) of X consisting of all vectors in X mapped into the origin of Y by L. Since X also is a linear space, it has an origin too. The origin of X is always in N(L) (i.e., LO = 0
for every linear transformation L). The null transformation (denoted by 0) is the mapping O : X -+ Y such that Ox = 0 for every x E X, which certainly is a linear transformation. In fact, if L : X -+ ,y is a linear transformation, then L = 0 if and only if N(L) = X. Equivalently, L = 0 if and only if R(L) = (0). The null space, N(L) = L-' ((O)), of any linear transformation L : X --+ Y is a linear manifold of X, and the range of L, R(L) = L(X), is a linear manifold of Y (see Problem 2.10). These are indeed particular cases of Problem 2.11: The linear image of a linear manifold is a linear manifold, and the inverse image of a linear manifold under a linear transformation is again a linear manifold. The theorem below supplies an elegant and useful, although very simple, necessary and sufficient condition that a linear transformation be injective.
Theorem 2.8. A linear transformation L is injective if and only if N(L) = (0). Proof. let X and Y be linear spaces over the same scalar field, and consider a
linear transformation L : X -+ Y. If L is injective, then L-' (L((0))) = (0) (see Problem 1.3(d)). But L({0}) = (0) (for LO = 0) so that L-1((0)) = (0), which means N(L) = (0}. On the other hand, suppose N(L) = (0). Take xt and x2 arbitrary in X, and note that Lx1 - Lx2 = L(xi - x2) since L is linear. Thus, if Lx1 = Lx2, then L(xt - x2) = 0 and hence xt = x2 (i.e., xt - x2 = 0 because N(L) = {0)). Therefore L is injective. 0 The collection yX of all mappings of a set X into a linear space Y over a field IF is itself a linear space over F (see Example 2F). Now suppose X is a linear space (over the same field IF), and let C(X, y] denote the collection of all linear transformations
of X into Y. Since C[X, Y] is a linear manifold of YX (Problem 2.13), it follows that L[X, y] is a linear space over the same field F. Set LIX] = C[X, X] for short, so that L[X] C XX is the linear space of all linear transformations on X. The linear space C[X,F] of all linear functionals defined on a linear space X, which is a linear manifold of the linear space FX (see Example 2E), is called the algebraic dual (or algebraic conjugate) of X and denoted by X' (Dual spaces will be considered in Chapter 4.) Let X and Y be linear spaces over the same scalar field, and let LAM: M Y be the restriction of a linear transformation L : X - y to a linear manifold M of X. Since M is a linear space, it is readily verified that LAM is a linear transformation. Briefly: The restriction of a linear transformation to a linear manifold is again a linear transformation (Problem 2.14). The next result ensures the converse: If L E
2.5 linear Transformations
57
£(M, Y] and M is a linear manifold of X, then there exists T E L[X, y] such that L = TIM, which is called a linear extension of L over X.
Theorem 2.9. Let X and y be linear spaces over the same field F, and let M be a linear manifold of X. If L : M -> y is a linear transformation, then there exists a linear extension T : X -+ Y of L defined on the whole space X.
Proof. Set
IC = 1 K E C[N, Y] : N E Cat(X), M c N and L= KIM}, the collection of all linear transformations from linear manifolds of X to y that are extensions of L. Note that K is nonempty (at least L is there). Moreover, as a subcollection of .T = UAEP(X)yA, K is partially ordered in the extension ordering (see Problem 1.17). Problem 1.17 also tells us that every chain (K.) in X
has a supremum V1Ky in Y with domain D(V1Ky) = U1D(Ky) and range R(V1Ky) = U1R(Ky). Since D(K1) E Lat(X) (each K. is a linear transformation defined on a linear manifold of X), and since Lat(X) is a complete lattice,
it follows that D(V1Ky) is a linear manifold of X (i.e., U1D(Ky) E Lat(X)). Similarly, R(V1 Ky) is a linear manifold of Y. Claim. The supremum Vy Ky lies in K.
Proof. Take u and v arbitrary in D(V1K1). so that u E D(KA) for some Kx in {Ky} and V E D(K,,) for some K. in (Ky). Since (Ky) is a chain, it follows that Kx < K. (or vice versa), so that D(Kx) a D(KN,). Thus au + fJv E D(K0) and hence K,,(au + f v) = f K,,v for every a, f E F (recall: each K. is linear). However (V1Ky)ID(K,.) = Kµ, which implies that (V), Ky)(au + fv) = a(\/y Ky)u + f(Vy Ky)v. That is, Vy Ky : D(V1 Ky) -+ y is linear. Moreover, since each K. is such that K. I M = L, and since (Ky ) is a chain, it follows that (V), Ky) I m = L. Conclusion: V1 Ky E K. 0 Therefore, every chain in K has a supremum (and so an upper bound) in K. Thus, according to Zorn's Lemma, K contains a maximal element, say Ko : No -> Y. We shall show that No = X, and hence Ko is a linear extension of L over X. The proof goes by contradiction. Suppose No # X. Take xt E X\No (so that x1 34 0 because No is a linear manifold of X) and consider the sum of No and the one-dimensional linear manifold of X spanned by (xt ),
N1 = No + span {xt ), which is a linear manifold of X properly including M (because M C No C Nt ). Since No fl span (xt) = (0), it follows that every x in N, has a unique representation as a sum of a vector in No and a vector in span (xi ). That is, for each x E Nt there exists a unique pair (xo, a) in No xlF such that x = xo + axt. (Indeed, if
x = xo + axt = xo + a'xi, then xo - xo = (a' - a)xi E No n span (xt) = (0)
58
2. Algebraic Structures
so that xo = xo and a' = a - recall: xt 96 0.) Take an arbitrary y in y (for instance, y = 0) and consider the mapping Kt : Nt -+ Y defined by K t x= Kox0 + a y
for every x E N. Observe that K1 is linear (it inherits the linearity of Ko) and
Ko = KiINo (so that Ko < K1). Since M C No C Nt, it follows that L = KoIM = Ki I.&4. Thus KI E K, which contradicts the fact that K0 is maximal in K (for Ko y6 Ki ). Therefore N0 = X. o
Let X and y be nonzero linear spaces over the same field. Take x 0 0 in X and y * 0 in Y, set M = span [x} in Gat(X), and let L: M -+ Y be defined by Lu = a y for every u = ax E M. Clearly, L is linear and L * O. Thus Theorem 2.9 ensures that, if X and y are nonzero linear spaces over the same field, then there exist many T # 0 in £[X, yJ (at least as many as one-dimensional linear manifolds in Cat(X)).
2.6
Isomorphisms
Two exemplars of a mathematical structure are indistinguishable, in the context of the theory in which that structure is embedded, if there exists a one-to-one correspondence between them that preserves such a structure. This is a central concept in mathematics. From the point of view of the linear space theory, two linear spaces are essentially the same if there exists a one-to-one correspondence between them that preserves all the linear relations - they may differ on the set-theoretic nature of their elements but, as far as the linear space (algebraic) structure is concerned.
they are indistinguishable. In other words, two linear spaces X and y over the same scalar field are regarded as essentially the same linear space if there exists a one-to-one correspondence between them that preserves vector addition and scalar multiplication. That is, if there exists at least one invertible linear transformation from X toy whose inverse from y to X also is linear. The theorem below shows that the inverse of an invertible linear transformation is always linear.
Theorem 2.10. Let X and y be linear spaces over F. If L : X -+ y is an invertible linear transformation, then its inverse L-' : y -> X is a linear transformation. Proof. Recall that, by definition, a function is invertible if it is injective and surjective. Take yi and y2 arbitrary in y so that there exist xt and x2 in X such that yj = LxI and Y2 = Lx2 (for y = 1Z(L) - i.e., L is surjective). Since L is injective (i.e., L-' L is the identity on X - see Problems 1.5 and 1.7) and additive, it follows that
L-'(y, +y2) = L-'(Lx, +Lx2) = L-'L(xt +x2) = x, +x2 = L-' Lx1 + L-' Lx2 = L-' yl + L-' y2.
2.61somorphisms
59
and hence L-1 is additive. Similarly, since L is injective and homogeneous,
L-1(ay) = L-t(aLx) = L-'L(ax) = ax = aL-'Lx = aL-'y for everyy E Y = R(L) and everya E F, which implies that L-t is homogeneous. Thus L-t is a linear transformation. a An isomorphism between linear spaces (over the same scalar field) is an injective and surjective linear transformation. Equivalently, an invertible linear transformation. Two linear spaces X and Y over the same field F are isomorphic if there exists an isomorphism (i.e., a linear one-to-one correspondence) of X onto Y. Thus, according to Theorem 2.8, a linear transformation L: X -+ Y of a linear space X into a linear space Y is an isomorphism if and only if N(L) = (0) and R(L) = Y.
In particular, if N(L) = (0), then X and the range of L (R(L) = L(X)) are isomorphic linear spaces. We noticed in Example 21 that F" is a "prototype" for every n-dimensional linear space over F. What this really means is that every n-dimensional linear space over a field F is isomorphic to F", and hence two n-dimensional linear spaces over the same scalar field are isomorphic. In fact, such an isomorphism between linear spaces with the same dimension holds in general, either for finite or infinite dimensional linear spaces. We shall prove this below (Theorem 2.12) but first we need the following auxiliary result.
Proposition 2.11. Let X and Y be linear spaces over the same field, and let B be a Hamel basis for X. For each mapping F : B -+ Y there exists a unique linear transformation T: X -+ Y such that T I e= F.
Proof. If B = (xy)yer is a Hamel basis for X, indexed by an index set r (recall: any set can be thought of as an indexed set), then every vector x in X has a unique expansion on B, viz.,
x = E ayxy, yer where (ay}yEr is a similarly indexed family of scalars with ay = 0 for all but a finite set of indices y (the coordinates of x with respect to the indexed basis B). Now set
Tx = JayF(xy) yer
for every x E X. This defines a mapping T : X -+ Y of X into Y which is homogeneous, additive, and equals F when restricted to B. That is, T is a linear transformation such that T 18 = F. Moreover, if L : X -+ Y is a linear transformation of X into y such that LAB = F, then L = T. Indeed, for every x E X,
Lx = L(>ayxy) = EayF(xy) = T(Eayxy) = Tx. yer
yer
yEr
t7
60
2. Algebraic Structures
Theorem 2.12. Let X and ,y be linear spaces over the same scalar field. X and Y are isomorphic if and only if dim X = dim Y. Proof. (a) Let L : X -+ y be an isomorphism of X onto Y. and let BX be a Hamel basis for X. Set By = L(BX), a subset of Y. Claim 1. By is linearly independent.
Proof. Recall that L is an injective and surjective linear transformation. If By is not linearly independent, then there exists y E By, which is a linear combination of vectors in By\(y), say y = F"=ia;y; where each y; is a vector in By\{y). Thus x = L-t y in BX = L-i (By) is a linear combination of vectors in BX\{x). (Indeed,
x = F"=iaix; where each x; = L-t y; is a vector in BX = L-t (By) different from x = L-t y - recall: each y; is a vector in By different from y, and L is injective.) But this contradicts the fact that BX is linearly independent. Conclusion: By is linearly independent. El
Claim 2. By spans Y. Proof. Take Y E Y arbitrary so that y = Lx for some x E X (because L is surjective). Since span Bx = X, it follows that x is a linear combination of vectors in B. Hence y = Lx is a linear combination of vectors in By = L(Bx) (since L is linear) so that span By = Y. 0
Therefore, By is a Hamel basis for Y. Moreover, #By = #BX because L sets a one-to-one correspondence between BX and By. (In fact, the restriction L I Bx : BX -, By is injective and surjective, since L is injective and By = L(BX) by definition.) Thus dim X = dim Y. (b) Let BX and By be Hamel bases for X and y, respectively. If dim X = dim Y, then #By = #BX, which means that there exists a one-to-one mapping F: Bx By of BX onto By. Let T : X -> Y be the unique linear transformation such that T IBx = F (see Proposition 2.11), and hence T(BX) = F(BX) = By. Claim 3. T is injective.
Proof. If X = (0), then the result holds trivially. Thus suppose X # (0). Take any nonzero vector x in X and consider its (unique) representation as a linear combination of vectors in BX. Therefore, Tx has a representation as a linear combination of vectors in By = T(BX) because T is linear. Since By is linearly independent, it follows that Tx 74- 0. That is, N(T) = (0) which means, by Theorem 2.8, that T is injective. 17
Claim 4. T is surjective. Proof. Take any vector y E y and consider its expansion on By, say y =
rn_ i a, v,
ia;T(x,) with each x; in BX because By = with each y; in By. Thus y a;x, is a T(BX). Since T is linear, it follows that y = T(>"_ia;x;), where vector in X (since X is a linear space). Hence y E R(T). a
2.6 Isomorphisms
61
a
Therefore, T : X - y is an isomorphism of X onto Y.
Example 2L. Let X and Y be finite-dimensional linear spaces over the same field and By = (y) m be Hamel ;t, with dim X = n and dim Y = m. Let Bx = (x j bases for X and y, respectively. Take an arbitrary vector x in X and consider its unique expansion on Bx: 1
M
X = 1: jXj, j=1
consists of the coordinates of x with respect to B. Now let A : X --o- y be any linear transformation so that where the family of scalars (l; j
n
e 1: jAxj.
Ax
j=1
Each Axj is a vector in Y. Consider its unique expansion on By:
Axj =
aijyi i=1
where, for each j, (a}1 is a family of scalars - the coordinates of each Ax j with respect to By. Set y = Ax in Y and consider the unique expansion of y on By: M
y = Eviyi i=1
Again. (vi)" is a family of scalars consisting of the coordinates of y with respect to By. Thus the identity y = Ax can be written as 1
m
Eviyi
=
m
n
i=1
j=1
E(Djaij)yj.
i=1
Since the expansion of y on By is unique, it follows that
vi =
aij j j=1
for every i = i, ... , m. This gives an expression for each coordinate of Ax as a function of the coordinates of x. In terms of standard matrix notation, and according to the ordinary matrix operations, the matrix equation
(ui
all
...
aOn
t1
vm
aml
...
amn
to
62
2. Algebraic Structures
represents the identity y = Ax (the vector y is the value of the linear transformation A at the point x), and the in xn array of scalars
all [A]
...
Olin
.
amn
= (am,
I
is the matrix that represents the linear transformation A : X -+ Y with respect to the bases BX and By. The matrix [A] of a linear transformation A depends on the bases BX and By. If we change the bases, then the matrix that represents the linear transformation may change as well. Different matrices representing the same linear transformation are simply different representations of it with respect to different bases. However, if we fix the bases BX and By, then the representation [A] of A is unique. But uniqueness is not all. It is easy to show that (a) the set Fm xn of all m x n matrices with entries in F is a linear space over IF when equipped with the ordinary (entrywise) operations of matrix addition and scalar multiplication. Moreover, for fixed bases BX and By.
(b) Fmxn is isomorphic to £[X, Y]. If we fix the bases BX and By, then the relation between £[X, Y] and Fmxn defined by "[A l represents A with respect to BX and By" in fact is a function from £[X, Y] to Fmxn. It is readily verified that such a function, say 4': £[X, Y] - Fmxn, is homogeneous, additive, injective and surjective. In other words, 0 is an isomorphism.
For this reason we may and shall identify a linear transformation A E £[F', F'n] with its matrix [A] E Fmxn relative to the canonical bases for Fn and Fm (which were introduced in Example 21).
Example 2M. Let F denote either the real field or the complex field. Take an arbitrary nonnegative integer n, and let Pn [0, 1] be the collection of all polynomials in the variable t E [0, 1] with coefficients in F of degree no greater than n. That is,
{p E F10'1: p(t) = E"_oa;t'.t E [0, 1], with each ai in F)
Recall: The degree of a nonzero polynomial p is m if p(t) _ Via; t' with am 96 0 (e.g., the degree of a constant polynomial is zero), and the degree of the zero polynomial is undefined (thus not greater than any n E No). It is readily verified that P. [0, 1 ] is a linear manifold of the linear space Flo 11 (see Example 2E), and hence a linear space over F. Now consider the mapping L: F1+1 --- P. [0, 1] defined as follows. For each x = ( 4 o . . . . . kn) E Fn+I let p = Lx in P. [0, 1] be given by n
P(t) _ Dit' i=o
2.6Isomorphisnu
63
for every t E [0, 1]. It is easy to show that L is a linear transformation. More-
over, N(L) = (0) (i.e., if p(t) = E t...0 t' = 0 for every t E [0, 1], then x = = 0 - a nonzero polynomial has only a finite number of zeros) so that L is injective (see Theorem 2.8). Furthermore, every polynomial pin P. [0, 1] is of
the form p(t) = E"_oti t' for some x = (to,... , t") in F"+', which means that P, [0, 1] C R(L). Hence P"[0, 11 = R(L); that is, L is also surjective. Therefore, the linear transformation L is an isomorphism between the linear spaces F"+' and P. [0, 1). Thus, since dim F"+' = n + 1 (see Example 2I), it follows by Theorem 2.12 that
dim P"[0, 1] = n+ 1. Next consider the collection P[0, 1] of all polynomials in the variable I E [0, 1] with coefficients in F of any degree:
P[0, 11 = U P"[0,1]. neNO
Note that P[0, 11 contains the zero polynomial together with every polynomial of finite degree. It is again readily verified that, as a linear manifold of Flo,tl, P[0, 11 is itself a linear space over F. The functions pi : [0, 1] -> F, defined by pi (t) = t' for every t E [0, 1), clearly belong to P[0, 1 ] for each i E No. Consider the set B = (p; };ENO C P[0, 1]. Since any polynomial in P[0, 1] is, by definition, a (finite) linear combination of vectors in B. it follows that P[0, 1] e_ span B. Hence B spans P[0, 1] (i.e., span B = P[0, 1]). We claim that B is also linearly independent. Indeed, suppose B is not linearly independent. Then there exists in 1aj pi, for B a linear combination Pk of vectors in B\{ pk }. That is, pk some integer m E N, where (cry } , is a family of nonzero scalars and (pij IT t is a finite subset of B such that p;j * pk (i.e., i j 76 k) for every j = t, ... , m. Thus
p = Pk - F taj pij is the origin of P[0. I], which means that
P(t) = tk - E j=1
for all t E [0, 11. But this is a contradiction because p is a polynomial of degree equal
to max Ilk) U (i)1) > 1. Conclusion: B is linearly independent. Therefore, the set B = { pi };ENO is a Hamel basis for P[0, 11, and hence
dim P[0, 1] = NO
(for #B = #No = No). Thus P[0, 1] is isomorphic to the linear space X of all F-valued sequences with a finite number of nonzero entries (which was introduced in Example 21).
64
2.7
2. Algebraic Structures
Isomorphic Equivalence
Two linear spaces over the same scalar field are regarded as essentially the same linear space if they are isomorphic. Let X, Y and Z be linear spaces over the same field F. It is clear that X is isomorphic to itself (reflexivity), and Y is isomorphic to X whenever X is isomorphic to y (symmetry). Moreover, since the composition of two isomorphisms is again an isomorphism (see Problems 1.9(c) and 2.15) it follows that, if X is isomorphic to y and y is isomorphic to 2, then X is isomorphic to 2 (transitivity). Thus, if the notion of isomorphic linear spaces is restricted to a given set (for instance, to the collection of all linear manifolds Lat(X) of a linear space X), then it is an equivalence relation on that set. We shall now define an equivalence between linear transformations. Recall that GF: X -* 2 denotes the composition
of a mapping G : y - 2 and a mapping F : X -* Y. Definition 2.13. Let X, X, y and y be linear spaces over the same scalar field, where X is isomorphic to X and y is isomorphic to Y. Two linear transformations
T : X -> y and L : X - y are isomorphically equivalent if there exist isomorphisms X : X -i X and Y : y -- y such that
YT = LX. That is, T = Y-' L X (or equivalently, L = YT X -') which means that the diagram
x XI
T
y TY
X ` y commutes. Warning: If X is isomorphic to X and y is isomorphic to y, then there exists an uncountable supply of isomorphisms between X and X and between y
and y. If we take arbitrary linear transformations T : X - y and L : X -, y, it may happen that the above diagram does not commute (i.e., it may happen that YT * LX) for all isomorphisms of X onto X and all isomorphisms of y onto y. In this case T and L are not isomorphically equivalent. However, if there exists at least one pair of isomorphisms X and Y for which YT = LX, then T and L are isomorphically equivalent. It is readily verified that isomorphic equivalence deserves its name. In fact, every T E C[X, Y) is isomorphically equivalent to itself (reflexivity), and L E L[X, y] is isomorphically equivalent to T E L[X, yJ whenever T is isomorphically equivalent to L (symmetry). Moreover, if T E C[X, y] is isomorphically equivalent to L E L[X, yJ and L is isomorphically equivalent to K E L[X, yJ (so that X. X and X are isomorphic linear spaces, as well as Y, y and y), then it is easy to show that T is isomorphically equivalent to K (transitivity). Indeed, if X = x and y = y, and if we restrict the concept of isomorphic equivalence to the set LIX, yJ of all linear
2.7 Isomorphic Equivalence
65
transformations of X into Y, then isomorphic equivalence actually is an equivalence relation on C[X, Y).
An important particular case is obtained when X = y and X = y so that T lies in £[X] and L lies in £[X]. Let X and X be isomorphic linear spaces. Two linear transformations T : X -> X and L : X -> X are similar if there exists an isomorphism W : X --> X such that
WT = LW. Equivalently, if there exists an isomorphism W such the diagram
X
r>
wI
X 1w
XLX
commutes. It should be noticed now that the concept of similarity will be redefined later in Chapter 4 where the linear spaces are endowed with an additional (topological) structure. Such a redefinition will assume that all linear transformations involved in the definition of similarity are "continuous" (including the inverse of W).
Example 2N. Consider the setup of Example 21, where X and y are finitedimensional linear spaces over the same field F. Let X : X -> F" and Y : y -+ F' be two mappings defined by
Xx=(ti,...,i;n)
and
Yy=(vt, ..,vm)
for every x E X and every y E y, where (tj) f=t and {v; }"' t consist of the coordinates of x and y with respect to the bases BX and By, respectively. It is readily verified that X and Y are both isomorphisms (for fixed bases BX and By). Let F,,. t
denote the linear space (over the field F) of all n x 1 matrices (or, if you like, the linear space of all "column n-vectors" with entries in F - Example 2L). Now consider the map W,,: F" -> F,,,,1 that assigns to each n-tuple (fit , ... , k,,) in F" the n x 1 matrix i
"(fit, ")
Sn
in F. x t whose entries are the (similarly ordered) coordinates of the ordered n-tuple with respect to the canonical basis for F". It is easy to show that W. is an isomorphism between F" and F.. 1. This is called the natural isomorphism of F" onto F.. 1. Note that any mxn matrix (with entries in F) can be viewed as a linear transformation from F" 1 to F.. t : the action of an m x n matrix [a;i ] E F,,, xn on an n x 1 matrix
66
2. Algebraic Structures
e Fn
I
is simply the matrix product a] I
. .
.
aln amn
am l
1
to
which is an m x 1 matrix in lFm x 1. According to Example 2L let [A] E
1Fm xn be the unique matrix representing the linear transformation A E £[X, y] with respect to the bases BX and By. Now, if this matrix is viewed as a linear transformation of
Fn x I into Fm x 1, then the diagram
X
A)
y 1Y-'
[AI
commutes. This shows that the linear transformation A : X y is isomorphically equivalent to its matrix [A] with respect to the bases Bx and By when this matrix is viewed as a linear transformation [A]: IFnx1 lFmxl . That is,
(WmY)A = [A](WWX)
2.8 Let (Xi)
Direct Sum 1
be a finite indexed family of linear spaces over the same field F (but not
necessarily linear manifolds of the same linear space). The direct sum of (Xi " 1 denoted by (DR= 1 Xi , is the set of all ordered n-tuples (x 1, ... , xn ), with each x, in X, , where vector addition and scalar multiplication are defined as follows.
(x1.... ,xn)®(y1,... ,yn) = (xl +Y1,... ,xn+yn),
a(xl..... xn) = (axl, .... ax,,) for every (x l , .... xn) and (yl, ... , yn) in ®"..1 Xi and every a in F. It is easy to verify that the direct sum ®"-1Xi of the linear spaces {X1)°=1 is a linear space over F when vector addition (denoted by ®) and scalar multiplication are defined as above. The underlying set of the linear space ®"_1X, is the Cartesian product 1 1n=I X, of the underlying sets of each linear space X,. The origin of ®"=1 X, is the ordered n-tuple (01. .. - , On) consisting of the origins of each X,.
2.8 Direct Sum
67
If M and N are linear manifolds of a linear space X, then we may consider both their ordinary sum M + N (defined as in Section 2.2) and their direct sum M ® N. These are different linear spaces over the same field. There is however a natural
mapping 0: M ® N --> M + N, defined by
4'((x1,x2)) = xl +x2, which assigns to each pair (xl, x2) in M ® A (their sum in M + N C X. It is readily verified that 4) is a surjective linear transformation of the linear space M ®N onto the linear space M + N, but 4' is not always injective. We shall establish below a necessary and sufficient condition that 4> be injective, viz., M n N = (0). In such a case the mapping 4' is an isomorphism (called the natural isomorphism) of M ® N onto M + N, so that the direct sum M ® N and the ordinary sum M + N become isomorphic linear spaces.
Theorem 2.14. Let M and A (be linearmanifolds ofa linear space X. Thefollowing assertions are pairwise equivalent.
(a) M n N = (0). (b) For each x in M + N there exists a unique u in M and a unique v in N such
that x = u + v. (c) The natural mapping 0: M ® N -p M + N is an isomorphism. Proof. Take an arbitrary x in M +A(. If x = u t + v1 = u2 + v2, with u 1, u2 in - U2 = v1 -v2EMnN(forut-u2 E M and M and VI, V2 in N, then vi - v2 E N). Thus M n N = (0} implies that ui = u2 and vt = v2, and hence (a) =:>(b). On the other hand, if M n N # (0), then there exists a nonzero vector w in M n N. Take any nonzero vector x in M + N so that x = u + v with u in
M and vinN.Thusx = (u+w)+(v-w),whereu+wisinM andv-w is in N. Since w i 0, it follows that u + w 34 u. and hence the representation of x as a sum u + v with u in M and v in N is not unique. Therefore, if (a) does not hold, then (b) does not hold. Equivalently, (b)=(a). Finally, recall that the natural mapping 4> is linear and surjective. Since 45 is injective if and only if (b) holds (by its very definition), it follows that (b)q(c). 0
Two linear manifolds M and N of a linear space X are said to be disjoint (or algebraically disjoint) if M n N = (0). (Note that, as linear manifolds of a linear
space X, M and M can never be "disjoint" in the set-theoretical sense - the origin of X always belongs to both of them.) Therefore, if M and N are disjoint linear manifolds of a linear space X, then we may and shall identify their ordinary sum M + N with their direct sum M ® N. Such an identification is carried out
by the natural isomorphism 4> : M ®N M + N (Theorem 2.14). When we identify M 0 A (with M + N, which is a linear manifold of X, we are automatically
68
2. Algebraic Structures
identifying the pairs (u, 0) and (0, v) in M ® N with u in M and with v in N respectively. More generally, we shall be identifying the direct sums M ® (0) and
(0) ®NwithMandAf,respectively. Forinstance, ifx E M®NandMnN = (0), then Theorem 2.14 ensures that there exists a unique u in M and a unique v in
N such that x = (u, v). Hence x = (u, 0) ®(0, v) where (u, 0) E M ®(0) and (0, v) E {0) ® N (recall: M e (0) and (0) ® N are both linear manifolds of M ®N). Now identify (u, 0) with 0)) = u and (0, v) with 0((0, v)) = v, and write x = u ® v where u E M and V E N (instead of x = (u, 0) 6) (0, v) = 4 -' (u) ® 4>-'(v)). Outcome: If M and N are disjoint linear manifolds of a linear space X, then every x in M ®N has a unique decomposition with respect to M and N, denoted by x = u ® v, which is referred to as the direct sum of u in M and v in N. It should be noticed that u ® v is just another notation for (u, v) that reminds us of the algebraic structure of the linear space M ® N. What really is being added in M ®N is (u, 0) a (0, v). If M andN are disjoint linear manifolds of a linear space X, and if their (ordinary) sum is X, then we say that M and N are algebraic complements of each other. In other words, two linear manifolds M and N of a linear space X form a pair of algebraic complements in X if
X=M+N and MnN=(0). Accordingly, this can be written as X =Jet ® N and m n N = (0)
once we have identified the direct sum Me N with its isomorphic image 4 (M ® N) = M +A( = X through the natural isomorphism 0. Proposition 2.15. Let M and N be linear manifolds of a linear space X, and let BM and BN be Hamel bases for M and N, respectively.
(a) .M n N = {0) if and only if BM n BN = 0 and BM U BN is linearly independent.
(b) M + N = X and BM U BN is linearly independent if and only if BM U BN is a Hamel basis for X. In particular, if BM U BN C B. where B is a Hamel basis for X, then
(a) M n N = {0} if and only if BM n BN = 0, (8) M + N = X if and only if BM U BN = B. Proof. (a) Recall that
(0) C span (BM n BN) C span (.M n N) = M n N. Thus M nN= (0) implies span (BM n BN) = (0), which implies BM n BN = 0 (for 0 0 BM U BN). Moreover, if M n N = (0), then the union of the linearly
2.8 Direct Sum
69
independent sets BM and BN is again linearly independent (see Problem 2.3). On the other hand, recall that
(0) c 11.1 fl N = span BM f1 span BN = span (BM fl BN) when (BM U BN) is linearly independent (see Problem 2.4). Thus BM fl BN = 0 implies span (BM fl BN) = {0}, and hence M nN = (0).
(b) Next recall that
span (BM U BN) = span (M U N) = M + N c X whenever BM and BN are Hamel bases for M and N, respectively. If BM U BN is a Hamel basis for X, then BM U BN is linearly independent and X = span (BM U BN)
sothatM+N = X. On the other hand,ifM +N = X, then span (BMUBN) = X. Thus, according to Theorem 2.6, there exists a Hamel basis B' for X such that B' c BM U BN. If BM U BN is linearly independent, then Theorem 2.5 ensures that there exists a Hamel basis B for X such that BM U BN a B. Therefore B' c B. But a Hamel basis is maximal (see Claim 2 in the proof of Theorem 2.5) so that B' = B. Hence BM U BN = B. 0
Theorem 2.16. Every linear manifold has an algebraic complement.
Proof. Let M be a linear manifold of a linear space X, let BM be a Hamel basis for M. and let B be a Hamel basis for X such that BM C B (see Theorem 2.5). Set BN = B\BM (which, as a subset of a linearly independent set B, is linearly independent itself) and N = span BN (a linear manifold of X). Thus BM and BN are Hamel bases for M and N. respectively, both included in the Hamel basis B for X. Since BM fl BN = 0 and BM U BN = B. it follows by Proposition 2.15 that N is an algebraic complement of M. 17
Lemma 2.17. Let M be a linear manifold of a linear space X. Every algebraic complement of M is isomorphic to the quotient space X/M. Proof. Let M be a linear manifold of a linear space X over a field IF, and let X/M be the quotient space of X modulo M, which is again a linear space over F (see Example 2H). Let >r be the natural mapping of X onto X/M defined in Section 1.4. That is, for each x E X set 7r (x) = [x] = x + M E X/M. Example 2H shows that n : X -> X/M in fact is a linear transformation. Let )C be a linearmanifold of X and consider the restriction of n to K, rr IX: K -). X IM, which is again a linear transformation. Claim. If K is an algebraic complement of M, then n1K is invertible.
Proof. Take an arbitrary [x] in X/M so that [x] = x + M for some x in X. Since X = M + K, it follows that x = u + v with u in M and v in K. Thus [x] = [v]
(reason: [x] = [u + v] = [u] + [v] = [v] because [u] = u + M = M = [0] for
70
2. Algebraic Structures
every u in M). Hence [x] = 7rIZ(v) and so [x] E R(7rlx). Conclusion: 7riK is surjective. Moreover, if 7r f K(v) = [0] for some v E 1C, then v + M = [v] = [0] = M and hence v E M. Since m n; = (0), it follows that v = 0. That is, the null space of 7r Ix is the singleton (0),N(7rI,) = {v E IC: 7rIK(v) = [01) = (0), which means that 7r Ix is injective (see Theorem 2.8). L]
Therefore, 7r IX is an isomorphism of K onto X/M whenever 1C is an algebraic complement of M. D
Theorem 2.18. Let M be a linear manifold of a linear space X. Every algebraic complement of M has the same dimension. Proof. According to Theorem 2.12 the above statement can be rewritten as follows. If N and IC are algebraic complements of M, then IC and N are isomorphic. But
this is a straightforward consequence of the previous lemma: N and X are both isomorphic to X IM, and hence isomorphic to each other. 17 The dimension of an algebraic complement of M is therefore a property of M (i.e., it is an invariant for M). We refer to this invariant as the codimension of M: the codimension of a linear manifold M, denoted by codim M. is the (constant) dimension of any algebraic complement of M.
2.9
Projections
Aprojection is an idempotent linear transformation of a linear space into itself. Thus,
if X is a linear space, then P E £[X] is a projection if and only if P = P2. Briefly, projections are the idempotent elements of C[XI. Clearly, the null transformation 0 and the identity 1, both in C[X], are projections. A nontrivial projection in C[X] is a projection P such that 0 y6 P A 1. It is easy to verify that, if P is a projection, then so is I - P. Moreover, the null spaces and ranges of P and I - P are related as follows (see Problem 1.4).
R(P) = N(! - P)
and
N(P) = R(1 - P).
Projections are singularly useful linear transformations. One of their main properties is that the range and the null space of a projection form a pair of algebraic complements.
Theorem 2.19. If P E L[X] is a projection, then R(P) and N(P) are algebraic complements of each other.
Proof. Let X be a linear space, and let P: X -+ X be a projection. Recall that both the range 1Z(P) and the null space N(P) are linear manifolds of X (since P is linear). Since P is idempotent, it follows that
R(P) _ {x E X: Px = x}
2.9 Projections
71
(the range of an idempotent mapping is the set of all its fixed points - Problem 1.4). If x E R(P) n N(P), then x = Px = 0, and hence
R(P) n N(P) = (01. Moreover, for an arbitrary vector x in X. write x = Px + (x - Px). Since P(x - Px) = Px - P2x = 0 (recall: P is linear and idempotent), it follows that (x - Px) E N(P). Hence x = u + v with u = Px in R(P) and v = (x - Px) in N(P). Therefore
X = R(P) +N(P).
0
On the other hand, for any pair of algebraic complements there exists a unique projection whose range and null space coincide with them.
Theorem 2.20. Let M and N be linear manifolds of a linear space X. If M and N are algebraic complements of each other then there exists a unique projection
P : X -- X such that R(P) = M and N(P) = N. Proof. Suppose M and N are algebraic complements in a linear space X so that
M+N=X and MnN=(o1. According to Theorem 2.14, for each x E X there exists a unique u E M and a unique v E N such that x = u + v. Let P : X -* X be the function that assigns to each x in X its unique summand u in M (i.e., Px = u). It is easy to verify that P is linear. Moreover, for each vector x in X, P2x = P(Px) = Pu = u = Px (reason: u is itself its unique summand in M), so that P is idempotent. By the very definition
of P we get R(P) = M and N(P) = Al'. Conclusion: P: X -- X is a projection with R(P) = M and N(P) = N. Now let P': X -> X be any projection with IZ(P') = M and N(P') = N. Take an arbitrary x E X and consider again its unique representation as x = u + v with u E M = R(P') and V E N = N(P'). Since P' is linear and idempotent, it follows that P'x = P'u + P'v = u = Px. Therefore P = P. o Remark: An immediate corollary of Theorems 2.16 and 2.20 says that any linear manifold of a linear space is the range of some projection. That is, if M is a linear manifold of a linear space X, then there exists a projection P : X -* X such that
R(P) = M. If M and N are algebraic complements in a linear space X, then the unique projection P in L[X] with range R(P) = M and null space N(P) = N is called the projection on M along N. If P is the projection on M along H. then the projection onN along M is precisely the projection E = I - P in G[X]. Note that
EP=PE=0.
72
2. Algebraic Structures
Proposition 2.21. Let M and N be linear manifolds of a linear space X. If M and N are algebraic complements of each other then the unique decomposition of each x in X = M ® H as a direct sum
of u in M and v in N is such that
u = Px
and
v = (I - P)x
where P : X -> X is the unique projection on M along M. Proof. Take an arbitrary x in X and consider its unique decomposition x = u ® v in X = M ®M. Note that the identification of M ® M with M+M = X is implicitly assumed in the proposition statement. Now write x = (u, v) and set Px = (u, 0). The very same argument used in the proof of Theorem 2.20 can be applied here to
verify that this actually defines a unique projection P : M ® M -+ M ® N such that R(P) = M ® {0} and N(P) = (0) H. Finally, identify M ® (0) and (0) ®N with M and N (and hence (u. 0) and (0, v) with a and v), respectively. o
According to Theorem 2.16, every linear space X can be represented as the
sum X = M + M of a pair (M, N) of algebraic complements in X. If M N is identified with M + M, then this means that every linear space X has a decomposition X = M ® M as a direct sum of disjoint linear manifolds of X. Proposition 2.22. Let X be a linear space and consider its decomposition
X=M®N as a direct sum of disjoint linear manifolds M andN of X. Let P : X -+ X be the projection on M along M, and let E = I - P be the projection on N along M. Every linear transformation L : X - X can be written as a 2 x 2 matrix with linear transformation entries
L=A
B D
,
where A=PLIM:M-+M, B= PLIN:M-+M, C=ELIM:M-+Nand D = ELI,: M-*N. Proof. Let M and N be linear manifolds of a linear space X. Suppose M andN are algebraic complements of each other and consider the decomposition X = M ®N. Let L be a linear transformation on M ®M so that L E L(XJ. Take an arbitrary x E X and consider its unique decomposition x = u ® v in X = M ® M with u in
M and v in M. Now write x = (u, v) so that Lx = L(u, v) = L((u. 0) a (0, v)) = L(u. 0) ® L(0. v) = LAM(Dtol(u, 0) ® LIloli(0, v). Identifying M ® (0) and
2.9 Projections
73
(0) ® N with M and N (and hence (u, 0) and (0, v) with u and v), respectively, it follows that
Lx = LIMu ®LINv, where L IM u and L IN v lie in X = M ® N. Proposition 2.21 says that we may write
LIMu = PLIMu®ELIMu, LIArv = PLINv®ELINv, where P is the unique projection on M along N and E = I - P. Therefore
Lx = (PLIMu + PLINv) ®(ELIMu + ELINv), where PL IM u + PL IN u is in M and E L I M u + EL I N v is in N. Since the ranges
of PLIM and PLIN are included in R(P) = M, we may think of them as linear transformations into M. Similarly, ELIM and ELIN can be thought of as linear transformations into N. Thus set A = PLIM EC[M], B = PLIN E G[N, MI,
C= ELIM EL[M,N]and D=ELINEL[N]so that Lx = (Au + Bv, Cu + Dv) E M ® N for every x = (u, v) E M N. In terms of standard matrix notation, the vector Lx in M ® N can be viewed as a 2 x 1 matrix with the first entry in M and the other in N, namely (Au+Ov This is precisely the action of the 2x2 matrix with linear transformation entries, (C D), on the 2x 1 matrix with entries in M and N representing x, namely (uv). Thus Lx = (A BD)( "), and hence we write L = (c n).
0 Example 20. Consider the setup of Proposition 2.22. Note that the projection on M along .,V can be written as
P=(1O
0) 0/J
with respect to the decomposition X = M ® N, where 1 denotes the identity on M.
Thus L P = (A o) and P L P = (o o ), so that L P = PL P if and only if C = 0. Now note that M is L-invariant (i.e., L(M) c M) if and only if PLIM = LIM or, equivalently, if and only if ELIM = 0 (recall: E = I - P). Therefore,
L(M)cM
A = LIM
C=0
Conclusion 1: The following assertions are pairwise equivalent.
(a) M is L-invariant.
(b) L=(`' 0
DB)
LP = PLP.
74
2. Algebraic Structures
(c) LP =PLP. Similarly, if we apply the same argument to N, then
L(N)cN b D=LIN e== B=0
PL=PLP.
Conclusion 2: The following assertions are pairwise equivalent as well.
(a) M and N are both L-invariant.
(b) L =
(L 001M
°)
L 1q/
(c) L and P commute (i.e., PL = LP). Let M and N be algebraic complements in a linear space X. If a linear transformation L in £[X] is represented as L = (v v) in terms of the decomposition X = M ® A ((as in (a) above), where A E L[M] and D E C[N], then it is usual to write L = A (D D. For instance, the projection on M along N, which is represented
with respect to the same decomposition X = M ®N, is usually written as Po) = 19 0. These are examples of the following concept. Let (Xi)"=1 be a finite indexed family of linear spaces over the same scalar field and consider their direct sum ®"=1Xi. Now let (L1)"=1 be a (similarly indexed) family of linear transformations such that Li E G[Xi] for every index i. The direct
as P = (o
sum of (Li 1,1=1, denoted by ®in_ I Li, is the mapping of ®" I Xi into itself defined by n
®Li(xt,...
, xn) = (L1x1.... , Lnxn)
i=1
for every (x1, ... , xn) in ®"=1 Xi. It is readily verified that ®" I Li is linear (i.e.,
®" I Li E G[®in_I Xi J) and also that, for every index i, n
(DL)I x, =
Li.
i=1
Observe that the above identity actually is a short notation for the following assertion:
"if Oi is the origin of each Xi and Oi is the unique (linear) transformation of (0j) onto itself, then each linear manifold (01) ® - ®{0i _ I) ®Xi ® (Oi+I) ®... ® (On) of ®"=1 Xi is invariant for ®"_I Li and the restriction of ®°=1 Li to that invariant linear manifold is the direct sum 01 ® . . . ® 0j-1 ® Li ® 0i+1 ® . . . ® On". Of course, we shall always use the short notation. Conversely, if L E G[®" 1Xi] is such that LIxi E G[X1] for every index i, then L is the direct sum of (LIx1)" I. That is, if each Xi in ®"=I Xi is invariant for L E G[®"1 Xi ], then
Llx.
L= ,=1
Problems
75
Summing up: Set X = ®"_ i X; and consider linear transformations L; in L[X; ] for each i and L in L[X]. n
L=®L;
if and only if
L;=LIX;
-t
for every index i (so that each X;, viewed as a linear manifold of the linear space (D'"=Mi. is invariant for L). The linear transformations (Li ) are referred to as the direct summands of L. In particular, if we consider the decomposition X = M ® N of a linear space X into the direct sum of a pair of algebraic complements M and N in X, and if we take linear transformations L E G[X], A E G[M] and D E L[N], then
L = (A D) = A ® D
if and only if
A = LIM and D = LIN
(so that M and N are both L-invariant), where A and D are the direct summands of L with respect to the decomposition X = M ® N.
Suggested Reading Brown and Pearcy [2] Halmos [2] Herstein [1] MacLane and Birkhoff [1]
Naylor and Sell [1] Roman [I ] Simmons [ 1 ]
Taylor and Lay [ I ]
Problems Problem 2.1. Let X be a linear space over a field F. Take arbitrary a and 6 in F and arbitrary x, y and z in X. Verify the following propositions. (a)
(-a)x = -(ax).
(b) Ox =0=a0.
(c) ax = 0 ;
a = 0 or x = 0.
(d) x+y=x+z : y=z. (e) ax = ay = x=y if aA0. (f) ax =fix
a=0 if x54 0.
76
2. Algebraic Structures
Problem 2.2. Let X be a real or complex linear space. A subset C of X is convex if ax + (1 - a) Y E C whenever x, y E C and O< a< 1. A vector x E X is a convex linear combination of vectors in X if there exists a finite set (x;) t of vectors in X and a finite family of nonnegative scalars {a;}" t such that x = E"_ta;x; and E"_ t ai = 1. If A is a subset of X, then the intersection of all convex sets containing A is called the convex hull of A, denoted by co(A). (a) Show that the intersection of an arbitrary nonempty collection of convex sets is convex.
(b) Show that co(A) is the smallest (in the inclusion ordering) convex set that contains A.
(c) Show that C is convex if and only if every convex linear combination of vectors in C belongs to C.
Hint: To verify that every convex linear combination of vectors in a convex set C belongs to C proceed as follows. Note that the italicized result holds for any convex linear combination of two vectors in C (by the definition of a convex set). Suppose it holds for every convex linear combination of n vectors in C, for some n E N. This implies that aE"_ta-t a; xi lies in C
whenever {x;}"+ C C anda; = 1 with0 < (reason:
a
="_tai
a- a; x; E C). Now conclude the proof by induction.
(d) Show that co(A) coincides with the set of all convex linear combinations of vectors in A. Hint: Let clc(A) denote the set of all convex linear combinations of vectors in A. Verify that clc(A) is a convex set. Now use (b) and (c) to show that co(A) C clc(A) C clc(co(A)) = co(A).
Problem 2.3. Let M and N be linear manifolds of a linear space X, and let A and B be linearly independent subsets of M and N. respectively. If M n.h = (0). then A U B is linearly independent. (Hint: If a E A is a linear combination of vectors in A U B, then a = b + a' for some a' E M and some b E N.) Problem 2.4. Let A be a linearly independent subset of a linear space X. If B and C are subsets of A, then
span (B f1 C) = span B fl span C. Hint: Show that span B n span C c span (B U C) by Proposition 2.3.
Problem 2.5. If a subset A of a linear space X spans X, then the cardinality of every linearly independent subset of X is less than or equal to the cardinality of A.
Problems
77
Hint. Suppose A is a subset of a linear space X that spans X. Let B be an arbitrary Hamel basis for X, and let C be an arbitrary linearly independent subset of X. Show
that #C < #B < #A. (Apply Theorems 2.5, 2.6 and 2.7 - see Problems 1.21(a) and 1.22). Note that this generalizes Claim 0 in the proof of Theorem 2.7 for subsets of arbitrary cardinality.
Problem 2.6. Let X be a linear space, and let M be a linear manifold of X. Verify the following propositions.
(a) dim M = 0 if and only if M = (0). (b) dim M < dim X.
(Hint: Problem 2.5.)
Problem 2.7. If M is a proper linear manifold of afinite-dimensional linear space X, then dim M < dim X. Prove the above statement and show that it does not hold for infinite-dimensional linear spaces (e.g., show that dim X = dim Xp where X is the linear space of Example 2J and Xa = (x = (l;], l;Z, h, ... ,) E X: l;'t = 0}). Problem 2.8. Let X be a nonzero linear space over an infinite field, and let B be a Hamel basis for X. Recall that every nonzero vector x in X has a unique representation in terms of B. That is. for each x # 0 in X there exists a unique nonempty finite subset Bx of B and a unique finite family of nonzero scalars (ab}bEB, C iF such that
x = E abb. be B,
For each positive integer n E N, let X" be the set of all nonzero vectors in X whose representations as a (finite) linear combination of vectors in B have exactly n (nonzero) summands. That is, for each n E N, set
X" = {x E X: #Bs = n}. (a) Prove that #X" = #(IF x B) for all n E N. Hint: Show that #X, = #(IF" x B) and recall: if F is an infinite set, then #F" _ #IF (Problems 1.23 and 1.28).
(b) Apply Theorem 1.10 to show that #(U"ENXf) < #(Fx B). (c) Verify that {X"}IEN is a partition of X\{0}. Thus conclude from (b) and (c) (see Problem 1.28(a)) that
#X = #(IFx B) = max{#IF, dim X}.
78
2. Algebraic Structures
Problem 2.9. Prove the following proposition, which is known as the Principle of Superposition. A mapping L : X - Y. where X and Y are linear spaces over the same scalar field, is a linear transformation if and only if n
L(>aixi) i=1
n
= Ea1Lxi i=1
for all finite sets (xi)"_ 1 of vectors in X and all finite families of scalars (a i }" 1.
Problem 2.10. Let L : X -+ Y be a linear transformation. Show that the null space N(L) and the range R(L) of L are linear manifolds (of the linear spaces X and Y. respectively).
Problem 2.11. Let L : X --> y be a linear transformation of a linear space X into a linear space Y. Prove the following propositions. (a) If M is a linear manifold of X. then L(M) is a linear manifold of Y (i.e.. the linear image of a linear manifold is a linear manifold).
(b) If N is a linear manifold of y, then L-1 (N) is a linear manifold of X (i.e., the inverse image of a linear manifold under a linear transformation is again a linear manifold).
Problem 2.12. Let X and Y be linear spaces, and let L : X -+ Y be a linear transformation. Show that the following assertions are equivalent.
(a) A C X is a linear manifold whenever L(A) C y is a linear manifold.
(b) N(L) = (0). Hint: Give a direct proof for (b) =*(a) by using Problems 1.3(d) and 2.11(b). Give a recall: if x is a nonzero vector in X, then (x) contrapositive proof for is not a linear manifold of X.
Problem 2.13. Show that the set £[X, y] of all linear transformations of a linear space X into a linear space Y is itself a linear space (over the same common field of X and y) when vector addition and scalar multiplication in G[X, Y] are defined pointwise as in Example 2F.
Problem 2.14. The restriction L I M : M -- y of a linear transformation L : X -* Y to a linear manifold M of X is itself a linear transformation. Problem 2.15. Prove that a composition of two linear transformations is again a linear transformation.
Problems
79
Problem 2.16. Let L : X -s Y be a linear transformation. It is trivially verified that, if L is surjective. then dim R(L) = dim Y. Now verify that, if L is injective, then dim R(L) = dim X. Problem 2.17. Let L : X -+ Y be a linear transformation of a linear space X into a linear space Y. The dimension of the range of L is the rank of L, and the dimension of the null space of L is the nullity of L. Show that rank and nullity are related as follows.
dim N(L) + dim R(L) = dim X. (Hint: Addition of cardinal numbers was defined in Problem 1.30.)
Problem 2.18. If dim R(L) is finite, then L is called a finite-dimensional (or a finite-rank) linear transformation. Clearly, if)' is a finite-dimensional linear space, then every L E C[X, YJ is finite-dimensional. Verify that, if X is a finite-dimensional linear space, then every L E £[X. yJ is finite-dimensional. Moreover, if L : X -+ )' is a finite-dimensional linear transformation (so that R(L) is a finitedimensional linear manifold of y), then show that (a)
L is injective if and only if dim R(L) = dim X,
(b) L is surjective if and only if dim R(L) = dim Y. Problem 2.19. let X be a linear space over a field F. and let XNO be the linear space (over the same field F) of all X-valued sequences (x JnEN0 Suppose A is a linear transformation of X into itself. Take an arbitrary sequence u = (un )nENo in XNO and consider the (unique) sequence x = (xn )nENo in XNO which is recursively
defined as follows. Set xo = uo and, for each n E No, let
xn+l = Axn + un+1 Prove by induction that n
xn = E A"-'u1 i=o
for every n E No, where AO is (by definition) the identity I in LIX]. Let L: XNO - XNo be the map that assigns to each sequence u in XNO this unique sequence x in XNO, so that
x = Lu. Show that L is a linear transformation of XNo into itself. The recursive equation (or the difference equation) xn+l = Ax" + un+ 1 is called a discrete linear dynamical
system because L is linear. Its unique solution is given by x = Lu (i.e., xn = ui for every n E fro).
80
2. Algebraic Structures
Problem 2.20. Let F denote either the real or complex field, and let X and Y be linear spaces over F. For any polynomial p (in one variable in F, with coefficients in IF, and of any finite order n) set n
p(L) = >2aiL' i=o
in £[X] for every L E G[X], where the coefficients (ai )"O lie in IF (note: LO = 1).
Take L E £[X], K E C[y] and M E £[X,YJ arbitrary, and prove the following implication.
(a) If ML = KM, then Mp(L) = p(K)M for any polynomial p. Thus conclude: p(L) is similar to p(K) whenever L is similar to K. A linear transformation L in G[X J is called nilpotent if L" = 0 for some integer n E N, and algebraic if p(L) = 0 for some polynomial p. It is clear that every nilpotent linear transformation is algebraic. Prove the following propositions.
(b) A linear transformation is similar to an algebraic (nilpotent) linear transformation if and only if it is itself algebraic (nilpotent). (c) Sum and composition of nilpotent linear transformations are not necessarily nilpotent.
Hint: The matrices T = (o o) and L = (o
in £(C2] are both nilpotent. o) L + T is an involution. L T and TL are idempotent. Problem 2.21. Let IF denote either the real or complex field, and let X be a linear
space over F. A subset K of X is a cone (with vertex at the origin) if ax E K whenever x E K and a > 0. Recall the definition of a convex set in Problem 2.2 and verify the following assertions.
(a) Every linear manifold is a convex cone. (b) The union of nonzero disjoint linear manifolds is a nonconvex cone. Let S be a nonempty set and consider the linear space IFS. Show that (c) {x E IFS : x(s) > 0 for all s E S} is a convex cone in FS.
Problem 2.22. Show that the implication (a)=(b) in Theorem 2.14 does not generalize to three linear manifolds, say M, Al and TL, if we simply assume that they are pairwise disjoint. (Hint: R3.)
Problem 2.23. Let {M; }"_, be a finite collection of linear manifolds of a linear space X. Show that the following assertions are equivalent.
Problems
81
(a) M i n r_nj=i joiMj= (0) for every i = i , .... n. (b) For each x in F,,,= I Mi there exists a unique n-tuple (XI , ... ,
in ]l" 1 Mi
such that x = E"=Ixi. Hint. (a)=(b) for n = 2 by Theorem 2.14. Let n > 2 and suppose (a)=(b) for every 2 < m < n. Show that, if (a) holds true form + 1, then (b) holds true form + 1. Now conclude the proof of (a)=(b) by induction inn. Next show that (b)=(a) by Theorem 2.14.
Problem 2.24. Let {M, }"_, be a finite collection of linear manifolds of a linear
space X. and let Bi be a Hamel basis for each Mi. If Mi n E"_, i#iMi = (0) for every i = i, ... , n, then U"=i Bi is a Hamel basis for F_,"_1 Mi. Prove. Hint: The result holds for n = 2 by Proposition 2.15. Use the hint of the previous problem.
Problem 2.25. Let M and Al be linear manifolds of a linear space. (a) If M and N are disjoint, then
dim(M ®N) = dim(M + N) = dim M + dimN. Hint: Problem 1.30, Theorem 2.14 and Proposition 2.15. (b) If M and N are finite-dimensional, then
dim(M + N) = dim M + dim N - dim(M n N). Problem 2.26. Let M be a proper linear manifold of a linear space X so that M E Cat(X)\(X). Consider the inclusion ordering of Cat(X) and show that
M is maximal in ,Car(X)\(X}
codimM = 1.
Problem 2.27. Let rp be a nonzero linear functional on a linear space X (i.e., let (P be a nonzero element of X', the algebraic dual of X). Prove that
(a) N(cp) is maximal in Lat(X)\(X). That is, the null space of any nonzero linear functional in X' is a maximal proper lin-
ear manifold of X. Conversely, if M is a maximal linear manifold in Gat(X)\{X), then there exists a nonzero cp in X' such that M = N((p). In other words, prove the following assertion.
(b) Every maximal element of Lat(X)\(X} is the null space of some nonzero q in X'.
82
2. Algebraic Structures
Problem 2.28. Let X be a linear space over a field F. The set
H,,,,, = {x E X: q,(x) = a}, determined by a nonzero tp in X' and a scalar a in F. is called a hyperplane in X. It is clear that H,,,o coincides with N((p), but H., is not a linear manifold of X if Cr is a nonzero scalar. A linear variety is a translation of a proper linear manifold. That is, a linear variety V is a subset of X that coincides with the cosec of x modulo M.
V = M+x = (yEX: y=z+x forsome zEMI, for some x E X and some M E Lat(X)\(X). If M is maximal in Gat(X)\(X), then M + x is called a maximal linear variety. Show that a hyperplane is precisely a maximal linear variety. Problem 2.29. Let X be a linear space over a field IF, and let P and E be projections
in £[X]. Suppose E j4 0, and let a be an arbitrary nonzero scalar in F. Prove the following proposition.
(a) P + aE is a projection if and only if PE + EP = (1 - a)E. Moreover, if P + aE is a projection, then show that
(b) E and P commute (i.e., EP = PE) and EP is a projection,
(c) EP = O if and only if a = 1 and EP # O if and only if a = -1. Therefore, (d) if P + a E is a projection, then a = 1 or a = -1.
Finally conclude that
P + E is a projection if and only if EP = PE = 0, P - E is a projection if and only if EP = PE = E. Problem 2.30. An algebra (or a linear algebra) is a linear space A that is also a ring with respect to a second binary operation on A called product (notation: xy E A is the product of x E A and y E A). The product is related to scalar multiplication by the property
a(xy) = (ax)y = x(ay) for every x, y E A and every scalar a. We shall refer to a real or complex algebra if A is a real or complex linear space. Recall that this new binary operation on A (i.e., the product in the ring A) is associative,
x(yz) = (xy)z,
Problems
83
and distributive with respect to vector addition, x(y + z) = xy + xz
and
(y + z)x = yx + zx,
for every x, y and z in A. If A possesses a neutral element 1 under the product operation (i.e., if there exists I E A such that x I = lx = x for every x E A), then A is said to be an algebra with identity (or a unital algebra). Such a neutral element 1 is called the identity (or unit) of A. If A is an algebra with identity, and if x E A has an inverse (denoted by x I) with respect to the product operation (i.e., if there exists z' I E A such that xx - t = x - I x = 1), then x is an invertible element of A. Recall that the identity is unique if it exists, and so is the inverse of an invertible element of A. If the product operation is commutative, then A is said to be a commutative algebra.
(a) Let X be a linear space of dimension greater than one. Show that C[X] is a noncommutative algebra with identity when the product in L[X] is interpreted
as composition (i.e., LT = L o T for every L, T E C[X]). The identity I in £[X] is precisely the neutral element under the product operation. L is an invertible of C[X] if and only if L is injective and surjective. Asubalgebra of A is a linear manifold M of A (when A is viewed as a linear space) which is an algebra in its own right with respect to the product operation of A (i.e.,
uv E M whenever u E M and V E M). A subalgebra M of A is a left ideal of A if ux E M whenever u E M and X E A. A right ideal of A is a subalgebra M of A such that xu E M whenever x E A and u E M. An ideal (or a two-sided ideal or a bilateral ideal) of A is a subalgebra I of A that is both a left ideal and a right ideal.
(b) Let X be an infinite-dimensional linear space. Show that the set of all finitedimensional linear transformations in £[X] is a proper left ideal of G[X] with no identity. (Hint: Problem 2.25(b).)
(c) Show that, if A is an algebra and I is a proper ideal of A, then the quotient space All of A modulo I is an algebra. This is called the quotient algebra of A with respect to Z. If A has an identity 1, then the cosec 1 + I is the identity
of A/I. Hint: Recall that vector addition and scalar multiplication in the linear space
All are defined by
(x+I)+(y+I) = (x+y)+I, a(x + I) = ax + I, for every x, y E A and every scalar a (see Example 2H). Now show that the product of cosets in A/I can be likewise defined by
(x+I)(y+I) = xy+I
84
2. Algebraic Structures
forevery x,yEA(i.e.,ifx'=x+uandy'=y+v, withx,y E Aand U. V E Z, then there exists z E I such that x'y' + w = xy + z for any w E Z, whenever I is a two-sided ideal of A).
3 Topological Structures
The basic concept behind the subject of point-set topology is the notion of "closeness" between two points in a set X. In order to get a numerical gauge of how close together two points in X may be, we shall provide an extra structure to X, viz., a topological structure, that again goes beyond its purely set-theoretic structure. For most of our purposes the notion of closeness associated with a metric will be sufficient, and this leads to the concept of "metric space": a set upon which a "metric" is defined. The metric-space structure that a set acquires when a metric is defined on it is a special kind of topological structure. Metric spaces comprise the kernel of this chapter but general topological spaces are also introduced.
3.1
Metric Spaces
A metric (or metric function, or distance function) is a real-valued function on the Cartesian product of an arbitrary set with itself that has the following four properties, called the metric axioms.
Definition 3.1. Let X be an arbitrary set. A real-valued function d on the Cartesian product X x X,
d:XxX-->II8.
86
3. Topological Structures
is a metric on X if the following conditions are satisfied for all x, y and z in X. (i) d(x, y) > 0 and d(x, x) = 0 (ii) d(x, y) = 0 only if x = y (iii) d(x, y) = d(y, x) (iv) d (x, y) < d (x, z) + d (z, y)
(nonnegativeness), (positiveness), (symmetry),
(triangle inequality).
A set X equipped with a metric on it is a metric space. A word on notation and terminology. The value of the metric don a pair of points of X is called the distance between those points. According to the above definition a metric space actually is an ordered pair (X, d) where X is an arbitrary set, called
the underlying set of the metric space (X, d), and d is a metric function defined on it. We shall often refer to a metric space in several ways. Sometimes we shall speak of X itself as a metric space when the metric d is either clear in the context or is immaterial. In this case we shall simply say "X is a metric space". On the other hand, in order to avoid confusion among different metric spaces, we may occasionally insert a subscript on the metrics. For instance, (X, dX) and (Y, dy) will stand for metric spaces where X and Y are the respective underlying sets, dX denotes the metric on X, and dy the metric on Y. Moreover, if a set X can be equipped with more than one metric, say di and d2, then (X, d1) and (X, d2) will represent different metric spaces with the same underlying set X. In brief, a metric space is an arbitrary set with an additional structure defined by means of a metric d. Such an additional structure is the topological structure induced by the metric d. If (X, d) is a metric space, and if A is a subset of X, then it is easy to show that the restriction d I A x A: A x A-+ R of the metric d to A x A is a metric on A - the so-called relative metric. Equipped with the relative metric, A is a subspace of X. We shall drop the subscript A x A from dIA xA and say that (A, d) is a subspace of
(X, d). Thus a subspace of a metric space (X, d) is a subset A of the underlying set X equipped with the relative metric, which is itself a metric space. Roughly speaking, A inherits the metric of (X, d). If (A. d) is a subspace of (X, d) and A is a proper subset of X, then (A, d) is said to be a proper subspace of the metric space (X, d). Example 3A. The function d : R x R -+ R defined by
d(a, 0) = Ia - 0I for every a. fi E R is a metric on R. That is, it satisfies all the metric axioms in Definition 3.1, where la I stands for the absolute value of a E R: la I = (a2) 2. This is the usual metric on R. The real line R equipped with its usual metric is the most important concrete metric space. If we refer to R as a metric space without specifying a metric on it, then it is understood that R has been equipped with its
usual metric. Similarly, the function d: CxC - R given by d(, v) = l!; - vl for every , v E C is a metric on C. (Again, Ii; I stands for the absolute value (or
3.1 Metric Spaces
87
modulus) of a complex number r; : Il; I = (11;) i with the upper bar denoting complex
conjugate). This is the usual metric on C. More generally, let F denote either the real field R or the complex field C, and let F" be the set of all ordered n-tuples of scalars in F. For each real number p > 1, consider the function dP : F" xF" R defined by I41 - v1iP) r,
dp(x, y) i=1
and also the function doo: F" x F" -i R given by doo (x, y) = max
i
v; I,
f o r every x = (i;1, ... , ln) and y = (v1, ... , vn) in F". These are metrics on F". Indeed, all the metric axioms up to the triangle inequality are trivially verified. The triangle inequality follows from the Minkowski inequality (see Problem 3.4(a)). Note that (Q", dP) is a subspace of (R", dP) and (Q", dam) is a subspace of (R", doo).
The special (very special, really) metric space (R", d2) is called n-dimensional Euclidean space and d2 is the Euclidean metric on W. The metric space (C", d2) is called n-dimensional unitary space. The singular role played by the metric d2 will become clear in due course. Recall that the notion of a bounded subset was defined for partially ordered sets in Section 1.5. In particular, boundedness is well-defined for subsets of the simply ordered set (R, <): the set of all real numbers R equipped with its natural ordering < (see Section 1.6). Let us introduce a suitable and common notation for a subset of R that is bounded above. Since the simply ordered set R is a boundedly complete lattice (Example IC), it follows that a subset R of R is bounded above if and only if it has a supremum, sup R, in R. In such a case we shall write sup R < oo. Thus the notation sup R < oo simply means that R is a subset of R which is bounded above. Otherwise (i.e., if R C_ R is not bounded above) we write sup R = oo. With this in mind we shall extend the notion of boundedness from (R, <) to a metric space (X, d) as follows. A nonempty subset A of X is a bounded set in the metric space
(X, d) if sup d(x, y) < oo. x.yEA
That is, A is bounded in (X, d) if (d(x, y) E R: x, y E A} is a bounded subset of (R, <). Equivalently, if {d(x, y) E R: x, y E A} is bounded above in R, since 0 < d (x, y) for every x, y E X. An unbounded set is, of course, a set A that is not bounded in (X, d). The diameter of a nonempty bounded subset A of X (notation: diam(A)) is defined by
diam(A) = sup d(x, y) x,yEA
so that diam(A) < oo whenever a nonempty set A is bounded in (X, d). By convention the empty set 0 is bounded and diam(O) = 0. If A is unbounded we write
88
3. Topological Structures
diam(A) = oo. Let F be a function of a set S to a metric space (Y, d). F is a bounded
function if its range, R(F) = F(S), is a bounded subset in (Y. d); that is, if
sup d(F(s), F(t)) < oo. SEES
Note that R is bounded as a subset of the metric space R equipped with its usual metric if and only if R is bounded as a subset of the simply ordered set R equipped with its natural ordering. Thus the notion of a bounded subset of R and the notion of a bounded real-valued function on an arbitrary set S are both unambiguously defined.
Proposition 3.2. Let S be set and let F denote either the real field R or the complex field C. Equip F with its usual metric. A junction w E Fs (i.e., (p: S -)- F) is bounded if and only if sup Ik(s)I < 00. sES
Proof. Consider a function tp from a set S to the field F. Take s and t arbitrary in S, and let d be the usual metric on F (see Example 3A). Since sp(s), (p(t) E F. it follows by Problem 3.1(a) that Iw(S)I - Iw(t)I
-<
<
1Iw(s)I - Iw(t)Ij = I d (w(s), 0) - d(O. (p(t))I Iw(s)I + Iw(t)I. d(w(s), (p(t)) = Iw(s) - w(t)I
If sup5Eslw(s)I < oo (i.e., if (IQ(s)I E R: S E S} is bounded above) then
d(w(s), (p(t)) -< 2sup Iw(s)I, ZES
so that the function w is bounded. and hence sup,,,es d(w(s), (p(t )) < 2 On the other hand, if sups.rES d(w(s), (p(t)) < oo, then Iw(s)I <- sup d(w(s). w(t)) + Iw(t)I,
JjEs
and the real number sup,.IEs d(w(s), (p(t)) + Iw(t)I is an upper bound for (Iw(s)I E t] R: S E S) for each t E S. Thus Sup,ESlw(s)I < 00.
Example 3B. For each real number p > 1, let l+, denote the set of all scalarvalued (real or complex) infinite sequences M 1,N in CN (or in CNO) such that Fk° I Ilk I° < oo. We shall refer to this condition by saying that the elements of f+ are p-summable sequences. Notation: F_' t l k 1 = SuPnENEk-I 141P. Thus, according to Proposition 3.2, Fk°_ l Itk I° < co means that the nonnegative sequence {F_,=, IkI°},EN is bounded as a real-valued function on N. Note that, if (tk}kEN and (vk }kEN are arbitrary sequences in f+ P, then the Minkowski inequality (Problem
3.1 Metric Spaces
89
vk l p < oo. Hence we may consider the function
3.4(b)) ensures that Ekw 1
dp: tpxt+ -* Rgivenby 00
,
dp(x, y) = E I k - ukIp
F
k=1
for every x = (ik}kEN and y = (vk)kEN in l+. We claim that dp is a metric on tp.. Indeed, as it happened in Example 3A, all the metric axioms up to the triangle inequality are readily verified; and the triangle inequality follows from the Minkowski inequality. Therefore (t{p. , d p) is a metric space for each p > 1, and the metric d p is referred to as the usual metric on t.. Now let t+° denote the set of all scalar-valued bounded sequences; that is, the set of all real or complex-valued sequences (k }kEN such that supkENI4 I < oo. Again, the Minkowski inequality (Problem 3.4(b)) ensures that SupkEN Itk - uk I < co whenever ( k }kEN and (Uk )kEN lie in l+°, and hence
we may consider the function d.: t+° x t+ -+ R defined by
do0(x, y) = sup Ilk - Ukl kEN
for every x = ( k)kEN and y = (Uk)kEN in t.OO. Proceeding as before (using the Minkowski inequality to verify the triangle inequality) it follows that (t+, do") is a metric space, and the metric d00 is referred to as the usual metric on t+°. These metric spaces are the natural generalizations (for infinite sequences) of the metric spaces considered in Example 3A, and again the metric space (t+2, d2) will play a central role in the forthcoming chapters. There are counterparts of tp} and t+° for nets in CZ. In fact, for each p > 1 let tp denote the set of all scalar-valued (real or complex) nets {k }kEZ such that E0k°___00 Il;k Ip < oo (i.e., such that the nonnegative is bounded), and let £O0 denote the set of all bounded sequence (Ek=_,, Ilk I p
nets in CZ (i.e., the set of all scalar-valued nets (k)kEZ such that supkEZItkI < 00).
The functions d: tp x t P -* R and dd : tO0 x t0O -- R, given by 00
dp(x, y) = ( E I k=-0o
for every x =
k - U k I lr
and y = (vk)kEZ in V. and d0o(x, y) = sup Ilk - Uk I kEZ
for every x = (k}kEZ and y =
in tOO, are metrics on tp (for each p > 1)
and on £00, respectively.
Let (X, d) be a metric space. If x is an arbitrary point in X, and A is an arbitrary nonempty subset of X, then the distance from x to A is the real number
d(x. A) = inf d(x, a). aEA
90
3. Topological Structures
If A and B are nonempty subsets of X, then the distance between A and B is the real number d(A, B) = inf d(a,b). a be a metric space. Let B[S, Y] denote the subset of Ys consisting of all bounded mappings of S into (Y, d). According to Problem 3.6,
supd(f(s),g(s)) < diam(R(f))+diam(R(g))+d(R.(f),R(g)) .TES
so that supsES d (f (s), g(s)) E R for every f, g E B[S, Y]. Thus we may consider the function doo : B[S, Y] x B[S, Y] -+ R defined by
doo(f, g) = supd(f (s), g(s)) sES
for each pair of mappings f, g E B[S, Y]. This is a metric on B[S, Y]. Indeed, doo clearly satisfies conditions (i), (ii), and (iii) in Definition 3.1. To verify the triangle inequality (condition (iv)) proceed as follows. Take an arbitrary s E S and note that, if f, g, and h are mappings in B[S, Y], then (by the triangle inequality in (Y, d))
d(f(s),g(s)) < d(f(s),h(s))+d(h(s),g(s)) < doo(f,h)+doo(h,g). Hence dooff, g) < dooff, h) + dooff, g) and therefore (B[S, Y], doo) is a metric space. The metric d°o is referred to as the sup-metric on B[S, Y]. Note that the metric spaces (t?°, doo) and (t70O, dw) of the previous example are particular cases of (B[S, Y], doo). Indeed, t+°° = B[N, C] and £ = B[Z, C].
Example 3D. The general concept of a continuous mapping between metric spaces will be defined in the next section. However, assuming that the reader is familiar with the particular notion of a real-valued continuous function of a real variable, we shall consider now the following example. Let C[0, 1] denote the set of all scalarvalued (real or complex) continuous functions defined on the interval [0, 1]. For every x, y E C[0, 1] set 1
dp(x, y) _ (J Ix(t) - y(t)I°dt 0
where p is a real number such that p > 1, and
doo = sup Ix(t) - y(t)I. IE[0,11
These are metrics on the set C [O, 1]. That is, d: C[0, 1 ] x C [0, 11 -+ R and dO° : C[0, 1 ] x C[0, 1 ] - R are well-defined functions that satisfy all the conditions in Definition 3.1. Indeed, nonnegativeness and symmetry are trivially verified,
3.1 Metric Spaces
91
positiveness for d p is ensured by the continuity of the elements in C[0, 1], and the triangle inequality comes by the Minkowski inequality (Problem 3.4(c)): for every x, y, z E C[O, 1],
e
dp(x, y)
I
cI
<J
r I
Ix(t) - z(t) + z(t) - y(t)I p dt)
Ix(t)-z(t)Ipdt)A+(J Iz(t)-y(t)Ipdt) 0
0
dp(x, z) + dp(z, y),
doo(x,y) =
sup Ix(t)-z(t)+z(t)-y(t)I
te[o,I[
sup Ix(t) - z(t)I + sup WO - Y(01 1E[o,II
tE[0,1]
doo(x, z) + doo(z, y).
Let B [O, 1] denote the set B[S, Y] of Example 3C when S = [0, 11 and Y = IF (with
F standing either for the real field R or for the complex field C). Since C[0, 1] is a subset of B[0, 1) (reason: every scalar-valued continuous function defined on the interval [0, 11 is bounded), it follows that (C[0, 1], dam) is a subspace of the metric space (B[0, 1], dam). The metric doo is called the sup-metric on C[0, 1] and, as we shall see later, the "sup" in its definition in fact is a "max".
Let X be an arbitrary set. A real-valued function d on X x X, d : X x X -+ R, is a pseudometric on X if it satisfies the axioms (i), (iii) and (iv) of Definition 3.1. A pseudometric space (X, d) is a set X equipped with a pseudometric d. The difference between a metric space and a pseudometric space is that a pseudometric does not necessarily satisfy the axiom (ii) in Definition 3.1 (i.e., it is possible for a pseudometric to vanish at a pair (x, y) even though x 9& y). However, given a pseudometric space (X, d), there exists a natural way to obtain a metric space (X, d) associated with (X, d), where d is a metric on X associated with the pseudometric
d on X. Indeed, as we shall see next, a pseudometric d induces an equivalence relation - on X, and k is precisely the quotient space X/- (i.e., the collection of all equivalence classes [x] with respect to - for every x in X). Proposition 3.3. Let d be a pseudometric on a set X and consider the relation -' on X defined as follows. If x and x' are elements of X, then
x', x if
d(x', x) = 0.
The relation - is an equivalence relation on X with the following property. For every x, x', y and y' in X,
x'- x and y'-- y
imply
d(x', y') = d(x, y).
Let X/- be the quotient space of X modulo
X/-. xX/- set d([x], [y]) = d(x, y)
For each pair ([x], [y]) in
92
3. Topological Structures
for an arbitrary pair (x, y) in [x] x [y]. This defines a function d:
X/-r x X/s -+ R
which is a metric on the quotient space X l- . Proof. It is clear that the relation - on X is reflexive and symmetric because a
pseudometric is nonnegative and symmetric. Transitivity comes from the triangle
inequality: 0 < d(x, x") < d(x, x') + d(x', x") for every x, x', x" E X. Thus is an equivalence relation on X. Moreover, if x'- x and y'-' y (i.e., if x' E [x] and y' E [y]), then the triangle inequality in the pseudometric space (X, d) ensures that
d(x, y) < d(x, x') +d(x', y') +d(y', y) = d(x', y') and, similarly, d(x', y') < d(x, y). Therefore
d(x', y') = d(x, y)
whenever
x'- x and y'-. y.
That is, given a pair of equivalence classes [x] e X and [y] a X. the restriction of d to [x] x [y] c X x X, d I[xlxty] : [x] x [y] - R, is a constant function. Thus, for
each pair ([x], [y]) in X/- x X/-', set d([x], [y]) = dl[x]x[y](x y) = d(x, y) for any x E [x] and y E [y]. This defines a function d : X/-' x R which is nonnegative, symmetric, and satisfies the triangle inequality (along with d). The reason for defining equivalence classes is to ensure positiveness for d from the nonnegativeness of the pseudometric d: if d([x], [y]) = 0, then d(x, y) = 0 so that x -r y, and hence [x] = [y]. U Example 3E. The previous example exhibited different metric spaces with the same underlying set of all scalar-valued continuous functions on the interval [0, 1]. Here we shall allow discontinuous functions as well. Let S be a nondegenerate interval of the real line R (typical examples: S = [0, 11 or S = R). For each real number p > 1 , let rP (S) denote the set of all scalar-valued (real or complex) p-integrable functions on S. In this context, "p-integrable" means that a scalar-valued function x on S is
Riemann integrable and fslx(s)IPds < oo (i.e., the Riemann integral fslx(s)IPds exists as a number in R). Consider the function 8P: rP(S)xrP(S) -+ R given by
8P(x,Y) = tfs Ix(s)-y(s)IPds)
for every x, y E re(S). The Minkowski inequality (see Problem 3.4(c)) ensures that the function 8P is well-defined, and also that it satisfies the triangle inequality. Moreover, nonnegativeness and symmetry are readily verified but positiveness fails. For instance, if 0 denotes the null function on S = [0, 11 (i.e., 0(s) = 0 for all s E S),
3.2 Convergence and Continuity
93
and if x (s) = 1 for s = i and zero elsewhere, then SP (x, 0) = 0 although x 96 0 (for x(71) 0 0(2')). Thus 8P actually is a pseudometric on re(S) rather than a metric,
so that (rP(S), SP) is a pseudometric space. However, if we "redefine" re(S) by endowing it with a new notion of equality, different from the usual pointwise equality
for functions, then perhaps we might make SP a metric on such a "redefinition" of rn(S). This in fact is the idea behind Proposition 3.3. Consider the equivalence
relation - on re(S) defined as in Proposition 3.3: if x and x' are functions in re(S), then x'-' x if SP(x', x) = 0. Now set RP(S) = rP(S)/-, the collection of all equivalence classes [x] = (x' E rP(S) : SP(x', x) = 0) for every x E rP(S). Thus, according to Proposition 3.3, (RP(S), d,,) is a metric space where the metric
dP: RP(S)xRP(S) -> Risdefinedbyd,([x], [y]) =8P(x, y)forarbitraryx E [x] and y E [y] for every [x], [y] E RP(S). Note that equality in Re(S) is interpreted in the following way: if [x) and [y] are equivalence classes in RP(S), and if x and y are arbitrary functions in [x] and [y], respectively, then [x] = [y] if and only if 8P(x, y) = 0. If x is any element of [x] then, in this context, it is usual to write x for [x] and hence dp(x, y) for d,([x], [y]). Thus, following the common usage, we shall write x E RP(S) instead of [x] E RP(S), and also
dp(x, y) _
(1s Ix(s) - y(s)I P ds) A
for every x, y E RP(S) to represent the metric dP on RP(S). This is referred to as the usual metric on RP(S). Note that, according to this convention, x = y in RP(S) if and only if d,(x, y) = 0.
3.2
Convergence and Continuity
The notion of convergence, together with the notion of continuity, plays a central role in the theory of metric spaces.
Definition 3.4. Let (X, d) be a metric space. An X-valued sequence (xo} (or a sequence in X indexed by N or by No) converges to a point x in X if for each real number e > 0 there exists a positive integer ne such that
n > of
implies
d(x,,, x) < e.
If (x,,) converges to x E X, then (xo ) is said to be a convergent sequence and x is
said to be the limit of (xo ) (notations: lim x = x, limo x = x,
xn = X.
x,-*x,or xo->xas n--oo). As defined above, convergence depends on the metric d that equip the metric space (X, d). To emphasize the rule played by the metric d, it is usual to refer to an X-valued convergent sequence {xn } by saying that (xo ) converges in (X, d). If an Xvalued sequence (xn ) does not converge in (X, d) to the point x E X, then we shall
94
3. Topological Structures
either converges in (X, d) write x -4 x. Clearly, if x,, -# x, then the sequence to another point different from x or does not converge in (X, d) to any x in X. The notion of convergence in a metric space (X, d) is a natural extension of the ordinary notion of convergence in the real line R (equipped with its usual metric). Indeed, Let x be an let (X, d) be a metric space, and consider an X-valued sequence x)). According to arbitrary point in X and consider the real-valued sequence Definition 3.4.
x -s x
if and only if
d(x,,,x) -+ 0.
This shows at once that a convergent sequence in a metric space has a unique limit (as we had anticipated in Definition 3.4 by referring to the limit of a convergent +d (x , b) sequence). In fact, if a and b are points in X, then 0 < d (a, b) < d (a,
for every n. Thus, if x -- a and x -- b (i.e., d(a,
0 and d(x,,, b) - 0),
then d(a, b) = 0 (see Problem 3.10(c) and show that the sum of two convergent real-valued sequences (a.) and { fi } is a convergent real-valued sequence with lim a + lim &). Hence a = b. lim(a + Example 3F. Let C[0, 1] denote the set of all scalar-valued continuous functions on the interval [0, 1], and let be a C[0, 1]-valued sequence such that, for each integer n > 1, x : [0, 1] -* R is defined by
xn(t) =
1-nt, to,
t E [0, A], tE 1].
Consider the metric spaces (C[0, fl, dp) for p > 1 and (C[0, 1], do,) which were introduced in Example 3D. It is readily verified that the sequence Ix. 1 converges in (C[0, 1],dp) to the null function 0 E C[0, 11 for every p> 1. Indeed, take an arbitrary p > 1 and note that
f
i
dp(xn,0) _ (J Ixn(t)Ipdt)I < (,)II 0
for each n > 1. Since the sequence of real numbers ((A)F } converges to zero (when the real line R is equipped with its usual metric - apply Definition 3.4), it follows that dp(x,,, 0) -- 0 as n --r oo (Problem 3.10(c)). That is,
x _+ 0 in (C[0, 1],dp). However, (x,,) does not converge in the metric space (CIO, 11, dam). Indeed, if there
existsx E C(0, 1]suchthatd... (x, x) -- O, then it is easy to showthatx(0) = land x (s) = O for all e E (0, 1). Hence x ¢ C(0, 11, which is a contradiction. Conclusion: There is no X E C[0, 1] such that x -i x in (C[0, 1], dam). Equivalently,
(x,,) does not converges in (C[O, 1], dam).
3.2 Convergence and Continuity
95
Example 3G. Consider the metric space (B[S, Y], do.) introduced in Example 3C, where B[S, Y] denotes the set of all bounded functions of a set S into a metric space (Y, d), and dw is the sup-metric. Let (f.) be a B[S, Y]-valued sequence (i.e., a sequence of functions in B[S, Y]), and let f be an arbitrary function in B[S, Y]. Since
0 < d(fn(s), f(s)) < supd(fn(s), f(s)) = doo(fn, SES
f)
for each index n and all s E S, it follows by Problem 3.10(c) that
fn -+ f
in (B[S, Y], dam)
for every s E S. If fn
implies
f. (s) -- f (s) in (Y, d)
fin (B[S, Y]. dd), then we say that the sequence if,) of
functions in B[S, Y] converges uniformly to the function f in B[S, Y]. If f (s) -i f (s) in (Y, d) for every s E S, then we say that (f.) converges pointwise to f. Thus uniform convergence implies pointwise convergence (to the same
limit), but the converse fails. For instance, set S = [0, 1], Y = IF (either the real field R or the complex field C equipped with their usual metric d), and set B[0, 1] = B[[0, 11, F]. Recall that the metric space (C[0, 1], dam) of Example 3D is a subspace of (B[0, 1], dm). (Indeed, every scalar-valued continuous function defined on a bounded closed interval is a bounded function - we shall consider a generalized version of this well-known result later in this chapter). If {gn} is a sequence of functions in C[0, 1] given by s2
gn(s)
s2 + (I - ns)2
for each integer n > 1 and every s E [0, 1], then it is easy to show (Definition 3.4) that
g,, (s) -> 0 in
(R, d)
for every s E [0, 11, so that the sequence {gn } of functions in C[0, 11 converges pointwise to the null function 0 E C[0, I]. On the other hand, note that 0 < gn (s) < I for all s E [0, 1], and gn (.) = 1 for each n > 1. Therefore
doo(gn, 0) = SUP Ign(s)I = I sE(O.II
for every integer n > 1. Thus (gn } does not converge uniformly to the null function, and hence it does not converge uniformly to any limit (for, if it converges uniformly,
then it converges pointwise to the same limit). Conclusion: The C[0, 1]-valued sequence (g,,) does not converge in the metric space (C[0, I], dam). Briefly, (g,,) does not converge in (C[0, I], dm).
However, it does converge to the null function 0 E C[0, 1] in the metric spaces (C[0. 1]. d,,) of Example 3D. That is,
gn - 0
in
(C[0, 1], dp)
96
3. Topological Structures
for each 0 and 0 < t+tnt for every p ? 1. Indeed, since s E (0, l ), it follows that 0 < g (s) < for each n > 1 and every s E [0, 1 ]. )p, and hence
Therefore j
0 < d p ( an, 0) I + n2
for each n > 1 and all p > 1. Since the sequence of positive numbers verges to zero in (IR, d), it follows by Problem 3.10(c) that
dp (g , 0) -+ 0 as n -- oo for every
{-) con-
p > 1.
Proposition 3.5. An X -valued sequence (x,,) converges in a metric space (X, d) to a limit x e X if and only if every subsequence of it converges in (X, d) to x.
If every subsequence converges to a fixed limit, then, in particular, the x in sequence itself converges to the same limit. On the other hand, suppose x (X, d). That is, for every e > 0 there exists a positive integer n6 such that n > n£ implies x) < e. Take an arbitrary subsequence Since }kEN of Proof.
k < nk (reason: {nk }keN is a strictly increasing subsequence of the sequence {n}nEN
- see Section 1.7), it follows that k > n6 implies nk > n6 which in turn implies x) < E. Therefore x in (X, d) as k - oo. o As we saw in Section 1.7, nets constitute a natural generalization of (infinite) sequences. Thus it comes as no surprise that the concept of convergence can be generalized from sequences to nets in a metric space (X, d). Indeed, an X-valued net (xy )yEr (or a net in X) indexed by a directed set r converges to a point x in X if for each real numbers > 0 there exists an index Ye in i' such that y > Ye
implies
d(xy, x) < S.
If {xy }yEr converges to x, then it is said to be a convergent net and x is said to be the limit of {xy }yEr (notations: lira xy = x, limy xy = x, or xy -+ x). Just as in the particular case of sequences, a convergent net in a metric space has a unique limit.
The notion of a real-valued continuous function on R is essential in classical analysis. One of the main reasons for investigating metric spaces is the generalization
of the idea of continuity for maps between abstract metric spaces: a map between metric spaces is continuous if it preserves closeness.
Definition 3.6. Let F : X -* Y be a function from a set X to a set Y. Equip X and Y with metrics dx and dy, respectively, so that (X, dx) and (Y, dy) are metric
spaces. F: (X, dx) -* (Y, dy) is continuous at the point xp in X if for each real number e > 0 there exists a real number S > 0 (which certainly depends one and may depend on xo as well) such that
dx(x, xo) < 3
implies
dy(F(x), F(xo)) < E.
3.2 Convergence and Continuity
97
F is continuous (or continuous on X) if it is continuous at every point of X ; and uniformly continuous (on X) if for each real number e > 0 there exists a real number
6 > 0 such that
dx(x, x') < S
dy(F(x), F(x')) < e
implies
for all x and x' in X.
It is clear that a uniformly continuous mapping is continuous, but the converse fails. The difference between continuity and uniform continuity is that if
F is uniformly continuous, then for each s > 0 it is possible to take S > 0 (which depends only on e) so as to ensure that the implication (dx(x, xo) < dy(F(x), F(xo)) < e} holds for all points xo of X. We say that a mapping F: (X, dx) -+ (Y, dy) is Lipschitzian if there exists a real number y > 0 S
(called Lipschitz constant) such that
dy(F(x), F(x')) < y dx(x. x') for all x, x' E X (which is referred to as the Lipschitz condition). It is readily verified that every Lipschitzian mapping is uniformly continuous but, again, the converse fails (see Problem 3.16). A contraction is a Lipschitzian mapping F : (X, dx) -+ (Y, dy) with a Lipschitz constant y < 1. That is, F is a contraction if dy(F(x), F(x')) < dx(x, x') for all x, x' E X or, equivalently, if
F(x')) < 1. dx(x,x')
sup
dy(F(x),
X0x'
F is said to be a strict contraction if it is Lipschitzian with a Lipschitz constant y < 1, which means that su X Ox,
dy(F(x), F(x')) < 1. dx(x,x')
Note that, if dy(F(x), F(x')) < dx(x, x') for all x, x' E X. then F is a contraction but not necessarily a strict contraction. Consider a function F from a metric space (X, dx) to a metric space (Y, dy). If F is continuous at a point xo E X, then xo is said to be a point of continuity of F. Otherwise, if F is not continuous at a point xo E X, then xo is said to be a point of discontinuity of F, and F is said to be discontinuous at xo. F is not continuous if there exists at least one point xp E X such that F is discontinuous at xo. According to Definition 3.6 a function F is discontinuous at xo E X if and only if the following assertion holds true: there exists s > 0 such that for every S > 0 there exists xa E X with the property that
dx(x,, xo) < 3
and
dy(F(xs), F(xo)) > s.
98
3. Topological Structures
Example 3H. (a) Consider the set R2(R) defined in Example 3E. Put Y = R2(R) and let X be the subset of Y made up of all functions x in R2(R) for which the formula
y(t) = I r x(s) ds for each t E R J -00 defines a function in R2(R). Briefly,
X = )x E IY: f racx(s)dsldt f.2 < oo). Recall that a "function" in Y is, in fact, an equivalence class of functions as discussed in Example 3E. Thus consider the mapping F : X -* Y that assigns to each function x in X the function y = F(x) in Y defined by the above formula. Now equip R2(R) with its usual metric d2 (cf. Example 3E) so that (X, d2) is a subspace of the metric
space (Y, d2). We claim that F: (X, d2) -> (Y, d2) is nowhere continuous; that is, the mapping F is discontinuous at every xo E X (see Problem 3.17(a)). (b) Now let S be a (nondegenerate) closed and bounded interval of the real line R (typical example: S = [0, 1)), and consider the set R2(S) defined in Example 3E. If x is a function in R2(S) (so that it is Riemann integrable), then set
y(t) = f , x(s)ds for each t E S.
J
s
According to the Holder inequality in Problem 3.3(c) it follows that fslx(s)Ids <
(fsds) I (fslx(s)I2ds)i for every x E R2(S). Therefore Iy(t)I2 =I fox(s)dsI2 < (fslx(s)Ids)2 < diam(S) fslx(s)I2ds for each t E S, and hence
f ly(t)I2dt < diam(S)2fS Ix(s)I2ds < 00
Is
for everyx E R2(S). Thus the above formula defines a function y in R2(S). Let F be a mapping of R2(S) into itself that assigns to each function x in R2(S) this function y in R2(S), so that y = F(x). Equip R2(S) with its usual metric d2 (Example 3E). It is easy to show that F: (R2(S), d2) -> (R2(S), d2) is uniformly continuous. As a matter of fact, the mapping F is Lipschitzian (see Problem 3.17(b)). Comparing the example in item (a) with the present one we observe how different the metric spaces R2(R) and R2(S), both equipped with the usual metric d2, can be: the "same" integral transformation F that is nowhere continuous when defined on an appropriate subspace of (R2(R), d2) becomes Lipschitzian when defined on (R2(S), d2). The concepts of convergence and continuity are tightly intertwined. A particularly important result on the connection of these central concepts says that a function is continuous if and only if it preserves convergence. This supplies a useful necessary and sufficient condition for continuity in terms of convergence.
3.2 Convergence and Continuity
99
Theorem 3.7. Consider a mapping F: (X, dx) - (Y, dy) of a metric space (X,dx) into a metric space (Y, dy) and let xo be a point in X. The following assertions are equivalent. (a) F is continuous at xo.
(b) The Y-valued sequence (F(xn)} converges in (Y, dy) to F(xo) E Y whenever is an X-valued sequence that converges in (X, dx) to xo E X.
Proof If
xo in (X, dx) for some xo in is an X -valued sequence such that x X. then (Definition 3.4) for every S > 0 there exists a positive integer ns such that
n > na
implies
dx(x,,, xo) < S.
If F: (X, dx) --> (Y, dy) is continuous at xo, then (Definition 3.6) for each e > 0 there exists S > 0 such that dx(xn, xo) < S
implies
dy(F(x), F(xo)) < E.
Therefore, if x -+ xo and F is continuous at xo, then for each E > 0 there exists a positive integer nE (e.g., nE = na) such that n > nE
implies
dy(F(x,,), F(xo)) < E.
which means that (a)=*-(b). On the other hand, if F is not continuous at xo, then there exists s > 0 such that for every S > 0 there exists xa E X with the property that
dx(xa, xo) < S
and
dy(F(x3), F(xo)) ? E.
In particular, for each positive integer n there exists x E X such that dx(x,,, xo) < A
and
F(xo)) > E.
Thus x -- xo in (X, dx) (for dx(x,,, xo) - 0) and F(xo)) -4 0). That is, the Y-valued sequence (
F(xo) in (Y, dy) (for does not converge
to F(xo) (it may not converge in (Y, dy) or, if it converges in (Y, dy), then it does not converge to F(xo)). Therefore, the denial of (a) implies the denial of (b). o Equivalently,
Note that the proof of can be rewritten in terms of nets so that, if the mapping F: (X, dx) - (Y, dy) is continuous at xo E X, and if (xy}yEr is an Xvalued net that converges to xo, then (F(xy))yEr is a Y-valued net that converges to F(xo). That is, if (6) is the statement obtained from (b) by changing "sequence" (b) is a to "net" in (b), then particular case of (6)), it also follows that (b)=(a) (because (b) =*(a)).
We shall say that a mapping F: (X, dx) -+ (Y, dy) of a metric space (X, dx) into a metric space (Y, dy) preserves convergence if the Y-valued sequence (F(x))
100
3. Topological Structures
converges in (X, dX).
converges in (Y, dy) whenever the X-valued sequence and lim
Corollary 3.8. A map between metric spaces is continuous if and only if it preserves convergence.
Proof. Combine the above definition and the definition of a continuous function o with Theorem 3.7.
-
Example 31. Let C[0, 11 denote the set of all scalar-valued (real or complex) continuous functions defined on the interval [0, 1 ]. Consider the map qp : C[0, 1] C defined by I
p(x) = J x(t)dt 0
for every x in C[0, 1]. Equip C[0, 1] with the sup-metric d(,. and C with its usual metric d. Take an arbitrary convergent sequence in (C[0, 1], dam) (i.e., an arbitrary C[0, 1 )-valued sequence that converges in the metric space (C[0, I], dam)) and set xo = lim x E C[0, I]. Note that for each positive integer n
0<
(p(xO)) = I I
1 (x, (t) 0
J0
I
- x0(t))dtl
xn(t)dt <
f
J0 I
xo(t)dt
i
Ix, (t) - xo(t)I dt
0
t
<
sup te(O.II
-x0(t)11 dt = 0
(by the Holder inequality: Problem 3.3(c)). Since doo(x,,, x0) --> 0, it follows that W(xo)) --+ 0. Therefore V(xo) in (C, d) whenever x -+ xo in (C[0, 1 ], da,), so that tp : (C[0, I], dam) - (C, d) is continuous by Corollary 3.8. (In fact, (p is a contraction, thus Lipschitzian, and hence uniformly continuous.)
3.3
Open Sets and Topology
Let (X, d) be a metric space. For each point xo in X and each nonnegative real number p the set Bo(xo) = {x E X : d(x,xo) < p) is the open ball with center xo and radius p (or the open ball centered at xo with radius p). If p = 0, then Bo(xo) is empty; otherwise Bo(xo) always contains at least its center. The set Bp[x0] = {X E X : d(x, xo) < pI
3.3 Open Sets and Topology
101
is the closed ball with center x0 and radius p. It is clear that Bp(xo) a Bp[xo}. If p = 0, then Bo[xo} = (x0) for every xo E X. Definition 3.9. A subset U of a metric space X is an open set in X if it includes a nonempty open ball centered at each one of its points. That is, U is an open set in X if and only if for every u in U there exists a positive number p such that Bp(xo) C U. Equivalently, U C X is open in the metric space (X. d) if and only if for every u E U there exist p > 0 such that
d(x, u) < p
implies
x E U.
Thus, according to Definition 3.9, a subset A of a metric space (X, d) is not open if and only if there exists at least one point a in A such that every open ball with positive radius p centered at a contains a point of X not in A. In other words, A C X is not open in the metric space (X, d) if and only if there exists at least one point a E A with the following property: for every p > 0 there exists x E X such that
d(x, a) < p
and
x E X \A.
This shows at once that the empty set 0 is open in X (reason: if a set is not open then it has at least one point); and also that the underlying set X is always open in the metric space (X, d) (reason: there is no point in X\X). Proposition 3.10. An open ball is an open set. Proof. Let Bp (x0) bean open ball in a metric space (X, d) with center at an arbitrary
x0 E X and with an arbitrary radius p ? 0. Suppose p > 0 so that Bp(x0) i6 0 (otherwise Bp (x0) is empty and hence trivially open). Take an arbitrary u E Bp (xo),
which means that u E X and d (u, x0) < p. Set fl = p - d (u, x0) so that 0 < f < p, and let x be a point in X. If d(x, u) < fl, then the triangle inequality ensures that
d(x,x0) < d(x, u) + d(u, x0) < B+d(u,x0) = p. and hence x E Bp(xo). Conclusion: For every u E Bp(xo) there exists P > 0 such that
d(x, u) < fi
implies
x E Bp(x0).
That is, Bp(xo) is an open set.
o
An open neighborhood of a point x in a metric space is an open set containing x. In particular (see Proposition 3.10), every open ball with positive radius centered at a point x in a metric space is an open neighborhood of x. A neighborhood of a point x in a metric space X is any subset of X that includes an open neighborhood of x. Clearly, every open neighborhood of x is a neighborhood of x. Open sets can be used to give an alternative definition of continuity and convergence.
Lemma 3.11. Consider a mapping F : X -> Y of a metric space X into a metric space Y and let xo be a point in X. The following assertions are equivalent.
102
3. Topological Structures
(a) F is continuous at xo. (b) The inverse image of every neighborhood of F(xo) is a neighborhood of xo.
Proof Consider the image F(xo) E Y of xo E X. Take an arbitrary neighborhood N C Y of F(xo). Since N includes an open neighborhood U of F(xo), it follows that there exists an open ball BE(F(xo)) g U C Al with center at F(xo) and radius e > 0. If the mapping F : X - Y is continuous at xo (cf. Definition 3.6), then there exists 8 > 0 such that
dy(F(x), F(xo)) < e
whenever
dx(x, xo) < 8,
where dx and dy are the metrics on X and Y, respectively. In other words, there exists 8 > 0 such that x E Bs(xo)
implies
F(x) E BE(F(xo)).
Thus Ba(xo) C_ F-' (BE(F(xo))) C_ F-' (U) C_ F-' (N). Since the open ball BS (xo) is an open neighborhood of xo, and since Ba (xo) C F-' (N), it follows that F-' (N) is a neighborhood of xo. Hence (a) implies (b). Now suppose (b) holds true. Then, in particular, the inverse image F-' (BE (F(xo))) of each open ball BE (F(xo))
with center F(xo) and radius s > 0 includes a neighborhood N C X of xo. This neighborhood N includes an open neighborhood U of xo which in turn includes an open ball B8(xo) with center at xo and with a positive radius S (cf. Definition 3.9). Therefore, for each e > 0 there exists 8 > 0 such that
B3(xo) S U C N c F-' (BE(F(xo))) Hence (see Problems 1.2(c j))
F(B6(xo)) c Be(F(xo)).
Equivalently, if x E Bs(xo), then F(x) E BE(F(xo)). Thus for each e > 0 there exists 8 > 0 such that dx(x, xo) < 8
implies
dy(F(x), F(xo)) < e,
where dx and dy denote the metrics on X and Y, respectively. That is, (a) holds true (Definition 3.6). 13 Theorem 3.12. A map between metric spaces is continuous if and only if the inverse image of each open set is an open set.
Proof. Let F : X - Y be a mapping of a metric space X into a metric space Y.
(a) Take any neighborhood N C Y of F(x) E Y (for an arbitrary x c- X). Since N includes an open neighborhood of F(x), say U. it follows that F(x) E U C N which implies
x E F-'(U) c F-'(N).
3.3 Open Sets and Topology
103
If the inverse image (under F) of each open set in Y is an open set in X. then F-1 (U) is open in X, and hence F-' (U) is an open neighborhood of x. Therefore, the inverse image F- I (N) is a neighborhood of x. Conclusion: The inverse image of every neighborhood of F(x) (for any x E X) is a neighborhood of x. Thus F is continuous by Lemma 3.11.
(b) Take an arbitrary open subset U of Y. Suppose 1(F) fl u i4 0 and take x E F-1(U) C X arbitrary. Thus F(x) E U so that U is an open neighborhood of F(x). If F is continuous, then it is continuous at x. Therefore, according to Lemma 3.11, F-1(U) is a neighborhood of x, and hence it includes a nonempty open ball B3(x) centered at x. Thus B8 (x) C F-1(U) so that F-1(U) is open in X (reason: it includes a nonempty open ball of an arbitrary point of it). If R(F) fl U = 0, then F-1 (U) = 0 which is open. Conclusion: F-1(U) is open in X for every open subset U of Y. 0 Corollary 3.13. The composition of two continuous functions is again a continuous function.
Proof. Let X. Y and Z be metric spaces, and let F : X - Y and G : Y -* Z be continuous functions. Take an arbitrary open set U in Z. According to Theorem 3.12
the set G-1(U) is open in Y so that (GF)-1(U) = F-'(G-(U)) is open in X. Thus, using Theorem 3.12 again, we conclude that G F : X -)- Z is continuous. 13
An X-valued sequence is said to be eventually in a subset A of X if there exists a positive integer no such that
n > no
implies
x E A.
Theorem 3.14. Let {x } be a sequence in a metric space X. and let x be a point in X. The following assertions are equivalent.
(a) x - x in X. (b) {x } is eventually in every neighborhood of x. Proof.
If x -+ x, then (definition of convergence) {x } is eventually in every
nonempty open ball centered at x. Hence it is eventually in every neighborhood of x (cf. definitions of neighborhood and of open set). Conversely, if {x } is eventually in every neighborhood of x then, in particular, it is eventually in every nonempty open ball centered at x, which means that x,, - x. o
Observe that the above theorem is naturally extended from sequences to nets. A net {x),}yEr in a metric space X converges to x E X if and only if for every neighborhood N of x there exists an index yb E 1' such that xy E N for every
Y?Y)
104
3. Topological Structures
Given a metric space X, the collection of all open sets in X is of paramount importance. The fundamental properties of it are stated in the next theorem.
Theorem 3.15. If X is a metric space, then (a) the whole set X and the empty set 0 are open,
(b) the intersection of a finite collection of open sets is open,
(c) the union of an arbitrary collection of open sets is open.
Proof. We have already verified that assertion (a) holds true. Let (Un) be a finite collection of open subsets of X. Suppose nnUn 34 0 (otherwise nnUn is an open set). Take an arbitrary u E nn Un so that u E Un for every index n. As each Un is an open subset of X, there exist open balls Ban(u) c Un (with center at u and radius an > 0) for each index n. Consider the set {an) consisting of the radius of each Baju). Since {an} is a finite set of positive numbers, it follows that it has a
positive minimum. Set a = min{a) > 0 so that Ba(u) c nnUn. Thus nnUn is open in X (i.e., for each u E nnUn there exists an open ball Ba(u) c nnUn), which concludes the proof of (b). The proof of (c) goes as follows. Let U be an arbitrary collection of open subsets of X. Suppose U is nonempty (otherwise it is open by (a)) and take an arbitrary u E U U so that u E U for some U E U. As U is an open subset of X, there exists a nonempty open ball Bp(u) c U C_ UU, which means that U U is open in X. o
Corollary 3.16. A subset of a metric space is open if and only if it is a union of open balls. Proof. The union of open balls in a metric space X is an open set in X because open balls are open sets (cf. Proposition 3.10 and Theorem 3.15). On the other hand, let U be an open set in a metric space X. If U is empty, then it coincides with the empty open ball. If U is a nonempty open subset of X, then each u E U is the center of an open ball, say BpM(u), included in U. Thus U = UUEU{u) c UUEuBPM(u) E_ U,
and hence U =
UUEUBpM(u).
0
The collection T of all open sets in a metric space X (which is a subcollection of the power set P(X)) is called the topology (or the metric topology) on X. As the elements of T are the open sets in the metric space (X, d), and since the definition of an open set in X depends on the particular metric d that equips the metric space (X, d), the collection T is also referred to as the topology induced (or generated, or determined) by the metric d. Our starting point in this chapter was the definition of a metric space. A metric has been defined on a set X as a real-valued function on X x X that satisfies the metric axioms of Definition 3.1. A possible and different approach is to define axiomatically an abstract notion of open sets (instead of an abstract notion of distance as we did in
3.3 Open Sets and Topology
105
Definition 3.1), and then to build up a theory based on it. Such a "new" beginning goes as follows.
Definition 3.17. A subcollection T of the power set P(X) of a set X is a topology on X if it satisfies three axioms, viz., (i) (ii) (iii)
The whole set X and the empty set 0 belong to T. The intersection of a finite collection of sets in T belongs to T. The union of an arbitrary collection of sets in T belongs to T.
A set X equipped with a topology T is referred to as a topological space (denoted by (X, T) or simply by X), and the elements of T are called the open subsets of X with respect to T. Thus a topology T on an underlying set X is always identified with the collection of all open subsets of X : U is open in X with respect to T if and only if U E T. It is clear (see Theorem 3.15) that every metric space (X, d) is a topological space, where the topology T (the metric topology, that is) is that induced by the metric. This topology T induced by the metric d, and the topological space (X, T) obtained by equipping X with T, are said to be metrized by d. If (X, T) is a topological space, and if there exists a metric don X that metrizes T, the topological space (X, T) and the topology T are called metrizable. The notion of topological space is broader than the notion of metric space. Although every metric space is a topological space, the converse fails. There are topological spaces that are not metrizable.
Example 3J. Let X be an arbitrary set and define a function d : X x X - R by
d(x, y) =
0, 1,
x=y,
xy,
for every x and y in X. It is readily verified that d is a metric on X, the so-called discrete metric on X. A set X equipped with the discrete metric is called a discrete space. In a discrete space every open ball with radius p E (0, 1) is a singleton in X: Bp (xo) = (x0} for every xo E X and every p E (0, 1). Thus. according to Definition 3.9, every subset of X is an open set in the metric space (X, d) equipped with the discrete metric d. That is, the metric topology coincides with the power set of X. Conversely, if X is an arbitrary set, then the collection T = P(X) is a topology on X (since T trivially satisfies the above three axioms), called the discrete topology, which is the largest topology on X (any other topology on X is a subcollection of the discrete topology). Summing up: The discrete topology
7 = P(X) is metrizable by the discrete metric. On the other extreme lies the topology T = (0, X), called the indiscrete topology, which is the smallest topology on X (it is a subcollection of any other topology on X). If X has more than one point, then the indiscrete topology
T = (0, X) is not metrizable.
106
3. Topological Structures
Indeed, suppose there exists a metric d on X that induces the indiscrete topology. Take u in X arbitrary and consider the set X\{u). Since 0 96 X\{u} i4 X, it follows that this set is not open (with respect to the indiscrete topology). Thus there exists V E X \(u) with the following property: for every p > 0 there exists x E X such that d(x, v) < p and x E X\(X\{u)) = (u).
Hence x = u so that d (u, v) < p for every p > 0. Therefore u = v (i.e., d(u, v) _ 0), which is a contradiction (for v E X\{u}). Conclusion: There is no metric on X that induces the indiscrete topology. Continuity and convergence in a topological space can be defined as follows. A mapping F: X -* Y of a topological space (X, TX) into a topological space (Y, TO is continuous if F'1(U) E Tx for every U E Ty. An X-valued sequence converges in a topological space (X, T) to a limit x E X if it is eventually in every U E T that contains x. Carefully note that, for the particular case of metric spaces (or of metrizable topological spaces), the above definitions of continuity and convergence agree with Definitions 3.6 and 3.4, respectively, when the topological spaces are equipped with their metric topology. Indeed, these definitions are the topological-space versions of Theorems 3.12 and 3.14. Many (but not all) of the theorems in the next sections hold for general topological spaces (metrizable or not), and we shall prove them by using a topological-space style (based on open sets rather than on open balls) whenever this is possible and convenient. However, as we had anticipated at the introduction of this chapter, our attention will focus mainly on metric spaces.
3.4
Equivalent Metrics and Homeomorphisms
Let (X, d1) and (X, d2) be two metric spaces with the same underlying set X. The metrics dl and d2 are said to be equivalent (or dt and d2 are equivalent metrics
on X - notation: dt-d2) if they induce the same topology (i.e., a subset of X is open in (X, d1) if and only if it is open in (X, d2)). This notion of equivalence in fact is an equivalence relation on the collection of all metrics defined on a given set X. If Ti and T2 are the metric topologies on X induced by the metrics dt and d2, respectively, then
d1'-d2
if and only if
7 = T2.
If T c T2 (i.e., if every open set in (X, d i) is open in (X, d2)), then T2 is said to be stronger than T1. In this case we also say that T is weaker than 72. The terms finer and coarser are also used as synonyms for "stronger" and "weaker", respectively. If either 7 C_ T2 or T2 C_ T1, then 71 and T2 are said to be commensurable. Otherwise (i.e., if neither 7 C T2 nor T2 C Tt ), the topologies are said to be incommensurable. As we shall see below, if 71, is stronger than 71, then continuity with respect to 7 implies continuity with respect to T2. On the other hand, if T2 is stronger than Tt . then
3.4 Equivalent Metrics and Homeomorphisms
107
convergence with respect to T2 implies convergence with respect to T1. Briefly and roughly: "Strong convergence" implies "weak convergence" but "weak continuity" implies "strong continuity".
Theorem 3.18. Let dl and d2 be metrics on a set X, and consider the topologies Tj and T2 induced by dt and d2. respectively. The following assertions are pairwise equivalent. (a) T2 is stronger than T
(i.e., T C T2).
(b) Every mapping F : X -* Y that is continuous at xo E X as a mapping of (X, d j) into the metric space (Y, d) is continuous at xo as a mapping of (X, d2) into (Y, d). (c) Every X -valued sequence that converges in (X, d2) to a limit s E X converges in (X, dt) to the same limit x. (d) The identity map of (X, d2) onto (X, dt) is continuous.
Proof. Consider the topologies T and T2 on X induced by the metrics dl and d2 on X, respectively. Let T denote the topology on a set Y induced by a metric d on Y.
Proof of
If F : (X, di) -+ (Y, d) is continuous at xo E X, then (Lemma
3.11) for every U E T that contains F(xo) there exists U' E 7 containing xo
F_t (U). If T e 7-2, then U' E T2: the inverse image (under F) of such that U' C every open neighborhood of F(xo) in T includes an open neighborhood of xo in T2, which clearly implies that the inverse image (under F) of every neighborhood of F(xo) in T is a neighborhood of xo in T2. Thus, applying Lemma 3.11 again, F: (X, d2) -+ (Y, d) is continuous at xo.
Let {x,,) be an X-valued sequence. If x - x E X in (X, d2), Proof of then (Theorem 3.14) (x,,) is eventually in every open neighborhood of x in T2. If is eventually in every neighborhood of x in T. Tt g T2 then, in particular, Therefore, applying Theorem 3.14 again, x -+ x in (X, dj). Proof of (b)=(d). The identity map 1: (X, d1) - (X, dt) of a metric space onto itself is trivially continuous. Thus, by setting (Y, d) = (X, d1) in (b), it follows that (b) implies (d). Proof of (c)=*(d). Corollary 3.8 ensures that (c) implies (d).
(d) implies (a) (i.e., if the identity Proof of 1: (X, d2) -- (X, d1) is continuous, then U = 1-1 (U) is open in T2 whenever U 13 is open in TI, and hence 71 C T2). As the discrete topology is the strongest topology on X, the above theorem ensures that any function F : X --)- Y that is continuous in some topology on X is continuous
in the discrete topology. Actually, since every subset of X is open in the discrete
108
3. Topological Structures
topology, it follows that the inverse image of every subset of Y - no matter which topology equips the set Y - is an open subset of X when X is equipped with the discrete topology. Therefore, everyfunction defined on a discrete topological space is continuous. On the other hand, if an X-valued (infinite) sequence converges in the discrete topology, then it is eventually constant (i.e., it has only a finite number of entries not equal to its limit), and hence it converges in any topology on X. Corollary 3.19. Let (X, d j) and (X, d2) be metric spaces with the same underlying set X. The following assertions are pairwise equivalent. (a) d2 and dl are equivalent metrics on X.
(b) A mapping of X into a set Y is continuous at xo E X as a mapping of (X, d t ) into the metric space (Y. d) if and only if it is continuous at xo as a mapping of (X, d2) into (Y, d). (c) An X -valued sequence con verges in (X, d j) t o X E X if and only if it converges
in (X, d2) to x. (d) The identity map of (X, d i) onto (X, d2) and its inverse (i.e., the identity map of (X, d2) onto (X, dt)) are both continuous.
Proof. Recall that, by definition, two metrics dt and d2 on a set X are equivalent if the topologies T and T2 on X, induced by dt and d2 respectively, coincide (i.e., if T = T2). Now apply Theorem 3.18. o
A one-to-one mapping G of a metric space X onto a metric space Y is a homeomorphism if both G: X -- Y and G-t : Y -+ X are continuous. Equivalently, a homeomorphism between metric spaces is an invertible (i.e., injective and surjective) mapping that is continuous and has a continuous inverse. Obviously, G is a homeomorphism from X to Y if and only if G-t is a homeomorphism from Y to X. Two metric spaces are homeomorphic if there exists a homeomorphism between them. A function F : X -+ Y of a metric space X into a metric space Y is an open
map (or an open mapping) if the image of each open set in X is open in Y (i.e., F(U) is open in Y whenever U is open in X). Theorem 3.20. Let X and Y be metric spaces. If G : X -> Y is invertible, then
(a) G is open if and only if G-1 is continuous, (b) G is continuous if and only if G-1 is open, (c) G is a homeomorphism if and only if G and G
are both open.
Proof. If G is invertible, then it is trivially verified that the inverse image of B (B C Y) under G coincides with the image of B under the inverse of G (tautologically:
3.4 Equivalent Metrics and Homeomorphisms
109
G-I (B) = G-t (B)). Applying the same argument to the inverse G-t of G (which is clearly invertible), (G-t)-t(A) = G(A) for each A C_ X. Thus the theorem is a straightforward combination of the definitions of open map and homeomorphism by using the alternative definition of continuity in Theorem 3.12. o Thus a homeomorphism provides simultaneously a one-to-one correspondence between the underlying sets X and Y (so that X -> Y, for a homeomorphism is invective and surjective) and between their topologies (so that Tx as Ty, for a homeomorphism puts the open sets of Tx into a one-to-one correspondence with the open sets of Ty). Indeed, if Tx and Ty are the topologies on X and Y. respectively,
then a homeomorphism G: X Y induces a map Q: Tx - Ty, defined by G(U) = G(U) for every U E TX, which is injective and surjective according to Theorem 3.20. Therefore, any property of a metric space X expressed entirely in terms of set operations and open sets is also possessed by each metric space homeomorphic to X. We call a property of a metric space a topological property or a topological invariant if whenever it is true for one metric space, say X. it is true for every metric space homeomorphic to X (trivial examples: the cardinality of the underlying set and the cardinality of the topology). A map F : X -> Y of a metric space X into a metric space Y is a topological embedding of X into Y if it establishes a homeomorphism of X onto its range R(F) (i.e., F: X -+ Y is a topological embedding of X into Y if it is such that F : X -+ F(X) is a homeomorphism of X onto the subspace F(X) of Y).
Example 3K. Suppose G : X -> Y is a homeomorphism of a metric space X onto a metric space Y. Let A be a subspace of X and consider the subspace G(A) of Y. According to Problem 3.30 the restriction GIA: A - G(A) of G to A onto G(A) is continuous. Similarly, the restriction G-t IG(A): G(A) -- A of the inverse of G to G(A) onto G-1 (G(A)) = A is continuous as well. Since G-t IG(A) = (GIA)-t (Problem 1.8), it follows that
GIA: A --> G(A) is a homeomorphism, and hence A and G(A) are homeomorphic metric spaces (as subspaces of X and Y, respectively). Thus, as we might expect, the restriction GIA: A -> Y of a home-
omorphism G : X --). Y to any subset A of X is a topological embedding of A into Y.
The notions of homeomorphism, open map, topological invariant and topological embedding are germane to topological spaces in general (and to metric spaces in particular). For instance, both Theorem 3.20 and Example 3K can be likewise stated (and proved) in a topological-space setting. In other words, the metric has played no role in the above paragraph, and "metric space" can be replaced with "topological space" there. Next we shall consider a couple of concepts that only make sense in a metric space. A homeomorphism G of a metric space (X, dx) onto a metric space (Y, dy) is a uniform homeomorphism if both G and G-1 are uniformly continuous.
110
3. Topological Structures
Two metric spaces are uniformly homeomorphic if there exists a uniform homeomorphism mapping one of them onto the other. An isometry between metric spaces is a map that preserves distance. Precisely, a mapping J : (X, dx) -* (Y, dy) of a metric space (X, dx) into a metric space (Y, dy) is an isometry if
dy(J(x), J(x')) = dx(x,x') for every pair of points x, x' in X. It is clear that every isometry is an injective contraction, and hence an injective and uniformly continuous mapping. Thus every surjective isometry is a uniform homeomorphism (the inverse of a surjective isometry is again a surjective isometry - trivial example: the identity mapping of a metric space into itself is a surjective isometry on that space). Two metric spaces are isometric (or isometrically equivalent) if there exists a surjective isometry between them, so that two isometrically equivalent metric spaces are uniformly homeomorphic. It is trivially verified that a composition of surjective isometries is a surjective
isometry (transitivity), and this shows that the notion of isometrically equivalent metric spaces deserves its name: it is indeed an equivalence relation on any collection of metric spaces. If two metric spaces are isometrically equivalent, then they can be thought of as being essentially the same metric space - they may differ on the set-theoretic nature of their points but, as far as the metric space (topological) structure is concerned, they are indistinguishable. A surjective isometry not only preserves open sets (for it is a homeomorphism) but also distance. Now consider two metric spaces (X, d1) and (X, d2) with the same underlying set X. According to Corollary 3.19 the metrics dl and d2 are equivalent if and only if the identity map of (X, d1) onto (X, d2) is a homeomorphism (i.e., if and only if 1: (X, dj) -+ (X, d2) and its inverse 1-1: (X, d2) --- (X, d1) are both continuous). We say that the metrics di and d2 are uniformly equivalent if the identity map of (X, d1) onto (X, d2) is a uniform homeomorphism (i.e., 1: (X, d1) --> (X, d2) and its inverse /-1: (X, d2) --* (X, d1) are both uniformly continuous). For instance, if I and /-1 are both Lipschitzian, which means that there exist real numbers a > 0 and fi > 0 such that
adi(x,x') < d2(x,x') < 0dt(x,x') for every x, x' in X, then the metrics di and d2 are uniformly equivalent, and hence equivalent. Thus, if di and d2 are equivalent metrics on X, then (X, dj) and (X, d2) are homeomorphic metric spaces. However, the converse fails: there exist uniformly homeomorphic metric spaces with the same underlying set for which the identity is not a homeomorphism.
Example X. Take two metric spaces (X, dj) and (X, d2) with the same underlying set X and consider the product spaces (X x X, d) and (X x X, d'), where
d((x, y), (u, v)) d'((x, y), (u, v))
= dj(x, u) +d2(y. v), = d2(x, u) + di (y, v),
3.4 Equivalent M e t r i c s a n d H o m e o m o r p h i s m s
Ill
for all ordered pairs (x, y), (u, v) in X x X. That is, (X x X, d) = (X, dt) x (X, d2) and (X x X, d') = (X, d2) x (X, dl) - see Problem 3.9. Suppose the metrics dl
and d2 on X are not equivalent so that either the identity map of (X,dt) onto (X, d2) or the identity map of (X, d2) onto (X, dt) (or both) is not continuous. Let 1: (X, dt) --+ (X, d2) be the one that is not continuous. The identity map
1: (XxX,d) -- (XxX,d') is not continuous. Indeed, if it is continuous, then the restriction of it to any subspace of X x X, d is continuous (Problem 3.30). In particular, the restriction of it to (X, d i) - viewed as a subspace of (X x X. d) = (X, dt) x (X, d2) - is continuous. But such a restriction is clearly identified with the identity map of (X, d1) onto (X, d2), which is not continuous. Thus 1: (X x X, d) -+ (X x X, d') is not continuous, and hence the metrics d and d' on X x X are not equivalent. Now let J: X x X -- X x X be the involution on X x X defined by J((x, y)) = (y, x) for every (x, y) E X X X (Problem 1.11). It is readily verified that J: (X x X, d) -+ (X x X, d') is a surjective isometry so that
J: (XxX,d) -+ (XxX,d') is a uniform homeomorphism. Summing up: The metric spaces (X x X, d) and (X x X, d'), with the same underlying set X x X, are uniformly homeomorphic (more than that, they are isometrically equivalent) but the metrics d and d' on X x X are not equivalent.
Since two metric spaces with the same underlying set may be homeomorphic even if the identity between them is not a homeomorphism, it follows that a weaker version of Corollary 3.19 is obtained if we replace the homeomorphic identity with an arbitrary homeomorphism. This in fact can be formulated for arbitrary metric spaces (not necessarily with the same underlying set).
Theorem 3.21. Let X and Y be metric spaces and let G be an invertible mapping of X onto Y. The following assertions are pairwise equivalent.
(a) G is a homeomorphism.
(b) A mapping F of X into a metric space Z is continuous if and only if the composition FG-t : Y -* Z is continuous. (c) An X -valued sequence {xn) converges in X to a limit x E X if and only if the Y-valued sequence converges in Y to G(x).
Proof. Let G : X - Y be an invertible mapping of a metric space X onto a metric space Y.
112
3. Topological Structures
Proof of (a)=(b). Let F: X --* Z be a mapping of X into a metric space Z, and consider the commutative diagram
Z
so that H = FG't : Y - Z. Suppose (a) holds true, and consider the following assertions.
(b l) F : X -+ Z is continuous. (b2) F-' (U) is an open set in X whenever U is an open set in Z (b3) G(F-1 (U)) is an open set in Ywhenever U is an open set in Z
(b4) (FG-t)-t (U) is an open set in Ywhenever U is an open set in Z
(b5) H = FG-t : Y -> Z is continuous. Theorem 3.12 says that (b1) and (b2) are equivalent. But (b2) holds true if and only if (b3) holds true by Theorem 3.20 (the homeomorphism G : X -- Y puts the open sets of X into a one-to-one correspondence with the open sets of Y). Now note that, as G is invertible,
G(F-'(A)) = (G(x) E Y: F(x) E Al =
(y E Y: F(G-'(y)) E A}
= (FG-t)-'(A)
for every subset A of Z. Thus (b3) is equivalent to (b4), which in turn is equivalent to (b5) (cf. Theorem 3.12 again). Conclusion: (bt).e(b5) whenever (a) holds true.
Proof of (b) =>(a). If (b) holds, then it holds in particular for Z = X and for Z = Y. Thus (b) ensures that the following assertions hold true. (b)
(b)
If a mapping F of X into itself is continuous, then the mapping H = FG-t : Y -+ X is continuous. A mapping F of X into Y is continuous whenever the mapping H = FG-t : Y - Y is continuous.
Since the identity of X onto itself is a continuous mapping, (1) implies that G-t :
Y -> X is continuous. By setting F = G in (b") it follows that G: X -> Y is continuous (because the identity I = GG-' : Y --> Y is continuous). Summing up: (b) implies that both G and G-1 are continuous, which means that (a) holds true.
Proof of (a)q(c). According to Corollary 3.8 an invertible mapping G between metric spaces is continuous and has a continuous inverse if and only if both G and G-t preserve convergence. 13
3.5 Closed Sets and Closure
3.5
113
Closed Sets and Closure
A subset V of a metric space X is closed in X if its complement X\ V is an open set in X.
Theorem 3.22. If X is a metric space, then (a) the whole set X and the empty set 0 are closed, (b) the union of a finite collection of closed sets is closed,
(c) the intersection of an arbitrary collection of closed sets is closed. Proof. Apply the De Morgan laws to each item of Theorem 3.15.
0
Thus the concepts "closed" and "open" are dual of each other (U is open in X if and only if its complement X\U is closed in X, and V is closed in X if and only if its complement X\ V is open in X); but they are neither exclusive (a set in a metric space may be both open and closed) nor exhaustive (a set in a metric space may be neither open nor closed). Theorem 3.23. A map between metric spaces is continuous if and only if the inverse image of each closed set is a closed set.
Proof. Let F : X -+ Y be a mapping of a metric space X into a metric space Y. Recall that F'1(Y\B) = X\F-1(B) for every subset B of Y (Problem 1.2(b)). Suppose F is continuous and take an arbitrary closed set V in Y. Since Y\V is open
in Y. it follows by Theorem 3.12 that F-1(Y\V) is open in X. Thus F-t (V) = X\F-t (Y\ V) is closed in X. Therefore, the inverse image under F of an arbitrary closed set V in Y is closed in X. Conversely, suppose the inverse image under F of each closed set in Y is a closed set in X and take an arbitrary open set U in Y. Thus F-t(Y\U) is closed in X (since Y\U is closed in Y) so that F-1(U) = X\F-t (Y\U) is open in X. Conclusion: The inverse image under F of an arbitrary D open set U in Y is open in X. Therefore F is continuous by Theorem 3.12. Y of a metric space X into a metric space Y is a closed map A function F : X (or a closed mapping) if the image of each closed set in X is closed in Y (i.e., F(V )
is closed in Y whenever V is closed in X). In general, a map F : X -+ Y may possess any combination of the attributes "continuous", "open" and "closed" (i.e., the are independent concepts). However, if F : X -> Y is invertible (i.e., injective and surjective), then it is a closed map if and only if it is an open map.
Theorem 3.24. Let X and Y be metric spaces. If a map G : X -+ Y is invertible, then
(a) G is closed if and only if G- 1 is continuous,
1 14
3. Topological Structures
(b) G is continuous if and only if G-t is closed, (c) G is a homeomorphism if and only if G and G-t are both closed. Proof. Replace "open map" with "closed map" in the proof of Theorem 3.20 and use Theorem 3.23 instead of Theorem 3.12. Let A be a set in a metric space X and let VA be the collection of all closed subsets of X that include A:
VA = IVEP(X): V is closed in X and Ac V1. The whole set X always belongs to VA so that VA is never empty. The intersection of all sets in VA is called the closure of A in X, denoted by A- (i.e., A - = n VA). According to Theorem 3.22(c) it follows that A- is closed in X
and
ACA-.
If V E VA, then A- = n VA c V. Thus, with respect to the inclusion ordering of
P(X). A - is the smallest closed subset of X that includes A,
and hence (since A- is closed in X) A is closed in X
if and only if
A = A-.
From the above displayed results it is readily verified that
0- = 0,
(A-)- = A-
X - = X,
and, if B also is a set in X,
ACB
A- C B-.
implies
Moreover, since both A and B are subsets of A U B, it follows that A- C (A U
and B- c (A U B)- so that A- U B- C_ (A U B)-. On the other hand, since (A U BY is the smallest closed subset of X that includes A U B, and since A - U B-
is closed (Theorem 3.22(b)) and includes A U B (for A C A- and B C B' so that A U B C A' U B-), it follows that (A U B)- c A' U B-. Therefore, if A and B are subsets of X, then
(AUB)- = A- U B-. It is easy to show by induction that the above identity holds for any finite collection of subsets of X. That is, the closure of the union of a finite collection of subsets of X coincides with the union of their closures. In general (i.e., by allowing infinite collections as well) one has inclusion rather than equality. Indeed, if (Ay )yEr is an arbitrary indexed family of subsets of X. then
UAY C (UAY) Y
y
3.5 Closed Sets and Closure
115
because A. C U,, Ay and hence Aa c (UyAy)- for each index a E T. Similarly,
IAy) c n Ay
I
y
Y
because ny Ay c fly Ay and ny Ay is closed in X by Theorem 3.22(c). However, the above two inclusions are not reversible in general so that equality does not hold. Example 3M. Set X = R with its usual metric and consider the following subsets of R. An = [0, 1-1r ], which is closed in R for each positive integer n, and A = [0, 1), which is not closed in It Since 00
UAn
n=1
= A,
it follows that the union ofan infinite collection of closed sets is not necessarily closed
(see Theorem 3.22(b)). In particular, as An = An for each n and A _ [0, 1], 00
00
[0, 1) = U An C (U An) _ [0, 1], n=1
n=1
which is a proper inclusion. If B = [1, 2] (so that B- = B), then
0 = (A fl B)- c A- f1 B- = [1], so that the closure of any (even finite) intersection of sets may be a proper subset of the intersection of their closures. A point x in X is adherent to A (or an adherent point of A, or a point of adherence
of A) if it belongs to the closure A- of A. It is clear that every point of A is an adherent point of A (i.e., A c A-). Proposition 3.25. Let A be a subset of a metric space X and let x be a point in X. The following assertions are pairwise equivalent. (a) x is a point of adherence of A.
(b) Every open set U in X that contains x meets A (i.e., if U is open in X and
xEU,then AflU#0). (c) Every neighborhood N of x contains at least one point of A (which may be x itself). Proof. Suppose there exists an open set U in X containing x for which A n u = 0.
Then A C X\U, the set X\U is closed in X, and x tl X\U. Since A- is the
116
3. Topological Structures
smallest closed subset of X that includes A, it follows that A- C_ X\U so that x f A-. Thus the denial of (b) implies the denial of (a), which means that (a) implies (b). Conversely, if x f A-, then x lies in the open set X\A- which does not meet A- (A- fl X\A' = 0). Therefore, the denial of (a) implies the denial of (b); that is, (b) implies (a). Finally note that (b) is equivalent to (c) as an obvious
0
consequence of the definition of neighborhood.
A point x in X is a point of accumulation (or an accumulation point, or a cluster point) of A if it is a point of adherence of A\{x ). The set of all accumulation points of A is the derived set of A, denoted by A*. Thus X E A* if and only if x E (A\{x})-.
It is clear that every point of accumulation of A is also a point of adherence of A; that is, A* C A- (for A\{x} e A implies (A\{x})- C A-). Actually,
A- = A U A*. Indeed, since A* c A- and A e A-, it follows that A U A* C A-. On the other hand, if x A U A*, then (A\{x})- = A (because A\(x) = A whenever x V A), and hence x V A- (for x V A* so that x (A\{x})-). Therefore, X E A- implies x E AUA*, which means A- C AUA*. Hence A- = A U A*. Thus
A = A-
if and only if
A* C A.
That is, A is closed in X if and only if it contains all its accumulation points. It is trivially verified that A C B implies A* C B* whenever A and B are subsets of X. Also note that A - = 0 if and only if A = 0 (for
0- = 0 and 0 C A C A-), and A* = 0 whenever A = 0 (because A* C A-), but the converse fails (e.g., the derived set of a singleton is empty).
Proposition 3.26. Let A be a subset of a metric space X and let x be a point in X. The following assertions are pairwise equivalent. (a) x is a point of accumulation of A.
(b) Every open set U in X that contains x also contains at least one point of A other than x. (c) Every neighborhood N of x contains at least one point of A distinct from x.
Proof. Since X E X is a point of accumulation of A if and only if it is a point of adherence of A\(x), it follows by Proposition 3.25 that the assertions (a), (b) and (c) are pairwise equivalent (replace A with A\(x) in Proposition 3.25). 0 Everything that has been written so far in this section pertains to the realm of topological spaces (metrizable or not). However, the next results are typical of metric spaces.
Proposition 3.27. Let A be a subset of a metric space (X, d) and let x be a point in X. The following assertions are pairwise equivalent.
3.5 Closed Sets and Closure
117
(a) x is a point of adherence of A.
(b) Every nonempty open ball centered at x meets A.
(c) A # 0 and d(x, A) = 0. (d) There exists an A-valued sequence that converges to x in (X, d).
Proof. The equivalence (a)..(b) follows by Proposition 3.25 (recall: a nonempty open ball centered at x is a neighborhood of x and, conversely, every neighborhood of x includes a nonempty open ball centered at x, so that every nonempty open ball
centered at x meets A if and only if every neighborhood of x meets A). Clearly (b).'(c) (i.e., for each e > 0 there exists a E A such that d (x, a) < c if and only if A # 0 and infaEA d(x, a) = 0). Theorem 3.14 ensures that (d)=(b). On the other hand, if (b) holds true, then for each positive integer n the open ball Bi (x) meets A
(i.e., B i (x) fl A 76 0). Take X,, E B* (x) fl A so that X. E A and 0 < d(x,,, x) < W is an A-valued sequence such that d(x,,, x) - 0. Therefore for each n. Thus
(b) *(d).
o
Proposition 3.28. Let A be a subset of a metric space (X, d) and let x be a point in X. The following assertions are pairwise equivalent. (a) x is a point of accumulation of A.
(b) Every nonempty open ball centered at x contains a point of A distinct from x. (c) Every nonempty open ball centered at x contains infinitely many points of A.
(d) There exists an A\(x)-valued sequence of pairwise distinct points that converges to x in (X, d).
Proof. Note that (d)=*(c) by Theorem 3.14. (c)=(b) trivially, and (d) (a) (b) by the previous proposition. To complete the proof it remains to show that (b)=(d). Let BE(x) be open ball centered at x E X with radius e > 0. We shall say that an A-valued sequence (xk)kEN has Property P,,, for some integer n E N, if xk E (Bi (x)\(x)) for each k = i, ... , n+I and d(xk+t, x) < d(xk, x) for every
k = t, .. , n. Claim. If assertion (b) holds true, then there exists an A-valued sequence that has Property P. for every n E N.
Proof. Suppose (b) holds true so that (BE(x)\{x)) fl A A 0 for every e > 0. Take an arbitrary x1 E (BI (x)\(x)) fl A and an arbitrary x2 E (B£2(x)\(x)) fl A where e2 = min( I', d(xj, x)}. Every A-valued sequence whose first two entries coincide with x I and x2 has Property P1. Suppose there exists an A-valued sequence that has Property P. for some integer n E N. Take any point from (B,.12 (x)\(x )) fl A where
118
3. Topological Structures
x)}, and replace the (n+2)th entry of that sequence with an+2 = min{ nom, d this point. The resulting sequence has Property P.+1. Thus there exists an A-valued
whenever there exists one that has Property P,,, sequence that has Property and this concludes the proof by induction. 0 However, an A-valued sequence (xk )kEN that has Property P. for every n c- N in fact
is an A\{x}-valued sequence of pairwise distinct points such that 0 < d(xk, x) < I for every k E N. Therefore (b)=(d). 0 Recall that "point of adherence" and "point of accumulation" are concepts defined for sets, while "limit of a convergent sequence" is, of course, a concept defined for sequences. But the range of a sequence is a set, and it can have (many) accumulation be an X-valued sequence. A point x points. Let (X, d) be a metric space and let in X is a cluster point of the sequence {x) if some subsequence of converges to x. The cluster points of a sequence are precisely the accumulation points of its range (Proposition 3.28). If a sequence is convergent, then (Proposition 3.5) its range has only one point of accumulation which coincides with the unique limit of the sequence.
Corollary 3.29. The derived set A' of every subset A of a metric space (X, d) is closed in (X, d).
Proof. Let A be an arbitrary subset of a metric space (X, d). We want to show that (A*)- = A' (i.e., A' is closed) or, equivalently, (A')- a A* (recall: every set is included in its closure). If A is empty, then the result is trivially verified
(0* = 0 = 0-). Thus suppose A is nonempty. Take an arbitrary x- E (A*)and an arbitrary e > 0. Proposition 3.27 ensures that BE(x-) fl A* 96 0. Take
x' E BE(x-) fl A* and set S = e - d(x', x-). Note that 0 < S < e (for 0 < d(x', x-) < e). Since x' E A', it follows by Proposition 3.28 that Ba(x') fl A contains infinitely many points. Take x E Ba (x*) fl A distinct from x- and from x'.
Thus 0 < d(x, x-) < d(x, x') + d(x*, x-) < S + d(x*. x-) = e by the triangle inequality. Therefore X E BE(x-) and x A x-. Conclusion: Every nonempty ball BE(x-) centered at x- contains a point x of A other than x-. Thus X_ E A' by 13 Proposition 3.28, and therefore (A*)- C A'. The above corollary does not hold in a general topological space. Indeed, if a set X containing more than one point is equipped with the indiscrete topology (where the only open sets are 0 and X), then the derived set (x)* of a singleton (x) is X\(x) which is not closed in that topology.
Theorem 3.30. (The Closed Set Theorem). A subset A of a set X is closed in the metric space (X, d) if and only if every A -valued sequence that converges in (X, d) has its limit in A. Proof. (a) Take an arbitrary A-valued sequence
that converges to x E X in
(X, d). By Theorem 3.14 (x,,) is eventually in every neighborhood of x, and hence every neighborhood of x contains a point of A. Thus x is a point of adherence of
3.6 Dense Sets and Separable Spaces
119
A (Proposition 3.25); that is, x E A-. If A = A- (equivalently, if A is closed in (X, d)), then x E A.
(b) Take an arbitrary point x E A- (i.e., an arbitrary point of adherence of A). According to Proposition 3.27, there exists an A-valued sequence that converges to x in (X, d). If every A-valued sequence that converges in (X, d) has its limit in A,
then x E A. Thus A- C A, and hence A = A- (for A c A- for every set A). That o is, A is closed in (X, d). This is a particularly useful result that will be often applied throughout this book.
Note that part (a) of the proof holds for general topological spaces but not part (b). The counterpart of the above theorem for general (not necessarily metrizable) topological spaces is stated in terms of nets (instead of sequences).
Example 3N. Consider the set B[X, Y] of all bounded mappings of a metric space (X, dx) into a metric space (Y, dy), and let BC[X, Y] denote the subset of B[X, Y] consisting of all bounded continuous mappings of (X, dx) into (Y, dy). Equip B[X, Y] with the sup-metric doo as in Example 3C. We shall use the Closed Set Theorem to show that
BC[X, Y] is closed in (B[X, Y], dam). Take an arbitrary BC[X, Y]-valued sequence {fn} that converges in (B[X, Y), dam) to a mapping f E B[X, Y]. The triangle inequality in (Y, dy) ensures that
dy(f(u), f(v)) _< dy(f(u), fn(u))+dy(fn(u),.fn(v))+dy(fn(v), f(v)) for each integer n and every u, v E X. Take an arbitrary real number E > 0. Since fn -+ f in (B[X, Y], dx), it follows that there exists a positive integer nE such
that doo(fn, f) = sup
f(x)) < E, and hence
f(x)) <'
for all x E X, whenever n > ne (uniform convergence - see Example 3G). Since each fn is continuous, it follows that there exists a real number SE > 0 (which may
depend on u and v) such that dy(fn, (u), fn, (v)) < 3 whenever dx(u, v) < S. Therefore dy(f (u), f (v)) < E whenever dx(u, v) < SE, so that f is continuous. That is, f E BC[X, Y]. Thus, according to Theorem 3.30, BC[X, Y] is a closed subset of the metric space (B[X, Y]. dm). Particular case (see Examples 3D and 3G): C[0, 11 is closed in (B[0, 11, dm).
3.6
Dense Sets and Separable Spaces
Let A be a set in a metric space X and let UA be the collection of all open subsets of X included in A:
UA = { U E P (X) : U is open in X and U CA).
120
3. Topological Structures
The empty set 0 of X always belongs to UA so that UA is never empty. The union of all sets in UA is called the interior of A in X, denoted by A° (i.e., A° = U UA). According to Theorem 3.15(c) it follows that A° is open in X
A° C A.
and
If U E UA, then U e U UA = A°. Thus, with respect to the inclusion ordering of
P(X). A° is the largest open subset of X that is included in A,
and hence (since A° is open in X)
if and only if
A is open in X
A° = A.
From the above displayed results it is readily verified that
0° = 0,
X° = X,
(A°)° = A°
and, if B also is a set in X,
AcB
A° a B°.
implies
Moreover, since A n B is a subset of both A and B, it follows that (A n B)° C A° n B°. On the other hand, since (A n B)° is the largest open subset of X that is included in A n B, and since A° n B° is open (Theorem 3.15(b)) and is included in A n B (for A° c_ A and B° C B so that A° n B° c A n B), it follows that A° n B° c (A n B)°. Therefore, if A and B are subsets of X, then
(A n B)° = A° n B°. It is easy to show by induction that the above identity holds for any finite collection
of subsets of X. That is, the interior of the intersection of a finite collection of subsets of X coincides with the intersection of their interiors. In general (i.e., by allowing infinite collections as well) one has inclusion rather than equality. Indeed, if {Ay is an arbitrary indexed family of subsets of X, then
(I
I
Ay)° C I
Y
I
Ay
y
because f y Ay g A° and hence (f y Ay )° a A,,, for each index a E r. Similarly,
UAY C (UAY Y
y
because UY A°y C UY Ay and UyA°y is open in X by Theorem 3.15(c). However. the above two inclusions are not reversible in general so that equality does not hold.
3.6 Dense Sets and Separable Spaces
121
Example 30. This is the dual of Example 3M. Consider the setup of Example 3M and set Cn = X\An, which is open in R for each positive integer n, and C = X\A, which is not open in R. Since 00
00
00
n Cn = n(X\An) = X` U An = X\A = C, n=1
n=1
n=1
it follows that the intersection of an infinite collection of open sets is not necessarily open (see Theorem 3.15(b)). In particular, as C° = Cn, 00
00
(X\A)° = C° = (n Cn)° c n C° = C = X\A n=1
n=1
which is a proper inclusion. Now set D = X\B = (-oo, 1) U (2, oo) (so that D° = D). Thus C° U D° is a proper subset of (C U D)°:
R\(I) = CO U D° c (C U D)° = R. Remark: The duality between "interior" and "closure" is clear:
(X\A)` = X\A°
and
(X\A)° = X\A-
for every subset A of X. Indeed, U E UA if and only if X \U E VX\A (i.e., U is open
in X and U C A if and only if X\U is closed in X and X\A C X\U) and, dually, V E Vx\A if and only ifX\V E UA.ThusA° = X\nUEUA(X\U) = X\nvEVI\AV = X\(X\A)- and so X\A° = (X\A)-; which implies (swap A and X\A) that X \(X \A)° = A- and hence (X \A)° = X\A-. This confirms the above identities and also their equivalent forms:
A° = X\(X\A)-
and
A- = X\(X\A)°.
A point x E X is an interior point of A if it belongs to the interior A° of A. It is clear that every interior point of A is a point of A (i.e., A° C A), and it is readily verified that x E A is an interior point of A if and only if there exists a neighborhood of x included in A (reason: A° is the largest open neighborhood of every interior point of A that is included in A). A subset A of a metric space X is called dense in X (or dense everywhere) if its closure A- coincides with X (i.e., if A- = X). More generally, suppose A and B are subsets of a metric space X such that A C B. A is dense in B if B C A or, equivalently, if A- = B-. (Why?) Clearly, if A C_ B and A- = X. then B- = X. Note that the only closed set dense in X is X itself. Proposition 3.31. Let A be a subset of a metric space X. The following assertions are pairwise equivalent.
122
3. Topological Structures
(a) A- = X
(i.e., A is dense in X).
(b) Every nonempty open subset of X meets A. (c) VA = {X}.
(d) (X\A)° = 0
(i.e., the complement of A has empty interior).
Proof. Take any nonempty open subset U of X, and take an arbitrary u E U C X. If (a) holds true, then every point of X is adherent to A. In particular, u is adherent to A. Thus Proposition 3.25 ensures that U meets A. Conclusion: (a)=(b). Now take an arbitrary proper closed subset V of X, so that 0 96 X\V is open in X. If (b) holds true, then (X\V) fl A 96 0. Thus V does not include A. and hence V f VA. Since A- surely belongs to VA, it follows that (c)=>(a). The Therefore o equivalence (a). (d) is obvious from the identity A- = X\(X\A)°.
The reader has probably observed that the concepts and results so far in this section apply to topological spaces in general. From now on the metric will play its role. Note that a point in a subset A of a metric space X is an interior point of A if and only if it is the center of a nonempty open ball included in A (reason: every nonempty open ball is a neighborhood and every neighborhood includes a nonempty open ball). We shall say that (A, d) is a dense subspace of a metric space (X, d) if the subset A of X is dense in (X, d). Proposition 3.32. Let (X, d) be a metric space and let A and B be subsets of X such that 0 96 A C B C X. The following assertions are pairwise equivalent.
(a) A-= B-
(i.e., A is dense in B).
(b) Every nonempty open ball centered at any point b of B meets A. (c) infOEA d(b, a) = 0 for every b E B.
(d) For every point bin B there exists an A-valued sequence (a.) that converges in (X, d) to b.
Proof. Recall that A- = B- if and only if B C_ A-. Let b be an arbitrary point in B. Thus assertion (a) can be rewritten as (a) every point b in B is a point of adherence of A. Now notice that assertions (a), (a). (b) and (c) are pairwise equivalent by Proposition 3.27. o
Corollary 3.33. Let F and G be continuous mappings of a metric space X into a metric space Y. If F and G coincide on a dense subset of X, then they coincide on the whole space X.
3.6 Dense Sets and Separable Spaces
123
Proof. Suppose X is nonempty, to avoid trivialities, and let A be a nonempty dense subset of X. Take an arbitrary x E X and let (a"} be an A-valued sequence that converges in X to x (whose existence is ensured by Proposition 3.32). If F: X --0, Y and G : X -- Y are continuous mappings such that FIA = GIA, then
F(x) = F(lima") = limF(a") = limG(a") = G(lima") = G(x) (Corollary 3.8). Thus F(x) = G(x) for every x E X; that is, F = G.
El
A metric space (X, d) is separable if there exists a countable dense set in X. The density criteria in Proposition 3.32 (with B = X) are particularly useful to check separability.
Example 3P. Take an arbitrary integer n > 1, an arbitrary real p > 1, and consider the metric space (R", dp) of Example 3A. Since the set of all rational numbers Q is dense in the real line R equipped with its usual metric, it follows that Q" (the
set of all rational n-tuples) is dense in (R", dp). Indeed, Q- = R implies that
inf ,QI - vI = 0 for every E R, which in turn implies that n
inf d p(x, y) = YEQ"
inf
/y
Y=(UI .....Vn)E Q"
Iii - v; Ip) r = 0
i=I
for every x = (I, ... , k") E R". Hence
R", according to Proposition 3.32. Moreover, since #Q" = #Q = No (Problems 1.25(c) and 2.8), it follows that Q" is countably infinite. Therefore Q" is a countable dense subset of (R", dp), and hence (R", dp) is a separable metric space. Now consider the metric space (e+, d p) for any p > 1 as in Example 3B, where £. is the set of all real-valued p-summable infinite sequences. Let t. be the subset of £. made up of all real-valued infinite sequences with a finite number of nonzero entries, and let X be the subset of eO consisting of all rational-valued infinite sequences with a finite number of nonzero entries. The set eO is dense in (e+, d p) - Problem 3.44(c).
Since Q- = R, it follows that X is dense in (8+, dp) - the proof is essentially the same as the proof that Q" is dense in (R", dp). Thus X- = (e+)- = f+ p, and hence X is dense in (P+,dp). Next we show that X is countably infinite. In fact, X is a linear space over the rational field Q for which dim X = No (see Example 2J). Thus, according to Problem 2.8, #X = max{#Q, dim X} = No. Conclusion: X is a countable dense subset of (e+, d p), and so (e+ p, d p) is a separable metric space.
The same argument is readily extended to complex spaces so that (C", dp) also is separable, as well as (e+, d p) when e+ is made up of all complex-valued p-summable infinite sequences. Finally we show that
(C[0, 1], dd) is a separable metric space
124
3. Topological Structures
(see Example 3D). Actually, the set P[0, 11 of all polynomials on [0, 11 is dense in (C[0, 1], dam). This is the well-known Weierstrass Theorem, which says that every continuous function in C[0, 1) is the uniform limit of a sequence of polynomials
in P[0, 1] (i.e., for every x E C[0, 1] there exists a P[0, 1]-valued sequence (p.1 such that doo (p., x) -+ 0). Moreover, it is easy to show that the set X of all polynomials on [0, 1 ] with rational coefficients is dense in (P[0, I], dam), and hence X is dense in (C[0, 1 ], dam). Since X is a linear space over the rational field Q. and since dim X = No (essentially the same proof as that in Example 2M), it follows by Problem 2.8 that X is countable. Thus X is a countable dense subset of (C[0, 1 ], doo).
A collection B of open subsets of a metric space X is a base (or a topological base) for X if every open set in X is the union of some subcollection of B. For instance, the collection of all open balls in a metric space (including the empty ball) is a base for X (cf. Corollary 3.16). Note that the above definition of base forces the empty set 0 of X to be a member of any base for X.
Proposition 3.34. Let X be a metric space and let B be a collection of open subsets of X that contains the empty set. The following assertions are pairwise equivalent. (a) B is a base for X.
(b) For every nonempty open subset U of X and every point x in U, there exists a set B in B such that x E B C U.
(c) For every x in X and every neighborhood N of x, there exists a set B in B such that x E B C N. Proof. Take an arbitrary open subset U of the metric space X and set BU = (B E B: B C U). If B is a base for X, then U = U Bu by definition of base. Thus, if x E U, then x E B for some B E BU so that x E B C_ U. That is, (a) implies (b). On the other hand, if (b) holds, then any open subset U of X clearly coincides with U BU, which shows that (a) holds true. Finally note that (b) and (c) are trivially equivalent:
every neighborhood of x includes an open set containing X, and every open set containing x is a neighborhood of x. D Theorem 3.35. A metric space (X, d) is separable if and only if it has a countable base. Proof. Suppose B is a countable base for X. For each nonempty set B in B take an arbitrary point b in B. Now consider the set (bn) consisting of one point of each nonempty set B in B. The set {b,) is countable and every nonempty open subset U of X meets (since U is the union of some subcollection of so that U n (b,,) 36 0). Therefore (cf. Proposition 3.31) is a countable dense subset of X, and hence X is separable. On the other hand, suppose X is separable. Then there exists a countable subset A of X that is dense in X. Consider the collection
3.6 Dense Sets and Separable Spaces
125
of nonempty open balls B = (Bi(a): n E N and a E A). Observe that B = UaEAIBi(a))nEN, a countable union of countable collections, so that B is itself a countable collection (Corollary 1.11).
Claim. For every x E X and every neighborhood N of x there exists a ball in B containing x and included in N. Proof. Take an arbitrary x E X and an arbitrary neighborhood N of x. Let BE(x) be an open ball of radius e > 0, centered at x, and included in N. Now take a positive integer n such that 1 < , and a point a E A such that a E B i (x) (recall:
since A- = X, it follows by Proposition 3.32 that there exists a E A such that a E B,, (x) for every x E X and every p > 0). Obviously, X E Bi (a). Moreover, Bi (a) a BE(x). Indeed, if y E BR (a), then d(x, y) < d(x, a) + d(a, y) < so that y E Be (x). Thus X E Bi (a) c B.(x) c N.
<E
Therefore the countable collection B U (0) of open balls is a base for X by Proposition 3.34.
Corollary 3.36. Every subspace of a separable metric space is itself separable. Proof. Let S be a subspace of a separable metric space X and, according to Theorem 3.35, let B be a countable base for X. Set Bs = (Sfl B : B E B), which is a countable collection of subsets of S. Since the sets in B are open subsets of X, it follows that the sets in Bs are open relative to S (see Problem 3.38(c)). Take an arbitrary nonempty
relatively open subset A of S so that A = S fl U for some open subset U of X (Problem 3.38(c)). Since U = U B' for some subcollection B' of B, it follows that
A = S fl UBEB,B = UBEBfS n B = UBs, where BS = IS fl B: B E B') is a subcollection of Bs. Thus Bs is a base for S. Therefore the subspace S has a countable base, which means by the previous theorem that S is separable.
Let A be a subset of a metric space. An isolated point of A is a point in A that is not an accumulation point of A. That is, a point x is an isolated point of A if x E A\A`. Proposition 3.37. Let A be a subset of a metric space X and let x be a point in A. The following assertions are pairwise equivalent.
(a) x is an isolated point of A. (b) There exists an open set U in X such that A U U = {x}.
(c) There exists a neighborhood N of x such that A U N = (x). (d) There exists a nonempty open ball B,,(x) centered at x such that A U B,,(x) _ IX).
126
3. Topological Structures
Proof. Assertion (a) is equivalent to assertion (b) by Proposition 3.26. Assertions (b), (c) and (d) are trivially pairwise equivalent.
0
A subset A of X consisting entirely of isolated points is a discrete subset of X. This means that in the subspace A every set is open, and hence the subspace A is homeomorphic to a discrete space (i.e., to a metric space equipped with the discrete metric). According to Theorem 3.35 and Corollary 3.36, a discrete subset of a separable metric space is countable. Thus, if a metric space has an uncountable discrete subset, then it is not separable.
Example 3Q. Let S be a set, let (Y, d) be a metric space, and consider the metric space (B[S, Y], dd) of all bounded mappings of S into (Y, d) equipped with the sup-metric doo (see Example 3C). Suppose Y has more than one point, and let yo and y1 be two distinct points in Y. As usual, let 2s denote the set of all mappings on S with values either yo or yi (i.e., the set of all mappings of S into (yo, yt } so that 2S = (yo. y1 )S C B[S, Y]). If f, g E 2s and f 96 g (i.e., if f and g are two distinct mappings on S with values either yo or yt ), then
dm(f g) = supd(f (s), g(s)) = d(yo, yt) A 0. ses
Therefore, any open ball Bp (g) = (f E 2s: doo (f, g) < p) centered at an arbitrary point g of 2s with radius p = d (yo, Y1)/2 is such that 2s n B,.(g) = (g}. This means that every point of 2S is an isolated point of it, and hence
2S is a discrete set in (B[S, Y], dam). If S is an infinite set, then 2S is an uncountable subset of B[S, YJ (recall: if S is infinite, then No < #S < #2S according to Theorems 1.4 and 1.5). Thus (B[S, Y], dd) is not separable whenever 2 < #Y and No < #S. Concrete example:
(P°+°, do,) is not a separable metric space.
Indeed, set S = N and Y = C (or Y = R) with its usual metric d, so that (B[S, Y], doo) = (e{°, doo): the set of all scalar-valued bounded sequences equipped with the sup-metric, as introduced in Example 3B. The set 2N, consisting of all sequences with values either 0 or 1, is an uncountable discrete subset of (& 0, dd).
In a discrete subset every point is isolated. The opposite notion is that of a set where every point is not isolated. A subset A of a metric space X is dense in itself if A has no isolated point or, equivalently, if every point in A is an accumulation point
of A. That is, if A C_ A*. Since A- = A U A' for every subset A of X. it follows that a set A is dense in itself if and only if A' = A-. A subset A of X that is both closed in X and dense in itself (i.e., such that A' = A) is a perfect set: a closed set without isolated points. For instance, Q n [0, 11 is a countable perfect subset of the metric space Q, but it is not perfect in the metric space R (since it is not closed in R). As a matter of fact, every nonempty perfect subset of R is uncountable because R is a "complete" metric space, a concept that we shall define next.
3.7 Complete Spaces
3.7
127
Complete Spaces
Consider the metric space (R, d), where d denotes the usual metric on the real line ;.R, and let (A, d) be the subspace of (R, d) with A = (0, 11. Let be the A-valued sequence such that an = >s for each n E N. Does (an } converge in the metric space (A, d)? It is clear that {an } converges to 0 in (R, d), and hence we might at a first glance think that it also converges in (A, d). But the point 0 simply does not exist in A so that it is nonsense to say that "an --> 0 in (A, d)". In fact, (a } does not converge in the metric space (A, d). However, the sequence (an} seems to possess a "special property" that makes it apparently convergent in spite of the particular underlying set A, and the metric space (A, d) in turn seems to bear a "peculiar characteristic" that makes such a sequence fail to converge in it. The "special property" of the sequence (an ) is that it is a Cauchy sequence in (A, d) and the "peculiar characteristic" of the metric space (A, d) is that it is not complete. Definition 3.38. Let (X, d) be a metric space. An X -valued sequence (xe) (indexed either by N or N0) is a Cauchy sequence in (X, d) (or satisfies the Cauchy criterion) if for each real number e > 0 there exists a positive integer ne such that
n, in > n£
implies
d (xm, xn) < E.
A usual notation for the Cauchy criterion is limm,n d(xm, xn) = 0. Equivalently, an
X-valued sequence (xn) is a Cauchy sequence if diam({xk}n
Proposition 3.39. Let (X, d) be a metric space. (a) Every convergent sequence in (X, d) is a Cauchy sequence. (b) Every Cauchy sequence in (X, d) is bounded.
(c) If a Cauchy sequence in (X, d) has a subsequence that converges in (X, d), then it converges itself in (X, d) and its limit coincides with the limit of that convergent subsequence. Proof. (a) Take an arbitrary e > 0. If an X -valued sequence (x,,) converges to a point x E X, then there exists an integer n£ > 1 such that d(xn, x) < ' whenever n > ne. Since d(xm, xn) < d(xm, x) + d(x. for every pair of indices in. n (triangle inequality), it follows that d(xm, e whenever in. n > nf. (b) If {xn) is a Cauchy sequence, then there exists an integer n i > 1 such that
d(xm, xn) < 1 whenever m, n > n1. Note that the set {d(xm, xn) E R: in, n < n1)
128
3. Topological Structures
has a maximum in R, say $, because it is finite. Thus d(xm, d (x,,, , 2 max( 1, P) for every pair of indices m, n.
d(xm, xn,) +
(c) Suppose (x,,,) is a subsequence of an X-valued Cauchy sequence (xn) that converges to a point x E X (i.e., x,,, -> x as k --> oo). Take an arbitrary e > 0. Since (x,,) is a Cauchy sequence, it follows that there exists a positive integer of such that d (xm ,
whenever m, n > n f. Since {x,,, } converges to x, it follows
that there exists a positive integer kE such that d(x,,,, x) < ' whenever k > kE. Thus, if j is any integer with the property that j > kE and nj > nE (for instance, if j = max(nF, k5}), then d(x,,, x) < d(x,,, x) < e for every n > nE, and therefore (x,,) converges to X.
El
Although a convergent sequence always is a Cauchy sequence, the converse may fail. For instance, the (0, 1]-valued sequence is a Cauchy sequence in the metric space ((0, 1 ], d), where d is the usual metric on R, which does not converge in ((0, 1], d). There are however many metric spaces with the notable property that Cauchy sequences in it are convergent. Metric spaces possessing this property are so important that we give them a name. A metric space X is complete if every Cauchy sequence in X is a convergent sequence in X.
Theorem 3.40. Let A be a subset of a metric space X. (a) If the subspace A is complete, then A is closed in X.
(b) If X is complete and if A is closed in X, then the subspace A is complete. Proof. (a) Take an arbitrary A-valued sequence (an) that converges in X. Since
every convergent sequence is a Cauchy sequence, it follows that (an) is a Cauchy sequence in X, and therefore a Cauchy sequence in the subspace A. If the subspace A is complete, then {a } converges in A. Conclusion: If A is complete as a subspace of X, then every A-valued sequence that converges in X has its limit in A. Thus, according to the Closed Set Theorem (Theorem 3.30), A is closed in X. (b) Take an arbitrary A-valued Cauchy sequence If X is complete, then converges in X to a point a E X. If A is closed in X. then Theorem 3.30 (the Closed Set Theorem again) ensures that a E A, and hence {a } converges in the subspace A. Conclusion: If X is complete and A is closed in X, then every Cauchy sequence in the subspace A converges in A. That is, A is complete as a subspace of X. El An important, although immediate, corollary of the above theorem says that "inside" a complete metric space the properties of being closed and complete coincide.
Corollary 3.41. Let X be a complete metric space. A subset A of X is closed in X if and only if the subspace A is complete. Example 3R. (a) A basic property of the real number system is that every bounded sequence of real numbers has a convergent subsequence. This and Proposition 3.39
3.7 Complete Spaces
129
ensure that the metric space R (equipped with its usual metric) is complete; and so is the metric space C of all complex numbers equipped with its usual metric (reason: if (ak) is a Cauchy sequence in C, then (Re ak } and (Im ak) are both Cauchy sequences
in R so that they converge in R, and hence (ak } converges in Q. Since the set Q of all rational numbers is not closed in R (recall: Q- = R), it follows by Corollary 3.41 that the metric Q is not complete More generally (but similarly), for every positive integer n. R" and C" are complete metric spaces
when equipped with any of their metrics d p for p > 1 or dO (as in Example 3A); while
Q" is not a complete metric space.
(b) Now let F denote either the real field R or the complex field C equipped with their usual metrics. As we have just seen, F is a complete metric space. For each real number p > 1 let (P+, dp) be the metric space of all F-valued p-summable sequences equipped with its usual metric dp as in Example 3B. Take an arbitrary Cauchy sequence in (£+, dp), say (Xn)nEN. Recall that this is a sequence of sequences; that is, x,, = (in(k))kEN is a sequence in e+ for each integer n E N. The Cauchy criterion says: for every e > 0 there exists an integer ne > I such that d p(xm. xn) < e whenever m, n > nE. Thus 00
i
Im(i)-+;n(i)Ip)II = dp(xm,Xn) < E i=1
for every k E N whenever m, n > n, . Therefore, for each k E N the scalar-valued sequence (rn(k)}nEN is a Cauchy sequence in F, and hence it converges in F (since F is complete) to, say, r; (k) E F. Consider the scalar-valued sequence x = (r; (k))kEN consisting of those limits i; (k) E F for every k E N. First we show that x E e+. Since
(x"}nEN is a Cauchy sequence in (e+, dp), it follows by Proposition 3.39 that it is bounded (i.e., supm,ndp(xm, xn) < oo), and hence supmdp(xm, 0) < oo where 0 denotes the null sequence in a+. (Indeed, for every m E N the triangle inequality ensures that dp(xm, 0) < supn,ndp(xm, xn) + dp(xn, 0) for an arbitrary n E N). Therefore, n(k)I")
It r
<
dp(Xn,0) < supdp(Xm,0)
k=1
k=1
M
for every n E N and each integer j ? 1. Since ln(k) -+ !; (k) in F as n -> oo for each k E N, it follows that
I
j
I
supdp(xm.0)
lirn k=1
k=I
M
130
3. Topological Structures
for every j E N. Thus 00
EI
1
(k)IP)
k=1
=sup( J
l
!
E It(k)It')' < supdp(xm, 0), m
k=1
which means that x = (i; (k))kEN E P+. Next we show that xn -> x in (e+, d p). Again, as {xn }fEN is a Cauchy sequence in (f,°p, dp), for any E > 0 there exists an integer nE > 1 such that dp(xm, xn) < E whenever m, n > na. Thus 00
In(k) - m(kllt' < EP
I Sn(k) - lm(k)I p < k=1
k=1
for every integer j > 1 whenever m, n > ne . Since limn tm (k) = (k) for each k E N, it follows that rk_1 Iln(k) - 1; (k)Ip < e°, and hence 00
1
i
k=1
I
In(k) - (k)Ip)' < E,
Itn(k) - (k)Ip) F = Sup
dp(xn, x)
k=1
whenever n > nE; which means that x(n) -> x in (e+, d p). Therefore
(t-P, d p) is a complete metric space for every p > 1. Similarly (see Example 3B), for each p > 1,
(1p, dp) is a complete metric space. Example 3S. Let S be a nonempty set, let (Y, d) be a metric space, and consider the metric space (B[S, Y], d00) of all bounded mappings of S into (Y, d) equipped with the sup-metric do0 (Example 3C). We claim that
(B[S, Y], d00) is complete if and only if (Y, d) is complete. (a) Indeed, if (fn } is a Cauchy sequence in (B[S, Y], d00), then { fn (s)) is a Cauchy
sequence in(Y,d)foreverys E S(ford(fm(s), fn(s)) d00 (fn, fn) for each pair of integers m, n and every s E S), and hence (f. (s)) converges in (Y, d) for every s E S whenever (Y, d) is a complete metric space. For
each s E S set f(s) = limn f, (s) (i.e., fn (s) -* f(s) in (Y,d)), which defines a function f of S into Y. We shall show that f E B[S, Y] and that fn --
f
in (B[S, Y], d00), thus proving that (B[S, Y], d00) is complete whenever (Y, d) is complete. First note that, by the triangle inequality,
d(f(s), f(t)) < d(f(s), f. (s)) +d(fn(s), fn(t))+d(fn(t), f(t))
3.7 Complete Spaces
131
for each positive integer n and every pair of points s, t in S. Now take an arbitrary
e > 0. Since (fn) is a Cauchy sequence in (B[S, Y], dam), it follows that there exists a positive integer nE such that dd(fm, fn) = sup:Esd(fm(s), fn(s)) < s, and hence
d(fm(s), fn(s)) < -for all S E S, whenever m, n > ne. Moreover, since fm(s) -+ f (s) in (Y, d) for y): Y -> R is a every s E S, and since the metric is continuous (that is, continuous function from the metric space Y to the metric space R for every y E Y),
it also followsthatd(f(s), fn (s)) =d(limm f. (s), fn(s)) = limmd(fm(s), fn(s)) for each positive integer n and every s E S (see Problem 3.14 or 3.34 and Corollary 3.8). Thus
d(f(s), fn (s)) < e for all s E S whenever n > ne. Furthermore, as each fn lies in B[S, Y], there exists a real number yn, such that sup d(fn,(s). fn, (t)) < yn,.
S.IEs
Summing up: For an arbitrary real number e > 0 there exists a positive integer nE such that
d(f(s), f(t)) < 2e+yn, for all s, t E S, so that f E B[S, Y]; and
doo(f fn) = supd(f(s), fn (s)) < e SES
whenever n > ne, so that fn -* fin (B[S, Y], do.). (b) Conversely, take an arbitrary Y-valued sequence (y,,). Suppose S is nonempty and set fn (s) = yn for each integer n and all s E S. This defines a sequence [f.) of constant mappings of S into Y which clearly lie in B[S, Y] (a constant mapping is obviously bounded). Note that doo(fm, f,,) = supsEsd(fm(s), fn(s)) = d(ym, yn) for every pair of integers m, n. Thus (f.) is a Cauchy sequence in (B[S, Y], dam) if and only if { yn } is a Cauchy sequence in (Y, d). Moreover, (fn } converges in (B[S, Y], dam) if and only if (yn) converges in (Y, d) (reason: if d(yn, y) - 0 for
some y E Y. then doo(fn, f) --> 0 where f E B[S. Y] is the constant mapping
f (s) = y for all s E S; and if doo(f,,, f) --> 0 for some f E B[S, Y], then d(yn, f (s)) = d(fn(s), f (s)) for each n and every s, so that d(yn, f (s)) -* 0 for all S E S - and hence f must be a constant mapping). Now suppose (Y, d) is not complete, which implies that there exists a Cauchy sequence in (Y, d), say (y.), that fails to converge in (Y, d). Thus the sequence (f.) of constant mappings f" (s) = yn for each integer n and all s E S is a Cauchy sequence in (B[S, Y], dam) that fails to converge in (B[S, Y], dam), and hence (B[S, Y], dd) is not complete. Conclusion: If (B[S, Y], doo) is complete, then (Y, d) is complete.
132
3. Topological Structures
(c) Concrete example: Set S = N or S = Z and Y = IF (either the real field R or the complex field C equipped with their usual metric). Then (e.°, d0,) and (100, dam) are complete metric spaces.
Example 3T. Consider the set B[X, Y] of all bounded mappings of a nonempty metric space (X, dX) into a metric space (Y, dy) and equip it with the sup-metric dO0 as in the previous example. Let BC[X, Y] be the set of all continuous mappings from B[X, Y] (Example 3N), so that (BC[X, Y], dam) is the subspace of (B[X, Y], doo) made up of all bounded continuous mappings of (X, dX) into (Y, dy).
If (Y, dy) is complete, then (B[X, Y], doo) is complete according to Example 3S. Since BC[X, Y] is closed in (B[X, Y], doo) - Example 3N - it follows by Theorem 3.40 that (BC[X, Y], doo) is complete. On the other hand, the very same construction used in item (b) of the previous example shows that (BC[X, Y], dam) is not complete unless (Y, dy) is. Conclusion:
(BC[X, Y], dam) is complete if and only if (Y, dX) is complete. In particular (see Examples 3D, 3G and 3N),
(C[0, fl, d..) is a complete metric space because the real line R or the complex plane C (equipped with their usual metrics, as always) are complete metric spaces (Example 3R). However, for any p > I (cf. Problem 3.58),
(C[0, 1], dp) is not a complete metric space.
The concept of completeness allows us to state and prove a useful result on contractions.
Theorem 3.42. (Contraction Mapping Theorem or Method of Successive Approximations). A strict contraction F of a nonempty complete metric space (X, d) into itself has a unique fixed point x E X, which is the limit in (X, d) of every X-valued sequence of the form (F" (x0) }"ENO for any x0 E X. Proof. Take an arbitrary point xo in X and consider the X -valued sequence {x" }"Etvo
such that
x, = F" (xo) for each n E No. Recall that F" denotes the composition of F : X -- X with itself n times (and that F0 is by convention the identity map on X). It is clear that the sequence (X,,),,,No satisfies the difference equation
x"+l = F(x")
3.7 Complete Spaces
133
for every n E No. Conversely, if an X-valued sequence {xn}fENO is recursively defined from any point xo E X onwards as xn+l = F(xn) for every n E No, then it is of the form xn = F"(xo) for each n E No (proof: induction). Now suppose F : (X, d) -- (X, d) is a strict contraction and let y E (0, 1) be a Lipschitz constant for F so that
d(F(x), F(y)) < yd(x, y) for every x, y in X. A trivial induction shows that
d(Fn(x), F"(y)) < Y"d(x, y) for every nonnegative integer n and every x, y E X. Next take an arbitrary pair of nonnegative distinct integers, say m < n. Note that
xn = F"(xo) = Fm(F"-m(x0)) = F"(xn-m), and hence
Ymd(x0,xn-m)
d(xm,xn) = d(Fm(xo), Fm(xn-m)) By using the triangle inequality we get n-m-1
d(xo,xn-m)
E d(xi,xi+1), i=0
and therefore
n-m-I
n-m-I
d(xm,xn) < Ym E d(xi,xi+1) < ym( > Y')d(xo,xl) i=0
i=o
Another trivial induction shows that F* -'a' = T for every real number a # 1 and every integer k > 1. Thus, for any y E (0, 1), -i"=a
_
so that
d(xm, xn) <
1
d(xo, x1)
Ym
and
T
ym - 0,
Y
and hence {xn} is a Cauchy sequence in (X, d) (reason: for any a > 0 there exists an integer nE such that I'm d(xo, xI) < s, which implies d(xn, xn) < s, whenever
n > m > nE). Thus (xn ) converges in the complete metric space (X, d). Set x = lim xn E X. Since a contraction is continuous, it follows by Corollary 3.8 that {F(xn)} converges in (X, d) and F(limxn) = lim F(xn). Therefore
x = lim xn = lim xn+I = lim F(xn) = F(lim xn) = F(x) so that the limit of (xn } is a fixed point of F. Moreover, if y is any fixed point of
F, then d(x, y) = d(F(x), F(y)) < y d(x, y), which implies that d(x, y) = 0 (because y E (0, 1)), and hence x = y. Conclusion: For any xo E X the sequence {F"(xo)} converges in (X, d) and its limit is the unique fixed point of F.
134
3. Topological Structures
3.8
Continuous Extension and Completion
Recall that continuity preserves convergence (Corollary 3.8). Uniform continuity, as one might expect, goes beyond that. In fact, uniform continuity preserves Cauchy sequences too.
Lemma 3.43. Let F : X - Y be a uniformly continuous mapping of a metric space X into a metric space Y. If lx,,) is a Cauchy sequence in X, then (F (xn) } is a Cauchy sequence in Y.
Proof. The proof is straightforward by the definitions of Cauchy sequence and uniform convergence. Indeed, let dX and dy denote the metrics on X and Y, respec-
tively, and take an arbitrary X-valued sequence (xn). If F: X -+ Y is uniformly continuous, then for every s > 0 there exists 8E > 0 such that dx(xm,
dy(F(xm),
implies
8E
s.
However, associated with 8E there exists a positive integer nE such that
m, n > nE
implies
dX(xm, Xn) < 8E
whenever (x } is a Cauchy sequence in X . Hence, for every real number e > 0 there exists a positive integer nE such that m, n > nE
implies
dy(F(xm), F(xn)) < e,
which means that (F(xn)) is a Cauchy sequence in Y.
Thus, if G : X -+ Y is a uniform homeomorphism between two metric spaces X and Y, then (xn} is a Cauchy sequence in X if and only if (G(xn)) is a Cauchy sequence in Y, and therefore a uniform homeomorphism takes a complete metric space onto a complete metric space. Theorem 3.44. Take two uniformly homeomorphic metric spaces. One of them is complete if and only if the other is. Proof. Let X and Y be metric spaces and let G : X -+ Y be a uniform homeomorphism. Take an arbitrary Cauchy sequence (y.) in Y and consider the sequence (xn ) in X such that x,, = G-1(yn) for each n. Lemma 3.43 ensures that (xn } is a Cauchy sequence in X. If X is complete, then (xn) converges in X to. say, x E X. Since G is continuous, it follows by Corollary 3.8 that the sequence (yn}, which is such that
yn = G(xn) for each n, converges in Y to y = G(x). Thus Y is complete. Carefully note that the above theorem does not hold if uniform homeomorphism is replaced by plain homeomorphism: if X and Y are homeomorphic metric spaces,
then it is not necessarily true that X is complete if and only if Y is complete. In other words, completeness is not a topological invariant (continuity preserves convergence but not Cauchy sequences).
3.8 Continuous Extension and Completion
135
Example 3U. Let R be the real line with its usual metric. Set A = (0, 11 and B = [1, oo), both subsets of It Consider the function G : A -+ B such that G (a) = it for every at E A. As it is readily verified, G is a homeomorphism of A onto B, so that A and B are homeomorphic subspaces of R. Now consider the A-valued sequence {an) with an = >i for each n E N, which is a Cauchy sequence in A. However, G(an) = n for every n E N, and hence {G(an)) is certainly not a Cauchy sequence in B (since it is not even bounded in B). Thus, according to Lemma 3.43, G : A - B (which is continuous) is not uniformly continuous. Actually, B is a complete subspace of R (since B is a closed subset of the complete metric space R - Corollary 3.41) and, as we have just seen, A is not a complete subspace of R (the Cauchy sequence {an } does not converge in A because its continuous image (G(an)) does not converge in B - Corollary 3.8). Lemma 3.43 also leads to an extremely useful result on extensions of uniformly continuous mappings of a dense subspace of a metric space into a complete metric space.
Theorem 3.45. Every uniformly continuous mapping F : A -+ Y of a dense subspace A of a metric space X into a complete metric space Y has a unique continuous extension over X, which in fact is uniformly continuous.
Proof. Suppose X is a nonempty metric space (otherwise the theorem is a triviality)
and let A be a dense subset of X. Take an arbitrary point x in X. Since A- = X, it follows by Proposition 3.32 that there exists an A-valued sequence (an ) that converges in X to x, and hence (an) is a Cauchy sequence in the metric space X (Proposition 3.39) so that (an) is a Cauchy sequence in the subspace A of X. Now suppose F: A - Y is a uniformly continuous mapping of A into a metric space Y. Thus, according to Lemma 3.43, (F(an)) is a Cauchy sequence in Y. If Y is a complete metric space, then the Y-valued sequence {F(an)} converges in it. Let y E Y be the (unique) limit of {F(an)) in Y:
y = lim F(an). We shall show now that y, which obviously depends on x E X, does not depend on the A-valued sequence (an) that converges in X to x. Indeed, let (a,,) be an A-valued sequence converging in X to x, and set
y' = 1im F(a'). Since both sequences (an) and (a;,) converge in X to the same limit x, it follows that dx (an, a;,) -+ 0 (see Problem 3.14(a)), where dx denotes the metric on X. Therefore, for every real number S > 0 there exists an index n, such that
n > na
implies
dx(an, a;,) < S.
Moreover, since the mapping F: A -+ Y is uniformly continuous, for every real number e > 0 there exists a real number SF > 0 such that dx(a, a') < Sf
implies
dy(F(a). F(a')) < 8
136
3. Topological Structures
for all a and a' in A, where dy denotes the metric on Y. Conclusion: Given an arbitrary e > 0 there exists SE > 0, associated with which there exists an index nat, such that n > nA, implies dy(F(an), F(a',)) < s.
Thus (cf. Problem 3.14(c)), 0 < dy (y, y') < s for all e > 0, and hence dy(y, y') _ 0. That is, y = y'. Therefore, for each x E X set
F(x) = lim F(an) in Y, where (an) is any A-valued sequence that converges in X to x. This defines a mapping F : X -+ Y of X into Y.
Claim 1. F is an extension of F over X. Proof. Take an arbitrary a in A and consider the A-valued constant sequence {an } such that an = a for every index n. As the Y-valued sequence {F(an)) is constant,
it trivially converges in Y to F(a). Thus F(a) = F(a) for every a E A. That is, F I A= F, which means that F: A - Y is a restriction of F: X -> Y to A C X or, equivalently, F is an extension of F over X. o
Claim 2. F is uniformly continuous. Proof. Take a pair of arbitrary points x and x' in X. Let {an ) and (an'n ) be any pair of A-valued sequences converging in X to x and x', respectively (recall: the existence
of these sequences is ensured by Proposition 3.32 because A is dense in X). Note
that dx (an, a;) _< dx (an, x) + dx (x, x') + dx(x', a') for every index n, by the triangle inequality in X. Thus, as an -+ x and a;, -> x' in X, for any 8 > 0 there exists an index na such that (Definition 3.4)
dx(x, x') < 8
implies
dx(an, a,,) < 38 for every n>na.
Since F : A - Y is uniformly continuous, it follows by Definition 3.6 that for every s > 0 there exists 8E > 0, which depends only on s, such that dX(an, d,) < 38E
implies
dy(F(an), F(an)) < s.
Thus, associated with each e > 0 there exists 8E > 0 (that depends only on e), which in turn ensures the existence of an index nat, such that
dx(x, x') < 8E
implies
dy(F(an), F(a;,)) < e for every n > nat.
Moreover, since F(an) -+ P(x) and F(a')
P(x') in Y by the very definition of
F : X -+ Y, it follows by Problem 3.14(c) that
dy(F(an), F(')) < e for every n > na,
implies
dy(P(x), P(x')) < e.
3.8 Continuous Extension and Completion
137
Therefore, given an arbitrary s > 0 there exists 8E > 0 such that
dx(x, x') < bE
implies
dy(F(x), F(x')) < s
for all x, x' E X. That is, F : X -+ Y is uniformly continuous (according to Definition 3.6). o Finally, since F : X -+ Y is continuous, it follows by Corollary 3.33 that if G : X -+ Y is a continuous extension of F : A -+ Y over X, then d = F (because A is dense in X and GSA = FAA = F). Thus F is the unique continuous extension of F over
o
X.
Corollary 3.46. Let X and Y be complete metric spaces, and let A and B be dense subspaces of X and Y, respectively. If G : A -+ B is a uniform homeomorphism of A onto B, then there exists a uniq a uniform homeomorphism G : X -+ Y of X onto Y that extends G over X (i.e., GSA = G).
Proof. Since A is dense in X, Y is complete, and G: A -+ B C_ Y is uniformly continuous, it follows by the previous theorem that G has a unique uniformly continuous extension G : X -+ Y. Similarly, the inverse G-t : B --+ A of G : A -+ B has a unique uniformly continuous extension G-t : Y -+ X. Note that (G-t G) IA = G-1G-'_t G= IA where IA : A -+ A is the identity on A (reason: G I A = G : A - B
IB = G-1 : B -+ A). The identity IA is uniformly continuous (because its domain and range are subspaces of the same metric space X), and hence it has a unique continuous extension on X (by the previous theorem) which clearly is IX : X -+ X, the identity on X (recall: Ix in fact is uniformly continuous because its domain and range are equipped with the same metric). Thus G't G = IX, for G -1G is continuous (composition of continuous mappings) and is an extension of and
the uniformly continuous mapping G-I G = IA over X. Similarly, a6-:I = ly where ly: Y -+ Y is the identity on Y. Therefore G- = G-t. Summing up: G: X --+ Y is an invertible uniformly continuous mapping with a uniformly continuous inverse (i.e., a uniform homeomorphism) which is the unique uniformly continuous extension of G : A -* B over X. o
Recall that every surjective isometry is a uniform homeomorphism. Suppose the uniform homeomorphism G of the above corollary is a surjective isometry.
Take an arbitrary pair of points x and x' in X so that e(x) = lim G(an) and G(x') = lim G(a') in Y, where {an } and {d,) are A-valued sequences converging in X to x and x', respectively (cf. proof of Theorem 3.45). Since G is an isometry, it follows by Problem 3.14(b) that
dy(G(x),G(x')) =
limdx(an,a') = dx(x,x').
Thus d is an isometry as well, and hence a surjective isometry (since G is a homeomorphism). This proves the following further corollary of Theorem 3.45.
138
3. Topological Structures
Corollary 3.47. Let A and B be dense subspaces of complete metric spaces X and Y, respectively. If J : A -* B is a surjective isometry of A onto B, then there exists a .unique surjective isometry J : X -+ Y of X onto Y that extends J over X. (i.e.,
JIA = J) If a metric space X is a subspace of a complete metric space Z, then its closure X- in Z is a complete metric space by Theorem 3.40. In this case X can be thought of as being "completed" by joining to it all its accumulation points from Z (recall: X- = X U X*), and X- can be viewed as a "completion" of X. However, if a metric space X is not specified as being a subspace of a complete metric space Z, then the
above approach of simply taking the closure of X in Z obviously collapses; but the idea of "completion" behind such an approach survives. To begin with, recall that two metric spaces, say X and X, are isometrically equivalent if there exists a surjective isometry of one of them onto the other (notation: X = X). Isometrically equivalent metric spaces are regarded (as far as purely metric-space structure is concerned) as being essentially the same metric space. If X is a subspace of a complete metric space, then its closure X in that complete metric space is itself a complete metric space. With this in mind consider the following definition. Definition 3.48. If the image of an isometry on a metric space X is a dense subspace of a metric space X, then X is said to be densely embedded in X. If a metric space X is densely embedded in a complete metric space X, then k is a completion of X.
Even if a metric space fails to be complete it can always be densely embedded in a complete metric space. Lemma 3.43 plays a central role in the proof of this statement.
Theorem 3.49. Every metric space has a completion. Proof. Let (X, dx) be an arbitrary metric space and let CS(X) denote the collection and y = are of all Cauchy sequences in (X, dx). Recall that, if x =
sequences in CS(X), then the real-valued sequence {dx(x,,, y,)) converges in R (see Problem 3.53(a)). Thus, for each pair (x,y) in CS(X)xCS(X) set
d(x,y) = This defines a function d : CS(X) x CS(X) R which is a pseudometric on CS(X). Indeed, nonnegativeness and symmetry are trivially verified and the triangle inequal-
ity in (CS(X), d) follows at once by the triangle inequality in (X, dx). Consider a
relation - on CS(X) defined as follows. If x =
and x' = (x;,} are Cauchy
sequences in (X, dx), then
x'-. x
if
d(x',x) = 0.
Proposition 3.3 asserts that - is an equivalence relation on CS(X). Let X be the collection of all equivalence classes [x] c CS(X) with respect to - for every
3.8 Continuous Extension and Completion
139
sequence x = {x } in CS(X). In other words, set k = CS(X)/-' , the quotient space of CS(X) modulo -. For each pair ([x ], [y ]) in X x k set
dx([xl,[y1) = d(x,y) for an arbitrary pair (x,y) in [x]x[y] (i.e., dX ([x], [y]) = limdx(x,,, y,,) where and are any Cauchy sequences from the equivalence classes [x] and [y ], respectively). Proposition 3.3 also asserts that this actually defines a function djZ : X x X - R and, moreover, that such a function dX in fact is a metric on X. Thus
(X, dX) is a metric space. Now consider the mapping K : X -+ 2 defined as follows. For each x E X take the constant sequence x = {x) E CS(X) such that x = x for all indices n, and set K(x) = [x] E X. That is, for each x in X, K(x) is the equivalence class in X containing the constant sequence with entries equal to x. It is readily verified that
K : X -* X is an isometry. Indeed, take x, y E X arbitrary, let x = and y = (yn) be constant sequences such that x = x and y,, = y for all indices n, and note that dX (K(x), K(y)) _
dx(xn, y.) = dx(x, y). Claim 1. Proof.
K(X)- = X.
Take an arbitrary [x] in X and an arbitrary
E [x] so that
is a
Cauchy sequence in (X, dx). Thus for each e > 0 there exists an index nE such that dX(xn, e for every n > nE. Put [xEI = E K(X): the equivalence class in X containing the constant sequence with entries equal to x,,,. Therefore, for each [x] E X and each e > 0 there exists [xE] E K(X) such that dX ([xE], [x]) _ limdx(x,,, e, and so K(X) is dense in k (Proposition 3.32). o
Claim 2. The metric space (X, dX) is complete. Proof. Take an arbitrary Cauchy sequence {[x1k}k>1 in (X, dX ). Recall that K(X) is dense in (2, djZ ), and hence for each positive integer k there exists [y lk E K (X) such that
dX([x1k,[ylk) < k (cf. Proposition 3.32). Then, since {[x ]k )k> i is a Cauchy sequence in (X, dg), and since
dx([ylj,[Y]k) < dX([yl1,[xlj)+dX([xl1,[xlk)+dX([xlk,[y]k) for every j, k > 1, it follows that the K(X)-valued sequence
{[ylk)k>I is a Cauchy sequence in (X,dX).
140
3. Topological Structures
Now take an arbitrary k > 1 and notice that, as [y ]k lies in K (X), there exists Yk E X
such that the constant sequence Y k =
{yk
I belongs to the equivalence class
[y ]k = K(yk), and hence yk = K([y]k). Indeed, K: X - K(X) g X is a surjective isometry of X onto the subspace K(X) of X so that K-I : K(X) -+ X is again a surjective isometry, thus uniformly continuous. Therefore, since ([Y]k}k>I is a Cauchy sequence in (K(X), d5 Z), it follows by Lemma 3.43 that (Yk }k> I
is a Cauchy sequence in (X, dx );
that is, y = (yk)k>I E CS(X) so that the equivalence class [y] lies in X. Next we show that ([x ]k }k> I converges in (X, dX) to [y ] E X.
In fact, for every k > 1, 0 < djZ ([X 1k, [y]) < dX ([X 1k, [Y 1k) +dX([1'}k, [y]) < E +dx'([y]k, [y])-
Take y = (y 1.> I E [y ] and, for each k > I. take the constant sequence y t (yk
I E [Y ]k such that yk = K - I ([y ]). By the definition of the metric djZ on X,
dX ([Y 1k, [Y ]) = limn dX (Y., Yk) for every k >_ 1, and hence limk djz ([y ]k, [y ]) = 0
because (yk)k>I is a Cauchy sequence in (X, dx). Therefore djZ ([X 1k, [y]) - 0 as
k
oo.
Conclusion: Every Cauchy sequence in (X, dg) converges in (X, dg), so that (X, dX) is a complete metric space. 17
Summingup:X - K(X),K(X)- = X, andXiscomplete. That is,Xisdenselyembedded in a complete metric space X, which means that X is a completion of X.
13
Corollary 3.47 leads to the proof that a completion of a metric space is essentially unique; that is, the completion of a metric space is unique up to a surjective isometry.
Theorem 3.50. Any two completions of a metric space are isometrically equivalent.
Proof. Let X be a metric space and, according to Theorem 3.49, let j? and k be two completions of X. This means that there exist surjective isometrics
J:X -X and J':X'->X where k is a dense subspace of k and k is a dense subspace of 5r. Recall that a surjective isometry is invertible, and that its inverse is again a surjective isometry. Set
J=J-IJ':X-X
which, as a composition of surjective isometrics, is a surjective isometry itself. Since X and k are dense subspaces of the complete metric spaces X and j r, respectively. it follows by Corollary 3.47 that there exists a unique surjective isometry
J:X -X
3.8 Continuous F_rtension and Completion
that extends J over X'. Thus X and k are isometrically equivalent.
141
0
According to Definition 3.48 a complete metric space X is a completion of a metric space X if there exists a dense subspace of X, say X, which is isometrically equivalent to X. That is, if there exists a surjective isometry
J:X -+ X of k onto X for some dense subspace k of X. Now let Y be another metric space, consider a mapping
F:X -- Y of X into Y, and let
F:X -+ Y be a mapping of a completion X of X into Y such that F(x) = F(J(x)) for every x E X. That is, F is an extension of the composition FJ over X or, equivalently, FJ : X -+ Y is the restriction of F to X. It is usual to refer to F as an extension of F of over the completion k of X (which in fact is a slight abuse of terminology). The situation so far is illustrated by the following commutative diagram (recall: FIX
= FJ).
X=X L X C X=X
/?
F\ Fij Y
The next theorem says that, if F is uniformly continuous and Y is complete, then there exists an essentially unique continuous extension F of F over a completion X of X. Theorem 3.51. Let 2 be a completion of a metric space X and let Y be a complete metric space. Every uniformly continuous mapping F: X -+ Y has a uniformly
continuous extension F: X -> Y over the completion X of X. Moreover, F is unique up to a surjective isometry.
Proof. Existence. Let X be a completion of a metric space X. Thus there exists a dense subspace X of the metric space X, and a surjective isometry J : X --> X of k onto X. Suppose F : X --s Y is a uniformly continuous mapping of X into a metric space Y. Consider the composition FJ: X Y, which is uniformly continuous as well (reason: J is uniformly continuous). Since k is dense in k, Y is complete, and FJ: X - Y is uniformly continuous, it follows by Theorem 3.45 that there exists
a unique continuous extension F: X - Y of FJ: X -+ Y over X, which in fact is uniformly continuous. Thus F : X -> Y is a uniformly continuous extension of F: X -+ Y over the completion X of the metric space X.
142
3. Topological Structures
Uniqueness. Suppose F : X - Y is another continuous extension of F : X - Y over some completion X of X so that F f X' = FJ', where k is a dense subspace of X and P: X ' -* X is a surjective isometrX of X' onto X. Set Y= J - t J' : X ' -- X as in the-proof of Theorem 3.50, and let J : 2'-+ X be the surjective isometry of X onto X such that JIz' = J. Thus, recalling that FIX = FJ, we get
FJIX,
= FJ = FJJ-'J' =
Flg,.
Therefore, the continuous mappings FJ : X -+ Y (composition of two continuous
mappings) and F'. X - Y coincide on a dense subset jr of 2. and hence F = FJ by Corollary 3.33. Conclusion: If F : k'-+ Y is another a continuous extension of F over some completion X of X, then there exists a surjective isometry J: X -* X such that F = FJ. In other words, a continuous extension of F over a completion of X is unique up to a surjective isometry. o The commutative diagram, where C denotes dense inclusion,
I C X X
J-'
.I
11
Y
XCX
illustrates the uniqueness proofs of Theorems 3.50 and 3.51.
3.9 The Baire Category Theorem We close our discussion on complete metric spaces with an important classification
of subsets of a metric space into two categories. The basic notion behind such a classification is the following one. A subset A of a metric space X is nowhere dense
(or rare) in X if (A-)° = 0 (i.e., if the interior of its closure is empty). Clearly, a closed subset of X is nowhere dense in X if and only if it has an empty interior. Note that (A\A°)° = 0 for every subset A of a metric space X. Indeed A° is the largest open subset of X that is included in A, so that the only open subset of X that
is included in A\A° is the empty set 0 of X. Therefore, if V is a closed subset of X, then V\V° is nowhere dense in X. (Reason: V\V° = V-\V° = 8V, and aV is closed in X - see Problem 3.41.) Dually, if U is an open subset of X, then U-\U
is nowhere dense in X (recall: U-\U = (X\V)-\(X\V) = (X\V°)\(X\V) _ V\V°). Carefully note that aQ = Q -\Q' = (Q-)' = R in R. Proposition 3.52. A singleton {x) on a point x in a metric space X is nowhere dense in X if and only if x is not an isolated point of X.
3.9 The Baire Category Theorem
143
Proof Recall that every singleton in a metric space X is a closed subset of X (Problem 3.37), and hence {x } = (x)- for every x in X. According to Proposition 3.37 a point x in X is an isolated point of X if and only if the singleton (x) is an open set in X ; that is, if and only if Ix)* = [x). Thus a point x in X is not an isolated point of X if and only if (x)° C (x) (i.e., {x}° 96 (x}) or. equivalently, (x)° = 0 (since the empty set is the only proper subset of any singleton). But {x)° = 0 if and only if ({x)-)° = 0 (because {x} = {x}- for every singleton {x)), which means 11 that {x} is nowhere dense in X. The proposition below gives alternative characterizations of nowhere dense sets that will be required in the sequel. Proposition 3.53. Let X be a metric space and let A be a subset of X. The following assertions are pairwise equivalent. (a)
(A-)° = 0
(i.e., A is nowhere dense in X).
(b) For every nonempty open subset U of X there exists a nonempty open subset
U' of X such that U' C U and U'nA=0. (c) For every nonempty open subset U of X and every real number p > 0 there exists an open ball B£ with radius s E (0, p) such that BE C U and
BE nA=0. Proof Suppose A is nonempty (otherwise the proposition is trivial).
Proof of (a). (b). Take an arbitrary nonempty open subset U of X. If (A-)° = 0,
then U\A0 (i.e., A- includes no nonempty open set), U\A' is open in X (since U\A- _ (X\A-) n U), U\A- C U, and (U\A-) n A = 0. Therefore (a) implies (b). Conversely, suppose (A-)° A 0 so that there exists an open subset U0 of X such that 0 54 Uo C A-. Then every point of Uo is a point of adherence of A, and hence every nonempty open subset Uo of Uo meets A (cf. Proposition 3.25). Conclusion: The denial of (a) implies the denial of (b). That is, (b) implies (a). Proof of (b)q(c). If (b) holds true, then (c) holds true for every open ball included in U'. Precisely, if (b) holds true and if u is any point of the open set U', then there exists a radius p > 0 such that Bp(u) C U'. Take an open ball B.(u) with center at
u and radius 6 E (0, p). Since BE(u)- C Bp(u), it follows that BE(u)- C U and Be(u)- n A = 0. Therefore (b) implies (c). On the other hand, suppose (c) holds true and set U' = BE, so that (c) implies (b).
17
By using Proposition 3.53 it is easy to show that A U B is nowhere dense in X whenever the sets A and B are both nowhere dense in X. Thus a trivial induction ensures that any finite union of nowhere dense subsets of a metric space X is again nowhere dense in X. However, a countable union of nowhere dense subsets of X does not need to be nowhere dense in X. A subset A of a metric space X is of first category (or meagre) in X if it is a countable union of nowhere dense subsets of X.
144
3. Topological Structures
That is, A is of first category in X if A = UnENA., where each A. is nowhere dense in X. The complement of a set of first category in X is a residual (or comeagre) in X. A subset B of X is of second category (or nonmeagre) in X if it is not of first category in X.
Example 3V. Let X be a metric space. Recall that a subset of X is dense in itself if and only if it has no isolated point. Thus, according to Proposition 3.52, if X is nonempty and dense in itself, then every singleton in X is nowhere dense in X. Moreover, a nonempty countable subset of X is a countable union of singletons in X. Therefore, if A is a nonempty countable subset of X. and if X is dense in itself, then A is a countable union of nowhere dense subsets of X. Summing up: If a metric space X is dense in itself then every countable subset of it is of first category in X. For instance, Q is a (dense) subset of first category in R. Equivalently, if a metric space X has no isolated point, then every subset of second category in X is uncountable. The following basic properties of sets of first category are (almost) immediate consequences of the definition. Note that assertions (a) and (b). but not assertion (c), in the proposition below still hold if we replace "sets of first category" by "nowhere
dense sets".
Proposition 3.54. Let X be a metric space. (a) A subset of a set of first category in X is of first category in X.
(b) The intersection of an arbitrary collection of subsets of X is of first category in X if at least one of the subsets is of first category in X.
(c) The union of a countable collection of sets of first category in X is of first category in X.
Proof. If B = Un Bn and A C B C X, then A = U An with An = B f1 A C 8,,, so that (An -)c C (Bn)°, for each n; and hence (a) holds true by the definitions of nowhere dense set and set of first category. Let (Ay)ycr be an arbitrary collection of subsets of X. Since nyerAy C Aa for every a E r, it follows by item (a) that nyerA is of first category in X whenever at least one of the sets A in {Ay}yEr is of first category in X. This proves assertion (b). If (An) is a countable collection of subsets of X, and if each An is a countable union of nowhere dense subsets of X, then U. An is itself a countable union of nowhere dense subsets of X (recall: a countable union of a countable collection is again a countable collection -Corollary 1.11). Therefore Un A, is a set of first category in X. which concludes the proof of assertion (c). Example 3V may suggest that sets of second category are particularly important. The next theorem, which plays a fundamental role in the theory of metric spaces, shows that they really are very important.
3.9 A e Baire Category Theorem
145
Theorem 3.55. (Baire Category Theorem). Every nonempty open subset of a complete metric space X is of second category in X. Proof. Let (A, 1,,,N be an arbitrary countable collection of nowhere dense subsets of a metric space X and set A = UfENAn C_ X. Let U be an arbitrary nonempty open subset of X. Claim For each integer k > 1 there exists a collection (B n )nt of open balls Bn with radius en E (0, ,) such that: B,-, C U and B, fl A. = 0 for each n = 1, ... , k + 1,
and Bn+i C B. for every n = 1,... , k. Proof. Since each A. is nowhere dense in X. it follows by Proposition 3.53 that there exist open balls BI and B2 with center at xi and x2 (both included in U) and positive
radius ei < 1 and E2 < 1, respectively, such that Bi C U and Bi fl Ai = 0, for f = 1, 2, and B2 C B. Thus the claimed result holds for k = 1. Suppose it holds for some k > 1 and take a positive radius Ek+I < min{ek, M). Proposition 3.53 ensures again the existence of an open ball Bk+ i with center at xk+i E U and radius
Ek+1 such that Bk+i C U and Bk+i fl Ak+1 = 0. Clearly, Bk+t C Bk so that the claimed result holds for k + 1 whenever it holds for some k > I, which concludes the proof by induction. o Consider the collection { Bn }nEN = UkEN (Bn ),±11. Since each open ball Bn =
C U is such that 0 < E. < 11r, it follows that the sequence of centers xn E U is a Cauchy sequence in X (reason: for each E > 0 take a positive integer nE > s so that, if n > m > nE, then C BE,,,(xm) with E,n < >I < 7 < e, and hence d(xm, xn) < E). Now suppose the metric space X is complete. This ensures that the Cauchy sequence (xn }nEN converges in X to, say, x E X. Take an arbitrary integer i > 1. Since x is the limit of the sequence (xn)n>i (it is a subsequence of (xn)nEN). and since {xn)n>i C Bi (Bn+i C Bn for every n E N), it follows that x E Bi (i.e., x is an adherent point of Bi). Thus x V A,
because Bi fl Ai = 0 for every i E N and A = UENA1; and X E U, because Br C U for all i E N. Hence X E U\A, and therefore U 96 A. Summing up: If U is a nonempty open subset of a complete metric space X, and if A is a set of first category in X (i.e., if A is a countable union of nowhere dense sets in X), then U 96 A. Conclusion: Every nonempty open subset of a complete metric space X is not a set of first category in X.
o
In particular, as a metric space is always open in itself, we get at once the following corollary of Theorem 3.55.
Corollary 3.56. A nonempty complete metric space is of second category in itself.
Corollary 3.57. The complement of a set of first category in a nonempty metric space is a dense subset of second category in that space. Proof. Let X be a nonempty complete metric space. The union of two sets of first category in X is again a set of first category in X (Proposition 3.54). Since the
146
3. Topological Structures
union of a subset A of X and its complement X\A is the whole space X, it follows
by Corollary 3.56 that A and A\X cannot be both of first category in X. Thus X\A is of second category in X whenever A is of first category in X. Moreover, if (X\A)- A X, then A° is a nonempty open subset of X (Proposition 3.31), and hence A° is a set of second category in X (Theorem 3.55), which implies that A is a set of second category in X (reason: if A is of first category in X, then A° is of first category in X because A° C A - Proposition 3.54). Therefore, if A is of first 13 category in X, then X\A is dense in X. In other words, if X 96 0 is a complete metric space, then every residual in X is both dense in X and of second category in x. Theorem 3.58. if a nonempty complete metric space is a countable union of closed sets, then at least one of them has nonempty interior.
Proof. According to Corollary 3.56 a nonempty complete metric space is not a countable union of nowhere dense subsets of it. Thus, if a nonempty complete metric space X is a countable union of subsets of X. then at least one of them is not nowhere dense in X (i.e., the closure of at least one of them has nonempty interior).
17
This is a particularly useful version of the Baire Category Theorem. A further version of it, which is the dual statement of Theorem 3.58. reads as follows. (This in fact is the classical Theorem of Baire.) Theorem 3.59. Every countable intersection of open and dense subsets ofa complete metric space X is dense in X.
Proof. Let { U } be a countable collection of open and dense subsets of a nonempty
complete metric space X (if X is empty the result is trivially verified). Set V. = X \U,, for each n so that (V,,) is a countable collection of closed subsets of X with 0 by Proposition 3.31). If empty interior (recall: U,-, = X means 0. and hence A 0. However. X. then U,, (X
De Morgan laws - so that U. V. 54 0. which im-
54 0. Thus, according to Theorem 3.58. the nonempty subspace of X is not complete (reason: each V. is a closed subset of (U (U see Problem 3.38 - and V,,* = 0 for every n). On the other hand, Corollary is a complete subspace of the complete metric space X 3.41 ensures that (U. is a closed subset of X); which leads to a contradiction. Conclusion: (since (U,,
plies (U,,
(n,u
= X.
0
The Baire Category Theorem is a nonconstructive existence theorem. For of nowhere dense sets in a instance, for an arbitrary countable collection nonempty complete metric space X, Corollary 3.57 asserts the existence of a dense for every n, but set of points in X with the property that none of them lies in it does not tell us how to find those points. However, the unusual (and remarkable) fact about the Baire Category Theorem is that, while its hypothesis (completeness)
3.9 The Baire Category Theorem
147
has been defined in a metric space and is not a topological invariant (completeness is preserved by uniform homeomorphism but not by plain homeomorphism - see Example 3U), its conclusion is of purely topological nature and is a topological invariant. For instance, the conclusion in Theorem 3.55 (being of second category) is a topological invariant in a general topological space. Indeed, if G : X -> Y is a homeomorphism between topological spaces X and Y and A is an arbitrary subset
of X, then it is easy to show that G(A)- = G(A-) and G(A)° = G(A°). Thus the property of being nowhere dense is a topological invariant (i.e., (A-)° = 0 if and only if (G(A)-)° = 0), and so is the property of being of first or second category (for G(A) = G(U,, A.) = whenever A = U. A.). Such a purely topological conclusion suggests the following definition. A topological space is a Baire space if the conclusion of the classical Theorem of Baire holds on it. Precisely, a Baire space is a topological space X on which every countable intersection of open and dense subsets of X is dense in X. Thus Theorem 3.59 simply says that every complete metric space is a Baire space. Example 3W. We shal I now unfold three further consequences of the Baire Category Theorem, each resulting from one of the above three versions of it.
(a) Suppose A is a set of first category in a complete metric space X. According to
Corollary 3.56 (X\A)- = X or, equivalently, A° = 0. Conclusion: A set of first category in a complete metric space has empty interior. Corollary: A closed set of first category in a complete metric space is nowhere dense in that space (i.e., if A is a set of first category in a complete metric space, and if A = A-, then (A-)° = 0). (b) Recall that a set without isolated points (i.e., dense in itself) in a complete metric space may be countable (example: Q in R). Suppose A is a nonempty perfect subset of a complete metric space X, which means that A is closed in X and dense in itself.
If A is a countable set, then it is the countable union of all singletons in it. Since every point in A is not an isolated point of A (for A is dense in itself), it follows by Proposition 3.52 that every singleton in A is nowhere dense in the subspace A, so that every singleton in A has empty interior in A. (Recall: a singleton in a metric space A
is closed in A - Problem 3.37.) Then A is the countable union of closed sets in A. all of them with empty interior in A, and therefore the subspace A is not complete according to Theorem 3.58. However, since A is a closed subset of a complete metric space X, it follows by Corollary 3.41 that the subspace A is complete. Thus the assumption that A is countable leads to a contradiction. Conclusion: A non empty perfect set in a complete metric space is uncountable.
(c) A subset of a metric space X is a Ga if it is a countable intersection of open subsets of X, and an Fo if it is a countable union of closed subsets of X. First observe that, if the complement X \A of a subset A of X is a countable union of subsets of X, then A includes a Ga. In fact, if X\A = then l,,(X\Cn) c
X\U,,CR = X\(X\A) = A, and hence A includes a G8; viz., Moreover, if each C is nowhere dense in X (i.e., (C.-)* = 0), then (X \C,-,)- = X (see Proposition 3.31) so that X \C.- is open and dense in X for
148
3. Topological Structures
every index n. Therefore, according to Theorem 3.59, n,, (X \Cn) is a dense Gs in X whenever X is a complete metric space. Summing up: If X \A is of first category (i.e., a countable union of nowhere dense subsets of X) in a complete metric space X, then A includes a dense Ga. Conversely, if a subset A of a metric space X includes a Ga,
say l U c A where each U is open in X. then X\A c X \f U = Un X \U,,. If the Ga is dense in X (i.e., X), then (X\fl Un)° = 0 (see Proposition 3.31 again) so that [(X\Un)-]° = (X\U )° c (UmX\Um)° = 0, and hence each X \U,, is nowhere dense in X. Thus Un X \U. is a set of first category in X. which implies that X\A is of first category in X as well (since a subset of a set of first category is itself of first category - Proposition 3.54). Summing up: If A includes a dense Ga in a metric space X. then X\A is of first category in X. Conclusion: A subset of a complete metric space is a residual (i.e., its complement is of first category) if and only if it includes a dense Ga. (This generalizes Corollary 3.57.) Dually, A subset of a complete metric space is of first category if and only if it is included in an F. with empty interior. (This generalizes the results of item (a).)
3.10
Compact Sets
Recall that a collection A of nonempty subsets of a set X is a covering of A C_ X (or A covers A) if A C U A. If A is a covering of A. then any subcollection of A that also covers A is a subcovering of A. A covering of A comprised only of open subsets of X is called an open covering. Definition 3.60. A metric space X is compact if every open covering of X includes a finite subcovering. A subset A of a metric space X is compact if it is compact as a subspace of X. The notion of compactness plays an extremely important role in general topology. Note that any topology Ton a metric space X clearly is an open covering of X which trivially has a finite subcovering; namely, the collection {X) consisting of X alone. However, the definition of a compact space demands that every open covering of it has a finite subcovering. The idea behind the definition of compact spaces is that even open coverings made up of "very small" open sets have a finite subcovering.
Note that the definition of a compact subspace A of a metric space X is given in terms of the relative topology on A: an open covering of the subspace A consists of relatively open subsets of A. The next elementary result says that this can be equally defined in terms of the topology on X. Proposition 3.61. A subset A of a metric space X is compact if and only if every covering of A made up of open subsets of X has a finite subcovering. Proof. If U is a covering of A (i.e.. A C UU) consisting of open subsets of X,
then {U fl A: U E U) is an open covering of the subspace A (see Problem 3.38). Conversely, every open covering UA of the subspace A consisting of relatively open
3.10 Compact Sets
149
subsets of A is of the form (U fl A: U E U) for some covering U of A consisting of open subsets of X. (Reason: UA E UA if and only if UA = A fl u for some open
subset U of X - see Problem 3.38 again.)
0
The properties of being a closed subset and of being a compact subset of a metric space are certainly different from each other (trivial example: every metric space is closed in itself). However, "inside" a compact metric space these properties coincide.
Theorem 3.62. Let A be a subset of a metric space X .
(a) If A is compact, then A is closed in X. (b) If X is compact and if A is closed in X, then A is compact. Proof. (a) Let A be a compact subset of a metric space X. If either A = 0 or
A = X, then A is trivially closed in X. Thus suppose 0 0 X\A 54 X and take an arbitrary point x in X \A. Since x is distinct from every point in A, it follows that for each a E A there exists an open neighborhood Aa of a and an open neighborhood Xa of x such that Aa fl Xa = 0 (reason: every metric space is a Hausdorff space see Problem 3.37). But A C UaEA Aa so that (Aa)0EA is a covering of A consisting
of nonempty open subsets of X. If A is compact, then there exists a finite subset
of A, say {a;)"_,, such that A C U IAa, (Proposition 3.61). Set U,= n" IXa;, which is an open neighborhood of x (recall: each Xa; is an open neighborhood of x). Since Aaj fl U. = 0 for each i, it follows that (U"=, Aaj) fl u,,= 0, and hence A fl Ux = 0. Therefore Ux c_ X\A. Conclusion: X\A is open in X (it includes an open neighborhood of each one of its points). (b) Let A be a closed subset of a compact metric space X. Take an arbitrary covering of A, say UA, consisting of open subsets of X. Thus UA U (X\A) is an open covering
of X. As X is compact, this covering includes a finite subcovering, say U. so that U \(X \A} c UA is a finite subcovering of UA. Therefore, every covering of A consisting of open subsets of X has a finite subcovering, and hence (Proposition 3.61) A is compact. Corollary 3.63. Let X be a compact metric space. A subset A of X is closed in X if and only if it is compact. We say that a subset of a metric space X is relatively compact (or conditionally compact) if it has a compact closure. It is clear by Corollary 3.63 that every subset of a compact metric space is relatively compact. Another important property of a compact set is that the continuous image of a compact set is compact.
Theorem 3.64. Let F : X -* Y be a continuous mapping of a metric space X into a metric space Y. (a) If A is a compact subset of X. then F(A) is compact in Y.
150
3. Topological Structures
(b) If X is compact, then F is a closed mapping.
(c) If X is compact and F is injective, then F is a homeomorphism of X onto F(X). Proof. (a) Let U be a covering of F(A) (i.e., F(A) c UUEUU) consisting of
open subsets U of Y. If F is continuous, then F-t (U) is an open subset of X for
every U E U according to Theorem 3.12. Set F-t(U) = {F't(U): U E U); a collection of open subsets of X. Clearly (see Problem 1.2), A C_ F-' (F(A)) C
F-' (UUEuU) =
UUEUF-' (U) so that.F-t (U) is a covering of A made up of
open subsets of X. If A is compact, then (cf. Proposition 3.61) there exists a finite
subcollection of .F-1(U) covering A; that is, there exists (U;)" t g U such that A C U"_t F-1 (U;) C X. Thus F(A) C F(Un I F-I (U; )) e U" I U; C Y (have another look at Problem 1.2), and hence F(A) is compact by Proposition 3.61. (b) If X is compact and if A is a closed subset of X, then A is compact by Theorem
3.62(b). Hence F(A) is a compact subset of Y by item (a), so that F(A) is closed in Y according to Theorem 3.62(a). (c) If X is compact and F is injective, then F is a continuous invertible closed
mapping of X onto F(X) by item (b), and therefore a homeomorphism of X onto F(X) (Theorem 3.24). o
As compactness is preserved under continuous mappings, it is obviously preserved by homeomorphisms, and so compactness is a topological invariant. Moreover, a one-to-one continuous correspondence between compact metric spaces is a homeomorphism. These are straightforward corollaries of Theorem 3.64 Corollary 3.65. If X and Y are homeomorphic metric spaces, then one is compact if and only if the other is.
Corollary 3.66. If X and Y are compact metric spaces, then every injective continuous mapping of X onto Y is a homeomorphism. Probably the reader has already noticed two important features: the metric has not played its role yet in this section, and the concepts of completeness and compactness
share some common properties (e.g., compare Theorems 3.40 and 3.62). Indeed. the compactness proofs so far apply to topological spaces that are not necessarily metrizable. Actually they all apply to Hausdorff spaces (metrizable or not), and Theorems 3.62(b) and 3.64(a) do hold for general topological spaces (not necessarily Hausdorff). As for the connection between completeness and compactness, first note that the notion of completeness introduced in Section 3.7 needs a metric. Moreover. as we have just seen, compactness is a topological invariant while completeness is not preserved by plain homeomorphism (just by uniform homeomorphism Theorem 3.44 and Example 3U). However, as we shall see in the next section. in a metric space compactness implies completeness.
3.10 Compact Sets
151
Some of the most important results of mathematical analysis deal with continuous
mappings on compact metric spaces. Theorem 3.64 is a special instance of such results that leads to many relevant corollaries (e.g., Corollary 3.85 in the next section is an extremely useful corollary of Theorem 3.64). Another important result in this line reads as follows.
Theorem 3.67. Every continuous mapping of a compact metric space into an arbitrary metric space is uniformly continuous.
Proof. Let (X, dx) and (Y, dy) be metric spaces and take an arbitrary real number e > 0. If F : X -* Y is a continuous mapping, then for each x E X there exists a real number Se(x) > 0 such that
dx(x', x) < 28e(x)
dy(F(x'), F(x)) < e.
implies
Let Bs,(x)(x) be the open ball with center at the point x and radius Se(x). Consider the collection (Ba,(x)(x)}xEX, which surely covers X (i.e., X = UxExBs,(x)(x)). If X is compact (Definition 3.60), then this covering of X includes a finite subcovering, say U%i Bs,(x; ) (xi) with xi E X for each i = i, ... , n. Take any x' E X so that
dx(xj, x') < S.(xj)
for some j = i, ...
, n (i.e., X = U"_i Bs (x;) (xi) implies that every point of X
belongs to some ball Bae (xj) (x j )). Therefore
dy(F(xj), F(x')) < e. Set Se = min(Se(xi))"=1, which is a positive number. If
dx(x', x) < Se, then dx(x, xj) < dx(x, x') + dx(x', x j) < SE + Se (xj) < 2Se (xj) by the triangle inequality, and hence dy(F(x), F(xj)) < e. Thus, since dy(F(x), F(x')) < dy(F(x), F(xj)) + dy(F(xj), F(x')), it follows that
dy(F(x'), F(x)) < 2e. Conclusion: Given an arbitrary e > 0 there exists SE > 0 such that
dx(x', x) < SE
implies
dy(F(x'), F(x)) < 2e
for all x, x' E X. That is, F: X -- Y is uniformly continuous.
o
We shall now investigate alternative characterizations of compact sets that, unlike the fundamental concept posed in Definition 3.60, will be restricted to metrizable spaces.
Definition 3.68. Let A be a subset of a metric space (X, d). A subset Ae of A is an s-net for A if for every point x of A there exists a point y in Ae such that
152
3. Topological Structures
d(x, y) < e. A subset A of X is totally bounded in (X, d) if for every real number e > 0 there exists a finite a-net for A. Proposition 3.69. Let A be a subset of a metric space X. The following assertions are equivalent.
(a) A is totally bounded.
(b) For every real number e > 0 there exists a finite partition of A into sets of diameter less than E.
Proof. Take an arbitrary E > 0 and set p ='. If there exists a finite p-net A. for A, then the finite collection of open balls {Bp(y)}yEAP covers A. That is, A C UyEA Bp(y) because every x E A belongs to B,(y) for some y E Ap whenever AP is a p-net for A. Set Ay = Bp(y) fl A for each y E Ap so that A = UyEA Ay. A disjointification (Problem 1.18) of the finite collection {Ay}yEAP is a finite partition of A into sets of diameter not greater than max yEA,,diam(Bp(y) fl A) < 2p = e. Thus (a) implies (b) according to Definition 3.68. On the other hand, if {A; }"_j is a finite partition of A into (nonempty) sets of diameter less than s, then by taking one point ai of each set Ai we get a finite set (a; )i_1 which is an E-net for A. Therefore El (b) implies (a). Note that every finite subset of a metric space X is totally bounded : it is a finite Enet for itself for every positive E. In particular, the empty set of X is totally bounded: for every positive a there is no point in 0 within a distance greater than E for every
point of X. It is also readily verified that every subset of a totally bounded set is totally bounded (indeed, Ae fl B is an a-net for B C A whenever A, is an E-net for A). Moreover, the closure of a totally bounded set is again totally bounded (reason: A, is a 2e-net for A- whenever A, is an a-net for A). Proposition 3.70. Let A be a subset of a metric space X. If A has a finite a-net for some e > 0, then A is bounded in X. Proof. Suppose a nonempty subset A of a metric space (X, d) has a finite a-net for some real number e > 0. Take x, y E A arbitrary so that there exists a, b E A, for
which d(x,a) <Eandd(x,b) < e. Thus d(x, y)
Example 3X. The converse fails. That is, a bounded set is not necessarily totally bounded. For instance, the closed unit ball B1[0 ] centered at the null sequence 0 in the metric space (e+, d2) of Example 3B is obviously bounded in (C+, d2); actually, diam (Bi [ 0 ]) = 2. We shall show that there is no finite c-net for Bt [ 0 ]
3.10 Compact Sets
153
with s <
f2, and hence Bi [ 0 ] is not totally bounded in W2, d2). Indeed, consider the countable subset E = {e i )i EN of Bt [ 0 ] made up of all scalar-valued sequences
ei = (&k)1;
is, each sequence e; has just one nonzero entry (equal to one) at the i th position. If AE is an s-net for Bt [ 0 ], then AE contains a point within a distance less than a for each ei in E, and hence AE must have a point in each open ball BE (ei ). Since d2 (e;, ej) = f whenever i 91- j, it follows that BE (ei) f1 B. (ej) = 0
(i.e.. BE(ei)\BE(ej) = BE(ei)) whenever i 96 j and e <' f. Thus, if s < 2..12, then for each ei E E there exists bi E A. fl BE(ei)\BE(ej) for every j 34 i. This establishes an injective function from E to AE, so that #E < #AE. Therefore, AE is at least countably infinite (i.e., No < AE). Conclusion: Bt [ 0 ] is a closed and bounded set in the complete metric space (C+, 42) that is not totally bounded in (l+, d2).
Proposition 3.72. A totally bounded metric space is separable.
Proof. Suppose X is a totally bounded metric space. For each positive integer n let X be a finite ii-net for X. Set A = U,,ENXn, which is a countable subset of X (Corollary 1.11) and dense in X. Thus X is separable. To verify that A is dense in X, proceed as follows. Take X E X arbitrary. For each positive integer n there >t (since X. is a a -net for X), and the A-valued exists x E X. such that d (x, 17 sequence {x }nEN converges in X to x. Thus A- = X by Proposition 3.32. Lemma 3.73. A set A in a metric space X is totally bounded if and only if every A-valued sequence has a Cauchy subsequence. Proof. If A is a finite set, then the result holds trivially (in this case every A-valued sequence has a constant subsequence). Thus suppose A is an infinite set and let d denote the metric on X. (a) We shall say that an A-valued sequence (xk}kENo has Property P,, (6), for some
integer n E N, if there exists an s > 0 such that d (xj, xk) > e for every pair (j, k)
of distinct integers j, k = o, i,... , n. Claim. If A is not totally bounded, then there exists an e > 0 and an A-valued sequence that has Property Pr, (e) for every n E N.
Proof. Suppose the infinite set A is not totally bounded and let s be any positive real number for which there is no finite s-net for A. In particular (and trivially), no singleton in A is an s-net for A, and hence there exists a pair of points in A, say xo and x1, for which d (xo, x;) > s. Thus every A-valued sequence whose first two entries coincide with xo and xi has Property Pi (s). Suppose there exists an Avalued sequence {xk }kENo that has Property P (s) for some integer n E No, so that
d(xj. xk) > e for every j, k = o, i, ... , n such that j 0 k. Since the set (Xk 1k_0 is not an s-net for A (recall: there is no finite s-net for A), it follows that there exists a point in A, say xn+i , for which d (xk, xn+i) > s for every k = o, i, ... , n. Replace (s). This concludes with xn+1 so that the resulting sequence has Property the proof by induction. o
154
3. Topological Structures
If an A-valued sequence (Xk}kENO has Property P.(e) for every n E N, then d(xj, xk) > e for every nonnegative distinct integers j and k, and hence it has no Cauchy subsequence. Conclusion: If A is not totally bounded, then there exists an A-valued sequence that has no Cauchy subsequence. Equivalently, if every A-valued sequence has a Cauchy subsequence, then A is totally bounded. (b) Conversely, suppose A is totally bounded and let {Xk}kEN be an arbitrary Avalued sequence. According to Proposition 3.69 there exists a finite partition A of A into sets of diameter less than 1. Since A is a finite partition of A. it follows that at least one of its sets, say A i C A, has the property that the (infinite) A-valued sequencex = (Xk)kEN has an (infinite) subsequence, say xt = {x(1)k}kEN, whose entries lie in A 1. Note that A i is totally bounded (because A is). Thus there exists a finite partition At of A1, consisting of subsets of A t with diameter less than , such that at least one of its sets, say A 2 c A t c A, has the property that the A t -valued sequence x = {x (0k 6N has a subsequence, say X 2 = (X(2)k )kEN whose entries lie in A2. This leads to the inductive construction of a decreasing sequence (A,,),,,N of subsets of A with diam(An) < R, each including a subsequencexn = (X(n)k)kEN of the A-valued sequence (Xk)kEN, for every n E N. Moreover, the sequence of sequences (xn)nEN (i.e., the AN-valued sequence whose entries are the An-valued sequences xn for each n E N) has the property that xn+t is a subsequence of xn
for each n E N. The kernel of the proof relies on the diagonal procedure (see Section 1.9). Consider the sequence (X(n)n)fEN where, for each n E N, x(n) is the nth entry of xn. This sequence has the following properties: (1) it is an A-valued sequence (each X(n)n lies in An C A), (2) which in fact is a subsequence of (Xk)kEN
(since xn+1 is a subsequence of xn for each n E N and xt is a subsequence of x = {xk}kEN), and (3) d(x(m)m, x(n)n) < , whenever m < n (since x(m)n, E Am,
X(n)n E An, An C_ Am, and diam(Am) < jr for every m E N). Therefore, the "diagonal" sequence {x(n)n },iEN is a Cauchy subsequence of (xk )kEN.
0
Total boundedness is not a topological invariant but it is preserved by uniform homeomorphism. In fact, the same example that shows that completeness is not preserved by plain homeomorphism (Example 3U) also shows that total boundedness is not preserved by plain homeomorphism (the sets (0, 1] and [1, oo) are homeomorphic, but (0, 11 is totally bounded - see Example 3Y below - while [ 1, oo) is not even bounded).
Theorem 3.74. Let F : X -> Y be a uniformly continuous mapping of a metric space X into a metric space Y. If A is a totally bounded subset of X, then F(A) is totally bounded in Y.
Proof. Let A be a nonempty subset of X and consider its image F(A) in Y under a mapping F : X -> Y (if A is empty the result is trivially verified). Take an arbitrary F(A)-valued sequence (yn ) and consider any A-valued sequence {xn } such that yn = F(xn) for every index n. If A is totally bounded, then Lemma 3.73 ensures that (xn } has a Cauchy subsequence, say (xnt }. If F : X -- Y is uniformly continuous,
3.11 Sequential Compactness
155
then (F(xnk )) is a Cauchy sequence in Y (Lemma 3.43) which is a subsequence of (yn). Therefore, every F(A)-valued sequence has a Cauchy subsequence; that is, o F(A) is totally bounded (Lemma 3.73).
In particular, if F : X -- Y is a surjective uniformly continuous mapping of a totally bounded metric space X onto a metric space Y, then Y is totally bounded.
Corollary 3.75. If X and Y are uniformly homeomorphic metric spaces, then one is totally bounded if and only if the other is. Total boundedness is sometimes referred to as precompactness. Lemma 3.73 links completeness to compactness in a metric space. It actually leads to the proof that a metric space is compact if and only if it is complete and totally bounded. We shall prove this important assertion in the next section.
3.11
Sequential Compactness
The notion of compactness and total boundedness can be thought of as topological counterparts of the set-theoretic notion of "finiteness" in the sense that they may suggest "approximate finiteness" (see Propositions 3.61 and 3.69).
Definition 3.76. A metric space X is sequentially compact if every X-valued sequence has a subsequence that converges in X. A subset A of a metric space X is sequentially compact if it is sequentially compact as a subspace of X. Proposition 3.77. Let A be a subset of a metric space X. The following assertions are equivalent.
(a) A is sequentially compact. (b) Every infinite subset of A has at least one accumulation point in A. Proof. If A is empty, then the result holds trivially (it is sequentially compact because there is no A-valued sequence, and it satisfies condition (b) because it includes no infinite subset). Thus let A be a nonempty set. Recall that the limits of the convergent subsequences of a given sequence are precisely the accumulation points of its range.
Proof of (a)=(b). If B is an infinite subset of A, then B includes a countably infinite set, and hence there exists a B-valued sequence {bn}fEN of distinct points. If A is sequentially compact, then this A-valued sequence has a subsequence that converges
in X to a point a E A (Definition 3.76). If b 36 a for all n E N, then (bn }fEN is a B\(a)-valued sequence of distinct points that converges in X to a, and therefore a E A is an accumulation point of B (Proposition 3.28). If bn, = a for some m E N, then remove this (unique) point from and get a B\(a)-valued sequence of
156
3. Topological Structures
distinct points that converges in X to a, so that a E A is again an accumulation point of B. Conclusion: (a) implies (b).
Proof of Let A be a subset of X for which assertion (b) holds true. In particular, every countably infinite subset of A has an accumulation point. Then every (infinite) A-valued sequence has a convergent subsequence, and hence A is sequentially compact. That is, (b) implies (a).
t]
A subset A of a metric space X (which may be X itself) is said to have the Bolzano-Weierstrass property if it satisfies condition (b) of Proposition 3.77. Thus a metric space X is sequentially compact if and only if it has the Bolzano-Weierstrass property. Note that every finite subset of a metric space X is sequentially compact (because it includes no infinite subset, and hence it has the Bolzano-Weierstrass property). In particular, the empty set is sequentially compact.
Theorem 3.78. A metric space is sequentially compact if and only if it is totally bounded and complete. Proof Let (xn )nEN be an arbitrary X -valued sequence. If the metric space X is both totally bounded and complete, then {xn }nEN has a Cauchy subsequence (because X is totally bounded - Lemma 3.73), which converges in X (because X is complete).
Thus X is sequentially compact (Definition 3.76). On the other hand, sequential compactness clearly implies total boundedness (see Definition 3.76, Proposition 3.39(a) and Lemma 3.73). Moreover, if X is sequentially compact and if {xn }.EN is an X-valued Cauchy sequence, then this sequence has a convergent subsequence (Definition 3.76), and hence (xn)nEN converges in X (Proposition 3.39(c)). Therefore, in a sequentially compact metric space X every Cauchy sequence converges; that is, X is complete. 0 The next lemma establishes another necessary and sufficient condition for sequential compactness. We shall apply this lemma to prove the equivalence between the concepts of compactness and sequential compactness in a metric space. Lemma 3.79. (Cantor). A metric space X is sequentially compact if and only if every
decreasing sequence (V, )nEN of nonempty closed subsets of X has a nonemprv intersection (i.e., is such that Vn 0 0) Proof (a) Let { V. }.EN be a decreasing sequence (i.e., Vn+1 C Vn for every n E N) of nonempty closed subsets of a metric space X. For each n E N let vn be a point of V,, and consider the X -valued sequence { V. }nEN. If X is sequentially compact, then (v,, )nEN has a subsequence that converges in X to, say, v E X (Definition 3.76). Now
take an arbitrary m E N and note that this convergent subsequence is eventually in Vm (because (V,,),,,N is decreasing), and hence it has a Vm-valued subsequence that converges in X to v (Proposition 3.5). Since V. is closed in X. it follows by the Closed Set Theorem that v E Vm (Theorem 3.20). Therefore, as m is arbitrary, v E nm,N Vm'
3.11 Sequential Compactness
157
(b) Conversely, let {xk }kEN be an arbitrary X-valued sequence and set X. =
{xk E X: k > n} for each n E N. Observe that is a decreasing sequence of nonempty subsets of X, and so is the sequence of closed subsets of X, {Xn )IEN, consisting of the closures of each Xn. If nfENXn A 0, then there exists x E X- for all n E N. Take an arbitrary real number e > 0 and consider the open ball Be W. Since B,(x) fl X A 0 for every n E N (reason: put A = X and B = Xn in Proposition 3.32(b)), it follows that for every n E N there exist integers k >_ n for which xk E Ba (x). Thus every nonempty open ball centered at x meets the range of the sequence (xk )kEN infinitely often, and hence x is the limit of some convergent subsequence of {xk }kEN (Proposition 3.28). Conclusion: Every X-valued sequence has a convergent subsequence, which means that X is sequentially compact.
Theorem 3.80. A metric space is compact if and only if it is sequentially compact. Proof. (a) Suppose X is a compact metric space (Definition 3.60). Let { V )°O t be an arbitrary decreasing sequence of nonempty closed subsets of X. Set U = X \Vn for each n E N so that {Un }'i is an increasing sequence of proper open subsets of
X. If {U.)R° , covers X (i.e., if U.°_t U,, = X), then U. = U' I U,, = X for some t is increasing and X is compact), which contradicts the m E N (because fact that Un 96 X for every n E N. Outcome: U001 U,, # X and hence (De Morgan laws) V 0 0. Therefore X is sequentially compact by Lemma 3.79.
(b) On the other hand, suppose X is a sequentially compact metric space. Since X is separable (Proposition 3.72 and Theorem 3.78), it follows by Theorem 3.35 that X has a countable base B of open subsets of X. Let U be an arbitrary open covering of X. Claim. There exists a countable subcollection U' of U that covers X.
Proof. For each U E U set Bu = (B E B: B C U). Since B is a base for X, and since U is an open subset of X, it follows by the very definition of base that U = U Bu. The collection B' = UUE uBu of open subsets of X has the properties
#B' < #B
and
UU c UB'.
Indeed, since Bu C B for every U E U. it follows that UuE uBu e B. Thus B' C_ B
so that #B' < #B. Moreover, if U is an arbitrary set in U, then U = U Bu = UBEBu B C UBE& B = U B', and hence U U C_ U B'. Another property of the collection B' is that every set in B' is included in some set in U (reason: If B' E B' = Uu UBu, then B' E Bu. C U Bu- = U' for some U' E U). For each set B' in B' take one set U' in U that includes B', and consider the subcollection U' of U consisting of all those sets U'. The very construction of U' establishes a surjective map of B' onto U' that embeds U B' in U U'. Thus
#U' < #B'
and U B' C UU'.
158
3. Topological Structures
Therefore, by transitivity,
#U'< #B
and
UU C UU'.
Conclusion: U' is a countable subcollection of U (because B is a countable base for X) which covers X (because U covers X). 17
If U' is finite, then it is itself a finite subcovering of U so that X is compact. If t, where U' is countably infinite, then it can be indexed by N so that U' = each U. belongs to U. For each n E N set V. = X \U," I U; so that { V )n° , is a decreasing sequence of closed subsets of X. Since UU' = U001 U. = X (recall: U' covers X), it follows that fn_ , v. = 0. Therefore, according to Lemma 3.79, at least one of the sets in { V }M i , say V,,,, must be empty (because X is sequentially compact). Thus n 1 V = 0, and hence U' U. = X. Conclusion: U includes a o finite subcovering, viz. {Un}n 1, so that X is compact. The theorems have been proved. Let us now harvest the corollaries.
Corollary 3.81. If X is a metric space, then the following assertions are pairwise equivalent.
(a) X is compact. (b) X is sequentially compact. (c) X is complete and totally bounded.
Proof. Theorems 3.78 and 3.80.
13
As we have already observed, completeness and total boundedness are preserved by uniform homeomorphisms but not by plain homeomorphisms, whereas compactness is preserved by plain homeomorphisms. Thus completeness and total bound-
edness are not topological invariants. However, when taken together they mean compactness, which is a topological invariant. Corollary 3.82. Every compact subset of any metric space is closed and bounded. Proof Theorem 3.62(a) and Corollaries 3.71 and 3.81.
0
Recall that the converse fails. Indeed we exhibited in Example 3X a closed and bounded subset of a (complete) metric space that is not totally bounded, and hence not compact.
Theorem 3.83. (Heine-Bore]). A subset of R" is compact if and only if it is closed and bounded.
Proof The condition is clearly necessary by Corollary 3.82. We shall prove that it is also sufficient. Consider the real line R equipped with its usual metric. Let
3.11 Sequential Compactness
159
V, be any nondegenerate closed and bounded interval, say VP = [a, a + p] for some real number a and some p > 0. Take an arbitrary real number e > 0 and let nE be a positive integer large enough so that p < (n6 + 1)'. For each integer k = o, i, ... , nE consider the interval Ak = [a + k', a + (k + 1)2") of diameter i. Since Al f1 Ak = 0 whenever j # k, and Vp C [a, a + (n., + 1)'2) = Uk` oAk, it follows that (Ak fl VP)kLo is a finite partition of Vp into sets of diameter less than e. Thus every closed and bounded interval of the real line is totally bounded (Proposition 3.69). Now equip R" with any of the metrics d,, or d p for some p > 1 as in Example 3A (recall: these are uniformly equivalent metrics on R" - Problem 3.33). Take an arbitrary bounded subset B of R" and consider the closed interval V. of diameter p = diam(B) such that B C V.". Since V. is totally bounded in R, it follows by Problem 3.64(a) that the Cartesian product Vp is totally bounded in R". Hence, as a subset of a totally bounded set, B is totally bounded. Conclusion: Every bounded subset of R" is totally bounded. Moreover, since R" is a complete metric space (when equipped with any of these metrics - Example 3R(a)), it follows by Theorem 3.40(b) that every closed subset of R' is a complete subspace of R". Therefore every closed and bounded subset of R" is compact (Corollary 3.81). 0
The Heine-Borel Theorem is readily extended to C" (again equipped with any of the uniformly equivalent metrics doo or dp for some p > 1 as in Example 3A): A subset of C" is compact if and only if it is closed and bounded in V.
Corollary 3.84. Let X be a complete metric space, and let A be a subset of X. (a) A is compact if and only if it is closed and totally bounded in X. (b) A is relatively compact if and only if it is totally bounded in X. Proof. By Theorem 3.40(b) and Corollaries 3.81 and 3.82 we get the result in (a), which in turn leads to the result in (b) by recalling that A- is totally bounded if and only if A is totally bounded. o Corollary 3.85. A continuous image of any compact set is closed and bounded. Proof. Theorem 3.64(a) and Corollary 3.82.
17
Theorem 3.86. (Weierstrass). If V: X -+ R is a continuous real-valued function on a metric space X, then rp assumes both a maximum and a minimum value on each nonempty compact subset of X. Proof If rp is a continuous real-valued function defined on a (nonempty) compact metric space, then its (nonempty) range R(V) is both closed and bounded in the real line R equipped with its usual metric (Corollary 3.85). Thus the bounded subset R(q) of R has an infimum and a supremum in R. which actually lie in R(W) because R(ip) is closed in R (recall: a closed subset contains all its adherent points). 0
160
3. Topological Structures
Example 3Y. Consider the set C[X, Y] of all continuous mappings of a metric space X into a metric space Y and let B[X, YJ be the set of all bounded mappings of X into Y. According to Corollary 3.85,
C[X, Y] C B[X, Y] whenever X is compact. Thus the sup-metric do. on B[X, Y] (see Example 3C) is inherited by C[X, Y] if X is compact, and hence (C[X, Y], dw) is a subspace of (B[X, Y], dm). In other words. if X is compact, then the sup-metric do, is well-defined on C[X, Y] so that, in this case, (C[X, Y], doo) is a metric space. In particular, C[0, 1] C B[0, 1] because [0, 11 is compact in R by the Heine-Borel Theorem (Theorem 3.83), and so (C[0, 1 ], dam) is a subspace of (B[0, 1], d00) as we had anticipated in Examples 3D and 3G. and used in Examples 3N and 3T. Moreover, since (1) the absolute value function I I:
[0, 11 -> R is continuous, (2) (x - y) E C[0, I] for every x, y E C[0, 1], (3) the interval [0, 11 is compact, and since (4) the composition of continuous functions is continuous, it follows by the Weierstrass Theorem (Theorem 3.86) that
d,,(x, y) = max Ix(t) - y(t)I
for every
x, y E C[0, 1].
fE[O.11
Now let BC[X, Y] be the set of all bounded continuous mappings of X into Y and equip it with the sup-metric dd as in Example 3T. If X is compact, then C[X, Y] = BC[X, Y] and (C[X, Y), dam) is a metric space that coincides with the metric space (BC[X, Y], dx). Since (BC[X, Y], doo) is complete if and only if Y is complete (Example 3T), it follows that
(C[X, Y],doo) is complete if X is compact and Y is complete. Example 3Z. Suppose (X, dx) is a compact metric space, let (Y, dy) be any metric space, and consider the metric space (C[X, Y], do.) of Example 3Y. Let 4) be a subset of C[X, Y]. We shall investigate a necessary and sufficient condition that it be totally bounded. To begin with let us pose the following definitions.
(i) A subset 4) of C[X, YJ is poinhvise totally bounded if for each x in X the
set 4)(x) = if (x) E Y: f E 4)} is totally bounded in Y. Similarly. 0 is pointwise bounded if 4)(x) is a bounded in Y for each x E X (i.e., if
supfgE ,dy(f(x),g(x)) < ooforeachx E X). (ii) A subset d> of C[X, Y) is equicontinuous at a point xo E X if for each real number e > 0 there exists a real number b > O such that dy (f (x), f(xo)) < s whenever dx(x, xo) < 8 for every f E 4) (note: 8 depends on a and ma-, depend on xo but it does not depend on f - hence the term "equicontinuity" ) 0 is equicontinuous on X if it is equicontinuous at every point of X. Remark: If the subset 4) is equicontinuous on X. and if for each e > 0 that
exists a 6 > 0 (which depends only on e) such that dx(x,x') < S implies
3.11 Sequential Compactness
161
dy(f (x), f (x')) < e for all x, x' E X and every f E 0, then 0 is uniformly equicontinuous on X. Uniform equicontinuity coincides with equicontinuity on a compact space (Theorem 3.67).
(a) Take E > 0, xo E X. and f E 4) arbitrary. Let 4s be an s-net for 0. Thus there exists g E 4s such that doo(f, g) < s, and hence dy(f (xo), g(xo))
sup dy(f (x), g(x)) = doo(f, g) < e. £EX
If d) is a finite s-net for 4), then the set 4)F (xo) = (g(xo) E Y: g E 4>,) is a finite
E-net for 4)(xo). Therefore, if 0 is totally bounded, then m(xo) is totally bounded for an arbitrary xo E X. Moreover,
dy(f(x), f(xo)) 5 dy(f(x), g(x)) + dy(g(x), g(xo)) + dy(g(xo), f(xo)) <
2e + dy(g(x), g(xo))
for every x E X and every g E 4) f. Since each g E 4)E is continuous, it follows that for each g E 08 there exists a Sg = Sg(s, xo) > 0 such that dx(x, xo) < Sg implies dy(g(x), g(xo)) < E. If 4£ is a finite s-net for 0, then set S = S(e, xo) = min{Ss)gEm1 so that dy(g(x),g(xo)) < s whenever dx(x,xo) < S. Thus there exists a 8 > 0 (that does not depend on f) such that dx (x, xo) < S
implies
dy (f (x), f (xo)) < 3E.
Therefore, if (D is totally bounded, then 4) is equicontinuous at an arbitrary xo E X.
Summing up: if 4) is totally bounded, then 4) is pointwise totally bounded and equicontinuous on X (b) Conversely, recall that X is separable (because it is compact - Proposition 3.72 and Corollary 3.81) and take a countable dense subset A of X. Consider the (infinite) A-valued sequence (a; )i> I consisting of an enumeration of all points of A (followed by an arbitrary repetition of points of A if A is finite). Letf = (fn }n> t be an arbitrary 4)-valued sequence, and suppose (D is pointwise totally bounded so that 4)(x) is totally bounded in Y for every x E X. Thus, according o Lemma 3.73, for each x E X
the 4)(x)-valued sequence {fn(x)}n>i has a Cauchy subsequence. In particular,
(fn(ai))n>t has a Cauchy subsequence. say (f(')(ai)}n>i. Set ft = (f('))n>i, which is a 4)-valued subsequence off such that
t is a Cauchy sequence in Y. Now consider for each x E X the 4(x)-valued sequence (fn')(x)}n>t. Since Q) (x) is totally bounded for every x E X, it follows by Lemma 3.73 that if (') (x) )n> i
has a Cauchy subsequence for every x E X. In particular, (10(a2)}n>> has a Cauchy sequence, say { f(2) (a2) }n> > . Set f2 = if(2) },,, ,which is a 4)-valued subsequence off, such that (f, 2)(a2)}n>i is a Cauchy sequence in Y and both and (f(')(a2))n,l are Cauchy sequences in Y as well (reason: f2
162
3. Topological Structures
is a subsequence off), and hence { f(2)(a))}">t is a subsequence of the Cauchy sequence { f)) (a i) }">) ). This leads to the inductive construction of a sequence Uk )k>) of 4)-valued subsequences off with the following properties.
Property (1)_fk+1 = { f(k+)))">1 is a subsequence of fk = (f(k))">) for every
k> 1. Property (2). For each pair of integers i > 1 and k > 1, (f (k)(ai))">t is a Cauchy sequence in Y wheneveri < k. As it happened in part (b) of the proof of Lemma 3.73, the diagonal procedure plays a central role in this proof too. Take an arbitrary integer i > 1. By Property (1), the Y-valued sequence (f(") is a subsequence of if ,')(ai ))n>) , which in turn is a Cauchy sequence in Y by Property (2). Thus is a Cauchy sequence in Y, and so is (f (") (ai) }n>) . Therefore, the "diagonal" sequence is a subsequence of the (D-valued sequence f = { f")">t (cf. Property (1)) such that
{ f(")(a))">i is a Cauchy sequence in Y for every a E A. Now take E > 0 and x E X arbitrary. Suppose 4) is equicontinuous on X. Thus there
exists a dE = 3,(x) > 0 such that dX(x', x) < be
implies
E
for all n. Since A is dense in X, it follows that there exists a E A such that dX (a, x) <
8, and hence
dy((f,(")(a), f(")(x)) < E for alI n. But { f"(")(a))">i is a Cauchy sequence, which means that there exists a positive integer nE = nE(a) such that
dy(fmm)(a), f^n)(a)) < E whenever m, n > nE. Hence, by the triangle inequality,
dy(f(m)(x), .fn)(x))
dy(.fmm)(x). fm-)(a)) + dy(fmm)(a). f(")(a))
<
+dy(f(")(a), fn")(x)) < 3E whenever m, n > nE. Therefore, as nE does not depend on x, doo(fmm), fn n)) = supdy(fmm)(x). I
3E
xEX
whenever m, n > ne, which means that the subsequence (f,"}> t off = (f"}n>1 is a Cauchy sequence in 4). Thus 4) is totally bounded by Lemma 3.73. Summing up: If d) is pointwise totally bounded and equicontinuous, then 4) is totally bounded.
Problems
163
(c) This is the Arzela Ascoli Theorem: If X is compact, then a subset of the metric space (C[X, Y], doo) is totally bounded if and only if it is pointwise totally bounded and equicontinuous. The corollary below follows at once (cf. Example 3Y and Corollary 3.84): If X is compact and Y is complete, then a subset of the metric space (C[ X, Y], dam) is compact if and only if it is pointwise totally bounded,
equicontinuous, and closed in (C[X, Y], doo). Recall that total boundedness coincides with plain boundedness in the real line (see proof of Theorem 3.83). Thus we get the following particular case: A subset 4) of the metric space (C[0, 1], dam) is compact if and only if it is pointwise bounded, equicontinuous and closed in (C[0, 1], dx). Note: in this case pointwise boundedness means sup1 ,Ix(t)I < 00 f o r each t E [0, 1 ].
Suggested Reading Bachman and Narici [I ] Brown and Pearcy [2] Dieudonnt [1] Dugundji [1) Goffman and Pedrick [I ] Kantorovich and Akilov [ 1 ] Kelley (1]
Kolmogorov and Fomin [1] Naylor and Sell [1] Royden [1) Schwartz [1] Simmons [11 Smart [I] Sutherland [I ]
Problems Problem 3.1. If (X, d) is a metric space, then (a)
Id (x, y) - d(y, z)I < d (x, z)
for every x, y, z in X. (Hint: Use symmetry and the triangle inequality only.) Incidentally, the above inequality shows that the metric axioms (i) to (iv) in Definition 3.1 are not independent. For instance, the property d(x, y) ? 0 in axiom (i) follows from symmetry and the triangle inequality. That is, "d(x, y) > 0 for every x, y E X" in fact is a theorem derived by axioms (iii) and (iv). Moreover, show that (b)
Id(x, u) - d(v, y)I < d(x, v) +d(u, y)
forevery u, v, x, yin X. (Hint: d(x, u) < d(x, v)+d(v, y)+d(y, u) and, similarly, d(v, y) < d(v, x) + d(x, u) + d(u, y) - use symmetry.) Problem 3.2. Suppose di : X x X - + R and d2: X x X -+ R are metrics on a set X. Define the functions d: X x X -+ R and d: X x X -> K by
d(x, y) = di (x, y) + d2(x, y)
and
d'(x, y) = max {di (x, y), d2(x, y)}
164
3. Topological Structures
for every x, y E X. Show that both d and d' are metrics on X.
Problem 3.3. Let p and q be real numbers. If p > 1 and if q = 'PP'r > 1 is the unique solution to the equation pl, + ?f = 1 (or, equivalently, the unique solution to
the equation p + q = pq), then p and q are said to be Holder conjugates of each other. Prove the following inequalities.
(a) If p > 1 and q > 1 are Holder conjugates, and if x = (1.... , !") and y = (v1, ... , u") are arbitrary n-tuples in C", then
I ilP),
IV,19)
+ q for every pair of positive real numbers a and
(Hint: Show that a fi <
P whenever p and q are Holder conjugates. Now set lix IiP = Moreover, n
1
I") '.)
n
1eivi1 < max IiiI 1<,Sn
i=1
lvil. i=1
These are the Holder inequalities for finite sums.
(b) Let x =
and y = (vk) be arbitrary complex-valued sequences (i.e., sequences in CN). If p > 1 and q > l are Holder conjugates, then 00
00
00
,
EI kvkl < (E141P)ElvklQ)a 1'
k=1
k=1
k=1
oo and FOO IIukl9 < oo; and
whenever
00
00
E1 4W < sUPIkIF,IvkI kEN
k=1
k=l
whenever SUPkENIkI < oo and Fo I Ivkl < oo. These are called Holder inequalities for infinite sums.
(c) Finally, let 9 be a nonempty subset of C, and let x and y be arbitrary complex-
valued functions on Q (i.e., functions in 0). If p > I and q > 1 are Holder conjugates, then
f
< (f IxIPdw)F(f 1Y1gdw)'
Problems
165
f o r all integrable functions x, y in Cn such that fn Ix I P dw < oo and fa IYI9 dw < oo. Moreover, if x, y E Cn are integrable functions such that oo and f a I yl dw < oo, then
f Ix yl dw < sup Ix(w)I fu IYI dw.
a
wES2
These are the Holder inequalities for integrals. Problem 3.4. Take any real number p such that p > 1. Use the preceding problem to verify the following results.
(a) Minkowski inequalities f o r f i n i t e sums. If x = Q1, ... , l;") and y = (v1, ... , vn) are arbitrary n-tuples in C". then n
n
n
(E
5 (E I p) + (E Ivi lp)
vi lP)
i=1
i=1
i=1
and
max Iii + v, I < max Iii I + max Ivi 1. I
1
1
Hint: Show that l + v I < I I + I v1 for every pair v) of complex numbers, and also that (a + 0)P = (a + P)P+l a + (a + P)P+l P for every pair {a, of nonnegative real numbers.
(b) Minkowski inequalities for infinite sums. If x = (lk) and y = {vk) are sequences in CN (i.e.. complex-valued sequences) such that klP < 00 and Tk IIvkIP < oo, then 00
00
1
00
1
(Elk+vkl")r < (E 141P)" +(EIvkIP) k=1
k=1
k=1
Moreover,
SUpI,k+vkI < SUpIlkI+SUpIvkl kEN
kEN
whenever
kEN
1 < oo and SUPkENIUkI < 00-
(c) Minkowski inequalities for integrals. If x and y are integrable functions in Cn (i.e., integrable complex-valued functions on S2) such that fn Ix I P do) < 00 and fn IyIP dw < 00, where S2 is a nonempty subset of C. then
( fn Ix +YIPdw)
<
(f Ix IPdw)+( fnIYIPdw)
If sup,,,Enlx(w)I < oo and supwEnly(w)I < oo, then sup Ix(w) + y(w)I < sup Ix(w)1 + sup IY((,O)l. wES2
DIES?
wES2
166
3. Topological Structures
Problem 3.5. Prove the Jensen inequality: If p and q are real numbers such that 0 < p < q, then C'O
k=1
k=1
F_k00--1Itkl° < oo (i.e., for every p-summable for every x = {ik} in CN such that complex-valued sequence). Hint: Show that El ak < (Ek° 1ak)' for every r >_ 1 whenever the sequence of nonnegative real numbers {ak) is such that F' 1ak < oo.
Now let Z+ and P' be the sets of all p-summable and bounded sequences from FN,
respectively, where either F = R or F = C (i.e.. the sets of all p-summable and bounded scalar-valued sequences, respectively - see Example 3B). Verify that
1< p< q
implies
l+ C l+ C (+OO
(where q is any real number greater than p), and show that these are in fact proper inclusions.
Problem 3.6. If A and B are nonempty and bounded subsets of a metric space (X, d), then
d(x, y) < diam(A) + diam(B) + d(A, B).
sup xEA, yeB
(Hint: d(x, y) < d(x, a) +d(a, b)+d(b, y) forevery x, a E A andevery y, b E B.) Now conclude that A U B is bounded. Show that the union of a finite collection of bounded subsets of X is a bounded subset of X.
Problem 3.7. Let (X, d) be a metric space and define d1: X x X - R for i = 1, 2 as follows. d1 (x, y) =
d(x, y)
and
1+d(x,y)
d2(x, y) = min {1,d(x, y))
for every X, y E X. Show that dl and d2 are metrics on X. Moreover, verify that every set in the metric spaces (X, d1) and (X, d2) is bounded. Problem 3.8. Let p, q and r be positive numbers such that 1
1
1
p
q
r
(a) Prove the following extension of the Holder inequality for integrals.
(
lxyl'dw)r
< (f
IxlPdw)F(Jlyl9dw)a
Problems
167
for all integrable functions x, y in Cn such that f o IxV dw < oo and fa IyI9 dw < oo. (Note: Similar inequalities hold for finite and infinite sums.)
Hint: Verify that g and ; are Holder conjugates. (b) Let S be a bounded interval of the real line and let dP be the usual metric on RP(S) (see Example 3E). Show that
1
implies
RP(S) C R9(S).
Hint: Verify that dr(x, y) < diam(S)'d,(x, y) for every x, y in RP(S) whenever 1 < r < p. Problem 3.9. Let {(X1, d;)}"_, be a finite collection of metric spaces and set Z = 11°=t X;, the Cartesian product of their underlying sets. Consider the functions dP: Z x Z --' R (for each real number p > 1) and dam: Z x Z -> R defined as follows.
dp(x,Y) _
di(xi,Yi)P)
doo = max {dk(x;, y, }n
for every x = (x1... , xn) and y = (yi, ... , yn) in Z = f In , X; . Show that these are metrics on Z.
Remark: Let d : Z x Z - R be any of the above metrics. The Cartesian product n"_, Xi equipped with d is referred to as a product space of the metric spaces {(Xi,d;)}"_t, and the metric d as a product metric. Sometimes, when the metric d has been previously specified, it is convenient to denote the product space (n"= X; , d) by n"=1 (X;, d;). This notation is particularly suitable for a product space of metric spaces with the same underlying set but different metrics. For instance, (X,dt)x(X,d2) and (X, d2)x(X,dt) are different metric spaces with the same underlying set X x X.
Problem 3.10. Consider the real line R with its usual metric. Let {an} be a realvalued sequence (indexed by N or N0) and recall the following definitions. (i)
{an) is bounded if supn I an I < 00
(ii) (iii)
(an) is increasing if an < an+i for every index n. {an } is decreasing if an+t < an for every index n.
(iv)
{an } is monotone if it is either increasing or decreasing.
Prove the next three statements.
(a) If (an) converges, then it is bounded.
168
3. Topological Structures
(b) If (an ) is monotone and bounded, then it converges. Therefore, for a real-valued monotone sequence, boundedness becomes equivalent to convergence. Now suppose (an) is a nonnegative sequence (i.e., 0 < a,, for every index n) and let (fn) be another real-valued sequence.
(c) If 0 < a < fn for every n and fin -+ 0, then an -+ 0. Problem 3.11. Consider again the real line R with its usual metric and let (Cr.) be a sequence of nonnegative real numbers. For each integer n > 1 set n
an =
ai.
i=1
(an) is called the sequence of partial suns of (an ), which clearly is nonnegative and increasing. Let us introduce the following usual notation and terminology. (i) (ii)
(iii)
If (an) converges, then we say that E°O_tai converges. If (an) is bounded, then we write >O0_tai < oo. 00 For each index n write -i= n+tai = Supmam - an.
The purpose of this problem is to show that the assertions
F0' j a; converges, °O_tai < oo, EO0 i=n+10
and
-> 0 as n
00
are pairwise equivalent. Prove the following propositions. (a) If supmam < oo, then an -> sup,nan, as n -+ oo.
(b) If (an } converges, then lim,, a,, = supmam. Obviously, the above convergences are all in R with its usual metric.
Problem 3.12. Let (an } be a sequence of nonnegative real numbers and equip the real line with its usual metric. Prove:
(a) If E°n° pan < oo, then limn an = 0. (b) If (an } is decreasing and Fn°_i an < oo, then limn nan = 0.
Hint: Show that if lim,, a;,, = lim,, azj = a, then limn an = Of.
Problems
169
Now exhibit a pair of nonnegative sequences (fn) and (y,) with the following properties.
(c) O0_ fn < oo and supnn fln = oo. (d)
(yn } is decreasing, En_t y,? < oo, and supnn yn = oo.
Problem 3.13. Let (8n) be a real-valued bounded sequence. Since the real line is a boundedly complete lattice in the natural ordering < of R (Example IC), it follows that both sups
n
for each n, consider the real-valued sequences (an ) and l y,,), and equip the real line R with its usual metric. (a) Show that both sequences {an } and {yn } converge in R.
Hint: These are bounded monotone sequences. As they converge in IR, set
a = lim an = lim inf uk n
n
and
y = lim yn = lim sup fk.
n
n
n n
The limits a E R and Y E R are usually denoted by
a = lim inf Yn
and
n
y = lim sup f,,, n
and called limit inferior (or lower limit) and limit superior (or upper limit) of the sequence {Yn), respectively (see Problem 1.19).
(b) Show that lim infnfn < lim supn$n, and prove the following propositions. (c) If (fin) converges in ]R, then lim inf.#,, = lim sup,,& = limn /3n
Hint: Show that Ian - 131 <- !On - 01 for each n and every /3 E R. Similarly, show that II'n - y I < I yn - y I for each n and every y E R.
(d) If lim infnfn = lim sup,,/3 = /3, then {/3n} converges in ]R to P. Hint: Show that a,, < /3n < y,, for each n and then apply the result of Problem 3.10(c).
170
3. Topological Structures
Remark: If the real-valued sequence {fin } is not bounded above, then (fik )n
bounded above for every index n. In this case we write sups
sider the subsets Bn = {fik}n
lim Bn = lim inf B,, = lim sup B,, =non, n
n
n
n
which always exists as a set in the power set P(R) (e.g.. if P. = n for each n E N. then ffEN B. = 0). Carefully note that the present problem is concerned with the natural ordering < of R, and not with the inclusion ordering a of P(R). Problem 3.14. Let (x,, } and { yn) be two convergent sequences (both indexed by N or No) in a metric space (X, d). Set x = lim xn and y = lim yn in X. Prove the following propositions. (a) d(xn, u) --' d(x, u)
(b) d(xn, y,,) -d(x, y)
and
d(v, yn) -> d(v, y)
for each u, v E X.
(hint: Problems 3.1(b) and 3.10(c)).
(c) If there exists a > 0 and an integer no such that d(xn, y,,) < at for every
n > no, then d(x, y) < a. Hint: Use the triangle inequality to show that d (x, y) < 2e + a for every e > 0, so that d(x, y) < inf,>o(2e + a). Problem 3.15. Two (similarly indexed) sequences (xn ) and (yn) in a metric space (X, d) are equiconvergent if lim d(xn, yn) = 0. Prove the following propositions. (a) (x,,) converges to x in (X, d) if and only if it is equiconvergent with the
constant sequence (x;, ) where x, = x for all n. (b) Two sequences that converge to the same limit are equiconvergent. (c) If one of two equiconvergent sequences is convergent, then so is the other and
both have the same limit.
Problem 3.16. Obviously, uniform continuity implies continuity.
Problems
171
(a) Show that the function f : lR -> R defined by f (x) = x2 for every x E R is continuous but not uniformly continuous.
(b) Prove that every Lipschitzian mapping is uniformly continuous.
(c) Show that the function g : [0, oo) -> [0, oo) defined by g(x) = x 7 for every x E [0, co) is uniformly continuous but not Lipschitzian.
Problem 3.17. Consider the setup of Example 3H. (a) Show that the mapping F : (X, d2) -> (Y, d2) of Example 3H(a) is nowhere continuous. Hint: Take xo E X and 3 > 0 arbitrary. Let a be any positive real number such 3 63 that i < a2 < and consider the function ea : R -> R defined by
a
a, ea(t)
0
0,
otherwise.
Verify that es Iies in X so that xo + es E X. Set xa = xo+es in X, ys = F(xs)
and yo = F(xo) in Y. Compute y8(t) - yo(t) for t E R and conclude that d2(xs, xo) < 3 and d2(F(xa), F(xo)) > 1. (b) Show that the mapping F : (R2(S), d2) --j, (R2(S), d2) of Example 3H(b) is Lipschitzian (and hence uniformly continuous - hint: Holder inequality).
Problem 3.18. Let C'[0, 1] be the subset of C[0, 11 consisting of all differentiable functions from C[0, 11 whose derivatives lie in C[0, 1]. Consider the subspace (C'[0, 1], doo) of the metric space (C[0, 1], dam) - see Example 3D. Let D: C'[0, 1] -+ C[0, 1] be the mapping that assigns to each x E C'[0, 1] its derivative in C[0, 1].
(a) Show that D: (C'[0, 1], dam) -+ (C[0, 1], do.) is nowhere continuous. Now consider the function d : C'[0, 11 x C'[0, 1 ] -+ R defined by
d(x, y) = doo(x, y) +doo(D(x), D(y)) for every x, y E C'[0, 1], which is a metric on C'[0, 1] (cf. Problem 3.2).
(b) Show that D: (C'[0, 1], d) -+ (C[0, 1], d) is a contraction(thusLipschitzian, and hence uniformly continuous).
172
3. Topological Structures
Problem 3.19. Consider the real line R equipped with its usual metric. Set B[O, oo) = B[[O, oo), R] and R' [O, oo) = Rt ([O, oo)) as in Examples 3C and 3E, respectively. Let dd be the sup-metric on B[O, oo) and let dl be the usual metric on R t [0, oo). Consider the set of all real-valued functions x on [0, oo) that are 1-integrable (i.e., x is Riemann integrable on [0, oo) and f0 Ix(s)I ds < oo) and bounded (i.e., sup:>olx(s)I < oo). Allowing a slight abuse of notation we write X [0, oo) = B [0, oo) n R t [0, oo)
and set, for any real number a > 0,
X[O,a] = {xEX[O,oo): x(s)=0for all s>a}. For each x E X [O, oo) consider the function y : [0, oo) - R defined by the formula r
y(t) = f x(s)ds 0
for every t > 0. Let BC[O, oo) C B[0, oo) denote the set of all real-valued bounded and continuous functions on [0, oo).
(a) Show that y E BC[O, oo). Now consider the mapping F : X [O, oo)
BC[0, oo) that assigns to each function
x in X[0, oo) the function y = F(x) in BC[0, oo) defined by the above formula. For simplicity we shall use the same notation F for the restriction FIxlo,al of F to X[O, a]. By using the appropriate definitions, show that (b)
F: (X[0, oo), di) - (BC[O, coo), doo) is a contraction,
(c)
F: (X [O, oo), dam) -i- (BC[0, coo), dam) is nowhere continuous,
(d)
F: (X[O, a], dam) -* (BC[O, oo), dam) is Lipschitzian.
Problem 3.20. Let I be the identity map of C[O, 1] onto itself and consider the metrics dQ, and d p (for any p > 1) on C[O, I ] as in Example 3D. Verify that (a) 1: (C[O, I]doo)
(C[O, 1],dp) is a contraction,
(b) 1: (C[O, IJd,,) -- (C[O, 1], d00) is nowhere continuous.
Hint: Consider the C[O, 1 ]-valued sequence Ix,,) of Example 3F and apply Theorem
3.7 to show that 1: (C[O. I ]d,) -+ (C[O, I], dc.) is not continuous at 0 E C[O, 1]. Problem 3.21. Recall that e+ C f ° for every p > 1 (Problem 3.5) and consider the subspace (e+, dam) of the metric space (e+°, dam). Let I be the identity map of f+ onto itself. Show that, for each p > 1,
Problems
(a)
I: (8+, dp)
173
(P+, do.) is a contraction,
(b) 1: (l+, dam)
(f+ p, d p) is nowhere continuous.
Hint: Use Theorem 3.7 to show that 1: (P+, dam) -+ (tip. , d p) is not continuous at 0Ee{p.. Problem 3.22. Let IF denote either the real field R or the complex field C and let ]FN be the collection of all scalar-valued sequences indexed by N. Consider the metric space (B+, d p) of Example 3B for an arbitrary p > 1 and, for every a = (010 E FN' consider the following subset of Z+. XP
= {x = (k} E e+: F_k°_llak kI" < 001-
Let Da : (Xa , dp) (P{p,, dp) be the diagonal mapping that assigns to each x = { k} in Xa the sequence (04 4k) in P+; that is,
Da(x) =
for every x = {k} E X.
(a) Show that Xa = e+ and Da is Lipschitzian whenever a E Z° (i.e., whenever SUpkIaki < 00).
(b) If ak = k for each k E N, then Da is not continuous. Prove this statement by using Theorem 3.7.
Problem 3.23. Take an arbitrary real number a > 0 and set
1-a,
fl=0 (a)= a-1 a
a<1, a> 1 .
Consider the real-valued sequence (t;,,) recursively defined as follows.
o=0
and
4n+1=1
(+)
for every n > 0. Verify that is an increasing sequence,
(a)
(b) 0 <
< f for all n > 0,
(c)
it,,) converges in 1R, and
(d)
lim1;,,=1-(1-fl)2, . Hint: According to Problem 3.16(a) the function f : R - R such that f (x) = x2 for every x E HR is continuous. Use Theorem 3.7.
174
3. Topological Structures
Thus conclude the square root algorithm: For every nonnegative real number a,
011 _
1-limin,
a < 1,
(1-lima > 1,
where (i;n) is recursively defined as above.
Problem 3.24. Take an arbitrary C[0, 1]-valued sequence {xn) that converges in (C[0, 1], dam) to x E C[0, 1]. Take an arbitrary [0,1]-valued sequence (tn} that converges in R to t E [0, 1]. Show that xn(tn) -+ x(t)
in
R.
Hint: Recall that a, + fii -+ a + $ whenever ai -+ of and Pi - fi in R. Use Problem 3.10(c) and Theorem 3.7.
Problem 3.25. Consider the standard notion of curve length in the plane and let D[0, 1] denote the subset of C[0, 1] consisting of all real-valued functions on [0, 11 whose graph has a finite length. Let c : D[0, 11 -+ R be the mapping that assigns
to each function x E D[0, 1] the length of its graph (e.g., if x E C[0, 1] is given by x (t) = (t - t 2) 71 for every t E [0, 1 ], then x E D[0, 1 ] and V(x) _ ). Now consider the D[0, fl-valued sequence {xn } defined as follows. For each n > 1 the graph of x,, forms n equilateral triangles of the same height when intercepted with the horizontal axis.
0
Use this sequence to show that the mapping tp is not continuous when D[0, 11 is equipped with the sup-metric doo and R is equipped with its usual metric. (Hint: Theorem 3.7.) Problem 3.26. Let C[0, oo) denote the set of all real-valued continuous functions on [0, oo), set XC[0, oo) = x [o, oo) n C[0, oo) = B[0, oo) n R' [0, oo) n C[0, oo)
(with X[0, oo) defined as in Problem 3.19), and consider the mapping w: XC[0, oo) -> R given by
V(x) =
f 0
00
Ix(t)I dt
Problems
175
for every x E XC[O, oo). Let (xn } be the XC[O, oo)-valued sequence such that, for each n > 1,
xn(t)
_
i
n2
1,
t E [0,n],
2n-t,
t E [n, 2n],
0,
otherwise.
Use this sequence to show that rp : XC[0, oo) -* R is not continuous when XC[0, oo) is equipped with the sup-metric doo from B[0, oo) and R is equipped with its usual
metric. (Hint: Theorem 3.7 - compare with Example 31.)
Problem 3.27. Let (X, d) be a metric space. A real-valued function co: X -> R is upper semicontinuous at a point xo of X if for each real number 0 such that o(xo) < fi there exists a positive number S such that d(x, xo) < S
implies
(p(x) <,6.
It is upper semicontinuous on X if it is upper semicontinuous at every point of X. Similarly, a real-valued function i/r : X -+ R is lower semicontinuous at a point xo of X if for each real number a such that a < *(xo) there exists a positive number S such that d(x, xo) < S implies a < *(x). It is lower semicontinuous on X if it is lower semicontinuous at every point of X. Now equip the real line with its usual metric and prove the following proposition (which can be thought of as a real-valued semicontinuous version of Theorem 3.7). (a) rp: X -* R is upper semicontinuous at xo if and only if
limsup(p(xn) < p(xo) n
for every X-valued sequence (xn) that converges in (X, d) to xo. Similarly, *: X -* R is lower semicontinuous at xo if and only if lim inf >(r(xn) > >/r(xo) n
for every X-valued sequence {xn } that converges in (X, d) to xo.
Hint: Take e > 0 and xo arbitrary and set fl = V(xo) + E. Suppose rp is upper semicontinuous at xo and show that there exists S > 0 such that d(x, xO) < S
implies
V(x) < e + rp(xo).
If xn -+ xo in (X, d), then show that there exists a positive integer ns such that n > ns
implies
lp(xn) < s + c (xo).
176
3. Topological Structures
Now conclude: lim supn tp(xn) < tp(xo). Conversely, if tp: X -+ R is not upper semicontinuous at xo, then verify that there exists ,6 > tp(xo) such that for every d > 0 there exists xa E X such that d (xs, xo) < 3
Set do = (p(xn) >
and
p(xs) > P.
and xn = xs, for each integer n > 1. Thus x - xo in (X, d) while for all n, and hence lim supn cp(xn) > 9p(xo).
(b) Show that tp: X -+ R is continuous if and only if it is both upper and lower semicontinuous on X. Hint: Problem 3.13 and Theorem 3.7.
Problem 3.28. Show that the composition of uniformly continuous mappings is again uniformly continuous.
Problem 3.29. Let X, Y and Z be metric spaces. If F : X -+ Y is continuous at xo, and if G : Y -+ Z is continuous at F(xo). then the composition G o F: X -> Z is continuous at xo. (Hint: Lemma 3.11.) Problem 3.30. The restriction of a continuous mapping to a subspace is continuous. That is, if F : X --> Y is a mapping of a metric space X into a metric space Y. and if A is a subspace of X, then the restriction FIA: A -> Y of F to A is continuous. (Hint: F 1A = F o J where J : A -* X is the inclusion map - use Corollary 3.13.)
Problem 3.31. Let X be an arbitrary set. The largest (or strongest) topology on X is the discrete topology P(X) (where every subset of X is open), and the smallest (or weakest) topology on X is the indiscrete topology (where the only open subsets of X are the empty set and the whole space). The collection of all topologies on X is partially ordered by the inclusion ordering c. Recall that T c T2 (which means Ti is weaker than T2 or, equivalently, T2 is stronger than T) if every element of T also is an element of T2. Prove the following propositions. (a) If U c AP(X) is an arbitrary collection of subsets of X, then there exists a
smallest (weakest) topology T on X such that U C T. Hint: The power set P (X) is a topology on X. The intersection of a nonempty collection of topologies on X is again a topology on X.
(b) Show that the collection of all topologies on X is a complete lattice in the inclusion ordering.
Problem 3.32. Let dl and d2 be two metrics on a set X. Show that dl and d2 are equivalent if and only if for each xo E X and each s > 0 the following two conditions hold.
Problems
177
(i) There exists Si > 0 such that
di (x, xo) < St
implies
d2(x, xo) < s.
implies
di (x, xo) < e.
(ii) There exists S2 > 0 such that d2 (x, xo) < S2 Hint. d1
d2 if and only if the identity 1: (X, d1) --> (X, d2) is a homeomorphism.
Problem 3.33. Let { (X 1, dj))% be a finite collection of metric spaces and let Z = n,°_t X; be the Cartesian product of their underlying sets. Consider the metrics d p for each p > 1 and d,,. on Z that were defined in Problem 3.9. Show that, for an arbitrary p > 1, 1
doo (x, y) < dp(x, y) < dt(x, y) _< n doo(x, y)
foreveryx=(xi,... ,x,,)andy=(yi,... ,y,)inZ=r(,"_i X;. Hint: dp(x, y) < di (x, y) by the Jensen inequality (Problem 3.5). Thus conclude that the metrics dO and d p for every p > 1 are all uniformly equivalent
on Z, so that the product spaces (]-[" X; , dam) and (r["_1 Xi, d p) for every p > 1 are all uniformly homeomorphic. Problem 3.34. Let (X, d) be a metric space and equip the real line IR with its usual metric. Take u, v E X arbitrary and note that both functions d(. , u) : (X, d) -> R
and
d (v, ) : (X, d) -*
lit
preserve convergence (Problem 3.14(a)), and hence they are continuous by Corollary
3.8. Now consider the Cartesian product X x X equipped with the metric d1 (see Problem 3.9: d1(z, w) = d(x, u) +d(y, v) for every z = (x, y) and w = (u, v) in
XxX). (a) Show that d ( , ) : (X x X, d i) -> R is continuous. Hint: If (x,,, y,,) --> (x, y) in (X x X, dl ), then x -> x and yh -> yin (X, d). Verify. Now use Problem 3.14(b) and Corollary 3.8. Next let d' denote any of the metrics d p (for an arbitrary p > 1) or doo on X x X as in Problem 3.9.
(b) Show that d (. , ) : (X x X, d') - ]R is continuous. Hint: See Problem 3.33 and Corollary 3.19. They ensure that the identity map l: (X x X, d') -- (X x X. d1) is a (uniform) homeomorphism. Now use item (a) and Corollary 3.13.
178
3. Topological Structures
Problem 3.35. If d and d' are equivalent metrics on a set X, then an X-valued sequence {x } converges in (X, d) to x E X if and only if it converges in (X. d') to the same limit x (Corollary 3.19). If d and d' are not equivalent, then it may happen that an X-valued sequence converges to x E X in (X, d) but does not converge (to any point) in (X, d') (e.g., see Examples 3F and 3G). Can a sequence converge in (X, d) to a point x E X and also converge in (X, d') to a different point x' E X? Yes, it can. We shall equip a set X with two metrics d and d', and exhibit an X-valued sequence and a pair of distinct points x and x' in X such that
x -s x
in
(X, d)
and
x -+ x'
in
(X, d').
Consider the set R2 and let d denote the Euclidean metric on it (or any of the metrics on R2 introduced in Example 3A, which are uniformly equivalent according
to Problem 3.33). Set v = (0, 1) E R2 and let V be the vertical axis joining v to the point 0 = (0, 0) E R2 (in the jargon of Chapter 2, set V = span (v)). Now consider a function d': R2 x R2 --> R defined as follows. If x and y are both in R2\ V
or both in V, then set d'(x, y) = d(x, y). If one of them is in V but not the other,
then d'(x, y) = d(x + v, y) if x E V and y E R2\V, or d'(x, y) = d(x, y + v) if
xER2\VandyE V. (a) Show that d' is a metric on R2.
Hint: If x, y E V, then d'(x, y) = d(x + v, y + v). Next consider the R2-valued sequence {x }where x,, = ( , 1) for each n > 1. Show that (b)
x --+ v
in
(R2, d)
and
x -+ 0
in
(RR, d').
(This construction was communicated by No Fernandez Lopez).
Problem 3.36. Upper and lower semicontinuity were defined in Problem 3.27. Equip the real line with its usual metric, let X be a metric space, and consider the following statement. (i) Suppose rp: X -+ R is an upper semicontinuous function on X and suppose
*: X --+ R is a lower semicontinuous function on X. If (p(x) :S *(x) for every x E X, then there exists a continuous function q : X -+ R such that W (x) < q(x) < >i' (x) for every x E X. This is the Hahn Interpolation Theorem. Use it to prove the Tietze Extension Theorem which is stated below.
(ii) Let A be a nonempty and closed subset of a metric space X. If f : A --> R is a bounded and continuous function on A. then it has a continuous extension g : X -+ R over the whole space X. Moreover, infXEx g(x) = inf0EA Pal and sup,EX g(x) = SUNNEA f (a).
Problems
179
Hint- Verify that the functions 9: X -+ R and * : X - R defined by
f(x) D(x) =
(
inf aEA
xEA, f(a),
xEX\A,
and
f(x), *(x) = sup f(a),
XEA,
xEX\A,
extend f over X and satisfy the hypothesis of the Hahn Interpolation Theorem. To show that v is upper semicontinuous at an arbitrary point ao E A. use the fact that
f is continuous on A (take any f > f (ao) and set E = 0 - f (ao)). To show that p is upper semicontinuous at an arbitrary point xo E X\A, use the fact that X\A is open in X and hence it includes an open ball B6(xo) centered at xo for some radius S > 0 (d(x, xO) < S implies rp(x) < fi for every f > V(xo)). Problem 3.37. We can define neighborhoods in a general topological space as we did in a metric space. Precisely, a neighborhood of a point x in a topological space X is any subset of X that includes an open subset which contains x.
(a) Show that a subset of a topological space X is open in X if and only if it is a neighborhood of each one of its points.
A topological space X is a Hausdorfspace if for every pair of distinct points x and y in X there exist neighborhoods Nx and Ny of x and y, respectively, such that Nx n Ny = 0. Prove the following propositions.
(b) Each singleton in a Hausdorff space is closed (i.e., X\(x) is open in X for every x E X). (c) Every metric space is a Hausdorff space (with respect to the metric topology).
(d) For every pair of distinct points x and y in a metric space X there exist nonempty open balls B, (x) and Bp (y) centered at x and y. respectively, such
that B6 (x) n B,(y) = 0. Problem 3.38. Let (X, TX) be a topological space and let A be a subset of X. A set
U' C A is said to be open relative to A if U' = A n u for some U E Tx. (a) Show that the collection TA of all relatively open subsets of A is a topology on A.
TA is called the relative topology on A. When a subset A of X is equipped with this relative topology it is called a subspace of X; that is, (A, TA) is a subspace of (X, TX). If a subspace A of X is an open subset of X, then it is called an open subspace of X. Similarly, if it is a closed subset of X, then it is called a closed subspace of X. If (Y, Ty) is a topological space, and if F: X -+ Y is a mapping of a set X into Y. then the collection F-I (Ty) = (F-I (U): U E Ty) of all inverse images F-1 (U) of open sets U in Y forms a topology on X. This is the topology inversely induced on X by F, which is the weakest topology on X that makes F continuous.
180
3. Topological Structures
(b) Verify that the relative topology on A is the topology inversely induced on A by the inclusion map of A into X. (Recall: the inclusion map of A into X is the function J : A --> X defined by J (x) = x for every x E A.) Now let (X, d) be a metric space and let Tx be the metric topology on X. Suppose A is a subset of X and consider the (metric) subspace (A, d) of the metric space (X, d). Let TA' be the metric topology induced on A by the relative metric (i.e., let TA' be the collection of all open sets in the metric space (A, d)).
(c) Show that U' C A is open in (A, d) if and only if U' = A fl U for some U C X open in (X, d). Thus the metric topology TA' induced on A by the relative metric coincides with the relative topology TA induced on A by the metric topology Tx on X; that is, TA' = TA, and hence the notion of subspace is unambiguously defined in a metric space. Let A be a subspace of a metric space X.
(d) Show that V' C A is closed in A (or closed relative to A) if and only if V' = A fl V for some closed subset V of X.
Hint: A\(A fl B) = A fl (X\B) for arbitrary subsets A and B of X.
(e) Open subsets of an open subspace of X are open sets in X. Dually, closed subsets of a closed subspace of X are closed sets in X. Prove.
Let A be a subset of B (A C_ B C X) and let A- and B- be the closures of A and B in X, respectively. Prove the following propositions. (f) B fl A- coincides with the closure of A in the subspace B.
(g) A is dense in the subspace B if and only if A- = B-. Problem 3.39. A point x in a metric space X is a condensation point of a subset A of X if the intersection of A with every nonempty open ball centered at x is uncountable. Let A' denote the set of all condensation points of an arbitrary set A in X. Show that A' is closed in X and A' C A*, where A* is the derived set of A. Problem 3.40. Take an arbitrary real number p > 1 and consider the metric space (RP[O, 1 ], dp) of Example 3E. Let C[0, 11 be the set of all scalar-valued continuous
functions on the interval 10, 1] as in Example 3D. Recall that C[0, 1] C Rp[0, 1].
This inclusion is interpreted in the following sense: if x E C[0, 1], then [x] c Rp [0, 11. Therefore, we are identifying C[0, I ] with the collection of all equivalence
classes [x] = {x' E rp[0, 1]: 8p(x', x) = 0} that contain a continuous function x E C[0, 1] (see Example 3E). Use the Closed Set Theorem to show that C[0, 1] is neither closed nor open in (RP [0, 1], dp).
Problems
Hum As usual, write x for [xJ. Consider the C[0, 1]-valued sequence (xn}"° the Rp[0. 1]\C[0, 1]-valued sequence (yn)nD l defined by
tE[O,j],
xn(t)= n+l-2nt,
n1,
1E
and
yn(t)=
tE["
0, I',
1,
0,
181
1
and
1 E [0, a), t E [>s, 1j
Problem 3.41. Let A be a subset of a metric space X. The boundary of A is the set
aA = A-\A°. A point x E X is a boundary point of A if it belongs to aA. Prove the following propositions.
(a) aA = A- n (X\A)- = X\(A° U (X\A)°) = a(X\A). (b) A- = A° U aA, so that A = A- if and only if aA C A. (c) aA is a closed subset of X (i.e., (8A)- = aA). (d)
A° n aA = 0, so that A = A° if and only if A n aA = 0.
(e) The collection (A(, aA, (X\A)°) is a partition of X. (f)
aA n aB = 0 implies (A U B)° = A° U B° (for B C X).
Problem 3.42. Let (X, d) be a metric space. (a) Show that a closed ball is a closed set.
Let B,(x) and B,°[x] be arbitrary nonempty open and closed balls, respectively, both centered at the same point x E X and with the same radius p > 0. Prove the following propositions. (b)
Bp[xJ° = Bp(x)
(c)
Bp(x)- C B,[x]
aBp[x] _ (y E X : d(y,x) = p).
and and
aBp(x) C aBp[xl.
(d) Show that the above inclusions may be proper.
Hint: X = [0,1]U[2,3], x= l and p= 1. Problem 3.43. Let A be an arbitrary subset of a metric space (X, d). Show that (a)
diam(A°) < diam(A) = diam(A-) = diam(aA),
(b)
d(x, A) = d(x, A-)
and
d(x, A) = 0 if and only if x E A`.
182
3. Topological Structures
Problem 3.44. For an arbitrary p ? 1 let l+ be the set of all scalar-valued p-summable sequences as in Example 3B. Let e+ be the set of all scalar-valued sequences {!;k}kEN for which there exist p > 1 and a E (0, 1) such that Itk1 < pak for every k E N; and let e+ be the set of all scalar-valued sequences with a finite number of nonzero entries.
(a) Prove: If p and q are real numbers such that 1 < p < q, then
e+ce+ee+Ce+, where the above inclusions are all proper. (Hint: Problem 3.5.)
(b) Moreover, show that the sets a+, e+ and e+ are all dense in the metric space (f+, dq) of Example 3B. (Hint: Example 3P.)
Problem 3.45. Let A and B be subsets of a metric space X. Recall: (A fl B)' g A- fl B- and this inclusion may be proper. For instance, the sets in Example 3M are disjoint while their closures are not. We shall now exhibit a pair of sets A and B with the property
AFB#0 and (AfB)-#A-lB-. Consider the metric space (C+, d2) as in Example 3B. Set
A=e+
and
B=SxpEe+: xp=O{E}k>1 for some fiEC).
Recall that e+ C e+ (Problem 3.5) and (&.)- = e+ in (e+, d2) (i.e., the set e+ is dense in the metric space (e+, d2) - Problem 3.44(b)). Show that B = B-; that is. B is closed in (e+, d2). (Hint: Theorem 3.30.) Thus conclude that A fl B = (0),
(A f B)- = (0) and A- fl B- = B. Problem 3.46. Suppose F : X -+ Y is a continuous mapping of a metric space X into a metric space Y. Show that
F(A-) c F(A)-
(a)
for every A C X. (Hint: Problem 1.2 and Theorem 3.23.) Now use the above result to conclude that, if A C X and C C Y, then (b)
F(A) g C
implies
F(A-) C C-.
Finally, prove that (c)
A- = B-
implies
F(A)- = F(B)-
Problems
183
whenever A C B C X. (Hint: Proposition 3.32 and Corollary 3.8.) Thus, if A is dense in X and if F is continuous and has a dense range (in particular, if F is surjective), then F(A) is dense in Y. Problem 3.47. Consider the metric space (t.p, dp) for any p > 1, take an arbitrary scalar-valued sequence a = i from t°, and let D. : (t+, dp) - (t{p, , d p) be the diagonal mapping defined in Problem 3.22. Suppose the bounded sequence t is such that a 96 0 for every n > 1. Let t+0 denote the set of all scalar-valued sequences with a finite number of nonzero entries.
(a) Show that R(Da )- = t+ (i.e., the range of D. is dense in (t+, d p)).
Hint: Verify that t. c R(DQ) C If (see Problem 3.44). (b) Show that DQ (t+)- = t+. Hint: Problems 3.22(a), 3.44(b) and 3.46(c).
Problem 3.48. Prove the following results. (a) If X is a separable metric space and F : X -+ Y is a continuous and surjective
mapping of X onto a metric space Y, then Y is separable (i.e., a continuous mapping preserves separability). Hint: Recall that, if there exists a surjective function of a set A onto a set B, then #A < #B. Use Problems 3.46.
(b) Separability is a topological invariant.
Problem 3.49. Verify the following propositions. (a) A metric space X is separable if and only if there exists a countable subset A of X such that every nonempty open ball centered at each x E X meets A (i.e., if and only if there exists a countable subset A of X such that every x E X is a point of adherence of A). (Hint: Propositions 3.27 and 3.32.) (b) The product space (r["_ i X; , d) of a finite collection ((X,, d;) }"_ of separable
metric spaces is a separable metric space. (Note: d is any of the metrics on [[°=t X; that were defined in Problem 3.9.) (c) If t+ is the set of all scalar-valued sequences that converge to zero, then (t+", dam) is a separable metric space.
184
3. Topological Structures
Hint: According to Proposition 3.39(b) (t+ c, dd) is a subspace of the nonseparable metric space (t+00, dam) of Example 3Q. Show that the set of all rational-valued sequences with a finite number of nonzero entries is countable and dense in (f+, dam) (but not dense in (t+', dam)). See Example 3P. Note: We say that a complex number is "rational" if its real and imaginary parts are rational numbers. (d) The metric space (t °, dam) is not homeomorphic to (e°+°, dx).
Problem 3.50. A subset A of a topological space X that is both closed and open is called clopen (or closed-open). A partition (A, B} of X into the union of two nonempty disjoint clopen sets A and B is a disconnection of X (i.e.. IA. B} is a disconnection of X = A U B if A tl B = 0 and A and B are both clopen subsets of X). If there exists a disconnection of X, then X is called disconnected. Otherwise X is called connected. In other words, a topological space is connected if and only if the only clopen subsets of X are the whole space X and the empty set 0. A subset A of X is a connected set if, as a subspace of X, A is a connected topological space. A topological space is totally disconnected if there is no connected subset of it containing two distinct points. Prove the following propositions.
(a) X is disconnected if and only if it is the union of two disjoint nonempty open sets. (b) X is disconnected if and only if it is the union of two disjoint nonempty closed sets.
(c) If A is a connected set in a topological space X, and if A C B C A. then B is connected. In particular, A- is connected whenever A is connected. Note: The closure A - of a set A is defined in a topological space X in the same way we have defined it in a metric space: the smallest closed subset of X including A.
(d) The continuous image of a connected set is a connected set. (e) Connectedness is a topological invariant. Recall that a subset A of a topological X is discrete if it consists entirely of isolated points (i.e., if every point in A does not belong to (A\(x))-, which means that in the subspace A every set is open in A). Suppose A is a discrete subset of X. If B is an arbitrary subset of A that contains more than one point, and if x E B, then B\(x) and (x) are both nonempty and open in A. Thus B is disconnected, and hence A is totally disconnected. Conclusion: A discrete subset of a topological space is totally disconnected.
(f) Show that the converse of the above italicized proposition fails.
Problems
185
Hint: Verify that Q is a totally disconnected subset of R that is dense in itself (i.e., it has no isolated point).
Problem 3.51. Prove the following proposition. (a) {xn } is a Cauchy sequence in a metric space (X, d) if and only if
lim supd(xn+k, xn) = 0 n
k>1
(i.e., if and only if (d(xn+k, xn)} converges to zero uniformly in k).
(b) Show that the real-valued sequence (xn } such that xn = log n for each n > 1 has the property lim d(xn+k, xn) = 0 n for every k >_ 1, where d is the usual metric on R. (Hint: log: (0, oo) --> R is continuous - use Corollary 3.8.) However (xn } is not a Cauchy sequence (it is not even bounded).
Problem 352. Let (X, d) be a metric space and let (xn ) be an X-valued sequence. We say that (xn ) is of bounded variation if 00
Ed(xn+l,xn) < 00. n=1
If there exist real constants p > 1 and a E (0, 1) such that
d(xn+l,xn) < pan for every n, then we say that (xn ) has exponentially decreasing increments. Prove the following propositions. (a) If a sequence in a metric space has exponentially decreasing increments, then it is of bounded variation. (b) If a sequence in a metric space is of bounded variation, then it is a Cauchy
sequence.
Thus, if (X, d) is a complete metric space, then every sequence of bounded variation converges in (X, d), which implies that every sequence with exponentially decreasing increments converges in (X, d). (c) Every Cauchy sequence in a metric space has a subsequence with exponentially decreasing increments (and therefore every Cauchy sequence in a metric space has a subsequence of bounded variation).
186
3. Topological Structures
Now prove the converse of the above italicized statement:
(d) If every sequence with exponentially decreasing increments in a metric space (X, d) converges in (X, d), then (X, d) is complete.
Hints: (a) Verify that E 'a" -).
i
as
as m -> oo for any a E (0, 1). (b) Use
the triangle inequality and Problems 3.10 and 3.11. (c) Show that for each integer k > 1 there exists an integer nk > 1 such that d(xn, x"t) < (I )k for every n > n,k whenever {xn) is a Cauchy sequence. (d) Proposition 3.39(c). Problem 3.53. If (xn) and {yn } are (similarly indexed) Cauchy sequences in a metric
space (X, d), then (a) the real sequence {d(xn, y")} converges in R.
Hint: Use Problems 3.1(b) and 3.10(c) to show that {d(xn, y")} is a Cauchy sequence in R. Moreover, if {x,, } and (y') are Cauchy sequences in a metric space (X, d) equiconvergent with {xn } and (y,), respectively, (i.e., if {x. } and (y,,) are Cauchy sequences
in (X, d) such that lim d (xn, x;,) = 0 and lim d(yn, y.) = 0 - see Problem 3.15), then
(b)
lim d(x", yn) = lim d(x., y.).
Hint: Set an =d(xn, y"), a. = d(x' , y.), a = limas, and a' = lima.. Use Problems 3.1(b) and 3.10(c) to show that Ian - a.) -+ 0. Now note that 0 _<
la - a'l < la-and+la,, -and+la,,-a'lfor each n. Problem 3.54. Suppose (xn) and {x.} are two (similarly indexed) equiconvergent sequences in a metric space X - see Problem 3.15. (a) Show that if one of them is a Cauchy sequence, then so is the other.
(b) A metric space X is complete whenever there exists a dense subset A of X such that every Cauchy sequence in A converges in X. Prove. Problem 3.55. Let X be an arbitrary set. A function d : X x X -> R is an ultrametric on X if it satisfies conditions (i), (ii) and (iii) in Definition 3.1 and also the ultrametric inequality,
d(x, y) < max {d(x. z), d(z, y)} for every x, y and z in X. Clearly, the ultrametric inequality implies the triangle inequality so that an ultrametric is a metric. Example: The discrete metric is an ultrametric. Let d be an ultrametric on X and let x, y and z be arbitrary points in X. Prove the following propositions.
Problems
187
(a) If d(x, z) # d(z, y), then d(x, y) = max(d(x, z), d(z, y)). (b) Every point in a nonempty open ball is a center of that ball. That is, if e > 0 and z E BE(y), then B,(y) = BE(z).
Hint: Suppose Z E B,(y) and take any x E B, (y). First note that if d(x, z) _ d(z, y), then x E B, (z). Next use item (a) to show that if d(x, z) 96 d(z, y), then x E B, (z). Thus conclude that BE(y) a B,(z) whenever z E B, (y). (c) If two nonempty open balls meet, then one of then is included in the other. In particular, if two nonempty open balls of the same radius meet, then they coincide with each other. (d) Any nonempty open ball is a closed set. (Hint: Theorem 3.30.) Thus conclude that the metric space (X, d) is totally disconnected. (Hint: Proposition 3.10 and Problem 3.50.)
(e) A sequence (xn) in (X, d) is a Cauchy sequence if and only if
limd(xn+l,xn) = 0. n
Thus conclude that (d (xn+k , xn) ) converges to zero uniformly in k if and only if it converges to zero for some integer k > 1; that is, limn supk> (d (xn+k , xn) = 0 if and only if limn d (xn+k, xn) = 0 for some k > 1. (Compare with Problem 3.51.)
Problem 3.56. Let S be a nonempty set and consider the collection SN of all Svalued sequences. For any two distinct sequences x = (xk) and y = {yk } in SN set k (x, y) = min{k E N: xk 0 yk } and define the function d : SN X SN
d (x, y) =
0,
fit by
x = y,
x
Y.
(a) Show that d is an ultrametric on SN and that (SN, d) is a complete metric space.
This metric d is called the Baire metric on SN. Now set S = F (where F denotes either the real field R or the complex field C) and let e+°O be the set of all bounded scalar-valued sequences (i.e., x = {! k) E f if and only if supk 14' I < oo). Let d be the Baire metric on FN and consider the subspace (f+', d) of the complete metric space (IF'N, d). Take the following £ -valued sequence {xn }nEN: for each n E N. xn = (ln(k))keN where
tn(k) =
n,
k = n.
0,
k:n.
Let d. be the sup-metric on fO and recall that (Cr, dam) is a complete metric space (Example 3S(c)).
3. Topological Structures
188
(b) Show that {xn )nEN converges to 0 (the null sequence) in (t+, d) but is unbounded in (t+, d00). (c) Show that the metric space (t+, d) is not complete. Hint: Consider the following t+-valued sequence {yn }.EN: for each n E N. Yn = (vn(k))iEN where
jk,
k < n,
10,
k>n.
Verify that (Yn)nEN converges in (IFN, d) toy = (k)keN E (a) and Theorems 3.30 and 3.40(a).
lFN\t.
o. Use item
Problem 3.57. Recall: (t+. d p) is a complete metric space for every p > 1(Example
3R(b)) and to C t+ C t+ whenever 1 < p < q (Problem 3.44(b)). Consider the subspace (ti°., dq) of (t+, dq) and show that
(t.+p, dq) is not a complete metric space. Now consider the subspace (t+, d p) of (t+, dp) and show that
(t+,dp) is not a complete metric space. Hint: Problem 3.44(b) and Theorem 3.40(a).
Problem 3.58. Take an arbitrary real number p > 1 and consider the metric space (C[0, 1], dp) of Example 3D. Prove that
(C[O, 1], dp) is not a complete metric space. Hint: Consider the C[O, 1]-valued sequence {xn) that was defined in Problem 3.40. First take an arbitrary pair of integers m and n such that 1 < m < n and show that dp(xn,, xn)p < fin. Then conclude that {xn) is a Cauchy sequence in (C[0, 1), dp).
Next suppose there exists a function x in C[0, 1] such that dp(xn, x) -+ 0. Show that (i)
f o I i - x(t)Ipdt = 0
and
(ii)
f
Ix(t)Ipdt -+ 0 as n -+ oo.
From (i) conclude that x(t) = 1 for all t E [0, ']; in particular, x(2) = 1. From (ii) conclude that x (t) = O for al l t E 11 and everyn > 1; in particular, x ('1) = 0 for every n > 1 sothat x(1) = x(lim ) = limx(-nom) = 0byCorollary 3.8. This leads to a contradiction (viz., 0 = 1), and hence there is no function x in C[0, 1] such that dp(xn, x) -# 0. Thus the C[0, 1)-valued Cauchy sequence {xn) does not converge in (C[O, 1), dp).
Problems
189
Problem 3.59. Recall that (t+°O, doo) is a complete metric space (Example 3S). Let
t+ denote the set of all scalar-valued convergent sequences (i.e., x = (l;; } E t+ if and only if Iti - l; I 0 for some scalar l:) and let t+ denote the subset of t+ consisting of all sequences that converge to zero. Since every convergent sequence is bounded (Proposition 3.39), it follows that
t°Ct+Ct+ Ct+Ct0 with the sets t+ (p > 1) and t0 defined as before (Problems 3.44 and 3.57). Use the Closed Set Theorem to verify the following propositions.
(a) (I+ c*, dam) and (t+, doo) are complete metric spaces. (b) (t+, dam) and (tip., doo) are not complete metric spaces. Hint: (a) To show that t+ is closed in (t."°, dO) proceed as follows. Take an arbitrary e - 0. Let (xn )n> ( bean t+-valued sequence so that, for each n > 1, x = (tn (k ))k> i converges in R. Thus for each n > 1 there exists an integer ke,n > I such that
l;,n(!)I < e
whenever j,k > ks,n. Suppose (xn}n>t converges in (t+', d00) toX = ( (k)}k>) E ti ° so that supk Ian (k) - (k)I - 0 as n -+ oo. Thus there exists an integer nE > 1 such that
Itn(k) - (k)I < E for every k > 1 whenever n > ne. Therefore, Il(k) - i; (J)1 -< Il(k) - tn,(k)I +
kn,(J)I + Itn, (J) - 4(J)I < 3e
whenever j, k > k,,ne . Now conclude that x lies in t+. (b) To show that both sets t+0
and t+ are not closed in the metric space (t.°, dd) set xn = (1, (
)F`
, ... , (I)F,
0,0.0.... } E tO for each n > 1 so that the sequence (x n )n> i converges in (t+ , dam)
to x = {(')F }k>t E t+ \t+. Remark: Note that tp is not dense in (tf, dam): if y = {uk}k>) is the constant sequence in t+°O with vk = I for all k > 1, then doc (x, y) > 1 for every x E t+ C t+c".
Hence to is not dense in (t., dam) either.
Problem 3.60. Let ((Xi, d1))"_t be a finite collection of metric spaces and let Z = rI"=) X; be the Cartesian product of their underlying sets. Let d denote any of the metrics dp (for an arbitrary p 1) or dm on Z = f"_t X; as in Problem 3.9. Show that the product space (11% 1 X;, d) is complete if and only if (Xi, d;) is a complete metric space for every i = i, ... , n. Hint: Consider the metric dt on Z (see Problem 3.9) and show that (]l"_t X; , dt ) is complete if and only if each (Xi. d;) is complete. Now use Problem 3.33 and Lemma 3.43.
190
3. Topological Structures
Problem 3.61. Take an arbitrary nondegenerate closed interval of the real line.
say I = (a, fl] c R of positive length k = fi - a. Consider the pair of closed intervals ([a, a + I'],
of I = [a, f] of length
consisting of the first and third closed subintervals , which will be referred to as the closed intervals derived
from I by removal of the central open third of I. If I = (li }" t is a finite disjoint collection of nondegenerate closed intervals in R, then let T = (/' )Z''t be the corresponding finite collection obtained by replacing each closed interval 1i in Z with the pair of closed subintervals derived from li by removal of the central open third of I. Now consider the unit interval [0, 1] and set
To = ([0,1]}. The derived intervals from [0, 1] by removal of the central open third are [0, ] and
[', 1]. Set
Ti =To= ([0,;1,[2, 1]}. Similarly, replacing each closed interval in Tt by the pair of closed subintervals derived from it by removal of its central open third, set
T2 = T; _ {[0, ],
1, [
.'1],
Take an arbitrary positive integer n. Suppose the collection of intervals Ik has already
been defined for each k = o, i, ... , n, and set I"+t = T".This leads to an inductive construction of a disjoint collection T" of 2" nondegenerate closed subintervals of the unit interval [0, 11, each having length , for every n E No. Next set C. = U T for each n E No and note that IC, ,).,N, is a decreasing sequence of subsets of the unit interval (i.e., C,, +i C C,, C Co = [0, 1]) such that each set C,, is the union of 2" disjoint nondegenerate subintervals of length 31-.CO t
0
Cl 0
1
2
3
3
1
C2 0
7 '
I
The Cantor set is the set nEN0
(a) Show that the Cantor set is a nonempty, closed and bounded subset of the real line R.
(b) Show that the Cantor set has an empty interior and hence it is nowhere dense.
Problems
191
Hint: Recall that each set C" consists of 2" intervals of length. Take an arbitrary point y E C and an arbitrary e > 0. Verify that there exists a positive integer nE such that the open ball BE (y) is not included in C", Now conclude that the nonempty open ball B,(y) is not included in C. which means that y is not an interior point of C.
(c) Show that the Cantor set has no isolated point and hence it is a perfect subset of R. Moreover, show that the Cantor set is uncountable. Hint: Consider the hint of item (b). Verify that there exists a positive integer ne such that the open ball BE(y) includes an end point of some of the closed intervals of C",. Also see Example 3W(b), and recall that ]R is a complete metric space.
(d) Show that the Cantor set is totally disconnected. Hint: If a and y are two points of an arbitrary subset A of C such that a < y, and if n is a positive integer such that < y - a, then both a and y do not belong to any interval of length . Thus a and y must belong to different intervals in Cn, so that there exists a real number l such that of < 0 < y and f Cn. Verify that (A f1(-oo, l4), A fl (f, oo)) is a disconnection of the set A. See Problem 3.50. Remark: Let µ(C") denote the length of each set Cn, which consists of 2" disjoint intervals of length . Thus µ(C,) = (2)". If we agree that the length µ(A) of a subset A of the real line can be defined somehow as to bear the property that 0 < A(A) < µ(B) whenever A C B c R (provided the lengths µ(A) and µ(B) are
"well-defined"), then the Cantor set C is such that0
1) is obtained from Cn_I by removing 2"-1 central open subintervals, each of length . Now, instead of removing at each iteration 2"-1 central open subintervals of length T., remove at each iteration 2"-1 central open subintervals of length . Let (S,)nEN0 be the resulting collection of closed subsets of the unit interval So = [0, 1 ], and note that the length of S,, for each n E N is n_1
2i
I
I
A(Sn) = I-Eq+1 = 2+2n+1' i __O
Consider the sequence {xn } consisting of the characteristic functions of the subsets S,, of So for each n E N; that is
xn(t) =
I1,
j0,
t E Sn, 1 E SO\S,,
192
3. Topological Structures
for every n >I. Note that each x,, belongs to RP[0, 1] for every p2:1 (see Example 3E) so that {xn } is an RP[O, 1 ]-valued sequence.
(a) Equip RP [0, 1] with its usual metric dp and show that (xn) is a Cauchy sequence in (RP [0, 1], dp). (b) Show that {xn} does not converge in (RP [0, 1], dp).
Hint: Suppose there exists x in RP[0, 1] such that dp(xn, x) -+ 0. Consider the null function 0 E RP[0, 11 and show that dp(xn, 0) > 1 for all n. Use the triangle inequality to conclude that dp(x, 0) > 21. On the other hand, set S = nnSn so that So\S = Un(So\Sn). Take an arbitrary positive integer m and show that r
U._i(so\$,)
Ix(t)IPdt
1x(1)-xk(t)IPdt o
for every k > 1. Finally, conclude that fso\S Ix (1)I Pdi = 0, which implies that the lower Riemann integral of Ix lP is zero. Since Ix IP is Riemann integrable, it = 0. This contradicts the fact that fso Jx(t)IPdt > .11. follows that f so
From (a) and (b) we conclude that, for any p > 1,
(RP[0, 1], dp) is not a complete metric space.
Remark: The failure of RP[0, I) to be complete when equipped with its usual metric dp is regarded as one of the defects in the definition of the Riemann integral. A more general concept of integral, viz., the Lebesgue integral, corrects this and other drawbacks of the Riemann integral. Let LP[0, 1] be the collection of all equivalence classes (as in Example 3E) of scalar-valued functions x on [0, 11 such that f o Ix (t) I P d t < oo, where the integral now is the Lebesgue integral. Since any Riemann integrable function is Lebesgue integrable. it follows that RP[0, 1] C LP[0, 1].Moreover, RP[0, 1] is dense in the metricspace(LP[0, 1],dp) so that (LP[0, 1], dp) is a completion of (RP [0, 1], dp). Problem 3.63. A metric space is complete if and only if every decreasing sequence
(V)1 of nonempty closed subsets of X for which diam(Vn) -+ 0 is such that
ni V. 96 Hint: This result, likewise Lemma 3.79, is also attributed to Cantor. Its proof follows closely the proof of Lemma 3.79. Consider the same X-valued sequence {vn}nEN
that was defined in part (a) of the proof of Lemma 3.79. Show that (vn).,N is a Cauchy sequence if diam(V,,) -* 0. Suppose X is complete, set v = lim vn, and verify that v E Vn for an arbitrary m r= IY so that nmEN Vm * 0. On the other hand,
Problems
193
let {x" }"EN be an arbitrary X-valued Cauchy sequence and consider the decreasing sequence { V,W }m,N of nonempty closed subsets of X that was defined in part (b) of
the proof of Lemma 3.79. Show that V,- -+ 0. If nmEN V,; A 0, then there exists v E Vm for all m E N. Verify that x -). v and conclude that X is complete.
Problem 3.64. Let ((X1, d i) )"_ be a finite collection of metric spaces and let Z = 11"at Xi be the Cartesian product of their underlying sets. Let d denote any of the metrics d p (for an arbitrary p > 1) or doo on Z = Il" t Xi as in Problem 3.9. 1
(a) Show that the product space (]l,"=t X;, d) is totally bounded if and only if (Xi, di) is totally bounded for every i = i, ... , n. Hint: First use Lemma 3.73 to show that (]l"=t Xi, di) is totally bounded if and only if each (Xi, di) is totally bounded. Then apply Problems 3.33 and Corollary 3.81.
(b) Show that
i=i,...,n.
([j"_ , X, , d) is compact if and only if (Xi, di) is compact for every
Hint: Item (a), Problem 3.60, and Corollary 3.81.
Remark: Let {Xy}yEr be an indexed family of nonempty topological spaces and
let Z = ]lyerX, be the Cartesian product of the underlying sets (Xy)yEr. The product topology on Z is the topology inversely induced on Z by the family {7ry }yE r
of projections of X onto each Xy (i.e., the weakest topology on Z that makes each projection ny : Z -+ X Y continuous). Compactness in a topological space is defined as in Definition 3.60. An important result, the Tikhonov Theorem, says that n yEr X y is compact if and only if Xy is compact for every y E r. Problem 3.65. Every closed ball of positive radius in (P+p, d p) is not totally bounded (and hence not compact).
Hint: Take an arbitrary p > 1 and consider the metric space ([+, d p) of Example 3B. Let Bp[xo] be a closed ball of radius p > 0 centered at an arbitrary xo E e+. Consider the L+-valued sequence (ei }iEN of Example 3X and set xi = pei + xo for each i E N. Instead of following the approach of Example 3X, show that {xi }iEN is a Bp[xo]-valued sequence that has no Cauchy subsequence. Then apply Lemma 3.73.
Problem 3.66. Prove the following propositions.
(a) Every closed ball of positive radius in (C[0. 1], d,,) is not compact. Hint: Consider the metric space (C[0, 1], dam) of Example 3D. Let Bp[xoJ be the closed ball of radius p > 0 centered at an arbitrary xo E C[0, 1 ]. Consider the mapping p: B,, [xo] -+ 118 defined by (P (x) =
f 0Ix(t) - xo(t)I dt - Ix(0) - xo(0)I
194
3. Topological Structures
for every x E Bp[xo]. Equip BP[xo] with the sup-metric doo and the real line with its usual metric. Show that tp is continuous, tp(x) < p for all x E BP[xoI. and supxEB,l.,ol tp(x) = p. Now use the Weierstrass Theorem (Theorem 3.86) to verify that Bp[xo] is not compact. (b) Every closed ball of positive radius in (C[0, 1], dam) is not totally bounded. Hint: Problems 3.42(a), Theorems 3.40 and 3.81, and item (a).
Problem 3.67. A topological space X is locally compact if every point of X has a compact neighborhood. Prove the following assertions.
(a) A metric space X is locally compact if and only if there exists a compact closed ball of positive radius centered at each point of X.
(b) R" and C" (equipped with any of their uniformly equivalent metrics of Example 3A) are locally compact. (c) (f+, dP) and (C[0, 1]. dam) are not locally compact.
(d) Every open subspace and every closed subspace of a locally compact metric space is locally compact.
Problem 3.68. Consider the metric space (l+, dP) for some p ? 1. (a) Prove that a subset A of P.°t, is totally bounded if and only if 00
00
lim sup
and
sup E 14 1P < 00
n tlk)EAk=n
(tk)EA k=1
Ilk IP = 0
(i.e., A is bounded and Fk°_ Ilk IP -+ 0 as n --+ oo uniformly on A).
(b) Show that a subset A of £. is compact if and only if it is closed and satisfies the above conditions.
Problem 3.69. Let
be an arbitrary point in l+ and set Itk(o)l for every k}.
SO = 11W E f +P:
Use the preceding problem to show that So is a compact subset of the metric space (e+, dr). In particular, the set S = 114k}k>I E t+2:
I
kI < 11
for every k> 1},
Problems
195
which is known as the Hilbert cube, is compact in (e+, d2). Show that the Hilbert cube has an empty interior (hint: verify that (1+\S)- = t) and then conclude that it is nowhere dense.
Problem 3.70. Suppose X is a compact metric space, let Y be any metric space, and consider the metric space (C[X, Y], dam) of Example 3Y. Take an arbitrary real
number y > 0 and let Cy[X, Y] denote the subset of C[X, Y] consisting of all Lipschitzian mappings of X into Y that have a Lipschitz constant less than or equal
toy. (a) Show that Cy[X, Y] is equicontinuous and closed in (C[X, Y], dam).
Hint: Set 3 = r Yfor equicontinuity. Use the Closed Set Theorem: if fn C Y] and if fn -+ f E C1 X, Y], then
dy(f(x), fn(x)) +dy(fn(x), fn(y)) +dy(fn(y), f(y))
dy(.f(x), f(y)) <
2doo(fn, f) + ydx(x, y).
From now on suppose the space Y is compact. Thus Y is complete (Corollary 3.81), and hence (C[X, Y], dam) is complete (Example 3Y).
(b) Show that Cy[X, Y] is pointwise totally bounded and conclude that Cy[X, YJ is a compact subset of the metric space (C[X, Y], dd).
Particular case (y = 1): The set C1[X, Y] of all contractions of a compact metric space X into a compact metric space Y is a compact subset of (C[X, Y], doo). Let 1(X, Y] denote the set of all isometrics of a compact metric space X into a compact metric space Y so that I [X, Y] C Ct [X, Y]. (c) Show that I [X, Y] is closed in (C[X, Y], dam) and conclude that I [X, Y] is compact in (C[X, Y], dam).
Hint: Apply the Closed Set Theorem: if (f,j is an ![X, Y]-valued sequence that converges to f E C[X, YJ, then (Problem 3.1(b))
jdx(x, y) -dy(f(x), f(y))I
=
Idv(fn(X), fn(y)) -dy(f(x), f(y))I dy(fn(x), f(x)) +dy(fn(y), f(y)) doc(fn, f).
4 Banach Spaces
Our purpose now is to put algebra and topology to work together. For instance, from algebra we get the notion of finite sums (either ordinary or direct sums of vectors, linear manifolds, or linear transformations), and from topology the notion of convergent sequences. If algebraic and topological structures are suitably laid on the same underlying set, then we may consider the concept of infinite sums and convergent series. More importantly, as continuity plays a central role in the theory of topological spaces, and linear transformation plays a central role in the theory of linear spaces, when algebra and topology are properly combined they yield the concept of continuous linear transformation; the very central theme of this book.
4.1
Normed Spaces
To begin with let us point out, once and for all, that throughout this chapter F will denote either the real field IR or the complex field C, both equipped with their usual topologies induced by their usual metrics. If we intend to combine algebra and topology so that a given set is endowed with both algebraic and topological structures, then we might simply equip a linear space with some metric, and hence it would become a linear space that is also a metric space. However, an arbitrary metric on a linear space may induce a topological structure that has nothing to do with the algebraic structure (i.e., these structures may live apart on the same underlying set). A richer and more useful structure is obtained when the metric recognizes the operations of vector addition and scalar
198
4. Banach Spaces
multiplication that come with the linear space, and incorporate these operations in its own definition. With this in mind, let us first define a couple of concepts. A metric (or a pseudometric) d on a linear space X over F is said to be additively invariant if
d(x, y) = d(x + z, y + z) for every x, y and z in X (which means that the translation mapping X -+ X defined by x r+ x + z for any z E X is an isometry). If d is such that
d(ax,ay) = I«Id(x,Y) for every x and y in X and every a in F, then the metric d is called absolutely homogeneous. A program for equipping a linear space with a metric that has the above "linear-like" properties goes as follows. Let p : X -+ R be a real-valued functional on a linear space over F (recall: F is either R or C so that R is always embedded in F). It is nonnegative homogeneous if
p(ax) = a p(x) for every x in X and every nonnegative (real) scalar a in F, and subadditive if
P(x + Y) < P(x) + p(Y) for every x and y in X. If p is both nonnegative homogeneous and subadditive, then it is called a sublinear functional. If
P(ax) = IaIP(x) for every x in X and a in IF, then p is absolutely homogeneous. A subadditive absolutely homogeneous functional is a convex functional. (Note that this includes the classical definition of a convex functional: if p : X -- R is convex, then p(ax +
fix) < a p(x) + fl p(x) for every x, y E X and every a E [0, 1] with fi = I - a.) If
p(x)>0 for all x in X, then p is nonnegative. A nonnegative convex functional is a seminorm (or a pseudonorm - i.e., a nonnegative absolutely homogeneous subadditive functional). If p(x) > 0 whenever x 34 0, then p is called positive. A positive seminorm is a norm (i.e., a positive absolutely homogeneous subadditive functional). Summing up: A norm is a real-valued functional on a linear space with the following four properties, called the norm axioms. Definition 4.1. Let X be a linear space over F. A real-valued function
11 II:X - R
4.1 Normed Spaces
199
is a norm on X if the following conditions are satisfied for all vectors x and y in X and all scalars a in F. (nonnegativeness),
(i) (ii)
IIxII >- 0 px II > 0
(iii)
Ilaxll = Ialllxll
(iv)
IIx + yII < IIxII + Ilyll
if x A 0
(positiveness), (absolute homogeneity), (subadditivity - triangle inequality).
A linear space X equipped with a norm on it is a normed space (synonyms: normed linear space or normed vector space). If X is a real or complex linear space (so that
F = R or F = C) equipped with a norm on it, then it is referred to as a real or complex normed space, respectively.
Note that these are not independent axioms. For instance, axiom (i) can be derived from axioms (ii) and (iii): an absolutely homogeneous positive functional is necessarily nonnegative. Indeed, for a = 0 in (iii) we get 11011 = 0 and, conversely, x = 0 whenever IIx II = 0 by positiveness in (ii). Therefore, if II II: X - R is a norm, then IIxII = 0 if and only if x = 0.
Proposition 4.2. If 1111: X -* R is a norm on a linear space X, then the function d: X x X -> R, defined by
d(x, y) = IIx - yII for every x, y E X, is a metric on X. Proof From (i) and (ii) in Definition 4.1 we get the metric axiom (i) of Definition 3.1. Positiveness (ii) and absolute homogeneity (iii) of Definition 4.1 imply positiveness (ii) and symmetry of Definition 3.1, respectively. Finally, the triangle inequality (iv) of Definition 3.1 follows from the triangle inequality (iv) and absolute homogeneity (iii) of Definition 4.1.
A word on notation and terminology. According to Definition 4.1 a normed space actually is an ordered pair (X, 11 11), where X is a linear space and 11 is a norm on X. As in the case of a metric space, we shall refer to a nonmed space in several ways. We may speak of X itself as a normed space when the norm 11 11 is either clear in the context or immaterial and, in this case, we shall simply say "X is a normed space". However, in order to avoid confusion among different normed spaces, we may occasionally insert a subscript on the norms (e.g., (X, II IIX) and (Y. II Ily)) If a linear space X can be equipped with more than one norm, say 11 t and 11 112. then (X, II 11 1) and (X, 11 112) will represent different normed spaces with the same linear space X. The metric d of Proposition 4.2 is the metric generated by the norm 11 11, and so a normed space is a special kind of linear metric space. Whenever we refer to the topological (metric) structure of a normed space (X, 11 11) it will always be understood that such a topology on X is that induced by the metric d generated by 11
11
200
4. Banach Spaces
the norm 1111. This is the so-called norm topology. Note that the norm IIx 11 of every vector x in a normed space (X, II II) is precisely the distance (in the norm topology)
between x and the origin 0 of the linear space X (i.e., IIx 11 = d(x, 0), where d is the metric generated by the norm 11 11). Proposition 4.2 says that every norm on a linear space generates a metric, but an arbitrary metric on a linear space may not be generated by any norm on it. The next proposition tells us when a metric on a linear space is generated by a norm. Proposition 43. Let X be a linear space. A metric on X is generated by a norm on X if and only if it is additively invariant and absolutely homogeneous. Moreover for each additively invariant and absolutely homogeneous metric on X there exists a unique norm on X that generates it. Proof. If don is a metric on a normed space X generated by a norm II II on X, then it is
additively invariant (den (x, y) = Ilx - y11 = IIx+z-(y+z)II = do (x +z, Y+ z) for every x, y and z in X) and absolutely homogeneous (doll (ax, a y) = (lax - Of y II = Il a (x - Y) II = la l IIx - Y II = la l dog (x, y) for every x and yin X and every scalar a). Conversely, if d is an additively invariant and absolutely homogeneous metnc on a linear space X, then the function II Il d: X -+ R defined by IIx 11d = d (x, 0) for every x in X is a norm on X. Indeed, properties (i) and (ii) of Definition 4.1 are trivially verified by the first two metric axioms in Definition 3.1. Properties (iii) and (iv) of Definition 4.1 follow from absolute homogeneity (flax lid = d (ax, 0) = Ial d(x, 0) = Ial IIx lid for every x in X and every scalar a) and additive invariance
(IIx + ylld = d(x + y, 0) = d(x, -y) < d(x, 0)+d(0, -y) = d(x, 0)+d(0, y) = llx 11 + 11 y II for every x and y in X). This norm II lid on X clearly generates the metric
d (for IIx - Ylld = d(x - y, 0) = d(x, y) for every x and y in X). Uniqueness is straightforward: if 1111 , and 11112 generated, then IIx If t
in X.
= d (x, 0) = IIx 112 for all x
a
Let (X, 11 11) be a normed space. By Proposition 4.2 and Problem 3.1 it follows
at once that
Illxll-llyll1 < IIx - YII for every x, y E X. Thus the norm 1111: X - R is a continuous mapping with respect to the norm topology of X (see Problem 3.34). In fact, the above inequality says that every norm is a contraction (thus Lipschitzian and hence uniformly continuous).
Therefore (cf. Corollary 3.8 and Lemma 3.43), a norm preserves convergence: if x --' x in the norm topology of X, then 11x 11 -+ IIx 11 in R, and also preserves Cauchy sequences: if (x,,) is a Cauchy sequence in X with respect to the metric generated by the norm on X, then (11x 111 is a Cauchy sequence in R. A Banach space is a complete normed space. Obviously, completeness refers to the norm topology: a Banach space is a normed space that is complete as a metric space with respect to the metric generated by the norm. A real or complex Banach space is a complete real or complex normed space, respectively.
4.1 Nonmed Spaces
201
Let X be a linear space and let {xn } be an X-valued sequence indexed by N (or by N0). For each n > 1 set n
Ixi
yn =
i=1
in X so that {yn}n° 1 is again an X-valued sequence. This is called the sequence of partial sums of {xn }°D I. Now equip X with a norm II Il If the sequence of partial sums (yn)n° 1 converges in the normed space X to a point y in X (i.e., if IIYn - y II - 0), then we say that Ix.) I is a sununable sequence (or that the infinite series F°O_Ixi converges in X to y - notation: y = E0O_Ixi). If the real-valued sequence (II xi 11)"0 1 is summable (i.e., if the infinite series E°° I I1xi II converges in R
or, equivalently, if E°_° 1 Ilxi II < oo - see Problem 3.11), then we say that is an absolutely summable sequence (or that the infinite series F_°__Ixi is absolutely convergent).
1
Proposition 4.4. A normed space is a Banach space if and only if every absolutely summable sequence is summable. Proof. Let (X, 11 11) be a normed space and let {xn )1 o be an arbitrary X-valued sequence.
(a) Consider the sequence
of partial sums of (x.) 0, n
Yn = F, xi i=o
in X for each n > 0. It is readily verified by induction (with a little help from the triangle inequality) that n+k
n+k
IIYn+k-Yn11 =I E xi11 < E oxi11 i=n+l
i=n+1
for every pairof integers n > 0 and k > 1. Suppose (xn )n° o is an absolutely summable sequence (i.e., F_00IIxi 11 < oo) so that 00
Ilxi II , 0
0 < sup IIYn+k - Yn II < k>I
as
n - oo,
i=n+l
and hence limn supk>I IIYn+k - Yn II = 0 (Problems 3.10(c) and 3.11). Equivalently,
(y,)0 is a Cauchy sequence in X (Problem 3.51). Therefore, if X is a Banach space, then (yn) converges in X, which means that (xn) is a summable sequence. Conclusion: An arbitrary absolutely summable sequence in a Banach space is summable.
202
4. Banach Spaces
(b) Conversely, suppose {xn }n° o is a Cauchy sequence. According to Problem 3.52(c) (xn } has a subsequence {xnk }k° o of bounded variation. Set zo = and Zk+1 = Xnk+I -Xnk in X, so that Xnk+t = Xnk + Zk+1, for every k > 0. Thus k
Xnk = E zi i=0
o is of bounded variation (i.e., Xnk II < oo), it follows that {zk} f is an absolutely summable se-
for every k > 0 (see Problem 2.19). Since {Xnk }k
quence in X. If every absolutely summable sequence in X is summable, then (zk }-0 is a summable sequence, which implies that the subsequence {xnkk=O }' converges in
X. Thus (see Proposition 3.39(c)) the Cauchy sequence (xn }'0 converges in X. Conclusion: Every Cauchy sequence in X converges in X, which means that the normed space X is complete (i.e., X is a Banach space). o
4.2
Examples
Many of the examples of metric spaces exhibited in Chapter 3 are in fact examples of normed spaces: linear spaces equipped with an additively invariant and absolutely homogeneous metric.
Example 4A. Let F" be the linear space over F of Example 2D (with either F = R or F = C). Consider the functions II II,,: F" -+ R (for each real number p ? 1) and 11m: F" -+ R defined by II
and
IIXIIP =
IIXIIoo = max n
i=1
for every x = (ti, .. . , 4n) in F". It is easy to verify that these are norms on F" (the triangle inequality follows from the Minkowski inequality of Problem 3.4(a)). and also that the metrics generated by each of them are precisely the metrics dp (for p > 1) and doo of Example 3A. Since F", when equipped with any of these metrics, is a complete metric space (Example 3.R(a)). it follows that F" is a Banach space
when equipped with any of the norms II IIP or II I. In particular, for n = I all of these norms are reduced to the absolute value function I I: F -+ R, which is the usual norm on F. The norm II II2 plays a special role. On R" it is the Euclidean
4.2 Examples
203
norm, and the real Banach space (R", II 112) is the n-dimensional Euclidean space. The complex Banach space (C", II 112) is the n-dimensional unitary space.
Example 4B. According to Example 2E the set FN (or IFNO) of all scalar-valued sequences is a linear space over F. Now consider the subsets 1+ and 1+00 of FN defined as in Example 3B. These are linear manifolds of IFN (vector addition and scalar multiplication - pointwise defined - of p-summable or bounded sequences are again p-summable or bounded sequences, respectively), and hence IT and 1+00 are linear spaces over E. For each p > 1 consider function II lip: 1+ -- R defined by 00
IIxIIp = (E1 61p)" k_I
for every x = (i k }kEN in 1.p, and the function II Il,,: 1+00 ' R given by IIxIIoo = SUP Ikl kEN
for every x = fk }kEN in 1°+°. It is readily verified that II lip is a norm on 1+ and 111100 is a norm on 1+ (as before, the Minkowski inequality leads to the triangle inequality). Moreover, the norm it lip generates the metric dp and the norm 11 1100 generates the metric d00 of Example 3B. These are the usual norms on 1+ and 1+00. Since (1{p., dp) is a complete metric space, and since (1{°, d00) also is a complete metric space (see Examples 3.R(b) and 3S), it follows that (1+. II lip) and (1°+°, II Iloo) are Banach spaces.
Similarly (see Examples 2E, 3B, 3R(b) and 3S again), (tP, II lip) and (100, 111100) are Banach spaces, where the functions 1111 p : t" -- R and II 1100: 1°O -* R. defined by 00
IIxIIp =
for x =
(k3_oo E
1
and
11x1100 =
kEZ
in lp or in 100, respectively, are the usual norms on the linear
manifolds 1" and 1°O of the linear space IFZ.
Let X be a linear space. A real-valued function II : X -+ R is a seminorm (or a pseudonorm) on X if it satisfies the three axioms (i), (iii) and (iv) of Definition 4.1. It is worth noticing that the inequality I Ilx11 IIYIII < lix - yll for every x. y in X still holds for a seminorm. The difference between a norm and a seminorm is that a seminorm does not necessarily satisfy the axiom (ii) of Definition 4.1 (i.e., a seminorm surely vanishes at the origin but it may also vanish at a nonzero vector). II
-
4. Banach Spaces
204
A seminorm generates a pseudometric as in Proposition 4.2: if 11 11 is a seminorm
on X, then d(x, y) = IIx - YII for every x, y E X defines an additively invariant and absolutely homogeneous pseudometric on X. Moreover, if a pseudometric is additively invariant and absolutely homogeneous, then it is generated by a seminorm
as in Proposition 4.3: if d is an additively invariant and absolutely homogeneous pseudometric on X, then IIx II = d(x, 0) for every x E X defines a seminorm on X such that d(x, y) = lix - YII for every x, y E X. Proposition 4.5. Let 11 11 be a seminorm on a linear space X. The set N of all vectors x in X for which 11x II = 0,
N = (x E X: IIxII = 0}, is a linear manifold of X. Consider the quotient space X/N and set
II[x]II. = Ilxll for every coset [x) in X/N, where x is an arbitrary vector in [x]. This defines a norm on the linear space X/N so that (X/N, II II,,,) is a norm:ed space. Proof. Indeed, N is a linear manifold of X (if u, v EN, then 0 < Ilu + vii
11u11 + 11vll = 0 and 0 < IIau11 < Ir111u11 = 0 so that u + v E N and au E N for every scalar a). Now consider the quotient space X/N of X modulo N as in Example 2H, which is a linear space over the same scalar field of X. Take an arbitrary cosec
[x] = x+N = (x'EX: x'=x+z for some zEN} in X/N and note that (lull = Ilvll for every u and v in [x] (if U. V E [xl. then
u-vENand0
for an arbitrary x E [x] (i.e., for an arbitrary representative of the equivalence class [x]), which defines a function from X/N to R. 11
11
It is clear that 11141- ? 0. If 11(x)1j_ = 0, then [x] = N = [0], the origin of the linear space X/N (reason: 11[x111_ = 0 implies that every x' in [x] belongs to ;A' and also that every u in N belongs to [x]). Moreover,
Iialx]II- = II[axlll,r = Ilaxll = Ialllxll = lalll[x]II,,,,
II[x]+[y]II- = II[x+Y]II- = llx+yII
Ilxii+IIYII = 111X111-+ 111Y111-,
for every [x]. [Y] E X/N and every scalar a (Example 2H). Therefore (Definition a _ is a norm on the linear space X/N.
4.1). 11
11
4.2 Examples
205
Remark: Note that the relation - on X defined by
or, equivalently,
x'^ x
if
Ilx' - x II = 0
x'-x
if
x' - xEN
is a linear equivalence relation on the linear space X in the sense of Example 2G. Consider the quotient space of X modulo -. X/-... If vector addition and scalar multiplication are defined in X/. as in Example 2H, then X l- is a linear space that coincides with X /N. In this case (i.e., if X is a linear space and 1111 is a seminorm on X), then the metric d_ on X IA( generated by the norm II II_ is precisely the metric
d' on X/- of Proposition 3.3 obtained by the pseudometric d on X generated by the seminorm 11
11.
Example 4C. Take an arbitraryreal number p > 1 and consider the setup of Example 3E: re(S) is the set of all F-valued Riemann p-integrable functions on a nondegenerate interval S of the real line R. This is a linear manifold of the linear space Fs (see Example 2E), and hence rp(S) is a linear space over F. Indeed, addition and scalar multiplication of Riemann p-integrable functions on S are again Riemann p-integrable functions on S (Minkowski inequality). It is clear that the pseudometric Sp on re(S) defined in Example 3E is additively invariant and absolutely homogeneous. Thus SP is generated by a seminorm I ]p on rp(S),
IXIp = Sp(x,0) _ (J
Ix(s)Ipds)'
S
for every x E rp(S), so that Sp(x, y) = Ix - ylp for every x, y E r"(S). Now consider the linear manifold N = {x E rp(S): Ixlp = 0} and set R"(S) = rp(S)/N, the quotient space of re(S) modulo N (i.e., the collection of all equivalence classes [x] = (x' E rP(S): Ix' - xl p = 0) for every x E rP(S)). By Proposition 4.5 the function II IIp: Re(S) -+ R, defined by II[xlllp = Ixlp for every [xJ in Re(S), is a norm on RP(S) (where x is an arbitrary representative of the equivalence class [x]). This is the usual norm on Rn(S). Note that Rn(S) is precisely the quotient space rp(S)/-. of Example 3E (see the remark that follows Proposition 4.5). Moreover, IIp is the norm on Rn(S) that generates the usual metric dp of Example 3E: II
dp([x], [y]) = II[x) - [yIIlp = II[x - y]IIP = Ix - YIP = Sp(x. Y) for every [x], [y] E Re(S), where x and y are arbitrary vectors in [x] and [y], respectively. According to common usage we shall write x E RP(S) instead of [x] E R"(S), and also Ilxllp
= dp(x,0) _ (f Ix(s)Ipds)II s
206
4. Banach Spaces
for every x E Rn(S) to represent the norm II IIp on Re(S). Therefore, (Re(S), II IIp) is a normed space but not a Banach space (Problem 3.62). Its completion, of course, is: (LU(S), II IIp) is a Banach space.
(We shall dicuss the completion of a normed space in Section 4.7.)
Example 4D. Let C[0, 1] be the set of all F-valued continuous functions on the interval [0, 11 as in Example 3D. Again, this is a linear manifold of the linear space F10 I 1 of Example 2E (addition and scalar multiplication of continuous functions are continuous functions), and hence C[0, 1] is a linear space over F. In fact, C[0, 1]
is a linear manifold of the linear space rP[O, 1] of the previous example (every continuous function on [0, 1 ] is Riemann p-integrable). For each p > 1 consider the function 1111 P: C[0, 1] -+ R defined by
Ilxllp = (
f
Ix(t)Ipdt)F
0
for every x E C[0, 1]. This is the norm on C[0, 1] that generates the metric dp of Example 3D so that (C[0, 1], II IIp) is a normed space.
According to Problem 3.58 (C[0, 1], 1111 P) is not a Banach space. Recall that C[0, 1)
can be viewed as a subset of RP[O, 1] (in the sense of Problem 3.40) and, as such, it can be shown that C[0, 1] is dense in (Rp[0, 1], II IIp). Therefore (see Problems 3.38(g) and 3.62), the Banach space (Lp[O, 11, II IIp) is a completion of (C[0, 1], II UP). Let {Xy }yEr be an indexed family of linear spaces over the same field F. The set ®yErXy of all indexed families {xy }yEr, where xy E X. for each y E 1', becomes a linear space over F if vector addition and scalar multiplication are defined on
®yErXy as {xy}yEr ® (Yy)yer = {xy + Yy)yer
and
a(x))YEr = {axy)yer
and every a in IF. This is the direct sum (or the full direct swn) of the family (Xy )yEr. The underlying set of the linear space for every {xy }yEr and { yy }yEr in ®yErXy,
®yErXy is the Cartesian product ]yErXy of the underlying sets of each linear space Xy (cf. Section 2.8). Example 4E. Let ((X1, II II)},"_ be a finite collection of normed spaces, where the
linear spaces X; are all over the same field IF, and let ®" 1X; be the direct sum
4.2 Examples
207
of the family {X;)"=t. Consider the functions II 11 p: ®i IX; -+ R (for each real number p > 1) and II ll : ®;=1X; -i R defined by
Ilxllp =
IIxII0 = max llxiIli
and
Ilxilip) F
1
1=1
f o r every x = (x i , ... , x") in ®"_ X; . It is easy to verify that these are norms on the direct sum ®"= I Xi. For instance, the triangle inequality for the norm II ll p 1
comes from the Minkowski inequality (Problem 3.4): for every x = (xi,... , xn)
andy=(yi,... ,yn)in®;=IX;, n
Ilx®yllp =
I
<
n
n
(E(Ilxill;+IlYiIIi)p)
Ilx;+y;llii n
(EIlxill;) +Ilyill;)p a = Ilxllp+Ilxllp. p
D
Moreover, these norms generate the metrics d p and dd of Problem 3.9 (recall: the underlying set of the linear space ®"=I X; is the Cartesian product f° 1 X; of the underlying sets of each linear space X;), and therefore (Problem 3.60) (®"=1Xi, I) Ilp) and (®"=1Xi, 111100) are Banach spaces if and only if each (X1, II Ili) is a Banach space. If the normed spaces (Xi, II Ili) coin-
cide with a fixed nonmed space (X, 1111), then ®" X is the direct sum of n copies of X (a linear space whose underlying set is the Cartesian product [-[" X = X" of n copies of the underlying set of the linear space X - Section 1.7). We can identify ®"=I X with X" (where the linear operations on X" are defined coordinatewise) so that (X", II lip) and (X", II II,,) are Banach spaces 1
1
whenever (X, lI II) is a Banach space. Note that this generalizes the Banach spaces of Example 4A.
Example 4F. Let ((Xk, II Ilk)) be a countably infinite collection of normed spaces, where the linear spaces Xk are all over the same field F, and let ®kXk be the direct sum of (Xk). For each p > 1 consider the subset [®kXk]P of ®kXk consisting of
all p-summable families [xk) in ®kXk. That is, (xk) E [®kXk]p if and only if Ek Ilxk Ilk < oo, where >k llxk Ilk is the supremum of the set of all finite sums of positive numbers from (Ilxk Ilk). This is a linear manifold of the linear space ®kXk and so [®kXk] p is a linear space over IF. As in Example 4E, it is easy to show that the function II IIp: [®kXk],, -* R, defined by
Ilxllp = (EIlxkllk)II k
208
4. Banach Spaces
for every x = {Xk } E [®kXk] p, is a norm on [®kXk] p. Now consider the subset [®kXk]oo of ®kXk consisting of all bounded families (Xk) in ®kXk (i.e., (xk) E [®kXk] if and only if supkllxkIlk < oo). This again is a linear manifold of the
linear space ®kXk so that [®kXk]o is itself a linear space over F. It is readily verified that the function II ilw: [®kXk]oo - R. defined by Ilxlloo = supIIxkIIk k
for every x = {xk } E [®kXk]o,, is a norm on [®kXk],o. Moreover, it can also be shown that
([®kXk]p. II IIp) and ([(DkXk]., II II0.) are Banach spaces if and only if each (Xk, II Ilk) is a Banach space (hint: Example 3R(b)). Again, if the normed spaces (Xk, II Ilk) coincide with a fixed normed space (X, 11 11). then ®kX is the direct sum of countably infinite copies of X. (Recall that (DkX is a linear space
whose underlying set is the Cartesian product nkX of countably infinite copies of the underlying set of the linear space X, which coincide with XN, XNO or XZ if the indices k run over N, No or Z, respectively - Section 1.7). As before, we shall adopt the identifications ®kENX = XN, ®kENoX = XNO and ®kEZX = XZ (the linear operations on XN, XNO and XZ are defined as in Example 2F). It is usual to denote [®kX]P in XN (or in XNO) by £. (X): the linear manifold of XN
consisting of all p-summable X-valued sequences; and [®kX],. in XN (or in XNO) by t'.+(X): the linear manifold of XN consisting of all bounded X-valued sequences. That is,
t4 (X) _ (Xk) E XN: E00IIIxkIIp < 00). e+(X) _ {{xk} E XN: supkENIIxkII < oo). The norms IixIlp = (F-k° I IIxkIIP)II and Ilxll,,. = supkENllxkll, for every x = (Xk) either in l+ (X) or in a+(X ), are the usual norms on 8+(X) and a+(X), respectively, and (e+(X), II IIp) and (e{°(X), II
ll
) are Banach spaces
whenever (X, 11 11) is a Banach space. Similarly, [®kX]p in XZ is denoted by
CP(X) and [®kX],,. in XZ is denoted by ('(X) and, when equipped with their usual norms II IIp and II Il,,, (fp(X), II IIp) and (CO0(X), 11Iloo) are Banach spaces whenever (X, II II) is a Banach space. This example generalizes the Banach spaces of Example 4B.
4.3 Subspaces and Quotient Spaces
4.3
209
Subspaces and Quotient Spaces
If (X, 11 11) is a normed space, and if M is a linear manifold of the linear space X,
then it is easy to show that the restriction II JIM : M -> R of the norm 1111: X -> R to M is a norm on M so that (M, II JIM) is a normed space. Moreover, the metric dM : M x M -> R generated by the norm I) Jim on M coincides with the restriction to M x M of the metric d : X x X -> R generated by the norm 11 11 on X. Thus (M, dm) is a subspace of the metric space (X, d). If a linear manifold of a normed space is regarded as a normed space, then it will be understood that the norm on it is the restricted norm II IIM We shall drop the subscripts and write (M, 11 11) and (M, d) instead of (M, II JIM) and (M, dM), respectively, and often refer to the normed space (M, 11 11) by simply saying that "M is a linear manifold of X".
Proposition 4.6. Let M be a linear manifold of a normed space X. if M is open
in X, then M=X. Proof. Since M is a linear manifold of the linear space X, the origin 0 of X lies in M. If M is open in X (in the norm topology, of course), then M includes a nonempty open ball with center at the origin. That is, (y E X: JJYII < e) C M for some e > 0. Take an arbitrary nonzero vector x in X and set z = e(2JJxJJ)-1x E X
so that Ilzll =, and hence z E BE(0) C M. Thus x = (211x11)e-tz lies in M (since M is a linear space). Therefore, every nonzero vector in X also lies in M. Conclusion: X C M so that M = X (because M C X). 0 This shows that a normed space X is itself the only open linear manifold of X. On the other hand, the closed linear manifolds of a nonmed space are far more interesting. They in fact are so important that we give then a name. A closed linear manifold of a normed space is called a subspace of X. (Warning: A subspace of a metric space (X, d) is simply a subset of X equipped with the "same" metric d, while a subspace of a normed space (X, 11 11) is a linear manifold of X equipped with the "same" norm 11 11 that is closed in (X, 11
11)).
Proposition 4.7. A linear manifold of a Banach space X is itself a Banach space if and only if it is a subspace of X.
o
Proof. Corollary 3.41.
Example 4G As usual, let YS denote the collection of all functions of a nonempty set S into a set Y. If y is a linear space over F. then yS is a linear space over IF (Example 2F). Suppose Y is a normed space and consider the subset B[S, y] of YS consisting of all bounded mappings of S into Y. This is a linear manifold of YS (vector addition and scalar multiplication of bounded mappings are bounded mappings), and hence B[S, Y) is a linear space over F. Now consider the function Iloo : B[S, y] -> R defined by II
IIfII" = suplIf(s)II sES
210
4. Banach Spaces
for every f E B [ S, Y], where 1111: Y -+ R is the norm on Y. It is easy to verify that (B[S, Y]. 11Iloo) is a normed space. The norm II II,, on B[S, Y], which is referred to as the sup-norm, generates the sup-metric dw of Example 3C. Thus, according to Example 3S, (B[S. Y], II II..) is a Banach space if and only if Y is a Banach space.
Moreover, if X is a nonempty metric space and if BC[X, YJ is the subset of B[X, Y] made up of all continuous mappings from B(X, Y], then BC[X, YJ is a linear manifold of the linear space B[X, Y] (addition and scalar multiplication of continuous
functions are continuous functions), and hence a linear space over F. According to Example 3N the linear manifold BC[X, YJ is closed in (B[X, Y], II II ) and so (BC[X, Y1, II IIc) is a subspace of the normed space (B[X, Y], 11 11.). Thus (BC[X, Y1, II
II
) is a Banach space if and only if Y is a Banach space
(Example 3T - also see Proposition 4.7). If the metric space X is compact, then C[ X, Y] = BC[X, Y], where C[X, Y] stands for the set of all continuous mappings of the compact metric space X into the normed space Y (Example 3Y). Therefore, (C[X, Y], II II.) is a Banach space if X is compact and Y is Banach.
By setting X = [0, 11 equipped with the usual metric on R (which is compact Heine-Borel Theorem), it follows that (C[0, 11, II
II
) is a Banach space,
where C[0, 1] = C([0, 1], IF) is the linear space over F of all 1F-valued continuous functions defined on the closed interval [0, 11 (recall: (IF, I I) is a Banach space Example 4A), and
IIxLL,, = sup Ix(t)I = max Ix(t)l 0
for every
x E C[0, 1]
0
(cf., Examples 3T and 3Y) is the sup-norm on C[0, 1].
In any normed space X the zero linear manifold (0) and the whole space X are subspaces of X. If a subspace M is a proper subset of X. then it is said to be a proper subspace. A nontrivial subspace M of a normed space X is a nonzero proper subspace of it ({0} 96 M A X). The next proposition shows that, if the dimension of the linear space X is greater than one, then there are many nontrivial subspaces of the normed space X.
Proposition 4.8. Let X be a normed space.
(a) The closure M of a linear manifold M of X is a subspace of X.
4.3 Subspaces and Quotient Spaces
211
(b) The intersection of an arbitrary nonempty collection of subspaces of X is again a subspace of X. Proof. (a) The closure M- of a linear manifold M of a normed space X is clearly a closed subset of X. What is left to be shown is that this closed subset of X is also a linear manifold of X. Take two arbitrary points x and y in M-. According to Proposition 3.27 there exist M-valued sequences and that converge to x and y, respectively, in the norm topology of X. Therefore (cf. Problem 4.1). the
M-valued sequence {x + y,,) converges in X to x + y, and hence x + y E M(see Proposition 3.27 again).
(b) The intersection of an arbitrary nonempty collection of linear manifolds of a linear space is a linear manifold (cf. Section 2.2), and the intersection of an arbitrary collection of closed subsets of a metric space is a closed set (cf. Theorem 3.22). Thus the intersection of an arbitrary nonempty collection of closed linear manifolds of a normed space is a closed linear manifold. 0
Let A be a subset of a normed space X. The (linear) span of A, span A. was defined in Section 2.2 as the intersection of all linear manifolds of X that include A, which coincides with the smallest linear manifold of X that includes A. Recall that the smallest closed subset of X that includes span A is precisely its closure (span A)- in X (by the very definition of closure). According to Proposition 4.8(a) (span A) - is a subspace of X. This is the smallest closed linear manifold of X that includes A. Set
VA = (span A)
,
which is called the subspace spanned by A. If a subspace M of X (which may be X itself) is such that M = VA for some subset A of X, then we say that A spans M or that A is a spanning set for M (warning: the same terminology of Section 2.2 but now with a different meaning); or still that A is a total set for M. Also note that the intersection of all subspaces of X that include A is the smallest subspace of X (see Proposition 4.8(b)) that includes A. Summing up: VA is the smallest subspace of X that includes A, which coincides with the intersection of all subspaces of X that include A. It is readily verified that VO = {0}, VM = M- for every linear manifold M of X, and A C_ VA = V(IA) for every subset A of X. Moreover, if A and B are subsets of X, then
ACB
implies
VA C VB.
Proposition 4.9. Let X be a normed space. (a) A set A spans X if and only if every linear manifold of X that includes A is dense in X. (b) X is separable if and only if it is spanned by a countable set.
212
4. Banach Spaces
Proof. (a) Let A be a subset of a normed space X. Take an arbitrary linear manifold
M of X such that A C_ M. Recall: VA e VM = M-. Thus VA = X implies M' = X. Conversely, span A is a linear manifold of X that includes A. If every linear manifold of X that includes A is dense in X, then VA = (span A)- = X. (b) Let X be a normed space. If X is separable as a metric space, then (by
definition) there exists a countable set, say A C X, such that A- = X. Since
A c VA =
(VA)-, it follows that VA = X. Conversely, suppose (span A)- = X for some countable subset A of X. Recall that span A is the set of all (finite) linear combinations of vectors in A (Proposition 2.2). Let M denote the set of all (finite) linear combinations of vectors in A with rational coefficients (note: we say that a complex number is "rational" if its real and imaginary parts are rational numbers).
Since Q- = R, it follows that M- = (span A)- (see Example 3P), and hence M- = X. Moreover, it is readily verified that M is a linear space over the rational field Q and so M = span A (over Q). Thus M has a Hamel basis included in A (Theorem 2.6) so that dim M _< #A < No. Therefore, #M = max(#Q, dim M) = NO (cf. Problem 2.8). Conclusion: M is a countable dense subset of X. Outcome: X is separable. o
In Section 2.2 we considered the subcollection Cat(X) of the power set P(X) of a linear space X consisting of all linear manifolds of X. Recall that Cat(X) was shown to be a complete lattice (in the inclusion ordering of P(X)): if {My)yEr is an arbitrary nonempty subcollection of Cat(X) (i.e., an arbitrary nonempty indexed family of linear manifolds of X), then inf {M y )yEr = nyerMy E Cat (X)
and sup(My}yEr = span (UyerMy) = EyErMy E Cat(X). In particular, if {M, N) is a pair of linear manifolds of X. then MAN = M n N and M v N = span (M U N) = M + N are both linear manifolds of X. Now shift from linear manifolds to subspaces. Let Lat(X) denote the subcollection of the power set P(X) of a normed space X made up of all subspaces of X. Clearly, Lat(X) e_ Cat(X). If {My}yEr is an arbitrary nonempty subcollection of Lat(X) (i.e., an arbitrary nonempty indexed family of subspaces of the normed space X), then nyErMy E Lat(X) (by Proposition 4.8(b)) and nyErMy c Ma
for every M. E (My)yEr, so that nycrMy is a lower bound for {My}yEr. Moreover, if V in Lat(X) is a lower bound for (My)yEr (i.e., if V e My for all
y E F), then V c nyerMy. Thus nyerMy = inf(My)yer. If we adopt the usual notation AyerMy = inf{My}yEr, then
A my= n My. yer
yer
Similarly, Ma c V(nyerMY) E Lat(X) for every M. E (My)yer and, if U in Lat(X) is an upper bound for {My}yEr (i.e., if My c U for all y E I so that
UyerMy 9 U), then V (UyerMy) 9 U. Thus V(UyerMy) = sup(M))yer-
213
4.3 Subspaces and Quotient Spaces
Again, if we take up the usual notation VyerMy = sup{My)yer, then
V my = V (U J1dy) = yEr
yEr
(>M)
.
yEr
This is the topological sum of {My}yEI Conclusion: Lat(X) is a complete lattice. The collection ofall subspaces of a normed space is a complete lattice in the inclusion ordering. If M and N are subspaces of X, then JAI A N = JAI fl N and M v N =
V(M U N) = (M + N)- lie in Lat(X). However (and this is rather important) it may happen that M + N 0 (M + N)-: the (algebraic) sum of subspaces is not necessarily a subspace (it is a linear manifold but not necessarily a closed linear manifold). We shall see later (next chapter) an example of a couple of subspaces (of a Banach space) whose sum is not closed.
Next we consider the quotient space of a normed space modulo a subspace. Suppose M is a subspace of a normed space (X, II IIx) Let
[x] = x+M = {x'EX: x'=x+uforsomeuEM1 be the cosec (of x modulo M) in the quotient space X/mil (of X modulo .M), which is a linear space over the same field of the linear space X (see Example 2H). Consider the function 11 : X/M - R given by 11
II[xlll = Xin[xf IIlx'Ilx
unf IIx+uIIX
for every [x] in X/M, where x is any vector of X in [x]. This defines a function from X/M to R, for II[x1II depends only on the coset [x] and not on a particular representative vector x in [x]. Note that IIx + u IIx = IIx - u IIx because M is a linear manifold. Thus, with dx denoting the metric on X generated by the norm II IIx,
II[x]II = uEnf dx(x,u) = dx(x,M) M
for every [x] E X/M, where x is any vector of X in [x]. We claim that 11 is a norm on the quotient space X/.M. Indeed, nonnegativeness is trivially verified: 11
II[x]II > 0 for every [x] E X/M. To verify positiveness proceed as follows. Take an arbitrary
x E [x]. If II[x]II = 0, then dx(x, M) = 0 so that x E .M- = M (cf. Problem 3.43(b) - recall: M is closed in X). This means that x E [0] (reason: the origin [0]
of X/M is M - see Example 2H), and hence [x] = [0]. Equivalently, II[x]II > 0
whenever
[x] 34 0.
Absolute homogeneity and subadditivity (i.e., the triangle inequality) are also easily verified. Recall that (Example 2H)
a[x] = [ax]
and
[x] + [y] = [x + y]
214
4. Banach Spaces
for every [x], [y] in X/M and every scalar a. Since M is a linear manifold, it follows by absolute homogeneity and subadditivity of the norm II Iix that
Ila[x]II
Ilax+ullx =unf Ilax+aullx = II[axlll =unf GM GM = IaI inf llx + u llx = IaI II[x] II, MEM
II[x] + [y] II
= II[x+y)lh =unf lix+y+ullx <
YEf M lix+u+y+ullx
inf IIx+ullx+ inf IIy+uIlx = II[x]Ii+ll[y]II.
uEM
uGM
for every [x], [y] in X/M and every scalar a. Such a norm is referred to as the quotient norm and, whenever a quotient space X/M (of a normed space X modulo a subspace M) is regarded as a normed space, it is this quotient norm that will be assumed to equip it. Note that IIx + u IIx IIx IIx + Ilu IIx = Iix lix. so that II[xlll < IIXIIX
for every [x] E X/M and every x E [x]. Thus
II[x] - [y]ii = II [x - y]II
IIx - yllx
for every x, y E X. Therefore the natural mapping r of X onto X/M, tr(x) = [x] for every x E X, is uniformly continuous (a contraction, actually), and hence it preserves convergence and Cauchy sequences (Corollary 3.8 and Lemma 3.43). If converges in X to x E X, then ([xn)) converges in X/M to [x] E X/M; if {xn) is a Cauchy sequence in X, then ([xn]} is a Cauchy sequence in X/M.
Proposition 4.10. If M is a subspace of a Banach space X, then the quotient space X/M is a Banach space. Proof. Let M be a subspace of a normed space (X, II IIx) Consider the quotient space X/M equipped with the quotient norm 11 11 and let ([xn])'0 be an arbitrary Cauchy sequence in X/M. According to Problem 3.52(c) {[xn]) -0 has a subsequence {[x]} ' 0 of bounded variation, so that 00
00
E II [xnk+I - xnk l II = E II [xnk+ l 1+ - [xnk ] II < 00. k=0
k=0
Note: If x E X, then there exists an e > 0 and a vector u,,x E M such that lix+U6,xiix
inf llx+uIlx+E= II[x]II+E.
In particular. for each integer k > 0 there exists a vector Uk E M such that I1(xnk+I - xnk)
+ Uk IIx < II[xnk, I - xnk]II + (2)k.
4.4 Bounded Linear Transformations
Therefore, as
Eoo(')k
215
< 00, 00
E II (xnk+l - Xnk) + uk IIX < 00k =0
In words, the X-valued sequence {(xnk+l - xnk) + uk I k'--O is absolutely summable. Now consider the sequence (yk }oo o in X of partial sums of { (xnk+1 - xnk) + uk } -0. A trivial induction shows that k
k
Yk = E(Xni+l
- Xni) + ui = (xnk+l - Xno) + > ui
i-0
i=0
for each k > 0. If X is a Banach space, then the absolutely summable sequence ((xnk+l - xnk) + uk }k° o is summable (Proposition 4.4), which means that (yk )k°_o converges in X. Hence {[yk]}O0 p converges in X/M (for the natural mapping of X
onto X/111, x 1-) [x], is continuous). But [xnk+1] - [Xno] = [xnk+, - xno] = [Yk] for each k > 0 (because Ek_0ui lies in M for every k > 0). Thus the subsequence {[xnk ]}O0 o converges in X/111, and therefore the Cauchy sequence ([x])n° 0 converges in X/111 (Proposition 3.39). Conclusion: Every Cauchy sequence in X/111 converges in X/111; that is, X/111 is a Banach space. o
4.4
Bounded Linear Transformations
Let X and Y be normed spaces. A continuous linear transformation of X into Y is a linear transformation of the linear space X into the linear space Y that is continuous with respect to the norm topologies of X and Y. (Note that X and Y are necessarily linear spaces over the same scalar field.) This is the most important concept that results from the combination of algebraic and topological structures.
Definition 4.11. A linear transformation T of a normed space X into a normed space Y is bounded if there exists a constant f > 0 such that IITxlI < #11x11
for every x E X. (The norm on the left-hand side is the norm on Y and that on the right-hand side is the norm on X). Proposition 4.12. A linear transformation of a nonmed space X into a normed space Y is bounded if and only if it maps bounded subsets of X into bounded subsets of Y.
Proof. Let A be a bounded subset of X so that supaEA Ila II < oo (Problem 4.5). If T is bounded, then there exists a real number 0 > 0 such that 11 Tx 11 < P IIx 11 for
216
4. Banach Spaces
every x E X. Therefore sup
II Y II = sup II Ta ll _< f sup lla ll < oo,
yET(A)
aEA
aEA
and hence T(A) is bounded in Y. Conversely, suppose T(A) is bounded in y for every bounded set A in X. In particular, T (BI [0]) is bounded in Y: the closed unit ball centered at the origin of X, B1 [0], is certainly bounded in X. Thus
sup IlTbll = sup IlTbll = I1b1l51
bEB,(0]
sup
Ilyll < oo.
YET(B1101)
Take an arbitrary nonzero x in X. Since II
I-11
II = 1, it follows that
IITxII = IIxIIIIT(Irx,r)II
sup IITbIIIIxII lbll
for every 0 0 x E X, and hence T is bounded (since the inequality supllbt
The next elementary result is extremely important.
Proposition 4,13. Let T be a bounded linear transformation of a normed space X into a normed space Y. The null space N(T) of T is a subspace of X. Proof. The null space (or kernel) of T,
N(T) =
{x E X :
Tx = 0) = T-' ((0)),
is a linear manifold of X (Problem 2.10). The Closed Set Theorem (Theorem 3.30)
ensures that it is closed in X. Indeed, if (xn } is an N(T)-valued sequence (i.e., Tx,, = 0 for every n) that converges in X to x E X. then 0 -< IITx1I = IITxn - TxII = IIT(xn - x)II :5 fIlxn-x11
as n -> oo (because T is linear and bounded), and hence x E N(T).
0 El
Theorem 4.14. Let T be a linear transformation of a normed space X into a normed space Y. The following assertions are pairwise equivalent.
(a) T is bounded. (b) T is Lipschitzian. (c) T is uniformly continuous.
(d) T is continuous. (e) T is continuous at some point x0 of X.
4.4 Bounded Linear Transformations
217
(f) T is continuous at the origin 0 E X. Proof. Let T : X -* Y be a linear transformation. If T is bounded, then there exists fl > 0 such that, for every xI , x2 E X,
IITxi - Tx211 = IIT(xi -x2)II < PIlxi -x211, and hence Recall that (b)=(c)=(d)=(e) trivially. Now suppose T is continuous at xo E X. Then for every e > 0 there exists 8 > 0 such that IITx - TOIL = IITxII = IITx + xo) - Txoll < e whenever Ilx - Oil = IIx 11 = II(x + xo) - xoll < 8, and so T is continuous at 0 E X. Therefore (e)=(f). Next suppose T is continuous at 0 E X. Thus (for s = 1) there exists 8 > 0 such that IITxII = IITx - TOII < 1 whenever IIx11 = lix - Oil < S. Since 1121671 x11 < S for every nonzero x in X, 117'x11 =
21a IIII
= 21gIIT(Ix)II Uf UX7 Txll
< 2IlxII
o
for every 0 * x E X, and hence T is bounded. Thus (f)=(a).
Observe from Theorem 4.14 that the terms "bounded linear transformation" and "continuous linear transformation" are synonyms, and as such we shall use them interchangeably. If X and ) are normed spaces (over the same field IF), then the collection of all bounded linear transformations of X into Y will be denoted by B[X, y]. It is easy to verify (triangle inequality and homogeneity for the norm on y) that B[X, YJ is a linear manifold of the linear space L[X, YJ of all linear transformations of X into y (see Section 2.5), and hence B[X, y] is a linear space over IF. The origin of B[X, y] is the null transformation, denoted by 0 (Ox = 0 for every x E X). For each T E B[X, y] set
IITII = inf {)3 >- 0: IITxII :S fIlxll foreveryx E X}. (Recall: if T E B[X, y], then there exists fl > O such that IITxII < f llx ll for every x E X, so that the nonnegative number ]IT 11 exists for every T E B[X, Y].) If x is any vector in X, then IITx11 (IITII +E)IIxII for every E > 0 so that 11Tx11 infE,o(IITII + e)llxll Thus, for every x E X, IITxII
<-
IITIIIIxII,
and hence II T 11 = min {,0 > 0: IITxII <- P IIx II for every x E X) . It is also easy to
show that
11T II = sup IITxII = sup IITx 11 = sup IITxII = sup Il lsII lIx1l<1
ax 11:51
Ilxl1=1
x#0
where the last two expressions make sense only if X # (0}. Hence
IITII>0
and
IITII=O if and only if T=O.
218
4. Banach Spaces
Moreover, for any a inFandanySinB[X,Y].II(aT)xII = Ila(Tx)II = IaIIITxII IaIIITII lxll and II(T+S)xll = IITx+Sxll <_ IITxII+IISxII _< (IITII+IISII)IIxII for every x E X. Therefore, IIaTII < lalIITII
and
IIT+SII < IITII+ IITII
for everyT, S E B[X, Y] and every scalar a E F. Conclusion: The function B[X, y] -- R defined by T i-+ 11TII is a norm on the linear space B[X, Y]. Thus B[X, y] is a normed space, and IITII is referred to as the norm of T E B[X, Y) (or the usual norm on B[X, y], or still as the induced uniform norm on B[X, y]). This is the norm that will be assumed to equip B[X, yJ whenever B(X, y) is regarded as a normed space. A contraction in B[X, y] is a bounded linear transformation T E B[X, y] such that IITII < 1. Clearly,
IITII5 1 b IITxll _
IIxII for every xEX.
If X 36 (0), then a transformation T E B[X, YJ is a contraction if and only if sup,,#o(II Tx 11 /IIxII) < 1. A strict contraction in 8[X, YJ is a bounded linear transformation T E B[X, y] such that 11 T II < 1. It is obvious that every strict contraction
is a contraction. T E B[X, y] is a strict contraction if and only if X 96 (0) and sups#O(IITxII/IIxII) < 1. Carefully note that, if T E B[X, YJ and X 96 (0), then
IITII<1
11Tx11
.
Example 4H. The converses of the above implications fail. Indeed, Let If be the Banach space of Example 4B for some p > 1. We shall drop the subscript "p" from the norm II Iln on £+ and write simply II II. Now consider the diagonal mapping Da : £+ -+ £+ of Problem 3.22 for some bounded sequence a = (ak) f E t.+OO:
= {aklk}ko
DaX
x = (zk ) 0 E £4.
for every
(i.e., Da (o,t ,2,
) = (aoo, a i 1, a22, ... )). Common notations for a diagonal mapping of £+ into itself include the usual representation as an infinite diagonal matrix: Da = diag((ak )k'--O ) = diag(a0, a1, a2,
.
)_
a1
This bounded linear transformation is called a diagonal operator. Linearity is trivially verified and boundedness (i.e., continuity) follows from Problem 3.22(a). In-
deed, if a = (ak)' o
E £+OO, then
00
00
IIDaxll = (Flak kI") < suPlakl(E1 klp) = suplaklllxll k=0
k
k=0
k
4.4 Bounded Linear Transformations
for every x =
219
supk lak I = Ila II 0. On the other hand, consider the e.1°.-valued sequence {ei }O0o where, for each i > 0, ei is a scalar}k°
o E £. , so that 11 D,, 11
valued sequence with just one nonzero entry (equal to one) at the ith position (i.e.,
ei = {Sid}
for every i > 0, with Sik = 1 if k = i and Sik = 0 if k # i). Since Da ei = ai ei and II ei II = 1, it fol lows that 11 D. ei 11 = Iai I and hence II D411 = supUx11 =11IDax11 IIDaeill =Iai I, for every i>0.Thus IIDa11 ? supiJai I = Ilalloo Therefore, IIDQII = Ilallo,.
If a = {ad} is a constant sequence, say ak = a for all k > 0, then Da = diag(a, a, a_.) = a1, a multiple of the identity I on P+. In this case D. is a scalar operator (see Problem 4.19). It is clear that IIa1xIl = IallixII for every x E e+ and Ila111 = la 1. If a = {ak }k° 0 is the increasing sequence with ak = kk+1 for every k > 0, then Dl,, = diag(7, 4, 4, ...) is such that IlDax11 < IIxII for every
0#xEt+and 11Dall=1. Proposition 4.15. B[X, y] is a Banach space if y is a Banach space. Proof. Let {Tn} be an arbitrary Cauchy sequence in B[X, Y]. For each x E X the sequence {Tnx} is a Cauchy sequence in Y (reason: II Tmx - Tnx II
IITm - Tn II lix II
for every x E X and every pair of indices m and n). Since Y is complete, {Tnx}
converges in Y. Thus for every x E X there exists a vector yx E Y such that Tnx -+ yt in Y: the (unique) limit of {Tnx}. Let T be the mapping that assigns to each x in X this vector y, in y; that is, Tx = limn Tnx for every x E X. Claim 1.
T : X - y is linear.
Proof. Since Tn is linear for each n, and since the linear operations are continuous (Problem 4.1), it follows that
limTn(aixi +a2x2) = lim(Tnaixi +Tna2x2)
T(aixi + a2x2)
e
ai lim Tnxi + a2 lim Tnx2= n al Tx1 +a2Tx2 n
n
for every X1, x2 E X and every pair of scalars al and a2. o Claim 2.
T : X --* Y is bounded.
Proof. Take an arbitrary real number e > 0. Since (T.1 is a Cauchy sequence in B[X, y], it follows that there exists an integer ng > 1 such that IITm - TnII < e whenever m, n > nE. Thus IIT,nx - TnxI1 IITm - T.11 IIxII < ellx11 for every x E X and every m, n > n6. Therefore, 11(Tm-T)xII
= 1IT.x-Tx1I = II
lim(Trnx - Tnx)II =
11
Tn,x - TnxH < ellxll
220
4. Banach Spaces
for every x E X and every m > nE (reason: every norm is continuous - see Corollary 3.8). Hence the linear transformation Tm - T is bounded for each m > ne, and
soisT =Tm-(Tm-T). Claim 3.
Tm -* T in B[X, Y].
Proof. For every e > 0 there exists an integer nE > I such that m > nE
implies
IITm -T11 = sup II(Tm - T)xll <s Ilxl
according to the above displayed inequality. 0
in B[X,Y] converges in 13[X, YJ to Conclusion: Every Cauchy sequence T E B[X, y], which means that the normed space B[X, YJ is complete; that is o B[X, Y J is a Banach space. Comparing the above proposition with Example 4G we might expect that, if X j4 (0), then B[X, Y] is a Banach space if and only if Y is a Banach space. This indeed is the case, and we shall prove the converse of Proposition 4.15 in Section 4.10. Recall that bounded linear transformations can be multiplied (where multiplication means composition). The resulting transformation is linear as well as bounded.
Proposition 4.16. Let X, Y and 2 be normed spaces over the same scalar field if T E B[X, Y] and S E B[y, 2], then ST E B[X, 2j and IISTII -< IISIIIITII Proof. The composition S T : X -- 2 is linear (Problem 2.15) and II S Tx (l IISIIIITxII < IISIIIITIIIIxii for every x in X.
_<
o
The above inequality is a rather important additional property shared by the (induced uniform) norm of bounded linear transformations. We shall refer to it as the operator norm property. Let X be a normed space and set B[X] = B[X, X] for short. The elements of
B[X] are called operators. In other words, by an operator (or a bounded linear operator) we simply mean a bounded linear transformation of a normed space X into itself, so that B[X] is the normed space of all operators on X. Example 41. Let ((Xk, II Ilk)}kEI be a countable (either finite or infinite) indexed family of normed spaces (over the same scalar field), and consider the direct sum ®kEIXk of the linear spaces (Xk)kEI. (To avoid trivialities we assume that each Xk is nonzero.) Equip the linear space ®kEIXk with any of the norms II II,, or IIP (for any p > 1) as in Examples 4E and 4F. If the index set I is countably infinite, then write ( kEIXk. 11 11.) for the normed space ([®kElXk]oo, II II.) II
4.4 Bounded Linear Transformations
and ( kEIXk, II IIp) for the normed space ([ define a mapping Pi : ®kEIXk ®kEIXk by
Pix = xi
for every
221
kE[Xk]p. II IIP). For each i E II
x = {Xk)kE1 E ®kEIXk,
where we are identifying each vector xi E Xi with the indexed family (xi (k))kEl E ®kEIXk that has just one nonzero entry (equal to xi) at the ith position (i.e., set
x, (k) = SikXk so that xi (k) = Ok E Xk, the origin of Xk, if k # i and xi (i) = xi ). Briefly, we are writing xi for (Oi , ... , Oi _ 1. xi, Oi+), ... ). Each Pi is a linear transformation. Indeed,
P,(au (DOv) = Pi(a(uk)ke1 (Dfi(Vk)kEI) = Pi({auk+,8vk}kEl) = aui+0vi = aPiu®16Piv for every u = (Uk)kEl and v = {vk }kEl in (DkEIXk and every pair of scalars {a, 0). Moreover,
A(PO) = Xi if we identify each normed space Xi with the linear manifold ®kEIXi (k) of ®kEIXk,
such that Xi (k) = (0k) for k * i and Xi (i) = Xi, equipped with any of the norms 11
II
o0 or
11
II P. (To identify Xi with ®kEIXi (k) simply means that these normed spaces
are isometrically isomorphic, a concept that will be introduced in Section 4.7.) Note that (xi (k))kEt lies in ®kEIXi (k) for every xi in Xi and Ilx, Ili = II (xi (k))kE1II00 = 11(xi (k))kEI I1p for any p > 1. Also note that each Pi is a projection (i.e., an idempotent
linear transformation): P,?x = Pi{Xi(k)}kEI = {X,(k)}kEl = PiX
for every x = {Xk }kEI in ®kEIXk so that Pi = P,2 for each i E II. In fact, each Pi ®kEIXk is is a continuous projection. To verify that the mapping Pi : ®kEIXk continuous, when ®kEIXk is equipped with any of the norms I1 1100 or 1111 ,, observe that IiP,x1100 =
IIXilli < supIIXklli = IIxIIoo kEII
for every x = {Xk61 in 1®kEIXk, 11 1100. and
II Pix IIp = II{xi(k)}kEQIIn = Ilxillp
-
Ilxkllk = IIxIIP
for every x = {Xk}kEI in l®kEIXk. II 11pY Thus Pi is a bounded linear transformation, and hence continuous (Theorem 4.14). In other words, each Pi is a projection operator in Ci[®kEIXk]. Actually, if we use the same symbols II II,. and Ilp to denote the induced uniform norms on Ci[®keiXk] whenever ®kEIXk is equipped with either 11 1100 or 11 IIp, respectively, then the above inequalities ensure that 11 Pi 11 oo < 1 and 11 P, 11 p < 1. Thus each P, is a contraction with respect 11
222
4. Banach Spaces
to any of the norms 11Iloo or II IIp. On the other hand, since Pi = Pi,, it follows II Pi II oo and II Pi II p = II P?IIp 5 II Pi IIp (operator norm property). Hence 1 5 II Pi II. and 1 5 II Pi II,, (for Pk * 0) so that
that II Pi lloo = ll Pi2Iloo <-
IiPillao = IlPillp = I. Summing up: If ®kE[Xk is equipped with either II II,, or II lip, then (Pi)iE[ is a family of projections with unit norm in When we identify ®kEIXi (k) =
R(Pi) with Xi, as we actually did, then each map Pi: ®kE[Xk -' ®kEIXi(k) C ®ke1Xk can be viewed as a function from ®kE[Xk onto Xi, and hence we write Pi: ®kelXk -+ Xi. This is called the natural projection of ®kEIXk onto Xi. Definition 4.17. A normed space A that is simultaneously an algebra (as in Problem
2.30) with respect to a product operation .AxA
A, say (A, B) -- AB, such
11 B II is called a normed algebra. A Banach algebra is a normed algebra that is complete as a normed space. If a normed algebra possesses an identity I such that 11111 = 1, then it is a unital normed algebra (or a normed algebra with identity). If a unital normed algebra is complete as a normed space, then it is a unital Banach algebra (or a Banach algebra with identity). that II A B II < II A 11
Let £[X] be the linear space of all linear transformations on X (i.e., £[X] _ £[X, X]: the linear space of all linear transformations of a linear space X into itself - Problem 2.13). We have already noticed that B[X) is a linear manifold of £[X]. If X 56 (0), then £[X] is much more than a mere linear space; it actually is an algebra with identity, where the product in G[X] is interpreted as composition (Problem 2.30). The operator norm property ensures that B[X] is a subalgebra (in the algebraic sense) of £[X], and hence an algebra in its own right. Moreover, the operator norm property also ensures that B[X] in fact is a normed algebra. Let I denote the identity operator in SIX] (i.e., Ix = x for every x E X), which is the identity of the algebra B[X]. Of course, 11111 = 1 by the very definition of norm (the induced uniform norm, that is) on B[X]. Thus B[X] is a normed algebra with identity and, according to Proposition 4.15, B[X] is a Banach algebra with identity if X i4 (0) is a Banach space.
We know from Problem 4.1 that addition and scalar multiplication in B[XJ are continuous (as they are in any normed space). The operator norm property allows us to conclude that multiplication is continuous too. Indeed, if {S) and {Tn) are B[X]-
valued sequences that converge in B[X] to S E B[X] and T E B[X], respectively, then S IISSIIIITT - TII+IISn - SIIIITII for every n. Since sup, II Sn II < oo (reason: every convergent sequence is bounded),
it follows that {S,Tn} converges in B[XI to ST. Therefore the product operation SIX] x SIX] -i- SIX) given by (S, T) N S T is continuous (Corollary 3.8), where the topology on 8[X] x B[X] is that induced by any of the equivalent metrics of Problems 3.9 and 3.33.
4.5 The Open Mapping Theorem and Continuous Inverses
4.5
223
The Open Mapping Theorem and Continuous Inverses
Recall that a function F : X - Y from a set X to a set Y has an inverse on its range R(F) if there exists a (unique) function F't : R(F) -+ X (called the inverse of F on R(F)) such that F-1 F = IX, where Ix stands for the identity on X. Moreover,
F has an inverse on its range if and only if it is injective. If F is injective and surjective, then it is called invertible (in the set-theoretic sense) and F-t : Y -+ X is the inverse of F. Furthermore, F : X -> Y is invertible if and only if there exists
a unique function F-t : Y --* X (the inverse of it) such that F-1 F = Ix and FF-t = ly, where ly stands for the identity on Y. (See Section 1.3 and Problems 1.5 to 1.8). Now let T be a linear transformation of a linear space X into a linear space Y (i.e., T E L[X, Y]). According to Theorems 2.8 and 2.10 N(T) = (0) if and only if T has a linear inverse on its range R(T). That is,
N(T) = (0)
There exists T-t E C[1Z(T), X].
T is injective
Clearly, N(T) = (0) if and only if 0 < II Tx II for every nonzero x E X (whenever Y is a normed space). This leads to the proposition below. Proposition 4.18. Take T E L[X, Y], where X is a linear space and Y is a nonmed space. The following assertions are equivalent.
(a) There exists T-t E C[IZ(T), X] (i.e., T has an inverse on its range). (b) 0 < IITxII for every nonzero x in X. Definition 4.19. A linear transformation T of a normed space X into a normed space Y is bounded below if there exists a constant a > 0 such that aIIxII
IITxII
for every x E X (where the norm on the left-hand side is the norm on X and that on the right-hand side is the norm on Y).
Note: T E B[X, y] actually means that T E L[X, y] is bounded above. The next result is a continuous version of the previous proposition.
Proposition 4.20. Take T E L[X, Y], where X and Y are nornied spaces. The following assertions are equivalent.
(a) There exists T-t E B[1Z(T), X] range).
(b) T is bounded below.
(i.e., T has a continuous inverse on its
224
4. Banach Spaces
Moreover, if X is a Banach space, then each of the above equivalent assertions implies that R(T)- = R(T) (i.e., R(T) is closed in Y). Proof. (i) If (a) holds true, then there exists a constant 0 > 0 such that IIT-t y ll P 11 y II for every y E R(T) (Definition 4.11). Take an arbitrary x E X so that TX E R(T). Thus iIxI1 = IIT-1 Tx11 < f 11Tx11, and hence IIx11 IITxl1. Therefore (a)=*(b). (ii) Conversely, if (b) holds true, then 0 < IlTxll for every nonzero x in X, which implies that there exists T-1 E £[R(T), X] by Proposition 4.18. Take an arbitrary y E R(T) so that y = Tx for some x E X. Thus 11T-'y1I = 11T-'TxII = 11x11 a II T x II = a II y II, which means that T-1 is bounded (Definition 4.11). That is,
(b)=(a). (iii) Now take an arbitrary R(T)-valued convergent sequence (y.). Since yn lies in R(T) for every n, it follows that there exists an X-valued sequence (xn) such that yn = Txn for each n. Thus {Txn) converges in Y, and hence {Txn} is a Cauchy sequence in Y. If T is bounded below, then there exists a > 0 such that
0 < a1Ixm-xn11 < IIT(xm - xn)II = IITx.-Txn11 for every pair of indices m and n (recall: T is linear), so that (xn) also is a Cauchy sequence (in X), and therefore it converges in X to, say, x E X (whenever X is a Banach space). But T is continuous and so yn = Tx,, -* Tx (Corollary 3.8), which implies that the (unique) limit of {y,,) lies in R(T). Conclusion: R(T) is closed in Y by the Closed Set Theorem (Theorem 3.30). o
Example 4J Take a sequence a = {ak }o in t+00 and consider the diagonal operator Da in B[e+ } (for some p > 1) of Example 4H. If ak 96 0 for every k > 0 and Da x = 0
forsomex =
x =0(i.e.,ifak 96
0 for every k > 0 0 implies N(Da) = (0}. Conversely, if N(Da) = {0}, then Dax 96 0 for every nonzero x in £.. In particular. Da e; = ai e; 96 0 so that II Da ei 11 = jai 10 0, and hence ai # 0, for every i > 0 (see Example 4H). Conclusion:
N(Da) = (0) if and and only ak * 0 for every k > 0. Therefore, since a linear transformation has a (linear) inverse on its range if and only if its null space is zero (i.e., if and only if it is injective), we may conclude: There exists DQ 1 E C[R(Da ), e° } if and only if ak * 0 for every k > 0. In this case, the linear (but not necessarily bounded) transformation Da 1 is again a diagonal mapping, Da 1 v = (ak 1vL)k°O0
for every
y = {vk}
E R(D.) a t f,
4.5 The Open Mapping Theorem and Continuous Inverses
225
whose domain D(D; t) is precisely the range R(Da) of D. (reason: such a mapping is a diagonal, as diagonals were defined in Problem 3.22. and Da-t D. is the identity on f ). Recall that D(DD ) = £. whenever supk lak 11 < oo (Problem 3.22(a)), and hencelZ(4) = l+ whenever inf k lak I > 0(since infklakI > 0implies supklakII < oo). Moreover, 00
1
00
IIDaxII = (EIaktkl")P > inf k=0
for every x = (:;k)
mf IakIIIxII k=0
E l+, so that DR is bounded below whenever infklakl > 0
Conversely, if Da is bounded below, then there exists a > 0 such a < II Da ei II = jai for every i > 0 so that inf; Jai I > a > 0. Outcome (cf. Proposition 4.20):
There exists DG-1 E B[e+] if and only if infklakl > 0. That is, the inverse of a diagonal operator D. in B[l+] exists and is itself a diagonal operator in B [t. ] if and only if the bounded sequence a= {ak ) is bounded away from zero (see Problem 4.5).
If T E B[X, Y], X is a Banach space, and there exists T-t E B[Y, X], then y is a Banach space. Proof: Theorem 4.14, Lemma 3.43 and Corollary 3.8 (note: this supplies another proof for part of Proposition 4.20). Question: Does the converse
hold? That is, if X and Y are Banach spaces, and if T E B[X, y] is invertible in the set-theoretic sense (i.e., injective and surjective), does it follow that T-t E B[y, X]? Yes, itdoes. This is the Inverse MappingTheorem, but to prove it we need a fundamental result in the theory of bounded linear transformations between Banach
spaces, namely, the Open Mapping Theorem. Recall that a function F : X --+ Y of a metric space X into a metric space Y is an open mapping if F(U) is open in Y whenever U is open in X (Section 3.4). The Open Mapping Theorem says that every continuous linear transformation of a Banach space onto a Banach space is an open mapping. In other words, a surjective bounded linear transformation between Banach spaces maps open sets into open sets. Theorem 4.21. (The Open Mapping Theorem). If X and y are Banach spaces and T E B[X, Y] is surjective, then T is an open mapping.
Proof. Let T be a surjective bounded linear transformation of a Banach space X onto a Banach space Y. The nonempty open ball with center at the origin of X and radius s will de denoted by X, and the nonempty open ball with center at the origin of y and radius S will be denoted by Y. Claim 1. For every e > 0 there exists S > 0 such that Ya c T (XE)- (i.e., T (Xe )is a neighborhood of 0 in y). Proof. Since each x in X belongs to X,, for every integer n > Jlx II, it follows
that the sequence of open balls {X ] covers X. That is, X = Un=1 Xn, and hence
226
4. Banach Spaces
T(X) = Un°_1T(Xn) (see Problem 1.2(e)). Thus the countable family (T(Xn)} of subsets of Y covers T(X). If T(X) = Y and Y is complete, then the Baire Category Theorem (Theorem 3.58) ensures the existence of a positive integer m such that T(Xm)- has nonempty interior. Since T is linear, T(XE) = rT(Xm)
and so T(XE)- = -T(Xm)- has nonempty interior for every e > 0 (because multiplication by a nonzero scalar is a homeomorphism). Take an arbitrary e > 0 and an arbitrary yo E [T(XZ)-]° so that there exists a nonempty open ball with center at yo, say Bs(yo), such that Bs(yo) g T(XZ)-. If y is an arbitrary point of Ys, then II y + yo - yo II = II Y II < S and hence y + yo E Bs (yo). Thus both y + yo and yo lie in T (X , )-, which means that
inf IITu-y-yoll=0
UEX?
inf IITv-yoll=0.
and
VEX
Therefore (recall: u, v E X z implies u - v E Xe),
inf IITx - y11
XEX,
inf
IITu-v)-yII
inf
IITu - y - yo+yo - TvII
"'VEX 5
u,vEX i -<
inf IITu - y - yoll+ in IITv - Yoll = 0,
z
and hence y E T(XE)-. Conclusion: Ys g T(XE)-. 17
Claim 2. For every e > 0 there exists 3 > 0 such that Ys e T(XE) (i.e., we may even erase the closure symbol from Claim 1).
Proof. Take an arbitrary e > 0, and set en = ' for every n E N. According to Claim 1, for each n E N there exists 3n E (0, en) such that Ys. c T (XEA)-.
If y. E Ys g T (XE )- for some n E N, then infXExjp IIyn - Tx ll = 0 and so there exists Xn E Xn such that Ilyn - T xn II < an+1. Set S = 81 and take an arbitrary y E Y. We claim that there exists an X-valued sequence (xn) with the properties n
XnEXEn
and
Y-1: k=1
for every n E N. Indeed, Y E Ys, implies that there exists x1 E XE, such that Ily - Tx, II < 32, and hence the above properties hold for x1 (i.e., they hold for n = 1). Now suppose they hold for some n > 1, so that there exists xk E XE,t for each k = i, ... , n such that y - Ek=1 Txk E Ys.+,. This implies that there exists X,,+1 E XEn+, such that py - yk= Txk - Txn+1 II < sn+2, and hence there exists xk E XEk for each k = 1, ... , n+1 such that y - .k±,' Txk E Ysn+2 Therefore,
4.5 The Open Mapping Theorem and Continuous Inverses
227
assuming that the above properties hold for some n > 1 we conclude that they hold for n + 1, which completes the induction argument. Now we shall use this sequence (xn) to show that y = Tx for some x E X.
Since Ilxnll <' for each n E N, Ek°O_I Ilxkll < Eyco Jr = s (for F or = 2) and so (xn } is absolutely summablc. Since X is a Banach space. this implies that is summable (Proposition 4.4), which means that (Ek=1xk) converges in X. That is, there exists x E X such that 1
n
E xk - x in X as
n -+ oo.
k=1
Moreover, II Ek-1 xk II < Ek=1 Ilxk II < e for all n so that n I Ilxll = limEXk1 = Jim n =l R k=1
n EXk1 <
RR
Sup
tl
I
11EXk I < E
n
444
k=1
(recall: every norm is continuous), and hence x E X. Since T is continuous and linear, and since n
E T xk -- y in Y as n - oo k=1
(for Il y - Ek=1 Txk II < 8n+1 < 8,,+l = ' for every n), it follows that n
n
n
Tx = TlnmExk = limTExk = k=1
k=1
y k=1
and so y E T(XE). That is, an arbitrary point of Ya lies in T(XE). Therefore Ya C_ T(XE). o
Let U be any open subset of X and take an arbitrary z E T (U). Thus there exists
u E U such that z = Tu. But u is an interior point of U (because U is open in X), which means that there exists a nonempty open ball included in U with center at u, say BE(u) a U. Translate this ball to 0 in X and note that the translated ball coincides with the open ball of radius E about the origin of X:
XE = BE(u) - u = {x E X : x = v - u for some v E BE(u)). According to Claim 2, there exists a nonempty open ball Ys about the origin of Y such that Ys e T(XE). Translate this ball to the point z and get the open ball with center at z and radius 8:
Bs (z) = Ys + z = J W E Y : w = y + z for some y E Ya 1.
228
4. Banach Spaces
Then, since T is linear,
Ba(z) 9 T(XE) + Tu = T(XE + u) = T(BE(u)) c T(U), and hence z is an interior point of T (U ). Conclusion: Everypoint of T (U) is interior, which means that T (U) is open in Y. El A straightforward application of the Open Mapping Theorem says that an injective and surjecti ve bounded linear transformation between Banach spaces has abounded inverse.
Theorem 4.22. (The Inverse Mapping Theorem or The Banach Continuous Inverse
Theorem). If X and y are Banach spaces and T E B[X, y] is injective and surjective, then T-' E B[y, X].
o
Proof. Theorems 2.10, 3.20 and 4.21.
If X and Y are normed spaces, then an element T of B[X, y] is called invertible if it is an invertible mapping in the set-theoretic sense (i.e., injective and surjective) and its inverse T- 1 lies in B[y, X). According to Theorem 2.10 this can be briefly stated as: T E B[X, yJ is invertible if it has a bounded inverse. We shall denote the set of all invertible elements of B[X, yJ by G[X, Y). The Inverse Mapping Theorem says
that, if X and y are Banach spaces, then both meanings of the term "invertible"
coincide, and T-' E 9[y, X] whenever T E G[X, y] (for (T-')-' = T Problem 1.8). Note that C[X, y] is not a linear manifold of B[X, y]: addition is not }00 ° 1) - see a binary operation on 9[X, y] (e.g., diag((k+ )k i) + I = diag({ Example 4J). On the other hand, the composition (product) of invertible bounded linear transformations is again an invertible bounded linear transformation.
Corollary 4.23. If T E G[X, y] and S E 9[y, 2], then ST E 9[X, Z1 and
(ST)-1 = T'S' whenever X, Y and Z are Banach spaces. Proof. Problem 1.10, Proposition 4.16 and Theorem 4.22.
O
Corollary 4.24. Let X and y be Banach spaces and take T in B[X, y]. The following assertions are pairwise equivalent.
(a) There exists T
E B[R(T ), X1
(i.e., T has a continuous inverse on its
range).
(b) T is bounded below.
(c) N(T) = {0) and 7Z(T )- = R(T)
(i.e.. T is injective and has a closed
range).
Proof. According to Proposition 4.20, assertions (a) and (b) are equivalent and each of them implies R(T)- = R(T) whenever X is a Banach space. Moreover,
4.5 The Open Mapping Theorem and Continuous Inverses
229
(b) trivially implies N(T) = (0). Thus any of the equivalent assertions (a) or (b) implies (c). On the other hand, if N(T) = (0), then T is injective. If, in addition, the linear manifold 7Z(T) is closed in the Banach space Y, then it is itself a Banach space (Proposition 4.7) so that T : X -> R(T) is an injective and surjective bounded linear transformation of the Banach space X onto the Banach space 7Z(T). Therefore,
its inverse T-t lies in B[R(T), X] by the Inverse Mapping Theorem. That is, (c) implies (a).
17
Let X # (0) be a normed space. Consider the algebra L[X] (of all linear transformations on X), and also the normed algebra B[X] (of all operators on X), which
is a subalgebra (in the purely algebraic sense) of L[X]. Let G[X] denote the set of all invertible operators in B[X] (i.e., set G[X] = CQ[X, X]). Suppose X is a Banach space, and let T E B[X] be an invertible element of the algebra L[X] (i.e.,
there exists T-t E L[X] such that T-t T = T T-t = 1, where 1 stands for the identity of L[X ]). Thus, as we had already observed, the Inverse Mapping Theorem
ensures that the concept of an invertible element of the Banach algebra B[X] is unambiguously defined (for T-t E B[X]). In other words, the set-theoretic inverse of an operator on a Banach space is again an operator on the same Banach space. Moreover (since the inverse of an invertible operator is itself invertible), the set C9[X] of all invertible operators in B[X] forms a group under multiplication (every operator in G[X] C B[X] has an inverse in g[XJ) whenever X is a Banach space. We close this section with another important application of the Open Mapping Theorem. Let X and y be normed spaces (over the same scalar field) and let X ®y be the direct sum of them, which is a linear space whose underlying set is the Cartesian product X xy of the underlying sets of the linear spaces X and Y. Recall that the graph of a transformation T : X --+ y is the set
GT = {(z,y)EXxy: y=Tx} = {(x,Tx)EX®y: xEX) (see Section 1.2). If T is linear, then GT is a linear manifold of the linear space
X®y(for a(u,Tu)®,8(v,Tv)=(au+fv, T(au+fv))forevery u,vEX and every pair if scalars (a, fi )). Equip the linear space X ®Y with any of the norms of Example 4E and consider the normed space X ® Y.
Theorem 4.25. (The Closed Graph Theorem). If X and y are Banach spaces and T E L[X, 31, then T is continuous (i.e., T E B[X, y]) if and only if GT is closed
in X®y. Proof.
Let P x : X ®y - X and Py : X ®y -* y be defined by Px(x, y) = x
and
Py(x, y) = y
for every (x, y) E X ® Y. Consider the restriction Px I GT : GT -+ X of Px to the linear manifold GT of the normed space X ® Y. Observe that Px and Py are the natural projections of X e y onto X and y, respectively, which are both linear and bounded (see Example Q. Thus Py E B[X ® y, y] and Px I GT E B[GT, X ]
230
4. Banach Spaces
(Problems 2.14 and 3.30). Moreover, PxIGr is clearly surjective and injective (if
Px(x, Tx) = 0,thenx = 0andhence (x, Tx) = (0,0) E GT;that is,JV(PxIGr) _ (0}).
(a) Recall that X ® y is a Banach space whenever X and y are Banach spaces (Example 4E). If GT is closed in X ®y, then GT is itself a Banach space (Proposition
4.7), and the Inverse Mapping Theorem (Theorem 4.22) ensures that the inverse (PxIGr)-l of PxIGr lies in 13[X, GT}. Since T = Py(PxIGr)-t (for TPxIGT = Py I c,), it follows by Proposition 4.16 that T is bounded.
(b) Conversely, take an arbitrary sequence ((x,,, in GT that converges in X ® y to, say, (x, y) E X ® Y. Since Px and Py are continuous, it follows by Corollary 3.8 that
lim x = lim Px(x,,, Tx,,) = PX lim(x,,, Tx,,) = Px(x, y) = x, lim Tx,, = lim Py(x,,, Tx,,) = Py
Tx,,) = Py(x, y) = y.
If T is continuous, then (Corollary 3.8 again)
y = lim Tx,, = T lim x = Tx, and hence (x, y) = (x, Tx) E GT. Therefore GT is closed in X ®Y by the Closed Set Theorem (Theorem 3.30). o
4.6
Equivalence and Finite-Dimensional Spaces
Recall that two sets are said to be equivalent if there exists a one-to-one correspondence (i.e., an injective and surjective mapping or, equivalently, an invertible mapping) between them (Chapter 1). Two linear spaces are isomorphic if there exists an isomorphism (i.e., an invertible linear transformation) of one of them onto the other (recall: the inverse of an invertible linear transformation is again a linear transformation). An isomorphism (or a linear-space isomorphism) is then a one-to-one correspondence that preserves the linear operations between the linear spaces, and hence it preserves the algebraic structure (Chapter 2). Two topological spaces are homeomorphic if there exists a homeomorphism (i.e., an invertible continuous mapping whose inverse also is continuous) of one of them onto the other. A homeomorphism provides a one-to-one correspondence between the topologies on the respective spaces, thus preserving the topological structure. In particular, a homeomorphism preserves convergence. A uniform homeomorphism between metric spaces is a homeomorphism where continuity (in both senses) is strengthened to uniform continuity, so that a uniform homeomorphism also preserves Cauchy sequences. Two metric spaces are uniformly hoineomorphic if there exists a uniform homeomorphism of one of them onto the other (Chapter 3).
4.6 Equivalence and Finite-Dimensional Spaces
231
Now, as one would expect, we shall be interested in preserving both algebraic and topological structures between two normed spaces. A mapping of a normed space X onto a normed space Y that is (simultaneously) a homeomorphism and an isomorphism is called a topological isomorphism (or an equivalence), and X and y are said to be topologically isomorphic (or equivalent) if there exists a topological isomorphism between them. Clearly, continuity refers to the norm topologies: a topological isomorphism is a mapping of a normed space X onto a normed space J), which is a homeomorphism when X and Y are viewed as metric spaces (equipped with the metrics generated by their respective norms), and also is an isomorphism between the linear spaces X and Y. Since an isomorphism is just an injective and surjective linear transformation between linear spaces, it follows that a topological isomorphism is simply a linear homeomorphism between normed spaces. Thus the topological isomorphisms between X and Y are precisely the elements of Cc[X, y]: if X and Y are normed spaces, then W : X -+ Y is a topological isomorphism if and only if W is an invertible element of 13[X, y]. Therefore, X andY are topologically
isomorphic if and only if there exists a linear homeomorphism between them or, equivalently, if and only if 9[X, Y] 96 0. Conversely, the elements of 9[X, y] are also characterized as those linear-space isomorphisms that are bounded above and below (Theorem 4.14 and Proposition 4.20). Thus X and Y are topologically isomorphic if and only if there exists an isomorphism W E £[X, y] and a pair of positive constants a and /3 such that allxll < II Wx II < fs llx ll
for every
x E X.
The Inverse Mapping Theorem says that an injective and surjective bounded linear transformation of a Banach space onto a Banach space is a homeomorphism, and
hence a topological isomorphism: if X and Y are Banach spaces, then they are topologically isomorphic if and only if there exists an isomorphism in !3[X, y]. Two norms on the same linear space are said to be equivalent if the metrics generated by them are equivalent. In other words (see Section 3.4), 11 11, and II 112 are equivalent norms on a linear space X if and only if they induce the same norm topology on X. Proposition 4.26. Let 1111 , and II 112 be two norms on the same linear space X. The
following assertions are pairwise equivalent. (a)
11
11
t
and II 112 are equivalent norms.
(b) The identity map between the normed spaces (X, II II1) and (X, II 112) is a topological isomorphism.
(c) There exist real constants a > 0 and /3 > 0 such that
allxll,
11x112 < /311x111
for every
x E X.
232
4. Banach Spaces
Proof. Recall that the identity obviously is an isomorphism of a linear space
onto itself, and hence it is a topological isomorphism between the nonmed spaces ) and (X, II 112) if and only if it is a homeomorphism. Thus assertions (a) (X, 11 and (b) are equivalent by Corollary 3.19. Assertion (c) simply says that the identity 1: (X. II II 1) --). (X, 11 112) is bounded above and below or, equivalently, that it is continuous and its inverse I - I : (X, II 112) -+ (X, II 111) also is continuous (Theorem 4.14 and Proposition 4.20), which means that it is a homeomorphism. That is. o assertions (b) and (c) are equivalent. 11
It is worth noticing that metrics generated by norms are equivalent if and only if they are uniformly equivalent. Indeed, continuity coincides with uniform continuity for linear transformations between normed spaces (Theorem 4.14), and hence the identity 1: (X, 11111) -* (X, 11112) is a homeomorphism if and only if it is a uniform homeomorphism. Observe that, according to Problem 3.33, all norms of Example 4E are equivalent. In particular, all norms of Example 4A are equivalent.
Theorem 4.27. If X is a finite-dimensional linear space, then any two norms on X are equivalent. Proof. Let B = (ei )"=1 be a Hamel basis for an n-dimensional linear space X over F so that every vector in X has a unique expansion on B.
x=
Siei, i=1
where (i;i }" 1 is a family of scalars: the coordinates of x with respect to B. It is easy to show that the function 11110: X -a R, defined by
IIx IIo = max Iii l 1
for every x E X, is a norm on X. Now take an arbitrary norm 11 11 on X. We shall verify that 11110 and 11 11 are equivalent. First observe that n
n
11X II = H
i i=1
e] _< E Iti l llei ll < (E Ilei iI)IIx 1Io i=1
i=1
for every x E X. To show the other inequality consider the linear space F" equipped with the norm II II of Example 4A; that is, Ila lloo = max Iai I 1
for every vector a = (a 1, ... , a,,) E F"; and consider the transformation L: (F", II II,,) --+ (X, II II) defined by
La = >aiei r=1
4.6 Equivalence and Finite-Dimensional Spaces
for every a= (a I,
233
, an) E F. It is readily verified that L is linear and bounded.
Indeed, n
n
11 La 11 =
Ileill) Ilall,.
aieiH _< i=1
i=1
so that the transformation L is continuous (Theorem 4.14), and so is the real-valued
function rp: (F", 11Iloo) - R such that W(a) = IILa1I
for every a = (al, ... , an) E F" (recall:
II II : X -> R is continuous, and composition of continuous functions is a continuous function). The unit sphere S = aB, [0] = (a E F" : Ila Iloo) is compact in (F", II Iloo) by Theorem 3.83, and hence the continuous function Sp assumes a minimum value on S (Theorem 3.86). Moreover, since led %,t is linearly independent, it follows that the linear transformation L is injective (L(a) = 0 if and only if a = 0), so that this minimum value of (P on S is positive (for p(a) = II L(a) II > 0 for every a E S). Summing up: there exists
a. E S such that 0<rp(am) :5 (p(a)
for all a E S. Let x = (I , ... , !n) E lF" be the n-tuple whose entries are the coorof an arbitraryx = " t lie; in X with respect to the basis B. Note
dinates {%; }"
that IIX IIo = IIx IIao 0 0
whenever x 54 0, and hence
-)IIlxlloo
(p(a,n)Ilxllo 5
t ieq = IIx II
11 LX 11 = H
i=1
for every nonzero x in X. Therefore, by setting a = Sp(an) > 0 and
_
Ilei II > 0 (which do not depend on x), it follows that aIIXII0
IIXII
<_ fillxllo
for every x E X. Conclusion: Every norm on X is equivalent to the norm II I10, and hence every two norms on X are equivalent.
Corollary 4.28. A finite-dimensional normed space is a Banach space.
Proof. Let X be an n-dimensional linear space over F and let B = (e; }"_, be a Hamel basis for X. Take an arbitrary X-valued sequence (xk)k>I so that n
xk = E Woe, i=1
234
4. Banach Spaces
where m }"= i is a family of scalars (the coordinates of xk with respect to the Hamel basis B) for each integer k > 1. Equip X with a norm 11 11 and suppose {xk)k>l is a Cauchy sequence in (X, 11 11). Consequently, for every e > 0 there exists an integer kE > 1 such that Ilxj - xk II < e whenever j, k > kE. Now consider the equivalent norm II 110 on X (which was defined in the proof of Theorem 4.27) so that, for every
j, k > k5,
Sk(i)I = Ilxj -xkllp S air 11xj -xkll < a
Max I
for some positive constant a. Hence {Sk(i)}k>I is a Cauchy sequence in (F, 1 I) for each i = i, ... , n, where I I stands for the usual norm on F. As (1F, 1 1) is a Banach space (Example 4A), each sequence (Fk(i)}k>I converges in (1F, 1 1) to, say, (i) E F.
That is, for every e > 0 there exists an integer k;,6 > 1 for each i = i, ... , n such that Itk(i) - i; (i) I < e whenever k > ki,E. By setting x = F,"=I (i)e; in X we get n
n
Ilxk - xll
t(i)I Ilei II
<- (
i=1
llei ll)e i=1
whenever k > ke = max{ki,,)"i, and hence xk -> x in (X, 11 11) as k --). oo. Conclusion: If X is a finite-dimensional linear space and 11 11 is any norm on M. then every Cauchy sequence in (X, II II) converges in (X,11 11). and therefore (X, 1111) is a Banach space. o Corollary 4.29. Every finite-dimensional linear manifold of any nonmed space X is a subspace of X.
Proof Corollary 4.28 and Theorem 3.40(a).
t7
Corollary 4.30. Let X and,y 76 {0} be nonmed spaces. Every linear transformation of X into Y is continuous if and only if X is finite-dimensional.
Proof. The statement can be rewritten as follows. If Y A (0), then
dim X < oo
if and only if
C[X, y] = B[X, Y].
(a) If dim X = 0, then X = (0} and hence L[X, YJ = (0), so that C[X, Y] trivially coincides with 8[X, Y]. Suppose dim X = n for some positive integer n and let B = (ei )"=, be a Hamel basis for X. Take an arbitrary x E X and consider its (unique) expansion on B, n
x=
i-
l;;e;,
where (; }"=i is a family of scalars consisting of the coordinates of x with respect to the basis B. Let 11 Ilx and II Ily denote the norms on X and Y. respectively. If
T E G[X, AJ, then n
n
11 Tx Ily = II > i=l
i Tei llY
-<
Iti 111 Tei lly
i=
<_
11 Tei iiy) max Iii 1. i=I
I
4.6 Equivalence and Finite-Dimensional Spaces
235
The norm 11 110 on X that was defined in the proof of Theorem 4.27 is equivalent to II
llx- Hence there exists a > 0 such that n
n
IITxIIy _< (EIITeilIy)llxlIn < (BEIITeilly) IIxIIX, i=1
i=1
and so T E B[X, Y). Thus £[X, Y] c B[X, y] (i.e., G(X, y] = B[X, y]). (b) Conversely, suppose X is infinite-dimensional and let (ey)yEr be an indexed Hamel basis for X. Since (ey )yEr is an infinite set, it has a countably infinite subset, say {ek)kEN. Set M = span (ek)kEN, a linear manifold of X for which {ek}kEN is
a Hamel basis. Thus every vector x in M C X has a unique representation as a (finite) linear combination of vectors in (ek)kEN. This means that there exists a unique (similarly indexed) family of scalars (k}kEN such that tk = 0 for all but a finite set of indices k, and
x=
k ek
kEN
(see Section 2.4). Take an arbitrary nonzero vector y in Y and consider the mapping
LM: M --)- Ydefinedby
LMx = >2ktklleklly kEN
for every x E M. It is readily verified that LM is linear (recall: the above sums are finite and unique). Extend LM to X, according to Theorem 2.9. and get a linear
transformation L: X - Y such that LIM = LM. Since Lek = LMek = kIleklly and ek * 0 for each k E N, it follows that
kllyll =
Lekl ek
11 U1
`- sup IF IF
for every k E N. Outcome: There is no constant P > 0 for which II Lx II '0II x II for all x E X, and therefore the linear transformation L: X -+ Y is not bounded;
that is, L E £[X, Y]\B[X, Y1. Conclusion: If C[X, Y] = B[X, y], then dimX
Corollary 4.31. Two finite-dimensional normed spaces are topologically isomorphic if and only if they have the same dimension. Proof. Let X and Y be finite-dimensional normed spaces. Corollaries 4.28 and 4.30 ensure that X and Y are Banach spaces and B[X, y] = C[X, Y1. Thus X and Y are
236
4. Banach Spaces
topologically isomorphic if and only if they are isomorphic linear spaces (Theorem 4.22). which means that dim X = dim Y (Theorem 2.12). El
Recall that every compact subset of any metric space is closed and bounded. The Heine-Borel Theorem asserts that every closed and bounded subset of F" is compact. This equivalence (i.e., in F" a set is compact if and only if it is closed and bounded) actually is a property that characterizes finite-dimensional normed spaces. Corollary 4.32. If X is a finite-dimensional normed space, then every closed and bounded subset of X is compact. Proof. Let X be a finite-dimensional normed space over F and let A be a subset of X that is closed and bounded (in the norm topology of X). According to Corollary 4.31 X is topologically isomorphic to F", where n = dim X and F" is equipped with any norm (which are all equivalent by Theorem 4.27). Take a topological isomorphism W E G[X,F"] and note that W (A) is closed (because W is a closed map-Theorem 3.24) and bounded (for SUpOE A II Wall 5 11 W II supaEA Ila II -see Problem 4.5) in F".
Thus W(A) is compact in F" (Theorem 3.83). Therefore, since W-1 E g[F", X] is a homeomorphism, and since compactness is a topological invariant (Theorem 0 3.64), it follows that A = W -1(W (A)) is compact in X. To establish the converse (i.e., if every closed and bounded subset of a normed space X is compact, then X is finite-dimensional) we need the following result. Lemma 4.33. (Riesz). Let M be a proper subspace of a normed space X. For each real number a E (0, 1) there exists a vector xa in X such that ll xa 11 = 1 and
a
henceitcontainstheoriginofX(i.e.,d(x, M) = infueMIIx - u11 5 IIx-011 = Ilx11 for every x E X). Since M is a proper closed subset of X, it follows that X\M is nonempty and open in X. Take any vector z E X\M such that 0 < d(z, M) (e.g., take the center of an arbitrary nonempty open ball included in the open set X \M). Thus for each e > 0 there exists a vector ve E M such that
0 < d(z,M) = uEnf Ilz - ull 5 Ilz-ve11 < (1+e)d(z,M). M
Setxe = llz-veil-1(z-ve)inXso that IIxeII = 1 and
IIz-vEIId(xf,M) =
inf vEnf
llz-vellxe-llz-velluII liz - ve - vll = d(z, M).
Hence I I+c< d(xe,M),
4.7 Continuous Linear Extension and Completion
237
o
which concludes the proof by setting a =+E.
This result is sometimes referred to as the "Almost Orthogonality Lemma".
(Why? - Draw a picture.) It is worth remarking that the lemma does not ensure the existence of a vector xt in X with Ilxt I) = 1 and d(xt, M) = 1. Such a vector may not exist in a Banach space but it certainly exists in a Hilbert space (next chapter).
Corollary 4.34. Let X be a normed space. if the closed unit ball B1 [0] _ (x E X : lix II < 1) is compact, then X is finite-dimensional. Proof. If Bt [0] is compact, then it is totally bounded (Corollary 3.81) so that there exists a finite 71 s -net for BI [0] (Definition 3.68), say (u; }" I C BI [0]. Set M = span (ui }"=t, which is a finite-dimensional subspace of X (reason: span (u1}1 is a finite-dimensional linear manifold of X because there exists a Hamel basis for it included in (u; }"_t, by Theorem 2.6, and hence it is a subspace of X according
to Corollary 4.29). If X is infinite-dimensional, then M * X and, in this case, Lemma 4.33 ensures the existence of a vector x E BI [0] such that i < d (x, M). Thus i < Ilx - u II for every u E M and, in particular, i < Ilx - ui II for every i = i, ... , n, which contradicts the fact that (u; };'_ i is a i -net for Bt [0]. Conclusion: If BI [0] is compact in X, then X is finite-dimensional. 0
4.7
Continuous Linear Extension and Completion
We shall now transfer the extension results of Section 3.8 on uniformly continuous mappings between metric spaces to bounded linear transformations between normed spaces. The normed-space versions of Theorem 3.45 and Corollary 3.46 read as follows.
Theorem 4.35. Let M be a dense linear manifold of a normed space X and let Y be a Banach space. Every T E B[M, Y] has a unique extension T in B[X, Y1. Moreover IITII = IITII.
Proof. Let M and Y be normed spaces. Recall that every T E B[M, Y] is uniformly continuous (Theorem 4.14). If M is a dense linear manifold of a normed space X. and Y is a Banach space (i.e., a complete normed space), then Theorem 3.45 ensures that there exists a unique continuous extension T : X --> Y of T over X. It remains to verify that ? is linear and IITII = IITII.
Let u and v be arbitrary vectors in X. Since M- = X, it follows by Proposition converging in X to u and 3.32 that there exist M-valued sequences and v respectively. Then, by Corollary 3.8,
T (u + v) = T(lim u + lim
T(lim(u9 +
Jim T(u + v)
238
4. Banach Spaces
because addition is a continuous operation (Problem 4.1) and T is a continuous mapping. But T = TIM, M is a linear space, and T is a linear transformation. Hence
T(un+vn) = T(un+vn) = Tun+Tvn = Tun+Tvn for every n. The same continuity argument (Problem 4.1 and Corollary 3.8) makes the backward path:
lim T(un + vn) = lim(Tun + Tvn) = lim Tun + lim Tvn = fu + TV. This shows that ? is additive. Similarly, for every scalar a,
T(alimun) = T(limaun) = limT(aun) = limTaun = limaTun = limcl un = alimTun = aTu
T(au) =
so that T is also homogeneous. That is, T is linear. Thus T E B[X, y]. Moreover, IlTunl! = IITunII <_ IITIIIIunII
for every n. Applying once again the same continuity argument, and recalling that the norm is continuous, it follows that lim Ilun II = flu II and
IITuII = Ii?limun II = limII7'unll
IITIIIimIIunII = IITIIIIull,
and hence II T II = sup1ju n=1 II Tu II < 11 T11. On the other hand,
IIT I! =
sup 096mEM
Tw =
sup 0#WEM
Tw < sup 056xex
= IITII.
Corollary 4.36. Let X and Y be Banach spaces. Let M and N be dense linear manifolds of X and ,y, respectively. If W E !;[M, All, then there exists a unique W E Q[X, yJ that extends W over X. Proof. Recall that every W E Q[M, N] is a linear homeomorphism. and hence a uniform homeomorphism (Theorem 4.14). Thus Corollary 3.46 ensures that there exists a unique homeomorphism W : X -+ Y that extends W over X. On the other hand, the previous theorem says that there exists a unique continuous linearextension
of W : M -+ Y over X, say W' E B[X, Y). Thus W and W' are both continuous mappings of X into Y that coincide on a dense subset M of X: W I M = W = W' Therefore W = W' by Corollary 3.33, and hence the homeomorphism W of X onto Y lies in B[X, Y1. Conclusion: W is a linear homeomorphism. which means a topological isomorphism; that is, W E Q[X, 311. o
An even stronger form of homeomorphism is that of a surjective isometry (a surjective mapping that preserves distance - recall: every isometry is an inject»e contraction, and the inverse of a surjective isometry is again a surjective isometry).
4.7 Continuous Linear Extension and Completion
239
If there exists a surjective isometry between two metric spaces, then they are said to be isometric or isometrically equivalent (Chapter 3). Thus a linear isometry of a normed space X into a normed space Y (i.e., a linear transformation of X into Y that is also an isometry) is necessarily an element of B[X, y], since an isometry is continuous. A surjective isometry in 8[X, y] (i.e., a linear surjective isometry or, equivalently, a linear-space isomorphism that is also an isometry) is called an isometric isomorphism. Two normed spaces are isometrically isomorphic if there exists an isometric isomorphism between them. The next proposition places linear isometries in a normed-space setting.
Proposition 4.37. Take V E C[X, y], where X and Y are normed spaces. The following assertions are equivalent. (a) V is an isometry
(i.e., II Vu - Vvll = IIu - vll for every u, V E X).
(b) IIVxII = IIx II for every x E X.
If Y = X, then each of the above assertions also is equivalent to (c)
II
Vnx II
= IIx II for every x E X and every integer n > 1.
Proof. Assertions (a) and (b) are clearly equivalent because V is linear. Indeed, set
v = 0 in (a) to get (b) and, conversely, set x = u - v in (b) to get (a). Moreover, take V E G[X] and suppose (b) holds true. Then (c) holds trivially for n = 1. If (c) holds for some n > 1, then II V"+t x Il = II V" Vx II = II Vx Il = IIx II for every
x E X. Conclusion: (b) implies (c) by induction. Finally note that (c) trivially implies (b).
0
It is clear that every isometric isomorphism is a topological isomorphism, and that the identity I of B[X] is an isometric isomorphism. It is also clear that the inverse of a topological (isometric) isomorphism is itself a topological (isometric) isomorphism,
and that the composition (product) of two topological (isometric) isomorphisms is again a topological (isometric) isomorphism. Therefore, these concepts (topological isomorphism and isometric isomorphism, that is) have the defining properties of an equivalence relation (viz., reflexivity, symmetry and transitivity).
Corollary 4.38. Let M and N be dense linear manifolds of Banach spaces X and Y. respectively. If U : M -> N is an isometric isomorphism of M onto N, then there exists a unique isometric isomorphism U : X -+ Y of X onto y that extends Uover X.
According to Corollary 4.36 there exists a unique topological isomorphism U : X -* Y that extends U : M --> N over X. Note that the mappings II : X -+ R and 11U( ) II : X -- lIt are continuous (composition of continuous Proof. II
functions is again a continuous function), and also that they coincide on a dense subset M of X (II U u II = II U u II = I I u 11 for every it E M because 61M = U and U
240
4. Banach Spaces
is an isometry - Proposition 4.37). Then, according to Corollary 3.33, they coincide on the whole space X; that is, II Ox II = Ilx II for every x E X. Therefore (Proposition 4.37 again), the topological isomorphism U is an isometry, and hence an isometric isomorphism. 0 Since every normed space is a metric space, every normed space has a completion in the sense of Definition 3.48. The question that naturally arrises is whether a normed
space has a completion that is a Banach space. In other words, is a completion (in the norm topology, of course) of a normed space a complete normed space? We shall first redefine the concept of completion in a normed-space setting.
Definition 4.39. If the image of a linear isometry on a normed space X is a dense linear manifold of a Banach space X, then X is a completion of X. Equivalently, if a normed space X isisometrically isomorphic to a dense linear manifold of a Banach space X, then X is a completion of X. (In particular, any Banach space is a completion of every dense linear manifold of it, for the inclusion map is a linear isometry.)
Theorem 4.40. Every normed space has a completion. Proof. Let (X, II IIX) be a nonmed space. Consider the linear space XN of all valued sequences. It is readily verified that the collection CS(X) of all Cauchy sequences in (X, II IIX) is a linear manifold of XN, and hence a linear space (over X_
the same field of the linear space X). The program is to rewrite the proof of Theorem 3.49 in terms of the metric dx generated by the norm II IIX Thus
Ilx II = lim IIx. Ilx
defines a seminorm on the linear space CS(X), where x = Ix,,) is an arbitrary sequence in CS(X). Set
N = {x E CS(X) : IIx II = 01 Proposition 4.5 ensures that M is a linear manifold of CS(X), and also that II[x]IIX = IIxII
defines a norm on the quotient space X = CS(X)/N, wherex is an arbitrary element of an arbitrary coset [x ] = x + N in X, so that (X, II iiX) is a nonmed space.
Now consider the mapping K : X -+ X that assigns to each vector x E X the coset [x ] E X containing the constant sequence x = (x } E CS(X) such that x = x for all n. It is easy to show that
K : X -+ X is a linear isometry.
4.7 Continuous Linear Extension and Completion
241
Moreover, according to Claims I and 2 in the proof of Theorem 3.49, the range of K (which is clearly a linear manifold of X) is dense in (X, II IIX). and (X. II 11X) is a complete normed space. That is,
K(X)- = X and X is a Banach space. Summing up. _!V is isometrically isomorphic to K(X). which is a dense linear manifold of X, which in turn is a Banach space. Therefore X is a completion o of X. The completion of a normed space is essentially unique; that is, the completion of a normed space is unique up to an isometric isomorphism (i.e., up to a linear surjective isometry). Theorem 4.41. Any two completions of a normed space are isometrically isomorphic. Proof. Replace "metric space", "surjective isometry" and "subspace", respectively, with "normed space", "isometric isomorphism" and "linear manifold" in the proof of Theorem 3.50. o
According to Definition 4.39, a Banach space X is a completion of a normed space X if there exists a dense linear manifold of X, say X, which is isometrically isomorphic to X. That is, if there exists a _surjective isometry Ux E G[X, X] for some dense linear manifold X of X. Let Y be a completion of a normed space Y. so that Y is isometrically isomorphic to a dense linear manifold 5i of the Banach space Y, and let UY E G[Y, y] be the surjective isometry between Y and Y. Take
T E B[X, y] and set T = Uy T Ux E B[X, y] as in the following commutative diagram.
X0 Y UXT
IUY
T
3'
Define T E B[X, y] as the extension of T E B[X, y] over X (Theorem 4.35) so that II T II = II Uy T Ux II = II T II (see Problem 4.41). It is again usual to refer to T
as the extension of T over the completion X of X into the completion y of Y. Moreover, since any pair of completions of X and Y, say {X, X) and (y. y are isometrically isomorphic (i.e., since there exist surjective isometrics Ux E 9[X. X] and UY E c[S, S']), it follows (see proof of Theorem 3.51) that any two extensions
T E B[X, y] and T E 13[X, y ] of T E B[X, y] over the completions X and X into the completions Y and Y, respectively, are unique up to isometric isomorphisms in the sense that r = Uy T Ox (and hence 11 T 11 = 11 ?11 = 11 T II ). The commutative
242
4. Banach Spaces
diagrams
y -* y
C
y
C
X4X
TI
X
X
illustrate such an extension program, which is finally stated as follows.
Theorem 4.42. Let the Banach spaces X and y be completions of the nonmed spaces X and y, respectively. Every T E B[X, y] has an extension T E B[X, y] over the completion X of X into the completion $ of Y. Moreover, T is unique up to isometric isomorphisms and 11 T II = II T II .
The Banach-Steinhaus Theorem and Operator Convergence
4.8
Let X and Y be normed spaces and let O be a subset of B[X, Y]. We shall say that 0 is strongly bounded (orpointwise bounded) if for each x E X the set Ox = (T x E Y:
T E O} is bounded in y (with respect to the norm topology of y), and uniformly bounded if it is itself bounded in B[X, Y1 (with respect to the uniform norm topology of B[X, Y]). It is clear that uniform boundedness implies strong boundedness; that is,
sup II T II < oo implies sup 11 Tx II < oo for every x E X,
Tee
TEA
since II Tx 11 < II T II Ilx 11 for every x E X. The Banach-Steinhaus Theorem ensures
the converse whenever X is a Banach space: sup 11 Tx II < oo for every x in a Banach space X implies sup II T II < coo. TEe TE9
That is, a collection O of bounded linear transformations of a Banach space X into a nonmed space y is uniformly bounded if and only if it is strongly bounded. This will be our second important application of the Baire Category Theorem (the first was the Open Mapping Theorem).
Theorem 4.43. (The Banach-Steinhaus Theorem or The Uniform Boundedness Principle). Let {Ty }yet- be an arbitrary indexed family of bounded linear transformations of a Banach space X into a normed space Y. If sup),,t-11 Tyx 11 < oo for every x E X, then supyet- II Ty II < oo. Proof.
For each n E N set An = Ix r= X:
Claim 1.
A is closed in X.
all
4.8 The Banach-teinhaus Theorem and Operator Convergence
243
Proof. Let (xk) be an An-valued sequence that converges in X to, say, x E X. Take an arbitrary y E I' and note that IlTyxll
= IITyx -xk)+Tyxkll IITyx - xk)II + IlTyxkll
<-
IITTIIIIx -xkll +n
for every integer k. Since Ilx - xk ll -+ 0 as k -+ oo, it follows that IITyx lI _< n. Thus, as y is an arbitrary index in r, Il Tyx 11 < n for ally E r, and hence x E A. Conclusion: A is closed in X by the Closed Set Theorem (Theorem 3.30). t] Claim 2. If supyEr IITyx II < oo for every x E X, then (Aa) covers X. Proof. Take an arbitrary x E X. If supyEr II Tyx 11 < oo, then there exists an integer
n., E Nsuch that supyEr11Tyxll <ns,andhencex E An,.Therefore,X C UnENAn (i.e., X = UnENAf). 0
-
Then, by the Baire Category Theorem (Theorem 3.58), at least one of the sets An, say Ano, has a nonempty interior. This means that there exists a vector xo in X and a radius p > 0 such that the open ball Bp(xo) is included in Ano (and hence
II Tyxo II < no for every y E I'). Now take any nonzero vector x in X and set x' = p IIxII-'x E X. Since X' E Bp(0), it follows that x' + xo E Bp(xo) C Ano. Thus, for every y E F, II Ty (x' + xo) II < no and p
11Tyx' ll
= Il Ty (x' + xo) - Tyxoll <
Therefore II T), II = sups #o
UT jxx1l`
IITy(x' +xo)II + llTyxoll < 2no.
< 7 for all y E r.
0
Let (L} be a sequence of linear transformations of a normed space X into a normed space Y (i.e., an £[X, Y]-valued sequence). (Ln) is pointwise convergent if for every x E X the Y-valued sequence (Lax) converges in Y. That is, if for each x E X there exists yX E Y such that Lax yx in Y as n - oo. As the limit is unique, this actually defines a mapping L : X -> Y given by L(x) = yx for every x E X. Moreover, according to Problem 4.1, it is readily verified that L is linear. Therefore, an £[X, y]-valued sequence {L,,) is pointwise convergent if and only if there exists L E L[X, Y] such that 11(La - L)x ll -i 0 as n -+ oo for every x E X. Now consider a sequence { T,) of bounded linear transformations of a normed space X into a normed space Y (i.e., a B[X, y]-valued sequence). Proposition 4.44. If X is a Banach space and Y is a normed space, then a 13[X, ))]valued sequence {Ta } is pointwise convergent if and only if there exists T E BR, YJ such that 11(Ta - T )x 11 -+ 0 for every x E X.
Proof. According to the above italicized result (about pointwise convergence of C[X, Y]-valued sequences) all that remains to be verified is that T is bounded
244
4. Banach Spaces
whenever II(TT - T)xII -+ 0 for every x E X. Since the sequence (Tnx) converges in Y, it follows that it is bounded in Y (Proposition 3.39), which means that sups II Tnx II < oo for every x E X (Problem 4.5). Hence sup II Tn II < 00 by the Banach-Steinhaus Theorem (Theorem 4.43) because X is a Banach space. Therefore,
IITxII = limIITnxll < (limsupliTnll)IIxII -< (supIITnII)IIxII n
n
R
for every x E X sothatT E B[X,Y]and IITII :Slim sup. IITnhI.
0
Definition 4.45. Let X and y be normed spaces and let (Tn) be a B[X, ,y]-valued sequence. If (Tn) converges in B[X, y], that is, if there exists T E B[X,Y] such that
IITT-TII--.0, then we say that (T,,) converges uniformly (or converges in the uniform topology, or in the operator norm topology) to T. In this case, T E B[X, Y J is called the uniform
limit of ITn }. Notation: T. -0 T (or Tn - T --i 0). If IT.) does not converge T. If there exists T E B[X, y] such that
uniformly to T, then we write Tn
II(Tn - T)xII - 0 for every xEX, then we say that (Tn) converges strongly (or converges in the strong (operator) topology) to T, which is called the strong limit of (T.). Notation: T,, T (or T,, - T 0). Again, if (T,,) does not converge strongly to T, then we write T. --si-> T.
Uniform convergence implies strong convergence (to the same limit). Indeed, for each n,0 < 11 (T. - T)xII 11 T. -TIIIIx1I for every x E X, and hence
T. s>T.
Tn" ) T
The next proposition says that on a finite-dimensional normed space X the concepts of strong and uniform convergence coincide.
Proposition 4.46. Let X and Y be normed spaces. Consider a B[X, y]-valued sequence (Tk ) and let T be a transformation in B[X, Y1. If X is finite-dimensional, then Tk
T if and only if Tk -"-' T.
Proof. Suppose (X, II Ilx) is a finite-dimensional normed space and let B = (e1)"_1 be a Hamel basis for the linear space X. Take an arbitrary x in X and consider its unique expansion on B,
;ej,
x= i=1
4.8 The Banach-Steinhaus Theorem and Operator Convergence
245
}"_1 are the coordinates of x with respect to the basis B. Set
where the scalars
Ilxllo = max 1
which defines a norm on X as we saw in the proof of Theorem 4.27. If (Tk) converges
strongly to T E B[X, Y], then 11(Tk - T)ei Ily -- 0 as k -> oo for each ei E B. Thus for each i = i, ... , n and for every E > 0 there exists a positive integer ki,e such that II (Tk - T )ei Ily < E whenever k > ki,E. Therefore, n
II(Tk - T)xIIy
=
IL.
:5
It;III(Tk-T)e;Ily
n max Iti I max II(Tk - T)ei IIy < n Ilxlloe I
1 _
whenever k > ke = max (ki,e )'_1. However, 11 110 and II IIX are equivalent norms on X (Theorem 4.27) so that there exists a constant a > 0 such that a IIx IIo < IIx IIX.
Since n and a do not depend on x, it follows that II(Tk - T)xIly < n,EllxlHX
for every x E X whenever k > ke. Hence, for every r > 0 there exists a positive integer ke such that
k > ke
implies
II(Tk - T) 11 = sup II(Tk - T)xIly < "s E, lxll
T implies and therefore {Tk} converges uniformly to T. Summing up: Tk T whenever X is a finite-dimensional normed space. This concludes the Tk proof once uniform convergence always implies strong convergence.
o
Proposition 4.47. If a Cauchy sequence in B[X, yJ is strongly convergent, then it is uniformly convergent.
Proof. Let X and)) be normed spaces, let T be an operator in B[X, y], and let {T" } be a B[X, y]-valued sequence. Thus, for each pair of indices in and n,
II(Tn - T)xII = II(T.-Tm)x+(Tm-T)xll
IITn - T. IIlIXll+II(Tm-T)xII
for every x E X. Now take an arbitrary e > 0. If (Tn) is a Cauchy sequence in B[X, Y], then there exists a positive integer ne such that IITn - Tm II <', and hence
II(T.-T)xII < 6lixII+II(Tm-T)x1l for every x E X, whenever m, n > ne. If {Tn} converges strongly to T, then for each x E X there exists an integer me,x > nE such that II (Tm - T )x II < ' whenever m > mE,x. Hence 11 (Tn - T)xll < ' IIx 11 + '
246
4. Banach Spaces
for every x E X, and so
IITn-TII = sup II(TT-T)xll <s, 11X U:51
whenever n > nF. Conclusion: For every s > 0 there exists a positive integer nE such that
n > ne
implies
II T" - T I( < s,
which means that {T"} converges in B[X, y] to T.
t7
Outcome: If X and Y are normed spaces and (T") is a B[X, y]-valued sequence, then IT") is uniformly convergent if and only if it is a strongly convergent Cauchy sequence in B[X, y] (recall that every convergent sequence is a Cauchy sequence). If { T" ) is strongly convergent, then it is strongly bounded (Proposition 3.39). Now suppose X is a Banach space. In this case Proposition 4.44 asserts that pointwise convergence coincides with strong convergence, and strong boundedness coincides with uniform boundedness (Theorem 4.43). Hence
T" s T = sup II T" II < oo whenever X is Banach. R
Take T E B[X] and consider the power sequence IT") in the normed algebra B[XJ. The operator T is called uniformly stable if the power sequence {T") converges uniformly to the null operator; that is, if T" 0 (equivalently, II T' (I --* 0). T is strongly stable if the power sequence IT") converges strongly to the null operator;
that is, if T" s. 0 (equivalently, II T'x II -- 0 for every x E X). It is called power bounded if {T") is a bounded sequence in B[X]; that is, if sup" II T" I( < oo. If X is a Banach space, then the Banach-Steinhaus Theorem ensures that T is power bounded if and only if sup,, II T"x II < oo for every x E X. Clearly, uniform stability implies strong stability,
T" O
T"--O,
which in turn implies power boundedness whenever X is a Banach space:
T" s 0
sup II T" II < oo if X is Banach. n
However, the converses fail.
Example 4K. Consider the diagonal operator D. E B[l+] of Example 4H (for some p > 1, where a= {ak} lies in t,), and recall that II Da II = SupklakI = IlaIlx It is readily verified by induction that the nth power of Da, D.". is again a diagonal operator in B[1+]. Indeed, Da x = (&A'k k}
for every
x = {lk}k 0 E If
4.8 The Banach-Steinhaus Theorem and Operator Convergence
247
so that If D., II = SUPk Iak I" = IIa Iin for every n > 0. Hence Da E B[1p] is uniformly
stable if and only if Ila II,, < 1; that is,
Da - 0
if and only if
sup Iak I < 1. k
Next we shall investigate strong stability. If II D. "x II -+ 0 for every x E t.} then, in particular (see Example 4H), II Da e, II = jai I" -+ 0 as n -- oo, and hence jai I < 1,
for every i > 0. On the other hand, let x = {rk'o be an arbitrary point in l+ and note that m
00
00
0 < IIDaxllp = E IakI"pI'kIp = E IakI"pltklp + E IakI"pIkIp k=0
k=0
k=m+1
m
<
00
max Iak I"p E Ilk lp + SUp IakI"p E Ilk ip k-0
05 k<m
k>m
k=m+l
for every pair of integers m, n > 0. If Iak I < 1 for every k >_ 0, then SUPk Iak I"p
for alln>0.Thus,
IIxllp forallm>0, 00
0 < IIDoxll° < max IakI"pllxllp+ E IkIp 0
k=m+1
for every m, n > 0. Now take an arbitrary s > 0. Since F_k=m+l I klp -+ 0 as m _+ 00 (for E o I k I P = II x II p < oo - Problem 3.11), it follows that there exists a positive integer m£ such that 00
m > M.
implies
141p < k=m+1
0 < I sothatlim,, max o
If lakl < I foreveryk > 0, then maxflak Ip)k
limn(max 0:5k<mlakIP)" = 0 for every nonnegative integer rn. In particular, lim" max 0
n > ns
implies
max IakI"p <
v
0< k<m,
But 0 < IIDa xlip < max0 nE
implies
and hence
0 < II D, ,"x II < c.
Therefore, if lak I < I for every k > 0, then II D, ,"x II -) 0 for every x E e+. Con-
clusion: D. E B[1. ] is strongly stable if and only if IakI < 1 for every k > 0; that is, Da
0
if and only if
IakI < 1 for every k > 0.
248
4. Banach Spaces
For instance, the diagonal operator Da = diag({ k is strongly stable but not uniformly stable.
} -0) E B[t. J of Example 4H
Example 4L. Consider the mapping S-: t+ -* t+ defined by S- = ( k+I )k_o (i.e., S_ (to, ti, 42,
for every
)_
x=
0E
t,Pp
.. )), which is also represented as an infinite
matrix 0
1
0
S- =
1
0
1
where every entry above the main diagonal is equal to one and the remaining entries
are all zero. This is the backward unilateral shift on If. It is readily verified that S_ is linear and bounded so that S_ E B[t+]. Actually. IIS_xIIP = F-k°_tltkl° -< E t+ so that S is, in fact, a contraction >°_o Ilk I P = Ilx II P for every x = {lk } (i.e., IIS-II < 1). Consider the power sequence (S") in B[t+]. A trivial induction shows that, for each n E No,
S-"x = {k+" }
for every
x = (rk}
E t+
Hence IIS"-x ll P = r_°O_ Ilk IP --o- 0 as n -- oo for every x = Problem 3.11) so that S- is strongly stable; that is, s"
E t+ (cf.
0.
However, S_ is not uniformly stable as we shall see next. Indeed, II S' II
II S II" < 1 for every n > 0 (see Problem 4.47(a)). On the other hand, consider the t4P.-valued sequence (ei )'o where, for each integer i > 0, ei is a scalar-valued sequence with just one nonzero entry (equal to one) at the ith position (i.e., ei = {Sik } E t+ for every i > 0). Note that Ilea II = 1 for every i > 0 and S"e"+I = e1 for every n > 0. Thus IIS-11 = supli.,11=I lls"-x II ? IIS-"e"+I II = Ilet 11 = 1 for every nonnegative integer n. Therefore, for every n > 0 11 S! II = 1
so that S` 0 0. Conclusion: The power sequence {S"} does not converge uniformly.
0, and since uniform convergence implies strong converReason: Since rgence to the same limit, it follows that either S" u 0 or {S"} does not converge uniformly.
4.8 The Banach-Steinhaus Theorem and Operator Convergence
249
Let X and Y be normed spaces and let 0 be a subset of B[X, y]. According to the Closed Set Theorem, 0 is closed in B[X, y] if and only if every 0-valued sequence that converges in B[X, Y] has its limit in 0. Equivalently, every uniformly
convergent sequence (T} of bounded linear transformations in 0 e_ B[X,Y] has its (uniform) limit T in 0. In this case the set 0 C B[X, y] is also called uniformly closed (or closed in the uniform topology of B[X, y]). We say that a subset 0 of B[X, Y] is strongly closed in B[X, Y] if every 0-valued strongly convergent sequence {Tn) has its (strong) limit T in 0.
Proposition 4.48. If 0 C B[X, Y] is strongly closed in B[X, y], then it is (uniformly) closed in B[X, Y1.
Proof. Take an arbitrary 0-valued uniformly convergent sequence, say and let T E B[X, Y) be its (uniform) limit. Since uniform convergence implies strong convergence to the same limit, it follows that IT,,) converges strongly to T. If every 0-valued strongly convergent sequence has its (strong) limit in 0, then T E 0. Conclusion: Every 0-valued uniformly convergent sequence has its (uniform) limit
in d. Remark: If X is finite-dimensional, then strong convergence coincides with uniform
convergence (Proposition 4.46), and hence the concepts of strongly closed and uniformly closed in B[X, Y] coincide as well whenever X is a finite-dimensional normed space. Example 4M. Take an arbitrary p > 1 and let V be the collection of all diagonal operators in B[e+]. That is,
V = D. E G[8+]: Da = diag({ak}') and a =
E e°+°}
(see Example 4H). Set
D. _ (D.ED: aEP+`.°}, the collection of all diagonal operators D. = diag((ak) 0) in B[t. ] such that the scalar-valued sequence a= (ak )k o converges to zero. As a matter of fact, both D and D, are subalgebras of the Banach algebra B[1+]. Let be an arbitrary D,-valued uniformly convergent sequence. Hence Dan -±+ D E B[1. ] with each Dan = diag((ak(n)) 0) in D,,. so that each an = (ak(n))k°o lies in e+ . This implies that Dan -+ D and, according to Problem 4.51, D is a diagonal operator; that is, D = Da = diag((ak )k4 E V for some a = {a0)k° o E 1+'. Thus (see Example 4H)
IID.. - Dal1 = Ilan - allpp = SUP Iak(n)-akI. k
Now take an arbitrary s > 0. Since supk lak (n) - ak l -- 0 as n -> oo (because !I Dan - Da ll - 0), there exists a positive integer nE such that SUP lak (n,) - ak l < E. k
250
4. Banach Spaces
Moreover, since ak (n1) -+ 0 ask -+ oo (for an, E l ° ), there exists a positive integer ks such that lak (n,) I < s whenever k > k6 .
Finally recall that IIakI - Iak(n,)l I < Iak - ak(n1)I, and hence lakl < Sup Iak - ak(11)I + lak(ns)I k
for every k. Therefore,
k > k£
implies
Iak I < 2e
so that ak --* 0. Thus a = {ak }k o E e+ and so D. E V. Conclusion:
D,, is closed in 5[l+]. Next consider a Dam-valued sequence (Dn )n> 1, where each Dn is a diagonal operator whose only nonzero entries are the first n entries in the main diagonal, which are all equal to one: Dn = diag(1, ... , 1, 0, 0, 0, ...) E DO.
k I P -+ 0 as n - oo for every x = { k } (because F_ 0 0. o &I ' = lix II < oo - Problem 3.11). Hence Observe that II Dn x - x 11P =
1 I
Dn !) I but the identity is a diagonal operator that does not lie in D,, (i.e., I E D\Dc,). Outcome:
D,, is not strongly closed in B[e+].
4.9
Compact Operators
Let X and 3) be normed spaces. A linear transformation T E C[X, y] is compact (or completely continuous) if it maps bounded subsets of X into relatively compact subsets of Y. That is, if T(A)- is compact in Y whenever A is bounded in X.
Equivalently. T E C(X, y] is compact if T (A) lies in a compact subset of Y whenever A is bounded in X (see Theorem 3.62).
Theorem 4.49. If T E C[X, YJ is compact, then T E B[X, Y1. Proof. Take an arbitrary bounded subset A of X. If T E C[X, YJ is compact, then T(A)- is a compact subset of Y. Thus T(A)- is totally bounded in Y (Corollary 3.81). and so T(A)- is bounded in y (Corollary 3.71). Hence T(A) is a bounded subset of y, and therefore the linear transformation T is bounded by Proposition 4.12. El
4.9 Compact Operators
251
In other words, every compact linear transformation is continuous. The converse is clearly false, for the identity I of an infinite-dimensional normed space X into itself is not compact by Corollary 4.34. Recall that T E G[X, y] is of finite rank (or a finite-dimensional linear transformation) if it has a finite-dimensional range (see Problem 2.18).
Proposition 4.50. Let X and Y be normed spaces. If T E B[X, y ] is of finite rank, then it is compact.
Proof. Take any bounded subset A of X. If T E B[X, y], then T (A) is a bounded subset of Y (Proposition 4.12). Thus T(A)- is closed and bounded in Y. If T is of finite rank, then the range R(T) of T is a finite-dimensional subspace of Y (Corollary
4.29), and hence R(T) is a closed subset of Y. Thus T(A)- is closed and bounded in R(T), according to Problem 3.38(d), and therefore a compact subset of R(T) by Corollary 4.30. Since the metric space 1Z(T) is a subspace of the metric space Y. it follows that T(A)- is compact in Y. o The assumption "T is bounded" cannot be removed from the statement of Proposition 4.50. Actually, we have already exhibited an unbounded linear transformation L of an arbitrary infinite-dimensional nonmed space X into an arbitrary nonzero normed space Y. If dim Y = 1, then L E G[X, Y] is a rank-one linear transformation that is not even bounded (see proof of Corollary 4.30, part (b)).
Corollary 4.51. If X is a finite-dimensional normed space and Y is any normed space, then every linear transformation T E £[X, Y] is of finite rank and compact. Proof.
If X is finite-dimensional, then T E B[X, y] (Corollary 4.30) and
dim R(T) < oo (Problems 2.17 and 2.18). Thus T is bounded and of finite rank, and hence compact by the preceding proposition. t] Theorem 4.52. Let T be a linear transformation of a normed space X into a normed space Y. The following four assertions are pairwise equivalent.
(a) T is compact (i.e., T maps bounded sets into relatively compact sets). (b) T maps the unit ball B1 [0] into a relatively compact set.
(c) T maps every BI [0]-valued sequence into a sequence that has a convergent subsequence. (d)
T maps bounded sequences into sequences that have a convergent subsequence.
Moreover, each of the above equivalent assertions implies that
(e) T maps bounded sets into totally bounded sets, which in turn implies that
252
4. Banach Spaces
(f) T maps the unit ball Bt [0] into a totally bounded set.
Furthermore, if y is a Banach space, then these six assertions are all pairwise equivalent.
Proof. First note that (a)=(b) trivially. Hence, in order to verify that (a), (b), (c) and (d) are pairwise equivalent, it is enough to show that
(d)=*.(a).
If (b) holds, then Proof of (b)=*(c). Take an arbitrary B1 [0]-valued sequence lies in a compact subset of y or, equivalently, in a sequentially compact subset
of y (Corollary 3.81). Thus, according to Definition 3.76, the y-valued sequence has a convergent subsequence. Therefore (b) implies (c). Proof of (c)=(d). Take an arbitrary X-valued bounded sequence exists a nonnegative real number 0 > sup, Ilx.r II for which 1,6- 1 1-t sequence. If assertion (c) holds, then
subsequence, and so
so that there is a B, [01-valued
has a convergent has a convergent subsequence. Thus (c) implies (d).
Proof of (d) te(a). If (d) holds, then the image of every bounded subset of X contains
a subsequence that converges in Y. Then the closure T(A)- of the image T(A) of any bounded subset A of X contains a convergent subsequence by Theorem 3.30, which means that T(A)- is sequentially compact (Definition 3.76). Therefore (d) implies (a) by Corollary 3.81.
Moreover, observe that (a) implies (e). Indeed, if T(A)' is a compact subset of Y whenever A is a bounded subset of X, then T(A)' (and hence T(A)) is totally bounded in 3) whenever A is bounded in X, according to Corollary 3.81. Also note that (e) trivially implies (f). Conversely, suppose (f) holds true so that T(Bt[01) is a totally bounded subset of Y. If Y is a Banach space. then Corollary 3.84(b) ensures that T (BI [0]) is relatively compact in Y. Therefore (f) implies (b) whenever y is a Banach space. 17
We shall denote the collection of all compact linear transformation of a normed
space X into a normed space 3) by 13 (X, Y]. Recall that BjX, 3y) e B[X. Y1 by Theorem 4.49 (and BJX, ,y] = 13[X, 33J whenever X is finite-dimensional by Corollary 4.51). Accordingly, we shall write B00[X] for B0Q[X, XJ: the collection of all compact operators on a normed space X.
Theorem 4.53. Let X and Y be normed spaces.
(a) Bj X, yJ is a linear manifold of BI X. y1. (b) If 3) is a Banach space, then Bj X, y] is a subspace of B[X. Y). Proof. It is trivially verified that aT E BC(X, Y1 for every scalar or whenever
T E B [X. y1 (Theorem 4.52(d)) . In order to verify that S + T E Bx[X. )'I for every S. T E 13«,[X, y] proceed as follows. Take S and T in BJX. )J c
4.9 Compact Operators
253
8[X. Y1. and let (xn)n>i be an arbitrary X-valued bounded sequence. Theorem 4.52(d) ensures that there exists a subsequence of {Txn)n>1, say (TxnR)k>t, that converges in y. Now consider the subsequence (x nk )k> i of (x n )n> 1,which is clearly bounded. Then (Theorem 4.52(d) again) the sequence { Sx,, )k> t has a subsequence,
)j>i is a subsequence of the say (Sxnt, )i>!, that converges in Y. Since convergent sequence (Txni}k>j, it follows that (Txnk,)j>t also converges in Y (Proposition 3.5). Therefore ((S + T)xnt, )j>i = (Sxnkj)j>i + (Txntl );>1 is a convergent subsequence of ((S + T)xn )n> I, and hence S+T E B[X, Y] is compact
by Theorem 4.52(d). Thus S + T E B,[X, y]. Conclusion: B,Q[X, y] is a linear manifold of 8[X, y]. Claim. If y is a Banach space, then BJX, y] is closed in B[X, Y1. Proof.
Take an arbitrary B,Q[X, y]-valued sequence {T.) that converges (uni-
formly) in 8[X, y] to, say, T E B[X, Y1. Thus for each e > 0 there exists a positive integer ne such that
II(T-TT,)xll <- IIT - T,IIlxll < 6Ilxll for every x E X. Since Tn, is compact, it follows by Theorem 4.52(f) that the image Tn,(B1 [01) of the unit ball B1 [0] is totally bounded in y, and hence TT,(B1 [0]) has a finite -net, say YY (Definition 3.68). Therefore, for each x E Bt [0] there exists
y E YY such that IIT,,x - yll < ', and so
IITx - yll
= IITx - Tn,x + Tn,x - yiI -<
II(T-Tn,)x1I+IIT.,x-y11
That is. Yf is a finite'-net for T(B1[0)), which means that T(Bi(01) is totally bounded in Y. Thus, if y is a Banach space, then T is compact by Theorem 4.52. Conclusion: Every B,,[X, y]-valued sequence that converges in B[X, y] has its
limit in BJX. Y). Hence BJX, y] is closed in 8[X, y] by the Closed Set Theorem (Theorem 3.30). o
Outcome: If y is a Banach space, then B,.[X, y] is a subspace of B[X, y] (which o are Banach spaces by Propositions 4.15 and 4.7). Recall that a two-sided ideal .1 of an algebra A is a subalgebra of A such that the product (both left and right products) of every element of I with any element of A is again an element of I (see Problem 2.30).
Proposition 454. If X is a nonmed space, then BM[X] is a two-sided ideal of the nonmed algebra B[X].
Proof. 8[X] is a linear manifold of 6[X J by Theorem 4.53. Take S E B0[X] and T e B[M] arbitrary.
Claim. Both ST and TS lie in l3 jX].
254
4. Banach Spaces
Proof. Let A be any bounded subset of X so that T(A) is bounded by Proposition 4.12. Since S is compact, it follows (by definition) that S(T(A))- is compact. Thus the composition ST maps bounded sets into relatively compact sets, which means that ST is compact. Moreover, S(A) - is compact as well. Since T is continuous, it follows that T (S(A)-) is compact by Theorem 3.64 (and hence T (S(A)-) is closed
-Theorem 3.62), and also that T(S(A))- = T(S(A)-)- = T(S(A)-) according to Problem 3.46. Therefore T (S(A))- is compact. Thus the composition TS maps bounded sets into relatively compact sets, which means that TS is compact. o
o
Conclusion: Ba,[X] is a two-sided ideal of B[X].
Let B0[X, y] denote the collection of all finite-rank bounded linear transformations of a normed space X into a normed space Y. By Proposition 4.50 it follows
that B0[X, y] c 13,[X, Y1 (and 80[X, y] = BJX, y] = B[X, Y1 = £[X, y] whenever X is finite dimensional - Corollary 4.51). We shall write B0[X] for B0[X, X], the collection of all finite-rank operators on X. It is readily verified that both ST and TS lie in Bo[X] for every S E Bo[X] and every T E B[X]. Indeed, it is clear that S T is of finite rank (for the range of S T is trivially included in the range of S). Moreover, TS is of finite rank because TS = T Ijt(S) S and the domain of T IR(s) is the finite-dimensional range of S (and so its own range is finite-dimensional as well - see Problem 2.17). Therefore 50[X j is also a two-sided ideal of B[X]. Corollary 4.55. Let X and Y be normed spaces. If Y is a Banach space, then every B0[X, Y]-valued sequence that converges (uniformly) in B[X, y] has its limit in 1300[X, Y].
Proof. This is a straightforward application of Theorem 4.53(b) for B0[X, y] C B,[X, Y]. Indeed, since BJX, y] is closed in B[X, Y1, the Closed Set Theorem ensures that every B,[X, y]-valued sequence (and hence every B0[X. y]-valued sequence) that converges in B[X, y] has its limit in B.JX, Y). o Example 4N. Consider the diagonal operator Da E B[C+] of Example 4K (for some p > 1, where a = {ak)k o lies in e+OO). We shall show that
Da is compact if and only if ak
0 as k -+ oo.
Consider the P+-valued sequence {e; }°Oo with e; _ {8;k } ' E t+ for each i > 0 oust one nonzero entry equal to one at the ith position) as in Example 4H. (a) For each nonnegative integer n set Da. = diag(ao, ... ,
0, 0.0, ...) in
8[e+°.]. It is readily verified that each Dan is of finite rank (proof: y E R(DQ.) if
i;i ei for some x = }°OJ E f+, so that R(Da.) e span {e i )"o, and hence dim R(Da,,) < n by Theorem 2.6). Moreover (Problem at I for every n > 0. If limo la I = 0. then 4H again), I) Da. - Da II = lira sup Iao I = lime supk>n at I = 0 (Problem 3.13), and hence limo II Da. - D. it = 0. Thus a -> 0 implies Da y "+ D, .which in turn implies that Da is compact by Corollary 4.55.
and only if y =
4.9 Compact Operators
(b) Conversely, if a; + 0, then jai )'O has a subsequence, say jai.)'
255
such that inf lai. I > 0. Set s = inf. lai I > 0, and note that II Da eim - DQei II P = Ilaimeim ai ef, II P = l aim I P + I ai I P > 2e P whenever m i4 n. Then every subsequence of { D. ein )' 0 is not a Cauchy sequence, and hence does not converge in £.P.. Therefore, the bounded sequence (ei )' 0 is such that its image under Do, {Daei )O0 0, has no convergent subsequence. Hence D. is not compact by Theorem 4.52(c). Conclusion: ,
-
If D. is compact, then ai - 0 as i -* oo. Example 40. According to Theorem 4.53 and Corollary 4.55, it follows that
B0[X, Y] is closed in BO[X, YL
B0[X, Y] is closed in B[X, Y], and hence
B0[X, Y] is closed in B[X, Y], whenever Y is a Banach space. Now consider the setup of Examples 4M and 4N, and note that D,,,, = {DQ E D: a E B+ } is precisely the collection of all compact diagonal operators on the Banach space t +p. It was shown in Example 4M that D, is not strongly closed in B[t ] by exhibiting a sequence of finite-rank diagonal operators (and hence a sequence of compact diagonal operators) that converges strongly to the identity I on P+, which is not even compact. Conclusion: B0[r.P*] is not strongly closed in B00[t.J, which implies that
B0[t+] and Bje+P.] are not strongly closed in 8[t.P}]. It may be tempting to think that the converse of Corollary 4.55 holds. It in fact does hold whenever the Banach space Y has a Schauder basis (Problem 4.11), and it also holds whenever Y is a Hilbert space (next Chapter). In these cases, every T in BJX, Y] is the uniform limit of a B0[X, Y]-valued sequence. But such an italicized result fails in general (see Problem 4.58). However, every compact linear transformation comes close to having a finite-dimensional range in the following sense.
Proposition 4.56. Let X and Y be normed spaces and take T E £[X, Y]. If T is compact, then for each e > 0 there exists a finite-dimensional subspace Rs of the range R(T) of T such that
d(Tx,Rs) < e IIxII
for every
x E X.
Proof. Take an arbitrary e > 0. If T E C [X, Y ] is compact, then the image T (BI [0J) of the closed unit ball B1 [0] is totally bounded in the normed space R(T) by Theorem
256
4. Banach Spaces
4.52(b). Thus there exists a finite a-net for T(Bt[0)), say (vi}"%t C R(T). That is, for every y E T(Bi [0]) there exists vy E 1v}7.1 such that IIy - vy II < e. Set RE = span {v i }"` , C_ R(T). R£ is a finite-dimensional subspace of R(T) (Theorem 2.6, Proposition 4.7 and Corollary 4.28), and jFx7j d(Tx, R6) =
=
ilTff win EIZ,
inf
ll
IITx - uII = in finf
IIT(.x) - v11 x
ve(UXIs i
TT
II T(t) r - vi < e
for every nonzero x in X, which concludes the proof (for 0 E RE).
D
Proposition 4.57. The range of a compact linear transformation is separable. Proof. Let X be a normed space. Consider the collection (Bn(0))nEN of all open balls with center at the origin of X and radius n, which clearly covers X. That is
X=U n
Let Y be a normed space and let T : X -* Y be any mapping of X into y so that (see Problem 1.2(e))
R(T) = T(X) = T(U Bn(0)) = U T(Bn(0)) n
n
If T E 800[X, y], then each T(Bn(0)) is separable (Theorem 4.52 and Proposition 3.72), and hence each T(Bn(0)) has a countable dense subset. That is, for each n E N there exists a countable set An C_ T(Bn(0)) such that A- = T(Bn(0))- (cf. Problem 3.38(g)). Therefore,
UAn n
C UT(Bn(0)) 9 UAn C (UAn)n
n
n
(see Section 3.5). Thus (UnAn) = R(T)- so that UAn is dense in R(T) (cf. Problem 3.38(g) again). Moreover, since each An is countable, it follows by Corollary 1.11 that the countable union Un An is countable as well. Outcome: U. A. is a countable dense subset of R(T), which means that R(T) is separable. O
If T : X - ) is a compact linear transformation of a normed space X into a normed space )J, then the restriction TIM: M - y of T to a linear manifold M of X is a linear transformation (Problem 2.14). Moreover, it is clear by the very definition of compact linear transformations that TIM is compact as well: the restriction of a compact linear transformation T : X -i Y to a linear manifold of X is again a compact linear transformation (i.e., TIM lies in B00[M. YJ whenever T lies in 800[X, y]). On the other hand, if M is a linear manifold of an infinitedimensional Banach space X, and T : M --> Y is a compact linear transformation
4.9 Compact Operators
257
of M into a Banach space Y. then it is easy to show that an arbitrary bounded linear extension of T over X may not be compact (see Problem 4.60). However, if M is dense in X, then the extension of T over X must be compact. In fact, the extension of a compact linear transformation T : X --* y over a completion of X into a completion of y is again a compact linear transformation. Recall that every bounded linear transformation T of a normed space X into a normed space y has a unique (up to an isometric isomorphism) bounded linear extension T over the (essentially unique) completion X of X into the (essential unique) completion ,9 of y (Theorems 4.40 to 4.42). The next theorem says that T E B[X, y] is compact whenever T E B[X, y] is compact.
Theorem 4.58. Let the Banach spaces X and Y be completions of the normed
spaces X and y, respectively. If T lies in BJX, y], then its bounded linear extension T : X -+ $ lies in B,[X, Y]. Proof. Let X and 9 be completions of X and Y. Thus there exist dense linear
manifolds X and y of X and $ that are isometrically isomorphic to X and Y. respectively (Definition 4.39 and Theorem 4.40). Let Ux E G[X, X] and Uy E G[y, y] denote such isometric isomorphisms. Take T E 13[X, y) and set T = Uy T Ux E B[X, y] so that the diagram
y
UY
TI
X
Ux
commutes. Now take an arbitrary bounded X-valued seluence (2' J. Since 'v =X (i.e., since inf { IIX - Z11: Z E X} = 0 for every z E X - see Proposition 3.32), it follows that there exists an X-valued sequence (xn} equiconvergent with (i.e.. such that IIXn - xn II -> 0; for instance, for each integer n take xn in X such that IIXn - xn II < ; ), which is bounded (for IIxn II < IIXn - xn II + IIXn II for every n). Consider the X-valued sequence {xn} such that x,, = Ux xn for each n, which is bounded too: IIxn II = II Ux xn II = IIxn II for every n. If T is compact, then the y-valued sequence (Txn) has a convergent subsequence (Theorem 4.52), say, (Txnk).Thus (UyTxnk) converges in y (because Uy is a homeomorphism). Therefore (Txnk) converges in y (for Txnk = U T Ux znk = Uy Txnk for each k) to, say, y E y C fi. Hence (Tnk } converges in V. Indeed, IITXnk-YII
= IIT(Zk-xnk)+Txnk-ylI <
IITIIIIXnk-xnkll+IlTxnk-YII
for every k because T = T I X, so that ?Xnk - y (reason: II Txnk.. Y II - 0 and IIxnk - xnk II -> 0 - see Proposition 3.5) as k - oo. Conclusion: T F bounded sequences into sequences that have a convergent subsequence; that is, T is compact (Theorem 4.52). 11
258
4. Banach Spaces
4.10 The Hahn-Banach Theorem and Dual Spaces Three extremely important results on continuous (i.e., bounded) linear transformations, which yield a solid foundation for a large portion of modern analysis, are the Open Mapping Theorem, the Banach-Steinhaus Theorem and the Hahn-Banach Theorem. The Hahn-Banach Theorem is concerned with the existence of bounded linear extensions for bounded linear functionals (i.e., for scalar-valued bounded linear transformations), and it is the basis for several existence results that are often applied in functional analysis. In particular, the Hahn-Banach Theorem ensures the existence of a large supply of continuous linear functionals on a normed space X, and hence it is of fundamental importance for introducing the dual space of X (the collection of all continuous linear functionals on X). Let M be any linear manifold of a linear space X and consider a linear transformation L : M -- Y of M into a linear space Y. From a purely algebraic point of view, a plain linear extension L : X -> Y of L over X has already been investigated in Theorem 2.9. On the other hand, if M is a dense linear manifold of the normed space X and T : M -+ Y is a bounded linear transformation of M into a
Banach space Y, then T has a unique bounded linear extension T : X -. Y over X (Theorem 4.35). In particular, every bounded linear functional on a dense linear manifold M of a normed space X has a bounded linearextension over X. The results of Section 3.8 (and also of Section 4.7), that ensure the existence of a uniformly continuous extension over a metric space X of a uniformly continuous mapping on a dense subset of X, are called extension by continuity. What the Hahn-Banach Theorem does is to ensure the existence of a bounded linear extension f : X - IF over X for every bounded linear functional f : M -+ F on any linear manifold M of the normed space X. (Here M is not necessarily dense in X so that extension by continuity collapses.) We shall approach the Hahn-Banach Theorem step-by-step. The first steps are purely algebraic and, as such, could have been introduced in Chapter 2. To begin with, we consider the following lemma on linear functionals acting on a linear manifold of a real linear space. which are dominated by a sublinear functional (i.e., by a nonnegative homogeneous and subadditive functional.)
Lemma 4.59. Let Mo be a proper linear manifold of a real linear space X. Take xt E X \M0 and consider the linear manifold M t of X generated by Mo and xt,
MI = Mo+span{xt). Let p : X -+ R be a sublinear functional on X. If fo : Mo - R is a linear functional on Mo such that
fo(x) < p(x)
for every
x E Mo,
then there exists a linear extension ft : MI -* 1R of fo over M1 such that
f, (x) < p(x)
for every
x E MI.
4.10 The Hahn-Banach Theorem and Dual Spaces
259
Proof. Take an arbitrary vector xl in X\Mo. Claim. There exists a real number cl such that
- p(-w - xl) - MO < cl < P(w + xl) - fo(w) for every w E Mo. Proof. Since the linear functional fo : Mo -> IP is dominated by a subadditive functional p : X --+ R, it follows that p(v - u) = p(v + xt - u - xl)
MO - fo(u) = fo(v - u)
< p(v+xl)+P(-u-xl) for every u, V E Mo. Therefore,
-p(-u - xl) - MO
p(v + xl) - MO
for every u E Mo and every v E Mo. Set
at = sup (-p(-u - xt) - fo(u))
and
b1 = inf (p(v +xl) - fo(v)) VEMO
UEMO
The above inequality ensures that al and bi are real numbers, and also that at < bl. Thus the claimed result holds for any cl E [at, bt 1. o
Recall that every x in M1 = Mo + span {xl } can be uniquely written as x = xo + axt with xo in Mo and a in R. Now consider the functional ft : MI -+ R defined by the formula
f1(x) = fo(xo) + act for every x E M 1, where the pair (xo. a) in Mo x IR stands for the unique representation of x in Mo + span (xt ). It is readily verified that ft is, in fact, a linear
extension of fo over M1 (i.e., fl inherits the linearity of fo and f11Mo = fo). Finally, we show that p also dominates fl. Take an arbitrary x = xo + axt in MI and consider the three possibilities, viz., a = 0, a > 0, or a < 0. If a = 0, then f1 (x) < p(x) trivially (in this case, f (x) = fo(xo) < p(xo) = p(x)). Next recall that p is nonnegative homogeneous (i.e., p(yz) = yp(z) for every z E X and every y > 0), and consider above claimed inequalities. If a > 0, then
f1 (x) = fo(xo)+acl
<
fo(xo)+ap0'+x1)-afo(J) P(xo+axl) = P(x)
On the other hand, if a < 0, then
f1(x) = fo(xo) + act
= fo(xo) - lalct <
- xl) + Ialfo(=aP) P(xo - I«Ixt) = P(xo +axl) = P(x), fo(xo) + IaI P(-
260
4. Banach Spaces
which concludes the proof.
El
Theorem 4.60. (Real Hahn-Banach Theorem). Let M be a linear manifold of a real
linear space X and let p : X -> R be a sublinear functional on X. If f : M -+ R is a linear functional on M such that
f (x) < p(x)
for every
x E M,
then there exists a linear extension f : X -+ R of f over X such that
f (x) < p(x)
for every
x E X.
Proof. First note that, except for the dominance condition that travels from f to f , this could be viewed as a particular case of Theorem 2.9, whose kernel's proof is Zorn's Lemma. Let
IC = {coEC[N,R]: MECat(X), McNand f =VIM} be the collection of all linear functionals on linear manifolds of the real linear space
X which extend the linear functional f : M -- R, and set k' _ {gyp E IC : 'p (x) < p(x) for every x E N = D(rp)}.
Note that IC' is not empty (for f E /C'). Following the proof of Theorem 2.9. X is partially ordered (in the extension ordering, and so is its subcollection )C') and every chain in IC has a supremum in /C. Then everychain {spy) in 1C' has a supremum V,Vr
in /C, which actually lies in V. Indeed, since each 'p, is such that rpy(x) < p(x) for every x in the domain D(opy) of 'py, and since {rpy} is a chain, it follows that
(Vyvy)(x) < p(x) for every x in the domain UyD(cpy) of Vrpy. (In fact. if x E UyD(rpy), then x E D('pµ) for some gyp, E {tpy}. and hence (Vytpy)(x) = ',(x) < p(x) because (VVVy)ID(,pµ) = 'p,.) Therefore, every chain in 1C' has a supremum (and so an upper bound) in IC'. Thus, according to Zorn's Lemma, IC' has a maximal element, say, wo : No -+ R. Now we shall apply Lemma 4.59 to show
that No = X. Suppose No 56 X. Take xi E X\No and consider the linear manifold Nt of X generated by No and xi,
Nt = No+span{xt), which properly includes No Since Vo E K', it follows that rpo(x) < p(x) for every x E No. Thus, according to Lemma 4.59, there is a linear extension Wt : Nt -+ R of coo over Nt such that 't (x) < p(x) for every x E N. Therefore coo cot E IC', which contradicts the fact that'po is maximal in /C' (for Wo 54 gyp)). Conclusion:
Np = X. Outcome: " is a linear extension of f over X which is dominated by p.
0
4.10 The Hahn-Banach Theorem and Dual Spaces
261
Theorem 4.61. (Hahn-Banach Theorem). Let M be a linear manifold of a linear space X over F and let p : X -+ R be a seminorm on X. If f : M -+ F is a linear functional on M such that
If (x)I < p(x)
for every
x E M,
then there exists a linear extension f : X -+ F of f over X such that I T(x) 1
< p(x)
for every
x E X.
Proof. As we have agreed in the introduction to this chapter, F denotes either the complex field C or the real field R. Recall that a seminorm is a nonnegative convex functional (i.e., a nonnegative absolutely homogeneous subadditive functional). (a) If F = R, then this is an easy corollary of the previous theorem. Indeed, if
F = R, then the condition If 1 < p trivially implies f < p on M. As a seminorm is a sublinear functional, Theorem 4.60 ensures the existence of a linear extension f : X -> R of f over X such that' _$ p on X. Since f is linear and p is absolutely
homogeneous, it follows that -f (x) = f (-x) :< _?(-X) = 1-11 p(x) = p(x) for every x E X. Hence - p< f, and therefore I f 1 1 P I= p on X (for p is nonnegative).
(b) Suppose F = C and note that the complex linear space X can also be viewed as a real linear space (where scalar multiplication now means multiplication by real scalars only). Moreover, if M is a linear manifold of the (complex) linear space X, then it is also a (real) linear manifold of X when X is regarded as a real linear
space. Furthermore, if f : M -+ C is a complex-valued functional on M, and if g : M -+ R and h : M -> R are defined by g(x) =Ref (x) and h(x) = Im f (x) for every x E X, then
f = g+ih.
Now recall that f : M -> C is linear. Thus, for an arbitrary a E R,
g(ax) + i h(ax) = f (ax) = of (x) = ag(x) + iah(x), and hence g(ax) = ag(x) and h(ax) = ah(x) for every x E M (because g and h are real-valued). Similarly, for every x, y E M,
g(x+y)+ih(x+y) = f(x+y) = f(x)+f(y) = g(x)+ih(x)+g(y)+ih(y), and so g(x + y) = g(x) + g(y) and h(x + y) = h(x) + h(y). Conclusion: g : M --* R and h : M --> R are linear functionals on M when M is regarded as a real linear space. Observe that f (ix) = if (x), and hence g(ix) + i h (i x) = i g (x) - h (x ), for every x E M. Since g(x), g(ix), h(x) and h(ix) are real numbers, it follows that h(x) _
-g(ix), and therefore
f(x) = g(x) - ig(ix),
262
4. Banach Spaces
for every x E M. If If I < p on M, then g = Ref < p on M. Since g is a linear functional on the (real) linear manifold M, and since p is a sublinear functional on the (real) linear space X (because it is a seminorm on the complex linear space X), it follows by Theorem 4.60 that there exists a real-valued linear extension g of g over the (real) linear space X such that g < p on X. Consider the functional f : X -* C (on the complex linear space X) defined by
f(x) = 8(x)-ig(ix) for every x E X. It is clear that ?extends f over X (reason: If x E M - so that ix E M - then f(x) = g(x) - ig(x) = f(x)). It is also readily verified that f is a linear functional on the complex space X. Indeed, additivity and (real) homogeneity are trivially verified (because g is additive and homogeneous on the real linear space X). Thus it suffices to verify that ?(ix) = if(x) for every x E X.
In fact, f(ix) = g(ix) - ig(-x) = g(ix)+ig(x) = i f(x) for every x E X. Therefore, f is a linear extension of f over X. Finally we show that
1f(x)1 5 P(x) for every x E X. Take an arbitrary x in X and write the complex number AX) in
polar form: f(x) = pe'B (if Ax) = 0, then p = 0 and 0 is any number between 0 and 2,r). Since f is a linear functional on the complex space X. it follows that f (e-'Bx) = p = If (x)l. which is a real number. Then f (e-1ex) = g(e-iex), and hence
_ Ie-'BI If (x)I = g(e-r0x) 5 P(e-'Bx) = P(x) = P(x),
since p : X -+ R is absolutely homogeneous on the complex space X.
a
Theorems 4.60 and 4.61 are called Dominated Extension Theorems (in which no topology is involved). The next one is the Continuous Extension Theorem.
Theorem 4.62. (Hahn-Banach Theorem in Normed Space). Let M be a linear manifold of a normed space X. Every bounded linear functional f : M -- F defined on M has a bounded linear extension f : X -+ F over the whole space X such that Il f II = Il f 11.
Proof. Take an arbitrary f r= B[M, F] so that If (x)I 5 11 f 11 IIx II for each x E M. Set p(x) = II f II IIx II for every x E X, which defines a seminorm on X (since
p : X -> R is a multiple of a norm 11 II : X -> R on X - in fact. p is a norm on X whenever f A 0). Since If (x) I < p(x) for every x E M, it follows by the previous theorem that there exists a linear extension f : X -> F of f over X such that
11x)I 5 P(x) = IIfIIIIxII for every x E X. Thus f is bounded (i.e., f E BIX. F]) and IIf I1 5 11f 11. On the other hand, f (x) = T W for every x E M (because f = f I M ), and hence
Ilfli = sup If(x)I < sup If(x)I = IIf 11. xEM
xEX
4.10 The Hahn-Banach Theorem and Dual Spaces
263
o
Therefore II T II = II f II
Here are some consequences of the Hahn-Banach Theorem that are particularly useful.
Corollary 4.63. Let M be a proper subspace of a normed space X. If xo E X\M, then there exists a bounded linear functional f : X --). F such that f (xo) = 1,
f(M) = {0), and IIfII = d(xo, M)-' Proof. First note that d(xo, M) A 0: the distance from xo to M is strictly positive M)-) because xo E X\M and M = M- (see Problem 3.43(b)). Hence d(xo, is well-defined. Now consider the linear manifold Mo of X generated by M and xo, Mo = M + span (xo},
so that every x in Mo can be uniquely written as x = u + axo with u in M and a in F. Let fo : Mo --+ IF be a functional on Mo defined by fo(u + axo) = of
for every x = u + axo E Mo. It is easy to verify that fo is linear, fo(M) = (0), and fo(xo) = 1. Next we show that fo is bounded (so that fo E B[Mo, F}) and 11 foil = d(xo, M)-). Consider the set
S = {(u, a) E MxF: (u, a) # (0,0)} and its partition (Si, S2), where
S) = {(u,a) E MxF: a 34 01,
S2 = {(u,a)EMxF: uA0anda=0}. Observe that x = u + axo = 0 (in Mo) if and only if u = 0 (in M) and at = 0 (in IF) - reason: M is a linear manifold of X and xo E X \M, so that span (xo }f1M = (0}. Hence x = u + axo # 0 in Mo if and only if (u, a) E S, which implies that
Ilfoll =
SuP O#XEMO
I
IfxII
- sup Uu+axol - (u.a)ES, Sup (u,(r)ES
Ilu+axol'
since sup( a)Es2 u+alxa = 0 and S = S, U S2. However, inf VEM 11v + x011 = inf VEM llxo - v Il = d(xo, M) 0 0, and so (see Problem 4.5) I«I
(uaupsj Ilu+axoll
_
)
(uaupESI
_
i
I
'UP Ilu+xol E
Summing up: fo is a bounded linear functional on the linear manifold Mo of X such that fo(xo) = 1, fo(M) = (0) and Ilfoll = d(xo, M)-'. Therefore, according to Theorem 4.62, fo : Mo -- F has a bounded linear extension f : X -+ IF over
264
4. Banach Spaces
X such that llf II = Ilfoll = d(xo, M)-t. Moreover, since f IMO = fo, it also follows that f (xo) = fo(xo) = 1 and f (M) = fo(M) = (0), which concludes the
0
proof.
Recall that (0) is a proper subspace of any nonzero normed space X. Thus, according to Corollary 4.63, for each xo 96 0 in X 0 (0) there exists abounded linear
functional f : X -+ F such that f (xo) = 1 and Ilf II = d(xo, (0))-t = Ilxoll-t. Now set fo = Ilxo II f, which is again a bounded linear functional on X. so that fo(xo) = Ilxoll and Ilfoll = 1. Moreover, if xo = 0. then take xt 0 0 in X and a bounded linear functional ft on X such that f1 (xt) = Ilxt ll and Ilfi II = 1. Since fl is linear, ft (xo) = ft (0) = 0 = Ilxoll This proves the following corollary.
Corollary 4.64. For each vector xo in a normed space X 96 (0) there exists a bounded linear functional f : X -+ F such that 11f II = I and f (xo) = Ilxo lt. Consequently, there exist nonzero bounded linear functionals defined on every nonzero normed space.
Let X and Y be normed spaces over the same field, and consider the normed space 8[X, Y] of all bounded linear transformations of X into Y. If X A (0), then Corollary 4.64 ensures the existence of f A 0 in B[X, F]. Suppose Y (0), take any y 0 0 in y, and set T(x) = f(x)y
for every x E X. This defines a nonzero mapping T : X -+ Y which certainly is linear and bounded. Conclusion: There exists T 0 0 in B[X,YJ whenever X and Y are nonzero normed spaces.
Example V. Proposition 4.15 says that 8[X, y] is a Banach space whenever y is a Banach space. If X = (0), then 8[X, y] = (0), which is a trivial Banach space regardless whether )) is Banach or not. Thus the converse of Proposition 4.15 should read as follows.
If X A (0) and B[X, y] is a Banach space, then Y is a Banach space. Corollary 4.64 asserts that there exists f 56 0 in B[X, F] whenever X 96 (0). Take an arbitrary Cauchy sequence (y,,) in Y and consider the B[X, y]-valued sequence (Tn) such that, for each n,
Tnx = f(x)Y, for every x E X. Each T in fact lies in 8[X, Y] because f lies in B[X, F]. Indeed, for each integer n, II T, II = SUP IITnxII = sup If(x)IIIYnII = IIf11hIYnII ull<_I
II-rIISI
and, for any pair of integers m and n,
IITm - Tnll = SUP II(Tm - Tn)xII = II1IIIIYm - Ynll. 1(11<1
4.10 The Hahn-Banach Theorem and Dual Spaces
265
Hence {Tn) is a Cauchy sequence in B[X, y]. If B[X, y] is complete, then (Tn} converges in B[X, y] to, say, T E B[X, Y]. Since f # 0, there exists xo E X such that f (xo) i4 0. Therefore,
in Y
yn = f(xo)-'Tn(xo) -+
(uniform convergence implies strong convergence to the same limit), and so (yn) converges in Y. Conclusion: If X * (0) and B[X, Y] is complete, then y is complete. The dual space (or conjugate space) of a normed space X, denoted by X*, is the normed space of all continuous linear functionals on X (i.e., X' = B[X, F], where F stands either for the real field R or the complex field C, whether X is a real or complex normed space, respectively). Obviously, X` = (0) whenever X = (0). Corollary 4.64 ensures the converse: X' (0) whenever X # (0). Indeed, if f (x) = 0 for all f E X', then x = 0. As a matter of fact, if x i6 yin X, then there
exists f E X' such that f(x) - f(y) = f(x - y) =IIx - y11 A 0 (Corollary 4.64 again), and hence f (x) # f (y). This is usually expressed by saying that X' separates the points of X. Still from Corollary 4.64, for each nonzero x E X there exists fo E X' such that Ilfoll = 1 and Ilxll = Ifo(x)I. Therefore, IIx II =
of
-l
sup
ufu<,
supo fEX
If
I
(X)
< IIX II
-
[IfF
(recall: If (x)I $ II f II IIx II for every x in X and every fin X'), which shows a symmetry in the definitions of the norms in X and X*: IIx II =
sup
If(x)I = sup
o#fEX
W1
I
u
i
for every x E X. Observe that, according to Proposition 4.15, X' is a Banach space for every normed space X (reason: X' = B[X, IF] and (IF, I I) is a Banach space).
Proposition 4.65. If the dual space X' of a normed space X is separable, then X itself is separable. Proof. If X' = {0}, then X = (0) and the result holds trivially. Thus suppose X* 56
(0) is separable and consider the unit sphere about the origin of X', viz., S1 = (f E X}: Ilf II = 1). Since every subset of a separable metric space is separable (Corollary 3.36), it follows that S1 includes a countable dense subset, say, if.). For each fn there exists xn E X such that Ilxnll = 1
and
1 < Ifn(xn)I
I fn (x) I = II fn II = 1 because f E S1. Consider the countable subset (xn ) of X and put M = span {xn }. Indeed, SUP 11x 11= i
266
4. Banach Spaces
If M- 96 X, then Corollary 4.63 ensures that there exists 0 96 f E X* such that f (M) = (0). Set fo = 11f 11-1 f in X* so that fo E S1
and
Hence I fn (xn )1 S 1(fn - fo)(xn) I
fo(xn) = 0 for every n. <-
_S
II fn - fo II, and therefore
Ilfn - foil
for every n. But this implies that the set (fn) is not dense in S1 (see Proposition 3.32). which contradicts the very definition of (fn). Outcome: M- = X. and so X o is separable by Proposition 4.9(b). If X ,E (0), then X* y6 (0) and hence (X*)*, the dual of X*. is again a nonzero Banach space. We shall write X** instead of (X*)*, which is called the second dual (or bidual) of X. It is clear that X, X* and X** are nonmed spaces over the same scalar field. The next result shows that X can be identified with a linear manifold
of X**, so that X is naturally embedded in its second dual X.
Theorem 4.66. Every normed space X is isometrically isomorphic to a linear manifold of X**. Proof. Suppose X # (0) (otherwise the result is trivially verified), take an arbitrary x in X, and consider the functional cox : X* --> F defined on the dual X* of X by
(Pz(f) = f(x) for every f in X*. Since X* is a linear space,
W.((rf +fig) = (af +fig) (x) = of(x)+fg(x) = a(Px(f)+f(P.(g) for every f, g E X* and every a, /3 E F. so that W., is linear. Moreover, since the elements of X* are bounded and linear, it also follows that
Iwx(f)I = 1f(X)1 S Ilflllixll for every f E X*, and hence (px is bounded. Thus qpx E X**. Indeed, 119'x11=11x11.
since Corollary 4.64 ensures the existence of fo E X* such that Ilfoll = 1 and Ifo(x)I = IIx1l. and therefore
Iiwxll = sup If(x)1 S Ilxll = Ifo(x)I = krx(fo)I < Ilwxllllfoll = Ilwxll Of 1151
Let D : X -+ X** be the mapping that assigns to each vector x in X the functional (P., in X**; that is,
4(x)=(Px
4.10 The Hahn-Banach Theorem and Dual Spaces
267
for every x E X. It is easy to verify that 4) is linear. Since I14(x)II = Ilxli for every x E X, it follows that 4) is a linear isometry of X into X" (see Proposition 4.37). Hence 4): X -+ R(4)) C_ X" is an isometric isomorphism of X onto R((D) = 4)(X). the range of 4). Thus the range of 0 is a linear manifold of X" isometrically isomorphic to X.
11
This linear isometry 0: X -> X" is known as the natural embedding of the normed space X into its second dual X. If 4) is surjective (i.e., if 4'(X) = X"). then we say that X is reflexive. Equivalently, X is reflexive if and only if the natural embedding (b: X -+ X" is an isometric isomorphism of X onto X. Thus, if X is reflexive, then X and X" are isometrically isomorphic (notation: X = X**). The converse, however, fails: X = X" clearly implies (NX) - X" (for composition of isometric isomorphisms is again an isometric isomorphism) but does not imply X". Since X" (the dual of X') always is a Banach space,
it follows by Problem 4.37 that ever, reflexive nonned space is a Banach space. Again, the converse fails (i.e., there exist nonreflexive Banach spaces, as we shall see in Example 4S below). Recall that separability is a topological invariant (see Problem 3.48) so that the converse of Proposition 4.65 holds for reflexive Banach
spaces. Indeed, if X = X. then X is separable if and only if X" is separable, which implies that X' is separable by Proposition 4.65. Therefore, if X is separable
and X' is not separable, then X X" and hence X is not reflexive: a separable Banach space with a nonseparable dual is not reflexive. This provides a necessary condition for reflexivity. Here is an equivalent condition. Proposition 4.67. A Banach space X is reflexive if and only for each ip E X" there exists x E X such that
,p(f) = f (.r)
for every
f E X'.
Proof. Let c: X -+ X" be the natural embedding of X into X" and take an arbitrary cp E X. There exists x E X such that ip(f) = f (x) for every f E X' if and only if p = V. , which means that tp E R(c). Equivalently, if and only if c is
o
surjective.
Example 3Q. If X is a finite-dimensional normed space, then dim X = dim X` by Problem 4.64. Thus dim X* = dim .X" because X' is finite-dimensional (Problem
4.64 again), and hence dim X = dim X. Let 0: X -+ X" be the natural embedding of X into X. Since (D(X) is a linear manifold of the finite-dimensional linear space X", it follows by Problem 2.7 that 4)(X) also is a finite-dimensional linear space. Therefore, as X and 4)(X) are topologically isomorphic finite-dimensional
normed spaces, Corollary 4.31 ensures that dim c(X) = dim X. Then 4)(X) is a linear manifold of the finite-dimensional space X" and dim 4)(X) = dim X. This implies that c(X) = X" (Problem 2.7 again). Conclusion: Every finite-dimensional normed space is reflexive.
268
4. Banach Spaces
Example 4R. Take an arbitrary pair { p, q) of Holder conjugates (i.e.. take real numbers p > I and q > 1 such that p + = 1), and consider the Banach spaces l+ and e+ of Example 4B. It can be shown that there exists a natural isometric isomorphism JP : e+ --> (e+)' of Z+ onto the dual of f+. Thus, symmetrically, there exists a natural isometric isomorphism J. : e+ -+ (e+)` of e+ onto the dual of e.. It then follows by Problem 4.65 that there exists an isometric isomorphism Jy : (e+)` (e+)" of the dual of e+ onto the second dual of a+. Therefore, the composition Jq JP : e+ is an isometric isomorphism of e+ onto its second dual so that If Moreover, it can also be shown that this isometric isomorphism actually coincides with the natural embedding (b: t. - (e.})". Conclusion:
e+ is a reflexive Banach space for every p > 1. In particular, the very special space a+, besides being reflexive, is also isometrically equivalent to its own dual (actually, as we shall see in Section 5.11, the real space e+ is isometrically isomorphic to (e+)').
Example 4S. There are, however, nonreflexive Banach spaces. F o r instance, consider the linear spaces e + 0 0 and e + equipped with their usual norms (1111. and 11
11 t ,
respectively). Since t{" is a linear manifold of the linear space e+°, equip it with the sup-norm as well. Recall that e+ and e+ are separable Banach space but the Banach space t+° is not separable (see Examples 3P to 3S and Problems 3.49 and 3.59). It
is not very difficult to check that (t+ c)* = e{. and (f,1.)' '= e', and so (e+ )" = e.' (see Problem 4.65 again). Thus e+ is a separable Banach space with a nonseparable dual (reason: (4)` is not separable because (e. .I)' e0+0 and separability is a topological invariant). Hence, e.+1
is a nonreflexive Banach space.
Since e+ is not even homeomorphic to e+OO (Problem 3.49) and (e+ )" = l+°, it follows that e+c° 7- (e;)**. Therefore,
e+ is a nonreflexive Banach space.
Suggested Reading Bachman and Narici [1] Banach [I ] Beauzamy [1] Berberian [2] Brown and Pearcy [1] Conway [ 1 ]
Kantorovich and Akilov [I ] Kolmogorov and Fomin [ 1 ] Kreyszig [I ] Maddox [ 1 ]
Naylor and Sell [I ] Reed and Simon [1]
Problems
Douglas [ 1 ]
Dunford and Schwartz [I Goffman and Pedrick [I ] Goldberg [ I]
Hille and Phillips (1] Istriijescu [1]
269
Robertson and Robertson [ 1 J Rudin [1] Schwartz [1] Simmons [1]
Taylor and Lay [I ] Yosida [1]
Problems Problem 4.1. We shall say that a topology T on a linear space X over a field F is compatible with the linear structure of X if vector addition and scalar multiplication are continuous mappings of X x X into X and of 1FxX into X, respectively. In this case T is said to be a compatible topology (or a linear topology) on X. When
we refer to continuity of the mappings XxX - X and FxX -, X defined by (x, y) H x +y and (a, x) ,- ax, respectively, it is understood that X x X and IF x X are equipped with their product topology. If X is a metric space, then these are the topologies induced by any of the uniformly equivalent metrics of Problems 3.9 and 3.33. If X is a general topological space, then these are the product topologies (cf. remark in Problem 3.64). A topological vector space (or topological linear space) is a linear space X equipped with a compatible topology.
(a) Show that, for each y in a topological vector space X and each a in IF. the
translation mapping x H x + y and the scaling mapping x H ax are homeomorphisms of X onto itself.
(b) Show that every normed space is a topological vector space (a metrizable topological vector space, that is).
In other words, show that vector addition and scalar multiplication are continuous mappings of X x X into X and of FxX into X, respectively, with respect to the norm topology on X.
Problem 4.2. Consider the definitions of convex set and convex hull in a linear space (Problem 2.2). Recall that the closure of a subset of a topological space is the intersection of all closed subsets that include it.
(a) Show that in a topological vector space the closure of a convex set is convex. The intersection of all closed and convex subsets that include a subset A of a topological vector space is called the closed convex hull of A.
(b) Show that in a topological vector space the closed convex hull of a set A coincides with co(A)- (i.e., it coincides with the closure of the convex hull of A).
270
4. Banach Spaces
A subset A of a linear space X is balanced if ax E A for every vector x E A and every scalar a such that lal < 1 (i.e., if aA C A whenever lal < 1). A subset of a linear space is absolutely convex if it is both convex and balanced.
(c) Show that a subset A of a linear space is absolutely convex if and only if ax + fly E A for every x, y E A and all scalars a. fl such that lal + If l (hence the term "absolutely convex").
(d) Show that the interior A° of an absolutely convex set A contains the origin whenever A° is nonempty. (e) Show that in a topological vector space the closure of a balanced set is balanced, and therefore the closure of an absolutely convex set is absolutely convex.
A subset A of a linear space is absorbing (or absorbent) if for each vector x E X there exists e > 0 such that ax E A for every scalar a with lal < s. Equivalently,
if for each x E X there exists A > 0 such that x E sA for every scalar µ with JAI>A. (f) Show that in a topological vector space every neighborhood of the origin is absorbing. A subset A of a linear space X absorbs a subset B of X (or B is absorbed by A) if there exists fl > 0 such that x E B implies x E tiA for every scalar µ with JAI > fi (i.e., if B C j.tA whenever LuI > fl). In particular, A is absorbing if and only if it absorbs every singleton (x) in X. A subset B of a topological vector space is said to be bounded if it is absorbed by every neighborhood of the origin.
Problem 4.3. Let X be a linear space over a field F (either IF = C or F = R). A quasinorm on X is a real-valued positive subadditive functional I l: X -* R on X that satisfies the axioms (i), (ii) and (iv) of Definition 4.1 but, instead of axiom (iii), it satisfies the following ones.
(iii) (iii")
laxI < Ixll whenever a < 1, 0 whenever a --> 0,
for every x E X (as usual, a stands for a scalar in F and (a } for a scalar-valued sequence). A linear space X equipped with a quasinorm is called a quasinormed
space. Consider the mapping d: XxX -* R defined by d(x, y) = lix - yl for every x, y E X, where II
II
: X --. R is a quasinorm on X.
(a) Show that d is an additively invariant metric on X that also satisfies the condition
d(ax,ay) < d(x,y) for every x, y E X and every a E IF such that lal < 1.
Problems
This is called the metric generated by the quasinorm II
271
II-
(b) Show that a norm on X is a quasinorm on X, so that every normed space is a quasinormed space. A quasinormed space that is complete as a metric space (with respect to the metric generated by the quasinorm) is called an F-space.
(c) Verify that every Banach space is an F-space.
Problem 4.4. A neighborhood base at a point x in a topological vector space is a collection N of neighborhoods of x with the property that every neighborhood of x includes some neighborhood in X. A locally convex space (or simply a convex space) is a topological vector space that has a neighborhood base at the origin consisting of convex sets. (a) Show that every normed space is locally convex.
A barrel is a subset of a locally convex space that is convex, balanced, absorbing and closed. It can be shown that every locally convex space has a neighborhood base at the origin consisting of barrels. A locally convex space is called barreled if every barrel is a neighborhood of the origin. Barreled spaces can be thought of as a generalization of Banach spaces. Indeed, a sequence {x,,} in a locally convex space is a Cauclzy sequence if for each neighborhood N at the origin there exists an integer nN such that x,, - x E N for all m, n > nN. It can be verified that every convergent sequence in a metrizable locally convex space X (i.e., every sequence
that is eventually in every open neighborhood of a point x in X) is a Cauchy sequence. A set A in a metrizable locally convex space is complete if every Cauchy sequence in A converges to a point of A. A complete metrizable locally convex space is called a Frechet space. Recall the definition of F-space (Problem 4.3) and also that every Banach space is an F-space.
(b) Show that every F-space is a Frechet space. (c) Show that every Frechet space is a barreled space. Hint: Note that a Frdchet space X is a complete metric space. Take an arbitrary barrel B in X and show that the countable collection in of closed sets
covers X. Now apply the Baire Category Theorem (Theorem 3.58) - see Problem 4.2(d). We shall return to barreled spaces in Problem 4.44.
Problem 4.5. Consider the definition of a bounded subset of a metric space: A is bounded if and only if diam(A) < oo (Section 3.1).
272
4. Banach Spaces
(a) Show that a subset A of a normed space X is bounded if and only if supxEA Ilx II < oo. (By convention, sup,,ro Ilx II = 0.)
Now consider the definition of a bounded subset of a topological vector space as given in Problem 4.2.
(b) Show that a set A is bounded as a subset of a normed space X if and only if it is bounded as a subset of the topological vector space X. That is, supxEA Ilx II < oo if and only if A is absorbed by every neighborhood of the origin of X. In other words, the notion of a bounded subset of a normed space is unambiguously defined.
Let A be a subset of a normed space. Suppose A\{0} 0 0 and prove the following propositions. (c) supxEAllxll < oo implies (d) infxEAllxll
(supxEAIIxII)
.
0 implies supXEA\lo)11X11_t = (infxEAllxll)-t
Clearly,infxEAllx11 *0ifandonly ifinfxEAIIx11 > 0(sincellx11 ? Oforall x in any normed space). It is also clear that 11X11 5 supxEA Ilx II. and infxEA 11X11 < 00 even if A is unbounded. Show that (e) infxEAllxll A 0 if and only if supXEA\(o)IIxII-t < oo. A nonempty subset A of a normed space is bounded away from zero if infXEA Ilx II A 0. Accordingly, a mapping F of a nonempty set S into a normed space X is bounded
if and only if supsESIIF(s)II < oo; and bounded away from zero if and only if infSEs II F(s) II # 0. In particular, an X-valued sequence (x.) is bounded if and only if sup,, Ilxn II < oc; and bounded away from zero if and only if info IIxn II 34 0.
Problem 4.6. This problem is entirely based on the triangle inequality. Consider the spaces (f+, II II t) and (1+', II Il oo) of Example 4B. Take x = {k}.1 E t+' and let {x,,)°O t be an 8+-valued sequence (i.e., each x,, = {i;n(k)}k° t lies in l+). Recall: e+ C e+°O. Show that
(a) if IIx,, - xlloo - 0 and sup. Ilxnllt < oo, then x E e+.
Hint: Ek 1 Ilk 15 m IIx,, - x Iloo + sup,,Ilxn II t for each m > 1. Now suppose x E e+ and show that (b)
if Ilx,, - x 11. -+ 0 and Ilx Ii -- llx 11
,
then llxn - x 111 -> 0.
Problems
273
Hint: If x = {k }k° and z = {!k k are in e then, for each m > 1. 1
IIzIII + Ilxlll
-<
Ilz -xlt, +2mflzII,,. +2y"°m+1I kI
I(kl) + F°°_n,l(ISkj -
(Note that 11z111 + Ilxlll = Fk and 2(Ek 1ISkl +
ICklj S Itk - Ckl.) Prove the above auxiliary inequality and conclude: if x, y E P+, then
I Ily1I1 - IIxI11I+2mIly-xii,°+2E°°-,,,+114kI
I1y-x111
for each m > 1. Show that, under the assumptions of (b).
2F°0-,,,+t
lim sup,, llxn - x 111
for all
I
m > 1.
Next suppose the sequence (x )' is dominated by a vector y in f+1. That is, 1
I vk l for each k > l and all n > 1, for some y = (vk }k in f+. Equivalently. oc. Show that Fk°_1SUP, ISn (k) I
(c)
if Ilx, -
O and Fk t supn
(k)1 < oo, then llxn - x 111 -> 0.
Hint: Item (a) and the dominance condition ensure that x E f+ (supn Ilxn II II y lI 1 for some y E £+ ). Moreover, for each m > 1.
+F-k
llxn -X111 < m11xn -x11
1
m+1I4 I
Extend these results to a+ (X) and t +'(X) as in Example 4F.
Problem 4.7. Let (xi }x t be a sequence in a normed space X and consider the sequence (y}1 of partial sums of (x, )x 1; that is. Y,, = Mix, for each n > 1. Prove that the following assertions are pairwise equivalent. (a) {yn }n° 1
is a Cauchy sequence in X.
(b) The real sequence (II Fi +n x, 111' 1 converges to zero uniformly in k; that is, n+k
lim sup Il n
k>O
x,
I)
= 0.
1-n
(c) For each real number e > 0 there exists an integer nE > 1 such that 0
x,
II
i=m
II
<e
whenever
n< < m < n.
274
4. Banach Spaces
(Hint: Problem 3.51). Observe that, according to item (b), x -+ 0 in X whenever {F"_1x; }° I is a Cauchy sequence in X and, in particular, whenever the infinite series i=ix; converges in X. That is, x -+ 0 in X for every summable sequence {x ; }°O , in X. Now suppose X is a Banach space. As X is complete, then (by the very definition of completeness) each of the above equivalent assertions also is equivalent to the following one. x; converges in X (i.e., the sequence {x; }°O t is sum-
(d) The infinite series mable).
If X is a Banach space, then condition (c) (or condition (b)) is referred to as the Cauchy Criterion for convergent infinite series. Problem 4.8. Proposition 4.4 says that every absolutely summable sequence in a Banach space is summable. (a) Consider the a+-valued sequence (x1) O0 t where, for each positive integer i, x; = ej with e; = (Sik )k° l E e+ (just one nonzero entry equal to one at the ith position). Show that {x; }00 is a summable sequence in the Banach space e+ but is not absolutely summable. Problem 4.7 says that {x; }°O I is a summable sequence in a Banach space if and only y"+kx; 11 = 0, which clearly implies that limo II = 0 for if limo supk,pl1 every integer k > 0.
(b) Give an example of a nonsummable sequence (or, equivalently, an example of a nonconvergent series) in a Banach space such that limo CIF"_+,kx, II = 0 for every integer k > 0. (Hint: 1;; _ in R.) Problem 4.9. Prove the following propositions.
and (y; )°0o are summable sequences in a normed space X, then (ax; + fly, }°O_0 = a(x; )°°o + P{yi }0o is again a summable sequence in X. and 00
(a) If (x; }
00
00
E(axi +fyi) = aExi +,OF, yi, i-o
i=o
i=o
for every pair of scalars a, i4 E F. This shows that the collection of all Xvalued summable sequences is a linear manifold of the linear space XN (see Example 2F). and hence is a linear space itself. (b) If {x; }00o is a summable sequences in a normed space X, then 00
in X as n -oc. i=n
Problems
Hint: >"-0x; =
275
Ll xi + E' xi for every 1 < n _< m. Now use item (a) °_oxn_i converges in X, and E0!_px; _
to verify that the series
=pxi + °_°_nxi for every n > 1. Thus the result follows by uniqueness of the limit.
Problem 4.10. Let x = (xi
be a sequence of linearly independent vectors in a normed space X. and let Ax be the collection of all scalar-valued sequences a = (ai ), for which the series 0ai xi converges in X (i.e., such that (ai x i )°Oo is a summable sequence in X). Show that (a)
Ax is a linear manifold of the linear space FNO.
Recall: F = C or F = R, whether X is a complex or real linear space, respectively. Verify that the function 1111: Ax -± R, defined by n
(b)
Ila 11 = sup I E ai x, Ilx ,=U
for every a = (ai } ' in Ax, is a norm on ./fix. Take an arbitrary integer i > 0. Show that lai I lIxi IIx -< 211a II
for every a = (ai }, and hence (c)
in AX (hint: ai xi = E;=pad xj - >J i, a, xi for each i > 1), lai - 8i I Ilxi IIx -< 21la - biI
for every a = (ai ),'o and b = (#i }-0 in Ax. (Reason: Ax is a linear space by item (a).) Now let (ak }1 0 be a Cauchy sequence in the normed space Ax. That is, each ak = (ak(i)}0 lies in Ax and the Ax-valued sequence (ak} is Cauchy in (Ax, II 11). where II II is the norm in (b). According to (c), lak(i) - ae(i)I <
2llxi ll, l Ilak - at II for each i >0 and every k, t > 0 (recall: (xi }°O p is linearly independent so that Ilxi IIX 56 0 for every i > 0). Thus the scalar-valued sequence (ak(i)} is Cauchy in F, and so convergent in F. for each i > 0. Set
ai = limak(i) k
in F for each integer i > 0 and consider the sequence a = (ai }°Oo in FNO. Take an arbitrary e > 0. Show that there exists an integer ke > 0 such that n
II 1(ak(a) - at(i))x, IIx < 2E i=M
276
4. Banach Spaces
f o r every 0 < m < n, whenever k, t > kE (hint: >
i=M (ak(i)
- at(i))xi =
E;'_p(ak (i) - at (i ))xi - E"'_p (ak (s) - at (i ))xi and (ak )k° o is a Cauchy sequence in Ax), and hence n
(d)
I F'(ak(i) - «i)Xi IIx < 2c i-m
for every pair of nonnegative integers m < n, whenever k > kE. (Hint: Note that 11E;=,n(aka) - limtat(i))x,llx = limtl,E =,n(ak(i) - at(i))xiHHX. Why?) Next prove the following claims. (e)
If X is a Banach space, then a E A.
(Hint: (d) implies that the infinite series E,°_°o(ak(i) -ai )xi converges in the Banach
space X for every k > kE - see Problem 4.7: Cauchy Criterion - and hence ak - a E Ax for each k > ks.) (f)
If a E Ax, then ak - a in Ax.
(Hint: a E Ax implies llak - all < 2s whenever k > kE by setting in = 0 in (d).) Finally conclude from (e) and (f):
If (X, II IIX) is a Banach space, then (Ax, 11 11) is a Banach space.
Problem 4.11. A sequence {xi } ' of vectors in a normed space X is a Schauder basis for X if for each x in X there exists a unique (similarly indexed) sequence of scalars (ai } such that 00
x=EaiXi i=o
(i.e., x = limn Fj=paixi). The entries of the scalar-valued sequence (ai}f0 are called the coefficients of x with respect to the Schauder basis (xi }.0, and the (convergent) series E -0ai xi is the expansion of x with respect to (xi }°Qa. Prove the following assertions.
(a) Every Schauder basis for a normed space X is a sequence of linearly independent vectors in X that spans X. That is, if (xi )00o is a Schauder basis for X, then (i)
(x, }°Oo is linearly independent, and
(ii)
V{xi }°i'a = X.
(b) If a normed space has a Schauder basis, then it is separable.
Problems
277
Hint: Proposition 4.9(b).
Remark: An infinite sequence of linearly independent vectors exists only in an infinite-dimensional linear space. It is readily verified that finite-dimensional normed spaces are separable Banach spaces (see Problem 4.37 below) but, in this case, the purely algebraic notion of a (finite) Hamel basis is enough. Thus, when the concept of a Schauder basis is under discussion, only infinite-dimensional spaces are considered. Does every separable Banach space have a Schauder basis? This is a famous question, raised by Banach himself in the early thirties, that remained open for a long period. Each separable Banach space that ever came up in analysis during that period (and this includes all classical examples) had a Schauder basis. The surprising negative answer to that question was given by Enflo, who constructed in the early seventies a separable Banach space that has no Schauder basis. See also the remark in Problem 4.58 below.
Problem 4.12. As usual, for each integer i > 0 let e; be a scalar-valued sequence with just one nonzero entry (equal to one) at the ith position (i.e.. ei = (Sik) o for every i > 0). Consider the Banach spaces f+°O and t+ for every p > I as in Example 4B. Show that the sequence {e; )'o is a Schauder basis for each I+ P, and verify that t+' has no Schauder basis. (Hint: Example 3Q.) Problem 4.13. Let M be a subspace of a normed space X. If X is a Banach space, then M and X IM are both Banach spaces (Propositions 4.7 and 4.10). Conversely,
if M and X/M are Banach spaces, then X is a Banach space. Indeed, suppose M and X/M are Banach spaces, and let be an arbitrary Cauchy sequence in X.
(a) Show that {[x]} is a Cauchy sequence in X/M and conclude that ([xn]} converges in X/M to, say, [x] E X/M. (b) Take any x in (x]. For each n there exists Xn in [xn - x] such that 0 < 11411x
5 11[x"-x]II+7r = II[xn]-[x]II+,'r.
Prove the above assertion and conclude: Z. -> 0 in X.
Observe that xn - x,, + x lies in M for each n. In fact, [x"] = [xn - x] (for
x"E[xn-x]),and so [x"-x"+x)=[x"]-[x,, -x]=[0]=M.
(c) Set un = xn - xn + x in M and show that (un ) is a Cauchy sequence in M. Thus conclude that {u"} converges in M (and hence in X) to, say, u E M. Since xn = zn + x - u" for each integer n, it follows that (xn) converges in X to x - u E X (for vector addition and scalar multiplication are continuous mappings - see Problem 4.1). Outcome: Every Cauchy sequence in X converges in X, which means that X is a Banach space.
278
4. Banach Spaces
Problem 4.14. Consider the linear space e+ and let {ak)k
1
be an e+ -valued
sequence. That is, each ak = {ak(n))n° l is a scalar-valued sequence that converges to zero: lim Iak(n)I = 0 for every k> 1. n
Suppose further that there exists a real number a > 0 such that Iak(n)I < a for all k, n > I or, equivalently, suppose sup sup Iak(n)I < 00n
k
Take an arbitrary x = (k}k° 1 in Z+ so that 00
Ilkl < 00k=1
Consider the above assumptions and prove the following proposition. 00
(a)
limSUP Iak(n)II kI = 0 n
ESUP lak(n)IIlkI <00.
and
k
k=1
n
Hint: SUpk Iak(n)l 14 I < max {maxi
m, n > 1, which implies 00
aE
lira sup sup Iak(n)I n
for every
IkI
m > 1.
k=m+l
k
Next use the dominated convergence of Problem 4.6(c) to show that 00
(b)
linr E Iak(n)I I kl = 0k=1
Now conclude: For each integer n > 1 the infinite series Ek°_tak(n)k converges (in the Banach space IF) and the scalar-valued sequence (E tak(n)i;k)n° is an element of t+". Therefore, every infinite matrix A = [ak(1')1k,n>1 whose rows ak 1
satisfy the above assumptions represents a mapping of L. into 1+c. Equip l+ and t;° with their usual norms (II II 1 on e+ and 11 11... on I+ c). Observe that the assumptions
on A simply say that fad' 1 is a bounded (i.e., supk Ilak If,, < oo) (°-valued sequence. Show that, under these assumptions, such a mapping in fact is a bounded linear transformation of f+ into e+ :
A E B[f+,e'°)
and
11 All =supIlakll,, k
Problems
279
Problem 4.15. If (xk )kis a summable sequence in a nonmed space X and T is a continuous linear transformation of X into a normed space Y (i.e., if T E B[X, 3J), then show that (Txk)kis a summable sequence in Y and 00
00
Txk = T(>xk). k-0
k=0
Problem 4.16. Let (XI, 11111 ) and (X2, 11112) be normed spaces and consider the nonmed space XI ® X2 equipped with any of the norms of Example 4E, which we shall denote simply by 11 11. That is, for every (xi, x2) in X1 ®X2, either II (xi, x2) II P = 1lx1II°+IIx2112 for some p> I orII(xt,x2)11= max{IIx 1111, IIX2112). Now let TI and
T2 be operators on X1 and X2, respectively (i.e., Ti E B[X1 ] and T2 E B[X2]). Consider the direct sum T E C[X1 ® X2] of Tt and T2 (defined in Section 2.9):
T =Ti®T2= where TI = T I Xi and T2 = T I X2 are the direct summands of T.
(a) Show that T E B[X1 ®X2] and IITII = max {IITII, IIT211}. Hint: Whatever is the norm 11 11 that equips X1 ® X2 (among those of Example 4E), II T(xt, x2) I I 5 max {I I T i
II,
I I T211 } I I (xt , x2) 11 f o r (xt , x2) E X t ®X2
Generalize to a countable direct sum. That is, let (Xk ) be an indexed countable family of normed spaces and consider the normed space ®kXk of Examples 4E or 4F (equipped with any of those norms, so that either ®kXk = [®kXk]P or ®kXk = [(DkXk] in case of a countably infinite family as in Example 4F). Let {Tk) be a similarly indexed countable family of operators on Xk (each Tk lying in B[Xk ]) such that supk II Tk II < oo. Set
for each
T {xk } = I Tk xk }
(Xk) E ®kXk .
(b) Show that this actually defines a bounded linear transformation T of ®kXk into itself. Such an operator is usually denoted by
T = ®Tk in
B[®kXk],
k
and referred to as the direct sum of {Tk }. Moreover, verify that Tk = T I Xt (in the sense of Section 2.9) for each k. These are the direct summands of T. Finally, show that IITII = sup I1Tk1l. k
280
4. Banach Spaces
If Xk = X for a single normed space X, then T = ®kTk is sometimes referred to as a block diagonal operator acting on I+P(X) or on 1+'(X). Problem 4.17. Let Xi and Yj be normed spaces for i = 1, 2 and consider the normed spaces XI ® X2 and Y1 ® Y2 equipped with any of the norms of Example 4E.
(a) Take Tij in B[X1, Xi] for i, j = 1,2 so that TIIxi + T12x2 lies in Yj and T21 xI + T22x2 lies in y2 for every (XI, x2) in X1 ® X2. Set
T(xl,x2) = (T1Ix1 +T12x2, T21x2+T22x2)
in
y1 ®Y2.
Show that this defines a mapping T : XI ®X2 -+ y1 ® y2 which in fact lies in B[XI ® X2, y1 ® Y2], and max (IITijII}
i=1.2
_<
IITII
j=1.2
_< 4 max (HTijp}. i=1,2 j=I.2
(b) Conversely, suppose T E 8[X1 ® X2, Yj ® y2]. If xl is an arbitrary vector in X1, then
T(xt, 0) = (T11x1, T21x1) in Yi ® y2, where TI I is a mapping of XI into Yj and T21 is a mapping of XI into y2. Similarly, if x2 is any vector in X2, then
T (0,x2) = (T12x2, T22x2) in Yi ® y2, where T12 is a mapping of X2 into y1 and T22 is a mapping of X2 into y2. Show that Ti j E B[Xj, Y ] and II Tij 11 < 11TII for every i = 1, 2 and
j=1,2. Consider the bounded linear transformation T E B[XI ® X2, Yi ® y21 of item (b). Since T (x i , x2) = T (xl , 0) ® T (O, x2) in Yj ® y2, it follows that T (xl , x2) =
(TI ]XI + T12x2, T21xI + T22x2) for every (xi, x2) in XI ® X2 as in item (a). This establishes a one-to-one correspondence between each T in B[Xi ® X2, Yi ® y21 and the 2x2 matrix of bounded linear transformations [Tij], called the operator
matrix for T, which we shall represent by the same symbol T (instead of, for instance, [T]) and write T =
Tit
T12
T2I
T22
Note that, if yi = Xi for i = 1, 2, then T is the direct sum TI I ® T22 in B[X1 ®X2]
of the previous problem if and only if T12 = 0 in B[X2, X1 ] and T21 = 0 in B[X1, X2].
Problem 4.18. Let T be an operator on a normed space X (i.e., T E B[X]). Recall that a subset A of X is T-invariant (or A is an invariant subset for T) if T(A) c A (i.e., Tx E A whenever x E A). If M is a linear manifold (or a subspace) of X and, as a subset of X, is T-invariant, then we say that M is an invariant linear manifold (or an invariant subspace) for T. Prove the following assertions.
Problems
281
(a) If A is T-invariant, then A- is T-invariant. (b) If M is an invariant linear manifold for T. then M is an invariant subspace for T. (c) (0) and X are invariant subspaces for every T in B[M]. Problem 4.19. Let X be a nonmed space and let Lat(X) be the lattice of all subspaces of X. Recall that {0) and X are subspaces of X (so that they are elements of Lat(X)). These are the trivial elements of Lat(X): a subspace in Lat(X) is nontrivial if it is a
proper nonzero subspace of X (i.e., M E Lat(X) is nontrivial if {0) 96 M 96 X). (a) Check that there exist nontrivial subspaces in Lat(X) if and only if the dimension of X is greater than 1(i.e., Lat(X) A { {0}, X } if and only if dim X > 1).
Let B[X] be the unital algebra of all operators on a nonmed space X and let T be an operator in B[X]. A nontrivial invariant subspace for T is a nontrivial element of Lat(X) which is invariant for T (i.e., a subspace M E Lat(X) such that {0) 96
1 76 X and T(M) CZ M). An element of B[X] is a scalar operator if it is a multiple if the identity, say, a! for some scalar a.
(b) Verify that every subspace in Lat(X) is invariant for any scalar operator in B[X], and hence every scalar operator has a nontrivial invariant subspace whenever dim X > 1. Problem 4.20. Let X be a normed space and take T E B[X]. Prove the following propositions.
(a) N(T) and R(T)- are invariant subspaces for T. (b) If T has no nontrivial invariant subspace, then N(T) = {0) and 7Z(T)- = X.
Take S and T in B[X]. We say that S and T commute if ST = TS.
(c) Show that if S and T commute, then N(S), N(T), R(S)- and R(T)- are invariant subspaces for both S and T.
Problem 4.21. Let S E B[X] and T E B[X] be nonzero operators on a normed space X. Suppose ST = 0 and show that
(a) T(N(S)) c T(X) = R(T) c N(S), (b) (0) # N(S) 96 X
and
(0)
R(T)- j6 X,
(c) S(R(T)-) c S(R(T))- c 7Z(T)-.
282
4. Banach Spaces
Conclusion: If S # 0, T * 0 and ST = 0, then Ar(S)and 1Z(T)-arenontrivial invariant subspaces for both S and T. Problem 4.22. Let X be a normed space and take T E B[X]. Every nonzero polynomial p(T) of T (defined as in Problem 2.20) lies in B[X]. (Reason: B[X] is an algebra.) Show that
(a) N(p(T)) and R.(p(T))- are invariant subspaces for T. Recall that an operator in B[X] is nilpotent if T" = 0 for some positive integer n, and algebraic if p(T) = 0 for some nonzero polynomial p (see Problem 2.20 again).
(b) Show that every nilpotent operator in B[X] (with dim X > 1) has a nontrivial invariant subspace.
(c) Suppose X is a complex normed space and dim X > 1. Show that every algebraic operator in B[X] has a nontrivial invariant subspace. Hint: A polynomial (in one complex variable and with complex coefficients) of degree n > 1 is the product of a polynomial of degree n - 1 and a polynomial of degree 1.
Problem 4.23. Let Lat(T) denote the subcollection of Lat(X) made up of all invariant subspaces for T E B[ X]. where X is a normed space. Clearly (see Problems 4.18 and 4.19), T has no nontrivial invariant subspace if and only if Lat(T) = ({0}, X}.
(a) Show that Lat(T) is a complete lattice in the inclusion ordering. Hint: Intersection and closure of sum of invariant subspaces are again invariant subspaces. See Section 4.3.
Take any operator T in B[X] and an arbitrary vector x in X. Consider the X-valued power sequence { T"x }n>o. The range of (T "x )n>o is called the orbit of x under T.
(b) Show that the (linear) span of the orbit of x under T is the set of the images of all nonzero polynomials of T at x; that is, span { T"X). >O = (p(T )x E X : p is a nonzero polynomial 1.
Since span (T"x)n>o is a linear manifold of X, it follows that its closure. (span (T"x)n>_o)- _ V(T"x)n>o, is a subspace of X (Proposition 4.8(b)). That is, V{T"x}n>o E Lat(X). (c) Show that V{T"x}n>o E Lat(T).
Problems
283
These are the cyclic subspaces in Lat(T):.M E Lat(T) is cyclic for T if M = V{T"x}n>o for some x E X. If V{Tx}">o = X, then x is said to be a cyclic vector for T. We say that a linear manifold M of X is totally cyclic for T if every nonzero vector in M is cyclic for T.
(d) Verify that T has no nontrivial invariant subspace if and only if X is totally cyclic for T.
Problem 4.24. Let X and y be normed spaces. A bounded linear transformation X E B[X, y] intertwines an operator T E B[X] to an operator S E B[y] if
XT = SX. If there exists an X intertwining T to S, then we say T is intertwined to S. Suppose XT = SX. Show by induction that
XT' = S"X
(a)
for every positive integer n. Hence verify that
Xp(T) = p(S)X
(b)
for every polynomial p. Now use Problem 4.23(b) to prove that (c)
X(span {T"x}">o) = span {S"Xx}">o
for each z E X, and therefore (see Problem 3.46(a)) (d)
X(V{T"x)">o) c V{S"Xx)">o
for every x E X. An operator T in B[X] is densely intertwined to an operator S in B[y] if there exists a bounded linear transformation X in B[X, y] with dense range intertwining T to S. If XT = SX and 7Z(X)- = Y, then show that (e)
V{T"x}">o = X
implies
y = V(S"Xx)">o.
Conclusion: Suppose T in B[X] is densely intertwined to S in B[y] and let X in B[X, Y] be a transformation with dense range intertwining T to S. If X E X is a cyclic vector for T, then Xx E y is a cyclic vector for S. Therefore, if a linear manifold M of X is totally cyclic for T, then the linear manifold X (M) of y is totally cyclic for S. Problem 4.25. Here is a sufficient condition for transferring nontrivial invariant subspaces from S to T whenever T is densely intertwined to S. Let X and y be normed spaces and take T E 13[X], S E B[y] and X E B[X, y] such that
XT = SX. Prove the following assertions.
284
4. Banach Spaces
(a) If M c y is an invariant subspace for S, then the inverse image of M under X, X-t (M) S X, is an invariant subspace for T. (b) If, in addition, M A Y, 1z(x) n m 54 (0) and R(X)- = Y. then (0) 96
X'I(M) # X. Hint: Problems 1.2 and 2.11, and Theorem 3.23.
Conclusion: If T is densely intertwined to S, then the inverse image under the intertwining transformation X of a nontrivial invariant subspace M for S is a nontrivial invariant subspace for T, provided that the range of X is not (algebraically) disjoint with M. Show that the condition R(X) n m 0 (0} in (b) is not redundant. That is, if M is a subspace of Y, then show that
(c) (0) A M # Y and R(X)- = Y does not imply R(X) n m j4 (0). However, if X is surjective, then the condition R(X) n m 96 (0) in (b) is trivially satisfied whenever M * (0). Actually, with the assumption XT = SX still in force, check the proposition below.
(d) If S has a nontrivial invariant subspace, and if R(X) = Y, then T has a nontrivial invariant subspace.
Problem 4.26. Let X be a normed space. The commutant of an operator T in B[X] is the set (T)' consisting of all operators in B[X] that commute with T. That is,
(T)' = {C E B[X]: CT = TC). In other words, the commutant of an operator is the set of all operators intertwining it to itself.
(a) Show that IT)' is an operator algebra that contains the identity (i.e., (T )' is a unital subalgebra of the normed algebra B[M]).
A linear manifold (or a subspace) of X is hyperinvariant for T E B[X] if it is invariant for every C E (T)'; that is, if it is an invariant linear manifold (or an invariant subspace) for every operator in B[X] that commutes with T. As T E (T)" every hyperinvariant linear manifold (subspace) for T obviously is an invariant linear manifold (subspace) for T. Take an arbitrary T E B[X] and, for each x E X, set
T; = U Cx = (y E X : y = Cx for some C E (T)'). CEIT)'
It is clear that Tz is never empty (for instance, x E Tx because ! E (T)'). In fact, 0 E Tx for every x E X, and Tj = (0) if and only if x = 0. Prove the following proposition.
Problems
285
(b) For each x E X, Tx is a hyperinvariant subspace for T. Hint: As an algebra, (T}' is a linear space. This implies that Tx is a linear manifold of X. If y = Cox for some Co E (T}', then Cy = CCox E Tx for every C E IT)' (i.e., Tx is hyperinvariant for T because IT)' is an algebra). See Problem 4.18(b).
Problem 4.27. Let X and Y be any normed spaces. Take T E B[X], S E B[YJ, X E B[X, Y], and Y E B[y. X] such that
XT=SX
and
YS=TY.
Show that if C E B[X) commutes with T, then XC Y commutes with S. That is (see Problem 4.26), show that
(a) XC Y E (S)' for every C E IT)'.
Now consider the subspace T, of X that. according to Problem 4.26, is nonzero and hyperinvariant for T for every nonzero x in X. Under the above assumptions on T and S. prove the following propositions.
(b) Suppose M is a nontrivial hyperinvariant subspace for S. If IZ(X)- = Y and N(Y) fl m = (0), then Y(M) 96 (0) and 7s A X for every nonzero x in Y(M). Consequently, Tx is a nontrivial hyperinvariant subspace for T whenever x is a nonzero vector in Y(M). Hint: Since M is hyperinvariant for S. it follows from (a) that M is invariant
for XCY whenever C E (T)'. Use this fact to show that X(7x) c M for every x E Y(M), and hence X (T-) c M- = M (see Problem 3.46(a)). Now verify that TX = X implies IZ(X)- = X (X)- = X (TX )- C M. Therefore, if M # Y and IZ(X)- = Y, then TX A X for every vector x in Y(M). Next observe that if Y(M) _ (0) (i.e., if M C N(Y)), then N(Y) f1 M = M. Thus conclude: if .M (0) and N(Y) fl M = (0), then Y(M) A (0). Finally recall that Tr 96 (0) for every x y6 0 in X, and hence (0) 96 7- 96 X for every nonzero vector x in Y(M). (c) If S has a nontrivial hyperinvariant subspace, and if R(X )- = Y andN(Y) _
(0), then T has a nontrivial hyperinvariant subspace.
Problem 4.28. A bounded linear transformation X of a nonmed space X into a normed space Y is quasiinverrible (or a quasiaffinity) if it is injective and has a
dense range (i.e., 11(X) = (0) and IZ(X)- = y). An operator T E 5[X) is a quasiaine transform of an operator S E B[y] if there exists a quasiinvertible transformation X E B[X, y] intertwining T to S. Two operators are quasisimilar
286
4. Banach Spaces
if they are quasiaffine transforms of each other. In other words, T E B[X ] and
S E B[y] are quasisimilar (notation: T - S) if there exists X E B[X, Y] and Y E B[y, X] such that
N(X) = (0),
R(X)- = y,
XT=SX
N(Y) = (0),
and
9Z(Y)- = X,
YS=TY.
(a) Show that quasisimilarity has the defining properties of an equivalence relation.
(b) If two operators are quasisimilar and if one of them has a nontrivial hyperinvariant subspace, them so has the other. Prove.
Problem 4.29. Let X and y be normed spaces. Two operators T E B[X] and S E B[y] are similar (notation: T S) if there exists an injective and surjective bounded linear transformation X of X onto y, with a bounded inverse X-t of y onto X, that intertwines T to S. That is, T E B[X] and S E B[y] are similar if there exists X E B[X, yJ such that N(X) = (0), R(X) = y, X-t E 8[y, X] and XT = SX. (a) Let T be an operator on X and let S be an operator on Y. If X is a bounded linear transformation of X onto y with a bounded inverse X-t of y onto X (which is always linear), then check that
XT = SX q T = X-1SX .
S = XTX-t o. X-'S = TX-t.
Now prove the following assertions.
(b) If T and S are similar, then they are quasisimilar. (c) Similarity has the defining properties of an equivalence relation.
(d) If two operators are similar, and if one of them has a nontrivial invariant subspace, then so has the other. (Hint: Problem 4.25).
Note that we are using the same terminology of Section 2.7, namely "similar", but
now with a different meaning. The linear transformation X : X -* y in fact is a (linear) isomorphism so that X and y are isomorphic linear spaces, and hence the concept of similarity defined above implies the purely algebraic homonymous concept defined in Section 2.7. However, we are now imposing that all linear transformations involved are continuous (or, equivalently, bounded) viz., T. S, X and also the inverse X-1 of X.
Problems
287
Problem 4.30. Let {xi )O0o be a Schauder basis for a (separable) Banach space X (see Problem 4.11) so that every x E X has a unique expansion 00
x = E ai (x) xi i-o
with respect to the basis (xi } '. For each integer i > 0 consider the functional ipi : X F that assigns to each x E X its unique coefficient ai(x) in the above expansion: (pi (x) = ai(x) for every x E X. Show that Bpi is a bounded linear functional (i.e., Vi E B[X, F] for each i > 0). In other words, each coefficient in a Schauder basis expansion for a vector x in a Banach space X is a bounded linear functional on X. Hint: Let Ax be the Banach space defined in Problem 4.10. Consider the mapping
4): Ax - X given by 00
4)(a) = F'aixi i-o
for every a = Jai },°O_o in Ax. Verify that 4) is linear, injective, surjective and bound-
ed (actually, 4) is a contraction: 4)(a) _< Ila II for every a E Ax). Now apply Theorem 4.22 to conclude that 4) E G[. x , X]. For each integer i > 0 consider the functional tlii : Ax F given by
*i (a) = ai for every a = {a i }i>0 E Ax A. Show that each >Gi is linear and bounded. Finally, observe that the following diagram commutes.
Problem 431. Let X and Y be normed spaces (over the same scalar field) and let M be a linear manifold of X. Equip the direct sum of M and Y with any of the norms
of Example 4E and consider the normed space M ® Y. A linear transformation L : M -i- Y is called closed if its graph is closed in M ® Y. Since a subspace simply means a closed linear manifold, and recalling that the graph of any linear transformation of M into Y is a linear manifold of the linear space M ® Y, such a definition can be rewritten as follows. A linear transformation L : M - Y is
288
4. Banach Spaces
closed if its graph is a subspace of the normed space Al ® Y. Take an arbitrary L E L[X, )2] and prove that the assertions below are equivalent. (i)
L is closed.
(ii)
If is an M-valued sequence that converges in X, and if its image under L converges in Y. then
lim u E M and
lim Lu = L lim u,,.
Symbolically, L is closed if and only if
-+uEX
U EM
y=Lu Hint: Apply the Closed Set Theorem. Use the one-norm on M ® )) (i.e., II (u, y) 11 11uIIx + 11yIIY for every (u, y) E M ®y.
_
Problem 4.32. Consider the setup of the previous problem and prove the following propositions.
(a) If L E B[M, Y] and lvi is closed in X. then L is closed. Every bounded linear transformation defined on a subspace of a normed space
is closed. In particular (set M = X), if L E 13[X. Y] then L is closed.
(b) If M and Y are Banach spaces and L E L[M, YJ is closed, then L E B[M, yJ. Every closed linear transformation between Banach spaces is bounded. (c) If Y is a Banach space and L E B[.1u, y] is closed, then M is closed in X.
Every closed and bounded linear transformation into a Banach space has a closed domain. Hint: Closed Graph Theorem and Closed Set Theorem.
Recall that continuity means convergence preservation in the sense of Theorem 3.7, and also that the notions of "bounded" and "continuous" coincide for a linear transformation between normed spaces (Theorem 4.14). Compare Corollary 3.8 with Problem 4.31 and prove the next proposition. (d) If M and Y are Banach spaces. then L E L[M, yJ is continuous if and only
if it is closed.
Problems
289
Problem 4.33. Let X and Y be Banach spaces and let M be a linear manifold of X. Take L E £[M, YJ and consider the following assertions.
(i) M is closed in X (so that M is a Banach space). (ii)
L is a closed linear transformation.
(iii) L is bounded (i.e., L is continuous). According to Problem 4.32 these three assertions are related as follows: each pair of them implies the other. (a) Exhibit a bounded linear transformation that is not closed.
Hint: 1 is a dense linear manifold of (t
,
11
112). Take the inclusion map of
(e+. 11 112) into (8+, 11 112)
The classical example of a closed linear transformation that is not bounded is the differential mapping D : C'[0, I] --> C[0, fl) defined in Problem 3.18. It is easy to show that C'[0, 1], the set of all differentiable functions in C[0, 1] whose derivatives lie in C(0, I], is a linear manifold of the Banach space C[0, I] equipped with the sup-norm. It is also readily verified that D is linear. Moreover, according to is Problem 3.18(a), D is not continuous (and hence unbounded). However, if a uniformly convergent sequence of continuously differentiable functions whose derivative sequence (Du,,) also converges uniformly, then lim Dun = D(lim This is a standard result from advanced calculus. Thus D is closed by Problem 4.31. (b) Give another example of a closed linear transformation that is not bounded.
Hint: X = Y = 4+, M = {x =
1
E 4: Ek°_ikltkl < oo}, and
D = diag({k}ko 1) = diag(l, 2, 3.... ):.M -> Verify that M is a linear manifold of a+. Use xn = n ( 1 , ... , 1, 0, 0, 0, ...) E M (the first n entries e+.
are all equal to n ; the rest is zero) to show that D is not continuous (Corollary
3.8). Suppose u -> u E e+, with u = in M, and Du,, --> y = (v(k))k 1 E e+. Set (k) = kv(k) so that x = { (k))k° lies in M. Now show 1
1
that 1l u - x 111 < II Du - y 1j,, and hence u,, --), x. Therefore, u = x E M (uniqueness of the limit) and y = Du (for y = Dx). Apply Problem 4.31 to conclude that D is closed. Generalize to injective diagonals with unbounded entries.
Problem 4.34. Let M and Al be subspaces of a normed space X. If M and N are algebraic complements of each other (i.e., if M +N = X and M flN = (0)), then we say that M and Al are complementary subspaces in X. According to Theorem
2.14 the natural mapping 0: M ®N --> M + N. defined by 4((u, v)) = u + v for every (u, v) E M ® Al. is an isomorphism between the linear spaces M ® Al
290
4. Banach Spaces
and M + N whenever M fl N = (0}. Consider the direct sum M ® N equipped with any of the norms of Example 4E and prove the following claim.
If M and N are complementary subspaces in a Banach space X, then the natural mapping (V: M ®N -> M + N is a topological isomorphism.
Hint: Show that the isomorphism 4) is a contraction when M ® N is equipped with the norm II II r . Recall that M and N are Banach spaces (Proposition 4.7) and
conclude that M ® N is again a Banach space (Example 4E). Apply the Inverse Mapping Theorem to prove that 4) is a topological isomorphism when M ® N is Also recall that the norms of Example 4E are equivalent (see the remarks that follow Proposition 4.26). equipped with the norm 11
11
Problem 4.35. Prove the following propositions. (a) If P : X -- X is a continuous projection on a normed space X, then R(P)
and N(P) are complementary subspaces in X
Hint: Recall that R(P) = N(I - P). Apply Theorem 2.12 and Proposition 4.13.
(b) Conversely, if M and N are complementary subspaces in a Banach space X,
then the unique projection P : X -+ X with R(P) = M and N(P) = N of Theorem 2.20 is continuous and II P II ? 1.
Hint: Consider the natural mapping 4): M ® N - M + N of the direct sum M ® N (equipped with any of the norms of Example 4E) onto X = M +N. Let PM : M ®N --> M C X be the map defined by PM (u, v) = u for every (u. V) E M ® N. Recall from Example 41 that PM is a contraction (indeed, II PM II = 1). Apply the previous problem to verify that the diagram
M®N (
(b-1
PM\
M+N 1P
X commutes. Thus show that P is continuous (note: Pu = u for every u E M =
R(P)). Remarks: PM is, in fact, a continuous projection of M ® N into itself whose range
is R(PM) = M ® (0). If we identify M ® (0) with M (as we did in Example 4I), then PM: M ®N -> M ®(0) _c M ®N can be viewed as a map from M ® N onto M. and hence we wrote PM: M ® N -+ M C X; the continuous natural projection of M ® Ar onto M. It is also worth noticing that the above propositions hold for the complementary projection E = (I - P) : X -+ X as well,
forN(E) = R(P) and R(E) = N(P).
Problems
291
Problem 4.36. Consider a bounded linear transformation T E 13[X, y] of a Banach space X into a Banach space Y. Let M be a complementary subspace of N(T) in X. That is, M is a subspace of X that is also an algebraic complement of the null space N(T) of T :
M=M-, X = M + N(T) and M n N(T) = {0}. Set TM = TIM: M -> Y. the restriction of T to M, and verify the following propositions. (a) TM E 51M, Y]. R(TM) = R(T) and N(TM) = (0).
Hint: Problems 2.14 and 3.30. (b)
R(TM) = R(TM)- if and only if there exists TM 1 E B[R(TM), M]. Hint: Proposition 4.7 and Corollary 4.24.
(c)
If A C R(T) and TM'(A)- = M, then T(A)- = X. Hint: Take an arbitrary x = u + v E X = M + N(T), with u in M and v in M(T). Verify that there exists a TM -1 (A)-valued sequence (un) that converges
to u. Set x,, = un + v in X and show that (xn) is a T4-1 (A)-valued sequence that converges to x. Apply Proposition 3.32. Now use the above results to prove the following assertion. (d)
If A C 1Z(T) and A- = R(T) =1 (T)-, then T-1(A)- = X.
That is, the inverse image under T of a dense subset of the range of T is dense in X whenever X and Y are Banach spaces and T E 13[X, Y] has a closed range and a null space with a complementary subspace in X. This can be viewed as a converse to Problem 3.46(c).
Problem 4.37. Prove the following propositions. (a) Every finite-dimensional normed space is a separable Banach space. Hint: Example 3P, Problem 3.48, and Corollaries 4.28 and 4.31.
(b) If X and Y are topologically isomorphic nonmed spaces, and if one of them is a (separable) Banach space, then so is the other. Hint: Theorems 3.44 and 4.14.
Problem 4.38. Let X and Y be normed spaces and take T E L [X, Y Y. If either X or ) is finite-dimensional, then T is of finite rank (Problems 2.6 and 2.17). R(T) is a subspace of y whenever T is of finite rank (Corollary 4.29). If T is injective and of finite rank, then X is finite-dimensional (Theorem 2.8 and Problems 2.6 and 2.17). Use Problem 2.7 and Corollaries 4.24 and 4.28 to prove the following assertions.
292
4. Banach Spaces
(a) If Y is a Banach space and T E B[X, Y] is of finite rank and injective, then
T has a bounded inverse on its range.
(b) If X is finite-dimensional, then every injective operator in B[X] is invertible.
(c) If X is finite-dimensional and T E C[X], then N(T) = (0) if and only if T E c[X]. In other words, a linear transformation of a finite-dimensional normed space into itself is a topological isomorphism if and only if it is injective.
(d) If X is finite-dimensional, then every linear isometry of X into itself is an isometric isomorphism.
That is, every linear isometry of a finite-dimensional normed space into itself is surjective Problem 4.39. The previous problem says that nonsurjective isometrics in B[X] may exist only if the normed space X is infinite-dimensional. Here is an example. Let (e+, 11 11) denote either the normed space ([+, II llp) for some p > 1 or (t°, 11 11"). }k Consider the mapping S+ : t+ -' t+ defined as follows. For every x = o E t+,
S,x = {uk}ko (i.e., nite matrix
(0,
vk =
with
0,
j k-b
k = 0, k > 1,
. )) which is also represented by the infi-
o, I , 2, 0
0 S+ =
1
where every entry below the main diagonal is equal to one and the remaining entries are all zero. This is the unilateral shift on t+.
(a) Show that S+ is a linear nonsurjective isometry. Since S+ is a linear isometry. it follows by Proposition 4.37 that IIS+x II = IIx1I for every x E t+ and all n > 1, and hence
S+ E B[t+]
with
IIS+ II = 1
for all n > 0.
Consider the backward unilateral shift S_ of Example 41- now acting either on t+ or on t.°. Recall that S_ E 13[t+] and tI S'_' II = 1 for all n > 0 (this has been verified in Example 4L for (t+, II II) = (t.°, II lip) but the same argument ensures that it holds for (t+, 11 II) = (t°, II II0,) as well).
Problems
293
(b) Show that S_ S. = I : f+ -> f+, the identity on f+. Therefore, S_ E 13[f+] is a left inverse of S_ E 13[f+j.
(c) Conclude that S_ is surjective but not injective.
Problem 4.40. Let (f, 11 11) denote either the normed space (fP, II II,,) for some Consider the mapping S: f -+ f defined by p > I or (f°O. II
Sx = W - 1 1 1 ' M
x = {tk}l _M E f
for every
(i.e., S(.... t-2, t-1, (to), tl, t2, ...) _ (.... t-3 t-2 (t-1). to. ti, ... )) which is also represented by the following (doubly) infinite matnx (with the inner parenthesis indicating the zero-zero position).
1
S=
0 1
(0) 0
1
where every entry below the main diagonal is equal to one and the remaining entries are all zero. This is the bilateral shift on f.
(a) Show that S is a linear surjective isometry. That is, S is an isometric isomorphism. and hence
SE 9[f]
with
IIS"II=1 for all n > 0.
Its inverse S-1 is then again an isometric isomorphism, so that
S-1E9[f]
with
II(S-1)" II=1 for all n>0.
(b) Verify that the inverse S-1 of S is given by the formula S-1.t = W,-1)L'=_X
x = {tk )k _W E f
for every
(i.e., S-1(... , t-2 t-, (to), t1, i;2....) = (... , t-1, to, (W. t2 t3.... which is also represented by a (doubly) infinite matrix 1
0
S-1 =
1
(0)
1
0
l where every entry above the main diagonal is equal to one and the remaining entries are all zero. This called the backward bilateral shift on C.
294
4. Banach Spaces
Problem 4.41. Prove the following propositions.
(a) Let W, X, Y and Z be normed spaces (over the same scalar field). If V E B[X, y] is an isometry, then IITVII = IITII
and
IIVSII = IISII
for every T E B[Y, .Z] and every S E B[W, X]. Remark: This extends the following result. 11 V' 11 = 1 for every n > 1 whenever V is an isometry in B[X] (Proposition 4.37(c)).
(b) Every linear isometry of a Banach space into a normed space has a closed range. Hint: Proposition 4.20 and 4.37.
Problem 4.42. Let X and y be normed spaces. Verify that T in B[X J and S in B[y] are similar (in the sense of Problem 4.29) if and only if there exists a topological isomorphism intertwining T to S. That is, if and only if there exists W in C[X, y] such that
WT = SW. Clearly, X and y must be topologically isomorphic normed spaces if there are similar elements in B[X] and B[y]. A stronger form of similarity is obtained when there exists an isometric isomorphism, say U in G[X, y]. intertwining T to S; that is,
UT = SU. If this happens, then we say that T and S are isometrically equivalent (notation: T - S). Again, X and y must be isometrically isomorphic normed spaces if there are isometrically equivalent elements in B[X] and B[Y]. As in the case of similarity. show that isometric equivalence has the defining properties of an equivalence relation. An important difference between similarity and isometric equivalence is that isometric equivalence is norm-preserving: if T and S are isometrically equivalent, then II T II = II S II . Prove this identity and show that it may fail if T and S are simply similar. Now let X and y be Banach spaces. Show that, in this case, T in B[XJ and S in B[y] are similar if and only if there exists an injective and surjective bounded linear transformation in B[X, yJ intertwining T to S. Problem 4.43. Let X be a normed space. Verify that the following three conditions are pairwise equivalent.
(a) X is separable (as a metric space). (b) There exists a countable subset of X that spans X.
Problems
295
(c) There exists a dense linear manifold M of X such that dim M < No. Hint: Proposition 4.9.
(d) Moreover, show also that a completion X of a separable normed space X is itself separable.
Problem 4.44. In many senses barreled spaces in a locally-convex-space setting plays a role similar to Banach spaces in a normed-space setting. In fact, as we saw in Problem 4.4, a Banach space is barreled. Barreled spaces actually are the spaces where the Banach-Steinhaus Theorem holds in a locally-convex-space setting: Every pointwise bounded collection of continuous linear transformations of a barreled space into a locally convex space is equicontinuous. To see that this is exactly the locally-convex-space version of Theorem 4.43, we need the notion of equicontinuity in a locally convex space. Let X and y be topological vector spaces. A subset O of L[X, YJ is equicontinuous if for each neighborhood Ny of the origin of Y there exists a neighborhood Nx of the origin of X such that T (NX) c Ny for all T E O. (a) Show that if X and Y are normed spaces, then O C £[X, Y] is equicontinuous if and only if O C B[X, Y] and SUP Tee II T11 < oo.
The notion of a bounded set in a topological vector space (and, in particular, in a locally convex space) was defined in Problem 4.2. Moreover, it was shown in Problem 4.5(b) that this in fact is the natural extension to topological vector spaces of the usual notion of a bounded set in a normed space. (b) Show that the Banach-Steinhaus Theorem can be stated as follows: Every pointwise bounded collection of continuous linear transformations of a Hanaeh space into a normed space is equicontinuous.
Problem 4.45. Let (Tn) be a sequence in B[X, y], where X and Y are normed spaces. Prove the following results.
(a) If T. s T for some T E B[X, Y1, then II Tnx II and IITII
II Tx II for every x E X
liminfnl1Tnll.
(b) If sup,, II Tn II < oo and { Tna } is a Cauchy sequence in Y for every a in a dense
subset A of X, then {Tnx} is a Cauchy sequence in Y for every x in X.
Hint: Tnx - Tmx = Tnx - ,,a( + Tna, - Tmaj + Tmai - Tmx. (c) If there exists T E B[X, y] such that T,,a -> Ta for every a in a dense subset A of X, and if sup,, II Tn II < oo, then Tn T.
Hint: (Tn - T)x = (Tn - T)(x - a.) + (Tn - T)a,.
296
4. Banach Spaces
(d) If X is a Banach space and (Tnx) is a Cauchy sequence for every x E X, then SUP,IITn11 < 00-
(e) If X and Y are Banach spaces and (Tnx) is a Cauchy sequence for every
T for some T E B[X, Y].
x E X, then Tn
Problem 4.46. Let (T,,} be a sequence in B[X, y] and let (Sn) be a sequence in B[Y, Z), where X, Y and Z are normed spaces. Suppose
Tn - T
Sn f S
and
for T E B[X, Y] and S E B[Y, Z]. Prove the following propositions. (a) If sup. II S. 11 < oo, then Sn Tn * S T.
(b) If Y is a Banach space, then S,, T,, - # ST. (c) If Sn
(d) IfSn
S, then S Tn -+ S T.
sSand
ST.
Finally, show that addition of strongly (uniformly) convergent sequences of bounded linear transformations is again a strongly (uniformly) convergent sequence of bound-
ed linear transformations whose strong (uniform) limit is the sum of the strong (uniform) limits of each summand.
Problem 4.47. Let X be a Banach space and let T be an operator in B[X]. If A is any scalar such that 11T11 < IAI, then (Al - T) is an invertible element of B[X) (i.e., (A! - T) E G[X]) and the series Too -T'r converges in B[X] to (AI - T)That is, 00
implies
IITII < IAI
(A! -
T)-t
=x
(
a)i E B[X].
,=o
This is a rather important result, known as the von Neumann expansion. The purpose
of this problem is to prove it. Take T E B[X] and 0 # A E F arbitrary. Show by induction that, for each integer n > 0, (a)
IIT"II
n
(b)
IIT11",
n
(w)`(A!-T) = /_ a)n+I
(AI -T)1Ela) 1 = a =o
and
t-o
Problems
297
From now on suppose II T If < III and consider the power sequence (("J 0 in B[X]. Use the result in (a) to show that
)" Io is absolutely summable.
((
(c)
Thus conclude that (cf. Problem 3.12) u
(d)
(TT)n
)O
(i.e., , is uniformly stable). and also that (see Proposition 4.4)
{ (L)" I' is summable.
(e)
This means that the series T°_0(1 ) ' converges in B[X] or, equivalently, that there
exists an operator in B[X], say F '(a )', such that 00
n
'
E(T)i i=0
>:w'.
i=0
Apply the results in (b) and (d) to check the following convergences. n
n
(AI-T)fE(a
(f)
and
( )' (A1
z
i=0
- T) "-> I.
i=0
Now use (e) and (f) to show that 00
00
(A1-T)zL(T)' = IE(x)1(A1 -T) = 1.
(g)
i=0
i=0
Hence F0O_0(z )' E B[X ] is the inverse of (Al - T) E B[X] (Problem 1.7). Therex - T) E B[X] is an invertible element of 13[X] (i.e., (Al - T) E 9[X]) fore, (Al whose inverse (Al - T)-I is the (uniform) limit of the sequence
"
T'
00
that is, 00
(Al -
T)-i
(a/' E 5[X1-
=a 1 =0
Finally, verify that 00
(h)
11 (11 -
T)-' II
< iii
(pixr' = (III - IITII)-'. i=0
Remark: Exactly the same proof applies if, instead of B[X], we were working on an abstract unital Banach algebra A.
298
4. Banach Spaces
Problem 4.48. If T is a strict contraction on a Banach space X, then 00
(I -T)-1 _
T' E B[X]. i=o
This is the special case of Problem 4.47 for A = 1. Use it to prove the following assertions.
(a) Every operator in the open unit ball centered at the identity I of B[X] is invertible (i.e., if III - SII < 1, then S E 9[X]). (b) Let X and y be Banach spaces. Centered at each T E Cg[X. y] there exists a nonempty open ball BE(T) C 9[X, y] such that supSEBI(T) IIS-t II < oo. In particular, G[X, y] is open in B[X, Y]. Hint: Suppose IIT - SII < e < IIT -' 11 -1 so that
IIIx - T-'SII = ]IT-'(T - S)II < IIT-'Ile < 1. Thus T-1 S = Ix - (Ix - T-i S) lies in G[X] by (a). Hence S = T T-' S lies in G[X, y] (Corollary 4.23). Moreover, IIS-' II = IIS-'T T-1 II < IIT-'IIIIS-'TII.But
IIS-'TII = II(T-'S)-'II = II[Ix-(Ix-T-'S)]-'II < (I - IIIx - T-'SII)-' (cf. Problem 4.47(h)). Conclude: IIS-'II < IIT-' II(1 - IIT-' IIE)-'
(c) Inversion is a continuous mapping. That is, if X and y are Banach spaces, then the map T H T-1 of G[X, y] into Cg[y, X] is continuous.
Hint: T-1 - S-' = T -'(S - T )S-'. If (Tn) in CQ[X, y] converges in B[X, y]
toSEG[X,y],then sup,, IIT.'II
On the other hand, show that the collection of all contractions in B[X] is strongly closed in B[X], and hence (uniformly) closed in 5[X]. Hint: 111TxII - IITxII1 -< II(Tn - T)xII.
Problem 4.50. Show that the strong limit of a sequence of linear isometries is again
a linear isometry. In other words, the collection of all isometrics from B[X, y]
Problems
299
is strongly closed, and hence uniformly closed (X and Y are, of course, normed spaces). Hint: Proposition 4.37 and Problem 4.45(a). Problem 4.51. Take an arbitrary p > I and consider the normed space 1+ of Example 4B. Let (Dn) be a sequence of diagonal operators in B[P+]. If { D.) converges strongly to D E B[l+], then D is a diagonal operator. Prove.
Problem 4.52. Let (Pk )k° 1 be a sequence of diagonal operators in B[1+] such that, for each k > 1,
P k = diag(ek) = diag(0,... , 0, 1, 0, 0, 0, ... ) (the only nonzero entry is equal to one and lies at the kth position). Set n
En = E Pk = diag(1,... , 1, 0, 0, 0, ...) E B[1+] k=1
for every integer n > 1. Show that En - 1, the identity in B[1+], but (En )n°_t does not converge uniformly because II En - 111 = 1 for all n > 1.
Problem 4.53. Let a = (CO Mo be a scalar-valued sequence and consider a sequence (Dn}o of diagonal mappings of the Banach space 1+ into itself such that Dn = diag(ao, ... , an, 0, 0, 0, ... ) for each integer n > 0, where the entries in the main diagonal are all null except perhaps for the first n + 1. It is clear that Dn E B[1{P. ] for every n > 0. (Reason: Bo[1+] C B[1+].) If a E 1+°, then consider the diagonal operator D. = diag((ak } o) E B[1+] of Examples 4H and 4K. Now prove the following assertions. (a) If supklakI < oo,then Dn ") Da.
Hint: II Dnx - DaxII
< sUpklakVP> .! kIP for x = (tk)_o E 1+.
(b) Conversely, if { Dn x }n° o converges in 1+ for every x E 1+, then supk lak I < 00,
and hence D,, .
Da .
Hint: Proposition 3.39, Theorem 4.43 and Example 4H.
(c) If limk lakI = 0, then D,,
Da.
Hint: Verify that Il (Dn - Da )X ll P < SUpn
Dn°D..
300
4. Banach Spaces
Hint: Uniform convergence implies strong convergence. Apply (c). Compute (D" - Da )ek for every k, n.
Problem 4.54. Take any a E C such that a 0 1. Consider the operators A and P in 8[C2] identified with the matrices
A-( -a I--a)
and
P=alt a -1)
(i.e., these matrices are the representations of A and P with respect to the canonical basis for C2).
(a) Show that PA = AP = P = P2. (b) Prove by induction that
A"+t = aA" +(I - a) P, and hence (see Problem 2.19 or supply another induction)
A" =a"(1-P)+P, for every integer n > 0.
(c) Finally, show that
IaI < I
implies
A"
u) P and IIPII>1,
where II II denotes the norm on B[C2] induced by any of the norms II Ilr, (for
p > 1) or II Iloo on the linear space C2 as in Example 4A.
Hint: 1 < II Pet II
< II Pet II,, (cf. Problem 3.33).
Problem 4.55. Take T E £[X], a linear transformation on a normed space X. Suppose the power sequence [T") is pointwise convergent, which means that there
exists P E £[X] such that T"x -+ Px in X for every x E X. Show that
(a) PTk = TkP = p = Pk for every integer k > 1, (b)
(T - P)" = T" - P for every integer n > 1.
Now suppose T lies in 13[X] and prove the following propositions. (c) If T"
P E 13[X], then P is a projection and (T - P)"
0.
(d) If T"x -). Px for every x E X, then P E G[X] is a projection. If, in addition,
X is a Banach space, then P is a continuous projection and (T - P) E 13[X] is strongly stable.
Problems
301
Problem 4.56. Let F : X -+ X be a mapping of a set X into itself. Recall that F is injective if and only if it has a left inverse F-t : R(X) -+ X on its range. Thus,
if F is injective and idempotent (i.e., F = F2), then F = F-t FF = F-I F = I, and hence (a) the unique idempotent injective mapping is the identity. This is a purely set-theoretic result (no algebra or topology is involved). Now let X be a metric space and recall that every isometry is injective. Therefore,
(b) the unique idempotent isometry is the identity.
In particular, if X is a normed space and F: X -s X is a projection (i.e., an idempotent linear transformation) and an isometry, then F = 1: the identity is the unique isometry that also is a projection. Show that (c) the unique linear isometry that has a strongly convergent power sequence is the identity. Hint: Problems 4.50 and 4.55.
be a sequence of bounded linear transformations in B[Y, Z], Problem 4.57. Let where Y is a Banach space and Z is a normed space, and take T E B[y, Z]. Use Proposition 4.46 to show that, if M is a finite-dimensional subspace of Y. then
T±T
(a)
implies
(T - T)IM - O.
Now let K be a compact linear transformation of a normed space X into Y (i.e., K E B JX, Y]). Prove that
T
(b)
>T
implies
TK
T K.
Hint: Take an arbitrary x E X. Use Proposition 4.56 to show that for each e > 0 there exists a finite-dimensional subspace Rc of Y and a vector rr,x in Rc such that IIKx - re,xll < 2EIIx11
and
Ilre,xll <(2e+IIKII)Ilxll.
Then verify that
IIT,,-TIIIIKr-re.xll+(I(T,,-T)IR,IIIIrc,xll <
(2E11Tn-TII+(2E+IIKII)II(T,,-T)IR111)IIxUI
Now apply the Banach-Steinhaus Theorem (Theorem 4.43) to ensure that sup II T. - T II < oo and conclude from item (a): for every e > 0 lim sup lI T K - T K II < (2 sup 11T. - T Ii)e.
302
4. Banach Spaces
Problem 4.58. Prove the converse of Corollary 4.55 under the assumption that the Banach space )y has a Schauder basis. In other words, prove the following proposition.
If Y is a Banach space with a Schauder basis and X is a normed space, then every compact linear transformation in B[X, Y] is the uniform limit of a sequence of finite-rank linear transformations in B0[X, Y1.
Hint: Suppose )) is infinite-dimensional (otherwise the result is trivially verified) and has a Schauder basis. Take an arbitrary K in B,,[X, Y]. IZ(K)- is a Banach space possessing a Schauder basis, say {y;}O0, so that every y E IZ(K)- has a unique expansion y = F"coa1(y) y, (Problem 4.11). For each nonnegative integer n consider the mapping E,: IZ(K)- -+ IZ(K)- defined by E, ,y = En=oai (y) y;. Show that each E is bounded and linear (Problem 4.30), and also of finite rank (for c V { y; }^_o). Moreover, show that (E,,)'tl converges strongly to the identity operator I on IZ(K)- (Problem 4.9(b)). Use the previous problem to lies in B0[X, y] for each n. conclude that E K ° K. Finally, check that Remark: Consider the remark in Problem 4.11. There we commented on the construction of a separable Banach space that has no Schauder basis. Actually, such a breakthrough was achieved by Enflo in 1973 when he exhibited a separable (and reflexive) Banach space X for which B0[X] is not dense in 13,[X], so that there exist compact operators on X that are not the (uniform) limit of finite-rank operators (and hence the converse of Corollary 4.55 fails in general). Thus, according to the above proposition. such an X is a separable Banach space without a Schauder basis. Problem 4.59. Recall that the concepts of strong and uniform convergence coincide in a finite-dimensional space (Proposition 4.46). Consider the Banach space at+ for any p > 1(which has a Schauder basis - Problem 4.12). Exhibit a sequence of finiterank (compact) operators on £. that converges strongly to a finite-rank (compact) operator but is not uniformly convergent. Hint: Let Pk be the diagonal operator defined in Problem 4.52.
Problem 4.60. Let M be a subspace of an infinite-dimensional Banach space X. Show that an extension over X of a compact operator on M may not be compact. Hint: Let M and N be Banach spaces over the same scalar field. Suppose N is infinite-dimensional. Set X = M ® N and consider the direct sum T = K ®1 in B[X], where K is a compact operator in BE[M] and I is the identity operator in B[N], as in Problem 4.16. Problem 4.61. Let X # {0} and Y A {0} be normed spaces over the same scalar field and let M be a proper subspace of X. Show that there exists 0 54 T in B[X, y ] such that M C N(T) (i.e., such that T(M) = (0)). Hint: Corollary 4.63. Problem 4.62. Let X be a nonmed space. Since I Ilx II - Ilyll I llx - yll for every x, y E X. it follows that the norm on X is a real-valued contraction that takes each
Problems
303
vector of X to its norm. Show that for each vector in X there exists a real-valued linear contraction on X that takes that vector to its norm. Problem 4.63. The annihilator of a subset A of a normed space X is the following subset of the dual space X*:
Al ={ If E X*: A C N(f)1. (a) Verify that 01 = (0)-L = X, X1- = (0), B1 a Al whenever A C_ B and, if A 54 0, then
Al = { f E X* : f (A)= (011. (b) Show that Al is a subspace of X* for every subset A of X. (c) Take an arbitrary subset A of X and show that
A- s n N(f). feA1
Hint: If f E Al then A C_ N(f). Thus conclude that A- C_ N(f) for every f E Al (Proposition 4.13). Now let M be a linear manifold of X and prove the following assertions.
(d) n fEM1N(f) c M. Hint: if xo E X\M-, then there exists f E M1 such that f (xo) = 1 (Corollary 4.63), and hence xo ¢ n fM1N(f)-
(e) M- = nfEMIN(f) (f) M- = X if and only if M1 = (0). Problem 4.64. Let lei )"= I be a Hamel basis for a finite-dimensional normed space X. Verify the following propositions.
(a) For each i = i, ... , n there exists f; E X* such that f; (e!) = Sid for every
I=i_. n.
Hint: Set f (x) = l i for every x = yn 1l i ei E X. (b) (f;)"=1 is a Hamel basis for X*.
Hint: If f E X*, then f = F"_ 1f (e1) fi .
304
4. Banach Spaces
Now conclude that dim X = dim X* whenever dim X < oo.
Problem 4.65. Let J : y --> X be an isometric isomorphism of a normed space y onto a normed space X, and consider the mapping J* : X* yF defined by J*f = f o J for every f E X*. Show that (a)
J*(X*) = y* so that J*: X* -> y* is surjective,
(b)
J*: X* -+ Y* is linear, and
(c) IIJ*f II = IIIII for every f E X*. (Hint: Problem 4.41.)
Conclude: If X and y are isometrically isomorphic normed spaces, then their duals X* and y* are isometric isomorphic too. That is,
X-Y
implies
X* - y*.
Problem 4.66. Consider the normed space f+°O equipped with its usual sup-norm and recall that e+ c e° (Problem 3.59). Let S_ E B[t+°] be the backward unilateral shift
on e' (Example 4L and Problem 4.39). A bounded linear functional f : t.° - F is called a Banach limit if it satisfies the following conditions. (i)
(ii) (iii) (iv)
Ilf II = 1, f(x) = f (S_x) for everyx E e°+°, If x = lies in f+', then f (x) = Iimk k, If x = (k) in f+PO is such that tk > 0 for all k. then f (x) > 0.
Condition (iii) says that Banach limits extend to t+00 the limit function on t+ (i.e.. f is defined on e + ' and its restriction to e+, f I, , assigns to each convergent sequence its own limit). The remaining conditions represent fundamental properties of a limit function. Conditions (i) and (ii) say that Iimk Ilk I < supk lk I and Iimk l k = Iimk k+n for every positive integer n, respectively, whenever {tk } E t.. Condition (iv) says that f is order-preserving for real-valued sequences in t+°O (i.e.,
if x = {:;k } and y = (uk ) are real-valued sequences in t.', then f (x), f (y) E R - why? - and f (x) < f (y) whenever k < uk for every k). The purpose of this problem is to show how the Hahn-Banach Theorem ensures the existence of Banach
limits on e;. (a) Suppose F = R (so that the sequences in 1+' are all real-valued). Let a be the constant sequence in f+' whose entries are all ones (i.e.. e = (1, 1. 1.... ))
and set M = R(I - S_). Show that d(e, M) = 1.
Hint: Verify that d(e, M) < 1 (for llell,, = 1 and 0 E M). Now take an arbitrary u = (uk} in M. If vk, _< 0 for some integer ko, then show that 1 < Ile - ull,,, But U E R(1 - S-). and hence uk = lk - lk+t for some
Problems
x=
305
in 1°. If vk > 0 for all k, then (tk) is decreasing and bounded.
Check that (tk) converges in R (Problem 3.10), show that vk -+ 0 (Problem
3.51), and conclude that 1 < Ile - ull,, whenever vk > 0 for all k. Hence 1 < ll e - u II,, for every u E M so that d (e, M) > 1. Therefore (it does not matter whether M is closed or not), M- is a subspace of t+' (Proposition 4.9(a)) and d(e, M-) = 1 (Problem 3.43(b)). Then, according to Corollary 4.63, there exists a bounded linear functional gyp: l.' --> R such that
rp(e) = 1,
rp(M-) = (0)
and
Ilwll = 1.
(b) Show that rp(x) = Sp(S_"x) for every x E 1+1 and all n > 1.
Hint: rp((1 - S_)x) = 0 because (p(M) = (0). This leads to cp(x) = V(S_x) for every x in 1+00. Conclude the proof by induction.
(c) Show that rp satisfies condition (iii).
Hint: Take an arbitrary x = in t+ so that t k -+ t in R for some t E R. Observe that Ilk+n - t I < Itk+n - to I + It,, - t I for every pair of positive integers n, k. Now use Problem 3.51(a) to show that 11S_"x - i;ell -+ 0. That is, S_"x -+ t e in 1+00. Next verify that V(x) = i; tp(e).
(d) Show that (p satisfies condition (iv).
Hint: Take any nonzero x = (tk) in 1+ such that !;k > 0 for all k and set x' _ Verify that 0 < k < 1 for all k, and hence Il e - x' ll" < I . (Ilx II-00'O )x = Finally, show that 1 - rp(x') = rp(e - x') < 1, and conclude: rp(x) > 0. Thus, in the real case, W: t+00 - R is a Banach limit on t+00.
(e) Now suppose IF = C (so that complex-valued sequences are allowed in ti°). For each x = xI + i x2 in t+, where x1 and x2 are real-valued sequences in 1+00, set
f(x) = cp(xl)+irp(x2) Show that this defines a bounded linear functional f : t+0°
C.
Hint: II f II < 2.
(t) Verify that f satisfies conditions (ii), (iii) and (iv).
(g) Prove that ill II = 1 Hint: Let t. be the set of all scalar-valued sequences that take on only a finite number of values (i.e., that have a finite range). Clearly, C+* C t0'. If Y E t+ with Ilyllo° < 1, then there exists a finite partition of N. say (Nf) 1, and a finite set of scalars (al }l 1 with l ai 1 < 1 f o r all j such that y = . i aj XN, .
306
4. Banach Spaces
Here XNj is the characteristic function of Nj which, in fact, is an element of e+ (i.e., XN; = {XN,(k)}kEN, where XN;(k) = 1 if k E Nj and Xf(k) _
0 if k E N\Nj). Verify that f(y) =
j= J= ,ajf(XNt) _ F _m and (P(XN) = w(e). Recall that (P satisfies 'p(XNN) = w(E condition (iv) and show that I f (y) I < (sup! Iai I)V(e). 1
Conclusion 1. If Y E e+ and Ilylloo < 1, then I f(y)l < 1.
The closed unit ball B with center at the origin of C is compact. For each positive integer n take a finite n -net for B, say B C B. If x = (i k) E f+ is such that IIxlIoo < 1, then ik E B for all k. Thus for each k there exists v (k) E B, such that I Vn (k) - i k I < ,-. This defines, for each n, a Bn-valued sequence y = such that Ilyn - x11,, < ff, which in turn defines an e+-valued sequence
with Ilyy Iloo < I for all n that converges in 1+00 to X.
Conclusion 2. Every X E 1+°O with Iix 110,
1 is the limit in £+ of an C+-valued
sequence {y,) with sup,, Ilyn II1, < 1.
Recall that f : P+ -> C is continuous. Apply Conclusion 2 to show that
f(yn) -+ f(x), and hence If(yn)I - If(x)1. Since If(yn)1 < 1 for every n (by Conclusion 1), it follows that If (x) 1 < 1 for every x E e+ with 11x 110. < 1. Therefore 11 f 11 < 1. Finally, verify that II f II ? 1(for f (e) _ 4p(e) and Ilelloo = 1).
Thus, in the complex case, f :.f+ -+ C is a Banach limit on f+00-
Problem 4.67. Let X be a normed space. An X-valued sequence is said to be weakly convergent if there exists x E X such that if (xn)} converges in F to f (x) for every f e X*. In this case we say that {xn } converges weakly to x E X (notation: xn assertions.
x) and x is said to be the weak limit of {x }. Prove the following
(a) The weak limit of a weakly convergent sequence is unique.
Hint: f (x) = 0 for all f E X* implies x = 0. (b)
converges weakly to x if and only if every subsequence of (xn } converges weakly to x.
Hint: Proposition 3.5.
(c) If {x } converges in the norm topology to x, then it converges weakly to x
(i.e., x -+ x
x - x).
Hint: If(xn - x)I < 11fIIllxn -x11
Problems
307
(d) If (xa) converges weakly, then it is bounded in the norm topology (i.e., Xn
X
supn IIxn II < oo).
Hint: For each x E X there exists rpx E X" such that W., (f) = f (x) for every f E X' and II(px II = IIxII. This is the natural embedding of X into X" (Theorem 4.66). Verify that sup. If (xn)I < oo for every f E X' whenever xa x, and show that supn I cpxn (f) I < oo for every f E X. Now use the Banach-Steinhaus Theorem (recall: X' is a Banach space). Problem 4.68. Let X and Y be normed spaces. A B[X, y]-valued sequence { Ta } converges weakly in B[X, y] if (Tax) converges weakly in Y for every x E X. Equivalently, if If (Tax)) converges in F for every f E y' and every x E X. (a) If { Ta) converges weakly in B[X. Y], then there exists a unique T E £[X, y]
(called the weak limit of {T }) such that Tax -u) Tx in y for everyx E X. Prove.
Hint: Problem 4.67(a) ensures the existence of a unique mapping T : X - Y. which is linear because every f in y' is linear.
Notation: Ta -w T (or Ta - T -) 0). If {Ta) does not converge weakly to T, then we write Ta , . T. (b) If X is a Banach space and Ta --> T, then sup. II Tn II < oo and T E B[X, y]. Prove.
Hint: Apply the Banach-Steinhaus Theorem and Problem 4.67(d) to prove boundedness for {T}. Show that If (Tx )I -< Ill II (supn II Ta II) IIx II for
every x E X and f E Y*. Conclude: IITxII = supfEy. IIf11=1If(Tx)I < sup. II T. 1111 x II for every x E X.
(c) Show that T" -5) T implies Ta
T.
Hint: If((T. - T)x)I <- Ilflil[ (T. - T)xII. Take T E B[X] and consider the power sequence {T") in the normed algebra B[X]. The operator T is weakly stable if T" -°'- 0.
(d) Verity that strong stability implies weak stability,
T"5) 0 = T"- 0, which in turn implies power boundedness if X is a Banach space:
T" _ w) 0 = sup II T" II < oo if X is Banach. n
308
4. Banach Spaces
Problem 4.69. Let X and Y be normed spaces. Prove the following propositions. (a)
If T E B[X, Y], then Tx
",
-> Tx in Y whenever x "'- x in X. That
is, a continuous linear transformation of X into ) takes weakly convergent sequences in X into weakly convergent sequences in Y.
Hint: If f E Y. then f o T E X. (b) If T E 13[X, Y J, then II Tx - Tx II --> 0 whenever x -u-'+ x in X. That is, a compact linear transformation of X into Y takes weakly convergent sequences in X into convergent sequences in Y. Hint: Let x,, x in X and take T E B,,O[X, YJ. Use Theorem 4.49. part (a) of this problem, and Problem 4.67(d) to show that
(bt)
Tx " Tx in ))
and
sup IIxn ll < oo. n
Suppose does not converge (in the norm topology of Y) to Tx. Use Proposition 3.5 to show that (Tx ) has a subsequence, say that does not converge to Tx. Thus conclude: there exists to > 0 and a positive integer kEO such that
(b2)
IIT(xnk - x)II > eo
for every
k > kEO
Verify from (b1) that supk Ilx,,k II < oo Apply Theorem 4.52 to show that has a subsequence. say {Tx114,}. that converges in the norm topology of Y. Now use the weak convergence in (b1) and Problem 4.67(b) to show in fact converges to Tx (i.e., Tx in Y). Therefore, for that each e > 0 there exists a positive integer jE such that
x)II < e
(b3)
for every
J ? Je
Finally, verify that (b3) contradicts (b2) and conclude that verge in Y to Tx.
must con-
Problem 4.70. Let X be a normed space. An X*-valued sequence if.) is weakly converges in F to W(f) for convergent if there exists f E X* such that
every p E X*' (cf. Problem 4.67). In this case we write f X*-valued sequence
f in X*. An
is weakly* convergent if there exists f E X* such that
{ f (x)) converges in F to f (x) for every x E X (notation: f - f. Thus weak* convergence in X* means pointwise convergence of B[X, F]-valued sequences to an element of B[X,Fr].
(a) Show that weak convergence in X' implies weak* convergence in X' (i.e.,
f. -. f
f. -
f ).
Problems
309
Hint: According to the natural embedding of X into X** (Theorem 4.66) for each x E X there exists Qpt E X** such that Spx (f) = f (x) for every f E X*. Now verify that, for each x E X, f (x) - f (x) whenever qpx (fn) --)- Spx (f ).
(b) If X is reflexive, then the concepts of weak convergence in X* and weak* convergence in X* coincide. Prove.
5
Hilbert Spaces
What is missing? The algebraic structure of a normed space allowed us to operate with vectors (addition and scalar multiplication), and its topological structure (the one endowed by the norm) gave us a notion of closeness (by means of the metric generated by the norm), which interacts harmoniously with the algebraic operations. In particular, it provided the notion of the length of a vector. So what is missing if algebra and topology have already been properly laid on the same underlying set? A full geometric structure is still missing. Just algebra and topology are not enough to extend to abstract spaces the geometric concept of relative direction (or angle) between vectors that is familiar in Euclidean geometry. The keyword here is orthogonality, a concept that emerges when we equip a linear space with an inner product. This supplies a tremendously rich structure that leads to remarkable simplifications.
5.1
Inner Product Spaces
We shall assume throughout this chapter (as we did in Chapter 4) that IF denotes either the real field R or the complex field C, both equipped with their usual topologies induced by their usual metrics. Recall that an upper bar stands for complex conjugate in C. That is, for each complex number A = a + i # in standard form the real numbers a = Re A and P = Im A are the real and imaginary parts of A, respectively, and the complex number A = a - i fi is the conjugate of A. The following are
basic properties of conjugates: for every A, a E C, (µ) = µ, (A + it) = k +µ,
312
5. Hilberi Spaces
(Ak) = Aµ, A+A=2ReA, A-A=2iImA, AA=IA12=(ReA)2+(Im),)2, and A=x if and only if AER. Let X be a linear space over F. A functional a : X x X -* IF, defined on the Cartesian product of a linear space X with itself, is symmetric if a(x, y) = a(y, x) and Hermitian symmetric if
a(x, y) = a(y,x), for every x, y E X. If a ( , v) : X -> IF and a (u, ) : X -+ F are linear functionals on X for every u, v E X (i.e., if a is linear in both arguments), then it is called
a bilinear form (or a bilinear functional) on X. If a( , v): X -* IF is a linear functional on X for each v E X, and if a (u, ) : X -+ lF is a linear functional on X for each u E X, then a is said to be a sesquilinearform (or a sesquilinearfunctional) on X. Equivalently, a is a sesquilinear form on X if
a(au + fix, v)
= aa(u, v) + fa(x, v),
a(u,av+0y) = Z
y),
for every u, v, x, y E X and every a, 0 E IF ("sesqui" means "one-and-a-half"). If F = R, then it is clear that a is symmetric if and only if it is Hermitian symmetric, and the notions of bilinear and sesquilinear forms coincide as well. It is readily verified that a functional a : X xX -+ F that is linear in the first argument (i.e., a( -, v) : X -+ IF is linear for every v E F) and Hermitian symmetric is a sesquilinear form. Therefore, a Hermitian symmetric sesquilinear form is precisely a Hermitian symmetric functional on X x X that is linear in the first argument. If a is a sesquilinear form on X, then the functional 0: X -+ F defined by 0 (x) = a (x, x)
for every x E X is called a quadratic form on X induced (or generated) by a. If a is Hermitian symmetric, then the induced quadratic form 0 is real-valued (i.e., a (x, x) E R for every x E X whenever a is Hermitian symmetric). Also note that if a is a sesquilinear form, then a(0, v) = a(u, 0) = 0 for every u, v E X so that a (0, 0) = 0. The quadratic form 0 induced by a Hermitian symmetric sesquilinear forma is nonnegative or positive if
a (x, x) > 0 a (x, x) > 0
for every x E X.
or
for every nonzero x E X,
respectively. Equivalently, ¢ is positive if it is nonnegative and a (x, x) = 0 only if x = 0. An inner product (or a scalar product) on a linear space X is a Hermitian symmetric sesquilinear form that induces a positive quadratic form. In other words, an inner product on a linear space X is a functional on the Cartesian product X x X that satisfies the following properties, called the inner product axioms. Definition 5.1. Let X be a linear space over F. A functional
(; ):XxX-i. F
5.1 Incur Product Spaces
313
is an inner product on X if the following conditions are satisfied for all vectors x, y and z in X and all scalars a in F. (i)
(ii) (iii) (iv) (v)
(x + y ; z) = (x ; z) + (y; z)
(additivity (homogeneity), (Hermitian symmetry), (nonnegativeness),
(ax; y) = a(x ; y) (x; Y) _ (Y; x) (x ; x) > 0 (x ; x) = 0 only if x = 0
(positiveness).
(Observe that homogeneity in (ii) really means homogeneity in the first argument.) A linear space X equipped with an inner product on it is an inner product space (or a
pre-Hilbert space). If X is a real or complex linear space (so that F = R or F = C) equipped with an inner product on it, then it is referred to as a real or complex inner product space, respectively. Axioms (i), (ii) and (iii) are enough by themselves to ensure that, for all vectors x, y, z in X and all scalars a in F,
(x;ay)= i(x;y).
(x;Y+z)=(x;y)+(x;Z). and
(x; 0) _ (0;x)=(0;0) = 0.
As a matter of fact, just these three axioms imply (by induction) that n
n
(Eaix;;moyo) = EaiX(xi:yo), i=1
i=1
Y;'
(ao xo
aoX (xo;yi), ;=1
and hence
n
n
n
(jaixi ; EfjjYj) = E ;=o
j=0
i.j=0
for each integer n > I whenever ai and f; lie in F and x; and y; lie in X. for every i = o.... , n. In particular. for every x, y E X,
(x+y;x+y) = (x;x)+2Re(x;y)+(y;y) (because (x ; y) + (y; x) = (x ; y) + (x ; y) = 2 Re (x ; y)). Moreover, by using axioms (ii) and (v) we get
(x; y) = 0 for all y E X
if and only if
x = 0.
Let 11112: X - F be the quadratic form induced by an inner product (;) on X (i.e., 11x 112 = (x ; x) > 0 for every x E X - the notation 11112 for the quadratic
314
S. Hilbert Spaces
form induced by an inner product is certainly not a mere coincidence as we shall see shortly). Note that
IIx+YII2 = IIxI12+2Re(x;y)+11Y112
for every x, y E X (by axioms (i) and (iii)). The next result is of fundamental importance. It is referred to as the Schwarz (or Cauchy-Schwarz, or even CauchyBunyakowski-Schwarz) inequality.
Lemma 5.2. Let (
;
) : X x X -+ F be an inner product on a linear space X. Set
II x ll = (x ; x) 21 for each
x E X. If x, y E X. then
I(x;Y)l
IIxllllyll
Proof. Take an arbitrary pair of vectors x and y in X. According to axioms (i), (ii), (iii) and (iv) in Definition 5.1,
0 < (x-ay;x-ay) = (x;x)-a(x;y)-a(x;y)+IaI2(Y;Y) for every a E F. In particular, set a = 0
If Il Y I) i4 0, then put 0 =
for any
> 0 so that
IIxII2-7(2-1)I(x;Y)I2. 11y112
to get the Schwarz inequality. If II Y ll = 0, then 0 < 21(x ; y) 12 < # 11X II2 for all {9 > 0, and hence I (x ; y) I = 0 (which trivially satisfies the Schwarz inequality).
El
Proposition 5.3. If ( ; ) : X x X -* F is an inner product on a linear space X, : X - R, defined by
then the function 1 11 1
IIxII = (x;x)I for each x E X, is a norm on X. Proof. Axioms (ii), (iii). (iv) and (v) in Definition 5.1 imply the norm axioms (i), (ii) and (iii) of Definition 4.1. The triangle inequality (axiom (iv) of Definition 4.1) is a consequence of the Schwarz inequality:
0:5 IIx+y112 = IIxII2+2Re(x;y)+IIYIIZ < (IIxII+IIYII)2 for every x and y in X. (Reason: Re (x ; y) < I(x ; y)I < IIxII
n
A word on notation and terminology. An inner product space in fact is an ordered pair (X. ( ; )) where X is a linear space and ( ; ) is an inner product on X. We shall often refer to an inner space by simply saying that "X is an inner product space" without explicitly mentioning the inner product (. ) that equips the linear space X.
5.1 Inner Product Spaces
315
However, there may be occasions when the role played by different inner products should be emphasized and, in these cases, we shall insert a subscript on the inner products (e.g., (X, (; )X) and (y, ( )y)). If a linear space X can be equipped with and ( ; )2, then (X, ( ; ) t) and (X. ( ; )2) more than one inner product, say (
will represent different inner product spaces with the same linear space X. The 11 of Proposition 5.3 is the norm induced (or defined, or generated) by the inner product ( ; ), so that every inner product space is a special kind of normed space (and hence a very special kind of linear metric space). Whenever we refer norm 11
to the topological structure of an inner product space (X, (; )) it will always be understood that such a topology on X is that defined by the metric d that is generated ). That is, by the norm 11 11, which in turn is the one induced by the inner product i
d(x,y) = Ilx - yIl = (x-y;x-y)z for every x, y E X (see Propositions 4.2 and 5.3). This is the norm topology on X induced by the inner product. Since every inner product on a linear space induces a norm, it follows that every inner product space is a normed space (equipped with the induced norm). However, an arbitrary norm on a linear space may not be induced by any inner product on it (so that an arbitrary normed space may not be an inner product space). The next proposition leads to a necessary and sufficient condition that a norm be induced by an inner product.
Proposition 5.4. Let ( ; ) be an inner product on a linear space X and let 11 the induced norm on X. Then
11 be
Ilx + y112 + IIx - yll2 = 2(11x11' + Ilyll2)
for every x, y E X. This is called the parallelogram law. If (X, (; )) is a complex inner product space, then
(x;y) =I (IIx+1.112-Ilx-vu2+illx+iyll--i0x-iyli2) for every x, y E X. If (X, (;)) is a real inner product space, then (x
: y) = a (Ilx + yll2 - Ilx - yll2)
for every x, y E X. The above two expressions are referred to as the complex and real polarization identities, respectively. Proof According to axioms (i), (ii) and (iii) in Definition 5.1 (also see the displayed identity that precedes Lemma 5.2) we get llx + ayll22
=
11x1122 + 2Re (iz(x : y)) + Ia121Iy112
= 0x112+2[ReaRe(x; y)+Imalm(x;y)]+Ia1211y1I2 for every x, y E X and every a E IF (recall that Re (Aµ) = Re k Re it - Im ). Im µ for every A, µ E Q. The parallelogram law and the (real) polarization identity
5. Hilbert Spaces
316
in a real inner product space follow by setting a = 1 and a = -1. To get the (complex) polarization identity in a complex inner product space also set a = i and
a=-i.
o
Theorem 5.5. (von Neumann). Let X be a linear space. A norm on X is induced by an inner product on X if and only if it satisfies the parallelogram law. Moreover if a norm on X satisfies the parallelogram law, then the unique inner product that induces it is given by the polarization identity.
Proof. Proposition 5.4 ensures that if a norm on X is induced by an inner product, then it satisfies the parallelogram law and the inner product on X can be written in terms of this norm according to the polarization identity. Conversely, suppose a norm 11 on X satisfies the parallelogram law and consider the mapping ( ; ) : X x X IF defined by the polarization identity. Take x, y and z arbitrary in X. Note that 11
x+z=(x 2y+z)+x2y
y+z=(x 2y+Z)-x 2y
and
Thus, by the parallelogram law,
IIx+zu2+uY+zII2 =
2(IIx-
+zII2+11jII2).
Suppose IF = R so that (;) : X x X -* R is the mapping defined by the real polarization identity (on the real normed space X). Hence
(x;z)+(y;z) = a(IIx+z1I2-IIx-zII2+IIY+zII2-IIY-z1I2) a[(Ilx+zII2+Ily+zII2)-(IIx-zII2+IlY-z1I2)]
_[(11 x+zII2+11112)-(IX(1112+2 +zn2-
I
-z112)
-ZU2+I = 2(xq;z).
112)]
The above identity holds for arbitrary x, y, z E X, and so it holds for y = 0. Moreover, the polarization identity ensures that (0; z) = 0 for every z E X. Therefore. by setting y = 0 above we get (x ; z) = 2(1 ; z) for every x, z E X. Then (i)
(x;z)+(y;z) = (x+y:z)
for arbitrary x, y and z in X. It is readily verified (exactly the same argument) that such an identity still holds if IF = C, where the mapping ( ; ) : X x X --* C now satisfies the complex polarization identity (on the complex normed space X). This is the axiom (i) of Definition 5.1 (additivity). To verify the axiom (ii) of Definition 5.1 (homogeneity in the first argument) proceed as follows. Take x and y arbitrary in X. The polarization identity ensures that
(-x ; Y) = -(x ; Y).
5.1 Inner Product Spaces
317
Since (i) holds true, it follows by a trivial induction that
(nx; y) = n(x; y), and hence (x ; y) = (n it ; y) = n (Y'r ; y) so that
(i`r; Y) = R(x; Y), for every positive integer n. The above three expressions imply that
(qx;Y) =q(x;Y) for every rational number q (for (0; y) = 0 by the polarization identity). Take an arbitrary a E R and recall that Q is dense in R. Thus there exists a rational-valued sequence {q} that converges in R to a. Moreover, according to (i) and recalling
that -(ax; y) = (-ax; y), (qnx ; Y) - (ax ; Y) I = I ((qn - a)x ; Y) The polarization identity ensures that I(anx ; y) I -+ 0 whenever or,, --> 0 in R (because the norm is continuous). Hence I ((qn - a)x ; y) I -- 0 and so I (qnx ; y) (ax ; y) l -+ 0, which means that (qnx ; y) -* (ax; y). Therefore, (ax; y) _ limn (qnx ; y) = limn q, ,(x ; y) = a(x; y). Outcome: (ii(a))
(ax; Y) = a(x; Y)
for every a E R. If IF = C, then the complex polarization identity (on the complex space X) ensures that
(ix:Y) = i(x;Y). Take an arbitrary A = a + i 0 in C and observe by (i) and (ii(a)) that (Ax ; y) _ ((a + i Q)x ; y) = (ax ; y) + (i 8x ; y) = (a + i ,)(x ; y) = A(x ; y). Conclusion: (ii(b))
(AX; Y) = A(x ; Y)
for every A E C. Axioms (iii), (iv) and (v) of Definition 5.1 (Hermitian symmetry and positiveness) emerge as immediate consequences of the polarization identity. Thus the mapping ( ; ) : X x X -+ IF defined by the polarization identity is, in fact, an inner product on X. Moreover, this inner product induces the norm 11 11; that is, (x ; x) = IIx 112 for every x E X (polarization identity again). Finally, if
(; )0: X x X - IF is an inner product on X that induces the same norm II II on X, then it must coincide with ( ; ). That is, (x ; y)o = (x ; y) for every x, y E X o (polarization identity once again). A Hilbert space is a complete inner product space. In other words, a Hilbert space is an inner product space that is complete as a metric space with respect to the metric generated by the norm induced by the inner product. In fact, every Hilbert space is
a special kind of Banach space: a Hilbert space is a Banach space whose norm is induced by an inner product. According to Theorem 5.5, a Hilbert space is a Banach space whose norm satisfies the parallelogram law.
318
5. Hilbert Spaces
5.2
Examples
Theorem 5.5 may suggest that just a few of the classical examples of Section 4.2 survive as inner product spaces. This indeed is the case.
Example 5A. Consider the linear space E" over F (with either F = R or F = C) and set
"
for every x = (1;1,... , ") and y = (v1, ... , v") in F. It is readily verified that this defines an inner product on F" (check the axioms in Definition 5.1), which induces the norm II 112 on IF". In particular, it induces the Euclidean norm on R" so that (R", ( ; )) is the n-dimensional Euclidean space (see Example 4A). Since (F", 11112) is a Banach space, (IF", (
;
)) is a Hilbert space.
Now consider the norms 1111 P (for p > 1) and II II, on )F" defined in Example 4A. If
n > 1, then all of them, except the norm II 112. are not induced by any inner product
on Fn. Indeed, set x = (1, 0, ... , 0) and y = (0, 1, 0, ... , 0) in ]F" and verify that the parallelogram law fails for every norm II P with p 96 2, as it also fails for the II
sup-norm 1111,,.. Therefore, if n > 1, then (IF", 11112) is the only Hilbert space among
the Banach spaces of Example 4A. Example 5B. Consider the Banach spaces (e+, II IIP) for each p > 1 and (e.}', II II ,) of Example 4B. It is easy to show that, except for (e+, 11 112), these are not Hilbert spaces: the norms 1111 P for every p # 2 and 11 1I. do not pass the parallelogramlaw test of Theorem 5.5. and hence are not induced by any possible inner product
on 8 ° (p A 2) or on t+- (e.g., take x = el = (1, 0, 0.0, ...) and y = e2 = (0, 1, 0, 0, 0, ...) in £. n e °). On the other hand, the function (;) : e+ x1+2 -. F given by 00
(x; Y) k=1
for every x = {k)kEN and y = (vk )kEN in e+ is well-defined (i.e., the above infinite series converges in F for every x, y E l by the Holder inequality for p = q = 2 and Proposition 4.4). Moreover, it defines an inner product on f+ (i.e., it satisfies the axioms of Definition 5.1), which induces the norm 11 112 on t+'. Thus, as (C+, 11 112) is a Banach space,
(f+, (; )) is a Hilbert space.
5.2 Examples
319
Similarly, the Banach spaces (PP, II IIP) for any 1 < p # 2 and (t°O, II Iloo) are not Hilbert spaces. However, the function (; ) : e2 x t2 -> IF given by 00
EkVk
(x;y)
k=-oo
for every x =
and y = {Vk}kEZ in l2 defines an inner product on e2,
which induces the norm II 112 on f2. Indeed, the sequence of nonnegative numbers converges in R whenever the sequences {Ek__nikl2}nENo k-n I Vk 12}WEND of nonnegative numbers converge in R (cf. Holder inequaland ity for p = q = 2), and hence {Ek__nik vk }WEND converges in F (by Proposition
4.4). Therefore, the function (
, )
is well-defined and, as it is easy to check. it
satisfies all the axioms of Definition 5.1. Since (82, 11 112) is a Banach space, (e2, (
;
)) is a Hilbert space.
Example 5C. Consider the linear space C[0, 1 ] equipped with any of the norms II lip (p ? 1) of Example 4D or with the sup-norm 11 Iloo of Example 4G. Among these norms on C[0, 11, the only one that is induced by an inner product on C[0, 1) is the norm 11 112. Indeed, take x and y in C[0, 11 such that xy = 0 and llxll = ll ll # 0. where 11 11 denotes either II lip for some p > 1 or II II,,. That is, suppose x and y
are nonzero continuous functions on [0, 1] of equal norms such that their nonzero values are attained on disjoint subsets of (0, 1]. For instance, y
x
0
1
Observe that Ilx + yllP = IIx - yllP = 211x11p for every p> I and lix + ylloo = llx - y lloo = 211x ll oo Thus II lip for p A 2 and II 11,, do not satisfy the parallelogram
law, and hence these norms are not induced by any inner product on CIO, 1] (Theorem ; ) : C[0, 1 ] x C[0, 11 -* F given by
5.5). Now consider the function (
(x ; y) = f x(r) y(r) dt 0
for every x, y E C[0, 11. It is readily verified that (;) is an inner product on C[0, 1] that induces the norm Il 112. Hence
(C[0, 11, (; )) is an inner product space but not a Hilbert space. (Reason: (C[0, 1], 11112) is not a Banach space - Example 3D.) As a matter of fact, among the normed spaces (C[0, 1), 11 11 P) for each p ? 1
320
5. Hilbert Spaces
and (C[0, 1], II ilx). the only Banach space is (CIO, 1], I1 it ). This leads to a dichotomy: either equip C[0, 1] with II II, to get an inner product space that is not a Banach space, or equip it with 11 1],, to get a Banach space whose norm is not
induced by an inner product. In any case, CIO, 1] cannot be made into a Hilbert space. Roughly speaking, the continuous functions on [0, 11 are not large enough a set to be a Hilbert space. X x X -+ IF is a semiLet X be a linear space over a field F. A functional inner product on X if it satisfies the first four axioms of Definition 5.1. The difference between an inner product and a semi-inner product is that a semi-inner product is a
Hermitian symmetric sesquilinear form that induces a nonnegative quadratic form which is not necessarily positive (i.e., the axiom (v) of Definition 5.1 may not be satisfied by a semi-inner product). A semi-inner product ( ; ) on X induces a seminorm 11 11, which in turn generates a pseudometric d, viz., d (x, y) = Ilx - yll = (.r - y ; x - v) ! for every x, v E X. A semi-inner product space is a linear space equipped with a semi-inner product. Remark: The identity llx + y112 = IIx 112 + 2Re (x; y) + Ilyll2 for every x, y E X still holds for a semi-inner product and its induced seminorm. Indeed, the Schwarz inequality, the parallelogram law and the polarization identities remain valid in a semi-inner product space (i.e., they still hold if we replace "inner product" and "norm" with "semi-inner product" and "seminorm", respectively - cf. proofs of Lemma 5.2 and Proposition 5.4). The same happens with respect to Theorem 5.5.
Proposition 5.6. Let Il 11 be the sensinorns induced by a semi-inner product (; )
on a linear space X. Consider the quotient space X/A where N = (x E X: llx 11 = 0) is a linear manifold of X. Set
([x] ; [y])- = (x ; y) for every [x ] and [y] in X/JV, where x and y are arbitrary rectors its [x] and [y], is an respectively. This defines an inner product on X/N so that (X/N. (: inner product space.
Proof. The seminorm II is induced by a semi-inner product so that it satisfies the parallelogram law of Proposition 5.4. Consider the norm II II on X/JV of Proposition 4.5 and note that II
II[x]+[)1112 +Il[x]-[)']112
=
II[x+y]112 +I[x-y]1l2
= Ilx+y112+Ilx-y112 = 2(11x112 + 11y112) = 2(II[xlll2 +
II[y111*21)
for every Ix ], [v] E X/N. Thus II lI_ satisfies the parallelogram law. This means
that it is induced by a (unique) inner product (; )_ on X/N. which is given in terms of the norm 11
11
by the polarization identity (Theorem 5.5). On the other
5.2 Examples
321
hand, the semi-inner product ( ; ) on X also is given in terms of the seminorm 1 through the polarization identity as in Proposition 5.4. Since II[xl + a[y]II = lI x + a y ll for every [x], [y] E X/N and every a E F (with x and y being arbitrary elements of [x] and [y], respectively), it is readily verified via polarization identity o that ([x] ; [yl )_ = (x ; y). 11
Example SD. For each p > 1 let rp(S) be the linear space of all scalar-valued Riemann p-integrable functions, on a nondegenerate interval S of the real line, equipped with the seminorm I Ip of Example 4C. Again (see Example 5C) it is easy to show that, except for the seminorm 112, these seminorms do not satisfy the parallelogram law. Moreover, P
(X; y) = r x(s) y(s) ds Js
since every x, y E r2(S) defines a semi-inner product that induces the seminorm 112. viz., IX12 = (fslx(s)I2ds)4 for each x E r2(S). Consider the linear manifold N = {x E r2(S): Ix 12 = 0} and let R2(S) be the quotient space r2(S)/N as in Example 4C. Set
([x] ; [y]) = (X; y) for every [x], [y] E R2(S), where x and y are arbitrary vectors in [x] and [y], respectively. According to Proposition 5.6 this defines an inner product on R2(S), which is the one that induces the norm 11 112 of Example 4C. Since the normed space (R2(S), 11 112) is not a Banach space, it follows that
(R2(S), (
;
)) is an inner product space
but not a Hilbert space. The completion (L2(S), 11 112) of (R2(S), 11 112) is a Banach space whose norm is induced by the inner product ( ; ) so that
(L2(S), (
;
)) is a Hilbert space.
This, in fact, is the completion of the inner product space (C[0, 1
of Example 5C (if S = [0, 1 ] - see Examples 4C and 4D). We shall discuss the completion of an inner product space in Section 5.6.
Example 5E. Let ((Xi, ( ; )i)}"_, be a finite collection of inner product spaces, where the linear spaces Xi are all over the same field F. and let ®"_1Xi be the direct sum of the family (Ail". For each x = (xl,... , and y = (y , , ... , y.) in ®"_ i Xi put n
(.r; Y) = >2(xi;yi),. 1=1
It is easy to check that this defines an inner product on ®"_, Xi that induces the norm 11112 of Example 4E. Indeed, if 11 Ili is the norm on each Xi induced by the
5. Hilbert Spaces
322
inner product (
;
),,then (x ; x) = F"=1(.ri ; xi ), _ E"=1 Ilxi II = Ilx II2 for every
x = (x1, ... , x") in ®" X. Since (®" I Xi, II II2) is a Banach space if and only if each (X,, II Ill) is a Banach space, it follows that
((D"=1Xi, (; )) is a Hilbert space whenever each (X,, ) i) is a Hilbert space. If the inner product spaces (Xi. (; )i ) coincide with a fixed inner product space (X, ( ; )x), then (x ; y) = E"=1(x, ; y, )x
defines an inner product on X" = ®"=1X and
(X". ( whenever (X, (
;
)) is a Hilbert space
)x) is a Hilbert space. This generalizes Example 5A.
Example 5F. Let {(Xk, (, )k)) be a countably infinite collection of inner product spaces indexed by N (or by No). where the linear spaces Xk are all over the same field F. Consider the full direct sum (Dk Xk of {Xk }k° 1, which is a linear space over F. Let [®k I Xk], be the linear manifold of ®k°I Xk made up of all square-suntmable 1
sequences {xk}k° I in ®k° Xk. That is (see Example 4F),
[®k 1Xk]1. = ({xk}k°1 E ®k==1Xk Ek=jIlxkllk < 001, where each II Ilk is the norm on Xk induced by the inner product (
;
)k. Take arbitrary
sequences {xk }k° 1 and {yk }k°1 in [e?. I Xk], so that the real-valued sequences { IIxk Ilk }k° and { IIYk Ilk }k° I lie in a+. Write (; )e+ and II Iltz for the inner product 1
and norm on £. (as in Example 5B). Use the Schwarz inequality in each inner product space Xk, and also in the Hilbert space f.;, and get 00
00
I(xk; Yk)kl : k=I
IlxkIlk IlYkIlk
= ((Ilxkllk)k°, ; {IIYkIIk}k° 1)t;
k=I 11{Ilxkllk}k° 1 ii:! II(IIYklIk)k°=I 11(
.
Therefore F_k° I I (xk ; Yk )k I < oc; that is, the infinite series Ek° I (xk ; Yk)k is absolutely convergent in the Banach space (F, l I), and hence it converges in (F, I I) by Proposition 4.4. Set 00
(x ; Y) = J(xk ; Yk)k k=l
for each x = {xk )k°1 and y = {yk }k° 1 in [®k Xk],. It is easy to show that this defines an inner product on k° I Xk], that induces the norm II III of Example 4F. Moreover, since ([®k° I Xk]2, II III) is a Banach space if and only if each (Xk. II Ilk) is a Banach space, it follows that
([® 1Xk],, (; )) is a Hilbert space
5.3 Orthogonality
323
whenever each (Xk, ( ; )k) is a Hilbert space. A similar argument holds if the collection ((Xk, ( ; )k)} is indexed by Z. Indeed, if we set
[®k ooXk]2 =
J (xk)ko
-oo E ®k= -o0Xk: Eko -00Ilxk Ilk <
the linear manifold of the full direct sum ®k° _00Xk of (Xk}k° square-summable nets (xk }k°_oo in ®k°__00Xk, then
_
co),
made up of all
00
(x ; Y) = E (xk ; Yk)k k=-oo
for each x = (xk)k _00 and y = (yk )k° -00 in [®k_ooXk]2 defines the inner product on [®k°__00Xk]2 that induces the norm II 112 of Example 4F. Again, if each (Xk, ( ; )k) is a Hilbert space, then
([®k° _ooXk]2, (; )) is a Hilbert space. If the inner product spaces (Xk, (; )k) coincide with a fixed inner product space (X, ( ; )x), then set
t. (X) = [®k o1X]2
and
£2(X) = [® 00 k= 00'Y]
as in Example 4F. If (X, (; )x) is a Hilbert space, then (4+2 (X), (
5.3
)) and (f2(X), (
;
)) are Hilbert spaces.
Orthogonality
Let a and b be nonzero vectors in the Euclidean plane lR2, and let Bab be the angle between the line segments joining these points to the origin (this is usually called the
angle between a and b). Set a' = Ila11-la = (al, a2) and b' = 11b11-1b = (P1,82) in the unit circle about the origin. It is a simple exercise of elementary plane geometry to verify that cos Oab = Q'1 81 + x202 = (a' ; fl') = Ila 11 -1 IIbII-1 (a ; b). We shall be
particularly concerned with the notion of orthogonal (or perpendicular) vectors a and b. The line segments joining a and b to the origin are perpendicular if Oab = i (equivalently, if cos 9ab = 0), which means that (a ; b) = 0. These notions (angle and orthogonality, that is) can be extended from the Euclidean plane to a real inner product space (X, ( ; )) by setting cos dxy =
whenever x and y are nonzero vectors in X # (0). Observe that -1 < cos Oxy < 1 by the Schwarz inequality, and also that cosOxy = 0 if and only if (x ; y) = 0.
324
5. Hilbert Spaces
Definition 5.7. Two vectors x and y in any (real or complex) inner product space (X, ( ; )) are said to be orthogonal (notation: x 1 y) if (x ; y) = 0. A vector x in X is orthogonal to a subset A of X (notation: x 1 A) if it is orthogonal to every vector in A (i.e., if (x ; y) = 0 for every y E A). Two subsets A and B of X are orthogonal (notation: A 1 B) if every vector in A is orthogonal to every vector in B (i.e., if (x ; y) = 0 for every x E A and every y E B).
Thus A and B are orthogonal if there is no x in A and no y in B such that (x; y) * 0. In this sense the empty set 0 is orthogonal to every subset of X. Clearly, x 1 y if and only if y 1 x, and hence A 1 B if and only if B 1 A. so that 1 is a symmetric relation both on X and on the power set P(X). We write x L y if x E X and Y E X are not orthogonal. Similarly, A. B means that A C X and B C X are not orthogonal. Note that if there exists a nonzero vector x in A fl B. then (x ; x) = 11x112 # 0, and hence A L B. Therefore,
AIB
implies
A fl B = (0).
We shall say that a subset A of an inner product space X is an orthogonal set (or a set of pairwise orthogonal vectors) if x 1 y for every pair {x, y) of distinct vectors in A. Similarly, an X-valued sequence {xk) is an orthogonal sequence (or
a sequence of pairwise orthogonal vectors) if xk 1 xj whenever k 7-4 j. Since liX + y112 = IIX 112 + 2Re (x; y) + 11y112 for every x and y in X. it follows as an immediate consequence of the definition of orthogonality that
x1y
implies
llx + y112 = IIXI12 + 11y112.
This is the Pythagorean Theorem. The next result is a generalization of it for a finite orthogonal set.
Proposition 5.8. If {xi }"-0 is a finite set ofpairwise orthogonal vectors in an inner product space, then llxi112.
Proof. We have already seen that the result holds for n = 1 (i.e., it holds for every pair of distinct orthogonal vectors). Suppose it holds for some n > 1 (i.e., suppose II ynX, 112 = > .011X,112 for every orthogonal set (xi )" a containing n + 1 elements). Let {x, } i be an arbitrary orthogonal set with n + 2 elements. Since xn+I 1 {x1}'.., it follows that xn+t 1 >"=oxi (since (xn+l ; F_"_oXi) _ _"=o(Xn+l ; xi)). Hence n+I
n
2
n+t
n
q2
IIEX'II2
EX' i=0
=
IIEXi+xn+1 i=0
=
+IIXn+1112 = EIIxifl2. i=0
i=0
5.3 Orthogonality
325
so that the result holds for n + 1 (i.e., it holds for every orthogonal set with n + 2 elements whenever it holds for every orthogonal set with n + I elements), which completes the proof by induction. o
Recall that an X-valued sequence {xk}k°1 (where X is any normed space) is square-swwnable if F-k=1 Ilxk 112 < oo. Here is a countably infinite version of the Pythagorean Theorem. Corollary 5.9. Let {xk }0° I be a sequence of pairwise orthogonal vectors in a inner
product space X.
(a) If the infinite series E .Ixk converges in X, then (xk)k°I is a squaresummable sequence and II r-k°_I xk 112 = F-k°O I Ilxk 112.
(b) If X is a Hilbert space and {xk }k° I is a square-summable sequence, then the infinite series FO 1xk converges in X. Proof. Let {xk }k0 1 be an orthogonal sequence in X. (a) If the series Ek0O 1xk converges in X; that is, if X:k=1xk -> I:k°O__Ixk E X as n --> oo, then II Ek=1 xk 112 --* 11 F-k__ I xk 112 as n --> oo. (Reason: norm and squaring
are continuous mappings.) But Proposition 5.8 says that II ik=1 xk 112 = Ek= Ilxk 112 for every n > 1, and hence Ek=1 Ilxk II2 -> 111:00 1 Xk 112 as n -* 00.
(b) Consider the X-valued sequence {yn }' of partial sums of {xk }k I ; that is, set Yn = >k-1xk for each integer n > 1. According to Proposition 5.8 we know that 1
lIYn+m - Yn
112
=
Ilxk; II2 for every m, n > 1. If >k°O_I Ilxk 112 < oo, then 00
supm> I IIYn+m - Y. 112 = F-k=n+I Ilxk II2 -> 0 as n -+ oo (Problem 3.11), and hence
{yn}n' is a Cauchy sequence in X (Problem 3.51). If X is Hilbert, then {yn}O0 converges in X, which means that the series Jk°_Ixk converges in X. o 1
1
Therefore, if {xk }ko I is an orthogonal sequence in a Hilbert space N, then F_k° 111xk 112 < oo if and only if the infinite series Lkk__Ix, converges in N and, in this case, IlEk__Ixk112 = Ek° 111xk112.
Example 5G Let ((Xk, (
; )k)}k° be a sequence of Hilbert spaces. Consider is the linear space of all where [®k _I the Hilbert space ([®k=1 Xk]2, (; where square-summable sequences in the full direct sum ®k°_1Xk and (;) is the inner product of Example 5F; that is, 00
(x :Y) = 1: (xk ; Yk)k k=1
for every x = {xk}k0 I and y = (yk)k° 1 in L®k=1Xk]2. This is referred to as an (external) orthogonal direct sum. An orthogonal direct sum actually deserves its name. Indeed, if we identify each linear space Xi with the linear manifold ®k°=1 Xi (k)
326
5. Hillier Spaces
of [®k_ I XA12 such that.Yi (k) = 10k) c Xk fork # i and X, (:) = Xi (as in Example 41), then it is clear that X, 1 X4
whenever
i # j,
where such an orthogonality is interpreted as ®k- Xi (k) 1 (D'I Xj (k) with respect to the inner product ( ) on [®k0°_IXk],. Observe that the norm 11 11 induced on [®k=1 Xk]2 by this inner product is given by 1
0v
00
lIX112 = (x:x) = E(xk;xk)k = FIIxkIIk k=1
k=1
in [®k Xk],. This can also be verified via Corollary 5.9 as follows. Take an arbitrary vector x = {xk }k , in [®k I Xk], (i.e., take an arbitrary square-summable sequence from ([)k IXk). Set xi(k) = bikxk in Xk for every k, i > 1 (i.e.. xi(k> = Ok if k * i and xi(i) = xi). For each i > I consider the vector x, _ (x, (k))k° 1 so that x I = (x1.02, 03, ...) and, for i > 2 for every x = {xk }k0 I
1
xi = (xi(k))k== (01.... k=1
in
[®kw=IXk],.
Claim: (x, );-1 is an orthogonal square-summable sequence. (Proof: Just note that x, E (D"" I Xi (k>, Ilxi II = IIx, II,, and (xi }0O I is square-summable). Thus Corolla-
ry 5.9 ensures that (i) the infinite series ®,0_01xi converges in the Hilbert space ([®k IXk]2. (; )), and (ii) II®i=)xi 112 = X00=1 Ilxi II'". Notational warning: We are denoting vector addition in the linear space ®k __ 1 Xk by ® and vector subtraction
by e, as usual. From (i) we getx = ®00,xi. (Reason:x a (®" 1xi) = for each n > 1.) and from (ii) we get Ilx
112
= E0° I Ilxi 112. Therefore,
IIxII`=EIIxill2=>IIxi11l If (X, (; )) is an inner product space, and if M is a linear manifold of the linear space X. then it is easy to show that the restriction (; )M: M x M -* F of the inner product (; ) : X x X -> F to M x Aid is an inner product on M. so that (M, (; )M) is an inner product space. Moreover, the norm II IIM : M -* R induced by the inner product (; )M on M coincides with the restriction to M of the norm : X -- R induced by the inner product (, ) on X. Thus (M, II IIM) is a linear manifold of the normed space (X, 11 11). Whenever a linear manifold of an inner product space is regarded as an inner product space, it will always be understood that the inner product on it is the restricted inner product (; )M. We shall drop the subscript and write (M(; )) instead of (M( : )M), and refer to the inner product space (M (; )) by simply saying that "M is a linear manifold of X". Recall that a subspace of a normed space is a closed linear manifold of it. Hence a subspace of II
I
5.3 Orthogonality
327
an inner product space X is a linear manifold of the linear space X that is closed in the inner product topology. That is, a subspace M of an inner product space X is a linear manifold of X that is closed as a subset of X when X is regarded as a metric space whose metric is that generated by the norm that is induced by the inner product. According to Proposition 4.4 a linear manifold of a Hilbert space 7{ is a Hilbert space if and only if it is a subspace of R. We observed in Section 4.3 that the sum of subspaces is not necessarily a subspace (it is a linear manifold but may not be closed). An extremely important consequence of the Pythagorean Theorem is that the sum of orthogonal subspaces of a Hilbert space is again a subspace.
Theorem 5.10. (a) If M and N are complete orthogonal linear manifolds of an inner product space X, then the sum M + N is a complete linear manifold of X.
(b) If M and N are orthogonal subspaces of a Hilbert space f, then the suns M +N is a subspace of 7t. Proof. Let M and N be orthogonal linear manifolds of an inner product space X. Take an arbitrary Cauchy sequence {xn ) in M + N so that xn = un + vn with un in M and vn in N for each n. Since M and N are linear manifolds of X. it follows that un, - un lies in M and vn, - vn lies in N, and hence un, -un 1 v n - v, , for every pair of integers m and n (because M 1 N). Writing x,n - xn = (u,n - un) + (v,n - vn ) the Pythagorean Theorem ensures that
Ilxm - X,,12 = Ilum -
U'.112
+ IILm - LnII`
for every m and n. This implies that (un) and (vn) are Cauchy sequences in M and N. respectively (since (xn) is a Cauchy sequence). (a) If M and N are complete, then (un) converges in M and {vn} converges in N. Recalling that addition is a continuous operation (Problem 4.1) we get from Corollary 3.8 that (xn) converges in M + N. Conclusion: Every Cauchy sequence in M + N converges in M + N. Thus M + N is complete (and hence closed in X by Theorem 3.40(a)).
(b) If M and N are closed linear manifolds of a complete inner product space 7{, then they are complete (by Theorem 3.40(b)). Thus the linear manifold M + N of o 7i is complete (and therefore closed in 71) according to item (a). Theorem 5.10(a) fails if M and N are either (1) orthogonal but not complete or (2) complete but not orthogonal. Equivalently, Theorem 5.10(b) fails if M and N are either (1) orthogonal subspaces of an incomplete inner product space or (2) not orthogonal subspaces of a Hilbert space. That is, completeness and orthogonality are both crucial assumptions in the statement of Theorem 5.10. This will be verified in Problems 5.12 and 5.13. Let (My )yE r be an arbitrary nonempty indexed family of subspaces of a Hilbert space 7{ (i.e., an arbitrary nonempty subcollection of Lat(7{)). Recall that the sum of (My)yEr is the linear manifold of 71 consisting of all finite sums
5. Hilbert Spaces
328
of vectors in 7{ with each summand being a vector in one of the subspaces M. That is,
E My = span (U my). yer
yEr
Corollary 5.11. Every finite sum of pairwise orthogonal subspaces of a Hilbert space 7{ is itself a subspace of 7{.
Proof. Let f be a Hilbert space. Theorem 5.10(b) says that the sum of every pair of orthogonal subspaces of 7{ is again a subspace of 71. Take an arbitrary integer n > 2. Suppose the sum of every set of n pairwise orthogonal subspaces of 7{ is a subspace
of R. Now take an arbitrary collection of n + 1 pairwise orthogonal subspaces of N, say, (Mi }n+11. If X E >"_ I M; , then x = >n I xi with each xi in .M;, and
hence (x ; xn+,) _ Ei=1(x; ; xn+I) = 0 whenever xn+, lies in Mn+, (because .Mi 1 Mn+l for every i = i ,... , n). Thus E =,Mi 1 M +1. Since E,"=,M was assumed to be a subspace of 7{, and since n+I
n+1
n
n
Mi = span (U M) = span (U Mi U Mn+1) = E Mi + Mn+1. i=1
i=1
i=1
i=1
1
it follows by Theorem 5.10(b) that yn+' Mi is a subspace of 7{. This completes the proof by induction. 17
5.4
Orthogonal Complement
If A is a subset of an inner product space X, then the orthogonal complement of A is the set
Al = {xEX: x±A) = {xEX: (x;y)=0foreveryyEA} consisting of all vectors in X that are orthogonal to every vector in A. If A is the empty set 0, then for every x in X there is no vector y in A for which (x; y) # 0.
and hence 01 = X. Clearly, x 1 (0) for every x E X. and x 1 X if and only if x = 0. Hence
(0)1=X
and
Xl={0}.
Let A and B be nonempty subsets of X. The next results are immediate consequences of the definition of orthogonal complement. A -LA-L,
A fl Al c (0)
and
A fl Al =(O) whenever 0 E A
(reason: if there exists x E A fl Al, then (x ; x) = 0), and
AIB
if and only if
AcBl.
5.4 Orthogonal Complement
329
Since 1 is a symmetric relation (i.e., A 1 B if and only if B 1 A), the above equivalent assertions also are equivalent to B C_ Al. Moreover.
A1B
implies
A fl B C (0)
(if A C B1, then A fl B C B1 fl B C (0)). It is readily verified that ACB
implies
B1 C Al
and so
All C B11,
where All = (Al)l. Since A 1 Al and Al 1 A11, we get A C All (so that
A'11 C A1) and Al C Alll where Al" = (All)'. Therefore.
ACA"
and
A'=A1'1
Proposition 5.12. The orthogonal complement A' of every subset A of any inner product space X is a subspace of X. Moreover
Al = (A1)- = (A-)1 = (span A)1 =
(VA)'.
The orthogonal complement of every dense subset of X is the zero space:
Al = (0}
A- = X.
whenever
Proof. Suppose A # 0 (otherwise the results are trivially verified). Since the inner product is linear in the first argument, it follows at once that Al is a linear manifold of the linear space X. If x 1 A, then x 1 Y,"_t a; y; for every integer n > 1 whenever y; E A and a; E F f o r each i = t, ... , n, and hence Al C_ (span A)1. On the other hand, A C span A so that (span A)1 C A1. Then
Al = (span A)1. That Al is closed in X is a consequence of the continuity of the inner product (Problem 5.6). Actually, if is an A1-valued sequence that converges in X to x E X. then (cf. Corollary 3.8) (x ; y) = (lim x ; y) = lim (x ; y) = 0 for every y E A, which implies that x E A1. Therefore, Al is closed in X by the Closed Set Theorem (Theorem 3.30); that is,
Al = (Al)
,
and so Al is a subspace (i.e., a closed linear manifold) of X. Now take an arbitrary
x in Al and an arbitrary y in A-. By Proposition 3.27 there exists an A-valued sequence (y} that converges in X to y. Using Corollary 3.8 again, and recalling that the inner product is continuous, we get (y; x) = (lim y.; x) = lim(y ; x) = 0. Thus A11 A- so that Al C (A-)1. But (A-)' C Al because A C A-. Hence
Al = (A
-)-L.
330
5. Hilbert Spaces
Since Al = (span A)1 and Al =
(A-)1 for every subset A of X,
Al = (span A)1 = [(span A)-]1 =
(VA)1.
Finally, if A- = X. then Al = (A-)1 = X1 = (0).
0
Remark: If L E C[X, y), where X is an inner product space, then the linear transformation LI.N(L)I is injective (i.e., N(LIN(L)_) = 10)). In fact, if v E N(L)1 lies
in N(LIN(L)i), then v E N(L) nA((L)1 = (0). The next theorem is of critical importance; it may be thought of as a pivotal result
in the theory of Hilbert spaces. Recall that the distance d(x, M) of a point x in a normed space X to a nonempty subset M of X is the real (nonnegative) number
d(x, M) = inf llx - u11. MEAL
Theorem 5.13. Let x be an arbitrary vector in a Hilbert space R.
(a) If M is a closed convex nonempty subset of 11, then there exists a unique vector ux in M such that
Ilx - uxll = d(x.M). (b) Moreover, if M is a subspace of N, then the unique vector in M for which IIx - ux 11 = d (x, M) is the unique vector in M such that the difference x - ux is orthogonal to M; that is, such that
x - ux E M1. Proof. (a) Let x be an arbitrary vector in N and let M be a nonempty subset of in R. Therefore, for each integer n > 1 N so that d (x. M) =
there exists un E M such that
d(x.M) <- IIx-un11 < d(x,M)+It . Consider the M-valued sequence {un}. h is an inner product space and so the parallelogram law ensures that
II2x-um-unII2+IIun-um112 = 2(11x-um11 +IIx-un112) for each in. n > 1. Since M is convex, it follows that ; (11n, + un) E M, and hence 2d (x, M) < 211 i (um + un) - x ll = 112x - um - un ll so that 0 < 11u,n - un 112 < 2 (Ilx - U. 112 + lix - un 112 - 2d(x, M)2)
5.4 Orthogonal Complement
331
for every r n, n > 1. This inequality and the fact that IIx - u II -* d (x, M) as n -- 00 are enough to ensure that {u I is a Cauchy sequence in it, and therefore it converges in the Hilbert space 7 t' to, say, ux E N. But the norm is a continuous function so that (Corollary 3.8)
d(x, M).
IIx-ux11 = IIx Moreover, since M is closed in H and
is an M-valued sequence that converges
to ux in N, it follows by the Closed Set Theorem (Theorem 3.30) that ux E M. Conclusion: There exists ux in M such that
IIx - uxll = d(x, M). To prove uniqueness, take any u in M such that IIx - u11 = d(x, M). Observe that ' (ux + u) lies in M because M is convex, and hence d (x, M) < 11 21 (ux + u) - x II .
Thus 4d(x, M)2 < Ilux + u - 2xII2. This inequality and the parallelogram law imply that
4d(x,M)2+Ilux-u112
<
Ilux+u-2x112+llux-u112 = 2(Ilux -x112+ IIu -xll2) = 4d(x, M)2.
Outcome: II ux - u 112 = 0; that is, u = ux.
(b) Now let x be an arbitrary vector in N and suppose M is a subspace of R, which obviously implies that M is a closed convex nonempty subset of N. According to
item (a) there exists a unique u., E M such that IIx - uxll = d(x, M). Take an arbitrary nonzero u E M. Since (ux + au) E M for every scalar a, it follows that d(x, M)2 < IIx - ux - au112 = IIx - ux112 + Ia12llull2 - 2Re(a(x - ux ; u)). Setting a = IIu II -2 (x - ux ; u) in the above inequality and recalling that
fix - ux 112 = d(x, M)2, we get 21(x - ux ; u)I2 < I(x - ux ; u)I2, and hence I (x - ux ; u) I = 0. Conclusion: x - ux 1 u for every nonzero u in M, which implies
x - ux 1 M. Finally, we show that this ux is the unique vector in M with the above property.
Indeed, if v E M is such that x - v 1 M, then (x - v ; v - u) = (x - v ; v) (x - v ; u) = 0 whenever u E M so that x - v 1 v - u for every u E M. Thus, by the Pythagorean Theorem,
IIx-v112 < IIx-v112+lv-u112 = IIx-v+v-u112 = IIx-ull2 for all u E M. In particular, for u = ux,
d(x, M) -< lix - v11 < IIx - ux11= d(x. M)
332
5. Hilbert Spaces
so that d(x, M) = lax - vii, and hence v = ux because ux is the unique vector in 0 M for which d(x, M) = fix - u. 11. Let A be any nonempty subset of a Hilbert space it. Recall that the subspace V A = (span A) - of it is the closure in fl of the set of all (finite) linear combinations
of vectors in A. Take an arbitrary vector in x in rl. The vector u,, in VA that minimizes the distance of x to VA is called the best linear approximation of x in terms of A, and the difference x - ux in N is called the error of the approximation of x in R by ux in VA. The next result is a straightforward consequence of Theorem 5.13, which is illustrated in the figure below. Corollary 5.14. The best linear approximation of any vector x in a Hilbert space N in terms of a nonempty subset A of N is the vector ux in VA for which the error x - ux is orthogonal to VA.
VA Proposition 5.15. Let M be a linear manifold of a Hilbert space N.
M11 = M- and M1 = (0) if and only if M- = W. In particular, if A is any subset of a Hilbert space it, then
All = VA
and
Al = (0) if and only if VA = 7{.
Proof. Recall that M S M11 and M11 is closed in it according to Proposition 5.12. Then
M- C: M11.
Since M11 is a subspace (i.e., a closed linear manifold) of a Hilbert space, it follows that it is itself a Hilbert space, and hence M- is a subspace of the Hilbert space M11 Take an arbitrary x in M11. Theorem 5.13 ensures that there exists ux E M- such
that x - ux E (M-)1 = M1 (see Proposition 5.12). But x - ux E M11 because ux E M- C_ M11 Thus x - ux E M1' fl M11, and hence x = ux E M. Conclusion:
M11 C M-. Therefore M- = M11. If M1 = (0), then M11 = {0)1 = N, and so
M- = N
whenever M1 = {01.
On the other hand, Proposition 5.12 says that M1 = (0) whenever M- = 7{. Finally. if A is any subset of N, then Al = (VA)1 so that A11 = (VA)11
5 4 Orthogonal Complement
333
(Proposition 5.12 again). Since VA is a subspace of 7-f, it follows that All = VA,
and Al = (0) if and only if VA =X It is worth noticing that the inner product space N in Theorem 5.13 was supposed
to be complete only to ensure that the closed (convex and nonempty) subset M and the closed linear manifold A4 are complete (see Theorem 3.40(b)). In fact, Theorem 5.13 can be formulated in an inner product space setting by assuming that M is complete (as a metric space in the inner-product topology) and M is a complete linear manifold, instead of assuming that M and M are closed in a inner product space N that is itself complete. We shall see next that Theorem 5.13 does not hold without the completeness assumption.
Example 5H. Let X be a proper dense linear manifold of a Hilbert space H so that X is an inner product space that is not complete (Proposition 4.7). Take Z E f{\X and set
M = (z}1 n x, where (z)1 is the orthogonal complement of {z) in N. Since {z}1 is a closed linear manifold of N (Proposition 5.12). it follows by Problem 3.38(d) that {x}1 n x is closed in X. Thus the intersection d&4 of the linear manifolds (z)1 and X of H is a linear manifold of X that is closed in X (i.e., M is a subspace of X). Moreover,
if M = X, then X C {z}1, which implies {z}1-1 C_ X1. But X1 = {0} because X- = h (Proposition 5.12 again), and hence (z) c_ (z}11 = (0} so that z = 0, which contradicts the fact that z X. Therefore,
M is a proper subspace of X. Take X E X\M. Since X E R, and since (z)1 is a subspace of X it follows by Theorem 5.13 that there exists a uniqueux E (z)lsuchthatx-u., E (1-)11 = V(z) (see Proposition 5.15). and so it., = x + az for some scalar a (recall: V{z) _ span (z)). Hence,
U, 0X because z ¢ X and X E X. Theorem 5.13 also says that u.r is the unique vector in (z)1 such that
IIx - it. II = d(x. {z)1).
If M is dense in {z)1 (i.e., if M- = {z}1, where M is the closure of j'4 in N). then d(x, M) = d(x, {z}1) according to Problem 3.43(b). In this case we may conclude that there is no vector it in M = (z)1 n X such that iix - u ii = d(x, M), which shows that Theorem 5.13 does not hold in the incomplete inner product space X. We shall now exhibit a Hilbert space N. a dense linear manifold X of R. and a
vector z in R\X for which M = (z)1 n X is dense in (z}1. Set
N=e+
and
X=f
.
5. Hilbert Spaces
334
Recall that t+ is dense in t+ (Problem 3.44). Set z = ((' )k )k° 1 E 1+\t+ and take an arbitrary y E {z}1. That is, y = (vk}k° 1 E t+ is such that 00
E (11
Jk vk = 0-
k=1
For each integer n > 1 consider the sequence -2n+lFnk=1(21)kvk. 0,0.0'...
Yn = (v1,. .. , Ln,
in t+. It is clear that (yn ; z) = 0, and hence yn E M = (z}l f1 t+, for every n > 1. Moreover,
Ilyn -y112 = Ivn+l +2n+1Ek=l(1)kvkl2+E0k0-n+21vk12 for each n > 1. Since Y E {z}1 C l+, it follows that Lkn+2lvkl2 -4 0 as n -+ 00 (Problem 3.11) and >k=t (I )kvk = n+I ( )kvk for every n > 1. Recalling that Fk=n+1('I )k = ( )n for each n > 0 we get
k
00
n
0
2n+1I
E(1)kvkl =
is (i)kvkl
2n+Il
k=n+1
kk=l
00
sup k>n+1
e
sup lukl)2 k>n+l
(7)k
Ivkl)2n+1
k=nn+l
= 2 sup Ivkl - 0 k>_n+1
as n -+ oo (for limn supk>n I vk I = lim supn I vk I = limn Ivn I = 0). Thus
yn _+ y
in t+.
Conclusion: For every y E {z}1 there exists an M-valued sequence (y.), that converges in t+ to y, which means that M is dense in (z)1 (Proposition 3.32(d)). That is,
M- = (z)1. Note that M 0 (z)1 (e.g., the sequence (CI - 11z z1I2(C I)-, C2. 3. . ) lies in (z)1\t+O for every z = I in t+\t0 with C1 A 0), which implies that M is not closed in the Hilbert space f = t+, and hence M is not complete (Corollary 3.41).
The above example shows that completeness of fl was crucial in the proof of Theorem 5.13. But it is not enough. Indeed, Theorem 5.13 does not necessarily hold in a Banach space that is not Hilbert (i.e., in a Banach space whose norm does not
5.5 Orthogonal Structure
335
satisfy the parallelogram law). This is closely related to Lemma 4.33. In fact, what
is behind that lemma is that there may exist a proper subspace M of a Banach space X for which d(x, M) < 1 for all x E X with IIx II = 1 or, equivalently, d(x, M) < IIx II for every nonzero x in X. Suppose this is the case, and take an arbitrary x E X\M. Since d(x, M) = d(x - u, M) for every x E X and every u E M, it follows that d(x, M) < IIx - ull whenever u lies in M and x lies in X \M. Therefore, if x is a vector in X\M, then there is no vector u in M for which
IIx-u11 =d(x,M).
5.5
Orthogonal Structure
Let {My}yEr be an arbitrary nonempty indexed family of subspaces of a Hilbert space 7-l and consider their topological sum
(E .My) = [span(IJmy)J =V(Umy) = VMy. yEr
yEr
yEr
yEr
If W07
is a finite (nonempty) family of pairwise orthogonal (that is, , M; 1 .Mi whenever i 74- j) subspaces of a Hilbert space, then Corollary 5.11
says that E%I M; =
That is, the topological sum and the ordinary (algebraic) sum of a finite family of pairwise orthogonal subspaces of a Hilbert space coincide. The next theorem is a countably infinite counterpart of the above italicized result, which emerges as an important consequence of Theorem 5.13. Theorem 5.16. (The Orthogonal Structure Theorem). Let 1-l be a Hilbert space and suppose (Mk)keN is a countably infinite family of pairwise orthogonal subspaces
of R. If X E
(EkENMk)-, then there exists a unique 7-f-valued sequence {uk}k° 1
with Uk E Mk for each k such that 00
X =
>2 uk. k=1
Moreover, IIx II2 = ER°=1 Iluk 112. Conversely, if (uk)k° 1 is an 7-1-valued sequence
such that Uk E Mk for each k and Fk--[I Iluk II2 < oo, then the infinite series F-k=1 uk converges in 7-1 and F_k°_I uk E (L.kENMk)
.
Proof If X E (>kENMk)
, then there exists a sequence {x(n)j'1 of vectors in &ENMk that converges to x in 7.1 (Proposition 3.27). Take an arbitrary integer n > 1. Since X(n) E F-kEF1Mk, it follows that x(n) is a finite sum where each summand lies in one of the subspaces Mk. Thus x(n) can be written as
n!
Mn
x(n) = Ex(n)k E T Mk k=1
k=1
336
5. Hilbert Spaces
with x(n)k E Mk for each k = 1, ... , Mn, where mn is an integer that depends on n. Clearly, we may take mn+1 >_ Mn > n. Observe that the above finite sum may contain finitely many zero summands, and {Mk }k " 1 c (Mk )k" I' . Existence. Note that Ek-1 Mk is a subspace of the Hilbert space N (Corollary 5.10). According to Theorem 5.13 there exists a unique vector
m
m
Uk E EMk,
Ux(n) _ k=1
with each Uk in Mk, such that Ilx - Ux (n) 11
k=1
Ilx - u 11 for all vectors u in F-k1 Mk
and x - ux (n) 1 Ek ° Mk. In particular, 1
IIX
- F,
IIX-X(n)II
Mn
k=1
Claim The vectors Uk in the expansion of Ux(n) do not depend on n.
Proof. Since Ek°,'Mk =" Mk + 114 ;1.+1 Mk is a subspace of 9-l, it follows 1
by Theorem 5.13 that there exists a unique mn+1
Ux(n+1) = v+w E 1: Mk, k=1
with v E Y'nn,Mk and W E
rm"+' +1Mk, such that IIx - ux(n+1)II < Ilx - ull
for all vectors u in >k"i'Mk and x - ux(n+1) 1 F-k"i'Mk. Take an arbitrary Z E Ek" 1 Mk C >k=" I'Mk and note that
0 = (x-ux(n+1);z) = (x-v-w;z) = (x-v;z)-(w;z) = (x-v;z) (for M. 1 Mk whenever j 0 k so that rkm"+1 Mk 1 Emml tk, and hence (w; z) = 0). Thus x - v 1 Ek" Mk. But ux (n) is the only vector in k° Mk for which x - Ux(n) 1 F-k=1Mk. Therefore v = ux(n). Outcome: ux(n+t) _ 1
M.+ I Ux (n) + W = k=1 uk with Uk E Mk l7
Set (Uk}k° 1 = Un>1(Uk )k=1, which is a sequence of pairwise orthogonal vectors in H because {Mk }keN is an orthogonal family. Since n < Mn, it follows by Proposition 5.8 that n
M,
Iluk112 < EIlukII = k=1
k=1
m. k=1
(liX-X(n)II+IIxII)2.
But x(n) --> x in H as n -p oo so that yk_11Iuk 112 < 11x112 Hence (cf. Corollary 5.9) the infinite series F_k) uk converges in the Hilbert space N and IIF_kOO__tuk 112 =
337
5.5 Orthogonal Structure
Ek 1 Iluk112. Moreover, it in fact converges to x because the sequence of partial has a subsequence, namely {Ek"luk}n° 1, that converges to x sums (see Proposition 3.5), and so lix 112 = E00 1 Iluk 112. 1
uk = Fk=1 vk, where uk and vk lie in Mk for Uniqueness. Suppose x (uk - vk) = 0 (see Problem 4.9(a)). But ((uk - vk))k° is a each k. Thus 1
sequence of pairwise orthogonal vectors in the inner product space H (because Uk - vk E Mk for each k and (Mk)kEN is an orthogonal family) so that 1(uk - vk )112 = 0 by Corollary 5.9(a). Hence Uk = vk k 111 uk - Vk 112 = II for every k > 1.
Ea
Converse. If (uk)k_1 is an H-valued sequence with uk E Mk for each k, then it is a sequence of pairwise orthogonal vectors in H (since M. J. Mk whenever j 96 k). I1 uk112 < oo, then the infinite series If =I Uk converges in h according to Corollary 5.9(b); that is, _k-1 uk -- F-00 uk in 7l as n oo. Since Ek=l uk lies in EkENMk for each integer n > 1, it follows by Proposition 3.27 that the limit O Lk_luk lies in ( kENMk) . Two immediate consequences of the Orthogonal Structure Theorem: Lk0
Corollary 5.17. If (Mk )kEN is a countably infinite family of pairwise orthogonal subspaces of a Hilbert space H, then
l[..kENMkY = ILk= 1uk E H: uk E Mk and Ek 1IIuk112 < 001. Corollary 5.18. Let (Mk )kEN be a countably infinite orthogonal family ofsubspaces
of a Hilbert space R. If it spans H; that is, if
V Mk = H, kEN
then every vector x in H is uniquely expressed as 00
x = E uk k=1
in terns of an orthogonal sequence (uk )k°
1
with each uk in each Mk.
Example 51. Let (Mk) be a countable collection of subspaces of a Hilbert space (H, ( ; )) so that each (Mk, ( ; )) is itself a Hilbert space. If (Mk} is an infinite collection, then let it be indexed by N. Consider the direct sum ®kMk. Recall that ®kMk is itself a linear space (but it is not a subset of H). Let [®kMk]2 denote the linear manifold of the full direct sum ®k=1 Mk made up of all square-summable sequences (xk} in ®k° 1Mk. That is,
®kMk]2 = l(Xk) E ®kMk:
>klbXk112 < 00},
338
5. Hilbert Spaces
is the norm induced on 7-( by the inner product ( ; ). (It is clear that [®kMk]2 coincides with (DkMk if the collection (Mk} is finite.) The function )®: [®kMk]2 X [ekMk]2 _+ IF, given by where 11
11
(X; Y)® = J(xk; Yk) k
for every x = {xk} E [®kMk]2 and y = (yk) E [(DkMk]2, is an inner product on [®kMk]2 that makes it into a Hilbert space (Examples 5E and 5F). If the collection (Mk) consists of pairwise orthogonal subspaces of 7C, then the Hilbert space ([ekMk]2, ( ; ),) is referred to as an (internal) orthogonal direct sum. In this case (i.e., if (Mk } is a collection of orthogonal subspaces), then Corollary 5.17 (or Corollary 5.11, if the collection is finite) says that
(F-kMk)= (Ekxk E 71: (xk} E [®kMk]2[' and
[ekMk]2 = {(Xk) E ®kMk: Ekxk E (EkMk) }. This establishes a natural mapping 4>: [(DkMk]2 ->
(EkMk)-,
4'((xk)) = F, xk k
for every (Xk} E [®kMk]2, which is clearly injective and surjective (by the Orthogonal Structure Theorem) and linear as well (addition and scalar multiplication of summable sequences is again a summable sequence - see Problem 4.9). Conclusion: 4> is an isomorphism of the linear space [®kMk]2 onto the linear manifold (EkMk)- of the linear space R. Therefore, the orthogonal direct sum [®kMk]2 and the topological sum (>kMk)- of pairwise orthogonal subspaces of a Hilbert space are isomorphic linear spaces. They are, in fact, isometrically isomorphic Hilbert spaces: the natural mapping 4> actually is an isometric isomorphism (next section), which provides a natural identification between them. It is customary to use the same notation of the full direct sum to denote [®kMk]2 when it is equipped with the above inner product ( ; )®. Notation:
®kMk = ([®kMk]2 0
We shall follow the common usage. From now on ®kMk will denote the Hilbert space whose linear space is the set of all square-summable sequences in the full direct sum of a collection {Mk} of pairwise orthogonal subspaces of a Hilbert space W.
According to Problem 4.34 a pair of subspaces M and N of an inner product space X are complementary in X if they are algebraic complements of each other
(i.e., M +N = X and m flN = (01). Recall that M I. A (implies m n A( = {0).
5.6 Unitary Equivalence
339
Proposition 5.19. Orthogonal complementary subspaces in an inner product space are orthogonal complements of each other.
Proof. Let M and N be orthogonal complementary subspaces in an inner product
space X. Take an arbitrary x in M1 C X. Since M + N = X, it follows that
x=u+vwithuinMandvisAr,andso (x;u) = (u;u)+(v;u). But (x ; u) = (v ; u) = 0 (for M 1 /N1 and X11 1, V). Hence Ilull` = 0. which means that u = 0. Therefore x = v E N. Conclusion: M C_ N. But N C M1
because M I N. Outcome: M1 =N.
11
The next result is the central theorem of Hilbert space geometry.
Theorem 5.20. (Projection Theorem - First version). Every Hilbert space h can be decomposed as
N = ,11 + M 1 where M is any subspace of N.
Proof. Let M be an arbitrary subspace of a Hilbert space R. Since M1 is a subspace of H (Proposition 5.12) which is orthogonal to M (by its very definition), it follows
by Theorem 5.10 that M + M1 is a subspace of H. Moreover, M c_ M +
and M1 C M+J111.Thus (111+1111)1 c_ 1111n1 111 = 1111f1MM I n M = 0, and hence M + M1 = (M + 1111)- = H (Proposition 5.15). l7 Let M be an arbitrary subspace of a Hilbert space H. Since M1 is again a subspace of H. and since M f1 M1 = (0}. what Theorem 5.20 says is: If M is any subspace of a Hilbert space 1i, then M and M1 are complementary subspaces of N. This in fact is the converse of Proposition 5.19 in a Hilbert space setting. Moreover, according to Theorem 2.14. for each x E N = M + 111 there exists a unique u in M and a unique v in 1111 such that x = u + v and, by the Pythagorean Theorem, II X 1 1 2 = I I U II 2 + Il v 12.
5.6
Unitary Equivalence
Recall that an isometn' between metric spaces is a map that preserves distance, which is obviously continuous. Proposition 4.37 placed linear isometrics in a normed-space setting. The next one places it in an inner-product-space setting.
Proposition 5.21. Let X and Y be inner product spaces. A linear transformation V E £(X, yl is an isometry if and only if
(Vxi ; Vx,) = (XI ; x2)
340
5. Hilbert Spaces
for every xi,x2 E X. Proof. First note that the inner product on the left-hand side is the inner product on Y. and that on the right-hand side is the inner product on X. If the above identity holds, then II V X 112 = (V X ; V X) = (x ; x) = Ilx 112 for every x E X. Conversely, if
II VxII = IIxII for every x E X, then (Vxl ; Vx2) = (x1 ; x2) for every xt, x2 E X by the polarization identity of Proposition 5.4. Therefore, (Vxt ; Vx2) = (xt ; X2) for every X1, z2 E X if and only if 11 Vx II = Ilx ll for every x E X. Equivalently, if and only if V E G[X, Y] is an isometry (Proposition 4.37). o In other words, a linear isometry between inner product spaces is a linear transformation that preserves the inner product. Now recall that an isomorphism is an invertible linear transformation between linear spaces. These concepts (isometry and isomorphism) were combined in Section 4.7 to yield the notion of an isometric isomorphism between normed spaces, which is precisely a linear surjective isometry. Between inner product spaces it has a name of its own: an isometric isomorphism between inner product spaces is called a unitary transformation. That is, a unitary transformation of an inner product space onto an inner product space is a linear surjective isometry or, equivalently, an invertible linear isometry. According to Proposition 5.21, a unitary transformation between inner product spaces is a linearspace isomorphism that preserves the inner product. Thus a unitary transformation
preserves the algebraic structure, the topological structure, and also the geometric structure between inner product spaces. In particular, it preserves convergence and Cauchy sequences (see Section 4.6), and so it also preserves separability and completeness (cf. Problem 3.48(b) and Theorem 3.44). Two inner product spaces, say X1 and X2, are called unitarily equivalent if there exists a unitary transformation between them (equivalently, if they are isometrically isomorphic - Notation: X, = X2). Unitarily equivalent inner product spaces are regarded as essentially the same inner product space (i.e., they are indistinguishable, except perhaps by the nature of their points). The continuous linear extension results of Section 4.7 (viz., Theorem 4.35 and Corollaries 4.36 and 4.38) are trivially extended to inner product spaces (simply replace "normed space", "Banach space" and "isometric isomorphism" with "inner product space", "Hilbert space" and "unitary transformation", respectively). The completion results of Section 4.7 are (almost) immediately translated into the inner product space language by just recalling that in an inner-product-space setting two spaces are unitarily equivalent if and only if they are isometrically isomorphic. Definition 5.22. If the image of a linear isometry on an inner product space X is a dense linear manifold of a Hilbert N. then 71 is a completion of X. Equivalently, if an inner product space X is unitarily equivalent to a dense linear manifold of a Hilbert space 9-l, then R is a completion of X. Theorem 5.23. Every inner product space has a completion. Any two completions of an inner product space are unitarily equivalent. If N and K are completions of
5.6 Unitary Equivalence
341
two inner product spaces X and Y. respectively, then every operator T E B[X, y] has an extension T E B[1-1, K:] over the completion l of X into the completion X of Y. Moreover, T is unique up to unitary transformations and II T-11 = 11 T II .
Proof. According to Theorem 4.40, every inner product space has a completion (as a normed space). The only question that has not been answered in Theorems 4.40 to 4.42 is whether this completion (which certainly is a Banach space) is a Hilbert space. In other words, whether the norm 1111 -. in the proof of Theorem 4.40 satisfies the parallelogram law whenever the norm II IIX does. Consider the setup of the proof
of Theorem 4.40 and suppose X is an inner product space. Recall that 11[x1II3Z = lim IIxn IIX,
wherex = (xn ) is an arbitrary element of an arbitrary cosec [x ] in the quotient space 9. Take any pair of cosecs [x ] and [y ] in X. Since the norm II IIX on the inner product space X satisfies the parallelogram law, II[x] + Ly111X + II[xl - Ly]IIX
= II[x +y]IIX + nix -y]IIk = lim IIxn + Y. IIX + lim IIxn - Y. IIX = lim (IIxn + yn IIX + IIxn - Yn IIx ) = lim2(llxnhIX+IIYnIIX)
= 2(lim IIxn IIX + lim IIYn IIX) = 2(I1[x] III + IILy] IIX) by continuity (apply Corollary 3.8 recalling that squaring, addition and scalar multiplication are continuous). Therefore, the norm 11 11- on X also satisfies the parallelogram law so that this Banach space is, in fact, ahilbert space. o We shall now return to orthogonal complements. Let M and N be linearmanifolds of an inner product space X. If M and N are algebraic complements of each other
(i.e., if M + N = X and M n N = (0)), then they are said to be complementary linear manifolds in X. Consider the natural mapping 4) of the linear space M ®N into the linear space M + N, which was defined in Section 2.8 by the formula
(D((u, v)) = u + v for each (u, v) E M ®N. Let ( ; ) be the inner product on X and equip the direct sum M ® N with the inner product (; )®, viz., 01, vO ; (u2 ; v2))® = (u l ; u2) + (v1 ; U2) for every (u , vI) and (u2, v2) in M ® N, as in Example 5E.
Proposition 5.24. If M and N are orthogonal complementary linear manifolds in an inner product space X, then the natural mapping 4>: M Q) N -* M + N = X
342
5. Hilbert Spaces
is a unitary transformation so that M ® N and X are unitarily equivalent. (i.e.,
X-M®N).
Proof. Since M and Al are complementary linear manifolds in X, it follows by Theorem 2.14 that 4) is an isomorphism of the linear space M ® N onto the linear
space M+N= X.IfM 1N, then (4'((u1, VI)); (N(u2, v2)))
= = =
(ul + v1 ; u2+ v2)
(ul ; u2) + (vt ; v2) ((ul. v0 ; (U2; v2))®
for every (u 1, v;) and (u2, v2) in M ® N. Thus the natural mapping 0 is a linearspace isomorphism that preserves the inner product; that is, 0 is a unitary transfor-
0
mation.
In light of Proposition 5.24 we may identify the inner product space X with the orthogonal direct sum M (D Al (equipped with the above inner product( ; ),)through the natural mapping 4', whenever M and N are orthogonal complementary linear manifolds in X. Next suppose M and N are orthogonal complementary subspaces in a Hilbert space N. That is, suppose M and N are closed linear manifolds of a Hilbert space 7{ such that M +N = fl and M I N (and so M r1A/ = {0}). Accord-
ing to Proposition 5.19N = M1 so that M+M1 = 1, and hence M®M1 ", N
by Proposition 5.24. In this case, it is usual to identify the orthogonal direct sum
M ®M1 with its unitarily equivalent image, 4'(M (DM1) = M+M1 = N, and write M ® M1 = H. Therefore, under such an identification, the central Theorem 5.20 can be restated as follows. Theorem 5.25. (Projection Theorem - Second version). Every Hilbert space H has an orthogonal direct sum decomposition
'H=M®M1 where M is any subspace of R.
This leads to a useful notation for the orthogonal complement of a subspace M of a Hilbert space H, namely,
M1 = HeM. Observe that both M and M1 are themselves Hilbert spaces so that each of them can be further decomposed as direct sums of orthogonal complementary subspaces.
Example 5J. Let {Mk} be a countable collection of orthogonal subspaces of a Hilbertspace (H, ( ; ) ). If {Mk } is countably infinite, then assume that it is indexed by N. Consider the natural isomorphism 4) of ®kMk onto (J kMk)- that was defined in Example 51: (D ({xk}) _
Xk k
for every {xk} E ®kMk.
5.7 Sumnwbility
343
Recall that ®kMk denotes the Hilbert space of all square-summable sequences in the full direct sum of {Mk), and that the inner product on ®kMk is given by
((Xk); (Yi})
_ >>k; Yk) k
for every (Xk) and {Yk) in ®kMk (see Example 51). Since M, 1 Mk whenever j 0 k, and since the inner product is continuous (in both arguments - Problem 5.6), it follows that
(1>((xk)):4'((Yk))) _ Xx;:>Yk' _ ;
i
k
, E(x;;Yk) k
1: (xk; Yk) = ((Xk) ; {yk}L k
for every pair of square-summable sequences (xk) and {yk} in ®kMk. Thus the linear-space isomorphism 4) preserves the inner product, which means that it is a unitary transformation. Conclusion: The orthogonal direct sum ®kMk and the (F-kMk)- of pairwise orthogonal subspaces of a Hilbert space are topological sum unitarily equivalent Hilbert spaces. That is,
®Mk =
(I:Mk)-
k
k
whenever {Mk } is a collection of pairwise orthogonal subspaces of a Hilbert space. This, in fact, is a restatement of the Orthogonal Structure Theorem (Theorem 5.16).
(EkMk)-
(Recall: for a finite collection, = F-kMk by Corollary 5.11 so that >2kMk.) If, in addition, the orthogonal collection {Mk) spans h (i.e., if (DkMk VkMk = H), then it is usual to identify the orthogonal direct sum ®kMk with its
unitarily equivalent image (D(®kMk) = (>kMk) = VkMk = H. Therefore, under such an identification, Corollary 5.18 is restated as follows. If the orthogonal
collection (Mk} spans H, then h = ®kMk.
5.7
Summability
The first half of this section pertains to Banach spaces and, as such, could have been introduced in Chapter 4. The final part, however, is a genuine Hilbert space subject.
Definition 5.26. Let r be any index set and let (xy)yEI- be an indexed family is a summable family with sum x E X of vectors in a nonmed space X. (xy (notation: x = EyErxy) if for each e > 0 there exists a finite set of indices Ne C I' such that
11 E - EXyll < E kEN
yEf
344
5. Hilbert Spaces
for every finite subset N of I' that includes NE (i.e., for every finite N such that NE C N C I'). It is called a p-summable family for some p > 1 if { IIxy II" )yEr is a summable family of positive numbers. In particular, {xy}yEr is an absolutely summable family if {IIxy II }yEr is a summable family, and a square-summable family if { IIxy 112}yEr is a summable family.
It is readily verified from Definition 5.26 that if (xy)yEr and {yy}yEr are (sim-
ilarly indexed) summable families of vectors in a normed space X with sums F-yErxy and EyEryy in X, respectively, then {axy + 6yy}yEr = a{xy}yEr + 1{yy }yEr is again a summable family of vectors in X with sum
Dclxy + pyy) = a > xy +'0 > yy yEr
yEr
yEr
for every pair of scalars a and P. This shows that the collection of all summable families of vectors in X is a linear manifold of the linear space Xr (see Example 2F), and hence is a linear space itself. It is also easy to verify that if {xy}yEr is a summable family of vectors in X with sum Eyerxy in X, and if T E B[X, y] for some normed space Y, then {Txy}yEr is a summable family of vectors in 31 with sum
E Txy = T >2 xy. yEr
yEr
Theorem 5.27. (Cauchy Criterion). Let {xy }yEr be an indexed family of vectors in a nornzed space X and consider the following assertions.
(a) (xy. )yEr is a sumumable family.
(b) For every e > 0 there exists a finite set of indices NN C r such that
II Exkll < B kEN
for every finite subset N of r such that N is disjoint with NE (i.e., whenever
N C Fisfmiteand NINE = 0). Claim: (a) implies (b), and (b) implies (a) if X is a Banach space. Proof. Suppose (a) holds true and take an arbitrary e > 0. According to Definition 5.26 there exists a finite NE C r such that II
uxk - yEr xyll
kEN'
<2
whenever N' is finite and NE C N' C F. If N is a finite subset of r and N fl NE = 0, then >kENxk = F-kENuNexk - LkENfxk Since the set N U NE is finite and
5.7 Sumn ability
345
NE C N U NE C 1', it follows that
kEN
IExk11 < II
ye keNE
kENUNE
1:Xk->xy11+I>xk->xyl
<s.
yer
Thus (a)=(b). Conversely, if (b) holds true, then there exists a sequence {N1)°O1 of finite subsets of i such that, for each i > 1, IIEkEN,xk II < 1 whenever N' c r which is a finite subset of r such is finite and N' n NN = 0. Set Ni = Uj=I that Ni c Ni+1 for each i > 1. Take any i > 1 and an arbitrary finite N C 1' such that N n Ni = 0. Since N n Ni' = 0, it follows that II EkENxk II < 1. Conclusion: (Ni )°_t is an increasing sequence of finite subsets of t such that, for each i > 1,
I`xkO 1. Since (Ni }Oo is an increasing sequence we get Xk -
Yi+i - Yi =
kEN,+j
Xk
E
=
Xk
keN,+,\N,
kEN,
for every i, j > 1, which implies
IIYi+j - YiII 5 I
xkI <
kEN,+j\N,
for every i > 1 and all j > 1 (reason: for every i, j > 1. Ni+j\Ni is a finite subset of r and (Ni+j\Ni) n Ni = 0). Hence (see Problem 3.51) (yi)O01 is a Cauchy sequence in X. If X is a Banach space, then (y, )°O , converges in X to, say, x E X. Therefore, for each e > 0 there exists (i) an integer it 2: 1 such that II.keN,xk - xII <
whenever i > it
and, as (b) holds true,
(ii) a finite Ns c r such that II
l) <
N'nN1=0.
whenever N' C i is finite and
Since NE U N, is finite and (N1)1 is increasing, it follows that there exists an integer iE > i f, and consequently a finite subset N,; of r, such that NE U Ni, c Ni,. .
If N is finite and Ni: g N c T, then N\Ni; c I' is finite and (N\Nie) n N. = 0. Hence 11+xk-x11
= N
kEN
<
II
>xk+>2xk-x0 keMW,.
k c- N,,.
xkI+I kENjE.xk-xO < 8,
kEN\,N,.E
346
5
Hilbert Spaces
and therefore (b)=(a). Corollary 5.28. If (xy }yEr is a suncntable family of vectors in a normed space X,
then the set {y E r: xy # 0) is countable. Proof. Let {x y } yE t- be a summable family of vectors in a normed space X. According
to the previous theorem, for every integer n >_ 1 there exists a finite subset Nn of
r such that II>kENxk1I < of whenever N e T is finite and N fl Nn = 0. Put S = Un° t Nn c_ r and recall from Corollary 1.11 that S is a countable set. If y E I '\S, then {y) U Nn = 0, and hence Ilxy II < rtr, for every n >_ 1, which implies that xy = 0. Thus xy is a nonzero vector in X only if y lies in the countable set S.
What Corollary 5.28 says is that an uncountable indexed family of vectors in a normed space may be summable but, in this case, it has only a countable number of nonzero vectors. Corollary 5.29. Every absolutely sunutcable fancily of vectors in a Banach space is a suntncable jamily. Proof. Let {xy }yEr be an absolutely summable family of vectors in a normed space X so that (Ilxy II )yer is a summable family of nonnegative numbers, and hence (cf.
Theorem 5.27) for every e > 0 there exists a finite NE c f such that llxkll <
F, xk11 < kEN
kEN
whenever N c f is finite and N fl NE = 0. Another application of Theorem 5.27 ensures that (xy)yEr is a summable family if X is a Banach space. The converse of Corollary 5.29 holds for finite-dimensional Banach spaces. That is, if X is a finite-dimensional normed space, then every summable jamily of vectors in X is absolutely sutnntable. But it fails in general. In fact, Dvoretzky and Rogers proved in 1950 that there exist summable families of vectors in infinite-dimensional Banach spaces that are not absolutely summable.
Proposition 5.30. If X is a finite-dimensional normed space, then {xy)yer is a sununable fancily of vectors in X if and only if it is absolutely summable.
Proof Recall that a finite-dimensional normed space is a Banach space (Corollary 4.28). Then, according to Corollary 5.29, it remains to show that every summable family of vectors in a finite-dimensional normed space is absolutely summable. Consider a normed space (X, 11 11) with dim X = n for some positive integer n and let B = {e1 }"=t be a Hamel basis for X. Take an arbitrary x E X and consider its unique expansion on B, n
x=
E tjej J=t
5.7 Sununabiliry
347
where (:; j )%j is a family of scalars consisting of the coordinates of x with respect to the basis B. It is readily verified that the function 11111 : X -> R. defined by n
IIxIII = E Itjl j=1
for every x E X, is a norm on X. Since any two norms on X are equivalent, it follows that there exist real constants a > 0 and fi > 0 such that IIxIII 5 PIIxII
and
IIxll _< aIIxI11
for every x E X (Proposition 4.26 and Theorem 4.27). Now suppose (xy)yEr is a summable family of vectors in X and take an arbitrary s > 0. Theorem 5.27 ensures the existence of a finite N£ c r such that
EIDj(k)I j=I kEN
=
PUEXkI
111: Xk1I kEN
and hence
for every
j=I,...n,
kEN
whenever N c t is finite and N n NE = 0, where (1(k)}..1 are the coordinates of each xk with respect to B. That is, Xk = r1=1tj((k))ej with IIxkIII = and therefore F-kENXk = L-kEN_j=I tj(k)ej = E'=I LkEN J(k)ej is the unique j(k)I. Take any expansion of >kENXk on B so that IIEkENXkIII = .j=I IJkEN finite subset N of r for which N n NE = 0 and observe that n IIxk11
keN
< a1: 1Ixk11l *EN n
a1: 1: Iij(k)I keN j=1 n
a1: EIRe4j(k)I+a1: 1: IImij(k)l. j=IkEN
j=IkEN
Set N+ = (k E N : Re j (k) > 0) and N _ {k E N : Re 4 j (k) < 0} for each j = t, ... , n, which (as subsets of N) are finite subsets of r such that N n NE =
0 and NJ-. n N. = 0 for every j = I,... , n. Thus I>kEN+tJ(k)l < fis and
348
5. Hilbert Spaces
IEkENJ j(k)I < &e for all j = 1, ... , n. Hence, n
n`
n
j=1 keN
j=1 keNt n
IRe
j=1
j=1 kEN;
l j(k)) I +
kEN,
n
j=1
IRe (F j(k))lI kENj_
n
11: tj(k)I+1: IEtj(k)I < 2nfe.
<
j=1 kENj+
Similarly,
j=1 kEN,
1; j(kfl < 2n$e, and so
E Ilxk II < 4nafle. kEN
o
Conclusion: (xy)yer is an absolutely summable family.
Remark: If a family of vectors in a normed space X is indexed by N (or by Ne). then it can be viewed as an X-valued sequence (the very indexing process establishes a function from N to X). If {xk}kEN is a summable family, then (xk)k is 1
a summable sequence (or, equivalently, the infinite series Ek°O_Ixk converges). In-
deed, if for each s > 0 there exists a finite NF C N such that IIIkENxk - xfl < e for some x E X whenever N is finite and N£ C N C N, then by setting nF = #NE it follows that 11 Ek=1 xk - x 11 < e whenever n > n f. However, the converse fails even for scalar-valued sequences. For instance, consider the sequence (xk }k°
with x2k_1 = -x
1
= k. It is clear that the infinite series >k°1xk converges =
(since I Ek=1 xk I < +T for every n > 1) but is not absolutely convergent (since Ek 1 Ixk I = 2yk=1 k for every n > 1). Thus (xk)k° is a summable sequence but not an absolutely summable sequence, and hence (xk)kEN is not an absolutely 1
summable family of vectors in the one-dimensional normed space R, so that (xk }keN is not a summable family by Proposition 5.30. An X-valued sequence (xk )k°O 1 for
which (xk)kEN is a summable family of vectors in X is referred to as an unconditionally sunutuible sequence. In this case it is also common to say that the infinite series F-k°_1xk is unconditionally convergent.
Proposition 5.31. Let (xy}yer be a family of vectors in a normed space X and take any p > 1. The following assertions are equivalent.
(a) (xy )yer is a p-summable family. (b) supN F_kE N Ilxk 11P < oo, where the supremum is taken over all finite subsets
of F, which is expressed by writing Eyer flxy I1P < 00.
5.7 Summability
349
Moreover, if any of the above equivalent assertions holds, then >2 IIXk IIp. L IIXy II" = SUPN kEN
yEr
Proof. Suppose (a) holds true. Theorem 5.27 ensures that for each s > 0 there exists a finite subset N6 of r such that LkEN IIXk lip < e whenever N is a finite subset of f such that N n N6 = 0. If (b) fails, then for every integer n > 1 there exists a finite subset N of r such that F- k E N IIXk IIp > n. Therefore, since (Nn\N6) n N. = 0, it follows that
n <>2IIxkIIp= 1: IIxkIIp+>2 IIXkIIp <>2IIXkIIp+E kEN
kENt
kEN,,nN6
for every integer n > 1, which is a contradiction. Hence (a) implies (b). Conversely, Suppose (b) holds and set a = supN ENEr IIXk IIp P. Thus for every e > 0 there exists a finite subset N6 of r for which EkeN, IIXk lip < a - E. If N is any finite subset of
P such that N n N6 = 0, then
IIXkIIp = >2 IIXkIIp - >2 IIXkIIp < a - (a -E) = e. kENt
kENUN,
Hence (b) implies (a) by Theorem 5.27 (in the Banach space R). Finally, if any of the
above equivalent assertions holds, then for every e > 0 there exists a finite subset N6 of r such that 11: IIXkIIp - >2 IIxyulpl < E
yer
kEN
or, equivalently, such that
IIxkIIp <>2IIxkIIp+e keN
and
>2Ilxyllp <>2IIXkIIp+E, kEN
yEr
yEr
whenever N is finite and N£ c N c T (Definition 5.26). Hence
SUp>2 IIXkIIp < > IIXkIIp+E < SUp>2 IIxkIIp+2E N keN
yEr
N kEN
for every E > 0, where the supremum is taken over all finite subsets of r (recall: &EN, IIxkIIp < >keN2 IIxkIIp whenever N1 and N2 are finite subsets of r such that Ni a N2). By taking the infimum over all e > 0 in the above inequalities,
`IIxkIIp = SUP yEr
IIXkIIp.
17
N keN
Let (xy)yEr be a family of vectors in a normed space X. It is obvious that EkENIIxkIIp < oo for every finite subset N of F. By convention, the empty sum
350
5. Hilbert Spaces
is null (i.e.. FkEOIIXkI" = 0 - in general, F-keOXk = 0 E X). If (xy)yer is a summable family, then it has only a countable set of nonzero vectors (Corollary 5.28). If this set is infinite (countably infinite, that is), then the family of all nonzero vectors from (xy )yEr can be indexed by N, say (xk )kEN (or by any other countably infinite index set). Clearly, (xy }yEr is a summable (a p-summable) family if and only if (xk)kEN is. If (xk)k°1 is any sequence containing all vectors from (xk)kEN, then it follows by the remark that precedes Proposition 5.31 that (xy}yEr is a summable family if and only if (xk ) k is an unconditionally summable sequence. In particular, (Xy)yEr is a p-summable family if and only if (Ilxkll°)k°_t is an unconditionally summable sequence of positive numbers.
Example 5K. Let r be any index set and let er denote the collection of all squaresummable families (iy)yer of scalars in F (as usual, F stands either for the real field R or for the complex field C). It is easy to check that t2 is a linear space Uy )yEr is a summable family whenever x = (ly )yEr and over F. Observe that y = (vy}yEr are square-summable families. Indeed, by the Holder inequality for finite sums, t
keN
I
kEN
kEN
for every finite subset N of T. Therefore,
I4yUyI <
`yEr
yer
yEr
}yEr is an absolutely summable family of scalars, and hence a summable family in the Banach space F (Corollary 5.29). Thus the function (;) : er X E2 -> IF, given by by Proposition 5.31, which implies that
(x;y) _ lyvy yEr
for every x = (i;y)yEr and y = (vy)yer in 12, is well-defined. Moreover, it is readily verified that it defines an inner product on 4. In fact, it is not difficult to show that
(4, (
)) is a Hilbert space.
In particular, if I' = N. then (4. (; )) is reduced to the Hilbert space (L+, (; )) of Example 5B. The natural generalization is easy. Let Ca (n) denote the linear space of all square-summable families (xy }yEr of vectors in a Hilbert space F. t2 (7i) X 4 (71) Proceeding as above, it can be shown that the function given by
((xy)yer ; (Yy)yEr) = E(xy ; Yy)H yEr
5.7 Summability
351
for every {xy }yEr and (y y }yEr in t (h), defines an inner product on 4(n), and
(12(n), (; )) is a Hilbert space. Again, if r = N, then (8' (f), (; )) is reduced to the Hilbert space (42 (?{), (; )) of Example 5F. From now on suppose X is an inner product space. It is easy to show from Definition 5.26 that if (xy )yEr and { yy }yEr are (similarly indexed) summable families of vectors in X with sums EYErxy and Eyer yy in X, respectively, and if x and y are arbitrary vectors in X, then {(xy ; Y)}yEr and ((x ; Yy))yEr are summable families of scalars with sums
E(xy;y) =(Ex. ;y) yer
and
E(x;yy)=(x;Eyy). yer
yEr
yEr
The next result is the general version of the Pythagorean Theorem. It extends Corollary 5.9 to any infinite orthogonal family of vectors in an inner product space.
Theorem 5.32. Let {xy }yEr be a family ofpairwise orthogonal vectors in an inner product space X.
(a) if (xy)yEr is a summable family, then it is a square-summable family and II Eyerxy I12 = Eyer IIxy 112.
(b) If X is a Hilbert space and {xy}yEr is a square-summable family, then (xy }yEr is a summable family.
Proof. Let {xy}yEr be an orthogonal family of vectors in X. (a) If (xy}yer is a summable family, then for every e > 0 there exists a finite
Ns c r such that
E Ilxk112 = I ExkII2 < E2 keN
kEN
whenever N c_ r is finite and N fl NF = 0 (by Proposition 5.8 and Theorem 5.27). (Ilx),112}yer is a summable family Another application of Theorem 5.27 ensures that
in the Banach space R. Moreover, since xa 1 xp whenever xa and xp are distinct
vectors from {xy}yEr (i.e., (xa ; xp) = 0 for every a, P E r such that a 96 P), it follows that ((xy ; xy))yer includes the family of all nonzero scalars from the family ((xa ; xp))(a.j6)Erxr. Therefore, ryp
l Eyerxy
2
yE
r r xy =
xy ; y
N
er
r. P
(x-; E xf
EE(xa;xp) =E(xy;xy) =1: IIxyII2. aErfiEr
yer
yEr
352
5. Hilbert Spaces
(b) If (Ilxy 112)yEr is a summable family of nonnegative numbers, then for every
E > 0 there exists a finite NE C 1' such that Exk11= (1: 11xk112)
11
kEN
<
kEN
whenever N C_ T is finite and N fl Nf = 0 (by Proposition 5.8 and Theorem 5.27 again). Another application of Theorem 5.27 ensures that (xy)yEr is a summable family in the Hilbert space X. o
5.8
Orthonormal Basis
If A is an orthogonal set in an inner product space X, then A may contain the origin of X, which is orthogonal to every vector in X. The next proposition presents the main algebraic property of orthogonal sets that do not contain the origin.
Proposition 533. If A is an orthogonal set consisting of nonzero vectors in an inner product space, then A is linearly independent.
Proof. Let A be a set of pairwise orthogonal nonzero vectors in an inner product space X. Suppose there exists a finite subset of A containing more than one vector that is not linearly independent. For instance, (x; ) _o c A such that xp = T°_ t ai x; for some integer n > I and some set of scalars {a;)"_1. Since xo 1 xi for every . , ... , n we get Ilxo 112 = (xo ; xo) = F" a; (xo : x;) = 0, which is a contradiction (because A does not contain the origin). Conclusion: Every finite subset of A containing more than one vector is linearly independent. Recall that every singleton {x } in a linear space X such that x # O is linearly independent. Therefore, every finite subset of A is linearly independent, which means that A is itself linearly 0 independent (Proposition 2.3). A unit vector in a normed space is a vector with norm equal to one. An orthonorinal set in an inner product space X is an orthogonal set consisting of unit vectors. That
is, a subset A of X is an orthonormal set if x 1 y for every pair {x, y) of distinct vectors in A and IIx II = 1 for every x E A. Equivalently, (ey )yEr is an orthonormal
family of vectors in X if (en ; ep) = 3,,p for every a, f E r, where Sap is the Kronecker delta (i.e., (ea, ; ep) = 0 for every a, P E r such that a 96 P and II ey 112 = (ey ; ey) = I for every y E fl. Each orthogonal set consisting of nonzero vectors can be normalized. In fact, if A is an orthogonal subset of X such that x A 0 for every x E A, then the set f IIx II -1 x E X : x E A) is an orthonormal subset of X. The next two results are immediate consequences of the definition of orthonormal sets (Proposition 5.34 below is just a particular case of Proposition 5.33).
Proposition 5.34. Every orthonorinal set in any inner product space is linearly independent.
5.8 Orthonormal Basis
353
Proposition 5.35. If A is an orthonormal set in an inner product space X, and if there exists x E X such that x 1 A and lixil = 1, then A U (x} is an orthonormal set in X (that properly includes A).
Let 0 be the collection of all orthonormal sets in an inner product space X. Note that we may argue by contradiction that any singleton (x) with lixii = I is an orthonormal set in X (although the expression "pairwise orthonormal" is meaningless in this case). As a subcollection of the power set P(X), 0 is partially ordered in the inclusion ordering of P (X). Recall from Section 1.5 that a set A E 0 (i.e., an orthonormal set A in an inner product space X) is maximal in 0 if there
is no set A' E 0 such that A C A' (i.e., if there is no orthonormal set A' in X that properly includes A). If A is maximal in 0, then we say that A is a maximal orthonormal set in the inner product space X.
Proposition 5.36. Let X be an inner product space and let A be an orthonormal set in X. The following assertions are pairwise equivalent. (a) A is a maximal orthonormal set in X. (b) There is no unit vector x for which A U (x} is an orthonormal set.
(c) If x 1 A, then x = 0 (i.e., Al = {0)). Proof. Let A be an orthonormal set in an inner product space X.
Proof of (a)=(b). If there exists a unit vector x in X for which A U (x) is an orthonormal set, then A U {x) is an orthonormal set that properly includes A (since x 1 A). Thus A is not a maximal orthonormal set in X.
Proof of (b)=o-(c). If there exists a nonzero vector x in X such that x 1 A, then there exists a unit vector x' = lix11-Ix in X such that A U {x'} is an orthonormal set.
Proof of (c)=(a). If (a) fails, then there exists an orthonormal set A' in X that properly includes A so that A'\A 0 0. Take any x in A'\A, which is a nonzero vector (actually, x is a unit vector) orthogonal to A (for x E A', A C A', and A' is an orthonormal set). Thus (c) fails.
Proposition 5.37. If A is an orthonormal set in an inner product space X, then there exists a maximal orthonormal set B in X such that A C B. Proof. Let A be an orthonormal set in an inner product space X. Put
OA = IS E P(X): S is an orthonormal set in X and A C S), the collection of all orthonormal sets in X that include A. Recall that, as a nonempty
subcollection (A E OA) of the power set P(M), OA is partially ordered in the inclusion ordering. Take an arbitrary chain C in OA and consider the union U C of
354
5. Hilbert Spaces
all sets in C. If x and y are distinct vectors in U C, then x E C and Y E D, where
C, D E C C_ OA. Since C is a chain, it follows that either C C D or D C C. Suppose (without loss of generality) that C c D. Thus x, y E D, and so U C is an orthonormal set (for D E OA). Moreover, A C_ U C (reason: every set in C includes
A). Outcome: U C E OA. Since U C is an upper bound for C we may conclude: Every chain in OA has an upper bound in OA. Hence OA has a maximal element by Zorn's Lemma. Let B be a maximal element of OA, which clearly is an orthonormal set in X that includes A. If there exists a unit vector x in X such that B U {x) is an orthonormal set, then B U (x) lies in OA and properly includes B, which contradicts the fact that B is a maximal element of OA. Therefore, there is no unit vector x in X such that B U {x) is an orthonormal set, and hence B is a maximal orthonormal set in X (Proposition 5.36). o
Proposition 5.37 says that there are plenty of maximal orthonormal sets in any inner product space of dimension greater than 1. The next proposition says that the maximal orthonormal sets in a Hilbert space 7.1 are precisely those orthonormal sets
that span f. Proposition 5.38. Let A be an orthonormal set in an inner product space X.
(a) If VA = X, then A is a maximal orthonormal set. (b) If X is a Hilbert space and A is a maximal orthonormal set, then VA = X.
Proof. Take any orthonormal set A in X. According to Proposition 5.36, A is a maximal orthonormal set if and only if Al = {0}. If VA = X, then (VA)1 = Xl = (0). But Proposition 5.12 says that Al = (VA)1, and hence Al = (0). The converse is an immediate consequence of Proposition 5.15. If X is a Hilbert space
and Al = (0), then VA = X.
13
An orthonormal set in an inner product space X that spans X is called an orthonormal basis for X. In other words, a subset B of an inner product space X is an orthonormal basis if (i)
B is an orthonormal set, and
(ii)
VB = X.
This is a combined topological and algebraic concept, while the Hamel basis of Section 2.4 is a purely algebraic concept. However, every orthonormal basis for a given inner product space X is included in a Hamel basis for the linear space X (Proposition 5.34 and Theorem 2.5). Recall from Proposition 5.37 that every nonzero inner product space has a maximal orthonormal set. Using the above terminology.
Proposition 5.38(a) says that if B is an orthonormal basis for an inner product space X, then B is a maximal orthonormal set in X. Note that this does not ensure the existence of an orthonormal basis in incomplete inner product spaces. but in
5.8 Onhononnal Basis
355
nonzero Hilbert spaces they do exist. In fact, in a Hilbert-space setting the concepts of maximal orthonormal set and orthonormal basis coincide (Proposition 5.38). That is, B is an orthonornial basis for a Hilbert space 71 if and only if B is a maximal orthonormal set in R. Therefore (cf. Proposition 5.37 again), every nonzero Hilbert space has an orthononttal basis. As we could expect (suggested perhaps by Section 2.4), the cardinality of all maximal orthonorncal sets in an inner product space X is an invariant for X. First we prove this statement for finite-dimensional spaces.
Proposition 5.39. Let X be an inner product space.
(a) If X is finite-dimensional, then even' orthonormal basis for X is a Hantel basis for X. (b) If there exists an orthonormal basis for X with a finite number of vectors, then every orthonornutl basis for X is a Hamel basis for X. Consequently, in both cases, every orthonormal basis for X has the same finite cardinality, which coincides with the linear dimension of X. Proof. Let X be an inner product space.
(a) If X is finite-dimensional, then M = M- for every linear manifold M of X. In particular, span B = VB for every orthonormal basis B for X. Recall that every orthonormal basis for X is linearly independent (Proposition 5.34). Therefore, if B is an orthonormal basis for X, then B is a linearly independent subset of X such that span B = X; that is, B is a Hamel basis for X.
(b) If B = (e; )"_i is an orthonormal basis for X, then span (e, )"_t is an n-dimensional linear manifold of X, which in fact is a subspace of X (Corollary 4.29). Hence span B = V B. But V B = X so that B is a linearly independent subset of X (Proposition 5.34) such that span B = X. In other words, B is a Hamel basis for X. Finally, recall that the cardinality of any Hamel basis for X is an invariant for X: the linear dimension of X (Theorem 2.7). Thus, in both cases, #B = dim X E N 0 for every orthonormal basis B for X.
To verify such an invariance for infinite-dimensional spaces we shall use the following fundamental inequality.
Lemma 5.40. (Bessel Inequality). Let {ey )yer be a family of vectors in an inner product space X and let x be any vector in X. If {e y )yEr is an onhonorncal fancily, then { (x : ey) }yEr is a square-suncntable family of scalars and
L. I (x; ey)I` < Ilxll` yEr
Take an arbitrary x E X and an arbitrary finite set of indices N C_ I'. Since {ey}yEr is an orthonormal family, it follows by the Pythagorean Theorem Proof.
356
5. Hilbert Spaces
(Proposition 5.8) that
r
_
II`(x;ek)ek-x11
kEN
_
> (x ; ek) ek
kEN
112
j(x ; ek) (ek ; x) + 11X112 - 2RekEN
1: I(x;ek)12-2EI(x;ek)I2+IIxI12, kEN
kEN and hence
I(x; ek) 12 < 11X
112.
kEN
Therefore, as N is an arbitrary finite subset of r,
yEr
I(x ; ek)12 < 11x112, I(x ; ey) 12 = sup N kEN
where the supremum is taken over all finite subsets of r. The family of scalars ((x ; ey) )yEr is then square-summable by Proposition 5.31.
0
Corollary 5.41. Let (ey)yEr be any orthonormal family of vectors in an inner product space X. For each x E X the set (y E r : (x ; ey) # 0} is countable. Proof. According to Lemma 5.40 (I(x; ey)12)YEr is a summable family of nonnegative numbers. Thus Corollary 5.28 ensures that {y E r: I(x ; ey)12 96 0) is a countable set. That is, (y E r : (x ; ey) # 0) is a countable set (since (x ; ey) # 0 if and only if I (x ; e),)12 # 0).
a
Theorem 5.42. Every orthonornutl basis for a given Hilbert space 7{ has the same cardinality. Proof. If ii is finite-dimensional, then the above result follows by Proposition 5.39. Suppose f is infinite-dimensional and let B and C be an arbitrary orthonormal basis for R. Proposition 5.39(b) ensures that B and C are infinite, and so #N < #B. For each b E B set Cb = {c E C : (c; b) # 0). According to Corollary 5.41 #Cb < #N, and hence #Cb < #B for alI b E B. Then (cf. Theorems 1.10 and 1.9)
#(U Cb) < #(B x B) = #B bE B
because B is infinite. Now observe that if c E C. then c E Cb for some b E B. (Reason: since B is a maximal orthonormal set, it follows that if c E C and c 1 B. then c = 0, which contradicts the fact that Ilcll = 1; hence (c; b) # 0 for some b E B.) Therefore C c_ UbEBCb. But UbEBCb C C (each Cb is a subset of C). Thus C = UbEBCb and, consequently, #C < #B. Since C also is infinite, we may
5.8 Orthonormal Basis
357
swap B and C, apply the same argument, and get #B < #C. Hence #C = #B by the Cantor-Bernstein Theorem (Theorem 1.6).
El
It is worth noticing that the above theorem might be restated as "every maximal orthonormal set in a given inner product space X has the sane cardinality". The proof remains essentially the same. Such an invariant (i.e., the cardinality of every orthonormal basis for a given Hilbert space) is called the orthogonal dimension of the Hilbert space R. According to Proposition 5.39 the orthogonal dimension of 71 is finite if and only if its linear dimension is finite and, in this case, these two dimensions coincide. Therefore, the concepts of "infinite-dimensional" and "finitedimensional" Hilbert spaces are unambiguously defined.
Proposition 5.43. If X is a separable inner product space, then every orthonornial set in X is countable. Proof. Let B = (ey } yet be any family of orthonormal vectors in the inner product space X. If X is separable (as a metric space), then there exists a countable dense subset A = of X. Since A is dense in X, it follows by Proposition 3.32 that every nonempty open ball centered at any point of X meets A. In particular, every open ball of radius, say 1, centered at any point of B meets A. Then for each y E r there exists an integer ky E N such that Iley - ak1 II < ;. This establishes a function F : F -+ N that assigns to each y E r the integer ky E N. We shall show that this function is injective. Indeed, consider the family {ak,}yEr and recall that B is an orthonormal set. Thus
f = Ilea - ofll = Ilea-ak.-e$+akr,+aka-aksII Ilea-akvII+Ile#-akyII+Ilak.-ak,lI 1+Ilab-akd11, 1 > 0. for every pair of distinct indices a. 0 E r. This implies that ka k,6 whenever a 96,6, which means that F : F -> N is injective. Therefore #1' < #N, and so B is countable. 17 and hence IlakQ - ak0 II
Recall that every Hilbert space has an orthonormal basis. The next theorem characterizes the separable Hilbert spaces in terms of their orthogonal dimension.
Theorem 5.44. A Hilbert space is separable if and only if it has a countable orthonornutl basis. Proof. Propositions 4.9(b) and 5.43.
0
We close this section with a useful result for constructing countable orthonormal sets, which is known as the Gram-Schmidt orthogonalization process.
Proposition 5.45. Let X be an inner product space. If A = (ak) is a countable linearly independent nonempty subset of X, then there exists a countable orthonornial
358
5. Hilbert Spaces
subset B = (ek) of X with the following property: span{ek)k=1 = span {ak)k=1 for every integer n such that 1 < n < #A, and hence span B = span A. Proof. Let A = {ak } be a countable (either a finite and nonempty or a countably infinite) linearly independent subset of an inner product space X. Claim. For every integer n > 1 (n < #A) there exists an orthonormal subset (ek }k=1 of X such that span {ek )k=1 = span (ak }=1 Proof. al # 0 (for A = {ak} is a linearly independent subset of X). Set el = Ila I 11-1a, in X so that the result holds for n = 1. (Recall that any singleton {x) such that lix II = 1 is an orthonormal set in X.) If #A = 1 the proposition is proved. Thus
assume that 1 < #A < to. Suppose the result holds for some integer n such that 1 < n < #A. Observe that n
an+l - 1:(an+I;ek)ek i4 0 k=1
(otherwise a,,+] E span {ek )k=1 = span {ak }k=1, which contradicts the fact that A is linearly independent). Set n
en+I = An+I (an+I
- E(an+1 ; ek)ek), k=1
wherein+I = Ilan+I - Fnk=1 (an+1; ek)ekll -1 so that Ilen+I II =L Take any integer
j = 1, ...
, n and note that n
(en+1; ej)
l
$n+1 ((an+1 ; ej) - >(an+1; ek)(ek; ei)) k=1
Yn+1((an+I;ej) - (an+I;ei)) = 0
because (ek)k=I is an orthonormal set. Then en+1 1 {ek}k=1, and hence Jek)" =+,, is U {an+1)) = span {ak )k±ll. Therefore, an orthonormal set. But en+I E span ((ek }k=1
span (ek }kn+ =1I = span ({ek }k=1 U(en+l )) = span ({ad }k=1 U(en+, )) = span {ak )k±'.
which completes the proof by induction. (3
Finally, put B = U*A I {ek )k=1 Since (ek )k_1 C {ek )k±ll for each integer n such that 1 < n < #A, every pair of distinct vectors in B, say e and e', lies in (ek for )k=t some integer m such that 1 < m < #A. Since (ek )k I is an orthonormal set, e 1 e' and Ilell = Ile'll = 1. Thus B is an orthonormal set. Moreover, if span {ek)k=1 = span {ak }k=1 for every integer n such that I < n < #A, then span B = span A. a Corollary 5.46. There is no Hilbert space with a countably infinite Hamel basis. In other words, a Hamel basis for a Hilbert space is either finite or uncountable.
5 9 The Fourier Series Theorem
359
Proof. Let ?-l be a Hilbert space and suppose there exists a countably infinite Hamel basis for the linear space H. say { fk iff= . The Gram-Schmidt orthogonalization process ensures the existence of a countably infinite orthonormal set, say jet }' i , such
that span {ek}k=j = span (fk)i_, for every n > 1. If (CO', is any square-summable sequence of scalars, then {ak ek }' i is a square-summable sequence of pairwise
orthogonal vectors in H, and hence the infinite series F_k takek converges in the Hilbert space H by Corollary 5.9(b). Take an arbitrary square-summable sequence {ak }k° t of nonzero scalars (e.g.. at = z for each k > 1) and set x = _k i ak ek in R. Since x V span {ek}k=, and span (ek)k=i = span (fk)k=t for every n > 1. it follows that x span {f.} kk (i.e., x is not a finite linear combination of vectors from (fk )k°O I ), which contradicts the assumption that { fk }k° , is a Hamel basis for R. Conclusion: There is no countably infinite Hamel basis for R. 17 1
This result is the main reason why the concept of linear dimension is of no interest in Hilbert space theory. Here the useful notion is orthogonal dimension and, from now on, dim H will always mean the orthogonal dimension of the Hilbert space R. Observe that the notions of finite-dimensional and infinite-dimensional Hilbert spaces remain unchanged.
5.9
The Fourier Series Theorem
The Fourier Series Theorem states the fundamental properties of an orthonormal basis in a Hilbert space. Precisely, it exhibits a collection of necessary and sufficient conditions for an orthonormal set be an orthonormal basis in a Hilbert space. Before stating and proving it, we need the following auxiliary result. Proposition 5.47. Let led, EN be a finite orthonormal set in a Hilbert space H and let x be an arbitrary vector in H.
ux = E(x:ek)ek kEN
is the unique vector in span jet }tE,v such that
IIx - usll
Ilx - u11
for every
u E span letkkEN.
Proof. Since span {ek}kEN is a finite-dimensional linear manifold of H (Theorem 2.6), it follows by Corollary 4.29 that it is a subspace of H so that span jet )kcN = (span {ek}kEN) = V{ek}kEN. Theorem 5.13 says that there exists a unique vector U. in V(ek}kEN such that IIx - u, 11
IIx - u11
for every
u E V(ek}kEN.
Moreover, Theorem 5.13 also says that this u.C is the unique vector in V (ek)kEN such that
x - u.c 1 V(ek}kEN.
360
5. Hilbert Spaces
Since u, E span{ek}kEN,ux = F-kENakekforsome finite family ofscalars (&k)k6N
Sincex-u, 1 V{ek}kEN,x-ux 1 ejforevery j E N. Thus, recalling that jet 1k',jV is an orthonormal set,
0 = (x-ux;ej) = (x;ej)-Eak(ek;ej) = (x;ej)-af kEN
for every j E N, and hence ux = >kEN (x ; ek)ek.
a
Theorem 5.48. (The Fourier Series Theorem). Let B = (ey )yEr be an orthonormal set in a Hilbert space H. The following assertions are pairwise equivalent.
(a) B is an orthonormal basis for H. (b) Every x E H has a unique expansion on B, namely
x = J>; ey)ey. yEr
This is referred to as the Fourier series expansion of x. The elements of the family of scalars {(x ; ey)}yEr are called the Fourier coefficients of x with respect to B.
(c) For every pair of vectors x, y in H,
(x;y) = 1:(x;ey)(y;ey)yEr
(d) For every x E R, 11x112 = E I(x; ey)I2. yEr
This is the Parseval identity.
(e) Every linear manifold of H that includes B is dense in H.
Proof. We shall verify that (e)q(a)..(d) and (b)=*(c)=(d)=(b). Proof of (a) '(d). Take any x in H. If (span B)- = H, then every nonempty open ball centered at x meets span B (Proposition 3.32). That is, for every e > 0 there exists a finite set NE c F and a vector u E span (ek }kEN, such that llx - u 11 < E. Set
ux = >kEN. (x ; ek)ek in span {ek)kEN, Proposition 5.47 ensures that lix - ux l lax - u 11, and hence
X- E (x ; ek)ek kEN,
ll2
< E.
5.9 The Fourier Series 7heo, m
361
Since (ek )kEN, is an orthonormal set, it follows by the Pythagorean Theorem (Propo-
sition 5.8) that (X,ek)ek112
1 E(X;ek)ek -X112
112 11
kEN,
1: I(x;ek)12-2FI(x;ek)I2+IIxII` kEN,
kEN,
= kEN,
Therefore, by the Bessel inequality (Lemma 5.40) and since ikEN, I
111X112
(x ; ey) 12 (Proposition 5.31), we get
- E I(x; ey)
121
yEr 11
kEN, E E I (x ; e) 12.Outcome: (a) implies (d). Conversely. If the orthonormal set B is not an orthonormal basis for W. then B is not a maximal orthonormal set in the Hilbert space 7{ (Proposition 5.38(b)). Thus there exists a unit vector e in 7 such that B U {e} is an orthonormal set (Proposition 5.36). Therefore, (e ; ey) = 0 for all y E r, and hence 1 = Il e 11 0 EyE r l (e; ey )12 = 0. Conclusion: If (a) fails then (d) fails. Equivalently, (d) implies (a).
Proposition 4.9(a) ensures that (a)a(e). It is readily verified that (b)=(c). Indeed, if (b) holds, then (x ; y)
=
E(x ; ea)ea ; Dy ; aEr
pEr
ep)ep) _
jx ; ea) (e,,; D y; ep)ep) aEr
pEr
E(x;ea)E(y;ep)(ea;ep) =J(X;ea)(y;ea) aEr
fEr
aEr
for every x, y E 7 (because {ey )yEr is an orthonormal set). Hence (b) implies (c). trivially. Moreover,
Proof of (d)=(b). Take any x in 7{. Assertion (d) implies that {(x ; ey)ey}yEr is a square-summable family of orthogonal vectors in N. Thus ((x ; ey)ey)yEr is a summable family of orthogonal vectors in the Hilbert space 7{ by Theorem 5.32(b).
Let x' be the sum of {(x ; ey)ey}yEr. That is, x' = Eyer(x ; ey)ey E 7{. Since
362
5. Hilben Spaces
(ey )yEr is an orthonormal set, it follows by Theorem 5.32(a) that fix' - x 112
= II E (x ; ey) ey - x
2 11
yer E(X;ey)e,,12-2ReI: (x;ey)(ey;x)+11X112 II
yEr
yEr
EI(x;ey)I2-2EI(x;ey)I2+11X112 = 0 yEr
yEr
because assertion (d) holds true (i.e., because 11x 112 = Eyer I (x ; ey )12). Therefore,
x' = x so that
x = 1: (x; ey)ey. yEr
Finally, if x = Eyer ay(x)ey for some family of scalars (ay(x)}yEr, then Eyer(ay(x) - (x ; ey))ey = 0 so that EyErlay(x) - (x ; ey)12 = 0 by Theorem 5.32(a). Thus ay (x) = (x ; ey) for every y E I', which proves that the Fourier series expansion of x is unique. o
Remark: Consider the sums in (b), (c) and (d) -Theorem 5.48. If 7{ is a finitedimensional Hilbert space, then any orthonormal basis for 7i is finite (Proposition 5.39), and hence these are finite sums. If 7{ is infinite-dimensional and separable, then any orthonormal basis B for i is countably infinite (Proposition 5.43) so that B can be indexed by N (or by No, or by Z). For instance, suppose B = (ek )kEN is an orthonormal basis for an infinite-dimensional separable Hilbert space It. In this case we have a countable summable family of vectors ((x ; ek)ek)kEN in (b), a countable summable family of scalars {(x ; ek) (y ; ek)}kEN in (c), and a countable summable family of nonnegative numbers {I(x ; ek)12}kEN in (d). If (ek)° t is any 7l-valued orthonormal sequence containing all vectors from the orthonormal basis {ek )kEN for fi (which is itself an orthonormal basis for 7{), then the Fourier series expansion for any x E 7{ can be written as 00
x = j>; ek)ek. k=1
Similarly. 00
(x;y) = 2(x;ek)(y;ek) k=1
for every x, y in 7l, and 00
11x112 = jI(x.ek)I2 k=1
5.9 The Fourier Series Theorem
363
for every x E H, where all the above infinite series are unconditionally convergent. IfH is a nonseparable Hilbert space, then any orthonormal basis for H is uncountable (Theorem 5.44). However, Corollary 5.41 says that, even in this case, the sums in (b), (c) and (d) have only a countable number of nonzero summands for each x, y E H.
Example 5L. In this example we shall exhibit an orthonormal basis for some classical separable Hilbert spaces.
(a) First recall that, for any integer n > 1, IF" is a Hilbert space (see Example 5A). Moreover, the finite set {ek )k_ i , where each ek = (Ski .... , Sk") E IF" has 1 at the kth position and zeros elsewhere, clearly is an orthonormal set in IF" and also a Hamel basis for the linear space IF" (Example 2I). Then (ek)k-r is an orthonormal basis for
the finite-dimensional Hilbert space IF", which is called the canonical orthonormal basis for Ir". (b) Now let e+ be the Hilbert space of Example 5B. Consider the a2 -valued sequence
{ek)kEN, where each ek is an scalar-valued sequence in 12 with just one nonzero entry (equal to 1) at the kth position. That is, ek = (Skj } JEN E £. for every k E N
with Ski = 1 if j = k and Ski = 0 if j 0 k. Again, it is clear that (ek)kEN is an orthonormal sequence of vectors in f2. Moreover, recall that x = (l;j) jeN lies in 1'- < oo. Therefore, if x = { j)tEN E t2, then x = e+ if and only if lim xn, where x" z;n, 0, 0, 0, ...) E e+ for every n E N. Since each xn E span {ek }kEN (for x,, = Ek=, l;k ek), it follows by the Closed Set Theorem that V(ek)kEN. Hence e+ C V(ek)kEN C e+, which means that X E (span (ek)kEN = V{ek }kEN = 12. Conclusion: The orthonormal sequence (ek }kEN is an orthonormal
basis for £.. It can be similarly shown that (ek )kEZ, where each ek is a scalarvalued net in f2 with just one nonzero entry (equal to 1) at the kth position (i.e., ek = (Skj)FEZ E e2 for every k E Z with Ski = I if j = k and Ski = 0 if j A k) is an orthonormal basis for the Hilbert space e2 of Example 5B. These are referred to as the canonical orthonormal basis for e+ and e2, respectively. Next consider the complex Hilbert space L2(S) for some nondegenerate interval S of the real line (see Example 5D). Recall that L2(S) is the completion of R2(S), which
in turn is the quotient space r2(S)/- of Example 3E. Also recall that, according to the usual convention, we write x E L2(S) instead of [x] E L2(S), where x is any representative of the equivalence class [x). (c) Suppose S = [a, b] for some pair of real numbers a < b so that L2(S) _
L2[a, b]. It is not difficult to verify that the countable set {ek)LEz, with each ek in L2[a, b] given by
ek(t) = (b - a)-Z exp(2nik n-u)
for every
t E [a, b],
is a maximal orthonormal set in L2[a, b], and hence an orthonormal basis for the Hilbert space L2[a, b]. In particular, for S = [0, 27r], the countable set (ek )kEZ
364
5. Hilbert Spaces
with each ek in L2[0, 27r] given by
e'k' = '( cos kt + i sin kt)
ek (t) =
r E [0, 2a],
for every
79
is an orthonormal basis for the Hilbert space L2[0, 2n]. Similarly, let A be the open unit disk (about the origin) in the complex plane C, and let r = aA denote the unit circle in C. Suppose the length of 1' is normalized (i.e., suppose L2(r) is the Hilbert space of all equivalence classes of square-integrable complex functions defined on r with respect to normalized Lebesgue measure µ on the unit circle so that µ(F) = 1). The countable set (ek)kEz, with each ek in L2(r) given by ek(Z) = Zk
for every
(0 < 0 < 27r),
z = eie E T
is an orthonormal basis for the Hilbert space L2(r).
(d) If S = [0, oo), then it can be shown that the sequence {en -0 with each e in L2[0, 00) given by f
en (t) = n e-7Ln(t)
for every
t > 0,
where each Ln E L2[0, oo) is defined by
Ln(t) = e'
(tne-`) = yk_o(-1)k(k)
tk
for every
t > 0,
is an orthonormal basis for the Hilbert space L2[0, 00). (Note: the elements of (en )p° o and (Ln )'0 are referred to as Chebyshev-Laguerre functions and Chebyshev-Laguerre polynomials, respectively.) If S = R, then it can also be shown that the sequence {en )'0 with each en in L2(-oo, 00) given by i
en() = (2nn! f)-2 e-
,r 2
HH(t)
for every
t E R,
where each H E L2(-oo, oo) is defined by H. (1) _
1)ne'2ne-r2
for every
t E R,
is an orthonormal basis for the Hilbert space L2(-oo, 00). (Note: the elements of {en }n° o and { Hn )0 are called Chebyshev-Hermite functions and ChebyshevHermite polynomials, respectively.)
Example SM. All the orthonormal bases in the previous example are countable, and so those Hilbert spaces are all separable Hilbert spaces. However, there exist nonseparable Hilbert spaces. Actually, there exist Hilbert spaces of arbitrary orthogonal dimension. Indeed, let r be any index set, and let t2 be the Hilbert space of Example 5K. Consider the family (ey)yEr of vectors in 1t2., where each
ey = {8ya }QEr is a family of scalars such that 8y. = I if a = y and 8ya = 0
5.9 The Fourier Series Theorem
365
if a * y. Note that each ey is a square-summable family of scalars (with just one nonzero element equal to 1) so that each ey in fact lies in e,. It is also clear that (ea; ep) = Sap for every of, f E r, and hence (ey)yer is an orthonormal family of vectors in er. Let x be any unit vector in lr. That is, x = (y )yEr is a squaresummable family of scalars such that 11x112 = EyErlty12 = 1. If x 1 ey for all
y E r, then 0 = (x ; ey) = &era8ya = t;y for all y E r, and hence x = 0. But this contradicts the fact that Ilx11 = 1. Therefore, there is no unit vector x in 12 for which (ey }yEr U {x} is an orthonormal set. That is (Proposition 5.36), (ey)yer is a maximal orthonormal set in the Hilbert space f2 so that {ey}yEr is an orthonormal basis for e2 (Proposition 5.38). Hence
dim t= #r. If the index set r is uncountable, then (ey )yEr is an uncountable orthonormal basis for lr so that, in this case, the Hilbert space ere' is not separable (Theorem 5.44).
Theorem 5.49. Two Hilbert spaces are unitarily equivalent if and only if they have the same orthogonal dimension. Proof. Let H and IC be Hilbert spaces (over the same field IF) and let B = (ey )yEr be an orthonormal basis for R. (a) If H and 1C are unitarily equivalent, then there exists a unitary transformation
U E B(?-l, IC) so that U(B) is an orthonormal basis for 1C. Indeed, since B is an orthonormal set in R, and since U preserves the inner product, it follows that U(B) is an orthonormal set in iC. Moreover, as U is surjective, for every y E IC there exists x E H such that y = Ux, and hence (Theorem 5.48)
y = U>(x ; ey)ey = >(Ux ; Uey)Uey = >(y; Uey)Uey yEr
yEr
yEr
because U is a bounded linear transformation that preserves the inner product. Thus
the orthogonal set U(B) = {Uey)yEr is an orthonormal basis for IC according to Theorem 5.48. Since U is invertible, it establishes a one-to-one correspondence between B and U(B), and so #B = #U(B). Therefore, dim R = dim X. (b) Conversely, suppose dim H = dim !C. Then #B = #C, where C is an arbi-
trary orthonormal basis for 1C, so that C and B can be similarly indexed, say C = { fy )yEr. Now consider the Hilbert space f2 of Example 5K (over the field F) and take an arbitrary x E 7{. The Parseval identity (Theorem 5.48) says that IIx1l2 = EyEr l(x ; ey)12, and hence {(x ; ey)}yEr lies in f2. Consider the mapping U : ?{ - ere' defined by Ux = {(x ; ey)}yEr for every x E H. It is readily verified that U is a linear transformation (by the linearity of the inner product in the first argument), and
IlUxll2 =
I(x ; ey)12 = IIxl12
yer
366
5. Hilben Spaces
for every x E H (Parseval identity again). Thus U is a linear isometry. Next we verify that U is surjective. If (ay )yEr is any family of scalars in l2r (i.e., if Eyer lay 12 < 00
- see Proposition 5.31), then (ay ey) yE r is a summable family of vectors in W. Indeed, as (ey}yEr is an orthonormal set, EyErllaye),112 = EyErlay12 < 00 so that (ayey)yer is a square-summable family, and hence a summable family of vectors in the Hilbert space H (Proposition 5.31 and Theorem 5.32). Therefore, for every E e2 there exists an x E H such that x = EyEr.ayey. But the Fourier series expansion of x is unique, which implies that ay = (x ; ey) for every
y E r. Then Ux = ((x; ey))yer = {ay)yer so that (ay)yer E R(U). That is. 4 C R(U). Since R(U) g Pte, trivially, it follows that R(U) = ft2.. Conclusion: U is a linear surjective isometry, which means that U is a unitary transformation. Thus H and 6 are unitarily equivalent. As B and C are indexed by a common index set r, the same argument shows that JC and f2 also are unitarily equivalent. Therefore. H and IC are unitarily equivalent (composition of isometric isomorphism is again an isometric isomorphism). 0
According to Theorem 5.44 and Examples 5L and 5M, the next result is an immediate consequence of Theorem 5.49.
Corollary 5.50. Let r be an arbitrary index set. Every Hilbert space H with dim h = #1' is unitarily equivalent to 12r. In particular every n-dimensional Hilbert space (for any integer n E N) is unitarily equivalent to IF". and every infinite-
dimensional separable Hilbert space is unitarily equivalent to t. .
Our first example of an unbounded linear transformation defined on a Banach space appeared in the proof of Corollary 4.30 part (b). It is easy to exhibit unbounded
linear transformations defined on linear manifolds of a Hilben space. (Hint: see Problem 4.33(b).) Now we shall apply the Fourier Series Theorem to exhibit an unbounded linear transformation defined on a whole Hilben space; precisely, defined on an infinite-dimensional separable Hilben space.
Example 5N. Let H be an infinite-dimensional separable Hilbert space and let (ek be an orthonormal basis for H. Let B = { fy }yEr be a Hamel basis for i 7l that properly includes (ek)k t (see Theorem 2.5 and Corollary 5.46). Take any fo E B\(ek)k°-1 (obviously, fo # 0) and consider the mapping F: B -. 'H such that Ffo = fo and Ff = 0 for all f A fo in B. Now consider the transformation L : H --+ R defined as follows. For each x E H take its unique representation as a )ku
(finite) linear combination of vectors in the Hamel basis B, say n(x)
x = >2 ak(x)fk k_1
Here n (x) is a positive integer and (ak (x) ) jn(X is a finite family of scalars containing
all nonzero coordinates of x with respect to the Hamel basis B. Set
5.10 Orthogonal Projection
367
n(x)
Lx = >ak(x)Ffk k=1
It is not difficult to verify that L is linear (i.e., L E £[H]). Moreover, Lf = 0 for every f E B such that f 0 fo, and Lfo = fo (so that L # 0). In particular, Lek = 0 for all k > 1. Take an arbitrary x E l and consider its Fourier series expansion, viz., x = F°k°__1 (x; ek)ek. If L is bounded (i.e., if L is continuous), then it follows by Corollary 3.8 that Lx = E00 °_1 (x ; ek) Lek = 0. But this implies that
Lx = 0 for all x E l (i.e., L = 0), which is a contradiction. Thus L is unbounded; that is L E £[7-l]\B[?-l}.
5.10
Orthogonal Projection
A projection is an idempotent linear transformation P : X -> X of a linear space X into itself (Section 2.9). An orthogonal projection on an inner product space X is a projection P : X -> X such that R(P) 1 N(P). Since orthogonal projections are projections, all the results of Section 2.9 apply to orthogonal projections in particular. If P is an orthogonal projection on X, then so is the complementary
projection E = (I - P): X -+ X. (Reason: E = (I - P) is a projection with R(E) = N(P) and N(E) = 7Z(P).) Proposition 5.51. If P is an orthogonal projection on an inner product space X, then
(a) P E 13[7-l] and II P II = 1 whenever P # 0,
(b) A((P) and R(P) are subspaces of X,
(c) N(P) = R(P)1 and R(P) = N(P)1, (d) R(P) and N(P) are orthogonal complementary subspaces in X. That is, besides being orthogonal subspaces of X, R(P) and N(P) are such that
X = R(P) +N(P). Thus X can be decomposed as an orthogonal direct sum
X = R(P) ®N(P). Proof. Let P : X -+ X be an orthogonal projection on an inner product space X.
(a) Take an arbitrary x E X. Since R(P) and N(P) are algebraic complements of each other (Theorem 2.19), we can write x = u + v with u in R(P) and v in N(P). Moreover, u 1 v because R(P) 1 N(P)). Then IIXI12 = IIuI12 + IIvI12
368
5. Hilbert Spaces
by the Pythagorean Theorem. Recalling that 7Z(P) = (u E X: Pu = u), we get Px = Pit + Pv = u so that 11 px 1122 = Ilu 112 < 11x 112. Hence II P11 < 1. That is, P is a contraction. If P # 0, then R(P) # (0), and so IIPull = (lull # 0 (because Pu = u) for all nonzero a in R(P). Therefore II P II > 1. Outcome: 11 P 11 = 1.
(b) According to item (a), P E 13[XJ. Thus Proposition 4.13 ensures that N(P) is a subspace of X. Since E = I - P is an orthogonal projection on X, it follows that E E t3[X] and, by Proposition 4.13 again, that R(P) = N(E) is a subspace of X. (c) Recall from the definition of orthogonal complement that if A and B are sub-
sets of X such that A 1 B, then A C B1. Therefore, as N(P) 1 R(P), we get N(P) C R(P)1. Now take an arbitrary x E 7Z(P)' so that x 1 R(P). Theorem 2.19 says that x = u + v with u E 7Z(P) and V E N(P). Hence 0 = (x; u) = (it; u) + (v : u) = Ilu ll2 (for 7Z(P) 1 N(P)) and so u = 0, which implies that x = V E N(P). Therefore 7Z(P)1 C_ N(P). Outcome: A((P) = R(P)1. Considering the complementary orthogonal projection E = I - P we conclude that
R(P) =X(E) = Z(E)1= N(P). (d) Theorem 2.19 says that N(P) and R(P) are algebraic complements of each other so that X = R(P) + N(P). Since R(P) 1 N(P), it follows by Proposition 5.24 that X = 1Z(P) ® N(P). As usual, we identify the orthogonal direct sum R(P) eA(P) with its unitarily equivalent image R(P) +N(P) = X, and write
X = 1Z(P) e.,V(P). Theorem 5.52. (Projection Theorem - Third version). For every subspace M of a Hilbert space 7l there exists a unique orthogonal projection P E 13[71] with
R(P) = M. Proof Existence. Theorem 5.20 says that h can be decomposed as
7-[ = M+:m1. Since M and M' are algebraic complements of each other, for every x E N there
exists a unique u E M and a unique v E M1 such that x = u + v (Theorem 2.14). For each x E h set Px = P(u + v) = u in M C R. It is easy to check that this defines a linear transformation P: W - h with R(P) = M. Moreover, P2x = Pu = u = Px (because u = u + 0 is the unique decomposition of u E Min M +M 1) so that P is idempotent. Furthermore, Px = 0 if and only if x = 0+ v for some v E M 1, and hence N(P) = M'. Conclusion: P is an orthogonal projection
on h with R.(P) = M. Uniqueness. Take an arbitrary vector x E N and consider its unique decomposition
.r = U + V E A4 + A41 = R with u E M and v E M-t-. Suppose P' is an orthonormal projection on N such that 7Z(P') = M. Thus P'u = u = Pu for every u E AA (since the range of any projection is the set of all its fixed points -
see proof of Theorem 2.19), and P'v = 0 = Pv for every v E M'. (Reason:
5.10 Orthogonal Projection
369
if M = R(P) = R(P), then .Ml = R(P')1 = R(P)1 = N(P') = N(P).) Therefore, P'x = P'(u + v) = P(u + v) = Px, and hence P = P.
17
Let M be any subspace of a Hilbert space N. The unique orthogonal projection P : N --). 7{ for which R(P) = M is called the orthogonal projection onto M. The above proof shows that Theorem 5.20 implies Theorem 5.52. In fact, all the versions of the Projection Theorem, viz., Theorems 5.20, 5.25 and 5.52, are pairwise equivalent. That Theorems 5.20 and 5.25 are equivalent is an immediate consequence of Proposition 5.24. We shall verify that Theorem 5.52 implies Theorem 5.20. Indeed, if P : W --> 7{ is any orthogonal projection on a Hilbert space
N, then N = R(P) +N(P) by Theorem 2.19, and hence N = R(P) + R(P)1, where R(P) is a subspace of N (Proposition 5.51). But Theorem 5.52 says that for every subspace M of N there exists an orthogonal projection P : 7{ --> N such
R(P) = M. Therefore, for every subspace M of N we get R = M + M1. Outcome: Theorem 5.52 implies Theorem 5.20. Such an equivalence justifies the terminology "Projection Theorem" for Theorems 5.20 and 5.25. The pivotal result of Theorem 5.13 can also be translated into the orthogonal projection language. Actually, the unique vector u,, E M of Theorem 5.13 is given by ux = Px, where P : 71 -+ 7{ is the orthogonal projection onto M.
Theorem 5.53. Let M be a subspace of a Hilbert space N and let P E B[N] be the orthogonal projection onto M. Take any x in N. Px is the unique vector in M such that
IIx - PxII = d(x,M). Moreover, Px also is the unique vector in M such that
x - Px 1 M. Proof. Let P be the orthogonal projection onto M. P(x - Px) = Px - P2x = 0
so that x - Px E N(P) = 7 (P)1 = M'. Thus Px is the unique vector in M such that x - Px E M1, which in turn is the unique vector in M such that IIx - PxII = d(x, M) (Theorem 5.13).
o
The next two results connect the notion of orthogonal projection with the Fourier Series Theorem.
Theorem 5.54. Let {ey )yEr bean orthonormal family of vectors in a Hilbert space
N and set M = V{ey}yEr. For every x E ?{, ((x ; ey)ey)yer is a summable family of vectors in M and the mapping P : R -+ N. defined by
Px = 1:(x ; ey)ey, yEr
is the orthogonal projection onto M.
370
5. Hilbert Spaces
Proof. Take an arbitrary x E N. The Bessel inequality (Lemma 5.40) says that EyErl(x ; ey)I2 < IIx1122, and hence ((x; ey)ey)yEr is a square-summable family (cf. Proposition 5.31) of vectors in the Hilbert space M (see Proposition 4.7). Then ((x; ey )ey )yEr is a summable family of vectors in M by Theorem 5.32. Set Px = LyEr (x ; ey)ey E M. It is readily verified that this defines a linear transformation P: 7-t -+ h such that R(P) c_ M. Moreover,IIPxII2 = Eyer I (x ; ey ) 12 (Theorem 5.32 again), and hence 11 Px 112 < Ilx 112. Thus P E B[71] (i.e., P is continuous - a contraction, actually), which implies that
P2x = P(1:(x;ey)ey) _ E(x;ey)Pey. yEr
yEr
But the very definition of P ensures that Pea = Eyer(ea; ey)ey = ea for every a E r. Therefore, P2x = >2(x ; ey)ey = Px yEr
and so P is a projection. Since (ey )yEr is an orthonormal basis for the Hilbert space
M, the Fourier Series Theorem ensures that every u E M has a unique Fourier series expansion
u = E(u ; ey)ey. yEr
Hence Pu = ey)Pey = L+yEr(u ; ey)ey = u so that u E R(P). Therefore, M c R(P) and so R(P) = M. Note that v E N(P) (i.e., Pv = 0) if and only if 11Pv112 = >yErI(v; ey)12 = 0. Equivalently, if and only if (v; ey) = 0 for all y E I ,whichmeansthaty E ({ey}yEr)-' = (V{ey}yEr) = M1 = R(P)-L. Thus R is N(P) = R(P)1, which implies that R(P) 1 N(P). Conclusion: P: R an orthogonal projection with R(P) = M; that is, the projection onto M (Theorem 5.52). o
Corollary 555. Let (ey) yEr be any orthonornutl basis for a subspace M of a Hilbert space W. The orthogonal projection P E B[7{] onto M is given by
Px =
(x ; ey) ey
for every
x E 7t.
yEr
Now we consider "orthogonal families of orthogonal projections". This is a rather important notion upon which the Spectral Theorem of next chapter will be built.
Proposition 5.56. Let PI E 13[XI and P2 E 13[X I be orthogonal projections on an inner product space X. The following assertions are pairwise equivalent.
(a) R(Pl) 1 R(P2).
5.10 Orthogonal Projection
371
(b) Pt P2 = O. (c)
P2P1 = O.
(d) R(P2) S N(P1) (e) R(P1) S N(P2). Proof. Clearly, R(P1) 1 R(P2) implies 7 (P2) c R(P1)-L = N(P1). Conversely, if R(P2) C R(Pt)l, then 1Z(P2) 1 R(P1). Hence (a) .(d). Similarly (orthogonality is symmetric so that R(PI) 1 R(P2) if and only if 7Z(P2) I. R(P1)) we get (a)q(e). Since R(P2) c N(P1) if and only if Pt P2 = 0, it follows that (d). (b). Swap Pt and P2 to get (e)*(c). El If two orthogonal projections Pt and P2 on an inner product space X satisfy any of the equivalent assertions of Proposition 5.56, then they are said to be orthogonal to each other (or mutually orthogonal). If (Py }yEr is a family of mutually orthogonal projections on an inner product space X (i.e., R(PP) 1 R(Pp) whenever a 54 0), then we say that { Py }yEr is an orthogonal family of orthogonal projections on X. An orthogonal sequence of orthogonal projections { Pk } is similarly defined. If { Py )yEr is an orthogonal family of orthogonal projections and
Pyx = x
for every
x E X,
yer then (Py )yEr is called a resolution of the identity on X. (For each x E X the sum
x = F-yErPyx has only a countable number of nonzero vectors - why?) If
(Pk)
o is an infinite sequence, then the above identity in X means convergence in the strong operator topology: n
E Pk
' ) I.
k=0
If { Pk }k--o is a finite family, then the above identity in X obviously coincides with the
identity EL k_0 Pk = I in l3[X). For instance, if P is an orthogonal projection on X and E = I - P is the complementary projection on X, then P and E are orthogonal projections orthogonal to each other (for PE = P - P2 = 0). Moreover, {P, E) clearly is a resolution of the identity on X for P + E = I. Proposition 5.57. Let {ey } yE r be an orthonormal basis for a Hilben space 7{. For
each y E r define the mapping Py : 7{ -* 7{ by
Pyx = (x; ey )ey
for every
Claim: { Py }yEr is a resolution of the identity on 7{.
x E 7{.
5. Hilbert Spaces
372
Proof. Each P. is the orthogonal projection onto the one-dimensional space M = span (ey ). This is a particular case of Theorem 5.54. It is clear that Pa Pox = (x ; e,6) (e,6 ; e.) e., and so Pa Pp x = 0 whenever a # fJ, for every x E R. Thus Pa Pp = 0 for every a. fi E I' such that a # 6. Hence (Py)yEr is an orthogonal family of orthogonal projections on H. The Fourier Series Theorem ensures that {(x ; ey)ey }yEr is a summable family of vectors in N and, for every x E H,
x = 1: (x;ey)ey = >2Pyx. yer
yEr
o
Conclusion: { P. }yEr is a resolution of the identity on 1-(.
Example 50. Let (ek )k0 be an orthonormal basis for an infinite-dimensional 1
separable Hilbert space H. According to Theorem 5.54 and Proposition 5.57, for each k > 1 the mapping Pk : N -+ H defined by for every
Pk x = (x ; ek) ek is an orthonormal projection, and (Pk)
1
xEH
is a resolution of the identity on H. There-
fore, the sequence of operators (Ek=1 Pi,)' I converges strongly to the identity I on H. In fact, take an arbitrary x E H. By the Fourier Series Theorem, n
00
n
k=1
00
(.r:ek)ek-(x:ek)ek = -
Pk.r - x =
as n -+ oc because the infinite series F 4.9(b)). This confirms that
(x;ek)ek - 0 k=n+1
k=1
k=1
I
(x; ek)ek converges in H (see Problem
0
1: Pk S+ I. k=I
We shall see now that the sequence of operators [Ek=1 Pk}' 1, which converges strongly to the identity I on H, does not converge uniformly. Indeed, if { Ek=1 Pk
converges uniformly. then the identity I must be its uniform limit. (Reason: Fk=1 Pk s ) I. and uniform convergence implies strong convergence to the same limit.) However, for each n > 1, 00
n
I (I
-` k=1
(en+1 ; ek)ek II = 1len+l 11 = 1.
Pk)en+l
k=n+I
Thus 11 1 - Lk=I Pk II = supJixp=l II(/ - Lk=, Pk)x II > I foreveryn > 1, and hence
Fk=1 Pk -°*-+/. Conclusion:
Irk=1
Pk}
1 does not converge uniformly.
Proposition 5.58. If { Pk }LEr,1 is an orthogonal sequence of orthogonal projections
on a Hilbert space H, then {rk=1 Pk}nEN converges strongly to the orthogonal
5. 10 Orthogonal Projection
projection P: N -+ 7-l onto
373
F-k= I Pk -4 P, In other words, where P E 8[N] is the orthogonal projection with 1Z(P) = (EkEh11Z(Pk)) (F-kENR(Pk))-.
Proof. Set Jet =
which is a subspace of the Hilbert space N, and
consider the decomposition
N =M+M1 of Theorem 5.20. Take any x E N so that x = u + v with u E M and V E .Ml. Observe that v E (F-kENR(Pk))1 by Proposition 5.12 so that v 1 R(Pk) for every k E N. Thus V E R(Pk)1 = M(Pk), and hence Pkv = 0, for every k E N, which implies that FR=1 Pkv = 0. Since u E where {R(Pk))kEg1 is a countably infinite family of pairwise orthogonal subspaces of N, it follows by the Orthogonal Structure Theorem (Theorem 5.16) that u = F_k°_I uk with uk E R(Pk)
for each k. Since Pi is continuous, we get PI u = F-k_ I Pi uk = ui (reason: Uk E R(Pk), Pi Pk = 0 whenever j A k, and Pi ui = ui) for each j E N. Hence u = F-k°O__ 1 Pk u. Therefore, for every x E 7d, 00
00
00
1: Pkx = EPku+>Pku = u k=1
k=1
k=1
so that the 8[1-(]-valued sequence {Fk=1 PL-),,EN is strongly convergent (Proposition
4.44). Let P E 13[7-l] be the strong limit of ELI Pk)fEN so that 00
Px =
Pk x
xE
for every
k=1
Then P2x = L..k=1 P Pkx = Ek=1 F-j=1 Pi Pkx = Ek°_I Pkx = Px for every x in
N (because P is continuous and Pi Pk = 0 whenever j # k), and hence P is idempotent. Moreover, R(P) = .M and.W(P) = M1. Indeed, Px = P(u+v) = u for every x E 7-1 = M + ill, where u is the unique vector in M and v is the unique vector in Ml such that x = u + v. Therefore, R(P) 1 N(P). Outcome: P E 5[N] is the unique orthogonal projection onto M (cf. Theorem 5.52).
v
The above proposition is a consequence of the Projection Theorem (i.e., Theorems 5.20 and 5.52) and the Orthogonal Structure Theorem (Theorem 5.16). Here is the full version of the Orthogonal Structure Theorem in terms of orthogonal projections.
Theorem 5.59. Let N be a Hilbert space. If { Pk )kEN is a resolution of the identity on H, then
7l={1: R(PO)
.
kEN
Conversely, if (Mk )keN is a sequence of pairwise orthogonal subspaces of W such that 'H = (F-keNMk) -, then the sequence {Pk)kEN consisting of the orthogonal projections Pk E 5[R] onto each Mk is a resolution of the identity on N.
374
5. Hilbert Spaces
Proof. Set M = (EkENR(Pk))-, which is a subspace of the Hilbert space R. If x E Jldl = (EkEF4R(Pk))l, then x 1 R(Pk) for every k E N. Hence x lies in R(Pk)1 = N(Pk), so that Pkx = 0, for every k E N. Then EkENPkx = 0. But >kEN Pkx = x for every x E N because { Pk }keN is a resolution of the identity on
N. Therefore, X E ,Ml implies x = 0 so that M1= (0), and hence
M=N by Proposition 5.15. Conversely, let (Mk )kept be a sequence of pairwise orthogonal subspaces of 7-1 such that 'H = (>kENNMk) For each k E N let Pk E 13[H]
be the orthogonal projection onto Mk (Theorem 5.52), and note that R(PM) = M3 1 Mk = R(Pk) whenever j # k. Thus (Pk)kEN is an orthogonal sequence of orthogonal projections on R. Therefore, according to Proposition 5.58,
TPk
> I,
k=1
where the identity I E 13[N] is the unique orthogonal projection on 1-l with R(I) _
R = (>kENMk)- =
(IkENR(Pk))-. Outcome: {Pk}kEN is a resolution of the
identity on N.
El
It is worth noticing that, as (R(Pk))kENV is a sequence of pairwise orthogonal
subspaces of a Hilbert space N, then ®kENR(Pk) = (Ekd4R(Pk)) (see Example 5J). Therefore, under the usual identification, Proposition 5.58 says that Pk P, where P in B[N] is the orthogonal projection with R(P) _ k kEN R.( Pk ),and Theorem 5.59 says that N = ®kENR(Pk)whenever (R(Pk))kept is a resolution of the identity on H.
Definition 5.60. Let {Py}yEr be a resolution of the identity on a Hilbert space % A {0}, where Py 0 0 for every y E P. Let {-ky )yEr be a similarly indexed family of scalars. Set
V(T) = {x E N: {,ky Pyx}yEr is a summable family of vectors in N}. The mapping T : D(T) -+ N, defined by
Tx =
,ly Py x
for every
x E D(T),
yer
is said to be a weighted sum of projections.
Proposition 5.61. Every weighted sum of projections is a linear transformation. Moreover, if T E C[D(T), N] is a weighted sum of projections, then the following assertions are pairwise equivalent.
(a) D(T) =1-l.
5.10 Orthogonal Projection
375
(b) (Ay )yer is a bounded family of scalars. (c) T is bounded. If any of the above equivalent assertions holds true, then T E 13[1] is such that [ITII =suPyErIAyI. Proof. It is readily verified that the domain D(T) of a weighted sum of projections is a linear manifold of f, and also that T : D(T) -+ f is linear (see the remark that follows Definition 5.26). Proof of (a) mo(b). If (Ay )yE r is not bounded, then for each integer n > 1 there exists a y E r such that IAy 1 > n. Consider the scalar-valued sequence (Ay )n° , , which is clearly unbounded. Consider the sequence (Py, }A°_t from (Py)yEr and, for each 1. (Recall: 76 (0) because such that n > 1, take ey E is Py 0 0 for every y E I'.) Observe that the orthogonal sequence (Aypl ey, square-summable (for yn° IIAy41-2 < yn°_1 R < 00), and hence it is a summable sequence in the Hilbert space 7( (Corollary 5.9(b)). Set x = E0 in f and note that (since each Py is continuous) 1
00 1
Ay. Py, X =
Ay. An Py. eyk = ey.
for each
n > I
k=1
(because Py eyk = 0 if k 0 n and Py, ey, = e), , once eyk E R(Pyk) for each k >_ 1 and Py, Pyk = 0 whenever k 0 n). But (e}1 is not a square-summable sequence 1 is not a square(for it is an orthogonal sequence of unit vectors). Thus (Ay, summable sequence of vectors in 9-l, so that sup,,>,Ekn_, IIAyk Pykxll2 = 00, and hence the orthogonal family (Ay Pyx)yer of vectors in *H is not square-summable (Proposition 5.31). Therefore, (Ay P> x)YEr is not a summable family of vectors in 7-l by Theorem 5.32, which means that x tf D(T). Conclusion: If (Ay)yEr is not bounded, then D(T) 96fl. Equivalently, (a) implies (b).
Let x be an arbitrary vector in 71. Since (Py}yEr is a resoProof of lution of the identity on f, it follows that (Pyx)yEI- is a summable family (for EyEr Py x = x) of orthogonal vectors in f{, and hence a square-summable family by Theorem 5.32(a). Suppose {Ay}yEF is bounded and set l4 = supyErlAyl (which
is a nonnegative real number). Then, for any finite N c r, IIPyxII2 < 00
IIAkPkxII2 < fi2 E IIPkxII2 < f2 kEN
kEN
yEr
so that (Ay Pyx)yEr is a square-summable family of orthogonal vectors in 9-l (Propo-
sition 5.31), and hence a summable family of vectors in the Hilbert space 'H by Theorem 5.32(b). Thus X E D(T), and therefore (b) implies (a). Moreover, since (AyPyx)yer is an orthogonal family of vectors in 7, and {Py}yEr is a resolution
376
5. Hilbert Spaces
of the identity on 1 , we get 1ITx112
=
II
yEr <_
AyPyx l ` = E I1yl'`IIPyxII" yEr
1:
o' E II PyxI1` ='0`ll yEr
2
Pyxll = fl1x1l
yEr
(see Theorem 5.32 again). Hence (b) implies (c).
Proof of (c)=(b). For each y E r take a unit vector ey in R(Py). Thus
Tey = ExyPyey = )yey. ycr If T is bounded, then
IAyl = IIAyeyll = IITeyll
IITlllleyll = IITII
for all y E r so that (c) implies (d). Finally, the above inequality says that supyErIAyI 11TII. On the other hand, we have already seen that 11 Tx 11 f Ilx 11 for every x E N, where f = supyEr IAy 1, which implies that IITII < supyErlkyI. Hence IITII = supyerIAy1 0 Let the infinite sequence { Pk }k 1 be a resolution of the identity on 1-{, where Pk * 0 for every k > 1, and let (Ak }k° be a bounded sequence of scalars. Observe that the identity in R 1
ao
Tx = E Ak Pkx
for every
x E 7-(
k=1
that defines the weighted sum of projections T E 13[9.1'.] actually means convergence in the strong topology; that is,
k=1
5.11
The Riesz Representation Theorem and Weak Convergence
Let y be an arbitrary vector in an inner product space X and consider the functional f : X --- IF defined by
f (x) = (x ; y)
for every
x E X.
5.11 The Riesz Representation Theorem and Weak Convergence
377
This is a linear and bounded functional. Indeed. f is linear because the inner product is linear in the first argument; and bounded by the Schwarz inequality: If(x)I = I(x;y)I IlyllIlxil for every x E X. Then f E 13[X, F] and 11f 11 = supOXp=ilf(x)I so that IIYII
Ilyhl.Ontheother hand, IIyIIIly11 = I(y: y)I = if(y)I
Ilfllllyll
11 f II . Therefore, IIYII = 11Y11-
Outcome: A bounded (i.e., continuous) linear functional f is naturally associated with each vector y in an inner product space X. The remarkable fact is that the converse holds true in a Hilbert space.
Theorem 5.62. (The Riesz Representation Theorem). For every bounded linear functional f on a Hilbert space h there exists a unique vector y E h such that
f (x) = (x ; y)
for every
x ER.
Moreover, 11f 11 = IIYII Such a unique vector yin N is called the Riesz representation
of the functional f in 5[71, F].
If f = 0, then it is clear that y = 0 is the unique vector in 71 for which f (x) = (x ; y) for every x E R. Thus suppose f 0 0.
Proof.
Existence. Consider the null space N(f) off E 13[N, F]. which is a proper subspace
of N(i.e., N(f) is a subspace of N by Proposition 4.13 and N(f) 96 71 because f 96 0). Hence, as 71 is a Hi lbert space, N(f )1 96 (0) according to Proposition 5.15.
Let z be any nonzero vector in N(f )l. Since z V N(f) (forN(f)nM(f )1 = (0)), it follows that f (z) 54 0. Now take an arbitrary x in N and note that
f(x
- 4(Z)) z)
= f(x) - f(x)
c
= 0.
Thus x - lz) z E N(f ). Since Z E N(f)1 we get
0 = (x -
z ; z)
_ (x ; z) -
(Xi Ilzll`'.
and hence f (x) = (x : IIzU Z) Then there exists a vector y = not depend on x, such that f (x) = (x ; y).
z in W. which does U
Uniqueness. If y' E N is such that f (x) = (x ; y') for every x E R. then (x ; y) _ (x ; y') so that (x ; y - y') = 0 for every x E N. Therefore, y - y' E N1 = (0). Finally, as we have already seen in the introduction of this section, if y E 71 is such o that f (x) = (x ; y) for every x E R, then 11 f II = IIYII .
Corollary 5.63. For every Hilbert space 71 there exists a surjective isometry %P: N' --> 71 of the dual N' of h onto N. which is additive and conjugate hornogeneous (i.e., * (af) = iY tv(f) for every f E 71' and every Cr E F).
378
5. Hilbert Spaces
Proof. Let h be a Hilbert space and let N' = B[N, IF] be the dual of N. According to the Riesz Representation Theorem, for each f E N' there exists a unique y E g{ such that f (x) = (x ; y) for every x E N and 11111 = 11yl1. Conversely, for each y E N the functional f : N -+ IF given by f (x) = (x ; y) for every x E N'l is linear and bounded, which means that f E V. This establishes a surjective isometry (i.e., an invertible isometry) %P: N' -+ N of the dual N' of N onto N:
V' (f) = y
for every
f E N',
where y E N is the (unique) Riesz representation off E V. Therefore, every f in N' is such that
f (x) = (x ; 4' (f))
for every
x E ?{.
Observe that 'P is additive. Indeed, if f, g E V. then
(x;'I'(f + g)) _ (f +g)(x) = f(x)+g(x) (x; 41(f))+(x;'V(g)) = (x; 41(f)+41(g)) for every x E N, so that 4'(f + g) = 4'(f) +'11(g). Moreover, if f E y{' and of E F, then
(x; 4'01f)) = af(x) = f(ax) = (ax; 41(f)) = (x;a4I'(f)) for every x E N, and hence 4I' (a f) = a 4' (f ).
t]
From the above corollary we may conclude: Every Hilbert space is isometrically equivalent to its dual. In particular, every real Hilbert space is isometrically isomorphic to its dual. Corollary 5.64. Every Hilbert space is reflexive.
Let %P: N* - N be the surjective isometry of Corollary 5.63, which is additive and conjugate homogeneous. Consider the function ( )*: N' x7-l' -> F given by Proof.
(f ; g). = ('l' (g) ;'y(f)) for every f, g E N'. where ( ;) stands for the inner product on N. This defines an inner product on N'. Indeed, (; ), is additive because 4' is additive. Since 'I' is conjugate homogeneous,
(af ; g). = (41(g) ; 4I'(af)) = (41(g) ; a4t'(f)) = a(4'(g) ; W (f)) = a(f ; g). for every f, g E N' and every a E F, and hence ( ; )* is homogeneous in the first argument. It is clear that (; ) is Hermitian symmetric and positive. Actually, 11P. = 114/(1)11 = If II
5.11 The Riesz Representation Theorem and Weak Convergence
379
for every f E 7{' so that the norm II II. induced on V by the inner product ( ; ). coincides with the usual (induced uniform) norm on N = B[N., F). Since the dual space of every normed space is a Banach space, it follows that (N*, 11 11) is a Banach space, and hence (N', II II.) is a Hilbert space. We shall now apply the Riesz Representation Theorem to the Hilbert space V. Take an arbitrary rp E 7{'*. Theorem 5.62 ensures that there exists a unique g E N' such that
tp(f) = (f ; g). = (`p (g) ; 4i(f)) for every f E V. According to Theorem 5.62 every f E N' is given by f (x) _ (x ; y) for every x E X where y = %P (f) E h. Set z = '11 (g) E 7{ so that
f(z) = (z ; y) = (`1(g) ; '(f)). Therefore, there exists z E N such that
(p(f) = f (z)
f E N'.
for every
Hence 7-f is reflexive by Proposition 4.67.
11
Let T E B[7{, YI be a bounded linear transformation of a Hilbert space N to an inner product space Y. Take an arbitrary y E Y and consider the functional fy : 7{ - IF defined by
fy(x) = (Tx ; y) for every x E N, where the above inner product is that on Y. It is easy to show that fy : N -+ IF is a bounded linear functional (i.e., fy E N'). Indeed, ff, is linear because T is linear and the inner product is linear in the first argument; and bounded by the Schwarz inequality: 1 f (x) I < l I Tx 1111 y Il
-5
II T II II y II llx 11 for every x E H.
The Riesz Representation Theorem says that there exists a unique zy in H such that
(Tx ; y) = fy(x) = (x ; zy) for every x E W, where the right-hand side inner product is that on R. This establishes a mapping T*: Y -+ H that assigns to each y in Y this unique zy in H (i.e., T* y = zy for every y E y), and therefore satisfies the following identity for every x in N and every y in Y:
(Tx ; y) = (x ; T'y) The mapping T*: Y -+ 7{ is referred to as the adjoins of T E 8[7{, Y]. In fact, as we shall see below, the adjoint T' of T is unique.
Proposition 5.65. Take any T E B[7{, y], where H is a Hilbert space and y is an inner product space.
(a) The adjoins T' of T is the unique napping of Y into N such that (Tx; y) _ (x ; T' y) for every x E 7{ and every y E Y.
380
5. Hilbert Spaces
(b) T* is a bounded linear transformation
(i.e., T* E B[y, 7i]).
Moreover, if Y also is a Hilbert space, then
(c) T*' = T,
and
(d) IIT*112 = IIT'TII = IITT*11 = IIT112
Proof. Take T E B[7{, YJ and let T* : Y -+ If be a mapping such that (Tx ; y) _ (x ; T * y) for every x E 7l and every y E Y.
Proof of (a). If T# : Y - 71 satisfies the identity (Tx ; y) = (x ; Toy) for every x E 7l and every y E Y, then for each y in Y (x; T#y) = (x; T* y) for every x E 7{, and hence T* y = T * y for every y E Y.
Proof of (b). Take yi, y2 E Y and at, a2 E F arbitrary. Note that
(x; T'(at yj +a2y2)) _ (Tx ; at y1 + a2 y2)
= at(Tx;yi)+a2(Tx;y2) = ari(x; T*yt)+«2(x; T*y2) = (x;aiT*Y1 +a2T'y2) for every x E 7{, and hence T*(at yj + a2Y2) = at T*yi + a2T*Y2 so that T* is linear. Moreover, by the Schwarz inequality,
IIT'y112 = (T*y; T'y) = (TT'y; y) <_ IITT'yIIIIyII <_ IITIIIIT'yIIIIyII for every y E Y. This implies that IIT*yll 5 IITII IIYII for every y E Y. Thus T* is bounded and 11T'I1
_<
IITII.
Now suppose Y is a Hilbert space so that T * E B[Y, 7{J has a unique adjoint (T *)* E 13[7{, Y] (notation: T** = (T*)*) such that
(T*y;x) = (y;T**x) for every x E 7{ and every y E Y.
Proof of (c). Since (y; Tx) = (Tx; y) _ (x ;
(T*y ; x), for every x E 71
and every y E Y. it follows by the above identity that
(y; Tx) = (y;T*'x) for every x E 7l and every y E Y, and hence Tx = T**x for every x E 7l. Proof of (d). We have already seen that II T * II 5 II T II . Then, as T = T**. we get 11TII = IIT'*II 5 11T*11. Hence IIT*II = 11T 11. Therefore, IIT'TII 5 IIT'111ITII =
IIT 112. However, the Schwarz inequality ensures that II Tx 112 = (Tx ; Tx) =
5.1 1 The Riesz Representation Theorem and Weak Convergence
381
(T*Tx ; x) :5 IIT*TxII IIxll <- IIT*TII IIxll2foreveryx E lsothat llTll2 < IIT*TII Thus IIT112 = IIT*T11.Again, as T** = T, we get IIT*II2 = IIT**T*II = IITT*II. o
Let X be an inner product space and consider the definition of weak convergence (cf. Problem 4.67): An X-valued sequence (x,,) converges weakly to x E X (notation: x, -+ x) if the scalar-valued sequence If (xn)) converges in IF to f (x) for every f E X* (i.e., if f (xn - x) -- 0 for every f E X*). Recall from Problem 4.67 that convergence in the norm topology implies weak convergence (to the same and unique limit): x -* x implies x, x. IF lies in X* for every y E X, it follows that
Since (. ; y) : X
x
a; x
implies
(x ; y) -+ (x ; y) for every y E X.
The Riesz Representation Theorem ensures the converse in a Hilbert space: If X is a Hilbert space, then xn
x
In particular, x
if and only if
(xn ; y) -+ (x ; y) for every y E X .
x implies (xn ; x) -+ (x ; x) = Ilx 112. By recalling that
Ilxn - x 112 = Ilxn 112 - 2 Re (xn ; x) + Ilx 112, we may conclude the nontrivial partof
the next equivalence, which holds in any inner product space.
x, -+ x
if and only if
x,, T` x and
II xn II -- lix II
Now suppose X and Y are inner product spaces and let {Tn} be a 13[X, Y]-valued converges weakly for every x E X, then sequence. If the y-valued sequence we say that IT,,) converges weakly (or converges in the weak (operator) topology). Equivalently, IT,) converges weakly if there exists a unique T E 13[X, Y] such that
Tnx - Tx in Y for every x E X, which is called the weak limit of {Tn } (see Problem 4.68). Notation: Tn
T. Since
(.; y) : Y --* IF lies in Y* for every y E Y, it follows that
Tn -T) T implies ((T - T)x ; y) -+ 0 for every x E X and every y E Y. We shall see below (Proposition 5.67) that the converse holds if y is a Hilbert space. Recall that uniform convergence implies strong convergence, which in turn implies weak convergence (to the same limit - see Definition 4.45 and Problem 4.68):
T, - T = T,
T
Tn -T T.
Since yn -+ y if and only if yn -+ y and 11 y,, II - IIy11, it follows that
Tn --* T if and only if T
T and 11 Tnx 11 -
11 Tx 11
for every x E X.
382
5. Hilbert Spaces
We have already seen that strong and uniform convergence coincide if X is finitedimensional. In fact, uniform, strong and weak convergence coincide if X and Y are finite-dimensional inner product spaces.
Proposition 5.66. (a) In a finite-dimensional inner product space, weak convergence and convergence in the norm topology coincide.
(b) Let X and Y be inner product spaces. Consider a 8[X, Y]-valued sequence {Tk} and let T be a transformation in B[X,Y]. If Y is finite-dimensional, then
Tk -Z T if and only if Tk - T. Proof. (a) Suppose X is a finite-dimensional inner product space so that it is a
Hilbert space (Corollary 4.28) and let B = lei )'=, be an orthonormal basis for X. x for some Take an arbitrary weakly convergent sequence {xk) in X so that xk
x E X. and hence (xk - x ; ei) -+ 0 as k -+ oo for each ei E B. Therefore, for each i = i, ... , n and every e > 0, there exists a positive integer ki,E such that I (xk - x ; ei) I < e whenever k > ki,E. Then the Fourier Series Theorem (Theorem 5.48) ensures that n
I(xk-x;ei)l2 < nee
I1xk-x1122 _ i=1
whenever k > kE = max{ki,E }°_ . Hence Ilxk - x II -+ 0 as k
oo. That is,
xk -). x. Summing up: xk U'-> x implies xk -> x whenever X is a finite-dimensional inner product space. This concludes the proof of (a), for norm convergence always implies weak convergence. Tx in y for every x E X. But item (a) says that this is equivalent to Tkx -* Tx in Y for every x E X. whenever Y is finite-dimensional,
(b) Tk w T means Tkx which means Tk
o
T.
Here are equivalent conditions for weak convergence of bounded linear transformations between Hilbert spaces.
be a sequence of bounded linear transformations of a Proposition 5.67. Let Hilbert space fl into a Hilbert space IC (i.e., IT.) is a B[1, IC]-valued sequence). The following three assertions are pairwise equivalent.
(a) There exists T E SIR, IC] such that IT.) converges weakly to T (i.e., T U,) T
or equivalently, T - T
0).
(b) There exists T E 13[h, IC] such that
y) -+ (Tx ; y) as n -+ oo for
every x in 7{ and every y in K.
(c) The scalar-valued sequence { every y in IC.
y)) converges in 1F for every x in f and
5.11 The Riesz Representation Theorem and Weak Convergence
383
Now set K = H and consider the following further assertions. (d) There exists T E B[f] such that (Tnx ; x) -> (Tx ; x) as n --> oo for every
x E?(. (e) The scalar-valued sequence { (Tnx ; x) j converges in F for every x E H.
Clearly, (b) implies (d), which implies (e). If K = H is a complex Hilbert space, then these five assertions are all pairwise equivalent.
Proof. If there exists T E 8[H, JC] such that f (Tnx) --> f (Tx) in F as n - oo for every f E K' and every x E H, then (Tnx ; y) --+ (Tx ; y) as n -- oo for every y E K and every x E H (because (. ; y) : K -). IF lies in K' for every y E K). Hence (a)=(b). Conversely, suppose (b) holds true and take any f E V. Since K is a Hilbert space, the Riesz Representation Theorem (Theorem 5.62) ensures that
f (Tnx) --,- f(Tx) in F as n -+ oo for every x E ?-l. Thus (b)=(a). It is clear that (b)=(c). Now suppose assertion (c) holds true. Take an arbitrary y in K and consider the functional fy : H - F defined by fy(x) = l im(Tnx ; Y)
for every x E H. Observe that fy is linear (because Tn is linear for each n, the inner product is linear in the first argument, and the linear operations in F are continuous). Since ((Tnx ; y)) converges in F for every x E H. it follows that ((Tnx ; y)) is a bounded sequence for every x E H (Proposition 3.39). Then sup,, I(T.x; y) I < oo for every x in H and every y in K, which implies that
sup,, II T, II < oo (a consequence of the Banach-Steinhaus Theorem - see Problem 5.5). Hence Ify(x)I = Ilimn(Tnx ; y)I = limn I(Tnx ; Y)I < sup,,I(Tnx; Y)I sup,,IiTnxIIIIyll <sup,IITnllllxllllYllfor every x EHso that Ilfyll <_ suplIT,IIIIyH. n
That is, fy is bounded, and therefore fy E H' (fy is a bounded linear functional on H). Since h is a Hilbert space, the Riesz Representation Theorem says that there exists a unique Zy E H such that
fy(x) = (x ; Zy) for every x E H. Consider the mapping S: K -- H that assigns to each y in ?{ this unique zy in H, Sy = zy for every y E K, so that
l m(Tnx ; Y) = fy(x) = (X; zy) = (X; SY)
384
5. Hilbert Spaces
for every x E H and every y E JC. Note that S is linear and bounded (i.e., S lies in B[/C, 7{]). Indeed, if yl, Y2 E Y and at, 012 E F. then (x ; S(al Y1 + a2 Y2))
= limn (Tnx ; al Yl + a2y2) &I limn (Tnx ; Yl) + N2limn (Tnx ; Y2)
i 1(x; SYI)+ore(x;SY2) (x,at Sy,+a2Sy2) for every x E H. Hence S(at yt + a2Y2) = at SY1 + a2 SY2, which means that S is linear. Moreover,
IISYII = Ilzyll = Ilfyll
sup IIT"II IIYII n
for every y E 1C so that S is bounded with II SII < supn II T. II . Setting T = S* in B[H, K), the adjoint of S, we get (cf. Proposition 5.65) lim(Tnx ; y) = (x; Sy) = (x ; S**y) = (S*x ; y) = (Tx ; y) R
for every x E R and every y E /C. Therefore, (c)=(b). Finally, set IC = 71 so that trivially. According to Problem 5.3(b) (with L = 1), it follows (b)=(d) and that (d)=(b) and (e)=(c) whenever H is a complex Hilbert space. E3 Observe from Proposition 5.67 and Problem 5.5 that Tn
T : sup II Tn II < oo whenever H and K are Hilbert. n
Take T E B[X], where X is an inner product space, and consider the power sequence
(T") so that each T" lies in B[%]. The operator T is weakly stable if the power sequence (T") converges weakly to the null operator; that is, T" - -' 0. If X is a Hilbert space, then Proposition 5.67 says that this is equivalent to any of those five
assertions with Tn replaced with T" and T replaced with 0. In particular, if X is a Hilbert space, then T" -w) 0 if and only if (T"x ; x) -+ 0 as n -- oo for every x E X. Clearly, uniform stability implies strong stability that implies weak stability.
T"-"- 0 = T"") 0
;
T" U1) 0,
which in turn implies power boundedness whenever X is a Hilbert space: T"
) 0 : sup 11 T" II < oo if X is Hilbert. n
The converses, however, fail (see Example 4K and Problem 5.29(c)).
Let X and Y be inner product spaces. We say that a subset O of B[X, y] is weakly closed in B[X, Y J if every O-valued weakly convergent sequence (Tn ) has
5.11 The Riesz Representation Theorem and Weak Convergence
385
its (weak) limit T in O. Recall from Proposition 4.48: If 0 is strongly closed in B[X, Y], then it is (uniformly) closed in B[X, Y1. Proposition 5.68. If O C B[X, Y] is weakly closed in B[X, y], then it is strongly closed in B[X, y]. Proof. Take an arbitrary E)-valued strongly convergent sequence, say {T }, and let T E B[X, Y] be its (strong) limit. Since strong convergence implies weak convergence to the same limit, it follows that converges weakly to T. If every 9-valued weakly convergent sequence has its (weak) limit in 0, then T E O. Conclusion: Every 0-valued strongly convergent sequence has its (strong) limit in O. o Remark: If Y is finite-dimensional, then weak convergence coincides with strong convergence (Proposition 5.66(b)), and hence the concepts of weakly closed and strongly closed in B[X, y] coincide whenever y is a finite-dimensional inner product space. Then (cf. Remark after Proposition 4.48), if X and Y are finite-dimensional inner product spaces, then all the three concepts of weakly, strongly and uniformly closed in B[X, y] coincide.
Recall that a subset A of a metric space is compact if and only if it is sequentially compact (Theorem 3.80), which means by Definition 3.76 that every A-valued sequence has a convergent subsequence (that converges to a point in A). The HeineBorel Theorem (Theorem 3.83) ensures that every bounded subset B of a finite-dimensional inner product space has a compact closure. Thus every B-valued sequence
has a convergent subsequence (that converges to a point in B-). Therefore, every bounded nonempty subset of a finite-dimensional inner product space has a convergent sequence (whose limit lies in its closure). This no longer holds in an infinitedimensional inner product space (e.g., if (en} is an infinite orthonormal sequence, then Ile - en, 112 = 2 for every m # n). Now recall that a finite-dimensional inner product space is a Hilbert space where convergence in the norm topology coincides with weak convergence (Proposition 5.66(a)), so that the next lemma actually is an extension to infinite-dimensional Hilbert spaces of the above italicized result.
Lemma 5.69. Every bounded nonempty subset of a Hilbert space has a weakly convergent sequence. In particular, every bounded sequence in a Hilbert space has a weakly convergent subsequence. Proof. Let B be a bounded nonempty subset of an inner product space 7{. The result
is trivial if B = (0). Suppose B 36 (0) and let A i6 (0) be a nonempty countable subset of B. Put X = VA. which is a subspace of R. Thus O < supXEA IIX II < oo and X is a separable inner product space by Proposition 4.9(b). Consider the following
subset of the dual of X:
4) _ {(.;x):X---F: xEA} C X. Recall the definition of "pointwise total boundedness" and "equicontinuity" of Example 3Z.
386
5. Hilbert Spaces
Claim 1. 4) is pointwise totally bounded.
Proof. Since total boundedness coincides with plain boundedness in IF (cf. proof of Theorem 3.83), it follows that 4) is pointwise totally bounded if and only if it is pointwise bounded. But I (w; x) l < supxEA IIx 11 11 W11 for every w e X. and so 0 is
pointwise bounded. 0 Claim 2. 4) is (uniformly) equicontinuous.
Proof. Indeed, I (u; x) - (v;x) I= I(u-v;x) 1 <supxeAllxllllu - vllforevery u, V E X and every x E A. This implies that for every e > 0 there exists 8 = (supxEA IIx II)- I e > 0 such that 1(u; x) - (v; x) l < e whenever llu - vII < S for all
u,vEXandevery xEA.13 Then, according to the proof of Example 3Z (set f _ ( ; E 0), we may conclude: 4) is totally bounded because X is separable, which means that every 0valued sequence has a Cauchy subsequence (Lemma 3.73). Take an arbitrary Avalued sequence so that the 4)-valued sequence has a subsequence. say { ( ; x,,,)), that is Cauchy in X*. But X' is always complete so that (( ; x,,,)) converges in X. That is, there exists f E X such that ( ; fin X. If W is a Hilbert space, then the subspace X is itself a Hilbert space, and the Riesz Representation Theorem says that there exists x E X for which f = (. ; x). Therefore, (X.,; w) -* (x; w) for every w E X. Using the Riesz Representation Theorem x again we get x in N(cf. Problem 5.20). o
Theorem 5.70. Let (T,) be a B[9-l, IC]-valued sequence, where h and 1C are Hilbert spaces. If 71 is a separable Hilbert space and sup 11 T,11 < oo, then {T } has a weakly convergent subsequence.
Proof. Suppose q{ 36 (0) to avoid trivialities. If h is separable, then there exists a countably infinite dense subset A of nonzero linear space N. Let (a,)1>1 be an A-valued sequence consisting of an enumeration of all points of A. Since A it follows by Proposition 3.32 that
infllx-ai11=0 for every If sup 11 T II < oo, then
x E W.
11,?: 1 is a bounded sequence in K. Thus Lemma 5.69 en-
sures the existence of a subsequence of
1, say
such that
converges weakly in iC. Again, since oo, it follows that is bounded in IC and, in particular, the subsequence is bounded in K. Hence another application of Lemma 5.69 ensures that there exists a sub1, say {T sequence of 1, such that (T 2'a2 1 converges weakly in X. also converges weakly in K. (Reason: (T, Note that is a subsei and { converges weakly in X -see Problem 4.67(b).) quence of This leads to an inductive construction of a sequence of B(9'{, X)-valued sequences, I k> I , with the following properties. { T,ik' 1
5.12 The Adjoins Operator
387
Property (1). (Tu+" )n> i is a subsequence of { T(" ), 1, which is a subsequence of
{T,}n>1,forevery k>1.
Property(2).
)n> t converges weakly in IC whenever k > i.
Consider the "diagonal" sequence {T
I. which is a subsequence of (T.).> I. If
{ T,,"' )n> t is weakly convergent, then the theorem is proved. Claim.
)n> t is weakly convergent.
Proof. Take x E N, y E iC, and s > 0 arbitrary. Since inf; Ilx - ai II = 0, there exists an integer ie > 1 such that Ilx - ars ll < E.
According to Property (1) {T,)>1 is a subsequence of { Tn `) )n> l, and hence (Tn
a;, }n>;, is a subsequence of { T ")a;, )n> 1. Since { T('`'a;, )n> t converges weakly
in IC by Property (2), it follows that its subsequence
weakly in /C. This implies that
)n>;, also converges
}n,1 converges weakly in IC so that
{ (Tn"'a;, ; y) }n> t converges in F. and therefore is a Cauchy sequence in F. That is, there exists an integer nE > 1 such that m, n > of
implies
I ((T,("' - T,,(,) )aj,; y) I < s.
Note that II Tm > II < 2 supk II Tk 11 for all m , n >I because quence of { Tn }n> 1. Hence
I ((Tn`"- Tm')x ; y)I
}n> i is a subse-
- Tm')(aj, +x - aj,); y)I
=
I
-<
1((T," )- T,')ai,; y)1+2supkllTkllllx -a, llllyll
<
(1 +2supkIITkllllyll)E
((T`"'
whenever m, n > na. Conclusion: y%,> i is a Cauchy sequence in lF so that it converges in F (since F is complete). As x and y are arbitrary vectors in f{ and IC, respectively, this implies that the scalar-valued sequence y)).> t converges in F for every x E 11 and every y E IC, which means that T T E B[7{, IC] by Theorem 5.67. o
5.12
The Adjoint Operator
Let T E BIN, Y] be a bounded linear transformation of a Hilbert space f into an inner product space Y. The adjoins of T was defined in the previous section as the unique mapping T*: y -* 11 such that
(Tx ; y) = (x; T'y) for every x E h and every y E Y, whose existence was established in Section 5.11 as a consequence of the Riesz Representation Theorem. The basic facts about the
388
5. Hilbert Spaces
adjoint T* were stated in Proposition 5.65. In particular, it is linear and bounded (i.e., T* E B[y, 71]). Here is a useful corollary of Proposition 5.65. Corollary 5.71. If H and IC are Hilbert spaces and T E 13[h, K1, then IITII =
sup
I(Tx;y)I.
IIx11=11y11=1
Since I l z l ]= supuy11=1 I (z ; y) I for every z E K (see Problem 5.1), it follows that IITxll = sup1IY11=1 I (Tx ; y) I for every x E H. Hence Proof.
IITII = sup
sup I (T x ; y) I.
I14=1 Ily1l=1
Recalling that T** = T (see Proposition 5.65), it also follows that IIT'yll = supyxll=l l(T'y; x)I = supllxp=1I(y; Tx)l = supllx11=11(Tx ; y)I for everyy E K. Therefore, as II T* II = II T II (cf. Proposition 5.65 again),
IITII = IIT'll = sup sup l(Tx; y)l.
0
IlY11=1 IIx11=1
Let us see some further elementary properties of the adjoint. Consider the linear space B[7{, y], where h is a Hilbert space and Y is an inner product space. First observe that O' = O.
This means that the adjoint O' E B[y, H] of the null transformation 0 E B[H, J)]
coincides with the null transformation 0 E B[y, H]. In fact, 0 = (Ox ; y) = (x; O*y) for every x E H and every y E Y, and hence O* y = 0 for every y E Y. Now take S and T in B[7{, y] so that S + T and aT lie in B[H, y], where a is any scalar. Consider their adjoints, which are the unique transformations in B[y, H] such that ((S + T)x ; y) = (x ; (S + T )' y) and (a Tx ; y) = (x ; (a T)'y) for every
x E H and Y E Y, respectively. These two identities imply that (x; (S + T)' y) = (x; (S* + T') y) and (x; (a T )' y) = (x ; a T'y) for everyx E H and every y E Y. Therefore, (S*+T') = S'+T' and (aT)' = T. Next take T in B[H, IC] and S in B[IC, y], where h and IC are Hilbert spaces and y is an inner product space, so that ST lies in B[7{, Y] by Proposition 4.16. Consider
its adjoint, which is the unique transformation in B[Y, H] such that (STx ; y) =
(x; (ST )' y) for every x E H and Y E Y. This implies that (x; (ST )* y) = (x; T * S* y) for every x E 7H and every y E Y. Then
(ST)* = T*S*. Finally, consider the algebra 8[7-(], where H is a Hilbert space. It is clear by the very definition of adjoint that I = I',
5.1 2 The Adjoint Operator
389
where I is the identity operator in B[?{]. If T E 91X fl, where f and 1C are Hilbert spaces (i.e., if T is invertible in B[1-1, 1C] so that T-I E B[IC, ?1] by the Inverse
Mapping Theorem), then the above identities ensure that I = 1* = (T-IT)* = T*(T-I)*, the identity operator in B[7{], and I = 1* = (TT-I )* = (T-I)*T*, the identity operator in B[IC]. Hence T* E 9[)C, h] and
(T*)-I = (T-I)* Example 5P. Consider the Hilbert spaces F" and F' (for arbitrary positive integers m and n) equipped with their usual inner products as in Example 5A. Take any A E B[F", F"'] and recall that B[F", FmJ = £[F", F"'] by Corollary 4.30. As usual (cf. Example 2L), we shall identify the linear transformation A E L[Fn, F"'] with its matrix all ... aln
[A]=[aij]=
J EFmxn
...
amI
amn
relative to the canonical bases for F" and Fm. Thus for every x = (I , ... ,") E F" the vector y = Ax = (vl, ... , vn) E F' is such that n
Vi = E ail tj j=I
for every i = I, ... , m. In terms of common matrix notation, and according to ordinary matrix operations, the matrix equation
[y] = [A][x] represents the identity y = Ax. Here UI
l; l
and
E Fmxl
[y] =
[x]
Um
E Fnxl Sn
are the matrices of y and x with respect to the canonical bases for F'" and F", respectively. Recall: [A] = [a'ij] E Fmxn, [A]T = [aji] E Fnxm, and [A]T = [AlT = [aji] E Fnxm denote conjugate, transpose, and transpose conjugate of [A] = [aij] E Fmxn, respectively. Now observe that the inner product on F'" can be written as
m
(y;z) = Evici = YlT [Z-1 i=I
for every y = (vl, ... , vm) E Fm and for every z = (CI,
.
Cm) E Fm. Using
standard matrix algebra we get
(Ax ; y) = ([A1[x])T [y] = [x]T [A]T [yl = [x]T ([A]T [y])
390
5. Hilbert Spaces
for every x E IF" and every y E Irr"'. Next consider the adjoint A' E B[E",F"] of A E B(!", F"'], and let [A*] E jFnxm denote the matrix of A* relative to the canonical bases for F' and F". Then
(Ax; y) = (x; A*y) = [x]T ([A*][y]) for every x E F" and every y E F. Therefore,
[A*] = [A]T. That is, the matrix of the adjoint A* of A is the transpose conjugate of the matrix of A.
Example 5Q. Let S be a nondegenerate interval of the real line R. Consider the Hilbert space L2(S) equipped with its usual inner product (see Example 5D). As always, we shall write x E L2(S) instead of [x] E L2(S), where x is any representative of the equivalence class [x]. Take an arbitrary a E L2(SxS) so that a(s, ) E L2(S) for each s E S, a(., t) E L2(S) for each t E S. and Ila112
=
f
=
fjIa(s,t)I2dsdt=jjIa(s,i)I2dtds < coo.
s
Ila(s. )Il2ds = f s
This is due to a well-known result in integration theory called Fubini's Theorem. Now define the integral mapping A : L2(S) -+ L2(S) as follows. For each x in L2(S) let z = Ax E L2(S) be given by
z(s) =J a(s,t)x(t)dt = s
for every s E S. Note that z = Ax actually lies in L2(S) because a lies in L2 (S x S). Indeed, by the Schwarz inequality, 11z112 = f I(x ; a(s, ))I2ds < f IIxII2IIa(s, .)112ds
= Ilx11211a112,
and hence, IIAx1I < IIaII IIxII for every x E L2(S) so that A is bounded. Since A is certainly linear, it follows that A E B[L2(S)]. Also note that (Ax ; y)
=
fs(fs
a(s, t)x(t)dt) y(s)ds
ff a(s, t)x(t) y(s)dt ds =
fx(i) (
f
a(st) y(s)ds) dt
s
for everyx, y E L2(S). Now consider the adjoint A' E B[L2(S)] of A E B[L2(S)]. For each yin L2(S) set w = A* y E L2(S) so that
(Ax ; y) = (x; A* y) = (x; w) =
jx(z)idi
5.12 The Adjoins Operator
391
for every x, y E L2(S). Then w = A*y E L2(S) is given by
w(s) = ja(t.s)Y(t)dt = ja*(s.f)Y(t)dt for every s E S. Therefore, the adjoint A* of the integral operator A is again an integral operator whose kernel a* E L'-(Sx S) is related to the kernel a E L2(Sx S) of A as follows. For every (s, t) E S x S,
a*(s, t) = a(t,s). An isometry between metric spaces is a map that preserves distance, and hence every isometry is an injective contraction. A linear isometry between normed spaces is a linear transformation that preserves norm and, between inner product spaces, a linear isometry is a linear transformation that preserves inner product. Propositions
4.37 and 5.21 gave some necessary and sufficient conditions that a linear transformation be an isometry. Here is another one for linear transformations between Hilbert spaces that is stated in terms of the adjoint.
Proposition 5.72. An operator V E B[N. K] of a Hilbert space 7{ into a Hilbert space K is an isometry if and only if V * V = 1.
Proof. According to Proposition 5.21, V is an isometry if and only if it pre-
serves inner product; that is, (V x ; V y) = (x ; y) for every x, y E R. Equivalently, ((V* V - 1)x; y) = 0 for every X. y E R, which means that (V* V - 1)x = 0 for o every x E g{, where I is the identity on R. Acoisometry is a transformation T E B[N, K] such that its adjoint T* E B[K, N] is an isometry. Thus the previous proposition says that T is a coisometry if and only
if T T* = I (identity on K). Recall that a unitary transformation U E BIN, K] is an isometric isomorphism between 7{ and K. Equivalently, a linear surjective isometry, which means an invertible isometry (i.e., an isometry in G[N, K]).
Proposition 5.73. Take U E B[N, K], where h and K are Hilbert spaces. The following assertions are pairwise equivalent.
(a) U is unitary (b)
(i.e., U is a surjective isometry).
U lies in g[N, K] and U-' = U*.
(c) U * U = I (identity on N) and U U
I (identity on K).
(d) U is an isometry and a coisometry.
Proof. Let U be a transformation in B[7{, K]. It is trivially verified that (b) $ (c) by the Open Map Theorem (Theorem 4.22), and (c)q(d) by Proposition 5.72.
392
5. Hilbert Spaces
Proof of (a)..(b). If U is unitary, then it is an isometry that lies in G[f,1C] so that there exists U-t E 9[1C. R1 such that UU-t =I (identity on lC). Proposition 5.21 ensures that (Uxt ; Ux2) = (xt ; x2) for every X I, x2 E N. Hence
(X; U-1y) = (Ux ; UU-1y) = (Ux; y) = (X; U`y) for every x E 7{ and every y E lC. Therefore, as the adjoint is unique (Proposition
5.65(a)), U-t = U. Conversely, if U lies in G[W,1C] and U-t = U', then U`U = I (identity on 71), and hence (cf. Proposition 5.72) U is a surjective isometry.
o
Let T be an operator in B[X], where X is an inner product space, and let M be a subspace of X. Recall: M is an invariant subspace for T (or invariant under T, or T -
invariant) if T (M) C M (i.e., Tx E M whenever x E M). Anontrivial invariant subspace for T is an invariant subspace M for T such that (0) i4 M 96 X (see Problems 4.18 to 4.20). If M and its orthogonal complement M1 are both invariant for T (i.e., if T (M) c_ M and T (.M1) C M-L), then we say that M reduces T (or M is a reducing subspace for T). Accordingly, a nontrivial reducing subspace for T is a reducing subspace for T such that (0) 96 M * X. An operator is reducible if it has a nontrivial reducing subspace. Now let X = 7i be a Hilbert space and consider the orthogonal direct sum h = M ®M1 of Theorem 5.25. Observe from Example 20 (also see Problem 4.16) that the following assertions are pairwise equivalent.
(a) M reduces T.
(b) T=TIm ®TIMl=(TO TIM):7{=M 9 M-L-7{=.M®M1.
(c) PT =TP, where P=(oo):9L=Jet®M1
7{=M®M1 is the
orthogonal projection onto M.
This suggests that, if M reduces T, then the investigation of T is reduced to the investigation of smallest operators (viz., TIM and TIM-L), which justifies the terminology "reducing subspace". Proposition 5.74. Let T be any operator on a Hilbert space N. A subspace M of N is invariant for T if and only if M1 is invariant for T`. Thus T has a nontrivial invariant subspace if and only if T* has. Proof. Let M be a subspace of a Hilbert space N. let T be any operator in B[N],
and take an arbitrary y E M. If Tx E M whenever x E M, then (x, T' y) = (Tx ; y) = 0 for every x E M so that T*y 1 M. Then T'y E M-L for every y E M-L. Conclusion: T(M) c M implies T*(M1) C M. Conversely, since this happens for every T E B[7{], it follows that T'(M-L) c M1 implies T" (M11) C M11. But T** = T and ,M11 = M = M (cf. Propositions 5.15
5.12 The Adjoins Operator
393
and 5.65(c)). Therefore, T'(M1) c M1 implies T(M) c M. Finally, note that o (01 * M 96 7i if and only if (0) A M1 A H (cf. Proposition 5.15). Corollary 5.75. A subspace M of a Hilbert space 7{ reduces T E B[7{] if and only if it is invariant for both T and T' In this case (TIM)' = T*I M.
Proof. Since M11 = M, the previous proposition says that T(M1) c M1 if and only if T' (M) C M. Therefore, T (M) e M and T(M1) C- M1 if and only if T (M) e M and T * (M) c M. Moreover, in this case, we get ((TIM )x ; y) = (Tx ; y) = (x; T'y) = (x; (T' IM) y) for every x and yin the Hilbert space M. 17 Recall that N(T) and 7Z(T)- are invariant subspaces for T E B[7i] (see Problems 4.20 to 4.22). In fact, null spaces of Hilbert space operators constitute an important source of invariant subspaces. The next result shows that N(T`)1, 7Z(T')1, N(T*T), and R(T T')- also are invariant subspaces for T E B[7{].
Proposition 5.76. If T is a bounded linear transformation of a Hilbert space 7{ into a Hilbert space 1C, then (a)
N(T) = 7Z(T*)l = N(T`T),
(b)
7Z(T)- = N(T')1 = 7Z(T T')-,
(a`)
N(T') = 7Z(T)1 = N(T T'),
(be)
7Z(T')- = N(T)1 = 7Z(T'T)-.
Proof. Note that x E 7Z(T')1 if and only if (x; T'y) = 0 for every y E 1C. By the definition of adjoint, this is equivalent to (Tx ; y) = 0 for every y E 1C, which means that Tx = 0; that is, x E N(T). Hence
R(T')1= N(T). Moreover, since 1ITx 112 = (Tx ; Tx) = (T`Tx ; x) for every x E 7l, it follows that
N(T'T) c N(T). But N(T) c N(T'T) trivially, and so
N(T) = N(T'T), which completes the proof of (a). Since (a) holds true for every T E 5[71, K], it also holds for T' E B[1C, H] and T T' E B[/C]. Therefore (cf. Propositions 5.15 and 5.65(c)),
R(T)- = R(T+i)11 = N(T')1 = N(T T')1 = 7Z((T T')s)11 = 7Z(T"T*)11 = R(T T')
,
5. Hilbert Spaces
394
which proves assertion (b). Since T** = T we get the dual expressions (a*) and (ti`).
El
Here is a useful result concerning closed ranges and adjoints.
Proposition 5.77. Let H and k be Hilbert spaces and take any T E B[H, IC]. The following assertions are pairwise equivalent. (a)
R(T) = R(T )-.
(b)
R(T*) = R(T*) R(T*T) = R(T*T)-.
(c)
(d) R(T T*) = R(T T*)-. Proof. Let T be an arbitrary bounded linear transformation of a Hilbert space W into a Hilbert space K.
Proof of (a)s (b).SetTo = TIN(T)i E B[N(T)1,ICJ.therestriction ofTtoN(T)1. Recall that W = N(T) + N(T)1 (Proposition 4.13 and Theorem 5.20), and hence every x E H can be written as x = u + v with u E N(T) and V E N(T)1. If y E R(T ), then y = T x= T u+ T v = T v = T IN(T)1 v for some x E N so that y E R(T IN(T)1). Therefore, R(T) c R(T IN(T)1). Since R(T IN(T)1) S; R(T), it follows that R(To) = R(T) and N(T0) = (0)
because N(TI,v(T)1) = {0}. If R(T) = R(T)-, then Corollary 4.24 ensures the existence of Tot E 8[1Z(T),N(T)11. Now take an arbitrary w E N(T)1 and consider the functional fw : R(T) -+ F defined by
f. (Y) = (Tp t y ; w) for every y E R(T), which is linear (reason: Tot is linear and the inner product is Iinear in the first argument) and bounded (in fact, Ifw(Y)I < IITp 1111 w 11IIYII for
every y E R(T)). The Riesz Representation Theorem (Theorem 5.62) says that there exists zw in the Hilbert space R(T) (recall: R(T) is a subspace of the Hilbert space 1C, and so a Hilbert space itself, whenever R(T) = R(T)-) such that fw(Y) = (Y;zw)
for every y E R(T ). Consider the decomposition W = N(T) + N(T)1(again) and take any x E 'H so that x = u + v with u E N(T) and V E N(T)1. Then (x; T *zw)
_ (Tx ; zw) = (Tu ; zw) + (Tv. ; zw) = (Tv ; zw)
fw(Tv) = (To'Tv;w) = (Tp 1Tov;w) (v ; w) = (U; w) + (V; w) = (x ; w).
5.12 The Adjoint Operator
395
Hence (x; T*zt,, - w) = 0 for every x E 7{, which means that T*zw = w. Therefore, W E R(T*). This shows that N(T)1 C R(T*). On the other hand, 7Z(T*) e R(T*)- = N(T)1 by Proposition 5.73(b*) so that
R(T*) = N(T)1. Thus (a) implies (b) by Proposition 5.12 (or by Proposition 5.76(b*)). Since (a) implies (b), it follows that (b) implies (a) because T** = T.
Proof of (a)=*(c) and (b)=(d). Let Ti E 8[71, R(T)] be defined by Ttx = Tx for every x E 7l (i.e., T( is surjective and coincides with T on 7{). It is clear that
R(T) = R(T1 ).
Let T, E 8[R(T ), 7{] be the adjoint of T1. Now consider the restriction T * I R(T) E B[R(T ), 7{] of the adjoint T * E B[IC, 7{] of T to R(T), and note that
(x; Ti Y) = (T1x ; Y) = (Tx ; Y) = (x; T*Y) = (x; T*I R(T)Y) for every x E 7 f and every y E R(T). Then T, y= T*IR(T)y for every y E R(T) (i.e., T,* = T * I R(T) ), and hence
R(T,) = R(T * I R(T)).
Observe that x lies in R(T*IR(T)) if and only if x = T*IR(T)y = T*y for some which means y E R(T). But this is equivalent to x = T*Tu for some u E x E R(T*T). That is, R(T*IR(T)) = R(T*T) so that
R(TT*) = R(T*T).
If R(T) = R(T)-, then R(TE) = R(T1)-. Since (a) implies (b), it follows that 7Z(TT) = R(T,)-. Therefore,
R(T*T) = R(T*T)-. Conclusion: (a) implies (c), and hence (b) implies (d) (for T** = T).
Proof of (d)=:>(a) and (c)=(b). According to Proposition 5.76(b),
R(T T*) S; R(T) c R(T)- = R(T T*)-. Then (d) implies (a), and so (c) implies (b) (because T** = T).
11
We close this section by introducing an important notion. Let A be a unital (complex) Banach algebra (cf. Definition 4.17) and suppose there exists a mapping
396
5. Hilbert Spaces
A-+ A defined by A" A * that satisfies the following conditions f o r all A. B E A and all a E C.
(A*)* = A,
(i) (ii)
(iii)
(AB)* = B*A*, (A + B)* = A* + B*,
(iv)
(aA)* = A.
Such a mapping is called an involution on A. A C*-algebra is a unital Banach algebra with an involution *: A --> A such that (v)
IIA*AII = IIAII2
for every A E A. It is clear that if N is a complex Hilbert space, then B[N] is a C*-algebra, where * is the adjoins operation. Every T E B[N] determines a C*subalgebra C* (T) of B[7-1], which is the smallest C*-algebra of operators from B[7 (]
containing both T and the identity I. It can be shown that C*(T) = P(T, T*)-, the closure in B[N] of all polynomials in T and T* with complex coefficients. We mention the Gelfand-Naimark Theorem that asserts the converse: Every C* -algebra is isometrically *-isomorphic to a C*-subalgebra of B[7{]. That is, for every C*algebra A there exists an isometric isomorphism of A onto a C*-subalgebra of B[l] that preserves the involution *. A great deal of the rest of this book can be posed in an abstract C*-algebra. However, we shall stick to B[N].
5.13
Self-Adjoint Operators
Throughout this section N and K will stand for Hilbert spaces. An operator T in 8(7{] is self-adjoint (or Hermitian) if T* = T. By the definition of the adjoins operator, T E 13[7-1] is self-adjoint if and only if
(Tx ; y) = (x ; T y)
for every
x, y E N.
Proposition 5.78. If T E B[N] is self-adjoins, then
11TH = sup I(Tx;x)I. UX11=1
Proof. Let T be any operator in B[7-(]. The Schwarz inequality says that (Tx : x) IITx11 IIxII < 11TIIIlx112 for every x E N. Then
sup I(Tu;u) <_ IITR H"11=1
5.13 Self-Adjoins Operators
397
On the other hand, note that (T (x f y) ; x ± y) = (Tx ; x) ± (Tx ; y) ± (T y ; x) + (T y ; y) for every x, y E N. If T = T*, then (Ty ; x) = (y; Tx) = (Tx ; y), and hence (Tx ; y) + (Ty ; x) = 2Re (Tx ; y). Thus
(T(x±y);x±y) = (Tx;x)±2Re(Tx;y)+(Ty;y) so that
(T (x + y) ; x + y) - (T (x - y) ; x - y) = 4Re (Tx ; y) for every x,y E R. ButI(Tz;z)I = I(T(Ilzll-'z); ilzll-Iz)IIIzll2forevery nonzero vector z in H, and so
I(Tz; z)I 5 sup I(Tu; u)IIIZI12 Hull=,
for every z E R. By the above two relations and the parallelogram law,
4Re(Tx;y) < I(T(x+y);x+y)I+I(T(x-y);x-y)l < sup I(Tu;u)I(Ilx+yll2+Ilx-Y112) lull=1
2 sup I(Tu;u)I(IIx112+I1Y112) = 4 sup I(Tu;u)l Ilu1=1
Ilull=1
11x II = IIxII = 1. Consider the polar representation (Tx ; y) = I (Tx ; y) le'B. Set x' = e-'Bx so that I (Tx ; y) I = e-'B(Tx ; y) = (Tx'; y). Since Ilx'II = lix11 = 1 and I (Tx ; y) I = (T x' ; y) = Re (Tx'; y),
when
I(Tx ; y)I 5 sup I(Tu ; u)I
whenever
IIxII = IIYII = 1.
IIu 11=1
Therefore, according to Corollary 5.71,
IITII =
sup IIxII=lyl=1
I(Tx;Y)I 5 sup I(Tu;u)I.
a
11u11=1
It is worth noticing now that there are non-self-adjoint operators T in 6[H] such that IITII = supllx1=11(Tx; x)l. The class of all operators for which this norm identity holds will be characterized in Chapter 6 (these are the normaloid operators,
which includes the normal operators of Section 6.1). The next proposition gives a necessary and sufficient condition that an operator be self-adjoint on a complex Hilbert space. Proposition 5.79. If 7{ is a complex Hilbert space, then T E 13[H] is self-adjoint if and only if (Tx ; x) E lit for every x E 7-1Proof. If T = T*. then (Tx ; x) = (x; Tx) = (Tx ; x), and so (Tx ; x) is a real
number, for every x E R. Conversely, if (Tx ; x) E R for every x E H, then (Tx ; x) = (Tx ; x) = (x ; Tx) for every x E 71, and hence (cf. Problem 5.3(b))
(Tx ; y) = (x ; Ty)
398
5. Hilbert Spaces
for every x, y E 7l; that is, T is self-adjoint.
17
Remark: If 7l is a real Hilbert space, then (Tx ; x) E R for every x E ?l for all T in B[7l]. Since there exist non-self-adjoint operators on real Hilbert spaces, the above proposition fails when ?l is a real Hilbert space. Sample: If A = o o) E B[R2],
then 00A9& A'=(? o) and(Ax;x)=0forallxER2. Take an arbitrary z in IC, where IC is any (real or complex) Hilbert space. Recall that (z ; y) = 0 for every y E IC if and only if z = 0. Now take T E B[?l, IC]. Since
T = 0 if and only if Tx = 0 in IC for all x E ?l, it follows that T = 0 if and only if (Tx; y) = 0 for every x E ?l and every y E IC. Next set IC = ?l and take T E B[?l]. If ?l is a complex Hilbert space, then T = 0 if and only if (Tx ; x) = 0 for all x E 7l (cf. Problem 5.4). This is in general false for a real Hilbert space (see the above sample). However, this certainly holds for every operator (on any Hilbert space) that satisfies the norm identity of Proposition 5.78. In particular, it holds for self-adjoint operators on a real Hilbert space.
Corollary 5.80. Let ?l be any Hilbert space. If T E B[f] is self-adjoint, then T = 0 if and only if (Tx; x) = O for all x E 7l. Proposition 5.79 leads to a partial ordering of the set of all self-adjoint operators. Let Q E B[f] be a self-adjoint operator. We say that Q is nonnegative (notation: Q > 0 or 0 < Q) if 0 < (Qx ; x) for every x E W. If 0 < (Qx ; x) for every
nonzero x in R. then Q is called positive (notation: Q > 0 or 0 < Q). If there exists a real number a > 0 such that a llx 112 < (Qx ; x) for every x E H, then Q is called strictly positive (notation: Q >- 0 or 0 -< Q). Trivially,
Q>O = Q>O = Q>0. Observe that T'T E B[f] and T T* E B[IC] are self-adjoint operators for every T E B[71, IC]. In fact, since 0 < 11 Tx 112 = (Tx; Tx) = (T'Tx ; x) for every x in 11, it follows that T`T > O. Dually, since T** = T, it also follows that T T' > O. The next proposition uses this fact to give necessary and sufficient conditions that a projection be an orthogonal projection.
Proposition 5.81. If P E B[7l] is a nonzero projection, then the following assertions are pairwise equivalent. (a)
P is an orthogonal projection.
(b) P is self-adjoint. (c)
P is nonnegative.
(d)
11 P 11 = 1.
5.13 Self-Adjoint Operators
399
Proof. If P is an orthogonal projection, then Proposition 5.51 says that R(T) =
R(T)- (so that R.(T') =1 (T')' by Proposition 5.77) and R(T) = N(P)1. But
N(P)1 = 1
by Proposition 5.76. Then
R(P) = R(P'). Now take an arbitrary x E R. The above identity ensures that Px = P'z for some
z E N. and hence P'Px = (P')2z = (P2)'z = P'z = Px. That is, P'P = P, which implies that P is self-adjoint. Therefore (a)=O(b). If P = P' (i.e., if P is self-adjoint), then (Px ; x) = (P2x ; x) = (Px ; Px) = Il Px 112 > 0 for every x E N. This shows that If P > 0 (i.e., if P is nonnegative), then P = P. (by definition) so that II P 112 = 11 P' P 11 = 11 P211 = II P II (cf. Proposition 5.65), and
hence II P II = 1 (for P 34 0). Thus (c)=(d). Finally, suppose II P 11 = 1 and take an
arbitrary v E N(P)1. Since R(1 - P) = N(P), it follows that (I - P)v E N(P).
Then (I-P)vlvso that 0 = ((I-P)v;u) = IIvl12-(Pv;v),andhence 11v112 = (PV; V) < II PullIIvII < IIPII IIvII2 = IIv112. This implies that IIPvII = II v !I = (P v ; v) 2. Therefore,
11(I-P)v112 = IIPv-vu2 = IIPv112-2Re(Pv;v)+I1ull2 = 0, and so u E N(1 - P) = R(P). Hence AP(P)1 C R(P). On the other hand, if y E
R(P), then y = u + v, where u E N(P) and V E N(P)1 (reason: R(P) a N = Ar(P) + N(P)1). Since N(P)1 C R(P) = (x E R: Px = x), it follows that y = Py = Pu + Pv = Pv = v. Thus Y E N(P)1, and hence R(P) C N(P)1. Then R(P) = N(P)s so that R(P) 1 N(P). Outcome: (d)=(a). Take S, T E B[N]. If T - S is self-adjoint (in particular, if T and S are selfadjoint) and 0 < T - S, then we write S < T so that S < 0 means 0 < -S. It is easy to show that < defines a reflexive, transitive and antisymmetric relation on the set of all self-adjoint operators on N, and hence a partial ordering of it. Similarly,
we write S 0. Moreover, IIT11 < 1 if and only if T'T < 1, and 11T11 < 1 if and
only if T'T -< I (i.e., T'T < PI for some 0 E (0, 1)). Why? Proposition 5.82. If Q E 8[N] is nonnegative, then I (Qx : Y) I2 <- (Qx ; x) (QY ; Y)
for every x, y E N, and hence II Qx 112 < II Q II (Qx ; x)
for every
x E 7f.
Proof. Take a nonnegative operator Q E 13[N] and consider the function ( 7NxR -+ iF given by (x ; Y)Q = (Qx ; Y)
;
)Q:
400
5. Hilbert Spaces
for every x, y E R. It is really easy to verify that ( ; )Q is a semi-inner product on )Q so that Ilx 112 = (Qx ; x) for H. Let II IIQ be the seminorm induced on R by every x E H. Since the Schwarz inequality holds in a semi-inner product space, it follows that I (Qx ; Y)l2 = I(x ; Y)QIZ <_ IIxiIQ IIYIIQ = (Qx ; x) (QY ; Y)
for every x. Y E H. In particular, by setting y = Qx we get
IIQx114 = (Qx; Qx)2 <
(Qx;x)(Q2x;Qx)
5 (Qx;x)IIQ2xllllQxll 5 (Qx;x)IIQIIIIQxII2. which implies that II Qx 112 < II Q11 (Qx ; x), for every x E 7{.
Positive operators are not necessarily invertible. Indeed,
Q>0
Q > 0 and N(Q) = (0)
Q > 0 and R(Q)- = H.
The nontrivial part of the first equivalence follows from Proposition 5.82, and the second one is a straightforward consequence of Propositions 5.15 and 5.76(b'):
R(Q)1 = N(Q') = N(Q) so that R(Q)- = H if and only if N(Q) = (0). Therefore, Q > 0 has an inverse on its range, which is not necessarily bounded. However, strictly positive operators are invertible. In fact, if Q >- 0, then Q is bounded below (Schwarz inequality). Conversely, if Q > 0 is bounded below, then Q >- 0 (cf. Proposition 5.82). Thus, by the above displayed equivalence and Corollary 4.24,
Q>0
0 SQE9[71]
O
Note: If H is finite-dimensional, then Q >- 0 if and only if Q > 0. Why? Example 5R. Take a sequence a = (ak } -0 in f+°O and consider the diagonal operator Da = diag({ak }k° o) in B[e+] of Examples 4H and 4J, now acting on the Hilbert space a+. It is readily verified that
Da=Da DQ > 0
Do > 0 DQ >- 0
if and only if if and only if if and only if if and only if
a*ER for all k>0, ak > 0 for all k > 0, ak > 0 for all k > 0, D. = DQ and inf ak > 0. k
Let B+[H] denote the weakly closed convex cone of all nonnegative operators in B[H] (see Problem 5.49) and set 9+[7l] = 131 [h) f1 G[H]: the class of all strictly positive operators on H. Here is a rather useful corollary of Proposition 5.82.
5.14 Square Root and Polar Decomposition
401
Corollary 5.83. If IQ,.) is a sequence of nonnegative operators on ?{ (i.e., if Q. E B+ IN] for every integer n), then
Q.
0
if and only if
Q. -O O.
Proof. Recall that strong convergence always implies weak convergence (to the same limit). On the other hand, since ?{ is a Hilbert space, sup,, II Qn II < oo whenever { Qn } converges weakly. Therefore, according to Proposition 5.82, Qn - W > 0 implies Qn 0, for II Qnx II2 -< sup II Qk Ii (Qnx ; x) k
for every x E H and every integer n (see Proposition 5.67).
o
An immediate consequence of the above corollary reads as follows. If T E B[?{] is such that, for each n > 1, either O < T - T or O < Tn - T, then Tn w) T if and only if Tn ") T. This is what is behind the next proposition. Let (Tn} be a B[ll]valued sequence. We say that {T} is increasing or decreasing if 0 < Tn+t - Tn or O < Tn - Tn+l, respectively, for every integer n. If it is increasing or decreasing, then we say that (Tn) is monotone.
Proposition 5.84. A bounded monotone sequence of self-adjoint operators converges strongly.
Proof. Suppose {Tn} is a bounded decreasing sequence of self-adjoint operators in B[N] and take an arbitrary x E H. Thus ((Tnx ; x)) is a real-valued bounded decreasing sequence (see Proposition 5.79 and recall that sup,,I (Tnx ; x) I < supn II Tn II IIx 112). Therefore, ((Tnx ; x) j converges in R (Problem 3.10), and so in
F, which implies that T n - T W 0 for some T E BEll] (Problem 5.48). Moreover, as (Tn) is decreasing, (Tnx ; x) > (Tn+kX ; x) -+ (Tx ; x) ask -> oo so that T. - T E 6+[7-(] for every integer n, and hence Corollary 5.83 ensures that Tn - T 0 O. If (T,,) is a bounded increasing sequence of self-adjoint operators, then it is clear that (-Tn} is a bounded decreasing sequence of self-adjoint operators. The above argument ensures that {-T,,} also converges strongly, and so does
{T).
5.14
O
Square Root and Polar Decomposition
We shall also assume throughout this section that H and 1C are Hilbert spaces. If T is an operator in S[W], and if there exists an operator S in B[H] such that S2 = T,
then S is referred to as a square root of T. Nonnegative operators have a unique nonnegative square root.
402
5. Hilbert Spaces i
Theorem 5.85. Every operator Q in B + [N] has a unique square root Q2 in 8 '[W ] , which commutes with every operator in B[N] that commutes with Q.
Proof. (a) First we show that there is no loss of generality in assuming 0 < Q < I. In fact, if 0 < Q and 0 yl- Q, then 0 < II Q II -' Q < 1 (since (II Q II -' Qx ; x) = II Q II -' (Qx ; x) < II Q II -'II Q II 11 x 112 = (x ; x) for every x E H). Hence, if there exists a unique S > 0 such that S2 = 1 1 Q11 Q, then set Q7 = II Q112 S so that Q i > 0 and (Q 7)2 = II Q IIS2 = Q. Moreover, if S is the unique nonnegative operator such that S2 = II Q II -' Q, then Q Z is the unique nonnegative operator such that (Q2')2 = Q. Indeed, if A > 0 is such that A 2 = Q, then II Q II - z A > 0 and (11 Q I1- 71 A)2 = II Q II -1 Q. If S is the unique nonnegative operator such that S2 = I1Q11-' Q, then II Q II - 21 A = S so that A = II Q II Z S = Q21. Finally, it is clear that if II Q II -' QT = T II Q II-' Q implies ST = TS, then (since S = II Q II
2 Q7, )
QT = TQ implies Q2T = TQi.
(b) Thus suppose 0 < Q < 1 and set R = I - Q so that 0 < R < I. Consider the sequence {Bn);..o in B[R] recursively defined as follows.
Bn+1 = '(R+B.2)
with
Bo=O.
Claim 1. 0 < Bn for every integer n > 0.
Proof. The proof goes by induction. First observe that Bo is trivially self-adjoint. If B,, isself-adjoint for some n > 0, then Bn+1 is self-adjoint because (B'1 )' = (Bn )2 =
B, and R = R. Hence Bn is self-adjoint for every n > 0. Now note that Bo > 0 trivially. If Bn > 0 for some n >_ 0, then (Bn+I x ; x) = 2 ((Rx ; x) + II Bnx II2) > 0 for every x E N. 0 Claim 2. II Bn II < 1 so that Bn < I for alI n > 0. Proof. II RII = supllsll_l (Rx ; x) < supllXll=1 (x ; x) = 1 by Proposition 5.78. Trivially, II Bo II <- 1. If II Bn II -< 1 for some n >_ 0, then II Bn+t ll <- j (II R II + JIB,. 112) < 1.
which proves by induction that II B,, II < 1 for all n > 0. Then 0 < Bn < 1 for all n > 0 (cf. Claim 1 and Problem 5.55). Claim 3. (Bn) is an increasing sequence. Proof. It is readily verified by induction that each Bn is a polynomial in R with positive coefficients. This implies that Bn B,n = Bin B. for every m, n > 0. Hence
Bn+2 - B,,+1 = ''-S(Bn+i
- Bn) =
15(B,.+,
- Bn)(Brt+l + Bn)
for each n > 0. Observe that BI - Bo = i R and B2 - BI = I R + $ R2 are polynomials in R with positive coefficients. Since Bn+1 + B. is a polynomial in R with
5.14 Square Root and Polar Decomposition
403
positive coefficients (because each B,, is), it follows that Bn+2 - Bn+i is a polynomial in R with positive coefficients whenever Bn+t - B is. This proves by induction that B,, +I - B,, is a polynomial in R with positive coefficients for every n > 0, and
so Bn+t - B,, > 0 for every n > 0 because R > 0 (cf. Problem 5.52(e)). Since (B,,) is a bounded monotone sequence of self-adjoint operators, it converges strongly by Proposition 5.84. That is,
B,, -'-0 B
and hence
I - B,,
I-B
for some B E B[fl]. Since 0 < B, < I (i.e., both Bn and I - B. lie in 13+[7-]) for each n, and since B[fl] is weakly (thus strongly) closed in B[11J (cf. Problem
5.49), it follows that 0 < B < I. Moreover, as B,, -> B, we get B2 -p B2 (Problem 4.46(a)). Therefore,
0 < B = 1(R+B2) < I so that R = 2B - B2, and hence Q = I - R = B2 - 2B + I = (I - B)2. Recall-
ingthat0
(c) Suppose T Q = QT for some T E B[7{]. Thus T (I - R) = (I - R)T so that T R = R T, and hence Tp(R) = p(R)T for every polynomial p in R. Then TB,, = BT, it B, T (cf. proof of Claim 3) for each n. Since TB,, TB and B, ,T follows that TB = BT (the strong limit is unique). Therefore, T(I - B) = (I-B)T so that T Q z = Q i T. That is, Q 7 commutes with every operator that commutes with Q.
(d) Finally, we show that the nonnegative square root Qi of Q is unique. Take
any A > 0 such that A2 = Q. Since A Q = A3 = QA, it follows by (c) that A Q7' = QZA. Then
(Q2 - A)Qi (QZ - A) + (Q7 - A)A(QZ - A)
= (Q7 - A)(QI + A)(QI - A) = (QZ - A)(Q - A2) = 0.
But (Qi-A)Qs(QZ-A)>0and(QZ-A)A(Q71 -A)>O(forQiandA are nonnegative - cf. Problem 5.51(a)), which implies that these two operators are
null, and so is their difference. That is, (Qs - A)3 = (Qz - A)QI (Qs - A) (QI A)A(QI - A) = O, and hence (Q71 - A)4 = 0. Since (QZ - A) is
-
self-adjoint, II (Q 21 - A)114 = II (Q 7 - A)411 = 0 (see Problem 5.45(e)), so that
0
A=Q71. Proposition 5.86. If Q E B+[9-(], then (a)
IIQ2112
= IIQII =
IIQ2112'.
404
(b)
5. Hilbert Spaces
N(Q2) = N(Q) = N(Q2) and TZ(Qi )- = R(Q)- = R(QZ)-.
Proof. If Q is nonnegative, then Q2 is nonnegative (Problem 5.52). Hence Q is
the unique nonnegative square root of Q2 (Theorem 5.85). Therefore, the identities involving Q and Q2 follow at once if the identities involving QZ and Q are established. (a) Q7 = (Q,.) implies IIQ71112= II(QI)`II = IIQII (Proposition 5.65).
(b) Since Q = Q71 Qs, it follows that N(Q2) c N(Q). On the other hand, IIQzxII'- = (Qix ; Qix) = (Qx ; x) for every x E R. and henceN(Q) c N(Q7). Then N(Q) = N(Q71). Therefore, by Proposition 5.76, 1Z(Q)- _
o
N(Q)1 = N(Q7)1 =
Apartial isometry is a bounded linear transformation that acts isometrically on the orthogonal complement of its null space. That is, W E B[N, K) is a partial isometry
if WIN(w)L: N(µ')1 -' K, the restriction of W toN(W)1, is an isometry. Note: In this context, "isometry" means "linear isometry".
Proposition 5.87. If W : N -+ K is a partial isometry, then
W = VP, where V : N(W)1 -. K is an isometry and P : 7{ -+ N is the orthogonal projection onto N(W)1. Conversely, let M be any subspace of W. If V : M -I- 7-( is an isometry and P : N -+ N is the orthogonal projection onto M, their W = V P N -+ K is a partial isometry. Proof. If W : N -+ K is a partial isometry, then let P : W -> R be the orthogo-
nal projection onto N(W)1 (so that 7Z(P) = N(W)1, and hence R(I - P) = N(P) = 1 (P)1 = N(W) - cf. Propositions 4.13, 5.15 and 5.51). Set V = W IAr(w)1: N(W)1 -- K. which is an isometry. Thus, for every x E N.
Wx = W(Px+(I - P)x) = WPx+W(1 - P)x = WPX = W IN(w)1 Px = V Px. Conversely, let V : M -+ K be an isometry, and let P : W -* W be the orthogonal
projection onto M, where M is an arbitrary subspace of N. If W = VP, then W I M v = W v = V P v = V v for every v E R(P) = M, and hence W acts isometrically on M. Moreover, if Wu = 0. then V Pu = 0 so that Pu E N(V) = (0) (recall: V is an isometry), which implies that u E N(P). ThenN(W) c N(P), and so N(W) = N(P) (reason: N(P) C_ N(W) because W = V P). Therefore.
M = R(P) = N(P)1 = N(W)1. Conclusion: W = VP : W -+ K is a bounded linear transformation (composition of bounded linear transformations) that acts isometrically on N(W)1; that is, W is a partial isometry. 0
5.14 Square Root and Polar Decomposition
405
Since W = VP and V = Wlnr(w)1. IIVII -< IIWII = IIVPII < IIVIIIIPII and R(W) C R(V) = 1Z(WlA((w1)) C 7Z(W).Then IIWII = IIVII = 1 and1 (W) = R(V). Recall: The range of a linear isometry on a Hilbert space is a subspace (cf. Propositions 4.20 and 4.37). That is, R( V) = R(V) -, and hence 7(W) = R(W)-. Outcome: The range of a partial isometry also is a subspace. N(W)1 and R(W) are called initial and final spaces of the partial isometry W, respectively.
Proposition 5.88. Take W E 8[7-1, IC]. The following assertions are pairwise equivalent. (a) W is a partial isometry with initial space M and final space R. (b) W * W is the orthogonal projection onto M and R(W) = R.
(c) WW*W = W (d) W*WW* = W* (e) W W* is the orthogonal projection onto R and N(W)1 = M. (f)
W' is a partial isometry with initial space R and final space M.
Proof. Let W be a bounded linear transformation of N into IC.
Proof of (a)=(c). If (a) holds, then W = V P, where V : M -> IC is an isometry, P : h -+ 7{ is the orthogonal projection onto M. and M = N(W)1 (Proposition 5.87). Consider the adjoint V* : IC -+ M of V, and recall from Proposition 5.72 that V* V = /, the identity on the Hilbert space M (for M = N(W)1 is a subspace of the Hilbert space 7-( by Proposition 5.12, and hence a Hilbert space itself). Since
P is an orthogonal projection, it follows by Proposition 5.81 that P = P* and so
W*= P*V*= PV*. Therefore, WW*W = VPPV*VP = VP = W. Proof of (c)=(b). If WW*W = W, then (W*W)2 = W*WW*W = W*W so that W* W is a (continuous) projection, and hence it has a closed range. (Reason:
if E = E2, then R(E) = N(1 - E), and N(! - E) is closed whenever the linear transformation E is bounded - see Proposition 4.13). Thus R(W*W) = R(W*W)- =N((W*W)*)1 =N(W* W)-' =N(W)1(Proposition 5.76).Hence R(W*W) = N(W)1 and R(W*W) 1 N(W*W), which implies that W*W is the orthogonal projection onto M =M(W)1.
Proof of (b)=>(a). If (b) holds, then M = R(W*W) = R(W*W)- = N(W)1 (cf. Propositions 5.51 and 5.76). If V E M = R(W * W), then W * W v = v, and
hence pWup'-=(Wv:Wv)=(W*Wv;v)=(v:v)=I IV 112 so that WIM is an isometry. Since M = A((W)1, it follows that (b) implies (a).
Conclusion: Assertions (a). (b) and (c) are pairwise equivalent. If W* W is an orthogonal projection, then R(W*W) is closed, which implies that R(W*) and
406
5. Hilbert Spaces
R(W) are closed as well. Similarly, if W' is a partial isometry, then R(W') is closed, and so is 7Z(W). Therefore, in both cases, R(W') = N(W)1 and N(W')1 = 7Z(W). Since assertions (a), (b) and (c) are pairwise equivalent, it follows that the dual assertions (f), (e) and (d) are pairwise equivalent too. Finally, observe that (c) and (d) are trivially equivalent. o If a transformation T E B[f, K] is the product ofa partial isometry W E B[71, K]
and a nonnegative operator Q E B[H], and if N(W) = N(Q), then the representation T = W Q is called the polar decomposition of T. The next theorem says that every bounded linear transformation in B[7{, K] has a unique polar decomposition.
Theorem 5.89. If T E 5[71, K], then there exists a partial isometry W E 5[71, K] with initial space N(T)1 and final space R(T )- such that
T = W(T'T)' and N(W) = N((T'T)7 ). Moreover, if T = ZQ, where Q is a nonnegative operator in B[7{] and Z E B[7{, K] is a partial isometry with N(Z) = N(Q), then
Q=(T'T)'and Z=W. Proof. Take any T E BIN, K]. Recall that T'T E B+[?{1 and R((T'T)i)- _ R(T' T)- = N(T* T)1 = N((T' T)' )1 = N(T)1(cf. Corollary 5.75 and Proposition 5.86).
Existence. Consider the mapping Vo: R((T'T)') -i- K defined by V0(T'T)'x = Tx for every x E N, so that R(V0) C R(T). First note that Vo is linear. Indeed, if y and z lie in 7Z((T'T)7) g H. then y = (T'T)iu and z = (T'T)'v for some u and v in N, and hence
Vo(ay + #z) = Vo(a(T'T)' u + A(T'T)2 v)
= Vo(T'T)21(au+fv) = T(au+,5v) = aTu+18Tv = aVo(T'T)'u+AVo(T'T)'v = aVoy+l3Voz for every a, P E F. Moreover, since
IITx112 = (Tx; Tx) = (T*Tx;x) = ((T'T)71 x; (T*T)'x) = II(T*T)ZxII2. it follows that IIVo(T'T)7x11 = IITxII = II(T'T)7x11, for every x E 7i. Hence II Voy11 = lIy11 for every y E 7Z((T'T)21 ) so that Vo: R((T'T)2) -+ K is a linear isometry. Extend it to 7Z(T'T)- = 7Z((T`T)2 )- and get the mapping
V : R(T'T)- -+ K. This is a linear isometry with R(V) c_ 7Z(Vo)J e_ 7Z(T )-. Inder mapping Vo : R((T'T) s) - R(Vo) is an isometric isomorphism,
5.14 Square Root and Polar Decomposition
407
Corollary 4.30 that the mapping V : R((T'T) s )- - 1Z(Vo)- is again an isometric isomorphism. Since V(T'T)Ix = Vo(T'T)71 x = Tx for every x E it,
T = V(T'T)7, and hence R(T) C R(V), where
V : N(T)1-* K is a linear isometry with R(V) = R(T)-. (Recall: N(T)1 is a Hilbert space so that R(V) = R(V)-.) Let P: it -)- R be the orthogonal projection onto N(T)-L. Then R(P) = N(T )1, and so V P y = V y
for every y E N(T)1 = R((T'T)2). which implies VP(T*T)sx = V(T'T)zx for every x E it. That is, V P(T'T)7' = V(T'T)s. Setting W = V P : it --> K we get
T = W(T'T)21. Since V = W IR(P) = W IN(T)l is an isometry and P is an orthogonal projection,
it follows that N(W) = N(V P) = N(P) = R(P)-L = N(T) = N((T'T)1) and R(W) = R(V) = R(T)-. Therefore, W : H -+ K is a partial isometry with initial space N(T)1, final space R(T)-, and N(W) = N((T *T)
zi).
Uniqueness. Let Z E B[i, K] be a partial isometry with N(Z) = N(Q), where Q is a nonnegative operator in B[f]. Suppose T = ZQ. Since Z'Z is the orthogonal projection onto N(Z)1 = N(Q)1 = 7Z(Q)- (Propositions 5.76(b') and i
5.88(b)), it follows that T'T = QZ'ZQ = Q2. Thus Q = (T'T)2 by uniqueness of the nonnegative square root. Hence Z(T'T) I = T = W (T `T) I so that W IR((T'T)l/2), and therefore
N(Z) = N(W)
and
ZIN(z)1 = WIN(W)1
because 7Z((T'T)21)- = R((T'T)- = N(T)1 and N(Z) = N(Q) _ N((T'T)12) = N(T) = N(W). Conclusion: The partial isometrics Z: N -+ K and W : it --s K have the same initial space and they coincide there. That is,
o
Z= W.
If T = WQ is the polar decomposition of T, then W'W is the orthogonal pro-
jection onto N(W)1 = N(Q)1 = R(Q)- so that W'W Q = Q. Thus T=WQ
implies
W'T = Q.
Here is another corollary of Theorem 5.89.
Corollary 5.90. Let T = W Q be the polar decomposition of a bounded linear transformation T E 13[H, K].
408
5. Hilbert Spaces
(a) W E B[1-1, K] is an isometry if and only if .M(T) = (0).
(b) W E B[4{, K] is a coisometry if and only if 1Z(T)- = K.
Proof. Recall that N(T) = N(W) and 1Z(W) = 1Z(T)- whenever T = W Q is the polar decomposition of T.
(a) A partial isometry W is an isometry if and only if N(W)1 = 71, which means thatA((W) = (0). (b) W is a coisometry if and only if W* is an isometry. But W* is a partial isometry with N(W*)1 = 1Z(W) (Proposition 5.88). Hence W* is an isometry if and only if
R(W) = K.
o
Therefore, if T = W Q is the polar decomposition of T, then W is unitary (i.e., a surjective isometry or, equivalently, an isometry and a coisometry) if and only if T is injective and has dense range (i.e.. if and only if T is quasiinvertible). In particular, if T E Q[f, K] (i.e., if T is invertible), then T = where U E Q[71, K] is unitary. U(T*T)Zr,
Suggested Reading Akhiezer and Glazman [11 Arveson [1] Bachman and Narici [1] Balakrishnan [1] Beals [1] Berberian [3] Berezansky, Sheftel and Us [1] Conway [1], [2] Davidson [ 1 ]
Dunford and Schwartz [2] Fillmore [1], [21 Gohberg and KreYn [ 1 ]
Halmos [I ], [4]
Kato [11 Kubrusly [I ] Murphy [I ]
Naylor and Sell [I ] Pearcy [ 11, [21
Putnam [1] Radjavi and Rosenthal [I ] Reed and Simon [ 1 ] Riesz and Sz.-Nagy [ 1 ] Schatten [I ] Stone [ 1 ]
Sz.-Nagy and Foia§ [I ] Weidmann [1]
Problems Problem 5.1. Let II II be the norm induced by an inner product (;) on a linear space X # (0). Show that, for every x E X.
IIxII = sup I(x;y)l = sup I(x:y)I =supULL IlyII=t
IIyII
y#O
Problems
Hint: If Ilxii 36 0, then lixll = (1
409
; x). Use the Schwarz inequality.
Problem 5.2. Let (X, (; )) be an inner product space and let 11 11 be the norm on Take arbitrary vectors x and y in X and consider the following X induced by assertions. (a)
ix -4- YIi = IIxii+Ilyll
(a)
Ilx - YII = IIlxii - 11Y111-
(b)
I(x; y)i = Ilxil iiyli
(b)
(x ; y) = IIxii IIYIl
(c) x and y are collinear (i.e., either one of them is null or y = ax for some nonzero a in 3F).
(c) x and y are proportional (i.e., either one of them is null or y = ax for some real a > 0). Show that the diagram below exhibits all possible implications among the above assertions. (a) (b) b (c) (a) (b)
(c)
Hint: Consider the following auxiliary assertion.
(6) Re(x;y) = IIxIIIlyll. Recall that Ilx f y112 = lix 112 ± 2 Re (x ; y) + IIY 1122 for every x, y E X. Use this
identity to show that (a)4:'(15), (d)a(6) and, by setting z = (x; y)IIyII-2y and z = I(x ; y) I Ily11-2y. that (b)=(c) and (ti)=(c). Now use the Schwarz inequality The remaining implications are trivially verified. Show that to show that (c)#(c) and conclude that there may be no other implication in the above diagram.
Problem 5.3. Let X and Y be inner product spaces and take arbitrary T and L in £[X. Y]. Prove the following identities.
(a) (Tx; Ly) + (Ty; Lx) = 3 ((T(x + Y); L(x + y)) - (T(x - y) ; L(x - y))) for every x, y E X. If X is a complex inner product space, then (b)
(Tx; Ly)
= ((T (x + y) ; W + y)) - (T (x - y) ; L(x - y)) 4 + i (T(x+iy); L(x+iy)) -i(T(x -iy); L(x -iy)))
for every x, y E X. Note that these yield the polarization identities.
410
5. Hilbert Spaces
Hint: (T(x+ay);L(x+ay)) = (Tx;Lx) +a(Tx;Ly) +a(Ty;Lx) + Iaj2(T y ; Ly). Set a= ±1 to get (a), and also a = ±i to get (b). Problem 5.4. Let T E B[X] be an operator on an inner product space X. Show that the following assertions are pairwise equivalent.
(a) T = 0. (b) Tx = 0 for all x E X. (c)
(Tx ; y) = 0 for all x, y E X.
Now consider the following further assertion. (d) (Tx ; x) = 0 for all x E X.
Clearly, (c) implies (d). If X is a complex inner product space, then show that these four assertions are all pairwise equivalent. Hint: Use Problem 5.3(b) to show that (d) implies (c) if X is complex. Problem 5.5. Let [T } be a sequence of bounded linear transformations of a Hilbert space h into a Hilbert space 1C (i.e., T E B[?l, K] for each n). Show that uniform, strong and weak boundedness coincide. In other words, show that the following assertions are pairwise equivalent. (a) sup. IIT.II < 00.
(b) sup 11
II < oo for every x E 7i.
(c) sup. I (,.x ; y) I < oo for every x E 'H and every y E 1C.
Now set 1C = H and consider the following further assertion.
(d) sup I (Tnx ; x) I < oo for every x E H. Clearly. (c) implies (d). If 1C = H is a complex Hilbert space, then show that these four assertions are all pairwise equivalent. Hint: Check that (a) =* (b)=(c). Now take an arbitrary x E W. For each n consider the functional x,,: 1C - F given by
xn(y) = (y; Tnx) for every y E 1C. Show that each x is linear and bounded. If (c) holds (so that Sup. Ix. (y) I < oo for every y E IC), then the Banach-Steinhaus Theorem ensures that sup Ilxn II < oo because IC is a Banach space. But
I1Tnxli = sup Ix.(x)I = Ilxnll Ilyi=1
Problems
411
for each n (cf. Problem 5.1). Thus conclude that (c)=(b). A straightforward application of the Banach-Steinhaus Theorem ensures that (b)=(a) because l is a Banach space. Finally, if K = ?-l is complex, then use Problem 5.3(b) to show that (d)=>(c).
Problem 5.6. Let (X, ( ; )) be an inner product space and equip the field IF with its usual metric. Use the Schwarz inequality and Corollary 3.8 to prove the following assertions. (a) (- ; y) : X -+ IF is a continuous function for every y E X. (b) (x ; ) : X -* IF is a continuous function for every x E X.
Equip the direct sum X ® X with the inner product ( ; )® of Example 5E. That is, (x ; y)®= (XI ; yl) + (X'-); y2) for every x = (XI, x2) and y = (yl, y2) in X ®X. Show that (c)
(
;
)®: X ® X
IF is a continuous function.
Hint: Take an arbitrary convergent sequence (x (n) = (XI (n), x2(11))} in X ® X
and let x = (XI, x2) E X ® X be its limit. Verify that
(x1(11)-x1 +x1 ;x20,)-x2+ X2) = (XI(n)-X1 ;X2(11)-X2) + (xl (11) - xl x2) + (xI ; x2(11) - X2) + (xI ; X2),
and use the Schwarz inequality to show that I(XI (11); x200) - (XI ; x2) I
<
IIxl (n) - xl 11 11X200 - x211
xl II + Ilxl II 11x2(8) - X2II
+ 11x211 flxI
-
Since IIX(n) -x112 = Ilxl(n) -x1112 + x1112,x(n) -+ x in X ® X if and only if xl xl and xi (n) -* xi in X, which implies (xl (it); x2(11)) -* (XI;x2) in IF. l1xl(11)
Problem 5.7. Let A and B be subsets of a Hilbert space f. Prove:
(a) A C B and Al C B implies VB = 7-l. In particular, if M is a subspace of 1-l, then
(b) A C M and Al e M implies M = R. Hint: B1-L = f because B1 c Al fl A11 = (0). Use Proposition 5.15. Problem 5.8. Let X be an inner product space and let 9-l be a Hilbert space. Prove the following propositions. If {Ay}yEI- is a nonempty family of subsets of X, then
412
5. Hilbert Spaces
(a) nyErAy = (UyerAy)'. If (My )yEr is a nonempty family of linear manifolds of X, then
(b) nyerMy = (LyerMy)l. Hint: Proposition 5.12. Recall: V
(EyerMy)
If {My}yEr is a nonempty family of subspaces of it, then
(c) nyErMy = (0) if and only if (EverMy) = h,
(d) (F-yErMy) = h if and only if nyErMy = {0). Hint: Proposition 5.12 and 5.15 and item (b).
If (My )yEr is a nonempty family of orthogonal subspaces of h, then (e)
it = (Ever My) implies M. = na#yerMy for every a E I'. Hint: Take an arbitrary a E r. Show: Ma 1 Ea#yErMy. and hence M. 1
(EaoyErMy
(cf. Proposition 5.12). Verify each of the following steps:
(>2M)=V(McrU y =
a rMy)c vMaU V(
a?6yErU
Ma+( E My)-Ch a0yer
(see Theorem 5.10(b)). Then H = Ma + (Ea#yerMy) . Conclude by Proposition 5.19, and therefore that Ma = Ma = (EaoyErMy)l (Propositions 5.12 and 5.15). Apply (b). Remark: Recall from Section 4.3 that the collection Lat(h) of all subspaces of it is a complete lattice in the inclusion ordering, where M AN = M nN and M V N = (M + N)-. Therefore, according to item (b) and Proposition 5.15,
M vN= (Ml nN1)I. Problem 5.9. Let X and Y be inner product spaces and consider the setup of Problem 4.42. If there exists a unitary transformation U E CQ[X,YJ intertwining T E B[X] and S E B(y]; that is, if
UT = SU, then we say that T and S are unitarily equivalent (notation: T - S). If this happens, then it is clear that X and Y are unitarily equivalent inner product spaces. As in the
Problems
413
case of similarity (on linear spaces) or isometric equivalence (on normed spaces), show that unitary equivalence has the defining properties of an equivalence relation. Now take an arbitrary polynomial p in one variable (cf. Problem 2.20) and show
that Up(T) = p(S)U whenever UT = SU for every unitary U E 9[X, Y]. In particular. T
S implies T" = S" for every positive integer n.
Problem 5.10. This is the uncountable version of the Orthogonal Structure Theorem. Prove it.
(a) Let N be a Hilbert space. Suppose (My)yEr is an uncountable family of pairwise orthogonal subspaces of H. If X E (EyErMy)-, then there exists a unique summable family (uy)yEr of vectors in fl with uy E My for each y such that x = EyEruy. Moreover, IIx112 = F ErIIuy 112. Conversely, if (uy )yEr is a square-summable family of vectors in -l with uy E M for each
y E 1, then {uy}yEr is summable and EyEruy lies in (F-yErMy) Hint: Set x(n) = in where each N" is a finite subset of r such that Nn C_ Nn+t and #N" > n. Consider the set Noo = Un>> IN. and construct a countable family'of orthogonal vectors in W, say {uk}kEN,,, as in the proof of Theorem 5.16. (b) Show that % must be nonseparable if (My )yEr has uncountably many nonzero subspaces. (Hint: Proposition 5.43 and the Axiom of Choice.) Even in this case, the sum x = yEruy has only a countable number of nonzero vectors. (Why?) (c) Restate Corollaries 5.17 and 5.18 in light of item (a). Now rewri te Examples 51 and 5J for an uncountable family (My )yEr of pairwise orthogonal subspaces
of a HiIbert space (N, (; )) and conclude that (D yErMy =
yErMY)
Problem 5.11. Let (My } be a collection of orthogonal subspaces of a Hilbert space
N that spans h (i.e., VyMy = (>yMy)- = N) and let By be an orthonormal basis for each My. Show that Uy By is an orthonormal basis for R.
Hint: Uy By is an orthonormal set in h and Uy(V By) C_ V(Uy By) (cf. Sec-
tions 2.3, 3.5 and 4.3). Also verify each of the following steps. (EM) - _ V(UyMy) = V(UyV By) S V(V(UyBv)) = V(UyBy) C N. Problem 5.12. Completeness is necessary in Theorem 5.10(b): If M and Al are orthogonal subspaces of an incomplete inner product space X, then it may happen that M + Al is not closed in X. Indeed, set X = span (ek )k° o,
where eo = { }O0 i E Z+ and {ek }k i is the canonical orthonormal basis for the Hilbert space e+ (cf. Example 5L(b)). Recall that X is the linear manifold of e+
414
5. Hilbert Spaces
consisting of all (finite) linear combinations of vectors from {ek } the following linear manifolds of X.
.
Now consider
M = {uE{vk}k°EX: v2k_t=0 for allk>1
N=
{vE(vk)k'=i EX: v2k=0 for all k>1}.
It is clear that M 1 N, isn't it? Moreover, they are subspaces of X. (a) Show that M and N are both closed in the inner product space X.
t, with each u = (vk(n)) t in M. such that un + x = (!;k}k°t in X. Split the series 11U" - x112 = yx°-t Ivk(n) - lkI2 into the sum of a series running over the odd integers and a series running over the even integers and conclude that IIun - x 112 = F 00 t 1t2k-1 I2 + Hint: Take a sequence
F0°_ t 1 v2k (n) - t2k 12. This implies that x E M.
(b) Exhibit an M+N-valued sequence that converges in X to eo E X. Hint: For each n > 1 define un = (vk(n))k°_t and vn = {vk(n)}k°t as follows. vk (n) _ if 1 < k < n is an even integer, and vk(n) = 0 otherwise. vk (n) _ if 1
span {ek)k_tCX,unEM,vnEN,andun+vn=(1, ,...,-1,0,0.0, for every n>1. (c) Show (by contradiction) that eo ¢ M +N. Conclude from (b) and (c) that M +N is not closed in X.
Problem 5.13. Orthogonality is necessary in Theorem 5.10(b): If M and N are nonorthogonal subspaces of a Hilbert space 1l, then it may happen that M +N is not closed in f{ even if M fl N = (0). Set
vk = (1 -k )Ie2k-1+ke2k
in
e+
for every k > 1, where {ek}k is the canonical orthonormal basis for the Hilbert space a+, and consider the following subspaces of a+. 1
M = V{e2k_t }k° t
and N= V{vk}k° t.
(a) Show that (vk)kk' i is an orthonormal sequence, and hence an orthonormal basis for the Hilbert space N.
(b) Apply the Fourier Series Theorem to show that M fl N = {0).
Hint: Take X E M n N and consider its Fourier series expansions: x =
F"Qt(x;elk-t)eak-i = F-k°_i(x;vk)vk. Now verify: 0 = (x; e23) =
>k° t (x ; vk) (vk ; e2j) _
(x ; vt) for all j > 1. Conclude that x = 0.
Problems
415
Show that the series > k _° _ I kelk converges in 12 and set x = E00= k1 kelk in £ . Note
that X. = Ek_t ke2k =
Ek-I uk -
lies in M + N for each
Ek_, (1
n > 1. Moreover, check that x --> x in £ as n -> 00.
(c) Apply the Fourier Series Theorem to show that x V M + N. Hint: Suppose x = u + v with u E M and V E N. Verify that (x ; (u ; e2,,) = 0, and (v ; e2i) = (v ; v,,) 1r for each n > 1. Now conclude that (v ;
and hence (v ;
I for every n > 1, which is a contradiction
(since I1 v112 = F00 I(v; V.) 12).
Therefore, M +Y is not closed in £. by the Closed Set Theorem. Problem 5.14. Let {ey }yEr be any orthonormal basis for a complex Hilbert space R. For each x E f consider its Fourier series expansion with respect to (ey}yEr; that is, x = FyEr (x ; ey)ey. Verify that (Re (x ; ey)ey )yEr and (Im (x ; ey)ey )yEr are both summable families of vectors in N. Set
Rex = EyErRe (x; ey)ey
and
Imx = F-yErlm (x ; ey)ey
in R. Show that x = Rex + i Im x and prove the following identities.
(a) (Rex ; ey) = Re (x ; ey) and (Imx ; ey) = Im (x ; ey) for every y E P. (b) (Rex; Imx) = (Imx ; Rex) and IIX I12 = IIRe x II2 + II Im x II2.
(c) Re(ax)=ReaRex-Imalmx and Im(ax)=Realmx-ImaRex. (d) (Re (ax); Im (ax)) = Realma(IIRexII2 - IIImxII2) + (Rex ; Imx)((Rea)2 - (Ima)2). Use the above results (and the polar representation for of E C) to prove the Orthogonal Normalization Lemma: For every nonzero vector x in a complex Hilbert space
there exists an a E C such that (lax II = 1 and Re (ax) 1 Im (ax). Problem 5.15. Let (ey)yEr be an orthonormal family in an infinite-dimensional Hilbert space W. Let (ek }k t be a countably infinite subset of fey }yEr, set M = V{ek }k° I and. for each positive integer n, set M. = span (ek )k=, . Let P.: N -+ N be the orthogonal projection onto the finite-dimensional subspace M.. Show that P -5 ) P, where P : ?{ -+ 'H is the orthogonal projection onto M. (Hint: Corollary 5.55.)
Problem 5.16. Let (Pk ) q be a resolution of the identity on a Hilbert space H so that Fk=OPkx = x for every x E H; that is, Ek_oPk
I as n -+ oo. Suppose
Pk 96 0 for every k > 0 and let (),t )k° o be a bounded sequence of scalars. According
416
5. Hilbert Spaces
to Proposition 5.61 the weighted sum of projections E00_oAk Pk is the operator E00 04 A as n -+ oo. Theorem 5.59 says that on 7{ for which E"=04 Pk 71 = (EkoR(Pk))-, and (Ek_0R(Pk))- is unitarily equivalent to the orthogonal direct sum ®k°_01Z(Pk) equipped with its usual inner product (see Examples 51 and 5J). Moreover, the natural unitary transformation ( of the Hilbert space ®ko7Z(Pk ) onto the Hilbert space (Fk_oR(Pk))- is given by cD((uk )k 0) = F-00 Uk for every (uk }oo o in ®k°_oR(Pk). Set 1k = Pk Irz(Pk) in B[R(Pk)] for each k and consider the operator ®k°_0Ak /k in B[®k_oR(Pk)] (cf. Problem 4.16): ((Dko°oXklk){uk}k° o = {xkuk}? o for every {uk}k° 0 in ®k°_0R(Pk). Show that 00
00
(P((D41k)':b-1X = (1:xkPk)X k=0
k=0
for all x in 71. (Hint: 4-'(Em= (Pkx}°_o.) In other words, as the orthogonal projections { Pk }O0_o are orthogonal to each other, a weighted sum of projections
is identified with an orthogonal sum of scalar operators. In fact, these are unitarily equivalent operators: 00
00
®Ak1k
Pk k=0
with
Ik = PkIR(Pp).
k=0
Problem 5.17. Let {ek }k° I be an orthonormal basis for a (separable infinite-dimensional) Hilbert space 71 and let {Ak }k 1 be a sequence of scalars. For each n > I consider the mapping T,,: 7-1 -* 7-( defined by n
T x = E Ak (x ; ek) ek
for every
X E ?{.
k=1
Verify that T lies in B[l] for each positive integer n and show that sup 11Ak I < oo
if and only if
T,, --i T,
k
where T E B[7-(] is the weighted sum of projections 00
00
Tx = E ilk (x ; ek) ek
for every
x = >(x ; ek)ek E 7{ k=1
k=1
(cf. Theorem 5.48, Proposition 5.57, Definition 5.60, Proposition 5.61). Also check that and 11 T II = Supk IAk 1. In this case, T is called a diagonal operator with respect to the basis (ek)k I. Conversely, take any T E B[7{]. If there exists an orthogonal
basis for 71 and a bounded sequence of scalars such that T is the strong limit of (T n }n°
1, then T is a diagonalizable operator. Suppose sup k IAk I < oo. Show that
lirAk = 0 k
if and only if
T.
T
Problems
417
(cf. Problem 4.53) and prove the following assertion (see Example 4J.)
There exists
T-1
E C[7Z(T), N] if and only if xk # 0 for every k > l
and, in this case. 7Z(T)- = N. (Hint: span (ek)k 1 C_ 7Z(T) if Ak A 0 for all k see Problem 3.47.) Finally, prove the following proposition.
There exists T-1 E B[71) if and only if ikf IAkl > 0 (see Example 4J again) and, in this case, 00
T-1x = E Ak 1(x ; ek)ek
for every
xE
k=1
Problem 5.18. Consider the setup of the previous problem. Use the Fourier expansion of x E N to show by induction that 00
T"x = E Ak (x ; ek) ek
for every
x E 7{
k=1
and every positive integer n. Now prove the following propositions. (a)
T" 0 0 if and only if supk 1Ak I < 1.
(b) T"
(c)
0 if and only if IAk 1 < I for every k > 1.
T" -+ 0 if and only if T" _ 0.
Hint: For (a) and (b) see Example 4H. For (c) note that T"e, = .l",ej, and hence I (T"e; ; ej) I = I A j I" If I,Lj I > I for some j, then T" -- 0. Problem 5.19. Let (ek }k° be an orthonormal basis for a Hilbert space W. Show that M (defined below) is a dense linear manifold of N. 1
M = {x E N: F°O1I(x k= ; ek)I < 001Hint: Let T be a diagonal operator as in Problem 5.17 with Ak # 0 for all k (so that 7Z(T )- = 7{) and Ek°_I IAk I2 < oo. Use the Schwarz inequality in £. to verify that
F"
for all x E R. Hence 1 (T) C M.
Problem 5.20. Let (x,,) be an M-valued sequence and let x be a vector in M. where M is a subspace of a Hilbert space R. Show that
x" -+ x in M if and only if
x,,
x in W.
5. Hilbert Spaces
418
x in M if and only if (xo ; u) -> (x; u) for every u E M because M Hint: x is a Hilbert space (Theorem 5.62). Recall: 11 = M + M (Theorem 5.20), and
show that x, ) x in M implies x, J, x in 11. Problem 5.21. Let (T,,) be a sequence in B[X, y] where X and Y are inner product spaces. Prove the following propositions. (a) If T -+ T for some T E B[X, y], then 11TII < lim
Hint: Let (T,, j be a subsequence of (T,,) such that limk II
II = lim info II Tn II
TO T, then (Tx ; y) = limk (ToEx ; y) (why?) and hence I (Tx ; y) l < lim inf o IITT II Ilx II II Y II. for every x, y E N. Now apply Corollary 5.71. If T,,
(b) If sup,, II T, II < oo and { (Toa ; b) } converges in F for every a in a dense subset
A of X and every b in a dense subset B of y, then ((Tox ; y)) converges in F for every x E X and every y E Y. Hint: See the hint in Problem 4.45(b) and recall that IF is complete. (c) Take T E B[X, Y]. If sup,, IIT,, II < oo and (Toa ; b) -- (Ta; b) foreverya in a dense subset A of X and every b in a dense subset B of y, then (Tax ; y) (Tx ; y) for every x E X and every y E Y.
Hint: ((T,, -T)x;y) = ((To-T)(x-ae)+(Tn-T)ae;y-bE+bE). (d) If X and Y are Hilbert spaces, and if the hypothesis of (b) or (c) holds true, then T,, T) T.
Problem 5.22. Let (T,,) be a sequence in B[X, y] and let (Se) be a sequence in B[y, Z], where X, Y and Z are inner product spaces. Take T E B[X, Y1, S E B[Y, Z], and prove the following propositions.
(a) If
sup, IISn II < oo, So -u) S and T,,
T. then S,, T,, Ts ST.
Hint: S,, To - ST = S,, (Tn-T)+(So-S)T.
(b) If Y and .2 are Hilbert spaces, S, -* S and T. s T, then
ST.
(c) If So u- S, sup,, II T,, 11 < oo, and T,, -+ T, then S,, T,, - ) ST. Hint: S,T,, - ST = (S,, - S)T,, + S(T,, - T). (d) If X and Y are Hilbert spaces, S,
S and T,, T) T, then S,,T,, - wST.
Show that addition of weakly convergent sequences of bounded linear transformations is again a weakly convergent sequence of bounded linear transformations whose limit is the sum of the limits of each summand.
Problems
419
Remarks: It is easy to show, even if X = Y = Z is a Hilbert space, that
S. -w) S and Tn - T
does not imply
S"T" -) ST.
For instance, if Tn = 1 for all n and {Sn} is any operator sequence that converge weakly to zero but does not converges strongly, then SnTn = Sn * > O = O 1. It is also easy to show that S"
S and T.
does not imply
T
S"Tn
ST.
Samples: There exists an isometry S+ (thus S+" S+ = I and II S. x II = 1 for all n and x, so that S+ S" - , + O and S+ '1') O) for which S+" --+ 0 (and hence S+" -T 0
and S+ ms's 0) - see Problem 5.29(b,c) below. Problem 5.23. Let (ek }k° t be any orthonormal basis for a separable Hilbert space R. Prove the following proposition. A sequence IT,,) of operators in B[pi] converges weakly to T E B[7-L] if and only if
sup IIT" II < oo and (Tnej ; ek) -- (Ted; ek) as n - oo for every j, k > 1. n
Hint: The "only if" part is easy (cf. Problems 5.5 and Proposition 5.67). Conversely, suppose sup. II T" II < oo and limn ((Ta - T )ej ; ek) = 0 for every j, k > 1. Use
Theorem 5.48 and Problem 4.14(b) to show that
limsupI((T.-T)ej ;y) 1
00
lim sup >k1 1K(Tn-T)ej;ek)II(ek;y)I = 0
n
n
for each j > 1 and every y E M, where M is the linear manifold of Problem 5.19. Repeat the argument to show:
limsupI((T.- T)x; y) I = limsupI(x; (Tn-T)`y)I n
n
Eko
lim sup n
t I(x; ek)l I((Tn - T)ek ; A = 0,
so that limn 1((T, - T )x ; y)1= 0, for every x, y E Jul. Now use Problems 5.19 and 5.21(d) to conclude that Tn - W-> T.
Problem 5.24. Let X be a linear space and take any L E L[X]. Recall that the nth power of L, L", is the composition of L with itself n times. By setting Lo = 1, the power sequence (L")n>o can be recursively defined by L"+t = LL" for every n > 0. Show by induction: (a) Ln+k = L"Lk = LkL" and (L")k = (L')" = Lnk for every k, n > 0.
Now let X be an inner product space and take any power bounded operator T in B[X] (i.e., sups II T" II < oo). If T" P for some operator P E B[X], then use Problem 5.22 to show that
420
(b)
5. Hilben Spaces
P T' = T' P = p = pk for every k > 1, so that P is a projection
(not necessarily an orthogonal projection). Show by induction that (c)
(T - P)" = T" - P for every n > 1. so that (T - P)" -> 0.
If X is a Hilben space and T E 13[X], then show that (d) T"' = T'" and II T'" II = II T" II for every n > 0.
Problem 5.25. Let X and y be two normed spaces. Recall that T E S[X] and S E B[y] are similar if there exists W E G[X, yJ such that WT = SW (i.e., T = W-1S W - see Problem 4.42). Consider a sequence {T" } of operators in B[X] and a sequence IS,) of operators in 13[y]. Suppose there exists W E G[X, y] such that WT" = S" W for every integer n. Show that
S" -°-0 S
implies
T"
S
implies
T. - W -1 SW.
S"
W-1 SW,
In particular, if WT = SW, and if S" - P (or S" P) for some P E 5[Y I. then show that T" f W-1PW(orT" --°-> W-1PW)andW-1PWisaprojection in S[X] (cf. Problem 4.55(c)). Hint: T" - W-1S"W = W-1(S" - S) W.
Now let X and y be two Hilbert spaces. Recall that T E B[X] and S E S[y] are unitarily equivalent if there exists a unitary U E 9[X, y] such that UT = SU (i.e.,
T = U-tSU - see Problem 5.9). Suppose there exists a unitary U E 9[X, y] such that UT" = S" U for every integer n. Show that S
S,,
implies
T. - U-1 SU.
In particular, if UT = SU, and if S" -wi P for some P E B[Y], then show that T"
U - t P U and U -1 P U is a projection in 8[X1 (not necessarily orthogonal
- cf. Problem 5.24(b)). Hint: ((T,, - U-1S"U)x; y) = (U-1(Sn - S)Ux; y) = ((S,, - S)Ux; Uy). Problem 5.26. Take a B11-1,1C]-valued sequence {T"}, where 1{ and K are Hilben spaces, and take T E 5[h, 1C]. Show that
T" -+ T if and only if T
T
if an d onl y if
.
That is, the adjoint operation preserves uniform and weak convergence. But it does not preserve strong convergence. In fact, as we shall see shortly (Problem 5.29(b,c) below),
T" -S T
does not imply T - T*.
Problems
T and T
However, if T,,
421
S, then S = T*. Why?
Problem 5.27. Let 1-L and K be Hilbert spaces. Show that
(a) V E B[H, K] is an isometry if and only if V*" V" = I (where I is the identity on H) for every n > 1,
(b) U E 13[?{, K] is a unitary transformation if and only if U*" U" = I (identity on H) and U" U*" = I (identity on K) for every n > 1. Let T E B[H] be a diagonalizable operator (Problem 5.17). Show that
(c) T*x =
I Xk (x ; ek) ek for every x E
(d) T is unitary if and only if IAk I = 1 for every k > 1. Problem 5.28. Let Hi and Ki be Hilbert spaces for i = 1, 2 and consider the Hilbert spaces Hi ® H2 and K1 ® K2 of Example 5E. Take Tip E B[HJ, Hi] for i, j =1,2 and consider transformation
T=
Tie
1
E M W ®H2, K1 ®1C2]
T22
T2 2i
of Problem 4.17. Show that T1**
T*
T **i
E B[K1 ® K2, 7.11 ® ?idC T12
T22
In particular, if Ki = Hi for i =1,2 so that T, T* E B[H1 ® H2], then
T= T11 ®T22 = C 0
l
2
I
*
implies T*= T111 ® T22
=Or
Verify that T = ®kTk E B[®kfk] implies T* = ®kTi. E B[®k7-lk, where {Ilk} is a sequence of Hilbert spaces, ®kHk is the Hilbert space ([®k-i7Lk]2, )) of Examples SF and 5G, and T is the direct sum of (Tk) as in Problem 4.16. ( Consider the usual identification Ilk = ®ik {0}, so that we may interpret each '1lk as a subspace of ®kfk. Show that Ilk reduces T = ®kTk, T Ink = Tk, and (T lnk)* = T * Ink for every k (cf. Corollary 5.75).
Problem 5.29. An operator S.,. acting on a Hilbert space H is a unilateral shift if there exists an infinite sequence (Hk}k°0 of nonzero pairwise orthogonal subspaces of 7d such that H = ®k° 0Hk and S+ maps each 7-(k isometrically onto Kk+i. Observe that S+ Iq.{k : 7'lk - 7-(k+i is a unitary transformation (i.e., a surjective isom-
etry), and hence dim Rk+i = dim 7-lk, for every k > 0 (Theorem 5.49). Such a common dimension is the multiplicity of S+. The adjoint of S+ E B[f], S. E B[H], is referred to as a backward unilateral shift. We shall write ®k°O__0xk for {xk 1k° O in ®k0_0Nk. Prove the following assertions.
422
5. Hilbert Spaces
(a) S+: 1-( --> 7{ and S,*: R -f 7-l are given by the formulas S+x = 0 E D
Uk xk_I
and
S+x =
for every x = ®k_oxk in 7l = ®k_o7-lk, with 0 denoting the origin of No, where {Uk+l }O0 o is an arbitrary sequence of unitary transformations Uk+t Ilk -> 74+ 1, so that S+I?.!k = Uk+l, for each k > 0. S+ and S+ are identified with the following infinite matrix of operators.
S+ =
(b)
S.
O Ul O U2 O U3 O
O Ui and
S+* =
O U2 U 0 U3
is a strongly stable coisometry. That is, S+ is an isometry and S+" -- 0
(so that S+ is an isometry that is not a coisometry).
Hint: II S+x 112 = Ilx 112, and S+"x = ®'Uk+l . . . Uk+"xk+" (by induction) so that 115+"x112 = Ex0_"IIxkII2.
(c) S" -+ 0 but { S+ } does not converge strongly. Hint: S+"
0 and IIS"xll = IlxII (i.e., S+"S+ = 1).
Let JCo be an arbitrary Hilbert space unitarily equivalent to NO. Let Uo : )Co -I. No be a unitary transformation so that dim 7{k = dim ACo for all k. Consider the Hilbert space a+(Ao) = ®°D olCo of Example 5F.
(d) U = ®k_0Uk ... Uo : e+(Ao) - N is a unitary transformation, and O
1 O
10
U'S+U =
1 O
in
B(e+(Ko)1.
Thus S + is unitarily equivalent to U'S+U, which is a unilateral shift of multiplicity dim /C0, called the canonical unilateral s h i f t on t+2 Problem 5.30. An operator S acting on a Hilbert space R is a bilateral shift if there exists an infinite family {Nk) of nonzero pairwise orthogonal subspaces of 71 such that N = ®k' _,,.7{k and S maps each 7{k isometrically onto 1Ck+ 1.
__
Problems
423
As it happened in the case of a unilateral shift, the above definition ensures that SIHk: 7{k - 7{k+1 is a surjective isometry. Then the subspaces 7{k of 71 are all unitarily equivalent, and their common dimension is the multiplicity of S. The adjoint S' E B[7-(] of S E B[71] is referred to as a backward bilateral shift. Prove the following assertions.
(a) S : N -i 'H and S*: ? --> 7{ are given by the formulas Sx = ®k__00Uk xk_I
and
S'x = ®kk=00Uk+tXk+l
where (Uk)k° _00 is an arbiin 7{ = for every x = trary family of unitary transformations Uk+i : 7{k -> 74+1, so that SIN, = Uk+i, for each integer k.
S and S` are identified with the following (doubly) infinite matrix of operators, where the inner parenthesis indicates the zero-zero entry.
0 S=
U_b O
and S' _
Uo (0)
0 U'1 0
Ui 0
Uo
(0) Ui 0
(b) S is unitarily equivalent to the canonical bilateral shift acting on 12()Co) _ ®00 k=-ooXo
O
1 O /
in
(0) 1
B[t2(Ko)],
O
where KO is any Hilbert space with the property that dim )CO is the multiplicity
of S. (Hint: Problem 5.29(d).)
(c) S is a weakly stable unitary operator. In other words, S is an isometry and a O. coisometry (i.e., S'"S" = S'S*" = 1) and S" Hint: Let So denote the above canonical unilateral shift on t?2()Co), which is unitarily equivalent to S. Take any x = ®k0=ooXk in f2(JCo) = ®k°__00ICo.
Show by induction that Sox = ®k° .Xk-", and hence 00
(Sex;X) =
-1
(Xk;Xk+n)+J(xk;Xk+")
(Xk-n;Xk) = k=-oo
00
k=-oo
k-0
424
5. Hilbert Spaces
Now apply the Schwarz inequality in Ko and in f2 to show that -I
-I
T
E llxk ll ll Xk+,:
, I (xk ; xk+,:) I k=-oo
II
k=-oo I
Ilxk k
ll2) 2 (
=-oo
Il Xk+n
ll2) 7 <
(
k =-oo
IIxk+n ll2) 7 IIx II k=-00
Similarly, Ek00 o I (xk ; Xk+n) I < (Ek°O o IIxk+n II2) Z Ilx II . Next verify:
-1
00
lim E IIxk+n II2 = lim n k=-oo
IIXk+n
112
= 0.
k=0
Hence So -+ 0 (cf. Proposition 5.67). Use Problem 5.25 to conclude that
S" -* 0. (d) Both [S"} and (S*") do not converge strongly. Hint: S*"
0, and IIS"xll = IIS*"x II = IIxII.
Problem 5.31. According to Problem 5.29 unilateral shifts exist only on infinitedimensional spaces. Let x be a separable Hilbert space. (a) S+ E 13[f] is a unilateral shift of multiplicity one if and only if
for every
S+ek = ek+1 for some orthonormal basis {ek }k°
o
k E No
for 11. That is, if and only if it shifts some
orthonormal basis for x indexed by No (or by any set that is in a one-toone order-preserving correspondence with No). Moreover, in this case, show also that
S+eo = 0
and
S+*ek = ek-1
for every
k E No.
Hint: Set Hk = span (ek}, so that dim Ilk = 1, for each k E No. (b) Verify that the operator S+ E B[8+] of Problem 4.39 is, in fact, the canonical unilateral shift of multiplicity one on a+, which shifts the canonical orthonormal basis for f+2. (Hint: e+ = ®k_oC.) (c) Let {ek }k° _00 be the canonical orthonormal basis for e2. Re-index this basis as follows. Set, for every n E N,
fn = e if n is odd
and
fn = en if n is even.
Problems
425
Check that (f, }n°l is an orthonormal basis for t2 and that the operator S+ in B[t2] given by the following (doubly) infinite matrix,
,1
f 1
S+ =
is a unilateral shift on t2 that shifts the orthonormal basis { fn}n° t
Problem 5.32. According to Problem 5.30 bilateral shifts exist only on infinite-dimensional spaces. Let fl be a separable Hilbert space. (a) S E 13[R] is a bilateral shift of multiplicity 1 if and only if
Sek = ek+i
for every
kEZ
_.
for W. That is, if and only if it shifts for some orthonormal basis {ek}k° some orthonormal basis for 7{ indexed by 7G (or by any set that is in a one-toone order-preserving correspondence with Z). Moreover, in this case, show also that S*ek = ek_t for every k E Z. Hint: Set 7(k = span {ek}, so that dim 7Ik = 1, for each k E Z.
(b) Verify that the operator S E B[t2] of Problem 4.40 is, in fact, the canonical bilateral shift of multiplicity one on t2, which shifts the canonical orthonormal basis for t2. (Hint: e2 = (Dk_oocC.)
(c) Let (e.)' 1 be the canonical orthonormal basis for t+. Re-index this basis as follows. Set, for every k E Z,
fk = e _2k if k<0
and
fk = elk if n k > 0.
Check that {fk }k° 00 is an orthonormal basis for t+ and that the operator S in
8[t+] given by the following infinite matrix, b A
B A
nD B A
S=I
1, with b=(?), A = (o 0) and B=(00),
is a bilateral shift on t+ that shifts the orthonormal basis { fk}k° _oo.
426
5. Hilbert Spaces
Problem 5.33. Consider the orthonormal basis {ek}k _ for the Hilbert space L2(r) of Example 3L(c), where r denotes the unit circle about the origin of the complex plane and, for each k E Z, ek (z) = zk for every z E r. Define a mapping
U: L'-(I') -> On as follows. If f E L-(T), then Uf is given by (U f)(z) = zf (z)
for every
z E I'.
(a) Verify that Uf, in fact, lies in L2(F) for every f in L2(r). Moreover, show that U E 13[L2(I)J. (b) Show that U is a bilateral shift of multiplicity one on L2(I') that shifts the orthonormal basis (ek )k _,,. (c) Prove the Riemann-Lebesgue Lemma: If f E L'`(F), then fr zk f (z) dz -' 0 as k -+ ±oc.
Hint: (Ukf)(z) = zkf(z) so that (Ukf ; 1) = fr zkf(z)dz, where 1(z) = 1 for all z E r. Recall that Uk
0 (cf. Problem 5.30(c)).
Problem 5.34. Let f be a Hilbert space and take T, S E [3[7{]. Use Problem 4.20 and Corollary 5.75 to prove the following assertion. If S commutes with both T and T. then N(S) and 1Z(S)- reduce T. Problem 5.35. Take T E 5[7-l, K], where 7-l and K are Hilbert spaces.
(a) T is injective (a*) T' is injective
R(T`)- = W.
T`T is injective
7Z(T)- = K.
T T" is injective
Prove (a) and (a*). Hint: Propositions 5.15 and 5.76.
(b) T is surjective
T' has a bounded inverse on R(T').
(b`) T* is surjective
T has a bounded inverse on R(T).
Prove (b) and (b'). Hint: Corollary 4.24 and Proposition 5.77.
Problem 5.36. Consider the following assertions under the setup of the previous problem. (a)
T is injective.
(a*)
T' is injective.
(b)
dim 1Z(T) = n.
(IM)
dim R(T*) = m.
(c)
1Z(T') = R.
(d')
R(T) = K.
(d)
T'T E 917-(].
(d')
T T` E G[KJ.
If dim 7-1 = n, then (a), (b), (c) and (d) are pairwise equivalent. If dim K = m, then (a'). (b'). (c-') and (d') are pairwise equivalent. Prove.
Problems
427
Problem 5.37. Let H and IC be Hilbert spaces and take T E 13[7-1, K]. If y E R(T ), then there exists a solution x E H to the equation y = Tx. It is clear that this solution
is unique whenever T is injective. If, in addition, R(T) is closed in 1C, then this unique solution is given by x = (T*T)-IT*y.
In other words, suppose N(T) = (0} and R(T) = R(T)-. According to Corollary 4.24 there exists T-1 E 13[R(T), H]. Use Propositions 5.76 and 5.77 to show that there exists (T*T)-1 E B[R(T*), 7{] and
T-' = (T*T)-IT*
on
R(T).
Problem 5.38. (Least-Squares). Take T E 13[H, 1C], where H and IC are Hilbert spaces. If y E 1C\R(T), then there is no solution x E x to the equation y = Tx. Question: Is there a vector x in 7-I that minimizes 11y - Tx 11 ? Use Theorem 5.13, Proposition 5.76 and Problem 5.37 to prove the following proposition. If R(T) _ R(T )', then for each y E IC there exists xy E H such that
ll y - Txyll = inf 11y - Tx11
and
T*Txy = T*y.
Moreover, if T is injective, then xy is unique and given by
xy = (T*T)-IT*y. Problem 5.39. Let H and 1C be Hilbert spaces and take T E B[f,1C]. If Y E R(T) and R(T) = 7.(T)-, then show that there exists x0 E H such that y = Tx0 and Ilxoll -< Ilxll
for all x E R such that y = Tx. That is, if R(T) = 12-(T)-, then for each y E R(T )
there exists a solution xo E H to the equation y = Tx with minimum norm. Moreover, if T* is injective, then show that xo is unique and given by
xo = T*(TT*)-ly. Hint: If R(T) = R(T)-, then R(T T*) = R(T) (Propositions 5.76 and 5.77). Take y in R(T ). Thus y = T T *z for some z in 1C. Set xo = T *z in H so that y = Txo.
If X E H is such that y = Tx, then verify: llxoll2 = (T*z ; xo) = (z ; Tx) = (T*z ; x) = (xo ; x) -< llxoll 1Ix11.IfN(T*) = {0}, then N(TT*) = (0) (Proposition 5.76). Since R(T T *) = R(T) = R(T)-, it follows by Corollary 4.24 that T T * has a bounded inverse on R(T). Hence z = (TT*)-1y is unique, and so is xo = T*z.
Problem 5.40. Show that T E 50[H,1C] if and only if T* E B0[IC, H], where h and 1C are Hilbert spaces. Moreover, dim R(T) = dim R(T*).
428
5. Hilbert Spaces
Hint: B0[7-l, K] denotes the set of all finite-rank bounded linear transformations of h
into K. If T E 80[1.1, K], then R(T) = R(T)-. (Why?) Use Propositions 5.76 and R(T ) 5.77 to show that R(T*) = T' (R(T )). Thus conclude: dim (cf. Problems 2.17 and 2.18).
Problem 5.41. Let T E B[N, Y) be a bounded linear transformation of a Hilbert space N into a normed space Y. Show that the following assertions are pairwise equivalent. (a) T is compact (b)
(i.e., T E 800[7{, Y]).
Tx,, - Tx in Y whenever x,, -Z x in N.
(c) Tx,, -> 0 in Y whenever x,, -W 0 in
it.
be a bounded sequence in Hint: Problem 4.69 for (a)=(b). Conversely, let such R. Apply Lemma 5.69 to ensure the existence of a subsequence {x, } of converges in Y whenever (b) holds true. Now conclude that T is comthat (T pact (Theorem 4.52(d)). Hence (b)=(a). Trivially, (b)=>(c). On the other hand, if x 0 x in R. then verify that T (x - x) -- 0 in y whenever (c) holds; that is, (c)=.).(b).
Problem 5.42. If T E B[N, K], where N and K are Hilbert spaces, then show that the following assertions are pairwise equivalent.
(a) T is compact
(i.e., T E B00[7{, K]).
(b) T is the (uniform) limit in B[7{, K] of a sequence of finite-rank bounded linear transformations of N into K. That is, there exists a B0[N, K]-valued sequence (T,,) such that ]IT,, - T11 --o- 0.
(c) T' is compact
(i.e., T' E B00[K, R]).
Hint: Take any T E B00[N, K] and let {ek }k 1 be an orthonormal basis for 7Z(T)-.
If P,,: K -+ K is the orthogonal projection onto V(ek)k_1, then
T. Indeed, R(T)- is separable (Proposition 4.57) and the existence of P. is ensured P. where P : K -+ K is the orthogonal proby Theorem 5.52. Show that P jection onto R(T)- (cf. Problem 5.15). Now use Problem 4.57(b) to establish that T. Set T = P,, T and verify that each T. lies in B0[7{, K). Hence PT (a) =(b). For the converse, see Corollary 4.55. Thus (a) .(b), which implies (a)4o.(c) (Proposition 5.65(d) and Problem 5.40). (d) Verify that B0[N, K] is dense in 800[N, K].
Problem 5.43. An operator T E B[N] on a Hilbert space N is an involution if T2 = I (cf. Problem 1.11). A symmetry is a unitary involution. Show that the following assertions are pairwise equivalent.
Problems
429
(a) T is a unitary involution. (b) T is a self-adjoint involution. (c) T is self-adjoint and unitary.
Problem 5.44. Let 7l be a Hilbert space. Show that the set of all self-adjoint operators from B[7{] is weakly closed in B[R]. (T,, = I((T - T)x; y)I + I((T - T)y; x)l whenever T = T,,.
Hint: I (Tx ; y) - (x : Ty) I
I
T,,
I
<
Problem 5.45. Let S and T be self-adjoint operators in B[f]. where 7t is a Hilbert space. Prove the following results. (a)
T + S is self-adjoint.
(b) a T is self-adjoint if and only if a E R. Therefore, if 7-l is a real Hilbert space, then the set of all self-adjoint operators from 5[71] is a subspace of B[N].
(c) TS is self-adjoint if and only if TS = ST. (d)
p(T) = p(T)` for every polynomial p with real coefficients.
(e) T=" > 0 and II T`"II = II T II2'for each n > 1. (Hint: Proposition 5.78.)
Problem 5.46. If an operator T E 13[7-(] acting on a complex Hilbert space N is such that T = A + i B, where A and B are self-adjoint operators in 13[7{], then the representation T = A + i B is called the Cartesian decomposition of T. Prove the following propositions. (a) Every operator T E B[71] acting on a complex Hilbert space 7f has a unique Cartesian decomposition.
Huu:Set A=;(T`+T)
and
B=;(T'-T).
(b) T *T = T T* if and only if A B = BA. In this case, T'T = A2 + B2 and max(IIA112, IIBI12} < IIT112 < 11A211 + 118211
Problem 5.47. If T E B[N] is a self-adjoint operator acting on a real Hilbert space R, then show that
(Tx:y) = 3((T(x+y);x+y)-(T(.r-y).x-v)) for every x, y E R. (Hint: Problem 5.3(a).)
Problem 5.48. Let R be any (real or complex) Hilbert space.
430
5. Hilbert Spaces
(a) If (T") is a sequence of self-adjoint operators in B[H], then the five assertions of Proposition 5.67 are all pairwise equivalent even in a real Hilbert space.
Hint: If T,* = T and the real sequence ((T"x ; x)) converges in R for every x E 7-l, and if 7-l is real, then use Problem 5.47 to show that the real sequence ((T"x ; y)} converges in R for every x, y E 7-1. Now apply Proposition 5.67. (b) If IT,) is a sequence of self-adjoint operators in B[7-L], then the four assertions of Problem 5.5 are all pairwise equivalent even in a real Hilbert space. (Hint: Problems 5.5 and 5.47.)
Problem 5.49. The set B+[7{] of all nonnegative operators on a Hilbert space 71 is a weakly closed convex cone in B[7-I].
Hint: If Q,, > 0 for every positive integer n and Q,, - Q, then Q > 0 (0 < (Q"x; X) < I((Q"-Q)x;x)I+(Qx;x)).SeeProblems 2.2and2.21. Problem 5.50. Let 7{ and 1C be Hilbert spaces and take T E B[N,1C]. Recall: T*T E B+[71] and TT* E 13+ (IC]. Show that
(a) T*T > 0 if and only if T is injective, (b) T*T E 9+[7-l] if and only if T E 9[H, 1C], (a*)
T T * > 0 if and only if T* is injective,
(b*) TT* E 9+[1C] if and only if T* E 9[1C, H]. Problem 5.51. Let H be a Hilbert space and take Q, R and T in B[N]. Verify the following implications.
(a) Q > O implies T*QT > O. (b) Q > O and R > O imply Q + R > O. (c) Q > O and R > O imply Q + R > O.
(d) Q> -O and R> O imply Q + R r O. Problem 5.52. Let Q bean operator acting on a Hilbert space 7-L. Prove the following propositions.
(a) Q > 0 implies Q" > 0 for every integer n > 0.
(b) Q > 0 implies Q' > 0 for every integer n > 0. (c) Q >- 0 implies Q" >- 0 for every integer n > 0.
Problems
431
(d) Q >- 0 implies Q'' >- 0. (e) If p is an arbitrary polynomial with positive coefficients, then p(Q) > 0, p(Q) > 0 or p(Q) >- 0 whenever Q > 0. Q > 0 or Q >- O. respectively. Hints: (a), (b) and (c) are trivially verified for n = 0, 1. Suppose n > 2. (a) Show that (Q"x ; x) = II Q Tx 112 for every x E 7{ if n is even, and (Q"x ; x) _
(QQY x; Q1x) for every x E Rif n is odd. (b,c) Q > 0 if and only if Q > 0 and N(Q) 0 (0); and Q >- 0 if and only if Q > 0 and Q is bounded below. In both cases, Q 56 0. Note that (i) (QZ'x;x) = IIQ"x112 and (ii) (Q2i''x;x) > IIQII-'IIQ"x112, for every x in f and every n > 1. The inequality in (ii) is a consequence of Proposition 5.82: IIQQ"-'x112
< IIQII (QQ"''x; Q"-lx). Apply (i) to show that (b) and (c) hold for n = 2, and hence they hold for n = 3 by (ii). Conclude the proofs by IIQ"X112 =
induction. (d) IIXII2 = IIQQ-IX112 < IIQII(QQ-'x ; Q-'x) = IIQII(Q''x ; x). Why?
Problem 5.53. Let N be a Hilbert space and take Q, R E B[N]. Prove:
(a) 0
(c) 0
implies
0
Hints: Consider the result in Problem 5.22(d).
(a)
If 0 < Q < R, then Q-' > 0, R'' >- O and (R - Q)-' >- 0. But = ((R - Q + Q)(R - Q)-' Q)-' = (Q +
Q'' - R-' = Q- (R -
Q)R-'
Q(R - Q)- Q)-' and Q + Q(R - Q)-' Q > 0. This is enough to ensure that
Q''-R-'r O. (b)
If O < Q < R. then Q'' r O, R-' >- 0 (there exists a > 0 such that a1IX112<(R-'x;x)forallxEf)andO
Q-t - (n ti R)-' ((Q -' -
- "1 R-', and Q-' - ("n1 R)'1 >- O by item (a). Therefore,
R-')x;
x)
= (Q-1- (±_R)x;x)
1(R-lx;
n+
x)
11X I12
n+l for every x E N and all n > 1, which implies that ((Q-' - R-' )x ; x) > 0 for all
x EX (c) If 0 < Q < R, then Q'' > 0, R-1 >- 0 and R - Q > 0. Thus there exists a > Osuchthatallxll < IIQ-'xII for all x E 71, R-' E G[71] artdN(R- Q) =
432
5. Hilbert Spaces
(0). Hence 0
Q-t - R-' > 0 by item (b). Problem 5.54. Show that the following equivalences hold for every T E B[71, K]. where 7{ and K are Hilbert spaces (apply Corollary 5.83).
T*"T"
0 b T*"T" ) 0
T" -* O.
Now conclude that for a self-adjoins operator the concepts of strong and weak 0 b T" -+ 0). stabilities coincide (i.e., if T* = T, then T" Problem 5.55. Take Q, T E B[H], where h is a Hilbert space. Prove:
(a) -1
(b) O
O
Q*=QandQ2
Problem 5.56. Take P, Q. T E B[7{] on a Hilbert space 7 f. Prove: (a) If T* = T and T' -+ P, then P is an orthogonal projection.
Hint: Problems 5.24 and 5.44 and Proposition 5.81. (b) If 0 < Q < 1, then Q"+1 < Q" for every integer n > 0.
Hint: Let n be a positive integer and take any x E 7{. If n is even, then use Problem 5.55(b) and Proposition 5.82 to verify: (Q"x ; x) = II Q2x II22 <
IIQII(QQYx;Q° x) < (Q"-'x ;x). If n is odd, then (Q"x;x) _ (QQ'Yx; QT x) < (Qax ; QY x) = (Q"-'X ; X). (c)
If 0 < Q < 1, then Q"
P and P is an orthogonal projection.
Hint: Problems 5.55(b), 4.47(a), 5.24, items (a,b), Proposition 5.84.
Problem 5.57. This is our first problem that uses the square root of a nonnegative operator (Theorem 5.85). Take T E B[71] acting on a complex Hilbert space h and prove the following propositions.
(a) If T 96 0 is self-adjoint, then U±(T) = IITII-'(T ± i(IIT1121 - T2)2) are unitary operators in B[1].
Problems
433
Hint: [IT II-2T2
(b) Every operator on a complex Hilbert space is a linear combination of four unitary operators.
Hint: If 0 # T = T*, then show that T = U+(T)+ U- (T). Apply the Cartesian decomposition (Problem 5.46) if 0 34 T # T'. Problem 5.58. If Q E 5+[f], where 7-l is a Hilbert space, then show that (cf. Theorem 5.85 and Proposition 5.86) (a)
(Qx;x) = IIQ21 X112 < IIQII21 (Q21 x;x) for every x E li.
(b)
(Q21
(c)
QI > 0 if and only if Q > 0,
x ; x ) < (Qx;x)11 IIx11 for every x E
(d) Q? >- 0 if and only if Q >- 0. Problem 5.59. Take Q, R E 8+[x] on a Hilbert space R. Prove: (a)
If Q:< R and Q R = R Q, then Q2< R2.
(b) Q < R does not imply Q2 < R2.
Hints: (a)(RQzx;QIx)_(QRZx;R21 x).(b)QandR= Problem 5.60. Let Q and R be nonnegative operators acting on a Hilbert space. Use Problem 5.52 and Theorem 5.85 to prove that
QR = R Q
implies
Q" R' > 0 for every m,n > 1.
Show that p(Q)q(R) > 0 for every pair of polynomials p and q with positive coefficients whenever Q > 0 and R > 0 commute. Problem 5.61. Let H and IC be Hilbert spaces. Take any T E L3[f, IC) and recall
that T'T lies in 5'[h). Set
ITI = (T-T)1 in 8+[l] so that IT12 = (T'T). Prove the following assertions. (a)
11TII = IIITI2111 = IIITIII = 111T1 21
112.
(b) (ITIx;x) = IIITIIx112 < IIITIxIIIIx11 for every x E If.
434
5. Hilbert Spaces
(c) IITxII2 = IIITIxII2 < IITII(ITIx;x) for every x E N.
Moreover, if 7{ = IC (i.e., if T E B[7(I), then show that (d)
T" L. 0 b IT"I s 0
(e) B+[7{] _ (T E B[N] :
IT"I
T=ITI}
Ti 0,
(i.e., T > 0 if and only if T = I T I ).
Problem 5.62. Let Q be a nonnegative operator on a Hilbert space.
Q is compact if and only if Qs is compact. Hint: If Q i is compact then Q is compact by Proposition 4.54. On the other hand, II Q i xn 112 = (Qxn ; x,) < Supk IIxk 11 11 Qxn II (Problem 5.41).
Take T E B[N,1C], where N and IC are Hilbert spaces. Also prove that T E BJ71, JC]
T'T E BJf] b 1 T I E B00[N]
I T11 E 1300[7{].
Problem 5.63. Consider a sequence (Q,, } of nonnegative operators on a Hilbert space 7{ (i.e., Q. > 0 for all n). Show that (a)
Qn -> Q implies Qn
Qia,
(b) If Q, is compact for every n and Qn
Q, then Qn -" r Q.
Hints: (a) Q > 0 by Problem 5.49. Recall: Q` is the strong limit of a sequence { Pk (Q) } of polynomials in Q, where the polynomials ( Pk ) themselves do not depend
on Q; that is. Pk(Q)
s
Q"2 for every Q > 0 (cf. proof of Theorem 5.85).
Verify that
II(Qn - Q"')xll
<-
II (Q - Pk(Qn))xII + II(Pk(Qf) - Pk(Q))xII + II(Pk(Q) - Q"2))xll
Take any e > 0 and any x E R. Show that there exist positive integers nE and kE such that II(Pk,(Q) - Q"))xll < s, II(Pk,(Qn) - Pk,(Q))xII < E for every n > nE (because Q,n, s Q' for every positive integer j by Problem 4.46), and II(Qn - Pk,(Qn,))xll < E.
(b) Q E B([7{] (Theorem 4.53). Since Qn s Q'n by part (a), Qn Q'" --"-+ Q (Problems 5.62 and 4.57). So (Qn- Q'")2 = Qn+Q- QnQ'a-(QnQ'a)' 00 0 (Problem 5.26). Qn - Q" is self-adjoint so that II Q` - Q'a 112 = II (Qn - Q"a)211 (Problem 5.45).
Problems
435
Problem 5.64. Let {ey )yEr and (fy )yEr be orthonormal bases for a Hilbert space 1-l and take an arbitrary T E 8(1-1]. Use the Parseval identity to show that
IIT*fyII2 =
IITey112 =
yEr
I(Te.;
0f
)12
aErl3Er
yEr
whenever the family of nonnegative numbers {IITey112}yEr is summable; that is, whenever oo (cf. Proposition 5.31). Apply the above result to I T 17' and show that
1:(ITley;ey) = E(ITIfy;fy) yEr
yEr
whenever F-yEr (I T l ey ; ey) < oo. Outcome: If the sum >yE r (I T l ey ; ey) exists in 1R, then it is independent of the choice of the orthonormal basis (ey)yEr for ley ; ey) < 00 X An operator T E 13[1-1] is trace-class (or nuclear) if (equivalently, if F-yEr 11 1 T 17' ey 112 < oo) for some orthonormal basis (ey )yEr for 1-l. Let l3, [7-l] denote the subset of 13[7-l] consisting of all trace-class operators on
R. If T E B, [1-l], then set
IIT111 = E(ITley;ey) = 1:IIITI1eyII2. yEr
yEr
Problem 5.65. Let T E 8[1-1] be an operator on a Hilbert space 9-f, and let {ey }yEr be an orthonormal basis for 1-l. If I T 12 is trace-class (i.e., if >yer 11 1 T l ey 112 < 00 or, equivalently, iyEr II Tey 112 < oo - Problem 5.61(c)), then T is called a HilbertSchmidt operator. Let 132(1-(] denote the subset of 8[1-1] made up of all HilbertSchmidt operators on 7-l. Take T E 132[1-1]. According to Problems 5.61 and 5.64 set
IITI12 = IIT*Tll1 = IIITI2II;
IIITleyII2} yEr
IITey112
yEr
for any orthonormal basis {ey }yEr for 1-l. Prove the following results.
(a) T E 132[7-1] if and only if ITI E 82[1.1], if and only if IT12 E B, (W), and IITII2=IIITIII'=IIIT12111.
(b) T E 8[1-1] if and only if ITI E 8 [7-1], if and only if ITI4 E 82[1], and IITII, = IIITIII, = IIITI2II22 (c) T* E 132 [7-l] and II T* 112 = II T II2 if T E B, [1-L]. (Hint: Problem 5.64.)
(d) IITII < IIT112 for every T E ,82[7.1]. (Hint: IITeII
IITII2 if Ilell =
I)
436
5. Hilbert Spaces
(e) T+S E 82[7{] and IIT+5112 -< 1IT112+ IIS112 whenever T, S E B2[7f].
Hint: Since EyErIITeyliltSe, II < (EyerliTeyll2}(E
IISeyllV _
JITI121IS112 (Schwarz inequality in e2 ), IIT+SII2 < (IITII2+ IIS112)2
(f) B2[N] is a linear space and II 112 is a norm on B2[N]. (g) ST and TS lie in B2[?{], and max{IIST112, IITSII2} 5 IISII IITII2, for every S in B[7{] and every T in 82[N]. Hint: IISTeyll2 < 11SII2IITey112and II(TS)`ey112 <
(h) B2[71] is a two-sided ideal of B[7{].
Problem 5.66. Consider the setup of the previous problem and prove: (a) T+S E 81[7{] and 11T+5111 -< 11T111+ IIS111 whenever T. S E B1[N].
Hint: Let T +S = WIT+SI, T = W1ITI and S = W21SI be the polar decompositions of T + S, T and S, respectively, so that I T + SI = W * (T + S). I T I = Wi T and ISI = W2 T. Verify that
I(Tey;Wey)I+
(IT+Sley;ey)
yEr
_
I(Sey;W'er)I yEr
I(ITI} ey ; ITI4 WI*Wey)I
yEr
+ E I(ISIf ey; ISI1 W2 Wey)I yEr
IIITI} W K'eyll2)}
IIITI4eyll2)}(,
yEr
yEr
+(F IIISI1 ey,12)I (F yE1
IIISI} W2* Wey12)2
yEr
/
11IT17II211W;
W II =
(b) B1 [N] is a linear space and 11
11 1
II
II
II W211 = 1.)
is a norm on B, [N].
(c) B1 [N] a 821h] (i.e., every trace-class operator is Hilbert-Schmidt). If T lies in B,[7{], then 11TII2 <- 11T111.
Hint: Problem 5.65(a,b,g) to prove the inclusion, and Problems 5.61(c) and 5.65(b) to prove the inequality.
(d) B2 [N] C B J HI (i.e., every Hilbert-Schmidt operator is compact).
Problems
437
Hint: Take T E B,[R] so that T` E 13,[N] (Proposition 5.65(c)), and hence F-yErIIT`eyII2 < oo. Take an arbitrary integer n > 1. There exists a finite Nn c_ r such that EkEN II T'ek 112 < A for all finite N c I'\Nn (cf. Theorem
5.27). Thus Lyer\N II T'ey I12 < R. Recall that Tx = F-yEr(Tx ; ey)ey
ek)ek. (cf. Theorem 5.48) and define Tn : N - W by Tnx = Show that Tn lies in 50[71] and II(T-Tn)x112 = IITn - TII -+ 0. Conclude: T E BjN]
(Problem 5.42).
(e) T E 13, [R] if and only if T = A B for some A, B E B,[7{]. Hint: Let T = W1 TI = W I T J 2 1 T17' be the polar decomposition of T. If T E
13, [N], then use Problem 5.65(b,g). Conversely, suppose T = A B, where A, B E 132[7{]. Since ITI = W'T, ITI = W*AB with A'W E 13,[NJ (Problem 5.65(c.g)). Verify that F-yer(ITley; ey) F-yErlIBeyll!IA`Weyll < (EyerllBer112)'(EyErIIA*Weyl12)',and hence l1Tll,
IIBII,IIA*WII,.
(f) S T and TS lie in 5, [N] for every T in B [7{J and every S in 5[7i]. Hint: Apply (e). T = AB for some A. B E
SA and BS lie in B,(71J,
and so ST=SABandTS=ABSliein13,CHI. (g) Ca, [7{] is a two-sided ideal of CB[R].
Problem 5.67. Let (ey)yEr be an arbitrary orthonormal basis for a Hilbert space R and take any T E B, [N]. Show that I (Tey ; ey) I < oo. yEr
Hint: 21(Tey; ey) I = 21(ABey; ey)I < 21IBey1IIIA+ey11 for A, B E B,[7{] (Problem 5.66(e)), and hence 21(T ey ; ey) I II Be. Ill + 11 A*ey 112. Then 1
I(Tey; ey) I
yEr
2(IIAII; + JIBII)
`
(cf. Problem 5.65(c)). Therefore, according to Corollary 5.29, ((Tey ; ey) } is a summable family of scalars (since F is a Banach space). Let EyEr (Tey ; ey) in F be its sum and show that
EyEr(Tey;ey) does not depend on {ey}yEr.
Hint: EaEr(Tea;ea) = EaerF_fEr(Tea; fp)(f,;ea). where (ey}yEr and {fy}yEr are any orthonormal bases for 7{ (cf. Theorem 5.48(c)). Now observe
that EPEr EaEr(Tea; ff)(ff; ea) = flEr(f6; T*ffi) = FpEr(Tffl; IB).
438
5. Hilbert Spaces
If T E B, [N] and {ey }yr is any orthonormal basis for N. then set
tr(T) = >2(Tey; ey)
so that
IITII1 = tr(ITI)
yEr
Hence 8,[7{] = (T E B[7{]: tr(ITI) < oo). The number tr(T) is called the trace of T E B, [H] (thus the terminology "trace-class"). Warning: T E B[11] and Eyer(Tey ; ey) < oo for some orthonormal basis (ey}yEr for 71 does not imply T E B1 [N]. However, if EyEr (I T I ey ; ey) < oo for some orthonormal basis (ey )yEr for W. then T E B, [N] (Problem 5.64). Problem 5.68. Consider the setup of the previous problem and prove: (a) tr: B, [7{] -+ F is a linear functional. (b) Itr(T)I 5 11TII, for every T E B, [W] (i.e., tr: (B,[7{], II Ilt) -- IF is a con-
traction, and hence a bounded linear functional). Hint: Let T = W I T I be the polar decomposition of T. Recall that 11 W II = I and verify:
Itr(T)I 5 E I(IT14ey; ITIzW*ey)I yEr
51
1
yEr
12/J1
5 IIT11.
yEr
(Problem 5.65).
(c) tr(T*) = tr(T) for every T E Bi [N]. (d) tr(TS) = tr(S T) whenever T E 13,[7j] and S E B[N]. Hint: tr(TS) = 2aer(T*e(,;Sea) = F_aEr>pEr(T*ea;fp)(fp;Sea) and
tr(T S) = Eper(S*fp; Tfp) = EpEr>uEr(S*fp; ea)(ea; Tfp) (cf. Problem 5.66(f), item (c), and Theorem 5.48(c)). (e)
Itr(SITI)I = ltr(ITIS)I 5 IISII IITII, if T E B1 [H] and S E B[N].
Hint: Problems 5.65(b,g) and 5.66(t). Verify: IEyer (SI T ley ; ey) I <
F_yErl(ITl2ey; ITIIS*ey)I 5
item (d).
(t) T * E B, [N] and IIT* Ii, = 11 T 11 , for every T E Bt [N].
Hint: Let T = W I T I and T * = W2 I T* I be the polar decompositions of T and T *. Since I T* I = W2 T* = W2* I T I Wt , it follows by Problems 5.65(b) and
5.66(1) that T* E B1 [1{]. Show that 1IT*IIT = tr(IT*I) = tr(W21TIW1) < IIW W211IITII, (Problem 5.65(b) and items (d) and (e)). But 11W* W2II < II IV 1111 W211 = 1. Therefore, 11 T* 111 5 11 T 111. Dually, 11 T II1 5 11 T* 111.
6 The Spectral Theorem
The Spectral Theorem is a landmark in the theory of operators on Hilbert space, providing a full statement about the nature and structure of normal operators. Normal operators play a central role in operator theory; they will be defined in Section 6.1 below. It is customary to say that the Spectral Theorem can be applied to answer essentially all questions on normal operators. This indeed is the case as far as "essentially all" means "almost all" or "all the principal": there exist open questions on normal operators. First we consider the class of normal operators and its relatives (predecessors and successors). Next, the notion of spectrum of an operator acting on a complex Banach space is introduced. The Spectral Theorem for compact normal operators is fully investigated, yielding the concept of diagonalization. The Spectral Theorem for plain normal operators needs measure theory. We would not dare to relegate measure theory to an appendix just to support a proper proof of the Spectral Theorem for plain normal operators. Instead we assume just once, at the very last section of this book, that the reader has some familiarity with measure theory, just enough to grasp the statement of the Spectral Theorem for plain normal operators after having proved it for compact normal operators.
6.1
Normal Operators
Throughout this section 7{ stands for a Hilbert space. An operator T E B[?t'.] is normal if it commutes with its adjoint (i.e., T is normal if T'T = TT'). Here is another characterization of normal operators.
440
5. Hilbert Spaces
(b) (13,[1-(], (; )) is a Hilbert space. Recall that L30[1-1] C L31 [7{] c
c B,[7{] and BO[7{] is dense in the Banach
space (L3c[N]. II II). Now show that
(c) BD[f] is dense in (B, [1-1), II II 1) and in (B,[7-(], II 112)
Problem 5.70. Two normed spaces X and Y are topologically isomorphic if there exists a topological isomorphism between them (i.e., if there exists W in Q[X, Y] - see Section 4.6). Two inner product spaces X and Y are unitarily equivalent if there exists a unitary transformation between them (i.e., if there exists U in G[X, Y] unitary - see Section 5.6). Two Hilbert spaces are topologically isomorphic if and only if they are unitarily equivalent. That is, if If and X are Hilbert spaces, then 9[7-1, IC) 0- 0
if and only if
{U E c[N, X]: U is unitary} A 0.
Hint: If W EC[N,X], then IWI = (W*W)? EQ+[7{] (see Problems 5.50(b) and 5.58(d)). Show that U = W I W I - 1 E 9[f, X] is unitary (Proposition 5.73) and that U I W I is the polar decomposition of W (Corollary 5.90).
6 The Spectral Theorem
The Spectral Theorem is a landmark in the theory of operators on Hilbert space, providing a full statement about the nature and structure of normal operators. Normal operators play a central role in operator theory; they will be defined in Section 6.1 below. It is customary to say that the Spectral Theorem can be applied to answer essentially all questions on normal operators. This indeed is the case as far as "essentially all" means "almost all" or "all the principal": there exist open questions on normal operators. First we consider the class of normal operators and its relatives (predecessors and successors). Next, the notion of spectrum of an operator acting on a complex Banach space is introduced. The Spectral Theorem for compact normal operators is fully investigated, yielding the concept of diagonalization. The Spectral Theorem for plain normal operators needs measure theory. We would not dare to relegate measure theory to an appendix just to support a proper proof of the Spectral Theorem for plain normal operators. Instead we assume just once, at the very last section of this book, that the reader has some familiarity with measure theory, just enough to grasp the statement of the Spectral Theorem for plain normal operators after having proved it for compact normal operators.
6.1
Normal Operators
Throughout this section f{ stands for a Hilbert space. An operator T E B[1t:] is normal if it commutes with its adjoint (i.e.. T is normal if T'T = TT'). Here is another characterization of normal operators.
442
6. The Spectral Theorem
Proposition 6.1. The following assertions are pairwise equivalent.
(a) T is normal. (b)
II T*x II = II Tx 11 for every x E
(c) T" is normal for every positive integer n. (d)
II T'"x II = IIT"x II for every x E R and every n > 1.
Proof. IfT E 13[N],then llT*x112-IITxII2 = ((TT'-T*T)x;x)forevery x E N. Since T TO - T*T is self-adjoint, it follows by Corollary 5.80 that TT* = TOT if and only if II T'x 11 = II Tx 11 for every x E N. This shows that (a)a(b). Therefore, as T*" = T"* for every n > 1 (cf. Problem 5.24), (c),*(d). If TO commutes with
T, then it commutes with T" and, dually, T" commutes with TO' = T. Hence (a)=(c). Since (d)=(b) trivially, the proposition is proved. o Clearly, every self-adjoint operator is normal (T' = T implies T*T = T TO _ T2), and so are the nonnegative operators and, in particular, the orthogonal projec-
tions - cf. Proposition 5.81). It is also clear that every unitary operator is normal (U E 13[N] is unitary if and only if U * U = U U* = I - cf. Proposition 5.73). In fact, normality distinguishes the orthogonal projections among the projections, and the unitaries among the isometrics. Proposition 6.2. P E 8[l] is an orthogonal projection if and only if it is a normal projection. Proof. If P is an orthogonal projection, then it is a self-adjoint projection (Proposition 5.81), and hence a normal projection. On the other hand, if P is normal, then 11 P*x 11 = 11 Px 11 for every x E W (by the previous proposition) so that N(P') =
N(P). If P is a projection, then R(P) = N(I - P) so that R(P) = R(P)' by Proposition 4.13. Therefore, if P is a normal projection, thenN(P)1 = N(P*)1 =
R(P)' = R(P) (Proposition 5.76) so that R(P) 1 N(P), and hence P is an or-
a
thogonal projection.
Proposition 6.3. Let U be an operator in BEN]. The following assertions are pairwise equivalent.
(a) U is unitary (b)
(i.e., U'U = UU' = I).
IIU*"xII = IIU"x II = IlxI) for every x E Hand everyn>1.
(c) 11U'x11= 11Ux1I = lix11 for every x E N.
(d) U is a normal isometry.
6.1 Normal Operators
443
Proof. (a) '(b) by Propositions 4.37 and 5.73, (b) .(c) by Proposition 4.37, and o (c)*(d) Propositions 4.37 and 5.61. Given any operator T E B[f] set
D = T*T -TT* in B[N]. Recall that D = D* (i.e., D is always self-adjoint). Moreover,
T is normal if and only if D = 0. An operator T E B[W] is quasinorma! if it commutes with T *T. That is, if T *T T =
TT*T. Equivalently, if (T*T - TT*)T = 0. Therefore,
T is quasinormal if and only if DT = 0. It is clear that every normal operator is quasinormal. Observe that every isometry is quasinormal. (Proof: If V E B[7l ] is an isometry, then V* V = I so that V* V V -
VV*V = 0.) Proposition 6.4. If T = W Q is the polar decomposition of an operator T E B[7{], then (a)
W Q = QW if and only if T is quasinormal. In this case we have WT = T W and QT = T Q. Moreover, (b) if T is normal, then W I N(w) . is unitary. That is, the partial isometry of the polar decomposition of
any normal operator is, in fact, a "partial unitary transformation " in the following sense. W = U P, where P is the orthogonal projection onto Nl, with N = N(T) _
N(W) = N(Q), and U : Nl - 7l is unitary. Proof. (a) Let T = W Q be the polar decomposition of T so that Q2 = T *T
(Theorem5.89). If WQ = QW,then Q2W = QWQ = WQ2, and hence TT*T = WQQ2 = Q2WQ = T*TT (i.e., T is quasinormal). Conversely, if TT*T = TT*T, then TQ2 = Q2T. Thus TQ = QT by Theorem 5.85 (for Q = (Q2)71)
so that WQQ = QWQ; that is, (WQ - QW)Q = 0. Therefore, 7Z(Q)- c N(WQ - QW) and so
N(Q)1 c N(WQ - QW) by Proposition 5.76 (for Q = Q). Recall that N(Q) = N(W) (Theorem 5.89). If U E N(Q), then u E N(W) so that (WQ - QW)u = 0. Hence
N(Q) c N(WQ - QW). The above displayed inclusions imply N(WQ - QW) = 7-l (Problem 5.7(b)); that is, W Q = QW. Since T = W Q, it follows at once that W and Q commute with T whenever they commute with each other.
444
6. The Spectral Theorem
(b) Recall from Theorem 5.89 that the null spaces of T, W and Q coincide. Thus (cf. Proposition 5.86) set
Al = N(T) = N(W) = N(Q) = N(Q2). According to Proposition 5.87, W = VP, where V : Nl -+ W is an isometry and
P : N -+ N is the orthogonal projection onto Al-'-. Since R(Q)- = N(Q)1 = JVl = R(P), it follows that P Q = Q. Taking the adjoint and recalling that P = P' (Proposition 5.81) we get
PQ = QP = Q. Moreover, since V E B[Nl, 7-l}, its adjoint V' lies in B[W,.Nlj. Then R(V*) C 1Vl = R(P), which implies that PV* = V'. Hence
VPV* = VV*. These identities hold for the polar decomposition of every T E B[%]. Now suppose T is normal so that T is quasinormal. By part (a) we get
Q2 = T'T = TT* = WQQW' = Q2WW' = Q2VPV' = Q2VV'. Therefore, Q2(I - V V') = 0 and hence
R(I - VV*) c N(Q2) = H. However, according to Proposition 5.76 and recalling that R(T)- = R(W) _ 1Z(V), we also get
R(I - V V') = R(V V') = R(V) = R(T)-
= IZ(TT') = R(T`T) = R(Q2) = Ar(Q2)1 = Al-'-. Hence R(1 - VV*) = {0} (for R(I - VV*) g N n Nl = (0)), which means that I - VV* = 0. That is, VV* = 1 so that the isometry V also is a coisometry. Outcome: V is unitary (Proposition 5.73).
0
A part of an operator is a restriction of it to an invariant subspace. For instance, every unilateral shift is a part of some bilateral shift (of the same multiplicity). This
takes a little proving. In this sense, every unilateral shift has an extension that is a bilateral shift. Recall that unilateral shifts are isometrics, and bilateral shifts are unitary operators (see Problems 5.29 and 5.30). The above italicized result can be extended as follows. Every isometry is a part of a unitary operator. This takes a little proving too. Since every isometry is quasinormal, and since every unitary operator
is normal, we might expect that every quasinormal operator is a part of a normal operator. This actually is the case. We shall call an operator subnormal if it is a part of a normal operator. Equivalently, if it has a normal extension. Precisely, an operator
T on a Hilbert space H is subnormal if there exists a Hilbert space K including N and a normal operator N on K such that N is N-invariant (i.e., N(H) S H) and T
6.1 Normal Operators
445
is the restriction of N toll (i.e., T = NIx). In other words, T E B[71] is subnormal if 71 is a subspace of a larger Hilbert space !C, so that 1C = 71 e 711 by Theorem 5.25, and the operator
\
N= IT yI: he711-?H®f1 in B[1C] is normal for some X E B[f1, 71] and Y E B[711] (see Example 20). Recall that, writing the orthogonal direct sum decomposition IC = 71®711, we are identifying h E- IC with N ® (0) (which is a subspace of h ® 711) and 711 e_ 1C with (0) ® 711 (which also is a subspace of 7{ (D 711).
Proposition 6.5. Every quasinormal operator is subnormal. Proof. Suppose T E B[N] is quasinormal.
Claim. N(T) reduces T. Proof. If T is quasinormal, then TIT commutes with both T and T' so that N(T *T )
reduces T (cf. Problem 5.34). But N(T'T) = N(T) by Proposition 5.76. 13
Thus T=O®S on71=N(T)®N(T)1 with 0 = TIN(T): N(T) --) N(T) and S = T IN(T)1 : N(T)1 -* N(T)1. Note that TIT = 0 0 S*S, and so (OED S*S)(OED S)=T'TT =TT*T =(O(D S)(OED S'S). Then 0 ® S*SS = 0 ® SS* S, and hence S*SS = SS'S. That is, S is quasinormal. Since N(S) = N(T I N(T)1) = (0), it follows by Corollary 5.90 that the partial isometry of the polar decomposition of S E B[N(T)1] is an isometry. Therefore
S = V Q, where V E B[N(T )1] is an isometry (so that V *V = 1) and Q E B[N(T)1] is nonnegative. But S = VQ = QV by Proposition 6.4, and hence
S'=QV'=V*Q.Set
U = (0
V*V*)
1
\
and
R = C0 Q)
in B[N(T)1® N(T)1]. Observe that U is unitary. In fact,
. UU = \1 - V* VV*
_
(0
O V
0) = (O
V
(O
1 - VV' V' )
I V. v*)
1
1
VVV* 0
= UU*.
Also note that the\\\nonnegative\\\\operator R commutes with U:
UR = (VQ
(I V*v`)Ql
Q(1 S*VV*)
= (S _ (QV Q(1- VV') - RU. QV* 0
446
6. The Spectral Theorem
Now set N = UR in B[N(T)1®N(T)1]. The middle operator matrix says that S is a part of N (i.e., N(T)1 is N-invariant and S = NI j(T)1). Finally, note that
N*N = RU*UR = R2 = R2U*U = UR2U* = NN* and, therefore, N is normal. Conclusion: S is subnormal, and so is T = 0 ® S because T is a part of the normal operator 0 ® N on N(T) ® N(T)1® N(T)1. 0 An operator T E B[7{] is hyponormal if TT* < T*T. In other words.
T is hyponormal if and only if D > O. Recall that T *T and T T * are nonnegative, and D = (T*T - T T *) is self-adjoint, for every T E 5[7l].
Proposition 6.6. An operator T E B(7Il is hyponormal if and only if
it T*x 11 <
II T x ll for every x E W.
Proof.
T T* < T*T if and only if (T T*x ; x) < (T*Tx ; x) or, equivalently,
II T*x II -<
0
II Tx 11 for every x E H.
An operator T E B[7-l] is cohyponormal if its adjoint T* E 13[7-I] is hyponormal (i.e., if T*T < TT* or, equivalently, if D < O, which means by the above proposition that II T x II < II T *x II for every x E H). Hence T is normal if and only if it is
both hyponormal and cohyponormal (cf. Propositions 6.1 and 6.6). If an operator is either hyponormal or cohyponormal, then it is called seminormal. Every normal operator is trivially hyponormal. The next proposition goes beyond that.
Proposition 6.7. Every subnormal operator is hyponormal. Proof. If T E B[H] is subnormal, then 7l is a subspace of a larger Hilbert space X so that K = H ® H1, and the operator
N = (T
):n1_.nn1
in B[IC] is normal for some X E 8(77.11, H] and Y E B[7-i1]. Then
(X*T X*X + YY) =
= NN* = (OT
(X* ) (O Y) =
Y) (X* ) =
N*N
(TTYXXX' YY*)
Therefore T*T = TT* + XX*, and hence T*T - TT* = XX* > O.
o
Let X be a normed space and take any T E B[X]. A trivial induction (Problem 4.47(a)) shows that 11 T" 11 < 11 T II" for every n > 0.
6.1 Normal Operators
447
Lemma 6.8. If X is a normed space and T E B[X], then the real-valued sequence I) converges in R. {IIT" 11
Proof. The proof uses the following bit of elementary number theory. Take an arbitrary m E N. Every n E N can be written as n = m p" + qn for pn, q" E N. where q" < m. Hence IIT"II =
IITmP,+q"II
=
IIT"`II'' IIT"II.
IIT'"PAIIIITq,,II
Set µ = max o
IIT"11^ < IIT"'11 An = µ" IIT"'II. Since µ ^ --> 1 and 11 Tt II M
MR - II T m II M
MR.
as n -> oo, it follows that
lim sup 11T"II° < 1IT'"11IN n
for every m E N. Therefore, lim sup" II Tn II F < lim info II T" II F and so (cf. Problem
o
3.13) {II T" 11 i} converges in R.
We shall denote the limit of (II T"11 1) by r(T):
r(T) = lira IIT"I1F. n
According to the above proof we get r (T) <- II T" II 1 for every n > 1. In particular,
r(T) < IITII. Also note that r(Tk)F = lira" IITknII' = r(T) for each k> 1, because {11 Tk" II} is a subsequence of the convergent sequence (IIT" II "). Thus r(Tk) = r(T)k for every positive integer k. Therefore, if T E B[X] is an operator on a normed space X, then r (T )n = r (T") < 11 T" II < 11 T
In
for each integer n > 0.
Definition: If r(T) = 11T11, then we say that T E B[X] is normaloid.
Proposition 6.9. An operator T E B[X] on a normed space X is normaloid if and only if II T" II = 11T II" for every integer n > 0.
Proof. If r (T) = 11T II, then 11 T" II = 11TII" every n >_ 0 by the above displayed inequalities. Conversely, if 11 T' II = 11T 11" every n > 0, then r(T) = limo II T" II" _ 11T11.
Proposition 6.10. Every hyponormal operator is normaloid. Proof Let T E B[1-1] be a hyponormal operator.
o
448
6. The Spectral Theorem
Claim 1. IIT"III <
IITn+tIIIIT"- ' II for every positive integer n.
Proof. Take any T E B[f] and note that
IIT"xII2 = (T"x ; T"x) = (T'T"x ;
Tn-tx)
-< IIT*TnxllIIT"-'xll
for each integer n > 1 and every x E N. If T is hyponormal, then IIT*TnxIIIIT"-'xII -<
IITn+'xIIIIT"-'xll
_<
IITn+tIIIIT"-'IIIIxII2
by Proposition 6.6, and hence
IIT"xIl2 <
IITn+tIIIIT"-'IIIIx112
for each n > I and every x E N, which ensures the claimed result. o Claim 2. II Tk II =IIT Ilk for every positive integer k < n, for all n > 1.
Proof. The above result holds trivially if T = 0 and it also holds trivially for n =1 (for all T E 8[N]). Let T 0 and suppose the above result holds for some integer n > 1. By Claim l we get
IITII'I` = (IITII")2 = Therefore, as IIT" II
IITn112 < IITn+tIIIIT"-'II
= [ITn+1 IIIITII"-'
IITII" for every n > 0, and since T 54 O,
IITIIn+' =
IITII2n(IITII"-t)-t
< IITn+'II -< IITII"+t
Hence II Tn+' II = 11T II"+' . Then the claimed result holds for n + 1 whenever it
holds for n, which concludes the proof by induction. o Outcome: II T" II = II T II" for every integer n > 0 so that T is normaloid by Proposition 6.9. D
Since IIT*"II = IIT"Il for each n > 0 (cf. Problem 5.24), it follows that r(T') _ r(T). Thus T is normaloid if and only if T' is normaloid, and hence every seminormal operator is normaloid. Summing up: An operator T is normal if it commutes with its adjoint, quasinormal if it commutes with T'T, subnormal if it is a restriction of a normal operator to an invariant subspace, hyponormal if T T' < T* T. and normaloid if r(T) = IITII . These classes are related by proper inclusion as follows.
Normal C Quasinormal C Subnormal C Hyponormal C Normaloid. Example 6A. We shall verify that the above inclusions are, in fact, proper. The unilateral shift will do the whole job. First recall that a unilateral shift S+ is an isometry but not a coisometry, and hence S+ is a nonnormal quasinormal operator. Since S+ is subnormal, A = I +S+ is subnormal (i.e., if N is a normal extension of S+, then / + N is a normal extension of A). However, since S+ is a nonnormal isometry,
A'A A - A AA = A'A S+ - S+ AA = S., S+ - S+S+ 76 0, and therefore A is not quasinormal. Check that B = S,' + 2S+ is hyponormal but B2 is not hyponormal. Since the square of every subnormal operator is again a subnormal operator, it follows that B is not subnormal. Finally. S; is normaloid (cf. Proposition 6.9) but not hyponormal.
6.2 The Spectrum of an Operator
449
6.2 The Spectrum of an Operator Let T E C[D(T), X] be a linear transformation, where X 54 (0) is a normed space and D(T ), the domain of T, is a linear manifold of X. Let I be the identity on X.
The resolvent set p(T) of T is the set of all scalars k E F for which (Al - T) E G[D(T), X] has a densely defined continuous inverse. That is,
p(T) = (A E F: (Al - T)-' E I3[R(A1 - T). D(T)] and R(A1 - T)- = X}. Henceforward, all linear transformations are operators on a complex Banach space. In other words, T E B[X ], where D(T) = X # (0) is a complex Banach space, and the linear T : X -). X is bounded. In such a case (i.e.. in the Banach algebra 13[X)), Corollary 4.24 ensures that the resolvent set p(T). defined as above, is precisely the set of all those complex numbers k for which (Al - T) E 13[X] is invertible (i.e., has a bounded inverse on X). Equivalently (cf. Theorem 4.22).
p(T) _
(A E C: (Al - T) E !;[X1} (A E C: ,V(A1 - T) = (0) and R(A/ - T) = X}.
The complement of p (T), denoted by a (T), is the spectrum of T. Thus
a(T) = C\p(T) = (A E C: N(AI - T) A {0} or1 (AI - T) 96 X}. Proposition 6.11. If A E p(T), then S = II(AI - T)-111-1 is a positive number. The open ball BS (A) with center at A and radius S is included in p(T). and hence
S 0. Let Ba(0) be the nonempty open ball of radius S about the origin of the complex plane C and take an arbitrary it
in Ba(0). Since Iµl < II(A/ - T)-1II-1, it follows that Il,t(A/ - T)-'11 < I. Then / - Ic(A1 - T)-1 E 9[X1 (see Problem 4.48(a)), and so (A - z)I - T = (Al -T)[I -µ(A1 - T)-t]alsoliesing[X]by Corollary 4.23.Outcome: A-µ E p (T) so that
Bs (X) = Bs(0)+A = (V E C: v = µ +A for some p E Bs (0)) c p(T). Therefore, d(A- c) = IA - cI > S for every c E a(T) = C\p(T), and hence d(A, a(T)) = infcea(T) IA - SI > S.
17
Corollary 6.12. The resolvent set p(T) is nonempry and open, and the spectrum a (T) is compact. Proof. If T E 13[X] is an operator on a Banach space M. then the von Neumann expansion (Problem 4.47) ensures that k E p(T) whenever IIT II < IAI. Since
a(T) = C\p(T), this is equivalent to
Al IIITII
for every
A E a(T).
450
6. The Spectral Theorem
Thus a (T) is bounded, and therefore p (T) 0 0. Proposition 6.11 says that p(T) includes a nonempty open ball centered at each one of its points. That is, p(T) is open and, consequently, a (T) is closed. In C, closed and bounded means compact
o
(Theorem 3.83).
The mapping R: p(T) - C[X] such that R(A) = (Al - T)-t for every k in p(T) is called the resolvent function of T. Since
R(A) - R(IL) = R(A)(1Z(A)-t -
R(,L)-t)R(A),
we get
R(A) - R(tt) = (A - )L)R(A)R(A) for every A, a E p(T). This is referred to as the resolvent identity. Swapping A and A in the resolvent identity, it follows that R(A) and R(µ) commute for every A. a E p(T). Also note that TR(A) = R(A)T for every A E p(T) (trivially. R(A)-'R(A) = R(A)R(A)-t). To prove the next proposition we need a piece of elementary complex analysis. Let A be a nonempty and open subset of the complex plane C. Take a function f : A -* C and a point it E A. Suppose f'(µ) is a complex
number with the following property. For every e > 0 there exists S > 0 such that (x__µ() - f'(u)I < e for all A in A for which 0 < IA - µl < S. If there exists
such an f'(le) E C, then it is called the derivative of f at z. If f'(µ) exists for every µ in A, then f : A - C is analytic on A. A function f : C -+ C is entire if it is analytic on the whole complex plane C. The Liouville Theorem is the result we need. It says that every bounded entire function is constant.
Proposition 6.13. The spectrum a (T) is nonempty.
Proof Let T E 13[X] be an operator on a complex Banach space X. Take an arbitrary nonzero element cp in the dual 13[X]` of 13[X) (i.e., an arbitrary nonzero bounded linear functional rp : 13[X] -+ C - note: 13[X] 96 (0) because X 0 (0), and so 6[X]' 36 (0) by Corollary 4.64). Recall that p(T) is nonempty and open in C. Claim 1. V o R : p(T) -> C is bounded.
Proof. The resolvent function R : p(T) - Q[X] is continuous (reason: scalar multiplication and addition are continuous mappings, and also is inversion by Problem 4.48(c)) so that II R(-) II : p (T) -+ R is continuous. Then supjaj
Theorem 3.86. On the other hand, if Al I> IITII, then IIR(A)ll = il(A1 - T)-t u < (I),I - IITI{)-' (Problem 4.47(h)) so that IIR(A)II - 0 as IAI --* no. Thus, as p(T) --> R is continuous, suplxl>HTIIIIR(A)II < no. Hence the function supt,EP(T) II R(A)II < no, and therefore
sup II(w o R)(A)II < ) AEp(T)
II
sup IIR(A)ll < oo. 0 ,kEp(T)
6.2 The Spectrum of an Operator
451
Claim 2. tp o R : p(T) -+ C is analytic. Proof. If A and µ are distinct points in p(T), then
R(A) - R(µ) + R(u)2 = (R(µ) - R(A))R(li) A-µ by the resolvent identity. Set f = tp o R : p (T) --> C, and let f': p (T) -+ C be defined by f'(A) = -tp(R(A)2) for each A E p(T). Therefore,
f(A) - f(I<) A-µ
I(p((R(a) - R(A))R(A))I -< II'II IIR(lp)II IIR(µ) - R(A)II
so that f : p(T) -+ C is analytic because R: p(T) -+ Q[X] is continuous. 0 If a(T) = 0 (i.e., if p(T) = C), then (p o R : C -r C is a bounded entire function, and so a constant function by the Liouville Theorem. But we have just seen (proof of Claim 1) that IIR(A)II -* O as Al I-+ oo. Hence (p(R(A)) - 0 as IAI - oo (since (p
is continuous). Then (p o R = 0 for all (P E B[f]' so that R = 0 (Corollary 4.64).
That is, (Al - T)-t = 0 for every A E C, which is a contradiction (0 0 9[X]). ci Thus a(T) 96 0. Remark: a(T) is compact and nonempty, and so is its boundary 8o(T). Therefore, as (T) = ap(T) 96 0 (see Problem 3.41). The spectrum a (T) is the set of all A in C such that (Al - T) fails to be invertible (i.e., fails to have a bounded inverse on R(AI - T) = X). According to the origin of such a failure, a(T) can be split into many disjoint parts. A classical partition comprises three parts. The set of those A such that (Al - T) has no inverse is the point spectrum:
ap(T) = {AEC: N(AI-T)0{0}}. AscalarA E Cis called an eigenvalue of T if there exists a nonzero vectorx in X such
that Tx = Ax. Equivalently, if N(Al - T) A (0). If A E C is an eigenvalue of T, then the nonzero vectors in N(AI - T) are the eigenvectors of T, and N(AI - T) is the eigenspace (which, in fact, is a subspace of X - cf. Proposition 4.13), associated with A. The multiplicity of an eigenvalue is the dimension of the respective
eigenspace. After this quick digression on eigenvalues and eigenvectors, note that the point spectrum of T is precisely the set of all eigenvalues of T. The set of those A for which (Al - T) has a densely defined but unbounded inverse on its range is the continuous spectrum:
ac(T)_{AEC: N(AI -T)_{0),R(AI - T)-=X andR(Al - T)AX} (see Corollary 4.24 again). If (Al - T) has an inverse on its range that is not densely defined, then A belongs to the residual spectrum:
aR(T) = {AEC: AI(AI-T)_(0)and R(AI-T)-54 X}.
452
6. The Spectral Theorem
The collection lap (T), oc (T ), aR (T) } of subsets of a (T) is a disjoint covering of it. That is, they are pairwise disjoint and
a(T) = ap(T) Uac(T) UoR(T) The diagram below summarizes such a partition of the spectrum, where oR(T) = aR,(T) U aRZ(T) and ap(T) = U°=Iap,(T). Here we adopt the following notation: Tx = (Al - T), Nx = N(Tx), and Rx = R(T)L). Rx : X
Rx = X
Rx = Ra RA * 7Za Ra m Ra Ra = Ra Ta
ti(Ra,XI
Nx = (0)
T;'E t3(Rx,XI
Nay (0)
P (T)
0
0
,RR(T)
0
ac(T)
al¢2(T)
0
ap2(T)
ap3(T)
ap1(T)
ap4(T)
acp(T)
Recall that a(T) # 0 but any of the above disjoint parts of the spectrum may be empty (as we shall see in Section 6.5). However, if op(T) is nonempty, then every set of eigenvectors associated with distinct eigenvalues is linearly independent. Proposition 6.14. Let (A y }yEr be any family of distinct eigenvalues of T. For each
y E I' let x be an eigenvector associated with Ay (i.e., 0 * xy E N(Ay I - T) # {0}). The set (xy)yEr is linearly independent. Proof. Consider the set {xy}yEr (whose existence is ensured by the Axiom of Choice). Let (z;)"_1 be an arbitraryfinite subset of (xy )yEr.
Claim. {x;}"_1 is linearly independent. Proof. If n = 1, then linear independence is trivial. Suppose {xi)"_t is linearly
independent for some n > 1. If {x; }"+,, is not linearly independent, then F"=i a; xi, where (ak )', i is a family of complex numbers with at least one nonzero "=1a;Txi = number among them. Thus ai A,xi. 0, then A; 96 0 for every i 54 n + 1 and En=ia;Aix; = 0 so that If {x; )"_, is not linearly independent, which is a contradiction. If 96 0, then F_,"_ l ai A; xi , and therefore E"_la; (1A; )x; = 0 so that (xi )° 1 is not linearly independent (because Ai for every i 96 n + 1), which is again a contradiction. This completes the proof by induction. O Outcome: (xy }yEr is linearly independent by Proposition 2.3.
t7
There are some overlapping parts of the spectrum which are commonly used too. For instance, the compression spectrum oc p (T) and the approximate point spectrum
6 2 llie Spec trunm of an Operator
453
(or approximation spectrum) aA p (T) are defined by acp(T)
(A E C: R(AI -T) is not dense in X ap3(T) U ap,(T) U cR(T ).
aAp(T)
{A E C: (Al - T) is not bounded below) up (T) U cc (T) U aR_(T) = a(T)\aR,(T)
Next we give an alternative definition of aA p (T) which may come as a motivation for
the term "approximate point spectrum". The elements of it are sometimes referred to as the approximate eigenvalues of T.
Proposition 6.15. The following assertions are pairwise equivalent. (a) A E UAP(T )
(b) There exists an X-valued sequence (xn) of unit vectors such that (c) For every e > 0 there exists a unit vector s( E X such that II (A I - T)x(Il < E. Proof. It is clear that (c) implies (b). If (b) holds true. then there is no constant
a > 0 such that a = aIlxnll < II (Al - T).rnII for all n, and hence (AI - T) is not bounded below. Hence (b) implies (a). If (Al - T) is not bounded below, then there is no constant a > 0 such that a IIx II < II (Al - T )x II for all x E X or. equivalently, for every e > 0 there exists a nonzero yE in X such that 11(k/ - T )y, II < E IIy, II. o By setting x, = 11 y, II-1 y, it follows that (a) implies (c).
Proposition 6.16. The approximate point spectrum is nonemprv, closed in C, and includes the boundary as (T) of the spectrum.
Proof. Take an arbitrary A E aa(T). Recall that p(T) 36 0 (Corollary 6.12) and as (T) = ap(T) C p(T) - (Problem 3.41). Hence there exists a sequence {An) in p(T) such that An - A (Proposition 3.32). Since
(Ail - T) - (Al - T) = (An - A)I for every n, it follows that (Ail - T) - (Al - T) in 13[X]; that is ((Ail - T)) in q[XJ converges in t3[X) to (Al - T) E 18[X]\Cg[X] (each An lies in p(T) and A E aa(T) c_ a(T) because a(T) is closed). If supnll(Anl - T)-111 < oo, then (Al - T) E G[X J (cf. hint to Problem 4.48(c)), which is a contradiction. Therefore, sup 11 ().,j n
T)-t
II = 00.
For each integer n take yn in X with Il y'n II = 1 such that II (An I - T)-1 II - R <
II(AN! - T)-'yell < II(An! -
T)-tynll = oo.andhence
454
6. The Spectral Theorem
inf n ll (,Xn I - T)-' yn II -' = 0, so that there exist subsequences of (A) and (y.), say (),k) and (yk), for which
II(AkI - T)ykll-' - 0. Set xk = II (Ak I - T) -' yk II -' (Ak I - T) yk and get a sequence (xk) of unit vectors in X such that II (Ak I - T )xk ll = II (Ak I - T)-t yk ll-'. Since
II (AI - T )xk ll = II (Ak I - T)xk - (Ak - A)xk ll
II (Ak I - T)-' yk 11 -' + IAk - A I
and Ak -- A, it follows that II (AI - T )xk ll -* 0. Hence A E aA p (T) according to Proposition 6.15. Therefore,
aa(T) c acp(T). This inclusion clearly implies that aA p (T) 96 0 (for a (T) is closed and nonempty).
Moreover, as acp(T) c a(T), we also get aaAp(T) a aa(T) a acp(T) so that
0
oAp (T) is closed in C (see Problem 3.41).
Remark: aR,(T) is open in C. Proof: acp(T) is closed in C and includes aa(T).
Then C\aR,(T) = p(T)UaAp(T) = p(T)Uaa(T)UoAp(T) = p(T)Uap(T)U acp(T) = p(T)- U acp(T), which is closed in C. For the next proposition we assume that T E B[9-(], where 7 ( A (0) is a complex Hilbert space. If A is any subset of C, then set
A` _ {ALE C: AEA)
so that A** = A, (C\A) = C\Ai and (A, U A2) = A, U A. Proposition 6.17. If T' E B[l(] is the adjoins of T E B[?{], then
p(T) = p(T`)`,
a(T) = a(T1)*,
ac(T) = ac(T*)`.
and the residual spectrum of T is given by the formula
aR(T) =
ap(T.).\ap(T)
As for the subparts of the point and residual spectrum, ap,(T) = aR,(T$)`,
ap2(T) = aR2(T*)*,
ap3(T) = ap,(T`)`,
ap,(T) = ap,(T')'.
For the compression and approximate point spectrum we get
acp(T) = ap(T')`,
6.2 The Spectrum ofan Operator
455
Ba(T) c aAp(T) n aAp(T*)* = a(T)\(ap,(T) U aR,(T)) Proof. Since S E 9[71] if and only if S* E g[H], it follows that p(T) = p(T*)*.
Hence a(T)* = (C\p(T))* = C\p(T*) = a(T*). Recall that 7Z(S)- = R(S) if and only if R(S*)- = RZ(S*), and N(S) = (0) if and only if 7 (S*)- = N (cf. Proposition 5.77 and Problem 5.35). Thus ap,(T) = aR,(T*)*, ap2(T) = aR2(T*)*, ap3(T) = op3(T*)*, and also ap4(T) = ap,(T*)*. Applying the same argument,
ac(T) = ac(T*)* and acp(T) = ap(T*)*. Therefore
aR(T) = acp(T)\ap(T)
implies
cR(T) = ap(T*)*\ap(T)
Moreover, by using the above properties and the definition of
aAP(T*) = ap(T*) U ac(T*) U aR2(T*) = acp(T)* U ac(T)* U ap2(T)*, we get
aAp(T*)* = acp(T)Uac(T)Uap2(T) So aAp(T*) fl aAp(T) = a(T)\(ap,(T) U aR,(T)). But a(T) is closed and aR,(T) is open (and so is ap,(T) = aR,(T*)*) in C. This implies that ap,(T) U 0R,(T) c a(T)° and 8a (T) c a(T)\(ap,(T) U aR,(T)) (cf. Problem 3.41(b,d)). 0 Remark: We have just seen that ap, (T) is open in C.
Corollary 6.18. Let R * {0} be a complex Hilbert space, and let r denote the unit circle about the origin of the complex plane.
(a) If H E B[7N] is hyponormal, then ap(H)*C ap(H*) and aR(H*) = 0.
(b) If N E B[7{] is normal, then ap(N*) = ap(N)* and aR(N) = 0. (c) If U E B[7{] is unitary, then a (U) c r.
(d) If A E B[71] is self-adjoint, then a(A) C R. (e) If Q E B[7{] is nonnegative, then a(Q) c [0, oo).
(f) If R E B[l] is strictly positive, then a(R) c [a, oo) for some a > 0.
(g) If P E B[N] is a nontrivial projection, then a(P) = ap(P) = (0, 1).
(h) If JE B[7{] is a nontrivial involution, then a(J) = ap(J) = (-1, 1). Proof. Take any T E B[l] and any A E C. It is readily verified that
(Al - T)*(A! - T) - (X1 - T)(AI - T)* = T*T - TT*.
456
6. The Spectral Theorem
Hence (A I - T) is hyponormal if and only if T is hyponormal. If H is hyponormal, then (A/ - H) is hyponormal and so (cf. Proposition 6.6)
II(AI - H`)xll
II(A1 - H)xll for every x E 11 and every w E C.
Suppose ,k E ap (H). Then N(AI - H) * (0) and the above inequality ensures that
.&t(D - H') A {0}. Thus A E ap(H'). Conclusion: if H is hyponormal. then ap(H) c ap(H*)'. Equivalently, ap(H)' _c ap(H'), and therefore (Proposition 6.17) aR(H*) = ap(H)*\ap(H*) = 0. This proves (a). Since N is normal if and only if it is both hyponormal and cohyponormal, this also proves (b). Now let U be
unitary (i.e., a normal isometry). If At > 1, then A E p(U) because IIUII = I (cf. Problem 4.47). If Al I< 1, then IIAU'll < I because IIUII = IIUII = 1. and hence (1 - ).U') E Q[11] (cf. Problem 4.48). Since U* = U-1 E Q[f], this implies by
Corollary 4.23 that (nl - U) = -U(I -AU') E Q[fl], and therefore. E p(U). Outcome: if .l E a(U). then IXI = 1, which proves (c). Next let A be self-adjoint (i.e.. A' = A). For every a, 6 E R (i16x
(al - A).r) = -((a! - A)x ; i fix) = -(i fix ; (al - A)x)
so that 2 Re (i Px : (a I - A )x) = 0. Hence A)x11"
= IIiImAx +(ReAl - A)xl = lImAl'11x112 + 11(ReA! - A)xll' > IImal'llxll2
for every .r E 1I and every A E C. If A is not real, then (Al - A) is bounded below, which means that k E p(A) U aR(A) = p(A) because aR(A) = 0 according to (b). This shows that (d) holds true. If Q is nonnegative (i.e.. (Qx ; x) 0 for every .x E 9-l) and A E a(Q), then X E R by item (d) and II(AI - Q)x112 = IXI2IIxI12 2A(Qx; x) + IIQxll` for every x E R. If A < 0, then II(A1 - Q)xll' > lA1211x112 for every x E 1i. and hence (Al - Q) is bounded below. Using the same argument
of the previous item we get the result in (e). If 0 < R, then 0 < R E Q[7-(], and so a(R) E [0, 11 by item (e) and 0 E p(R). Since p(R) is open, a(R) must be bounded away from zero, which proves (f). Now let P be a projection so that 1 - P
is a projection as well, 7Z(1 - P) = N(P), and 7Z(P) = N(1 - P) (see Section 2.9). If 0 ap(T ), then,%'(P) = {0) so that 7Z(1 - P) = {0}. and hence P = 1. Similarly, if l ¢ ap(T), then N(! - P) = (0) so that 1Z(P) = (0), and hence P = 0. Therefore, if P is a nontrivial projection (i.e., 0 54 P = P2 A 1), then 0.1 E ap (T ). Moreover, if X is any complex number such that 0 96 A A 1. then I
P)
which means that (AI - P) E Q[1-l] (Theorem 4.22). and so k E p(T). This concludes the proof of (g). Finally, let J be an involution; that is, J'- = 1. In this case,
(I-J)(-1-J)=0=(-1-J)(1- J) so that 7Z(-I-J)c N(!-J)and 7Z(1 - J) c N(-1 - J). If I
ap(J) or -1 V ap(J), then N(1 - J) = (0)
6.3 Spectral Radius
457
or N(-1 - J) = (0}, which implies R(! + J) = {0) or R(! - J) = (0), and hence J = I or J = -1. Thus, if J is a nontrivial involution (i.e., J 34 ±1), then ± 1 E ap (T). Moreover, if A in C is such that 12 0 1 (i.e., A. # ± 1), then X E p(T). Indeed,
so that (Al - J) E g[?{], which concludes the proof of (h).
6.3
17
Spectral Radius
We open this section with the Spectral Mapping Theorem for polynomials. Let us just mention that there are versions of it that hold for functions other than polynomials. If A is any subset of C, and p : C - C is any polynomial with complex coefficients,
then set
p(A) = {P(A)EC: AEA). Theorem 6.19. (The Spectral Mapping Theorem). If T E B[X], where X is a complex Banach space, then
a(p(T)) = p(a(T)) for every polynomial p with complex coefficients.
Proof. If p is a constant polynomial (i.e., if p(T) = a1 for some a E C), then the
result is trivially verified (and has nothing to do with T: a(al) = aa(!) = (a), since p(al) = C\ (a) for every a E Q. Let p: C -+ C be an arbitrary nonconstant polynomial with complex coefficients, n
p(z) = E ai z' ,
with n > 1 and an 0 0,
i=O
for every z E C. Take an arbitrary 11 E C and consider the factorization n
h - p(z) = an1f(zi-z), i=1
where {zi)"=1 are the roots of µ - p(z), so that n
µI - P(T) = an 1 ft(zi! - T). i=1
458
6. The Spectral Theorem
If .t E a (p (T)). then there exists Z j E a (T) for some j = i, ... , n. Indeed, if zi E p(T) f o r every i = 1 , ... , n, then an ]-[" 1(ziI - T) E CQ[X] so that p. E p(T). However, n
P - P(zj) = an1ft(zi -Zj) = 0. i=1
and so p(Zj) = A. Hence µ = p(Zj) E (P(A) E C: A E a(T)) = p(a(T)) because Z j E a (T) . Therefore,
a(P(T)) c p(a(T)). Conversely, if µ E p(a(T)) = (p(A) E C: A E a(T)), then µ = p(A) for some A E a (T ). Thus /.c - p(A) = 0 so that A = z j for some j = i,... , n, and hence n
µI-p(T) = an 'fl(zi/-T) i=1 n
n
e (zj! - T)an 1 fl(zi1 - T) =ant fl(zi1-T)(Zjl - T) j#i=1
jai=1
for (z3I - T) commutes with (zi I - T) for every i. If µ E p(p(T)), then (Al - p(T)) E G[X] so that n
(ZjI - T) (ant ft(zi! - T)(µ! -
p(T))-1)
/
j#i=1
_ (Al - p(T))(p.! _
(({.c1
-
p(T))-l = I
= (µl - p(T))-l (Al - p(T))
n
p(T))-1an1 fl(ziI -T))(zjI - T). j#i=1
This means that (z1 I - T) has a right and a left inverse, and so it is injective and surjective (cf. Problems 1.5 and 1.6). The Inverse Mapping Theorem (Theorem 4.22)
ensures that (z j l - T) E G[X], and therefore A = z j E p(T). But this contradicts the fact that A E a(T). Conclusion: µ h p(p(T)); that is, µ E a(p(T)). Hence t]
P(a(T)) c a(p(T)). In particular, p. E a(T)" = (A" E C: A E a(T)) if and only if µ E a(T"):
a (T") = a (T)"
for every
n > 0;
and µ E aa(T) = {aA E C: A E a(T)) if and only if µ E a(aT):
a(aT) = aa(T)
for every
a E C.
6.3 Spectral Radius
459
It is also worth noticing (even though this is not a particular case of the Spectral Mapping Theorem for polynomials) that if T E 9[X], then
a(T-O = a(T)-(. That is, A E a(T)-( = (A-) E C: 0 56 A E a(T)) if and only if lc E a(T-1). Indeed, if T E G[X] (so that 0 E p(T)) and p # 0. then -µT-)(µ-)I - T) = µl - T-(, and hence lc-) E p(T) if and only if It E p(T-1). Also note that if T E 8[11], then
a(T') = a(T)*. where 7-l is a complex Hilbert space (cf. Proposition 6.17).
Let T E 8[X I be an operator on a complex Banach space X. The spectral radius of T is the number
ra (T) = sup IRI = max IXl. LEa(T)
),EC(T)
The first identity defines the spectral radius ra (T); the second one is a consequence of Theorem 3.86. (Reason: a (T) 96 0 is compact in C and the function I : C -+ R is continuous.) I
Corollary 6.20. ra (T") = ra (T)" for every n > 0.
Take an arbitrary integer n > 0. Since a(T") = a(T)", it follows that L E a(T") if and only if p = X' for some A E a(T). Hence SUP,Eo(T..)IILI =
Proof.
SUPLEC(TIIA"1 = SUPLEa(T)1AI" _ (SUPLE,,(T)Il.I)"
Remarks: Recall that k E a (T) only if IAI II T II (cf proof of Corollary 6.12). and so r a (T) < 1 1T1 1 . Therefore, according to Corollary 6.20,
ra(T")=ra(T)" <- IIT"II < 11TII"
for every
n>0.
Thus ra (T) < 1 whenever T is power bounded. Indeed, if sup. IIT" II < oa, then
ra(T)" =ra(T")
supkIITk11 and limn (supkIITkIl)^ = 1, so that
sup"IIT" II < ao
implies
ra (T) < 1.
Also note that the spectral radius of a nonzero operator may be null. Sample: ra (T) _
0 for every nilpotent operator T (i.e.. whenever T" = 0 for some positive integer n). An operator T E 13[X] is quasinilpotent if ra(T) = 0, so that every nilpotent
operator is quasinilpotent. Observe that a(T) = ap(T) = (0) if T is nilpotent. Indeed, if T"-( # 0 and T" = 0, then T(T"-)x) = 0 for every x E X. so that (0) # 7t(T"- () a M(T), and hence k = 0 is an eigenvalue of T. Since ap (T) may be empty for a quasinilpotent operator T (as we shall see in Examples 6F and 6G of Section 6.5), it follows that the inclusion below is proper: Nilpotent
C
Quasinilpotent.
460
6. The Spectral Theorem
The next proposition is the so-called Gelfand-Beurling formula for the spectral radius. The proof of it requires another piece of elementary complex analysis, namely, every analytic function has a power series representation. Precisely, if f : A --> C
is analytic, and if the annulus Ba,p(A) = (A E C: 0 < a < IA - 141 < 0) lies in the open set A _c C, then f has a unique Laurent expansion about the point it, viz.,
f(I) = EOO -.yk(I - 1 )k for every A E Ba,p(14). Proposition 6.21. r"(T) = lim" IIT"II".
Proof. Since rQ M' < II T" II for every positive integer n,
r,, (T) < lim IIT"IIF n
(Reason: the limit of the sequence (II T" II A} exists for every T E 51X], according to Lemma 6.8.). Now recall the von Neumann expansion for the resolvent function
R: p(T) - 9[X]: ao
R(A) = (Al -
T)-I = A-1
E TtX-t i=o
for every A E p(T) such that IAI > IITII. where the above series converges in the (uniform) topology of 8[X] (cf. Problem 4.47). Take an arbitrary bounded linear functional rp : B[X] - C in B[X]*. Since rp is continuous, 00
,p(R(I)) = A-1 E(p(T;)k-' i=o
for every A E p (T) such that III > IITII
Claim The displayed identity holds whenever IAI > ro(T). P r o o f . A-I F mrp(T' )A-i is a Laurent expansion of p(R(A)) about the origin for every A E p(T) such that III > IITII. But V o 1Z is analytic on p(T) (cf. Claim 2 in Proposition 6.13) so that (p(R(A)) has a unique Laurent expansion about the origin for every I E p(T), and hence for every A E C such that III > r, (T). Then tp(R(A)) = I7' E00 . (T')A-', which holds for every III > IITII ? ro(T), must
be the Laurent expansion about the origin for every A E C such that IAI > r, (T). 0
Therefore, if III > ra(T), then cp((A71 T)') = gp(T')A-i -- 0 (cf. Problem 4.7(c)) for every (p E But this implies that ((),71 T)') is bounded in the (uniform) topology of S[X] (cf. Problem 4.67(d)). That is, A-I T is power bounded. Hence III-" 11 T" II < supi II (A- I T)'11 < oo, so that
III-IIIT"II^ < (supll(I-IT)`II)".
6.3 Spectral Radius
461
for every positive integer n whenever IAI > ra(T). Then JAI-1 limo IIT"II° < I so that limo IIT"II" < IAl whenever JAI > ra(T). In other words, limn IIT"II^ < r, (T) + s for every e > 0. Outcome:
lim IIT"II^ < ra(T).
o
n
Observe the following immediate consequences of Proposition 6.21.
ra(aT)=lot lr., (T)
for every
aEC
and, if f is a complex Hilbert space and T E B[N], then
ro(T`) = ra(T) An important application of the Gelfand-Beurling formula reads as follows: T is uniformly stable if and only if ra (T) < 1. In fact, there exists in the current literature a large collection of equivalent conditions for uniform stability. We shall consider below just a few of them.
Proposition 6.22. Let T E B[X] be an operator on a complex Banach space X. The following assertions are pairwise equivalent. (a)
T" -* 0.
(b) ra(T) < 1. (c)
lIT" I l < $a" for every n > 0, for some i > 1 and some a E (0, 1).
(d)
1:000 11 T" II p < oo for an arbitrary p > 0.
(e) E IIT"x II p < oo for all x E X, for art arbitrary p > 0. Proof. Since ra (T)" = ra (Tn) < II T' II for every n > 0, it follows that (a) (b). Suppose ra(T) < 1 and take any a E (ra(T). 1). The Gelfand-Beurling formula says that Iimo II T" II ^ = ra (T). Therefore, there exists an integer na > 1 such that 11 T' II < an for every n > na, and hence (b) =>(c) with P = maxo
oo for every x E X. and so sup" II T" II < oo by the Banach-Steinhaus Theorem (Theorem 4.43). Moreover, for m > 1 and p > 0 arbitrary, 00
'n-1
IIT'"'"T"x11p
IImFTmxIlp = n=0
< (sup "
n=0
462
6. The Spectral Theorem I
Thus sup,,, IIm
T"'x II < oo for every x E X whenever (e) holds true. Since m7' T' I
lies in B[X] for each m > 1, it follows that sup,,, llm r' T' 11 < oo by using the BanachSteinhaus Theorem again. Hence
0 < IIT"II < n-F supllm`1T' 11 M
for every n > 1 so that II T" II
o
0 as n -+ oo. Therefore, (e)=(a).
The next result extends the von Neumann expansion of Problem 4.47.
Corollary 6.23. Let X be a complex Banach space. Take any operator T in B[X] and any nonzero complex number A. converges uniformly In this case, (a) ra (T) < IAI if and only if xJ:1°O_0(I)' where >OO_o(z)' denotes the uniform lies in p(T), limit of IF%O(1)' ), and II(AI - T)- 111 -< (IAI - IITID-'
(b) If ra (T) = Al Iand {F_"0(x (1)') converges strongly, then A lies in p(T) and (Al - T)- = 1 E0oo()T ' where Fooo(T)' denotes the strong limit of (c) If IAI < ra (T), then { F_;'_o (L;L )' } does not converge strongly.
0 (cf. Problem 4.7), and Proof. If IF,,j=0(21)') converges uniformly, then (a )" ;L hence IAI -'ra (T) = ra (j) < 1 by Proposition 6.22. Conversely, if ra (T) < IA I,
thenA E p(T)sothat(AI-T) E C[X],andr ( ) = IAI-'ra(T) < 1.Hence{(a)") is an absolutely summable sequence in 13[XJ by Proposition 6.22. Now follow the o(a )' } steps of Problem 4.47 to conclude all the properties of item (a). If converges strongly, then (L )"x -* 0 in X for every x E X (cf. Problem 4.7 again) so that sup,, II (a)"x II < oo for every x E X. Then sup,, II (a )" II < oo by the BanachSteinhaus Theorem (i.e., is power bounded), and hence IAIra (T) = ra (a) < 1. This proves assertion (c).xMoreover,
()'
(AI - T);, i=0
(z ) {AI - T) = I - (z )n+l
=a
1.
i=0
Therefore, (AI - T)-' = z °O_0(x ', where F_O0_0(a )' E B[X] is the strong limit o of ,)'),which concludes the proof of (b).
6.4
Numerical Radius
What Proposition 6.21 says is that ra(T) = r(T), where r(T) is the limit of the numerical sequence (II T" II ° ) (whose existence was proved in Lemma 6.8). We
6.4 Numerical Radius
463
shall then adopt one and the same notation (the simplest, of course) for both of them: the limit of III T" 114) and the spectral radius. Thus, from now on, we write . r(T) = sup IAI = max III =1imIIT"II1 n
XEa(T)
)EO(T)
Therefore, a normaloid operator acting on a complex Banach space is precisely an operator whose norm coincides with the spectral radius. Recall that, in a complex Hilbert space N, every normal operator is normaloid, and so is every nonnegative operator. Since T*T is always nonnegative, it follows that (cf. Proposition 5.65)
r(T*T) = r(TT*) = IIT*TII = IITT*II = IIT112 = IIT*II2 for every T E 5[1-t]. Also note that T is normaloid if and only if there exists k in a(T) such that A l = IITII. However, such a X can never be in the residual spectrum. In fact, for every T E B[N],
aR(T) c (A E C: IAI < IITII}. (If A E aR (T) = ap (T *)* \ap (T ), then there exists 0 0 x E N such that 0 < IITx -Ax112 = IITxlI2-2Re(Ax;T*x) + IAI2IIxII2 = IITxII2-IAI2IIxlI2, and hence Al I< IITII.) The numerical range of an operator T acting on a complex Hilbert space h 0 (0) is the (nonempty) set
W(T) _ (AEC: A=(Tx;x)for some llx11=1). It can be shown that W (T) is always convex in C and, clearly,
W(T*) = W(T)*.
Proposition 6.24. up (T) UaR(T) c W(T) and a(T) c W(T)-. Proof. Take T E 13[h], where N # (0) is a complex Hilbert space. (a) If A E ap (T), then there exists a unit vector x E N such that Tx = Ax. Hence
(Tx ; x) = A1IxI12 = A; that is A E W(T). If A E OR(T), then A E ap(T*) (Proposition 6.17). Thus A E W (T*) so that A E W(T).
(b) If k E aA p (T ), then there exists a sequence (x } of unit vectors in h such that ((Al - T )xn II -> 0 (Proposition 6.15). Therefore,
0 < IA - (Txn ; xn)I = I((A! - T)xn ; xn)I < II(Al - T)xnIl -* 0 so that (Txn xn) -+ A. Since each (Txn ; x") lies in W(T). it follows by the Closed Set Theorem that A E W (T) -. Hence
aAp(T) c W(T)
.
464
6 The Spectral Theorem
and so a (T) = aR (T) U a A a (T) C W (T)- according to item (a).
0
The numerical radius of T E 5[1-lI is the number
w(T) = sup IAI = sup I(Tx:x)l. lE W(Tl
IIx11=1
It is readily verified that
w(T') = w(T)
and
w(T*T) = IIT112.
Unlike the spectral radius, the numerical radius is a norm on B[R]. That is, 0 < w(T)
for every T E B[N] and 0 < w(T) whenever T # 0, w(aT) = lalw(T) and w(T + S) < w(T) + w(S) for every a E C and every S. T E B[7{]. Warning: the numerical radius does not have the "operator norm property" in the sense that the inequality w(ST) < w(S)u)(T) is not true for all operators S, T E B[H], but
the power inequality holds (i.e., w(T") < w(T)" for all T E B[N] and every positive integer n - the proof is tricky). Nevertheless, the numerical radius is a norm equivalent to the (induced uniform) operator norm of BI7-(] and dominates the spectral radius, as in the following proposition.
Proposition 6.25. 0 < r(T) < w(T) < 11TH < 2w(T). Proof. Since a(T) C W(T). we get r(T) < w(T). Moreover,
w(T) = sup I(Tx;x)I < sup IITx1I = IITII. I14=1
11xu=1
From Problem 5.3. recalling that l(T:::)I < sup1,,,11=i I(Tu ; u)I IIz112 = w(T)11z11` for every z E 1 ( (cf. proof of Proposition 5.78), and by the parallelogram law,
I(Tx;)-)1
14(1(T(.r+y):(x+y))I+I(T(x-y);(x-y))I
+1(T(.r+iy):(x+iy))I+I(T(x-iy);(x-iy))I) < s w(T)(Ilx + y112 + 11x - V112 + 11x + it'll` + 11x - iy112
= w(T)(IIx112 + Ilyll') < 2w(T) whenever Il.r II = II v 11 = 1. Therefore, according to Corollary 5.71, IIT11 =
sup
I(Tx;y)I -< 2w(T).
0
IIxIi=N'11=1
An operator T E 13[7{] is spectraloid if r (T) = w(T). The next result is a straightforward application of the previous proposition.
Corollary 6.26. Every nornialoid operator is spectraloid. Indeed, r (T) _ {I T 11 implies r (T) = w(T) by Proposition 6.25. However, Proposition6.25alsoensuresthatr(T) = 11T11 impliesw(T) = 11 T 11, so that w(T) = 11T11
6.4 Numerical Radius
465
is a property of every normaloid operator on 7f. What comes out as a nice surprise is that this property can be viewed as a third definition of a normaloid operator on a complex Hilbert space.
Proposition 6.27. T E B[7{] is normaloid if and only if w(T) = IITII Proof. The easy half of the proof was presented above. Now suppose w(T) = II T II
(and T # 0 - otherwise the result is trivially verified). Recall that W(T)- is compact in C (for W (T) is clearly bounded). Thus maxAEw(T)- IAI = SUPAE W(T)- IAI =
SUPAEW(T) IAI = w(T) = IITII, and hence there exists A E W(T)- such that A = IITII. Since W(T) is always nonempty, it follows by Proposition 3.32 that there exists a sequence (X,,) in W (T) that converges to A. In other words, there exists a sequence {xn } of unit vectors in 7{ (Ilxn II = 1 for each n) such that An = (Txn ; xn) -+ A, where IAI = IITII # 0. If S = X-1 T E B[7{], then
(Sxn ; xn) -> I. Claim II Sxn II - I and Re (Sxn ; xn) -. 1. Proof. I (Sxn ; xn) I fi
II Sxn II
II S II = 1 for each n. But (Sxn ; xn) -> 1 implies that
1 (and hence II Sxn II - 1) and also that Re (Sxn ; xn) -+ 1. Both I (Sxn ; xn) I arguments follow by continuity. O
Then 11(1-S)xn112=1ISxn-xnII2=11 Sxn 112-2Re(Sxn;xn)+IIxnII2-+o so that 1 E aAp(S) c_ a(S) (cf. Proposition 6.15). Hence r(S) > 1 and r(T) = r(AS) = IAIr(S) >- IAI = IITII, which implies r(T) = IITII (since r(T) < IITII for every 0 operator T). Therefore, the class of all normaloid operators on 7{ coincides with the class of all operators T E B[l] for which
IITII = sup I(Tx;x)I. IIX11=1
This includes the normal operators and, in particular, the self-adjoint operators (see Proposition 5.78). This includes the isometries too. In fact, every isometry is quasinormal, and hence normaloid. Thus
r(V) = w(V) = IIV II = I
whenever
V E B[7{] is an isometry.
(The above identity can be directly verified by Propositions 6.21 and 6.25, once II V" II = 1 for every positive integer n - cf. Proposition 4.37.)
Remark: If T E B[71] is spectraloid and quasinilpotent, then T = 0. Proof: If w(T) = r(T) = 0, then T = 0 by Proposition 6.25. Particular cases: The unique normal (or hyponormal, or normaloid) quasinilpotent operator is the null operator.
466
6. The Spectral Theorem
In other words, if T E B[f] is normal (or hyponormal, or normaloid) and r(T) = 0
(i.e., a(T) = (0)), then T = 0. Corollary 6.28. If there exists .l E W(T) such that Ill I = II T II, then T is normaloid
and A E ap (T). In other words, if there exists a unit vector x such that II T II = (Tx; x), then r(T) = w(T) = IITII and (Tx ; x) E ap(T). Proof. If A E W(T) is such that IAl = IITII, then w(T) = IITII (see Proposition 6.25) so that T is normaloid by Proposition 6.27. Moreover, since X = (Tx ; x) for some unit vector x, it follows that IITII = Ixl = I (Tx ; x) 1 <_ II Tx II Ilx II < IITII, and hence I (Tx ; x) I = II Tx 11 11 x II . Then Tx = ax for some a E C (cf. Problem 5.2) so
that aEup (T).Buts=allxll2=(ax;x)=(Tx;x)=X.
0
Remark: Using the inequality II T" II < II T II" , which holds for every operator T, we have shown in Proposition 6.9 that T is normaloid if and only if II T" II = II T II" for
every n > 0. Now, using the inequality w(T") < w(T)", which also holds for every
operator T, we can show that T is spectraloid if and only if w(T") = w(T)" for every n > 0. Indeed, according to Propositions 6.20 and 6.25,
r(T)" = r(T") < w(T") < w(T)"
for every
n > 0.
Hence r(T) = w(T) implies w(T") = w(T)". Conversely, since
w(T")' < IIT"II" -+ r(T) < w(T), if follows that w(T) = w(T)" implies r(T) = w(T).
6.5
Examples of Spectra
Every closed and bounded subset of the complex plane (i.e., every compact subset of C) is the spectrum of some operator.
Example 6B. Take any T E B[l], where X is afinite-dimensional complex norined space. Then X and its manifolds are all Banach spaces (Corollaries 4.28 and 4.29). Moreover, N(XI - T) = (0) if and only if (JAI - T) E C[X] (cf. Problem 4.38(c)). That is, N(,Al - T) = (0) if and only if A E p(T), and hence ac(T) = OR(T) = 0.
Furthermore, since R(ll - T) is a subspace of X for every A E C, it also follows that ap2(T) = ap3(T) = 0 (see diagram of Section 6.2). Finally, ifN(Jll -T) # (0), then R(,LI - T) 0 X whenever X is finite-dimensional (cf. Problems 2.6 and 2.17), and so up, (T) = 0. Therefore,
a(T) = up(T) = ap4(T).
6 5 Examples of Spectra
467
Example 6C. Let T E 13[RJ be a diagonals; able operator on a complex (separable infinite-dimensional) Hilbert space R. That is, according to Problem 5.17 there exists an orthonormal basis (ek)k'0_1 for 71 and a sequence 11k )'I in f.' such that, for every x E Tl, 1
Tx =
Ak(x: ek) ek. L=1
Take an arbitrary A E C and note that (Al - T) E 8[71] is again a diagonalizable - Ak)(x ; ek)ek for every x E R. Since operator. Indeed. (Al - T)x = F-A'
JV(AI - T) = {0} if and only if A # Ak for every k > I (that is, there exists (AI - T)-' E C[7 (AI - T), 7-lJ if and only if A - Ak # 0 for every k > 1 - cf. Problem 5.17), it follows that
ap(T) = {AEC: A=Akforsome k>1}. Similarly, since T' E B[HJ also is a diagonalizable operator, given by T'x Lk=1,kk (z ; ek)ek for every x E f (e.g., Problem 5.27(c)), we get
ap(T') =
{AEC: A=Akforsome k>1}.
Therefore, aR (T) = aR (T * )' \ap (T) = 0. Moreover. A lies in p (T) if and only if
(Al - T) E 9[7(J or. equivalently, if and only if infklA - AkI > 0 (cf. Problem 5.17). Then
a(T) =ap(T)Uoc(T) _ {AEC: inflA-AkI=0}. and hence a (T)\ap (T) is the set of all cluster points of the sequence PAZ I (i.e., the set of all accumulation points of the set (Ak )k° I ):
ac(T) = {AEC: inflA- AkI=O and A#Akfor every k>1}. Note that ap,(T) = ap,(T) = 0 (reason: T' is a diagonalizable operator so that aR(T') = 0 - see Proposition 6.17). If A, E op(T) also is an accumulation point of ap (T), then it lies in ap3(T); otherwise (i.e., if it is an isolated point of ap (T)). it lies in ap4(T). Indeed, consider a new set (Ak)' without this point A, and the associated diagonalizable operator T' so that A, E ac(T'), and hence R(A,1 - T') is not closed, which means that R(Ai I - T) is not closed. If [Ak } is a constant sequence, say Ak = µ for all k, then T = Ie I is a scalar operator and, in this case,
a(µ1) = ap(µl) = ap.(µl) = W. Now recall that C (equipped with its usual metric) is a separable metric space (Example 3P) so that it includes a countable dense subset, and so does every compact
subset E of C. Let A be any countable dense subset of E, and let {Ak}k° 1 be an enumeration of it (if E is finite, say #E = n, then set ;4 = O for all k > n). Observe
468
6. The Spectral Theorem
that supk IAk I < oo because E is bounded. Thus consider a diagonalizable operator
T E S[7-1] such that Tx = >k° ,Ak (.r ; ek)ek for every x E N. As we have just seen,
a(T)=A-=E;
that is, a(T) is the set of all points of adherence of A = {Ak}ko l, which means the closure of A. This confirms the statement that introduced this section. Precisely, every closed and bounded subset of the complex plane is the spectrum of some diagonalizable operator on N. Example 6D. Let A and r denote the open unit disc and the unit circle in the complex
plane centered at the origin, respectively. In this example we shall characterize each part of the spectrum of a unilateral shift of arbitrary multiplicity. Let S+ be a unilateral shift acting on a (complex) Hilbert space 7{, and let {7-lk}k°Oo be the underlying sequence of orthogonal subspaces of W _ ®O0_oWk (Problem 5.29). Recall that S+x = 0 ®®k°_i Uk xk_I
and
S+X = (Dk'--OUk+lxk+l
for every x = ®0 Oxk in R = ®O0 o7-(k, with 0 denoting the origin of No, where xk+t {Uk+i )k° O is an arbitrary sequence of unitary transformations Uk+1: 7{k Since a unilateral shift is an isometry, we get r(S+) = 1.
Take x =
k_oxk E h and A E C. If x E A (A! - S+), then Axo ®®kO0_IXxk = 0 ®®k° Ukxk_ 1. Hence Axo = 0 and, for every k > 0, Axk+i = Uk+txk. If X = 0, then x = 0. If A # 0, then xo = 0 and xk+1 = A-1 Uk+lxk, so that 11xoll = 0 and Ilxk+l II = JAI-1 Ilxk II, for each k > 0. Thus Ilxk II = IAI-k 11x011 = 0 for every k > 0.
and therefore x = 0. Conclusion: N(A! - S+) = (0) for all A E C. Equivalently,
ap(S+)=0. Now take any xo 0 0 in ho and any A E A. Consider the sequence (xk )k° o, with each xk in 7{k. recursively defined by xk+I = AUk+ixk so that IIXk+1 II = IAI 11Xk I) for every k > 0. Then Ilxkll = IAlkllxoll for every k > 1, and hence F_O0_olIxk112 =
IIxo112j1 + Ek°_i IAI2k) < coo, which implies that the nonzero x = ®°Ooxk lies in ®k_oHk = N. Moreover, since Axk = Uk+IXk+i for each k > 0, Ax = S+ x. Therefore, 0 x E A/'(A! - Sr). Conclusion: N(Al - S+) A (0) for all A E A. Equivalently, A c_ ap(S+). On the other hand, if A E crp(S+), then there exists °xk E ®k0O_ohk = 7{ such that S;x = Ax. Thus Uk+,xk+l = Axk so 0 # x = k--0 that Ilxk+1 II = IAI Ilxk Il for each k > 0, and hence 114 11 = IAIk Ilxoll for every k > 1.
Therefore, x0 # 0 (because x 0 0) and (1 + Ek° i IAIZk) IIXO112 = Fk°_ollxk 112 = 11x112 < coo, which implies that Al I< 1 (i.e., A E A). Conclusion: ap(S+) C A. Then ap(S+) = A.
6.5 Examples of Spectra
469
But the spectrum of any operator T on 7{ is a closed set included in the closed disc (I E C: IA 1 < r (T) ), which is the disjoint union of ap (T ), aR (T) and ac (T ), where
aR(T) = ap(T')'\ap(T) (Proposition 6.17). Hence
lp(S+)=aR(S+)=0. aR(S+)=ap(S+)_A, aC(S+)=aC(S+)=I' Example 6E. The spectrum of a bilateral shift is simpler than that of a unilateral shift, for bilateral shifts are unitary (i.e., besides being isometrics they are normal too). Let S be a bilateral shift of arbitrary multiplicity acting on a (complex) Hilbert space 9-(, and let (7{k }k° _.o be the underlying family of orthogonal subspaces of ?{ = ®k°=_po7{k (Problem 5.30). Recall that Sx = ®k°__,,. Uk xk_1
and
S'x = (J)'.UU+1xk+1
for every x = ®k°__.xk in N = ®k° _,,07{k, where (Uk)k _.o is an arbitrary family of unitary transformations Uk+1 : 7Ik -> 71k+1- Suppose there exists ). E
P f1 p(S) so that 7Z(AI - S) = N and IAI = 1. Take any yo 0 0 in No and set yk = 0 E 7-(k foreachk 96 0. Consider thevector y = ®k__ Cyk E N = 7Z(11-S)
and let x = E)k°__c,,,xk E 71 be any inverse image of y under (XI - S); that is, (XI - S)x = y. Since yo 0 0 it follows that y 34 0, and hence x 96 0. On the other hand, since yk = 0 for every k # 0, it also follows that Axk = Ukxk_1 + yk = Ukxk_1 so that Ilxk II = I1xk-1 II for every k 0 0. Therefore, 11x1 II = Ilx_t 11 for every j < -1 and 11x j II = II xo II for every j > 0, and hence x = 0. (Reason: IIx I12
=
k°__. = Ilxk 1122 = E- 1 _""IIxj ll2 + F J°_o11xj I12 < oo.) Thus the existence of a complex number k in 1' fl p (S) leads to a contradiction. Conclusion: r fl p (S) = 0. Equivalently, I' c_ a(S). Finally, recall that S is unitary and so a(S) c 1' (Corollary 6.18(c)). Outcome:
a(S)=r.
Now take an arbitrary pair {A , x) with A in a (S) and x = ®k°__,,xk in N. If X E N(Al - S), then ®A°__waxk = ®k° _wUkxk_1 and so a.xk = Ukxk_1 for every k. Since I,t1 = 1 (because a(S) = I'), Ilxk II = 114-1 11 for every k. Hence x = 0 (11X112 = F-k°_-oo Ilxk 112 is finite). Conclusion: A((),I - S) = {0} for all A E a(S). Equivalently, ap(S) = 0.
But S is normal so that OR(S) = 0 (Corollary 6.18(b)). Recalling that a(S') _ a(S)* and aC(S') = ac(S)' (Proposition 6.17) we get
a(S) = a(S') = ac(S') = ac(S) = I'. Consider a weighted sum of projections D = L.,kakPk on t;' (h) or on t2(7{). where {ak) is a bounded family of scalars and 7Z(Pk)
7-1 for all k. This is identified
with an orthogonal direct sum of scalar operators D = ®kak I (Problem 5.16), and
470
6 The Spectral Theorem
is referred to as a diagonal operator on a+ (f{) or on f--(f), respectively. A weighted shift is the product of a shift and a diagonal operator. Such a definition implicitly assumes that the shift (unilateral or bilateral, of any multiplicity) acts on the direct sum of countably infinite copies of a single Hilbert space f. Explicitly, a weighted unilateral shift on k2 (7-i) is the product of a unilateral shift on f+(fl) and a diagonal operator on f (7.O. Similarly, a weighted bilateral shift on 1=(7l) is the product of a bilateral shift on f 2(7-() and a diagonal operator on £2(f). Diagonal operators acting on f+(71) and on f2(7O, D_ = ®k°_oak 1 and D = ®kc _,ak 1, with 1 standing for the identity on 11, are denoted by D. = diag((ak)ko) and D = diag((ak)k _ respectively. Likewise, weighted shifts acting on 1+(7l) and on £2(7.t), T+ = ST D_
and T = SD, will be denoted by T = shift({ak)ko) and T = ((ak)k°-00), respectively, whenever S_ is the canonical unilateral shift on 1+(1() and S is the canonical bilateral shift on f2(f) (see Problems 5.29 and 5.30).
Example 6F. Let {ak )o be a bounded sequence in C such that ak A 0 for every k > 0
and
ak -> 0 as k -* oo,
and consider the weighted unilateral shift T. = shift((ak}'°o) on 1+2 (W), where 7-( T (0) is a complex Hilbert space. T. and T' are given by
T+x=S,D,x=0®®k 1ak-1xk-1 and T+x=D+Syx=® °akxk+l for every x = e o.rk in
®k°
0 denoting the origin of 1(. Ap-
plying the same argument used in Example 6D to show that a(S,) = 0, we get Ar(Al - T+) = (0) for all A E C. Indeed, if x = ®kc oxk E A[(;,1 - T.). then 1`.r0 ®®k l AxL = 0 ®®f 1 ak _ 1xk _ 1 so that Axo = 0 and kxk+1 = ak xk for
every k > 0. Thus x = 0 if A = 0 (for ak 0 0) and, if k # 0, then xo = 0 and Ilxk+l II c supk I00 1A1-l I1xk11 for every k > 0. which also implies that x = 0. Outcome:
ap(T_) = 0. Now note that the vector x = ®,koxk with 0 A xo E 7i and xk = 0 E 7-l for every k > 1 lies in (11) but not in 72(T_)- c_ (0) ®®k 17{. Hence 7Z(T+)- # 1+(7{) and so 0 E aCp(T_). Since up (T.) = 0, 0 E QR(T+)
However, If A # 0, then 7Z(Al - T,) = f+ (f). In fact, suppose k ; 0 and take any v' = ®koyk in f+(7-1). Set xo = A-1 yo and, for each k > 0, xk+i = k-1(akxk+ 1'k+1). Since ak -' 0. there exists a positive ka such that a_ = IA1-1supk>k, lakl :5 . TheIlak+lxk+l 11 5 a(IlakxkII+IIyk+lII),sothat 11ak+1xk+l 112
6.5 Examples of Spectra 00 -k
Ilakxk II2 + I IIYII2, which implies that rk ollakxk
112
471
< oo, and therefore
00
IAI2 (F, IIxk+1112) -< E(Ilakxkll + lIYk+1 II )2
k0
k--O
S2
(E
Il ak xk 112 + 11 Y l12)
< oo
k00 =0
so that x = ®D Oxk lies int. (f). But (AI -T+)x = Axo®®k°_1 Axk -ak-txk-1 = y and so y E R(AJ - T+ ). Outcome: R(AI - T+) = e+(?{). Since N(.k I - T+) = (0) for all A E C, A E p(T+) for every nonzero A E C. Conclusion: a(T+) = aR(T+) _ (0). Moreover, as up,(T) is an open set for every operator T, we get
a(T+) = aR(T+) = aR2(T+) = (0), and hence
a(T+) = ap(T+) = ap2(:) = (0). This was our first sample of a quasinilpotent operator (r (T+) = 0) that is not nilpotent
(ap(T+) = 0). The next example exhibits another one. It is worth noticing that a(µ1 - T+) = (µ - A E C: A E a(T+)) = (µ) by the Spectral Mapping Theorem, and therefore a(µ1 - T+) = (µ). Moreover, if x is an eigenvector of T+, then 7+x = 0 so that (µ1 - 7+)x = µx; that is, µ E ap(µ! - 7+). Thus
a(µ1 - T+) = CR(µl - T+) = (µ) and a(µ1 - T+) = ap(µ! - 7+) = {µ}. Example 6G. Let (ak )k--oo be a bounded family in C such that
ak i4 0 for every k E Z
and
ak -* 0 as IkI -+ oo,
and consider the weighted bilateral shift T = shift((ak)k° k= 7{ 96 (0) is a complex Hilbert space. T and T' are given by
T = SDx = ®k° _ooak-Ixk-1
and
on 82(7{), where
T` = D`S'x = ®k _ooakxk+l
for x = ®°k°-__,,xk in f2(7) = ®,°F°__.7{. Take any A E C. If x = ®k _ooxk C N(AI - T), then ®k -O,,Axk - ak_1xk_1 = 0 so that Axk+1 = akxk for every k E Z. If A = 0, then x = 0. If A A 0, then Ilxk+1 II IAI-1supklakl Ilxk II for every k E Z. But limk-+-,,IlxkII = 0(l1x112 = F-km=_,,Ilxk112 < oo)sothat x = 0. Hence
N(AI - T) = (0) for all A E C, and so
ap(T) = 0.
472
6. The Spectral Theorem
Take any vector y = ®k° -ooyk in £2(71) and any I A 0 in C. Since ak --->' 0 as Iki -* oo, there exists a positive integer kx and a finite set Kk = (k E Z: -kx < k < kk } such that a : = supkEZ\Ka I I < 1. Then
a
Yj
j=-00
X
'X
A
1 (2Faj +#Kxskpl
<skpII
j=o
for all k E Z, which implies that the infinite series convergent (thus convergent in 7-() for every k E Z. Set k-1 Xk
- I)
x
is absolutely
ak-1 ...aj Yj + Yk
in 7{ so that xk+I = a xk + x' for each k E Z. If k E Z\Kx, then Ilakxk II < a(Ilak-Ixk-III + IIYkil) and so Il(Ykxkll2 < 2a2(I1ak-Ixk-t1I2 + IIYk112) Hence EkEZ\Kallakxkll2 < (F-kEZ\Kal1ak-Ixk-III2 + IIYII2). Then J:k0=_oollakxk1l2 < oo, and therefore 00
IXI2IIxk+I 112) -< F, (Ilakxkll + IIYk+I II)2 00 k=-oo
k=-oo
00
<2(1: IIakxk112+IIYII2)
Thus x = ®k° _00xk lies in £2(f). But (Al - T)x = ®k° _,oJAXk - ak-Ixk-1 = y and soy E 7Z(AI - T). Outcome: 7Z(A! - T) = t2(7{). SinceN(A/ - T) = (0) for all k E C, it follows that every nonzero A E C lies in p(T). Conclusion:
a (T) = (0). However, if x E N(T ), then akxk+ 1 = 0 so that Xk+I = 0 (since ak 340) for every k E Z and hence x = 0. That is, N(T') = (0) or, equivalently, (cf. Problem 5.35)
R(T)- = P2(f). This implies that 0 A aR(T) and, as ap(T) = 0. we finally get
a(T) = ac(T) = ac(T*) = a(T`) = (0). Note: Using the Spectral Mapping Theorem (as we did in the previous example) it can be shown that
a(µl - T) = ac(Al - T) = (µ)
and
a(µ1 - T*) = oc(µl - T`) = (µ-).
Example 6H. Let F E B[f] be an operator on a complex Hilbert space H i4 (0). Consider the operator T E C3[t'+(7-l)] defined by
Tx = 0 ®®k° I Fxk_I
so that
T'x = ®k_oF'.rk+l
6 5 Examples of Spectra
473
for every x = ®k°_oxk in £+(H) _ ®O0_o7{, where 0 is the origin of H. These can be identified with the following infinite matrix of operators.
O T =
O F'
F O F O F O
O F' and
T' =
O
F'
O
It is readily verified by induction that T"x = ®k=a0 ®®k° F"xk_", and hence IIT"x112 = EOoIIF"xk112 so that IIT"xll < IIF"IIIIx11 for all x E £+(H), which 11 F" Il for each n > 1. On the other hand, take any nonzero vector yo in H, set yk = 0 E H for all k > 1, and consider the vector y = ®°O_.pyk in £+(7.1) such that IIY11 = Ilyoll A 0. Thus IIT"II = supjj,,N=tIIT"xII sup1j,=111T"3'11 = implies that 11 T" 11
sup,i ).0l1=111 F" yo II = II F" II for each n > 1. Therefore, 11 T" II =IIF" II for every n > 1.
Then the Gelfand-Beurling formula for the spectral radius ensures that
r(T) = r(F). Moreover, y # 0 and T* y = 0 so that
0 E ap(T'), and hence 0 E a (T). Take an arbitrary A E p(T) so that A # 0 and 7Z(AI - T) _ £+(H). Since y = yo ®®k° 10 lies in £+(H) for every yo E H, it follows that
y E 7Z(.kl - T). That is, y = (Al - T)x for some x = ®k°oxk in £+(7{) and so yo ® (Dk°_ 10 = Axo ® ®0000_ 1),xk - Fxk_ 1. Therefore, xo = A-1 yo and xk+1 = x-1 Fxk for every k > 0, and hence xk = 1 F)kxo = F)k yo. Since x E £+(H) for every yo E R. 11 X 112 = E°_ollxk112 = lAl-2yk0_oll(A-F)kyo112 <
0o for every yo E H. so that r(;L-1 F) < 1 by Proposition 6.22. Conclusion: If
A E p(T), then r(F) < JAI. Equivalently, if IAI < r(F), then A E a(T); that is, (A E C: IAl _< r(F)) C a(T). But a(T) C {a E C: IAI < r(F)) because r (F) = r(T). Therefore (since a (T *) = a (T )' for every operator T)
a(T) = {A E C: Al I< r(F)I = a(T'). Now recall that A E ap(T) if and only if Tx = xx (i.e.,Axo = 0 and )Axk+1 = Fxk
for every k > 0) for some nonzero x = ®k° oxk in ££ (H). If 0 E ap (T), then Fxk = 0 for all k > 0 for some nonzero x = ®k°=oxk in ££(H) so that 0 E ap (F). Conversely, if 0 E ap(F), then there exists xo # 0 in R such that Fx0 = 0. Set x = ®0°_o(k + 1)-1 xo, a nonzero vector in £+ (H), so that Tx = 0® ®000__1 k-1 Fxo = 0.
Hence 0 E ap(T). Outcome: 0 E ap(T) if and only if 0 E ap(F). If x 0 0 lies in ap(T), then xo = 0 and xk+t = A-1 Fxk for every k > 0 so that x = 0, which is a contradiction. Therefore, if A # 0, then k # ap(T). Conclusion: {0},
ap(T) _ 1
0 E ap(F),
0¢ap(F).
474
6. The Spectral Theorem
Since aR(T') = op(T)*\ap(T*), ap(T)` c (0) and 0 E ap(T'), aR(T*) = 0,
and hence
ac(T') = {A E C: IAI < r(F))\ap(T').
If up (T*) A {0}, then there exists 0 # A E ap(T*), which means that T*x = Ax for some nonzero x = ®k°_axk in f,' (R). Hence there exists 0 # xj E N such that F'xk+1 = Axk for every k > j. A trivial induction shows that F'kx j+k = Akx j for every k > 0. Then xj E nk __0R(F'k) because A A 0 so that nk_OR(F*k) {0}. Conclusion:
n"OR(F'k) = {0}
implies
ap(T') = (0),
and, in this case,
oc(T') = {A E C: IAI < r(F))\{0}. Sample: Set F = S+ on N = f;' (/Q for any complex Hilbert space IC 0 {0}. There-
fore, r(F) = r(S+) = I (cf. Example 6E) and ?Z(F'k) = R(Sk) = V=0{0}®
®j k+t 1C c e+ (/C) so that nk°_0R(F'k) _ {0). With T denoting the closed unit disc about the origin we get ap(T;)=aR(T)={0},
aR(T*)=ap(T)=0, ac(T)=ac(T)=\(O).
Summing up: A backward unilateral shift of unilateral shifts (i.e., T* with F' = SS)
and a unilateral shift of backward unilateral shifts (i.e., T with F = S*) have a continuous spectrum equal to the punctured disc 0 \(0). This was our first example of operators for which the continuous spectrum is not included in the boundary of the spectrum.
6.6
The Spectrum of a Compact Operator
The spectral theory of compact operators is an essential feature for the Spectral Theorem for compact normal operators of the next section. Normal operators were defined on a Hilbert space, and therefore we assume throughout this section that the compact operators act on a complex Hilbert space N $- {0), although the spectral theory for compact operators can be developed on a Banach space as well. Recall that Sam[?{] denotes the class of all compact operators on W.
Proposition 6.29. If T E SA[N] and A is any nonzero complex number then 1Z(AI - T) is a subspace of N.
Proof. Take any compact operator K E SJM, X), where X 0 (0) is a complex Banach space and M is a subspace of X. Let I be the identity on M. let A be any nonzero complex number, and consider the operator (Al - K) E SIM, X). Claim. If N(AI - K) = {0}, then R(AI - K) is closed in X.
6.6 The Spectrum of a Compact Operator
475
Proof. If N(Al - K) = (0) and R(AI - K) is not closed in X # (0), then AI - K is not bounded below (Corollary 4.24). This means that for every e > 0 there exists 0 A xE E M such that 11 (X1 - K)xell < sllx,11. Thus inf11x11=111()LI - K)xll = 0 and there exists a sequence (x,,) of unit vectors in M for which 11 (Al - K)x II 0. Since K is compact and {x } is bounded, it follows by Theorem 4.52 that { Kx )
has a convergent subsequence, say {Kxk), so that Kxk -+ y E X. However, IIAxk - y II = II Axk - Kxk + Kxk - y II <- 11 (V - K)xk ll + ll Kxk -Y11 -+0. Then {Axk) also converges in X to y, and hence y E M (for M is closed in X -Theorem 3.30). Moreover, y 0 (since 0 # Al I= IIAxk II -+ IIYII) and, as K is continuous,
Ky = K limk Axk = A limb Kxk = Ay so that y E N(AI - K). Therefore we get N(AI - K) 36 {0}, which is a contradiction. Take any T E B[7-l]. Recall that (AI - T)IN()Li_Tll E 13[N(AI - T)1, H] is injective (i.e., N((AI - T)IN(x1-T)J-) = (0) - cf. Remark that follows Proposition 5.12) and coincides with Al - T IN(x1-T)1 on N(AI - T)1. If T is compact, then so is T IN(),l-T)1 E 13[M(AI - T)1, 7-1]. (Reason: N(AI - T)1 is a subspace of 7-l, and the restriction of a compact linear transformation to a linear manifold is a compact linear transformation - see Section 4.9.) Since ?154 0, it follows by the above claim that (Al - T)IN()L1-T)1 = Al - T1N(A1-T)± has a closed range for all A # 0. But it is readily verified that R((AI - T)IN(x1_T)1) = R(AI - T).
Proposition 6.30. If T E l3,,[H] and A is any nonzero complex number, then
R(AI - T) = H whenever N(AI - T) = {0). Proof. Take any A 54 0 in C and any T E B[N]. Suppose N(AI - T) = (0) and R(AI - T) A H (recall: N # (0)), and consider the sequence o of linear manifolds of N recursively defined by
Mn+1 = (Al -
for every n > 0,
with
Mo = H.
It can be verified by induction that
c M.
for every
n > 0.
Indeed, M1 = R(AI - T) C H = Mo and, if the above inclusion holds for some n > 0, then (Al C (Al which concludes the induction. The previous proposition ensures that R(AI - T) is a subspace
of H, and so (A1 - T) E 9[71,1.(AI - T)] by Corollary 4.24. Hence (another induction plus Theorem 3.24),
{M }n° o is a decreasing sequence of subspaces of H.
Moreover, if M,,+i = M for some n, then there exists an integer k > 1 such
that Mk+1 = Mk Mk-1 (for MO = N 36 R(AI - T) = M1). But this leads to a contradiction: if Mk+1 = Mk, then (Al - T)(Mk) = Mk so that
476
6. The Spectral Theorem
Mk = (A! - T)'t(Mk) = Mk_i. Outcome: Mn+t is properly included in Mn for each n; that is,
Mn+i c M,,
for every
n > 0.
Hence Mn+t is a proper subspace of M. (see Problem 3.38). By Lemma 4.33, for each n > 0 there exists xn E Mn with 11x,,11 = 1 such that 7 < d(xn, Mn+t) Recall that A * 0, take any pair of integers 0 < m < n, and set
x=
X
( (, X
(X
so that Txn - Txm = A(x - xm). Since x lies in Mm+i, I1Tx - Txm11 = IAIIIx -xm11 > 21 1,X1,
which implies that the sequence (Txn) has no convergent subsequence (every subsequence of (Txn) is not a Cauchy sequence). Since (xn) is bounded, this ensures that T is not compact (cf. Theorem 4.52). Conclusion: If A 96 0 and T E B[71] is
such that N(A! - T) = (0) and R(Al - T) A W. then T V B J7{]. Equivalently, if T E B,[7-(] and N(AI - T) = (0) for A. 0 0, then R(AI - T) = 7{. Corollary 6.31. If T E B,[7{], then 0 56 A E p(T) U ap4(T) so that
a(T)\(0) = ap(T)\(0) c ap4(T). Proof. Take 0 96 A E C. Since 1 A (0), Propositions 6.29 and 6.30 say that A E p (T) U ap, (T) U ap4(T) U aR, (T) and A E p (T) U ap (T) (see the diagram of Section
6.2).ThenA E p(T)Uap,(T)Uap,(T)andhence A E p(T)' Uap,(T)* Uap,(T)* = p(T*) U aR,(T*) U ap4(T *) (by Proposition 6.17). But T* E
whenever T E BA[N] (cf. Problem 5.42) so that A E p(T') U ap,(T*) U ap,(T*), and therefore
A E p(T*)Uap4(T*).That is,AE p(T)Uap4(T)wheneverA 00. Example 61. If T E BON (i.e., T is a finite-rank operator on 71), then
a(T) = ap(T) = ap4(T) is finite. Indeed, if dim 7{ < oo, then a (T) = ap (T) = ap,(T) (Example 6B). Suppose dim 7{ = oo. Since Bo[71] c B>[f], it follows by Corollary 6.31 that 0 0 A E p (T) U ap4(T ). Moreover, since dim R(T) < oo and dim h = oo, it also follows that R(T)- = R(T) * H and N(T) A (0) (recall from Problem 2.17 that dimN(T) + dim? (T) = dim N). Then 0 E ap,(T) (cf diagram of Section 6.2). Hence or (T) = ap (T) = ap,(T ). If ap (T) is infinite, then there exists an infinite set of linearly independent eigenvectors of T (Proposition 6.14). Since every eigenvector of T lies in 7Z(T) this implies that dim R(T) = oo (see Theorem 2.5), which is a contradiction. Conclusion: ap (T) must be finite. In particular, this shows that the spectrum in Example 6B is, clearly, finite.
6.6 The Spectrum of a Compact Operator
477
Example 6J. A glance at the spectra of some compact operators:
(a) The operator A = (o o) on C2 is obviously compact. Its spectrum is given by (cf. Example 6B and 61)
a(A) = ap(A) = ap4(A) = (0, 1). (b) The diagonal operator D = diag(,Lk )o E B[P.22] with kk -> 0 is compact (Example 4N). By Example 6C. ap4(D) = (Ak) \(O) and
(with ac(D) = (0)), a(D) = ap,(D) U I ap3(D), Ak = 0 for some k > 0 (with ap3(D) = (0}). ac(D),
Ak 96 0 for all k > 0
(c) The unilateral weighted shift T+ = shift((ak) ) acting on t.. as introduced in Example 6F, is compact (reason: T+ = S. D+ and D+ is compact). We saw there (Example 6F) that
a(TI) = ax(T+) = aa2(T+) = (0), Moreover, T+ also is compact (Problem 5.42) and (Example 6F)
a (T+*) = up (T.') = apr(T+) = (0}. (d) Finally, consider the bilateral weighted shift T = shift((ak )k _o,) of Example 6G acting on f2. The same argument as above shows that T is compact and (cf. Example 6G)
a(T) = ac(T) = (0). Corollary 6.32. Ifan operator T on 7{ is compact and normaloid, then ap (T) 0 0 and there exists A E up (T) such that I A I = IITII
Proof. Recall that N # {0). If T is normaloid (i.e., r (T) =IITII ), then a (T) = (0) only if T = O. If T = O and fl 34 (0), then 0 E ap (T) and 11 T 11 = 0. If T 34 O, then a(T) 96 (0) and IITII = r(T) = maxAEa(T)IAI so that there exists k in a(T) such that IXI = IITII. Moreover, if T is compact and a(T) 9& (0), then 0 i6 a(T)\(0) C ap(T)byTheorem3.30,andhencer(T) = maxAEa(T)IAI = maxAEap(T)IAI = IITII. Thus there exists A E ap(T) such that I), I = IITII. o Proposition 6.33. If T E B... [7{] and (A ) is an infinite sequence of distinct elements in a (T), then A, -* 0. Proof. Take any T E B[N] and let {,1 } be an infinite sequence of distinct elements in a (T). If 0 for some n', then the subsequence 1).k) of (A } consisting of all points of {An} except is a sequence of distinct nonzero elements in a(T). Since Ak -+ 0 implies a -+ 0, there is no loss of generality in assuming that
is a sequence of distinct nonzero elements in a(T) indexed by N. Moreover, if
478
6. The Spectral Theorem
T is compact and 0 0 a E a(T), then Corollary 6.31 says that X. E ap(T) for every n > 1. Let {xn}n°1 be a sequence of eigenvectors associated with (a"}° (i.e., Txn = anxn with x,, ,E 0 for every n > 1), which is a sequence of linearly independent vectors by Proposition 6.14. Set 1
Mn = span (x}..1
for each
n>1
so that each Mn is a subspace of f with dim Mn = n, and Mn C Mn+1
for every
n > 1.
Actually, each M,, is properly included in Mn+l because (x;);"=+1' is linearlyi ndependent and hence Xn+t E Mn+1 \Mn. From now on the proof is similar to that of Proposition 6.30. Since each Mn is a proper subspace of Mn+1, it follows by Lemma 4.33 that for every n > 1 there exists Yn+1 E Mn+i with IIYn+111 = 1 such that < d (y"+1, Mn). Write Yn+I = Fn+i aixi in Mn+l so that n+1
n
(an+1 l - T)yn+i = > ai (an+l - ai )xi = E ai (an+1 - ai )Xi E Mn. i=1
i=1
Recall that an
0 for all n, take any pair of integers 1 < m < n, and set Y = Ym - aM 1(1lmI
- T)Ym +a;'(a,I - T)yn
so that T(am1ym) - T(an 1yn) = y - yn. Since y lies in Mn_1,
IIT(A;'ym)-T(an'Yn)11 = IIY - Y"II > which implies that the sequence (T (aR' y.)) has no convergent subsequence. If T is compact, then Theorem 4.52 ensures that {an' yn } is an unbounded sequence. That is, sups I an 1-' = sup" II a 1 Y" II = oo, and hence inf n Ian I = 0. As An # 0 for all n, this means that an --> 0.
Corollary 6.34. Take any compact operator T E Bo(x). (a) 0 is the only possible accumulation point of a (T) -
(b) If A E a(T)\(0), then A is an isolated point of a(T). (c) a (T)\{0} is a discrete subset of C. (d) a (T) is countable. Proof. If k * 0, then the previous proposition says that there is no sequence of
distinct points in a(T) that converges to I. Thus a 96 0 is not an accumulation point of a(T) by Proposition 3.28. Therefore, if a E a(T)\(0), then it is not an
6.6 The Spectrum of a Compact Operator
479
accumulation point of o(T), which means (by definition) that it is an isolated point
of a(T). Hence a(T)\{0) consists entirely of isolated points, which means (by definition again) that it is a discrete subset of C. But C is separable, and every discrete subset of a separable metric space is countable (this is a consequence of Theorem 3.35 and Corollary 3.36 - see the observations that follow Proposition
0
3.37). Then a (T)\{0} is countable and so is a (T).
The point A = 0 may be anywhere. That is, if T E B,[7{], then A = 0 may lie in
ap(T), (R(T), ac(T)orp(T)(see Example 6J).However,if0 E p(T),then 7{must be finite-dimensional. Indeed, if 0 E p(T), then T-1 E B[7{] so that 1 = T-I T is compact by Proposition 4.54, which implies that 7l is finite-dimensional (see Corollary 4.34). Moreover, the eigenspaces associated with nonzero eigenvalues of a compact operator also are finite-dimensional, as in the next proposition.
Proposition 6.35. If T E 8.[711 and A is a nonzero complex number, then
dimN(AI - T) = dimN(AI - T*) < oo. Proof. Take any A 96 0 in C and any T E B,,[7{]. If dim N(AI - T) = 0, then
N(A1 - T) = (0) so that A E p(T) by Corollary 6.31 and hence A E p(T*) by Proposition 6.17. Therefore N(AI - T*) = (0), which means that dimN(A! - T*) = 0. Dually, since T E B,[7{] if and only if T* E B0[7-1] (cf. Problem 5.42), dimN(AI - T*) = 0 implies dimH(AI - T) = 0. That is,
dimA(A1 - T) = 0
if and only if
dim X(11 - T*) = 0.
Suppose dimN(A1 - T) jk0, and so dim N(AI - T*) 340. Note that N(AI -T) 96 (0) is an invariant subspace for T (if Tx = Ax, then T(Tx) = A(Tx)), and also that T IN(at-T) = Al of N(Al - T) into itself. If T is compact, then T lg(xt-T) is compact (Section 4.9) and so is Al 96 0 on N(AI - T) (0). But Al # 0 is not compact in an infinite-dimensional normed space (by Corollary 4.34)
so that dim N(AI - T) < oo. Dually, as T* is compact, dim N(Al - T *) < 00. Therefore, there exists positive integers m and n such that
dimN(A1 - T) = m
and
dimN(AI - T*) = n.
Let {e}1 and {f).1 be orthonormal bases for the Hilbert spaces N(AI - T) and N(A1 - T*), respectively. Set k = min(m, n) > 1 and consider the mappings
S: f -> fl and S*: l -> 7-( defined by k
k
Sx = j>; ei) fi
and
S*x =
(x ; f,1)ei i=1
i=1
for every x E R. It is clear that S and S* lie in B[f), and also that S* is the adjoint of S: (Sx ; y) = (x ; S* y) for every x, y E R. Actually,
1Z(S) c V{ fi )k=1 9 N(Af - T *)
and
7Z(S*) c V{ei )k_I c N(AI - T)
480
6. The Spectral Theorem
so that S, S' E 80[N), and hence T + S and T' + S' lie in BJf] by Theorem 4.53 (for 50[1-(] c BJ?{]). First suppose that m < n (and so k = m). If x is a vector in N(AI - (T + S)), then (.k1 - T)x = Sx. But R(S) c J1/(Al - T') = 1(AI - T )-L (Proposition 5.76), and hence (Al - T )x = Sx = 0. Then x E N(kl - T) = span (ei)"! 1 so that x = >m Iaiei (for some family of scalars (00 7= 1), and therefore 0 = Sx = E' I ai Se j = E I a! E"'_ 1(el ; ei) fi = E'=jai fi. which implies that ai = 0 for every i = i,... , in (reason: { fi )'j is an orthonormal set, thus linearly independent - Proposition 5.34.) That is, x = 0.
Outcome: N(al - (T + S)) = (0). Hence I E p(T + S) according to Corollary 6.31 (once T + S E BJ7(] and k 96 0). Conclusion:
m
implies
7Z(Al - (T + S)) = R.
Dually, using exactly the same argument,
n<m
implies
1(A! - (T' + S')) = 7-[.
If m
I = (fm+l ; fm+1) = ((A! - (T + S))v ; fm+I ) = ((Al - T)v; fm+I) - (Sv; fm+1) = 0, which is a contradiction. Indeed, ((Al - T)v; f.+1) = (Sv; fm+I) = 0 for fm+I E
N(Al - T') = 1(A1 - T )-L and Sv E R(S) c span{ fi }"!1. If n < m, then k = n < n + 1 < m, and en+1 E R(Al - (T' + S')) = W so that there exists u e N for which (Al - (T* + S*))u = ei+1. Hence I = (em+l ; em+I)
(T' + S'))u ; en+I ) T')u ; en+I) - (S'u. en+1) = 0,
which is a contradiction too (for ei+1 E N(Al - T) = R(x! - T')1 and S'u E 7Z(S') C span (e,)"= ). Therefore, m = n.
0
Together, the statements of Propositions 6.29 and 6.35 are sometimes referred to as the Fredholm Alternative.
6.7
The Spectral Theorem for Compact Normal Operators
Throughout this section N # {0} is a complex Hilbert space. Let {JAy)yEr be a bounded family of complex numbers, let { Py be a resolution of the identity on
6.7 The Spectral Theorem for Compact Normal Operators
481
R, and let T E B[7-{] be a (bounded) weighted sum of projections (cf. Definition 5.60 and Proposition 5.61):
Tx =
Ay Pyx
for every
xE
yEr
Proposition 6.36. Every weighted sum of projections is normal. Proof. Note that (Ay )yEr is a bounded family of complex numbers, and consider the weighted sum of projections T* E 5[7{] given by
T *x = EAy Pyx
for every
x E 7{.
yEr
Since each Py is self-adjoint (Proposition 5.81), it is readily verified that T* is, in
fact, the adjoint of T E B[h]. Indeed, take x = EyEr Pyx and y = EyEr PYY arbitrary in 7-( (recall: { Py }yEr is a resolution of the identity on 71). Therefore, as
R(Pa) 1 R(Pp) whenever a 54 P, +
(Tx ; Y) = (`aErA0 Pax , EpEr P$Y) = F-aerY:pErAa(Pax ; Pp Y) = EYErAY(PYx ` P.x) = EpErEaErAa(Ppx; Pa Y)
(F-perPox;LaErAaPaY) _ (x;T*Y) Moreover, since Py = Py for all y and Pa Pp = Pp P. = 0 if a A T*Tx = L-+aErAaPa>pEIApPpx = F-aEr1:fErAaAPP0Ppx
=
'` LYErIAYI2PYx
= r-aErifErAaAPPaPpx EaerAaPaF-perA.Ppx = TT*x
for every x E R. That is, T is normal.
El
Particular case: Diagonal operators and, more generally, diagonalizable operators on a separable Hilbert space (defined in Problem 5.17), are normal operators. In fact, the concept of weighted sum of projections on any Hilbert space can be thought of as a generalization of the concept of diagonalizable operators on a separable Hilbert space. The next proposition shows that such a generalization preserves the spectral properties (compare with Example 6C).
Proposition 6.37. If T E B[7{] is a weighted sum of projections, then
ap(T)_{AEC: A=Ay forsome AEI'},
aR(T)A0,
and
arc (T) _ {A E C: A # Ay for all A E r and inf JA - Aye = 0}. yEr
482
6. The Spectral Theorem
Proof. Take any x = EyrPy.x in H (recall that {Py}yEr is a resolution of the identity on H)sothat 11 x112 = EyEr11Pyx112byTheorem 5.32.Thus, foranyA E C,
-
(Al - T)x = EyEr(A - Ay)Pyx so that II (AI - T)x112 = EyrIA AyI211Pyxll2 (cf. Theorem 5.32 again). If H(Al - T) A {0}, then there exists x 96 0 in R such that (AI - T)x = 0. Hence EyErllPyX1122 # 0 and EyErIA AyI211Pyxj,2 = 0,
-
which implies that 11 Pax II # 0 for some a E I' and IA - Aa I II P,,.x 11 = 0. Therefore,
A = Aa. Conversely, take any a E I' and an arbitrary nonzero vector x in R(P0) (recall: Py 54- 0 and so R(Py) # {0} for every y E I'). But R(PO) 1 R(Py) for
a A y so that R(P0) 1 Ua#yErR(Py). Hence R(P0) c (Ua#yEr IZ(Py))1 = (cf. Problem 5.8(a) and Propositions 5.76(a)
na#yEr 1(Py)1 =
and 5.81(b)). Therefore x E JV(Py) for every a # y E I', which implies that II(A0/ - T)x112 = EyErlAa - AyI211Pyx112 = 0, and so N(Aal - T) A (0). Conclusion: JV(AI - T) # (0) if and only if A = Aa for some a E T. In other words. ap (T)
AEC:
A = Ay for some y E I' } .
We have just seen that AI(A / - T) = (0) if and only if X 96 A. for ally E I'. In this
case (i.e., if (Al - T) is injective), there exists (Al - T)-t in C[R(AI - T), N], a weighted sum of projections on R(AI - T):
(AI - T)-t x = J:(A - Ay)-t Pyx
for every
x E R(A/ - T).
yEr
Indeed, if A # Ay for all y E ', EaEr(A - A-t Pa E,6Er(A - Ap)Ppx = LaErF-tEr(A-Aa-1(A-A6)PaP8 = EyErPy =xforevery x in R. Ac-
cording to Proposition 5.61 (Al - T)-l E 13[N] if and only if A 96 Ay for all y E r and supyerIA - Ay1-t < oo. Equivalently, if and only if infyErlA - AyI > 0. In other words,
p(T) = {A E C: m IA - Ayl > 01. But T is normal by Proposition 6.36 so that aR(T) = 0 (Corollary 6.18), and hence
ac(T) = p(T)\ap(T).
o
Proposition 6.38. A weighted sum of projections T E l3[H] is compact if and only if the following triple condition holds. a(T) is countable, 0 is the only possible accumulation point of a (T), and dim IZ(Py) < oo for every y such that Ay # 0.
Proof. Let T E 8[N] be a weighted sum of projections.
Claim. IZ(Py) C N(Ay I - T) for every y. Proof. Take an arbitrary y. If x E R(Py), then x = Pyx (Problem 1.4) so that T x = T P. x = a Aa Pa Py x = Ay Py x = Ay x (for P. 1 Py whenever y 0 a), and hence x E JV (Ay I- T). o
6.7 The Spectral Theorem for Compact Normal Operators
483
If T is compact, then a (T) is countable and 0 is the only possible accumulation point
of a (T) (Corollary 6.34), and dim N(A! - T) < oo whenever A 0 0 (Proposition 6.35) so that dim R(Py) < oo for every y such that Ay A 0 by the above claim. Conversely, if T = 0, then T is trivially compact. Thus suppose T 34 0. Since T is normal (Proposition 6.36), r (T) > 0 (reason: the unique normal operator with a null spectral radius is the null operator - see the remark that precedes Corollary 6.28) so that there exists A # 0 in ap (T) by Corollary 6.31. If a (T) is countable, then let
(Ak) be any enumeration of the countable set ap(T)\(0) = a(T)\{0}. Hence
T x=
Ak Pk x
for every
x E 7d
k
(Proposition 6.37), where { Pk) is included in a resolution of the identity on R (which is itself a resolution of the identity on W if 0 0 ap(T)). If {Ak) is finite,
say (Ak) = (h)k_1, then R(T) = >k_l R(Pk). If dimR(Pk) < oo for every k, then dim( Ek_ J)i(Pt )) < oo (according to Problem 5.11), and so T E 80[7{] c 8,[7{]. Now suppose (Ak) is countably infinite. Since a(T) is compact (Corollary 6.12), it follows by Theorem 3.80 and Proposition 3.77 that (Ak } has an accumulation
point in a(T). If 0 is the only possible accumulation point of a(T), then 0 is the unique accumulation point of (Ak }. Therefore, for each integer n > I consider the partition (Ak) = {Ak'} U (Ak"}, where IAk'I ? it and IAk"I < ,.. Note that {Ak') is a finite subset of a(T) (it has no accumulation point), and hence (Ak") is an infinite subset of a (T). Set
T=
for each
Ak' Pk' E 1317{]
n > 1.
k'
oo. That is, T E 80[h) for every n > 1. However, as PJ 1 Pk whenever j 96 k, we get (cf. Corollary 5.9) We have just seen that dim
II(T - Tn)x112 = II
Ak"Pk"x112 _< SUPIAk"121:IIPk"x112 <
V
u
k"
11x112
k"
for all x E N, so that T " T. Hence T E t3,[7-(] by Definition 4.45.
o
Before considering the Spectral Theorem for compact normal operators we need a few spectral properties of normal operators.
Proposition 6.39. If T E 5[7d] is normal, then
N(A! - T) = N(AI - T)
for every
A E C.
Proof. Take an arbitrary k E C. If T is normal, then (Al - T) is normal (cf. proof of Corollary 6.18) so that II (A I - T * )x II = II (A 1 - T )x 11 for every x E 'H by Proposition 6.1(b). 13
484
6. The Spectral Theorem
Proposition 6.40. Take k, lx E C. If T E B[N] is normal, then
N(kl - T) 1 N(µ I - T)
whenever
1l 0 µ.
Proof. Suppose X E N(k1 - T) and y E N(ILI - T) so that kx = Tx and µy = T y. Since N(kl - T) = N(kl - T*) by the previous proposition. it follows that Ax = T *x. Thus (µ y ; x) = (T y ; x) = (y; T *x) _ (y; kx) _ (fly ; x), and hence (lc - k) (y ; x) = 0, which implies that (y ; x) = 0 whenever µ *,X.
Proposition 6.41. If T E B[7{] is normal, then N(kl - T) reduces T for every
AEC. Proof. Take an arbitrary k E C and an arbitrary T E B[%]. Recall that N(kl - T)
is a subspace of h (Proposition 4.13). Moreover, it is clear that N(kl - T) is T-invariant (if Tx = kx, then T(Tx) = k(Tx)). Similarly, N(kl - T*) is T*invariant. Now suppose T E 13[7{] is a normal operator. Proposition 6.39 says
thatN(k1 - T) = N(kl - T*), and henceN(kl - T) also is T*-invariant. Then N(kl - T) reduces T according to Corollary 5.75. Corollary 6.42. If {ky}yEr is any (nonempty) family of distinct scalars, and if T E B[N] is a normal operator, then the (topological) sum (EyerN(kyl - T))reduces T.
Proof. For each y E l' 0 0 write Ny = N(ky I - T), which is a subspace of ll (Proposition 4.13). According to Proposition 6.40 {Ny }yEr is a family of pairwise
orthogonal subspaces of 71. Take an arbitrary x E (EyerNyy. If r is finite, then (EyErNr) = EyErNy (Corollary 5.11); otherwise apply the Orthogonal Structure Theorem (i.e., Theorem 5.16 if r is countably infinite or Problem 5.10 if r is uncountable). In any case (finite, countably infinite or uncountable r), x =
Eycruy with each uy in Ny. Moreover, for each y E r, Tuy and T*uy lie in A(,, because Ny reduces T by Proposition 6.41 (cf. Corollary 5.75). Thus, since T
and T* are linear and continuous, it follows that Tx = >yErTuy E (EyErNr) and T*x = F-rErT*uy E (E,ErAry . Therefore, (EyErNyY reduces T (cf. Corollary 5.75 again).
Every (bounded) weighted sum of projections is normal (Proposition 6.36), and every compact weighted sum of projections has a countable set of distinct eigenvalues (Propositions 6.37 and 6.38). The Spectral Theorem for compact normal operators ensures the converse.
Theorem 6.43. (The Spectral Theorem). If T E B[7fl is compact and normal, then there exists a countable resolution of the identity (Pk) on 7{ and a (similarly indexed) bounded set of scalars {kk } such that
T = >2kkPk, k
6.7 The Spectral Theorem for Compact Normal Operators
485
where (Ak) = ap(T), the set of all (distinct) eigenvalues of T, and each Pk is the orthogonal projection onto the eigenspace N(Ak I - T). Moreover, if the above countable weighted sum of projections is infinite, then it converges in the (uniform) topology of B[7{]. Proof. If T is compact and normal, then it has a nonempty point spectrum (Corollary 6.32) and its eigenspaces span 7{. In other words,
Claim. (EAEap(T) ((X1 - T)) = 7{.
Set M = (F-xE ,(T)N(AI - T))-, which is a subspace of W. Suppose M 54 R so that Ml 0 (0) (Proposition 5.15). Consider the restriction TIMi of T to Ml. If T is normal, then M reduces T (Corollary 6.42) so that M-L is TProof.
invariant, and hence T I Ml E 13[M'] is normal (cf. Problem 6.17). If T is compact, then T I Ml is compact (see Section 4.9). Thus T JMl is a compact normal operator on the Hilbert space M1 96 (0), and therefore ap(T IMl) 96 0 by Corollary 6.32.
That is, there exist A E C and 0 36 x E M1 such that TIM.Lx = Ax and so Tx = Ax. Hence A E up(T) and x E N(AI - T) C M. But this leads to a contradiction, viz., 0 A X E M fl Ml = (0). Outcome: Jet = N. o Since T is compact, the nonempty set ap(T) is countable (Corollary 6.34) and bounded (for T E B[7{]). Then write
ap(T) = (Ak)kEN. where (Ak)kEN is a finite or infinite sequence of distinct elements in C consisting of all eigenvalues of T . Here, either N = { i , ... , m) for some m E N if ap (T) is finite, or N = N if ap (T) is (countably) infinite. Recall that each N(.kk I - T) is a subspace
of h (Proposition 4.13). Moreover, since T is normal, Proposition 6.40 says that
N(XkI - T) 1 N(A I - T) whenever k -* j. Therefore, (N(.kkI - T)}kEN is a sequence of orthogonal subspaces of 7{ such that W = ( kENN(Ak I - T)) - by the above claim. Then the sequence ( Pk )kEN consisting of the orthogonal projections
onto each N(Ak I - T) is a resolution of the identity on 7I (see Theorem 5.59). This implies that x = >kENPkx and, since T is linear and continuous, Tx = >2kENT Pkx for every x E N. But PkX E 7Z(Pk) = N0Ak1 - T), and so T Pkx = ,kk Pk X, for each k E N and every x E 7{. Hence
T x = >2 Ak Pkx
for every
x E 7{.
kEN
Conclusion: T is a countable weighted sum of projections. If N is finite, then the theorem is proved. Now suppose N is infinite (i.e., N = N). In this case, the above identity says that Ek= t Ak Pk ) T (see the observation that follows the proof of Proposition 5.61). We show next that the above convergence actually is uniform.
486
6. The Spectral Theorem
Indeed, for any n E N, 00
n
00
E XkPkxll2 \T - EAkPk)xll2 = IIk=n+I k=1 l
11(
E IAk1211pkxf12 k=n+1
00
< sup 14 12 E IIPkx1I2 <supIAkI2IIxII2. k>n+l
k=n+1
k>n
(Reason: R(PM) 1 R(Pk) whenever j # k and x = Ek=1 Pkx so that 11x112 Ek=l II Pkx II2 -See Corollary 5.9.) Hence n
0<
11T-EAkPkD = sup
k=I
ll(T
I1xII=1
-
ruAkPk)xll n `SUP lAtI k=I
k>_n
for all n E N. Since T is compact and (),,,}°O I is a sequence of distinct elements in a (T), it follows by Proposition 6.33 that An -+ 0. Therefore limn SUPk>n IAk I = T. lim sup,, IA,, I = 0, and so Fk=1 Ak Pk o In other words, if T E 13[7l] is compact and normal, then the family of orthogonal
projections (Px}AEOP(T) onto each eigenspace N(),I - T) is a resolution of the identity on 71, and T is a weighted sum of projections: T =
A Px.
leap(T)
This was naturally identified in Problem 5.16 with an orthogonal sum of scalar operators ®AEap(T)AIA, where 1x = PAIR(pa). Here R(P1) = N(AI - T). Under such a natural identification we also write
T=®API. AEap(T)
These representations are referred to as the spectral decomposition of a compact normal operator T. The next result states the Spectral Theorem for compact normal operators in terms of an orthonormal basis for N(T)1 consisting of eigenvectors of T. Corollary 6.44. Let T E 8[71] be compact and normal.
(a) For each X E ap (T)\(0) there exists a finite orthonormal basis (ek(x))1c=I for N(Al - T) consisting entirely of eigenvectors of T, UAEap(T)\(O){ek(x)}k' I is a countable orthonormal basis foriV(T)1 made up of eigenvectors of T, and
(b) (ek} =
(c) Tx = EAEOp(T)\(O) A
Fk)=LI (x; ek(x))
ek(x) for every x E W, so that
6.7 The Spectral Theorem for Compact Normal Operators
487
(d) Tx = >k 1 k(x; ek)ek for every x E N, where (µk} is a sequence containing all nonzero eigenvalues of T, finitely repeated according to the multiplicity of the respective eigenspace. Proof. We have already seen that ap (T) is nonempty and countable (cf. proof of the previous theorem). Recall that op (T) = (0) if and only if T = 0 (Corollary 6.32) or, equivalently, if and only if A((T)1 = (0) (i.e., N(T) = N). If T = O (i.e., T = 01),
then the above assertions hold trivially (ap(T)\(0) = 0, (ek) = 0, N(T)1 = (0) and Tx = Ox = 0 for every x E N, because the empty sum is null). Thus suppose T A 0 (so that N(T)1 * (01), and take an arbitrary A 34 0 in ap(T). According to Proposition 6.35, dimN(A1 - T) is finite, say, dimN(A1 - T) = nx for some positive integer na. This implies the existence of a finite orthonormal basis (ek (a))k' for the Hilbert space N(AI - T) -* (0) (cf. Proposition 5.39). Observe that ek()L) is 1
an eigenvector of T for each k = 1, ... , na (because 0 34 ek(a) E N(A1 - T)). Claim
UAEOp(T)\(0)(ek()L))ka
1 is an orthonormal basis for A(T).
Proof. We know (cf. Claim in the proof of Theorem 6.43) that
N(AI - T))
7-(
.
AEap(T )
Therefore, according to Problem 5.8(b,d,e),
N(T) = n N(A1 - T)1 =
(E
N(AI - T) )
\ ) EOp(T)\(O)
.LEap(T)\(O)
(because (N(Al - T)}AEap(T) is a nonempty family of orthogonal subspaces of N - Proposition 6.40). Hence
N(T)1 =( E N(AI - T) \ AEap(T)\(0l
(Proposition 5.15), and the claimed result follows by part (a), Corollary 6.40, and Problem 5.11.0 Note that (ek) = UAEap(T)\(O)(ek(a))k'_1 is countable by Corollary 1.11. Finally,
consider the decomposition 7-l = N(T) + N(T)1 of Theorem 5.20, and take an arbitrary x E N so that x = u + v with u E N(T) and V E N(T)1. Consider the Fourier series expansion ni,
v = E(v; ek)ek = k
E(v: ek(x))ek(1) kEap(T)\(0) k=1
488
6. The Spectral Theorem
(cf. Theorem 5.48) of v in terms of the orthonormal basis
(ek) = U
{ek(x)}k=1
AE ip(T)\(0)
for the Hilbert space AV(T)1 36 (0). Since T is linear and continuous, and since Tek(x) = Aek(x) for each k = 1, , nx and each X E ap(T)\(0), it follows that
Tx = Tu + Tv = Tv =
`/ 1
r )
AEap(T)\(Ol
`(u ; ek(x))Tek(x)
k=1
na
1: (v; ek(x))ek(x) AEap(TNO)
k=1
However, (x ; ek(x)) = (u ; ek()L)) + (v ; ek(x)) = (v ; ek(x)) because u E N(T) and
o
ek(x) E N(T)1.
Remark: If T E B[N] is compact and normal, and if N is nonseparable, then 0 E ap (T) and J1((T) is nonseparable. Indeed, for T = 0 the italicized result is trivial (T = 0 implies 0 E ap(T) and 1(T) = N). On the other hand, if T A O, then N(T)1 # (0) is separable because it has a countable orthonormal basis {ek} (Theorem 5.44 and Corollary 6.44). If N(T) is separable, then it also has a countable orthonormal basis, say (fk }, and hence (ek ) U (fk } is a countable orthonormal basis
for N = N(T) + 11(T)1 (Problem 5.11) so that ?{ is separable. Moreover, if 0 ¢ ap (T), then N(T) = (0), and therefore 7-( = 11(T)1 is separable. N(T) reduces T (Proposition 6.41), and hence T = TIN(T)1 ® O. By Problem 5.17 and Corollary 6.44(d), if T E B[7{] is compact and normal, then T I N(T)i E B[1/(T)11 is diagonalizable. Precisely, T IN(T)i is a diagonal operator with respect to the orthonormal basis {ek } for the separable Hilbert space N(T)1. Generalizing: An operator T E B[7{] (not necessarily compact) acting on any Hilbert space N (not necessarily separable) is diagonalizable if there exist a resolution of the identity { Py }yEr on N and a bounded family of scalars (.ky)yEr such that
Tu = kyu
whenever
u E 7Z(Py).
Take an arbitrary x = EyE r Py x in R. Since T is linear and continuous. T x = EyErT Pyx = EyEr ky Pyx so that T is a weighted sum of projections (which is normal by Proposition 6.36). Thus we write (cf. Problem 5.16)
T = EXyPy yEr
or
T =®xyPy. yEr
Conversely, if T is a weighted sum of projections (Tx = E E AyPPPau =AQu forevery u E 7Z(Pp)
6.8 A Glimpse at the Spectral Theorem for Normal Operators
489
(since P. P. = 0 whenever y * a and u = P. u whenever u E l (Pa)), and hence T is diagonalizable. Outcome: An operator T on it is diagonalizable if and only if it is a weighted sum of projections for some bounded sequence of scalars (Ay )yEr and some resolution of the identity (Py}yEr on W. In this case, {Py}yEr is said to diagonalize T.
Corollary 6.45. If T E B[f] is compact, then T is normal if and only if T is diagonalizable. Let {Pk} be a resolution of the identity on ?{ that diagonalizes a con pact and normal operator T E B[?{] into its spectral decomposition, and take any operator S E B[?{]. The following assertions are pairwise equivalent. (a) S commutes with T and with T*. (b) IZ(Pk) reduces S for every k.
(c) S commutes with every Pk.
Proof. Let T E B,[?{] be any compact operator. If T is normal, then the Spectral Theorem ensures that it is diagonalizable. The converse is trivial since every diagonalizable operator is normal. Now suppose T is compact and normal so that
T = >AkPk, k
where (Pk) is a resolution of the identity on ?{ and (Ak) = crp(T) is the set of all (distinct) eigenvalues of T (Theorem 6.43). Recall from the proof of Proposition 6.36 that
T* _ >2AkPk. k
Take any I E C. If S commutes with T and with T*, then (Al - T) commutes with S and with S*, so that N()AI - T) is an invariant subspace for both S and S* (Problem 4.20(c)). Hence N()AI - T) reduces S (Corollary 5.75), which means that S commutes with the orthogonal projection onto N(al - T) (cf. observation that precedes Proposition 5.74). In particular, since IZ(Pk) = N(AkI - T) for each k (Theorem 6.43), 7Z(Pk) reduces S for every k, which means that S commutes with every Pk. Then (a)=::,(b)' (c). It is readily verified that (c)=(a). Indeed, if
SPk_= PkS for every k, then ST = LkAkSPk = Ek1kPkS = TS and ST* = EkAkSPk = EJk PkS = T*S (recall that S is linear and continuous).
17
6.8 A Glimpse at the Spectral Theorem for Normal Operators What is the role played by compact operators in the Spectral Theorem? First note that, if T is compact, then its spectrum (and so its point spectrum) is countable. But this is
490
6. The Spectral Theorem
not crucial once we know how to deal with uncountable sums. In particular, we know
how to deal with an uncountable weighted sum of projections Tx = Eyerlly Pyx (recall that, even in this case, the above sum has only a countable number of nonzero vectors for each x). What really brings a compact operator into play is that a compact normal operator has a nonempty point spectrum and, more than that, it has enough eigenspaces to span 7{ (see the claim in the proof of Theorem 6.43). That makes the difference, for a normal (noncompact) operator may have an empty point spectrum
(witness: a bilateral shift) or it may have eigenspaces but not enough to span the whole space f (sample: an orthogonal direct sum of a bilateral shift with an identity).
However, the Spectral Theorem survives the lack of compactness if the point spectrum is replaced with the full spectrum (which is never empty). But this has a price: a suitable statement of the Spectral Theorem for plain normal operators requires some knowledge of measure theory, and a proper proof requires a sound knowledge of it. We shall not prove the two fundamental theorems of this final section. Instead, we just state them, and verify some of their basic consequences. Thus we assume here (and only here) that the reader has, at least, some familiarity with measure theory in order to grasp the definition of spectral measure and, therefore, the statement of the Spectral Theorem. Operators will be acting on complex Hilbert spaces l A {0} or K A {0}. Definition 6.46. Let S2 be a set in the complex plane C and let En be the a-algebra of Borel sets in Q. A (complex) spectral measure in a (complex) Hilbert space f is
a mapping P: En - L [7-l} such that (a) P(A) is an orthogonal projection for every A E En,
(b) P(O)=O and P(Q)=I, (c) P(A1 n A2) = P(A1)P(A2) forevery Al, A2 E En, (d) if {Ak} is a countable collection of pairwise disjoint sets in En, then
P(U Ak) = E P(Ak). k
k
If (Ak)kEN is a countably infinite collection of pairwise disjoint sets in En, then the above identity means convergence in the strong topology: 0
E P(Ak) - P(U Ak). k=t
kEN
Indeed, Since A j n Ak = 0 whenever j 96 k, it follows by properties (b)and (c) that
P(Aj)P(Ak) = P(A; n Ak) = P(r) = 0 for j # k, so that {P(Ak)}kEN is an orthogonal sequence of orthogonal projections in 8[71}. Then, according to Propoconverges strongly to the orthogonal projection in sition 5.58. (Fk=l
6.8 A Glimpse at the Spectral Theorem for Normal Operators
491
B[H] onto (EkENR(P(Ak))) = V(UkEN'Z(P(Ak))). Therefore, what property (d) says (in the case of a countably infinite collection of pairwise disjoint Borel sets (Ak )kEN) is that P (UkEN Ak) coincides with the orthogonal projection in B[1-t] onto V(UkEN1Z(P(Ak ))). This generalizes the concept of a resolution of the identity on H. In fact, if (Ak }kEN is a partition of 0, then the orthogonal sequence of orthogonal projections (P(Ak)}kEN is such that n
E P(Ak) -s . P(UkENAk) = P(c2) = 1. k=1
Now take x, y E H and consider the mapping p.,.,.: En - C defined by
px.y (A) = (P(A)x:y) for every A E En. px y, is an ordinary complex-valued countably additive measure on En. Let w : fZ -- C be any bounded En-measurable function. The integral of tp with respect to px.y., f w(A) d px y., will be denoted by f w(A) d (Pax ; y). Moreover, there exists a unique F E B[H] such that (Fx ; y) =J w(A) d (Pax ; y)
for every
x, y E R.
Indeed, let f : 7{x7-( - C be defined by f (x, y) = f rp(A) d (Pax . y), and note that If (x, y)I f Iw(A)I d(Pax ; y) 5 IIwll,,fd(Pkx: y) = IIwll0(P(n)x ; y) = IlwlI,,(x ; y) = II(pII,,IIx 11 IIyII, for every x, y E H. Outcome: for each y E 7{ the
functional f ( , y) : H -- C, which is clearly linear, is bounded too. Then, by the Riesz Representation Theorem, for each y E H there exists a unique z, E W such that f (x, y) = (x ; zy) for every x E H. This establishes a mapping : R -+ h that assigns to each y E 7( this unique Zy E 71 (i.e., Oy = zy), and f (x, y) = (x ; 4)y) for every x, y E W. It is easy to show that 4) is unique and lies in 5[7{] (cf. proof of Proposition 5.65(a,b)). Thus there exists a unique F E B[7{], viz., F = c', such that (Fx ; y) = f (x, y) for every x, y E H. The notation
F=
Jco(A)dPA
is just short for the identity (Fx ; y) = f w(A) d (Pax ; y) for every X. y E
Note that (F'x;y) = 4x; y) = (y;4'x) = f(y,x) = fw(A)d(Pxy;x) _ f w(A) d (Pax ; y)) for every x, y E H. and hence
F' = 1 w(A)dPA. If ip : n -). C is a bounded En-measurable function and G = fy(A) dPA, then it can be shown (by using the Radon-Nikodvn Theorem) that FG = f w(),)y (A) d Px.
492
6. The Spectral Theorem
In particular F'F = f1W(1.)12dPA = FF* so that F is normal. The Spectral Theorem states the converse.
Theorem 6.47. (The Spectral Theorem). If N E 13[71] is nor mtl, then there exists a unique spectral measure P on E(y(N) such that
N = radPA, If A is a nonempty relatively open subset of a (N), then P(A) # 0. The representation N = f )L d PA is usually referred to as the spectral decomposition of N. Note that N' N = f I AI2 d Px, = N N'.
Theorem 6.48. (Fuglede). Let N = f l d PA be the spectral decomposition of a normal operator N E 13[7{]. If S E 13[71] commutes with N, then S commutes with
P(A) for everv A E E,(N).
In other words, if SN = NS, then SP(A) = P(A)S, and hence each subspace R(P(A)) reduces S, which means that (7Z(P(A))}AEE (,v) is a family of reducing subspaces for every operator that commutes with a normal operator N = f A d PA.
If a(N) has a single point, say a(N) = (A), then N = Al (by uniqueness of the spectral measure); that is, N is a scalar operator so that every subspace of 71
reduces N. Hence, if N is nonscalar, then a(N) has more than one point (and dim h > 1). If A, µ E a(N) and), # it, then let AA denote the open disc of radius jla - ul centered at X. Put AA = a(N) fl AA and A' = a(N)\AA in Eo(N), so that a(N) is the disjoint union of AA and Note that P(AA) 56 0 and P(A') A 0 (for AA and a(N)\Aa are nonempty relatively open subsets of a(N),
and a(N)\A; c Aa). Then I = P(a(N)) = P(AA U A' ) = P(AA) + P(A' and therefore P(AA) = I - P(A') A I. Thus (0) # 1Z(P(A)L)) 0 N. Conclusions: If dim 71 > 1, then every normal operator has a nontrivial reducing subspace. Actually, every nonscalar normal operator has a nontrivial hyperinvariant subspace which reduces every operator that commutes with it. In fact, every operator that commutes with a nonscalar normal operator is reducible.
Corollary 6.49. (Fuglede-Putnam). If N1 E 13[h] and N, E B[K] are normal operators, and if X E 1-3[7{, K] intertwines Ni to N,, then X intertwines N* to N. (i.e., if XN1 = N,X, then XN, = N, X). %
Proof. Let N = f X d P;, be a normal operator in B[7I], let A be an arbitrary set in E7(N). and take S E B[71].
Claim. SN = NS
SP(A) = P(A)S
SN' = N*S.
6.8 A Glimpse at the Spectral Theorem for Normal Operators
493
Proof. If SN = NS, then SP(A) = P(A)S for every A E Ea(N) by Theorem
6.48. Thus (SN*x; y) = (N'x; S*y) = f 1ld(Pax; S*y) = f!d(SPax; y) = fl d(PkSx ; y) = (N*Sx ; y) for every x, y E 7i, and hence SN* = N*S so that NS* = S*N. This implies that P(A)S' = S*P(A), so that SP(A) = P(A)S, for every A E E0(N) (cf. Theorem 6.48 again). Therefore, (SNx ; y) = (Nx ; S* y) =
f.kd(Pax;S*y) = fId(SPAx;y) = fAd(P1Sx;y) = (NSx;y) for every x, y E 7-f, and hence SN = NS. 0
Take NI E B[7{], N2 E B[IC], and X E 5[h, 1C]. Set N = NI ®N2 = (o 0) and S = (X o) in B[fl ® ICJ. If NI and N2 are normal operators, then N is clearly normal. If XNI = N2X, then SN = NS and so SN* = N*S by the above claim. Hence XNi = N2* X.
El
The claim in the above proof ensures that S E B[7{] commutes with N and with N* if and only if S commutes with P(A) or, equivalently. R(P(A)) reduces S. for every A E E,(N) (compare with Corollary 6.45).
Corollary 6.50. Take NI E B[7l], N2 E B[/C], and X E B[7-1, 1C]. If NI and N2 are normal operators and X NI = N2X , then
(a) N(X) reduces Nt and R(X )- reduces N2 so that NI IN(X)1 E B[N(X)1] and N21 R(x)- E B[R(X)-l. Moreover, (b)
NI IN(x)l and N211Z(x)- are unitarily equivalent.
Proof. (a) Since XNi = N2X, it follows that N(X) is NI-invariant and R(X)
is N2-invariant (and so R(X)- is N2-invariant - cf. Problem 4.18). Indeed, if Xx = 0, then X (Nix) = N2(Xx) = 0; and N2Xx = XNIx E R(X) for every x E ?{. Corollary 6.49 ensures that XNi = NZX, and hence N(X) is N, *-invariant
and R(X)- is N*-invariant. Therefore (Corollary 5.75), N(X) reduces NI and R(X)- reduces N2. (b) Let X = W Q be the polar decomposition of X, where Q = (X *X) 2 (Theorem 5.89). Observe that XNi = N2X implies X*N2* = N, X', which in turn implies X*N2 = NIX* by Corollary 6.49. Then
Q2NI = X*XNI = X*N,X = NIX*X = NIQ2 so that QNI = NI Q (Theorem 5.85). Therefore, W NI Q = W QNI = X NI = N2X = N2WQ. That is,
(WNI - N2W)Q = O. Thus?Z(Q)- c N(WNI - N2W), and so .V(Q)-- c N(WN1 - N2W)
494
6. The Spectral Theorem
(for Q = Q' so that R(Q)- = N(Q)1 by Proposition 5.76). Recall that
N(W) = N(Q) = N(Q2) = N(X*X) = N(X) (cf. Propositions 5.76 and 5.86, and Theorem 5.89). If u E N(Q) then N, Wit = 0.
and N(u = Ni IN(x)u E N(X) = N(W) (becauseA((X) is Nt-invariant) so that W N, u = 0. Hence
N(Q) c A1(WNJ - kW)The above displayed inclusions imply that N(WN1 - N2W) = K (cf. Problem 5.7(b)). Equivalently.
WN( = N,W.
Now W = VP. where V: N(W)1- K is an isometry and P: N -* W is the orthogonal projection onto./V(W)1(Proposition 5.87). Then
V PN( = N,VP
so that
V PNIIN(x)l = N,V PIN(x) ..
Since R(P) = N(W)' = N(X)' is Nt-invariant (recall: N(X) reduces N(), it follows that NJ(,V(X)1) c N(X)1 = 7Z(P), and hence V PNt I.V(x)- = VN(IN(x)i.
Since R(V) = R(W) = R(X)- (cf. Theorem 5.89 and the observation that precedes Proposition 5.88), it also follows that
N,VPI,V(x)1 = N-,VPJRz(p) = N,V = N,IR(x)-V.
But V : N(WV)' -* R(V) is a unitary transformation (i.e., a surjective isometry)
of the Hilbert space M(X)1 = N(W) -' e 7{ onto the Hilbert space R(X)- _ R(V) C K. Conclusion: VNtI. '(x)1 = N2I7Z(x)-V
so that the operators N( IN(X)' E 8[1.1(X)1] and NIR(X)- E 8[R(X)-] are unitarily equivalent. 0 An immediate consequence of Corollary 6.50: If a quasiinvertible linear transformation intertwines two normal operators, then these normal operators are uni-
tarily equivalent. That is, if N( E 8[N] and N, E B[K] are normal operators, and if XN1 = N,X. where X E B[N,)C] is such that 1.1(X) = {0} (equiva-
lently, N(X)' = 11) and R(X)- = K, then UN, = N,U for a unitary transformation U E 5[71, K]. This happens, in particular, when X is invertible (i.e., if X E 9[71. K}). Outcome: Two similar normal operators are unitarily equivalent. Applying Theorems 6.47 and 6.48, we saw that normal operators (on a complex Hilbert space of dimension greater than one) have a nontrivial invariant subspace. This also is the case for compact operators (on a complex Banach space of dimension
6.8 A Glimpse at the Spectral Theorem for Normal Operators
495
greater than one). The definitive result in this line was presented by Lomonosov in 1973: An operator has a nontrivial invariant subspace if it commutes with a non-
scalar operator that commutes with a nonzero compact operator. In fact, every nonscalar operator that commutes with a nonscalar compact operator (itself, in particular) has a nontrivial hyperinvariant subspace. Recall that, on an infinite-dimensional normed space, the only scalar compact operator is the null operator. On a finite-dimensional normed space every operator is compact, and hence every operator on a complex finite-dimensional normed space of dimension greater than one has a nontrivial invariant subspace and, if it is nonscalar, a nontrivial hyperinvariant subspace as well. This prompts the most celebrated open question in operator theory, namely, the invariant subspace problem: Does every operator (on an infinite-dimensional complex separable Hilbert space) have a nontrivial invariant subspace? All the qualifications are crucial here. Note that the operator (i o on R has no nontrivial invariant subspace (when acting on the Euclidean real space but, of course, it has a nontrivial invariant subspace when acting on the unitary complex space C2). Thus the above question actually refers to complex spaces and, henceforward, we assume that all spaces are complex. The problem has a negative answer if we replace Hilbert space with Banach space. This (the invariant subspace problem in a Banach space) remained as an open question for a long period up to the mid-1980s, when it was solved by Read (1984) and Enflo (1987) who constructed a Banach-space operator without a nontrivial invariant subspace. As we have just seen, the problem has an affirmative answer in a finite-dimensional space (of dimension greater than one). It has an affirmative answer in a nonseparable Hilbert space too. Indeed, let T be any operator on a nonseparable Hilbert space 9-l, and let x be any nonzero vector in f. Consider the orbit of x under T, (T"x },,>o, so that V{ T"x }">o # (0) is an invariant subspace for T (cf. Problem 4.23). Since (T"x)">o is a countable set, it follows by Proposition 4.9(b) that V(T"x }n>o A R. Hence V {T"x }n>o is a nontrivial invariant subspace for T. Completeness and boundedness are also crucial here. In fact, it can be shown that (1) there exists an operator on an infinite-dimensional complex separable (incomplete) inner product space which has no nontrivial invariant subspace, and that (2) there exists a (not necessarily bounded) linear transformation of a complex separable Hilbert space into itself without nontrivial invariant subspaces. However, for (bounded linear) operators on an infinite-dimensional complex sep-
arable Hilbert space, the invariant subspace problem remains a recalcitrant open question.
Suggested Reading Akhiezer and Glazman [ I ], [2] Bachman and Narici [I ] Beals [1] Beauzamy (1 ] Berberian [11, [3]
Helmberg [I ] Kubrusly (I ] Martin and Putinar [I ] Naylor and Sell [I ] Pearcy [1], [2]
496
6. The Spectral Theorem
Berezansky, Sheftel and Us [1], [2] Clancey [1] ColojoarA and Foia§ [ 1 ]
Conway [1], [2], [3] Douglas [1] Dowson [1] Dunford and Schwartz [3] Fillmore [1], [2] Halmos [11, [4]
Putnam [1]
Radjavi and Rosenthal [I] Riesz and Sz.-Nagy [I ] Sunder [1] Sz.-Nagy and Foia§ [I ] Taylor and Lay [I ] Weidmann [I ] Xia [1] Yoshino [I]
Problems Problem 6.1. Let f be a Hilbert space. Show that the set of all normal operators from B[l] is closed in B[f].
Hint: (T* - S*)(T - S) + (T* - S')S + S*(T - S) = T*T - S*S, and hence IIT'T - S*SII _< IIT - 5112 + 211SII IIT - SII for every T, S E B[fl]. Verify the above inequality. Now let (N )n° t be a sequence of normal operators in B[7 f ] that converges in B[f] to N E B[7{]. Check that
II N*N - NN'll = II N'N -
NN-11
II N,*,Nn - N*NII + II NN
2(IINn-N1l2+2IINIIIINn-Nll). Conclude: The (uniform) limit of a uniformly convergent sequence of normal operators is normal. Finally, apply the Closed Set Theorem.
Problem 6.2. Let S and T be normal operators acting on the same Hilbert space. Prove the following assertions.
(a) a T is normal for every scalar a. (b)
If S*T = T S*, then S + T, TS and ST are normal operators.
(c) T'" T" = T" T" = (T*T)" = (T T')" for every integer n > 0.
Hint: Problem 5.24 and Proposition 6.1.
Problem 6.3. Let T be a contraction on a Hilbert space 7{. (i.e., T E B[l] and 11TII < 1). Show that (a)
T'"T" - A.
(b) 0 < A < I
(i.e. A E t3 [7l] and llAII <- 1),
Problems
497
(c) T'"A T" = A for every integer n > 0.
Hint. Proposition 5.84, Problem 5.49 and Proposition 5.68, Problems 4.45(a), 5.55 and 5.24(a). Note that, according to Problem 5.54, a contraction T is strongly stable if and only
if A = 0. Since A > 0, it follows by Proposition 5.81 that A is an orthogonal projection if and only if it is idempotent (i.e., if and only if A = A2). In general, A is not a projection.
(d) Set T = shift(a, 1, 1, 1, ...) in B[l+j, a weighted shift on 7{ = t. with lal E (0, 1), and show that this is a contraction for whichA =diag(1a12, 1, 1, 1,...) in B[e+) is not a projection.
(e) Show that A = A2 if and only if A T = TA. 1, check Hint: Use part (c) to verify that (A Tx ; TAx) = II Ax 112. Since 11 T 11 that 11 A T x - TAx 112 < 11 A Tx 112 - II Ax 112. Thus, recalling that 11 T 11 < 1, II A II
-< 1, and applying part (c), verify that II Ax II2 < Il A Tx 112 < 11 A 2 Tx II2
=
(T'A T x ; x) = (Ax ; x) = II A 2 x I12. Conclude: A = A2 implies AT = T A. For the converse, use parts (a) and (c).
(f) Show that A = A2 whenever T is a normal contraction. Hint: Problems 6.2(c) and 5.24.
It can be shown that A = A2 whenever T is a cohyponormal contraction. Problem 6.4. Consider the Hilbert space L2(F) of Example 5L(c), where I' denotes the unit circle about the origin of complex plane. Recall that, in this context, the
terms "bounded function", "equality", "inequality", "belongs" and "for all", are interpreted in the sense of equivalence classes. Let rp : I' --> C be a bounded function.
Show that
(a) of lies in L2(I') for every f E L2(r). Thus consider the mapping M.: L2(r) -- L2(I') defined by
M,, f = (pf
for every
f E L2(r).
That is, (M,y f)(z) = c(z)f (z) for all z E r. This mapping is called the multiplication operator on L2(F). It is easy to show that MM, is linear and bounded (i.e., M,p E B[L2(1')]). Prove the following propositions. (b)
llMpll = Ilwll,..
498
6. The Spectral Theorem
Hint: First verify that II Mwf II : 11(p [I,, IIf II for every f E L2(f ). Take e > 0
and consider the set rE = (z E r: IIVIIoo- E < Itp(z)I). Let ff be the characteristic function of rE. Check that fE E L2(I'), and also that II Mwff II (IltvlIoo- e)Ilf - eli. (c) M(p' g = ;pg for every g E L2(I').
(d) M. is a normal operator.
(e) Mp is unitary if and only if tp(z) E r for all z E r. (t) M. is self-adjoint if and only if tp(z) E R for all z E r. (g) M. is nonnegative if and only if tp(z) >- 0 for all z E F.
(h) M, is positive if and only if V(z) > 0 for all Z E P.
(i) M. is strictly positive if and only if W(z) > a > 0 for all z E I'. Problem 6.5. If T is a quasinormal operator, then (a)
(T'T)"T =T(T'T)" for every n>0,
(b)
I T" I =ITI" for every n > 0,
(c)
T" -- 0 -> IT In
and
0
IT In
0.
Hint: Prove (a) by induction. Assertion (b) holds trivially forn = 0, 1, for every operator T. Let T be a quasinormal operator (so that (a) holds true) and suppose IT" I =
ITI" for some n> 1. Then verify that IT"+112 = T*T'"T"T = T`IT"12T = T*ITI'"T = T*(T`T)"T = T.T(T*T)n = (T`T)"+t = IT12(n+i) = (ITI"+1)2 Now conclude the induction by recalling that the square root is unique. Use Problem 5.61(d) and part (b) to prove (c).
Problem 6.6. Every quasinormal operator is hyponormal. Give a direct proof. Hint: Let T be an operator on a Hilbert space R. Take an arbitrary x = u + v E N =
N(T') + N(T')1 = N(T') + R(T)-, with u E N(T') and V E 1Z(T)-, so that v = limn u" where {vn} is an R(T)-valued sequence (cf. Proposition 4.13, Theorem 5.20, Propositions 5.76 and 3.27). Set D = T*T - T T* and check that (Du; u) = IITu112. If T is quasinormal (i.e., if DT = 0). then verify that (Du ; v) = limn (u ; Dvn) = 0, (Dv; u) = limn(Dun ; u) = 0. and (Dv; v) = limn (Dun ; v) = 0. Finally, show that (Dx ; x) > 0. Problem 6.7. If T E Q[NJ is hyponormal, then T-1 is hyponormal. Hint: 0 < D = T*T - T T'. Then (Problem 5.51(a)) 0 < T-1 D T-''. Now show
that I < T-I T*TT'-I, and hence T*T-t T'-I T < I (see Problems 1.10 and
Problems
499
5.53(b)). Verify: 0 < T-t'(I - T*T-'T*-I T)T-I , and conclude that T-' is hyponormal.
Problem 6.8. If T E 9[7-IJ is hyponormal and both T and T-1 are contractions, then T is normal. Hint: (ITxI( = IITT'-'T*xll IIT'xI].Thus Tiscohyponormal. Problem 6.9. Let 7{ be a Hilbert space. Take any T E 13[71] and any x E 7-1. Show that
(a) T'Tx = IITII2x if and only if IITxf1 = IITII Ilxll Hint: If IITx11 = 11T11llx11, then (T'Tx; IIT112x) = I1T114 11x112 Therefore, by using the above identity, verify that II T' T x - IITII 2x 112 = II T' T.r 11' (!IT'T112 - ilTll;)11x11- = 0. 11TII4 IIx112
Set M = (x E 71: II Tx II = 11 T II fix 11). Prove the following assertions. (b) M is a subspace of R. Ifint: ,t1 = Af(IIT II`I - T* T). (c) If T is hyponormal, then M is T-invariant.
Hint: If x E M and if T is a hyponormal operator. then show that II T (T x )11 < IIT(Tx)I1. 11T1111Tx11 = 1111TII'x11 = 11T'Tx11
(d) If T is normal, then M reduces T. Hint: AA is invariant for both T and T* whenever T is normal.
Note: M may be trivial (samples: T = I and T = diag((
)k i )).
Problem 6.10. Let 7{ 96 (0) and K 36 (0) be complex Hilbert spaces. Take T E 13[7{] and W E 9(7.1. K] arbitrary. Recall that 7- and K are unitarily equivalent. according to Problem 5.70. Show that
ap(T) = ap(WTW-')
and
p(T) =
p(WTW-1).
Thus conclude that (see Proposition 6.17).
cR(T) = rR(WTW-1)
and
a(T) =a(WTW-t).
Finally, verify that
ac(T) = ac(WTW-1). Outcome: Similarity preserves each part of the spectrum, and hence similarity pre-
serves the spectral radius: r(T) = r(WTW-t). In other words, if T E B[N]
500
6. The Spectral Theorem
and S E B[IC] are similar (i.e., if T -_ S), then op(T) = ap(S), aR(T) = aR(S).
ac(T) = ac(S), and so r(T) = r(S). Use Problem 4.41 to show that unitary equivalence also preserves the norm (i.e., if T = S, then IITII = IISII) Note: Similarity preserves nontrivial invariant subspaces (Problem 4.29).
Problem 6.11. Let A E B[N] be a self-adjoint operator on a complex Hilbert space
N 0 {0}. Use Corollary 6.18(d) to check that ±i E p(A) so that (A + i 1) and (A - i 1) both lie in G[7-(]. Consider the operator
U=
(A-il)(A+il)-I
= (A+il)-1(A-il)
in G[f], where U = (A + i 1) (A - i I)-t = (A - i I)-1(A + II) (cf. Corollary 4.23). It is clear that A commutes with (A + i 1) -1 and with (A - i 1)-t because every operator trivially commutes with its resolvent function. Show that
(a) U is unitary,
(b) U=1-2i(A+i1)-1, (c) I E p(U) and
A = i(1 + U)(l - U)'1 = i(I - U)-1(I + U).
Hint: (a) Verify that II (A ± i I )x 112 = II Ax 112 + Ilx 112 (reason: A = A', and hence 2 Re (Ax ; i x) = 0) for every x E 7{. Take any y E 7i so that y = (A + i 1)x for
some x E N (recall: 7Z(A + i I) = 71). Then U y = (A - i l )x and II Uy II2 = II (A - i I )x 112 = II (A + i 1)x 112 = 11 y 112 so that U is an (invertible) isometry. (b)
(A - i 1) = -2i1 + (A + i 1). (c) (I (A - il)(A + il)-1.
U)-1
=
(A + iI) and (I + U) = I +
Conversely, let U E G[N] be a unitary operator with 1 E p(U) (so that (I - U) E 9[7{]) and consider the operator
A = i(1 + U)(I - U)-1 = i(1 - U)-1(I + U) in B[N]. Recall again: U commutes with (I - U)-1. Show that
(d) A = it +2i U(1 - U)-1 = -il +2i(l - U)-1, (e) A is self-adjoint, (f)
±i Ep(A) and U = (A-il)(A+il)_1
=(A+il)-1(A-il).
Hint: (d) Verify that i(I -U)+2iU = i(I+U) = -i(1 -U)+2i1. (e) U-I (I - U)-1'U* = (I = (U(1 - U`))-I = -(I - U)-' so that A`= -i i - 2i (I - U)-1 iU` = -i I + 21(1 - U)-1 = A. (f) Using assertion
(d) we get (A - il) = 2iU(l - U)-1 and (A +il) = 2i(l - U)-1 so that (A + il)-1 = lli(l - U).
Problems
501
Summing up: Set U = (A - i1)(A + i1)-t for an arbitrary self-adjoint operator A. U is unitary with I E p(U) and i(I + U)(1 - U) -I = A. Conversely, set A = i(I + U)(I - U)-t for any unitary operator U with 1 E p(U). A is self-adjoint
and (A-il)(A+i/)-' = U. Outcome: There exists a one-to-one correspondence between the class of all selfadjoint operators and the class of all unitary operators for which 1 belongs to the
resolvent set: A r-' (A - i1)(A + i1)
with inverse U r--+ i(1 + U)(I - U)-'.
If A is self-adjoint, then the unitary U = (A - i /)(A + i I)-t is called the Cayley transform of A. What is behind such a one-to-one correspondence is the Mobius transformation z .- " that maps the open upper half plane onto the open unit disc, and the extended real line onto the unit circle.
Problem 6.12. Let T E 8[11] be any operator on a complex Hilbert space f # (0). Prove the following assertions. (a) At(.k 1 - T) is a subspace of 71, which is T-invariant for every A E C. MoreAl : N(A1 - T) -+ N(A1 - T). That is, if restricted to over, T
the invariant subspace N(.1 - T), then T acts as a scalar operator on N(AI - T), and hence T I,V(at_T) is normal. Remark: Obviously, if N(xl - T) = (0) (i.e., if k 0 ap(T)), then T I,v(Al-T) = Al: (0) -+ (0) coincides with the null operator: on the null space every operator is null or, equivalently, the only operator on the null space is the null operator. (b) Every eigenspace of a nonscalar operator T is a nontrivial invariant subspace
for T (i.e.. if ,l E ap(T), then {0} # N(AI - T) 0 1 and N(A! - T) is T-invariant). (c) If ap(T) U ap(T') * 0, then T has a nontrivial invariant subspace. (d) If T has no nontrivial invariant subspace, then a(T) = ac(T).
Hint: For parts (c) and (d) use Propositions 5.74 and 6.17, respectively.
Problem 6.13. We have already seen in Section 6.3 that a(T-t) = U(T)-1 = (),-( E C: A E a(T)} for every T E c[XJ. where X 54 (0) is a complex Banach space. Exhibit a diagonal operator T in Q[C2] for which r(T-t) 96 r(T)-t. Problem 6.14. Let T be an arbitrary operator on a complex Banach space X # (0), take any A E p(T) (so that (X/ - T) E G[X}), and set
d = d(,A, a(T)). the distance of k to a (T). Since a (T) is nonempty (bounded) and closed, if follows that d is a positive real number (cf. Problem 3.43(b)). Show that the spectral radius
6. The Spectral Theorem
502
of the inverse of (Al - T) coincides with the inverse of the distance of A E p(T) to the spectrum of T. That is,
r((AI - T)-') = d
(a)
Hint: d = inf,Ea(T) IA - Al so that d-' = supgE,(T)IA - AI-1 and, from the Spec-
tral Mapping Theorem (Theorem 6.19), a(Al - T) = (A - µ E C: ,u E a(T)).
However, since a((AI - T)-') = a(AI - T)-' = (µ-' E C: A E a(AI - T)). show that a((AI - T)-') = ((A - µ.)-' E C: ,U E a(T)}. Now let X be a Hilbert space and prove the following implication.
If T is hyponormal, then II(Xl - T)-' II = d-1.
(b)
Hint: If T is hyponormal, then (Al - T) is hyponormal (cf. proof of Corollary 6.18)
and so is (Al - T)-' (Problem 6.7). Hence (Al - T)-' is normaloid by Proposition 6.I0. Apply (a).
Problem 6.15. Let M be a subspace of a Hilbert space ?l and take T E B[f]. If M is T-invariant, then (TIM)* = PT*IM in B[M], where P : f -- ?{ is the orthogonal projection onto M.
Hint: UseProposition 5.81toverifythat((TIM)`u; v) = (u; TIMv) = (u; Tv) _ (u; TPv) = (PT'u; v) = (PT*IMu ; v) for every u, V E M. In other words, if M is T-invariant, then T(M) c M (so that TIM lies in B[M]) but T'(M) may not be included in M; it has to be projected there: PT'(M) C M (so that PT*IM lies in B[M] and coincides with (TIM)'). If M reduces T (i.e., if M also is T'-invariant), then T'(M) does not need to be projected on M; it is
already there (i.e., if M reduces T, then T*(M) c M and (TIM)* = TIM see Corollary 5.75).
-
Problem 6.16. Let M be an invariant subspace for T E B[f]. (a) If T is hyponormal, then TIM is hyponormal.
(b) If T is hyponormal and TIM is normal, then M reduces T. Hint: Since M is T-invariant, TIM E B[M]. Use Problem 6.15 (and Propositions 5.81 and 6.6) to show that II(TIM)*ull < IIT*uII < IITI u11 for every u E M. Moreover, if TIM is normal, say TIM = N, then T = (o y) in B[M ® M1] (cf. Example 20 and Proposition 5.51). Since N is normal and T is hyponormal, verify that
0 < D = T'T - TT`
-XX'`
X'N - YX'
N`X - XY` 1 E B(M ®M J. XX'
Take anyu in M.setx = (u, 0) in M M1, and show that (Dx ; x) = -IIX"xI12. Conclude: T = (o o) = N ® Y, and hence M reduces T.
Problems
503
Problem 6.17. This is a rather important result. Let M be an invariant subspace for a normal operator T E BIN]. Show that TIM is normal if and only if M reduces T.
Hint: If T is normal, then it is hyponormal. Apply Problem 6.16 to verify that M reduces T whenever TIM is normal. Conversely, if M reduces T. then write T = Ni ®N2 on MOM-L, where Ni = TIM in B[M ] and N2 = TIM- in B[M 1]. Now verify that both N1 and N2 are normal operators whenever T is normal.
Problem 6.18. Let T be a compact operator on a complex Hilbert space 71 and let 0 denote the open unit disc about the origin of the complex plane.
(a) Show that ap(T) c A implies T"
0.
Hint: Corollary 6.31 and Proposition 6.22.
(b) Show that T"
0 implies ap(T) c A.
Hint: If A E ap (T), then verify that there exists a unit vector x in 71 such that T"x = A"x for every positive integer n. Thus IAI" -+ 0. and hence IAI < 1,
whenever T" " 0 (cf. Proposition 5.67). Conclude: The concepts of weak, strong, and uniform stabilities coincide for a compact operator on a complex Hilbert space. Problem 6.19. If T E B[7{] is hyponormal, then
N(AI - T) c 1(AI - T') ,
for every
A E C.
Hint: Adapt the proof of Proposition 6.39.
Problem 6.20. Take A, µ E C. If T E 8[7{] is hyponormal, then
N(AI - T) 1 N(II - T)
whenever
A 4 µ.
Hint: Adapt the proof of Proposition 6.40 by using Problem 6.19.
Problem 6.21. If T E 8[7{] is hyponormal, then N(AI - T) reduces T for every
AEC. Hint: Adapt the proof of Proposition 6.41. First observe that, if Ax = Tx, then T*x = Ax (by Problem 6.19). Next verify that AT*x = AAx = AAx = ATx = TAx = TT'x. Then conclude: ,V(AI - T) is T'-invariant. Note: TIJV(xt_T) is a scalar operator on N(AI - T), and hence a normal operator (cf. Problem 6.12(a)). A pure hyponormal operator is a hyponormal operator that has no normal direct summand (i.e., it has no reducing subspace on which it acts as a normal operator). Use Problem 6.17 to show that a pure hyponormal operator has an empty point spectrum.
504
6. The Spectral Theorem
Problem 6.22. Let T E B[7{] be a hyponormal operator and set
,M =
E N(AI - T ))
\ AEap (T)
.
Show that
M reduces T and TIM is normal. Hint: If ap(T) = 0, then the result is trivial (for the empty sum is null). Thus suppose ap(T) * 0. First note that (A/(AI - T))AEOp(T) is an orthogonal family of nonzero subspaces of the Hilbert space M (Problem 6.20). Now choose one of the following ways.
(1) Adapt the proof of Corollary 6.42, with the help of Problems 6.20 and 6.21, to verify that M reduces T. Use Theorem 5.59 and Problem 5.10 to check that the family {PA)AEap(T) consisting of the nonzero orthogonal projections Px E B[M] onto each N(AI - T) is a resolution of the identity on M. Take any u E Jul. Verify
that u = EAPxu, and hence TIMu = Tu = FATPxu = >2AAPxu (reason: Pxu E N(AI - T), where the sums run over ap (T)). Conclude that TIM E B[M] is a weighted sum of projections. Apply Proposition 6.36.
(2) Use Example 5J and Problem 5.10 to identify the topological sum M with the orthogonal direct sum ®AEp(T)N(AI - T). Since each N(Al - T) reduces T (Problem 6.21). it follows that M reduces T, and also that each N(AI - T) reduces TIM E B[M]. Therefore, TIM = ®AEap(T) T IM(xt-T). But each T IN(xt -T) is normal (in fact, a scalar operator - Problem 6.12), which implies that TIM is normal (actually, a weighted sum of projections).
Problem 6.23. Every compact hyponormal operator is normal.
Hint: Let T E B[N] be a compact hyponormal operator on a Hilbert space N and
consider the subspace M of Problem 6.22. Show that ap (T I Mi) = 0: if A E ap(T IMI), then there exists a nonzero vector v E Jill such that Av = Tv, and hence v E N(Al - T) C M, which is a contradiction. Now recall that T IM± is compact (Section 4.9) and hyponormal (Problem 6.16). Use Corollary 6.32
to conclude that M1 = 0. Apply Problem 6.22 to show that T is normal. Remark: According to the above result, on a finite-dimensional Hilbert space, quasinormality, subnormality and hyponormality all collapse to normality (and so isome-
tries become unitaries - see Problem 4.38(d)).
Problem 6.24. Let T E B[l] be a weighted sum of projections on a complex Hilbert space N 96 (0). That is,
T x = >2 Ay Py x Y
for every
xE
Problems
505
where { P.) is a resolution of the identity on N (with P. j4 0 for all y) and (Ay } is a (similarly indexed) bounded family of scalars. Recall from Proposition 6.36 that T is normal. Now prove the following equivalences.
(a) T is unitary t Ay E r for all y b a(T) c 1'. a(T) a R.
(b) T is self-adjoint
ay E R for all y
(c) T is nonnegative
Ay c [0, oo) for ally
(d) T is positive
a (T) C [0, oo).
Ay C (0, oo) for all y.
(e) T is strictly positive
Ay C [a,oo) for alI y
a(T) C [a,oo).
(f) T is a projection t= ply g (0,1 ) for all y b a (T) = ap(T) C JO. 1). Note: In part (a), r denotes the unit circle about the origin of the complex plane. In part (e), a is some positive real number. In part (t), projection means orthogonal projection (Proposition 6.2). Problem 6.25. Let T be an operator on a complex Hilbert space 7{ 54 (0). Show that
(a) T is diagonalizable if and only if H has an orthogonal basis made up of eigenvectors of T. Hint: If (ey ) is an orthonormal basis for N. where each ey is an eigenvector
of T. then use the Fourier Series Theorem to show that the resolution of the identity on H of Proposition 5.57 diagonalizes T. Conversely if T is diagonalizable, then every nonzero vector in each R(Py) is an eigenvector of T. Let By be an orthonormal basis for the Hilbert space 1 .(Py). Since (F_y R(Py ))- = ?-l (see Theorem 5.59 and Problem 5.10), use Problem 5.11 to verify that Uy By is an orthonormal basis for h consisting of eigenvectors of T.
If there exists an orthonormal basis (ey) for H and a (similarly indexed) bounded family of scalars (Ay) such that Tx = 2y.ly(x ; ey)ey for every x E H, then we say that T is a diagonal operator with respect to the basis {ey) (cf. Problem 5.17). Use part (a) to show that (b) T is diagonalizable if and only if it is a diagonal operator with respect to some orthonormal basis for N. Now let {ey
be an orthonormal basis for 7-( and consider the Hilbert space er of Example 5K. Let (A,,),,1 be a bounded family of scalars and consider the mapping
D: t2 --> fr defined by Dx = {ayky)yEr
x = (y)yEr E er.
for every
In fact, Dx E er for all x E Pr, D E 8[e ], and I1DII = supyErIAyI (hint: Example 4H). This is called a diagonal operator on Show that er.
506
6. The Spectral Theorem
(c) T is diagonalizable if and only if it is unitarily equivalent to a diagonal operator.
Hint: Let {ey }yEr be an orthonormal basis for 7-( and consider the natural mapping (cf. Theorem 5.48) U : 7-t! -* Cr given by
Ux = {(x ; ey)}yer
for every
x = >yEr(x ; ey)ey.
Verify that U is unitary (i.e., a linear surjective isometry - see the proof of Theorem 5.49), and use part (b) to show that the diagram
7{ T) 7{ uj
Tu.
6 -°) er commutes if and only if T is diagonalizable, where D is a diagonal operator
on fr. Problem 6.26. If T is a normal operator, then
(a) T is unitary if and only if a (T) c 1', (b) T is self-adjoint if and only if or (T) C R,
(c) T is nonnegative if and only if a(T) C [0, oo), (d) T is strictly positive if and only if a (T) c [a, oo) for some a > 0, (e) T is an orthogonal projection if and only if a (T) c 10, 1). Hint: Recall that r denotes the unit circle about the origin of complex plane. Half of this problem was proved in Corollary 6.18. To prove the other half use the Spectral
Theorem: T = fa(T)"Pa, T' =
T*T =
(T)1112d& = TT'.
(Tx ; x) = fa(T)X d (P), x ; x) for alI x E 7{.
Problem 6.27. Let T E B[N] be a hyponormal operator. Prove the following implications.
(a) If o (T) C F, then T is unitary. Hint: If the spectrum of T is included in the unit circle 1' about the origin, then 0 E p(T) so that T E G[7{), Moreover, since T is hyponormal, it follows that II T 11 = r (T) = 1. Now use Problem 6.7 to check that T - t is hyponormal.
Verify that IIT-t II = 1 (recall: a(T-1) = a(T)-t = I') and conclude from Problem 6.8 that T is normal. Finally, apply Problem 6.26(a) to show that T is unitary.
Problems
507
(b) If a (T) c R, then T is self-adjoint. Hint: Show that scaling and translation by the identity of any hyponormal operator are again hyponormal (i.e., if T is hyponormal, then a T and (1 - T)
are hyponormal for every a E C, and hence (Al - T) is hyponormal for every X E C as we saw in the proof of Corollary 6.18). Let T be a hyponormal operator such that a (T) c_ R. Suppose T 56 0 (otherwise the result is trivial) and set T' = (211TII)-(T. which is hyponormal. Verify that a(T') =
(211T II)-'a(T) c R and IIT'll = 3. Also verify that (;1 - T') is a hyponormal operator in 9[71] and use Problem 6.14 to show that II (i 1 - T ')- (II = 3 Moreover, II3'1 - T'II :5 5 + 11 T'11 = 1. Outcome: Both (;1 - T') and
(;1 - T')-( are contractions. Use Problem 6.8 to conclude that (i1 - T') is normal. Now undo scaling and translation: for instance, apply Problem 6.2
(with S = -'1) to verify that -T' is normal, and so is T. Finally, show that T is self-adjoint by using Problem 6.26(b). (c) If a(T) c [0, oo), then T is nonnegative.
(d) If a(T) c [a, oo) for some a > 0, then T is strictly positive. (e) If a (T) C {0, 1), then T is an orthogonal projection. Hint: Use part (b) and Problem 6.26 to prove (c), (d) and (e).
Problem 6.28. (a) An isolated point of the spectrum of a normal operator is an eigenvalue.
Hint: Consider the spectral representation N = f k d Pa of a normal operator on a Hilbert space f and let A0 be an isolated point of a (N). Apply Theorems 6.47 and 6.48 to show that:
(1) POW) -A 0, (2) R(P((Ao))) 96 (0) reduces N,
(3) NIR(p((ao))) = NP((Xo)) = fAX(ao)dPx = XoP((Xo}),where X(a01 is the characteristic function of (X0}, and
(4) (101 - N)u = ;L0P(()Lo})u - NI1z(p((a0)))u = 0 for every u E 7Z(P({h0})). An important result in operator theory is the Riesz Decomposition Theorem, which
reads as follows. If T is an operator on a complex Hilbert space, and if a(T) = at U a2, where a( and a2 are disjoint nonempty closed sets in C, then T has a complementary (not necessarily orthogonal) pair of nontrivial invariant subspaces (M (, M2) such that a (T IM,) = a( and a (T I,M2) = a2. Now prove the following assertion. (b) An isolated point of the spectrum of a hyponormal operator is an eigenvalue.
508
6. The Spectra! Theorem
Hint: Let Al be an isolated point of the spectrum a (T) of a hyponormal operator T E 13[7.1]. Verify that a(T) = (A1 } U 0`2 for some nonempty closed set a2 that does not contain Eli . Apply the Riesz Decomposition Theorem to ensure that T has a nontrivial invariant subspace M such that a(TIM) = {A1). Set H = TIM on 1'A A {0}. Show that ().11 - H) is a hyponormal (thus normaloid) operator for which a (,l i 1 - H) = (0), and conclude that TIM = H = At 1 in 8(M ]. (c) A pure hyponormal operator has no isolated point in its spectrum. Hint: Problem 6.21.
Problem 6.29. Let S and T be normal operators acting on the same Hilbert space. Prove the following assertion.
If S T = TS, then S+ T, TS and S T are normal operators. Hint: Corollary 6.49 and Problem 6.2.
Problem 6.30. The operators in this problem act on a complex Hilbert space of dimension greater than one. Recall from Problem 4.22:
(a) Every nilpotent operator has a nontrivial invariant subspace. It is an open question whether every quasinilpotent operator has a nontrivial invariant
subspace. Shift from nilpotent to normal operators, and recall from the Spectral Theorem:
(b) Every normal operator has a nontrivial invariant subspace. Now prove the following propositions.
(c) Every quasinormal operator has a nontrivial invariant subspace.
Hint: (T'T - T T' )T = 0. Use Problem 4.21. (d) Every isometry has a nontrivial invariant subspace. Every subnormal operator has a nontrivial invariant subspace. This is a deep result proved by S. Brown in 1978. However, it is still unknown whether every hyponormal operator has a nontrivial invariant subspace.
References
N.I. AKHIEZER AND I.M. GLAZMAN [1]
Theory of Linear Operators in Hilbert Space - Volume I (Pitman, London, 1981).
[2] Theory of Linear Operators in Hilbert Space - Volume 11 (Pitman. London, 1981). W. ARVESON
[ 1] An Invitation to C*-Algebras (Springer, New York, 1976). G. BACHMAN AND L. NARICI [1]
Functional Analysis (Academic Press, New York, 1966).
A.V. BALAKRISHNAN
[ 1 ] Applied Functional Analysis 2nd edn. (Springer, New York, 1980). S. BANACH [1]
Theory of Linear Operations (North-Holland, Amsterdam, 1987).
R. BEALS
[1] Topics in Operator Theory (The University of Chicago Press, Chicago, 1971). B. BEAUZAMY
[1) Introduction to Operator Theory and Invariant Subspaces (North-Holland, Amsterdam, 1988). S.K. BERBERIAN [1]
Notes on Spectral Theory (Van Nostrand, New York, 1966).
510
References
[2] Lectures in Functional Analysis and Operator Theory (Springer, New York, 1974).
[3] Introduction to Hilbert Space 2nd edn. (Chelsea, New York, 1976). Y.M. BEREZANSKY, Z.G. SHEFTEL AND G.F. Us
Functional Analysis - Volume I (Birkhauser, Basel, 1996). [2] Functional Analysis - Volume II (Birkhauser, Basel, 1996). [1]
K.G. BINMORE [1]
The Foundations ofAnalysis -A Straightforward Introduction -Book 1: Logic. Sets and Numbers (Cambridge University Press, Cambridge, 1980).
A. BROWN AND C. PEARCY
Introduction to Operator Theory!- Elements of Functional Analysis (Springer, New York, 1977). [2] An Introduction to Analysis (Springer, New York, 1995). [1]
S.W. BROWN
[1] Some invariant subspaces for subnormal operators, Integral Equations Operator Theory 1 (1978) 310-333. G. CANTOR [1]
Ein Beitrag zur Mannigfaltigkeitslehre, J. fur Math. 84 (1878) 242-258.
K. CLANCEY
[ 1 ] Seminormal Operators (Springer, Berlin. 1979). 1. COLOJOARA AND C. FOIA [1)
Theory of Generalized Spectral Operators (Gordon and Breach, New York, 1968).
J.B. CONWAY
[ 1 ] A Course in Functional Analysis 2nd edn. (Springer, New York, 1990). [2] The Theory of Subnormal Operators (Mathematical Surveys and Monographs Vol.36, Amer. Math. Soc., Providence, 1991). [3] A Course in Operator Theory (Graduate Studies in Mathematics Vol.21, Amer. Math. Soc., Providence, 2000). J.N. CROSSLEY et al. [1]
What is Mathematical Logic (Oxford University Press, Oxford, 1972).
P. COHEN
{1] The independence of the continuum hypothesis, Proc. Nat. Acad. Sci. 50 (1963) 1143-1148. K.R. DAVIDSON [1]
C`-Algebras by Example (Fields Institute Monographs Vol.6, Amer. Math. Soc., Providence, 1996).
References
511
I. DIEUDONNE [1]
Foundations of Modern Analysis (Academic Press, New York, 1969).
R.G. DOUGLAS
[11 Banach Algebra Techniques in Operator Theory (Academic Press, New York, 1972; 2nd edn. Springer, New York, 1998). H.R. DowSON [1]
Spectral Theory of Linear Operators (Academic Press, New York, 1978).
I. DUGUNDJI
[1] Topology (Allyn & Bacon, Boston, 1960). N. DUNFORD AND I.T. SCHWARTZ [1]
Linear Operators - Part I: General Theory (Interscience, New York, 1958).
[2] Linear Operators - Part 11: Spectral Theory - Self Adjoint Operators in Hilbert Space (Interscience, New York, 1963).
[3] Linear Operators - Part 111: Spectral Operators (Interscience, New York, 1971). A. DVORETZKY AND C.A. ROGERS
[1] Absolute and unconditional convergence in normed linear spaces, Proc. Nat. Acad. Sci. 36 (1950) 192-197. P. ENFLO
(1] Acounterexample to the approximation problem in Banach spaces, Acta Math. 130 (1973) 309-317. [2] On the invariant subspace problem for Banach spaces, Acta Math. 158 (1987)
213-313. S. FEFERMAN
[ 1 ] Some applications of the notion of forcing and generic sets, Fund. Math. 56
(1965)325-345. P.A. FILLMORE
[1] Notes in Operator Theory (Van Nostrand, New York, 1970). [2] A User's Guide to Operator Algebras (Wiley. New York, 1996). A.A. FRAENKEL, Y. BAR-HILLEL AND A. LEVY
[ 1) Foundations of Set Theory 2nd edn. (North-Holland, Amsterdam, 1973). K. GODEL
[1) Consistence-proof for the generalized continuum-hypothesis, Proc. Nat. Acad. Sci. 25 (1939) 220-224. C. GOFFMAN AND G. PEDRICK
[ 1 ] A First Course in Functional Analysis 2nd edn. (Chelsea. New York. 1983). I.C. GOHBERG AND M.G. KRETN
[1) Introduction to Nonselfadjoint Operators (Translations of Mathematical Monographs Vol.18, Amer. Math. Soc., Providence, 1969).
512
References
S. GOLDBERG [I]
Unbounded Linear Operators (Dover, New York, 1985).
P.R. HALMOS
Introduction to Hilbert Space and the Theory of Spectral Multiplicity 2nd edn. (Chelsea, New York, 1957; reprinted: AMS Chelsea, Providence, 1998). [2] Finite-Dimensional Vector Spaces (Van Nostrand, New York, 1958; reprinted: Springer, New York, 1974). [3] Naive Set Theory (Van Nostrand, New York, 1960; reprinted: Springer, New York, 1974). [1]
[4] A Hilbert Space Problem Book (Van Nostrand, New York, 1967; 2nd edn. Springer, New York, 1982). G. HELMBERG
[1] Introduction to Spectral Theory in Hilbert Space (North-Holland, Amsterdam, 1969). I.N. HERSTEIN [ I] Topics in Algebra (Xerox, Lexington, 1964).
E. HILLE AND R.S. PHILLIPS
[11 Functional Analysis and Send-Groups (Colloquium Publications Vol. 31. Amer. Math. Soc., Providence, 1957). V.I. ISTRATESCu [1]
Introduction to Linear Operator Theory (Marcel Dekker, New York, 1981).
T. KATO [1J
Perturbation Theoryfor Linear Operators 2nd edn. (Springer. Berlin, 1980).
L.V. KANTOROVICH AND G.P. AKILOV [1]
Functional Analysis 2nd edn. (Pergamon Press, Oxford, 1982).
J.L. KELLEY [l]
General Topology (Van Nostrand, New York, 1955; reprinted: Springer, New York, 1975).
A.N. KOLMOGOROV AND S.V. FOMIN [1]
Introductory Real Analysis (Prentice-Hall, Englewood Cliffs, 1970).
E. KREYSZIG
[1] Introduction to Functional Analysis with Applications (Wiley, New York, 1978). C.S. KUBRUSLY
[1) An Introduction to Models and Decompositions in Operator Theory (Birkhauser. Boston, 1997). V.I. LOMONOSOV
[t) Invariant subspaces for the family of operators which commute with a completely continuous operator. Functional Anal. Appl. 7 (1973) 213-214.
References
513
S. MACLANE AND G. BIRKHOFF [I ] Algebra (Macmillan, New York, 1967).
M. MARTIN AND M. PUTINAR
[1] Lectures on Hyponormal Operators (Birkhauser. Basel, 1989). I.J. MADDOX
[11 Elements of Functional Analysis 2nd edn. (Cambridge University Press, Cambridge, 1988). G.H. MOORE
[ I J Zermelo 's Axiom of Choice (Springer. New York, 1982). G. MURPHY (1 J
C`-Algebras and Operator Theory (Academic Press, San Diego, 1990).
A.W. NAYLOR AND G.R. SELL [1]
Linear Operator Theory in Engineering and Science (Rolt, Rinehart & Winston, New York, 1971; reprinted: Springer, New York, 1982).
C.M. PEARCY
[ 1 ] Some Recent Developments in Operator Theory (CBMS Regional Conference Series in Mathematics No.36, Amer. Math. Soc., Providence, 1978). [2] Topics in Operator Theory (Mathematical Surveys No.13, Amer. Math. Soc., Providence, 2nd pr. 1979). C.R. PUTNAM
[1] Commutation Properties of Hilbert Space Operators and Related Topics (Springer, Berlin, 1967). H. RADJAVI AND P. ROSENTHAL [1]
Invariant Subspaces (Springer, New York, 1973).
C.J. READ
[I] A solution to the invariant subspace problem, Bull. London Math. Soc. 16 (1984) 337-401. M. REED AND B. SIMON
[1] Methods of Modern Mathematical Physics 1: Functional Analysis 2nd edn. (Academic Press, New York, 1980). F. RIESZ AND B. SZ-NAGY [1]
Functional Analysis (Frederick Ungar, New York, 1955).
A.P. ROBERTSON AND W.I. ROBERTSON
[11 Topological Vector Spaces 2nd edn. (Cambridge University Press, Cambridge. 1973). S. ROMAN
[1] Advanced Linear Algebra (Springer, New York, 1992).
514
References
H.L. ROYDEN
[1] Real Analysis 3rd edn. (Macmillan, New York, 1988). W. RUDIN
[1] Functional Analysis 2nd edn. (McGraw-Hill, New York, 1991). R. SCHATTEN
[ 1 ] Norm Ideals of Completely Continuous Operators (Springer, Berlin, 1970). L. SCHWARTZ
[11 Analyse - Topologie Generale et Analyse Fonctionnelle 26me 6dn. (Hermann, Paris, 1970). W. SIERPINSKI
[11 L'hypothese gdn6ralisde du continu et l'axiome du choix, Fund. Math. 34 (1947) 1-5. G.F. SIMMONS
[1] Introduction to Topology and Modern Analysis (McGraw-Hill, New York. 1963). D.R. SMART [1]
Fixed Point Theorems (Cambridge University Press, Cambridge, 1974).
M.H. STONE
[1] Linear Transformations in Hilbert Space (Colloquium Publications Vol.15, Amer. Math. Soc., Providence, 1932). V.S. SUNDER
[1) Functional Analysis - Spectral Theory (Birkhauser, Basel, 1998). P. SUPPES
[ I ] Axiomatic Set Theory (Dover, New York, 1963). W.A. SUTHERLAND (1 ]
Introduction to Metric and Topological Spaces (Oxford University Press, Oxford, 1975).
B. SZ.-NAGY AND C. FOIA§
[ I ] Harmonic Analysis of Operators on Hilbert Space (North-Holland, Amsterdam, 1970). A.E. TAYLOR AND D.C. LAY [1]
Introduction to Functional Analysis 2nd edn. (Wiley, New York, 1980; enlarged edn. of A.E. TAYLOR, 1958).
R.L. VAUGHT [1]
Set Theory - An Introduction 2nd edn. (Birkhauser, Boston, 1995).
J. WEIDMANN [I]
Linear Operators in Hilbert Spaces (Springer, New York, 1980).
References
515
R.L. WILDER [1]
Introduction to the Foundation of Mathematics 2nd edn. (Wiley. New York. 1965; reprinted: Krieger. Malabar. 1983).
D. XIA [1]
Spectral Theory of Hyponorntal Operators (Birkhauser, Basel. 1983).
T. YOSHINO (I ] Introduction
to Operator Theory (Longman. Harlow, 1993).
K. YOSIDA [1]
Functional Analysis End edn. (Springer. Berlin, 1981).
Index
Abelian group, 38 absolute homogeneity, 128 absolutely convergent series, 201 absolutely convex set, 210 absolutely homogeneous functional, 128
absolutely homogeneous metric, 128 absolutely summable family, 344,346 absolutely summable sequence, 201 absorbing set, 270 accumulation point, 116-118 additive Abelian group, 3$ additive mapping, 55 additively invariant metric. 198 additivity, 313 adherent point, 115-11 7 adjoint, 379, 387-390, 454 algebra, 82 algebra with identity, 83 algebraic complement, 68.6-9 71 72
74 289 algebraic conjugate, S6 algebraic dual, 56 algebraic linear transformation, 84
algebraic operator, 281 algebraically disjoint, 67 annihilator, 343 antisymmetric relation. 8 approximate eigenvalue, 453 approximate point spectrum, 452 approximation spectrum, 453 Arzeli-Ascoli Theorem, 163 associative binary operation, 38 Axiom of Choice, 15
backward bilateral shift, 293, 423 backward unilateral shift, 248, 292,
421,474 Baire Category Theorem, 145-147 Baire metric, 187 Baire space, 142 balanced set, 220 Banach algebra, 222 Banach limit, 304 Banach space, 200 209 2 4 219 233, 271, 295
Banach-Steinhaus Theorem, 242.295 Banach-Tarski Lemma, 11
518
Index
barrel, 271
barreled space, 271, 295 Bessel inequality, 355 best linear approximation, 332 bidual, 266 bijective function, 5 bilateral ideal, 83 bilateral shift, 293, 422 425 bilinear form, 312 bilinear functional, 312 binary operation, 32 block diagonal operator, 284 Bolzano-Weierstrass property, L56 Boolean sum, 4 boundary. 181 boundary point, 181 bounded above, 9, 223, 231 bounded away from zero, 225, 272 bounded below, 9, 223, 228, 231 bounded family, 208 bounded function, 9 88, 216, 212 bounded inverse, 223, 228, 229 bounded linear operator, 220 bounded linear transformation, 215 216 bounded sequence, 89, 122 272 bounded set, 9 87, 152, 270,27-2 bounded variation, 185 boundedly complete lattice, 10
C'-algebra, 396 canonical basis for £+, 363 canonical basis for F', 54 canonical bilateral shift, 423,470 canonical unilateral shift. 422, 470 Cantor set, 190 Cantor-Bernstein Theorem, 1 111 cardinal number, 15 cardinality, 14-16, 18, 21, 22, 31-35, 77
Cartesian decomposition, 429 Cartesian product, 4 13,66, 16L177, 189, 193. 206 Cauchy criterion, 127, 224 344
Cauchy sequence, 12.7 134.185-187 271
Cayley transform, 501 chain, 12 characteristic function, 16, 27 clopen set, 184
closed ball, 101 closed convex hull, 269 Closed Graph Theorem, 229 closed 1 inear transformation, 287-289 closed map, 113 closed set, 1 13 1 16 128. 149. 150 Closed Set Theorem, 118 closed subspace, L79 closure, 113-115, 180, 184 cluster point, 11 118 codimension, 70 81 codomain, 5 coefficients, 216 cohyponormal operator, 446 coisometry, 391, 422 collinear vectors, 409 comeagre set. 144 commensurable topologies, 106 commutant, 284 commutative algebra, 83 commutative binary operation, 38 commutative diagram, 6 commutative group, 38 commutative ring, 39 commuting operators, 281, 489, 492, 506 compact extension, 252 compact operator, 250-254, 2566 257. 301,302, 308, 428,434,436,
474-480,482494,486,489, 495.503.5044 compact restriction, 256 compact set, 148. 149,158, 159, 236, 232 157-160 compact space, 148, 193-195
compatible topology, 269 complementary linear manifolds, 3.41
Index
complementary projection, 290 complementary subspaces, 289. 290,
519
339,342 complete lattice, 0 11, 45, 176, 212, 282
Contraction Mapping Theorem, 132 contrapositive proof. 3 convergence. 93 1516 convergence-preserving map, 1(X) convergent nets, 96
complete set, 271
convergent sequence, 93-96,103,106,
complete space, 128, 130, 131, 133-
135J37-141,145-147,156, 158-160,185-189,191,192 completely continuous. 250 completion, 138-141, 240-242, 3451 complex field, 451
complex linear space, 4 composition of functions, 6 compression spectrum, 452 condensation point, L80 conditionally compact, L49 cone, 852 conjugate space, 255 connected set, 184 connected space, 265 connectedness, 184 constant function, 5
continuity, 97 L06 continuity of inner product, 411 continuity of inversion, 298 continuity of metric, 172 continuity of norm, 200 continuity of scalar multiplication, 264 continuity of vector addition, 269 continuous composition. 103, 176
continuous extension, 135-138 Continuous Extension Theorem, 262 continuous function, 96-100,002 113, 122,149-151, 159, 182,216 continuous inverse, 223
Continuous Inverse Theorem, 228 continuous linear extension, 237-239 continuous linear transformation, 215 continuous projection, 221, 290, 300 continuous spectrum, 451 474 continuous restriction, 126 Continuum Hypothesis, 22
contraction, 97 218,298,400,409
127
convergent series, 201, 274 convex functional, 198
convex hull, 76269 convex linear combination, 76 convex set, 269 convex space, 271 coordinates, 49 coset, 43 countable set, 18 countably infinite set, 18
covering, 8 148 cyclic subspace, 283 cyclic vector, 283
De Morgan laws, 4 24 decomposition, 72-74, 342, 406, 429, 486.507 decreasing function, 10 decreasing increments, 115 decreasing sequence, 13 dense in itself, 126 dense linear manifold, 211, 237-239, 333 dense set, 121, 329
145 146. 179.
dense subspace, 122, 135-138 densely embedded, 138 densely intertwined, 283, 284 denumerable set, L$ derived set, 116, 118
diagonal mapping, 173,183, 218, 289 diagonal operator, 218, 225,246 2
254.255,299,302.400.416. 470, 505 diagonal procedure, 22, 154, 162 diagonalizable operator, 45. 467,489
504.505
520
Index
diameter, 82
dimension, 53 77 81. 359 direct proof, 2 direct sum, 66-68, 74, 75, 206-208,
220, 279, 280. 290,321323,325,337,33-8 direct sum decomposition, 68 72 74 75,342
direct summand, 75 229 directed downward, 1Q directed set, 10 directed upward, 10 disconnected set, 184 disconnected space, L84 disconnection, 184 discrete dynamical system, 79 discrete metric, 105 discrete set, 126.. L84 discrete space. LOS discrete topology, L05 disjoint linear manifolds, 67 disjoint sets, 4 disjointification, 22 distance, 86, 82 distributive laws, 39 division ring, 39 domain, 5 Dominated Extension Theorem, 262
doubleton, 4 dual space, 265 s-net, 151
eigenspace, 451, 485. 490, 501 eigenvalue, 451, 479, 484, 507 eigenvector, 454, 476 505 embedding, 6 empty function, 28 empty sum, 349 equicontinuous, 160, 295 equiconvergent sequences, 170, 186 equivalence, 230 equivalence class, 7 equivalence relation, 2 equivalent metrics, 106, 108, 116
equivalent norms, 231, 232 equivalent sets, l4 equivalent spaces, 231 Euclidean metric, 87 Euclidean norm, 203, 318 Euclidean space, 87 203. 318 eventually constant, LOS eventually in, 103 expansion, 49 276 extension by continuity, 258 extension of a function, 6 extension ordering, 28 extension over completion, 141. 142,
241,257,341
F-space, 271
Fo.141 field, 40 final space, 405 finite sequence, 13 finite set, 14 finite-dimensional space, 53 61 65 77,232-237,244,251,267, 291,292.301,303,346,355, 382, 424,466, 504 finite-dimensional transformation, 79 251 finite-rank transformation, 79 251,254,
292.302 first category set, 143, 145, 147 fixed point, 6 11. 132
148
Fourier coefficients, 360 Fourier series expansion, 360 Fourier Series Theorem, 360 Fr6chet space, 271 Fredholm Alternative, 480 Fubini's Theorem, 390 Fuglede-Putnam Theorem, 492 full direct sum, 246 function, 5 Gs, 147
Gelfand-Beurling formula. 460 Gelfand-Naimark Theorem, 396
Index
Gram-Schmidt process, 357 graph, 4 greatest lower bound, 2 group, 38 222 Hahn Interpolation Theorem, 128 Hahn-Banach Theorem, 260-262 Hamel basis, 48, 49, 51-53, 355, 3.5.8 Hausdorff Maximal Principle, 23 Hausdorff space, 179 Heine-Borel Theorem, 158 Hermitian operator, 396 Hermitian symmetric functional, 312 Hermitian symmetry, 313 Hilbert cube, 195 Hilbert space, 311 Hilbert-Schmidt operator, 435, 436
Holder conjugates, L64 Holder inequalities, 164-166 homeomorphic spaces, 108, L50 homeomorphism, 108,111, 114 147 150
homogeneity, 313 homogeneous mapping, 55 hyperinvariant linear manifold, 284 hyperinvariant subspace, 284 hyperplane, 82 hyponormal operator, 446-448. 450. 499, 506--508 hyponormal restriction, 501 ideal, 83
idempotent, 7 25 70. 301 identity element, 3$, 39 83 identity map, 6 identity operator, 222 image of a point, 5 image of a set, 5 inclusion map, 6 increasing function, 10, 11 increasing sequence, 13 index set, 12 indexed family, 12 indexing, 12
521
indiscrete topology. 105 induced equivalence relation, 8 induced topology, 104, 199, 315 induced uniform norm, 218 inductive set, 2 infimum, 9,13 infinite diagonal matrix, 218 infinite sequence, 13 infinite set, 14
infinite-dimensional space, 53, 356, 359 initial segment, 13 initial space, 405 injection, 6
injective function, 5 25.26 injective linear transformation, 56 inner product, 312 inner product axioms, 312 inner product space, 313 323 inner product space e+(X), 323 interior, 120 interior point, 121 intertwined operators, 283 intertwining transformation, 283 invariant linear manifold, 73-75.280 invariant set, 6
invariant subspace, 28 282-284,392 393.500-503.507 invariant subspace problem, 495.507 inverse element, 3$, 83 inverse image, 5 Inverse Mapping Theorem, 228 inverse of a function, 7. 25 223 inversely induced topology, 129 invertible element of L3[X, Y], 228 invertible function, 7 26
invertible linear transformation, 58 invertible operator in 13[X], 229 involution, 26 isolated point, 125, 126. L42 isometric isomorphism, 239-242.267,
268.292-294 isometrically equivalent operators, 294
522
Index
isometrically equivalent spaces, 1 10, 138. 44 328 isometrically isomorphic spaces, 239.
241,266-268,304.338 isometry,110 195.239,292.294.298. 301.339.391.442.465.508 isomorphic equivalence, isomorphic linear spaces, 59-63, 6562
isomorphism, 59-62.64 65, 6Z
linearly ordered set, 12 Liouville Theorem, 4551 Lipschitz condition, 97 Lipschitz constant, 92
Lipschitzian mapping, 97 125 locally compact, L94 locally convex space. 271 Lomonosov Theorem, 495 lower bound, 2 lower limit, L69 lower semicontinuity, L75
Jensen inequalities, L66
kernel, 56 Kronecker delta, 54
lattice, 10, 27.45 212, 281, 282 Laurent expansion, 460 Law of the Excluded Middle, 2 least-squares, 427 least upper bound, 9 left ideal, 83 left inverse, 25 limit, 29, 93, 26 limit inferior, 29 L62 limit superior, 29 L62 linear algebra, 82 linear basis, 48 linear combination, 46 linear composition, 28 linear dimension, 53, 357, 359 linear equivalence relation, 42 linear extension, 57, 258-262 linear functional, 55 linear manifold, 43,209, 325 linear restriction, 56.78 linear space, 40 linear space £[X, Y), 56, 78
map. S mapping, 5
Mathematical Induction, 2 13 matrix, 65 maximal element, 9 maximal linear variety, 82 maximal orthonormal set, 353-355. 357 maximum, 9
meagre set, 143 metric, 85
metric axioms, 85 metric generated by a norm, 199, 244 metric generated by a quasinorm, 271 metric space, 86 metrizable, L05 minimal element, 2 minimum, 2 Minkowski inequalities. 165 MObius transformation, 541 modus ponens, 2 monotone function, 10 monotone sequence, 13 multiplication operator. 492 multiplicity, 421-426,451 mutually orthogonal projections, 321
linear span, 45
linear topology, 262
linear transformation, 55, 57 58 62 65 linear variety, 82
linearly independent set, 47 352
natural embedding, 267 natural isomorphism, 65, 67,341, 342 natural projection, 222, 290 neighborhood, 101, 172 neighborhood base, 271
Index net, 14
neutral element, 38 nilpotent linear transformation, 80 nilpotent operator, 282 nondegenerate interval, 21 nondenumerable set, 18 nonmeagre set, L44 nonnegative contraction, 399 433 nonnegative functional, 198 nonnegative homogeneity. 198 nonnegative operator, 399-402, 406,
407.430,432-434,442,445, 463.499,505-507 nonnegative quadratic form. 312 nonnegativeness, 86, 199, 313 nontrivial hyperinvariant subspace, 285 nontrivial invariant subspace, 281-284, 286 nontrivial linear manifold, 43 nontrivial projection, 20 nontrivial reducing subspace, 392,492 nontrivial ring, 32 nontrivial subset, 3 nontrivial subspace, 210 norm, 198,218 norm axioms, L98 norm induced by an inner product,
315, 316 norm topology. 200, 315
normal operator, 441-448, 455 480499, 503-508 normal restriction, 591 normaloid operator, 397, 447 463466,477 normed algebra, 222 normed linear space, L92 normed space, 199 222 nonmed space 8[X, YJ. 217.218 normed spaces tp(X) and £ (X), 208 normed vector space, L29 nowhere continuous, 98 nowhere dense, 142, 143. 142 nuclear operator, 435
523
null function, 42 null space, 56, 216 null transformation, 56 212 nullity, 22 numerical radius, 462-466 numerical range, 463
one-to-one correspondence, 5 Ll one-to-one mapping, 5 open ball, 100 open map, 108 Open Mapping Theorem, 225 open set, 10 1. 104, L05 open subspace, 172 operator, 220
operator algebra B[M]. 220,222,229, 246 253. 307 operator convergence, 244-246, 249, 250,296-300,306-308.37 1-,
372,381-385,401.403,418420,422,424,42 430, 462, 496,502 operator matrix, 280 operator norm property, 220 orbit, 282
order-preserving correspondence, 23 ordered n-tuples, 13 ordered pair, 4 ordering, 8 ordinal number, 23 origin of a linear space, 41 orthogonal complement, 328,339, 342. 367
orthogonal dimension, 357, 359 orthogonal direct sum, 325, 338, 341.
342,367,392.416,421.469, 490 orthogonal family, 351, 352, 355, 356 Orthogonal Normalization Lemma, 415 orthogonal projection, 367-376, 398,
442,502.505-5517 orthogonal projection onto M. 369, 392, 415, 485, 502 orthogonal sequence, 324
524
Index
orthogonal set, 324. 352 Orthogonal Structure Theorem, 335 orthogonal subspaces, 327, 328, 335339,342,468.469,484,485,
Principle of Contradiction, 2 Principle of Recursive Definition. L3 Principle of Superposition, 78 product metric, 167
481
product of cardinal numbers, 35
orthogonality, 323 orthonormal basis, 354,356,360,363-
30 orthonormal set, 352-354
p-integrable functions, 92 p-summable family, 207 344, 348 p-summable sequence, 88 pair, 4 parallelogram law, 315 Parseval identity, 360 part of an operator, 444 partial isometry, 404-408 partial ordering, 8 partially ordered set, 8 partition, 8 perfect set, 126, L47 point of accumulation, 116-118 point of adherence, 115. L16 point of continuity, 22 point spectrum, 451 pointwise bounded, 160, 242 pointwise convergence, 95,243 pointwise totally bounded, 160 polar decomposition, 405 , 40L 443 polarization identities, 315, 31fi polynomial, 62, 80,282 positive functional, L98 positive operator, 39_9 400 430, 431,
433,4505 positive quadratic form, 312 positiveness, 86, 199, 313
power bounded operator, 246 power of a function, 7
product space, 167. 177 183. 189, 123
product topology, 123
projection, 70-74, 82,221, 300 projection on M, 71 73, 24 projection operator. 221 Projection Theorem, 339, 342 proof by contradiction, 2 proof by induction, 2 proper subset, 3 proportional vectors, 442 pseudometric, 91 pseudometric space, 91 pseudonorm,198, 203 Pythagorean Theorem, 324, 351 quadratic form, 312 quasiaffine transform, 285 quasiaffinity, 285 quasiinvertible transformation, 285 quasinilpotent operator, 459, 465421, 508 quasinorm, 220
quasinormal operator, 443.4 448,498, 508 quasinormed space, 220 quasisimilar operators, 285 quasisimilarity, 286
quotient algebra, 83 quotient norm, 214
quotient space, 7 42 44 6,9 83 91. 139,204.205.213,214,240. 320, 321
pre-Hilbert space, 313
Radon-Nikod'yn Theorem, 491 range, 5 rank, 79
pre-image, -5
rare set, 142
precompactness, 155
real field, 40
power sequence, 246, 282, 300 power set, 4
Index
real linear space, 41 reducible operator, 392 492 reducing subspace, 392, 484 488.492,
499,502 reflexive relation, Z reflexive spaces, 267, 302 relation, 4 relative complement, 4 relative metric, 86 relative topology, 172 relatively closed, 179
relatively compact, 11 L59 relatively open, L79 residual set, 144, 146. L48 residual spectrum. 451 resolution of the identity, 371-376 resolvent function, 450 resolvent identity, 450 resolvent set, 449, 450 restriction of a function, 5 restriction of a function, 6 Riemann-Lebesgue Lemma, 426 Riesz Decomposition Theorem, 507 Riesz Lemma, 236 Riesz Representation Theorem, 376 right ideal, 83 right inverse, 25 ring, 39 ring with identity, 32 scalar, 40
scalar multiplication, 40 scalar operator, 219, 281 scalar product, 312 Schauder basis, 276, 287.302 Schwarz inequality, 314 second category set, 444 L45 second dual, 266 self-adjoint operator, 396-401, 403,
429,430,432,442,456,465, 500-50R self-indexing, 12 semi-inner product, 320 semi-inner product space, 320
525
semicontinuity, 175
seminorm, 198,203 seminormal operator, 446.448 separable space, 123-126, 153, 183,
211.256,265,267.268.276, 291,294,357 sequence, 13
sequence of partial sums, 168.201 sequentially compact set, 155 sequentially compact space, 155-158 sesquilinear form, 312 sesquilinear functional, 312 set, 3 shift, 248, 292, 293, 421-424, 448, 468-471, 474
similar linear transformations, 65 80 similar operators, 286. 294 similarity, 65 80 286.294 simply ordered set, 12 singleton, 4 span, 45, 46, 211
spanned linear manifold, 47 spanned subspace, 211 spanning set, 211 spectral decomposition, 486,492 Spectral Mapping Theorem, 457 spectral measure. 420 spectral radius, 457-464. 473, 483. 499, 501 Spectral Theorem, 4M. 486,490 492 spectraloid operator, 464, 465
spectrum, 449-457, 466-469, 474-
480,499,502 506-508 spectrum diagram, 452 square root, 401
square root algorithm, 174 square-summable family, 344, 350.351 square-summable net, 323 square-summable sequence, 322, 325 stability, 246-248,33738844 422,4 461, 497, 499, 503 strict contraction, 97,218, 298 strictly decreasing function, 10 strictly decreasing sequence, 13
526
Index
strictly increasing function, 151 strictly increasing sequence, 13
symmetry, 86
strictly positive operator, 398. 400.
Tietze Extension Theorem, 178 Tikhonov Theorem, 193 topological base. 124 topological embedding, 109 topological invariant, 109, 150, 183,184 topological isomorphism, 231, 239,
430,431,433,455,598,505,
507
strong convergence, 243.245-248,250,
296,299-301,371.372,376 , 382,420,422,424,432,434, 420
strong limit, 244 stronger topology, 146 strongly bounded, 242 strongly closed, 249 229 strongly stable operator, 246-248.384
422 432 434 497 503 subadditive functional, 128 subadditivity, 199 subalgebra, 8.3 subcovering, 148 sublattice, 14 sublinear functional, 19.8 subnormal operator, 444-446, 448 508 subsequence, 14 subset, 3 subspace of a metric space, 86, 184 subspace of a normed space, 209,2 0
234.326.329-331.335 subspace of a topological space, 172 Successive Approximation, 132 sum of cardinal numbers, 35 sum of linear manifolds, 44 45- 67, 68 summable family, 343,344,346, 348 350,351 summable sequence, 201, 274 275, 272
sup-metric. 90, 91 sup-norm, 214 supremum, 9, 13
surjective function, 125, 26 surjective isometry, 110 138-142 symmetric difference, 4 24 symmetric functional, 312 symmetric relation, 1
290,291,294 topological linear space, 269 topological space, 105 topological sum, 213, 335 topological vector space, 269 topologically isomorphic spaces, 231. 235 topology, 104 105 total set, 211
totally bounded, 152-156, 158, 11 161-163 totally cyclic linear manifold, 283 totally disconnected, 8-4 182 totally ordered set, 12 trace, 438 trace-class operator, 435, 436 438, 439
transformation, 5 transitive relation, 7 triangle inequality, 86. L99 trichotomy law, 12 two-sided ideal, 83 253
ultrametric, 186 ultrametric inequality, 1$6 unbounded linear transformation, 235.
289,366 unbounded set, 82
unconditionally convergent series, 3_48
unconditionally summable. 348 350 uncountable set, 18 uncountably infinite set, 18 undecidable statement, 23 underlying set, 40. 86 Uniform Boundedness Principle, 242
Index
uniform convergence, 244-246.28 273,296-300,381,420,422436 uniform homeomorphism, 109, 134, 137
uniform limit, 244 uniformly bounded, 242 uniformly closed, 249 uniformly continuous composition, 17-6 uniformly continuous function, 97,134137. 151. 154.216 uniformly equicontinuous, 161 uniformly equivalent metrics, 110,11 232 uniformly homeomorphic spaces, 110 134,155. 177 uniformly stable operator, 24.6 248
unilateral shift, 292, 421-425. 448, 468 unit vector, 352 unital algebra, 83
unital algebra L[X], 56, 79 83 222 unital Banach algebra, 222 unital normed algebra, 222, 2$4 unitarily equivalent operators, 493, 494, 506
unitarily equivalent spaces, 33 343 unitary operator, 423, 433, 492 455,
498 500 505,506 unitary space, 87, 203 unitary transformation, 340, 343
527
upper bound, 9 upper limit, L69 upper semicontinuity, L75
usual metrics, 86 89 93 usual norms, 20,2 203, 205, 208.218 value of a function, 5 vector, 40 vector addition, 40 vector space, 40 von Neumann expansion, 226
weak convergence, 306,308,309.376-
387,401,418,420,432 weak* convergence, 308 weak limit, 306 weaker topology, L06 weakly bounded, 410 weakly closed, 384 weakly closed convex cone B+[7{],
400.430 weakly stable operator, 307,423, 432,
Weierstrass Theorems, 124, 159 weighted bilateral shift, 470,471 weighted sum of projections, 374, 416 481, 482, 488, 504 weighted unilateral shift, 470 well-ordered set, 12 Zermelo Well-Ordering Principle, 23 Zorn's Lemma, 17
Carlos S. Kubrusly
Elements of Operator Theory Elements of Operator Theory is aimed at graduate students as well as a new generation of mathematicians and scientists who need to apply operator theory to their field. Written in a user-friendly, motivating style, fundamental topics are presented in a systematic fashion, i.e., set theory, algebraic structures, topological structures, Banach spaces,
Ililbert spaces, culminating with the Spectral Theorem, one of the landmarks in the theory of operators on Hilbert spaces. The exposition is concept-driven and as much as possible avoids the formula-computational approach. Key features of this largely self-contained work include:
required background material to each chapter fully rigorous proofs, over 300 of them, are specially tailored to the presentation and some are new
more than 100 examples and, in several cases, interesting
counterexamples that demonstrate the frontiers of an important theorem over 300 problems, many with hints
both problems and examples underscore further auxiliary
results and extensions of the main theory; in this nontraditional framework, the reader is challenged and has a chance to prove the principal theorems anew
This work is an excellent text for the classroom as well as a self-study
resource for researchers. Prerequisites include an introduction to analysis and to functions of a complex variable, which most first-year
graduate students in mathematics, engineering, or another formal science have already acquired. Measure theory and integration theory are required only for the last section of the final chapter.
Birkhduser ISBN 0-8176-4174-2 www.birkhauser.com