Lectures and Exercises on Functional Analysis
Translations of
MATHEMATICAL MONOG RAPHS Volume 233
Lectures and Exercises on Functional Analysis A. Ya. Helemskii Translated by S. Akbarov
Robert
D.
AMS Subcommittee
Grigorii A. Margulis James D. Stasheff ( Chair ) ASL Subcommittee Steffen Lempp ( Chair ) IMS Subcommittee Mark I. Freidlin ( Chair )
MacPherson
A.
JI.
XeJieMCKMM
JIEKU1111 ITO
YHKU110HAJibHOMY AHAJI113Y MUHMO , MocKBa, 2004
The present translation was created under license for the American Mathematical Society and is published by permission. Translated from the Russian by S . Akbarov 2000 Mathematics Subject Classification. Primary 46-01, 47-01.
For additional information and updates on this book, visit www . ams.org/bookpages / mmono-233
L ibrary o f Congress Cat aloging- in- P ubl icat io n D ata
Khelemskii , A. lA. (Aleksandr IAkovlevich) [Lektsii po funktsinal'nomu analizu. English] Lectures and exercises on functional analysis / A . Ya. Helemskii. p. em. -(Translations of mathematical monographs, ISSN 0065-9282 ; v. 233) Includes bibliographical references and index. ISBN-10 0-8218-4098-3 (acid-free paper) ISBN-13 978-0-8218-4098-6 1. Functional analysis. 2. Operator theory. I. Title. II. Series. QA321.K54 5 1 5'. 7-dc22
2005 2005053605
Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy a chapter for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Requests for such permission should be addressed to the Acquisitions Department , American Mathematical Society, 20 1 Charles Street , Providence, Rhode Island 02904-2294, USA. Requests can also be made by e-mail to reprint-permission
© 2006 by the American Mathematical Society. All rights reserved. The American Mathematical Society retains all rights except those granted to the United States Government. Printed in the United States of America.
§ The paper used in this book is acid-free and falls within the guidelines established to ensure permanence and durability. Visit the AMS home page at http: I /www. ams. org/ 10 9 8 7 6 5 4 3 2 1
1 1 10 09 08 07 06
To two Marias: mother and wife
Contents
Preface Chapter 0. Foundations: Categories and the Like § 1 . On sets and linear and metric spaces § 2 . Topological spaces §3. Categories; first examples § 4 . Isomorphisms. The problem of classification of objects and morphisms § 5 . Other classes of morphisms §6. A sample of a category-theoretic construction: The (co )product §7. Functors
.
XI
1 2
9 20 25 32
37 45
Chapter 1 . Normed Spaces and Bounded Operators ( "Waiting for Completeness" ) 55 § 1 . Prenormed and normed spaces. Examples 55 §2. Inner products and near-Hilbert spaces 66 §3. Bounded operators: First acquaintance, examples 76 82 § 4 . Topological and categorical properties of bounded operators § 5 . Some types of operators and operator constructions. Projections 93 §6. Functionals and the Hahn-Banach theorem 100 §7. Invitation to quantum functional analysis 1 13 Chapter 2. Banach Spaces and Their Advantages § 1 . What lies on the surface
125 125
..
VII
v1n
Contents
§2. Categories of Banach and Hilbert spaces. Classification and the 133 Riesz-Fischer theorem 140 §3. Theorem on the orthogonal complement and around it §4. Open mapping principle and uniform boundedness principle 149 155 §5. Banach adjointness functor and other categorical questions §6. Completion 167 173 §7. Algebraic and Banach tensor products 186 §8. Hilbert tensor product Chapter 3. From Compact Spaces to Fredholm Operators § 1 . Compact spaces and relevant functional spaces §2. Compact metric spaces and total boundedness §3. Compact operators: General properties and examples §4. Compact operators between Hilbert spaces §5. Fredholm operators and the index Chapter 4. Polynormed Spaces, Weak Topologies, and Generalized Functions §1. Polynormed spaces § 2. Weak topologies §3. Spaces of test functions and generalized functions §4. Generalized derivatives and the structure of generalized functions
191 191 201 209 215 232 245 245 260 276 288
Chapter 5. At the Gates of Spectral Theory § 1 . Spectra of operators and their classification. Examples §2. Something from algebra: Algebras §3. Banach algebras and spectra of their elements. Fredholm operators revisited
297 297 304
Chapter 6. Hilbert Adjoint Operators and the Spectral Theorem § 1. Hilbert adjointness: First information §2. Selfadjoint operators and their spectra. Hilbert-Schmidt theorem §3. An overview: Involutive algebras , C* -algebras, and von Neumann algebras §4. Continuous functional calculus and positive operators §5. The spectral theorem as an operator-valued Riemann-Stiltjes integral
327 327
31 1
336 347 360 371
Contents
.
IX
§6. Borel calculus and the spectral theorem as an operator-valued Lebesgue integral 38 2 §7. Geometric form of the spectral theorem: Models and classification 396 §8. Proof of the final form of the spectral theorem 405 Chapter 7. Fourier Transform 413 § 1. Classical Fourier transform 413 §2. Convolution. Fourier transform as a homomorphism 422 §3. Fourier transform of test functions and of generalized functions 433 §4. Fourier transform of square-integrable functions 441 448 §5. A little about harmonic analysis on groups Bibliography
455
Index
461
Preface
About functional analysis. There is a common opinion that algebra stud
ies sets endowed with operations, such as, say, addition, multiplication, tak ing the inverse or symmetric element. On the other hand, topology studies sets where continuous passing from some elements to others makes sense, and in particular, where convergent sequences are defined. With the same oversimplification, we can say that functional analysis is the study of sets with synthetic (i.e. , composite) structure, partly algebraic and partly topo logical, and the two parts are related to each other by certain natural rules. In the original form of classical functional analysis of the 1920s the alge braic structure is the linear space structure, while the convergence of vectors is defined (and made compatible with algebraic operations) via a given norm. However, it was immediately discovered that for the main part the really deep results require the additional assumption of completeness of the normed space under consideration. This led to the notion of Banach space. But the greatest progress in classical functional analysis was achieved in the theory of Hilbert spaces, which are defined as Banach spaces with norm given by an inner product. Here "complete success" was achieved in a number of fundamental directions. Notably, we now know everything about the nature of Hilbert spaces (the Riesz-Fisher theorem) and about the most important classes of maps of these spaces (the Hilbert spectral theorem, the Schmidt theorem) . (The classical functional analysis of normed, Banach, and Hilbert spaces constitutes the main part of our book. ) As time passed, new problems required enrichment of the initial struc tures. First of all, it was realized that to study many important function .
Xl
..
Xll
Preface
spaces we cannot restrict ourselves to just one norm. The natural con vergences in such spaces can be described only with the help of several norms, or, more precisely, several more general "prenorms" . Thus poly normed spaces appeared, and we use them now in the theory of generalized functions, in complex analysis, and in differential geometry. They turned out to be useful in the classical functional analysis as well, supplying it with new types of convergence. (A special chapter in our book is devoted to polynormed spaces and to some of their applications.) Another important observation was made. A deep and beautiful theory can be built if, instead of enriching the topological structure of our spaces, we enrich their algebraic structure by adding a new operation, namely, multipli cation of vectors. The theory of Banach algebras was constructed in this way. Its most significant results are two fundamental "realization theorems" . The first one (the Gelfand theorem) asserts that, up to some explicitly measured imprecision, every commutative Banach algebra is the algebra of continuous functions on some topological space, with pointwise operations. The second theorem (the Gelfand-Naimark theorem, which later became indispensable in modern quantum physics) states that every Banach algebra endowed with an involution (i.e. , a kind of natural symmetry) which is consistent with the norm, is just the algebra of operators in a Hilbert space. (In our book we present only the elements of the theory of Banach algebras needed for a discussion of spectra. Both theorems are given without proof, but we believe that every student should know their statements.) Finally, the last 20 years of the 20th century were the time of birth and rapid development of a new branch of our science-quantum functional analysis, or the theory of operator spaces. In this theory linear spaces are endowed not with a norm, but with a more complicated structure, sometimes called "quantum norm" . Namely, every space of matrices (of arbitrary size) with entries in a given space has its own norm, and all these norms agree with each other in a certain reasonable way. It was realized that many difficult questions in analysis become clear and transparent if we pass from the given space to its "quantization" . Some important problems, which for many years resisted solution by classical methods, were solved in this way (see [1] ) . At the same time, quantum functional analysis, in addition to clarifying many questions of classical analysis, has its own adornments in the form of deep and brilliant theorems that have no classical analogues. (In this book we did not dare to plunge into quantum functional analysis. We only tried to satisfy possible curiosity of the reader and give (in small print) two main definitions: of a quantum space and of a completely bounded operator. One of the sections of our book is devoted to these definitions,
XIII
Preface
with preliminaries and discussion. The corresponding text plays the role of an advertisement and is not necessary even for advanced students.) We have mentioned only some parts of functional analysis. A more complete discussion would have touched upon its other branches. For in stance, throughout its history, functional analysis developed in close connec tion with harmonic analysis. The latter, roughly speaking, studies "shifts in function spaces" , and can be regarded as an integral part of functional analysis. (Elements of harmonic analysis are presented in our book in the chapter devoted to the Fourier transform.) The theory of ordered spaces and the theory of topological algebras took shape and became separate ar eas of functional analysis. Further, we emphasize that the classical func tional analysis-geometry of Banach spaces-has by no means come to a standstill. So many times "Banach geometry" was predicted to retire, but it continues to astonish us with new, deep, and unexpected results. (Some latest achievements, such as Cowers ' theorem, are mentioned in the book.) *
*
*
Of course, we would like the reader of the book to get the impression that functional analysis is a beautiful science, rich in content. But, appar ently, the intrinsic value is not sufficient for the full development and long life of a mathematical discipline. Every mathematical science that loses con nections with the rest of mathematics, runs the danger of becoming an "art for art ' s sake" , as von Neumann wrote in [2] . As a result, we can see that in some areas, smaller and smaller questions are studied under stronger and stronger magnification, whereas in other areas (or, sometimes, in the same ones) feeble "general theories" arise with an evidently insufficient number of substantial examples. But we can reassure the reader: these problems do not threaten func tional analysis now. There are many connections with the surrounding areas of mathematics and with other sciences, and these connections are being re newed all the time. Indeed, what is mathematics? The accepted answer is now: the science of patterns. But physics is teeming with patterns from functional analysis: from the very early ones, which appeared in the calcu lus of variations, to the ultramodern, based on recent achievements in the theory of operator algebras (see for example [3] ) . (The most famous pat tern belongs to von Neumann, who dared to attack quantum mechanics as a whole.) And within mathematics, functional analysis allows us to con sider, from a common point of view, such seemingly different things as, say, integral equations, systems of linear equations, and certain variational prob lems. This again means that functional analysis suggests common patterns for a certain circle of problems, and, as a result, common methods for their study.
.
XIV
Preface
An instructive example of this connection of functional analysis with other areas of mathematics is the small book of the Fields Medal laure ate V. F. R. Jones, Subfactors and knots [4] . Try to separate there func tional analysis (to be more specific, operator algebras) , group theory, low dimensional topology (knots and links) , and quantum statistical mechanics! (Unfortunately, lack of time and space prevented us from paying much attention to external connections and applications of functional analysis. Only now and then, if the temptation to speak about the "physical mean ing" , or "physical context" was too strong, we allowed ourselves to include some notes of informal or semi-literary nature. This concerns, for instance, the Riesz-Fischer theorem or the theorem on non-emptiness of a spectrum.) O n some principles for the selection of material. So you have a new
textbook on functional analysis in your hands, a book intended for a first acquaintance with the subject. Of course, the author ' s duty is to say a few words about the specific features of the book, the selection of material, and the style of the exposition. Indeed, if the book does not differ from the other ones, why do we need it? In this text we "grant civil rights" to some notions, methods, and results of modern functional analysis that are absent, or regarded as marginal in other textbooks. There is no need to list them: every specialist will see this from the contents. We shall only mention some main aspects. Perhaps the main idea is that our book is written from the categorical point of view. Everywhere we stress and comment on the categorical na ture of the fundamental constructions and results (like the constructions of adjoint operators and completion, the Riesz-Fischer and the Schmidt theo rems, and closer to the end of the book, the great Hilbert spectral theorem) . This, as we believe, provides a new level of understanding of the topics dis cussed. We are sure that students (and even professors!) are ready for the perception of the very basic categorical notions (and only those are used) and, what is more important, for the unifying mathematical language of category theory. Functional analysis, with its synthetic algebraic and topo logical content, works very well for first acquaintance with categories, the same way as "Analysis III" did for the exposition of the foundations of set theory 50 years ago. (Of course, the exposition must be accompanied by a sufficient supply of examples and exercises; but we shall discuss this later.) As for specific elements of the modern techniques of analysis, we would like to distinguish tensor products of Banach spaces. We discuss two of them in detail, the "projective" and "Hilbert" tensor products, because one cannot work without them either in the geometry of Banach spaces, or in quantum statistical mechanics, or in the theory of elementary particles.
Preface
XV
On some principles of the exposition.
1 ° . What we expect to be known. We assume that the reader has mas tered the material usually given in the first two years at mathematics de partments of Russian universities ( we used Moscow State University as an example ) . In particular, and this indeed is very important, the reader should know linear algebra, the foundations of real analysis ( namely, Lebesgue mea sure and integral ) , and the elements of the theory of metric spaces, usually presented in advanced analysis courses. As for complex analysis, in my time, the corresponding lectures were given in the third year, then in the second, then in the third again. ( The creative initiatives of the administration in this area have not settled down yet. ) Fortunately, we will need complex analysis only in the middle of the book, when dealing with spectra ( the Liouville theorem appears first ) . By that time the necessary lectures on complex analysis at Moscow State University will be delivered in any case. The situation with topology and also with some topics in algebra ( like tensor products of linear spaces ) is more delicate. Formally, this material is a part of the second year syllabus. But, as our experience shows, it is dangerous to rely upon the corresponding obligatory courses where these questions are considered: usually the lecturers have in mind the goals that are too far from functional analysis. This is why in the book we give an independent exposition of the necessary topological and algebraic results. 2° . Standard and small print. The text in ordinary print roughly corre sponds, in our opinion, to the material of the course of functional analysis for third-year students of mathematical departments at Russian universities. We mean the full one-year course of functional analysis ( four hours a week ) . The text in ordinary print is addressed to all students, no matter what their future mathematical specialty will be. But for advanced students, who chose pure mathematics ( algebra, geom etry, or analysis ) as their area of research, this large print is not sufficient, in our opinion. So we ask them to make themselves familiar with the small print ( "noblesse oblige" ) . ( Thus, you need not necessarily know how to de cipher exact sequences of Banach spaces, but you must if you are to become a professional mathematician. ) Moreover, with the aim of satisfying possible curiosity, we also give in small print some material which is not necessary even for advanced stu dents. Here we mean the information about quantum functional analysis, the structure of ( co ) products in some categories, etc. But in such cases we always let the reader know that the material is optional. Needless to say, the text in ordinary print does not depend on the text in small print.
.
XVI
Preface
3° . About examples. Possibly, we give more consideration to examples than it is usually done. When we introduce a new notion, we immediately collect a list of examples for the reader. For instance, we have a list of functors, a list of polynormed spaces, or a list of operators ( the largest and the most important one ) . The reader should keep these lists ready at all times: whenever a new construction, property, or invariant appears, the reader should take examples from the lists and see which concrete form this construction acquires in these examples. ( You can see how we do this with spectra of operators in Chapter 5.) We are sure this "playing with examples" is the only way for informal understanding. 4° . About exercises. This book will not teach you much if you do not do the exercises. Of course, a reader would usually like to postpone working on exercises more or less indefinitely. This is why exercises are included directly in the text. When you encounter an exercise, stop and do it ( prove the assertion ) before continue with reading. As a rule, our exercises are elementary ( taking into account the hints ) . They illustrate the "main" text and make under standing less formal. Often they give a useful supplement to the proven proposition. There are, however, some more difficult exercises marked by the asterisk (*) ; here do what your conscience demands. On the contrary, the simplest exercises are marked by a zero ( 0 ) ; they are absolutely neces sary. Incidentally, our exercises are never used in the proofs of theorems, or in examples, but this does not mean, of course, that you can ignore them. On the other hand, exercises have their own hierarchy: the results of some of them can be used in others. 5 ° . Theorems presented without proofs. The number of such theorems in this book is greater than usual. As a rule, these are results of great impor tance: "named theorems" that have simple and spectacular formulation, but whose proofs are relatively complicated and / or are based on facts exceeding the knowledge of our expected reader. So it is very desirable, moreover, in our opinion, necessary, to know these facts, but at the moment it is better not to waste time on analysing the proofs. Typical examples are the Enflo Read theorem, the Milyutin theorem, and ( above all ) the Gelfand-Naimark theorem. For all these theorems, a reference where the proof can be found is provided. 6° . Technical details. The book is divided into chapters and sections; a reference such as "see Section 0.6" means Section 6 in Chapter 0. We distinguish and number independently the following types of math ematical statements: definitions, theorems, propositions, corollaries, exam ples, and exercises. ( There are also "remarks" and "warnings" , which are not numbered. ) When we refer to, say, Proposition 1.2.3 ( correspondingly,
..
XVII
Preface
Proposition 2.3, Proposition 3) , we have in mind Proposition 3 of Section 2 in Chapter 1 (correspondingly, Proposition 3 of Section 2 in the current chapter, or Proposition 3 in the current section.) The end of a proof is marked by the sign II. Some of theorems and propositions in the book are given without proof. The "end of proof ' sign placed immediately after a statement means that the proof of the statement is clear or can be easily verified. If the sign is absent, then the proof of the result can be found in the supplied reference(s) . Usually, these are important theorems for which, we believe, the reader should know the statements (see the previous remark) . Corollaries are assertions that immediately follow from what was proved earlier. The symbol <====> means "if and only if" . The combination : == means "by definition" . *
*
*
So these are our good intentions. But, of course, the readers themselves will judge whether we carried them out successfully. Acknowledgements. While writing this book I frequently asked my col
leagues--experts in different fields of functional analysis-various questions. In particular, A. M. Stepin and 0. G. Smolyanov expressed their enlightened opinions on many questions. I appreciate very much the help of A. Yu. Pirkovskii, my friend and former student, who practically played the role of the editor of this book. If we managed to make the set of errors and misprints become nowhere dense instead of everywhere dense, as was initially the case, this is entirely to his credit. In addition, Pirkovskii suggested some useful amendments to the text, and in particular, enriched it with a series of useful exercises. I would like to thank my friend S. S. Akbarov, who made a series of critical remarks about the book. Finally, I would like to acknowledge my gratitude to the Moscow Center for Continuous Mathematical Education for their suggestion to write this book, and to the Russian Foundation for Basic Research for their financial support.
Chapter 0
Foundations: Categories and the Like
A course of functional analysis at the Mechanics and Mathematics Depart ment of Moscow State University is usually delivered to students of the third year of undergraduate study. Together with the theory of measure and in tegral this material was for many years the content of the course named "Analysis III" . It was Andrei Nikolaevich Kolmogorov who first delivered this course in the late 1940s. He usually started this course with the basics of set theory. Not many people contested at that time the exceptional role of set theory as the ABC for advanced sections of analysis, and for modern mathematics as a whole. Nevertheless, it was a revolutionary act to include set theory into the program. And it was not incidental that set theory was discussed only during the third year. Half-way to graduation, the students were assumed to be sufficiently mature. They accumulated mathematical knowledge and culture, and paraphrasing Hilbert ' s well-known expression, it became possible to open for them the gates of "the paradise that Cantor created for us" . Many things changed since that time, of course. Long ago set theory became a standard topic. Psychologically it is now perceived as a subject no more complex than, say, analytic geometry, and it is often taught during the first two undergraduate years. But nowadays set theory is not sufficient for the role of ideological and linguistic foundation of modern mathematics. This role passed to a younger discipline at the next level of abstraction. We mean category theory that appeared also in the 1940s. (Some aficionados 1
2
0. Foundations: Categories and the Like
of category theory prefer the name "abstract nonsense" .) It is category theory that plays the unifying role for the a large and increasing part of mathematics, the role that initially belonged to set theory. In particular, this "abstract nonsense" gives a universal, short, and convenient language for expressing vast amount of mathematical notions and facts. In a great many areas of algebra, geometry, and analysis, including modern functional analysis, attempts to manage without this language would be as absurd as the attempts to manage without Viete ' s letter notion. That is why we, paying homage to the good old tradition, start the exposition with the elements of the "all-mathematical basic science" . The only difference is that now this role belongs to category theory. But before we proceed to categories, we check our knowledge of set the ory, and also of linear and metric spaces. Besides, for the reasons mentioned in the Introduction, we include here some elementary facts from topology. 1. On sets and linear and metric spaces
We assume that the reader is familiar with basic notions, facts, and notation from set theory. In particular, we use the standard notation Z, Q, JR, JR+ , and CC for the sets of integers, rational numbers, real numbers, non-negative real numbers, and complex numbers, respectively. Let us specify that natural numbers are elements of the set 1 , 2, . . . , and we denote this set by N. The set of non-negative integers, i.e. , N U { 0} is denoted by Z+ . The unit circle in CC ( "one-dimensional torus") is denoted by 1r, the closed unit disk by lD>, and the open unit disk by ID>0 . The set of n x n-matrices with complex entries is denoted by Mn. Suppose Xv; v E A is an arbitrary family of sets. (By using the toneless notation v for indices, instead of, say, n, we want to emphasize that we speak about the set of indices of arbitrary cardinality, not necessarily countable.) For every v E A consider the set X� of pairs ( x , v); x E Xv. We call the set U{X� v E A} the disjoint union of the family Xv; v E A. (This construction allows us to speak about the union of disjoint copies of sets in an arbitrary family.) :
Warning. We use the terms
injective, surjective and bijective for the
mappings of sets only, and they have the usual meaning, namely "map ping that does not glue the points" , "mapping onto" , "one-to-one corre spondence" , respectively. (In contrast to this, the terms "monomorphism" , "epimorphism" , and "isomorphism" will appear later with another, categor ical, meaning, which we explain below.) The image of the mapping rp is also understood in the sense of set theory and is denoted by Im( rp) .
1.
3
On sets and linear and metric spaces
Let us recall some terminology and notation related to a given mapping of sets rp : X --+ Y. If M is a subset in X and rpo : M --+ Y a mapping such that rpo ( x ) rp ( x ) for all x E M, then rpo is called the restriction of rp to M, and rp is an extension of rpo to X. Let N be a subset in Y containing Im ( rp ) , and rp0 : X --+ N a mapping such that rp 0 ( x ) rp ( x ) for all x E X. Then we say that rp0 is a corestriction rp to N and rp is a coextension of rp0 to Y. Finally, if for the same M and N a mapping rp g : M --+ N is such that rpg ( x ) rp ( x ) for all x E M ( i.e. , "corestriction of a restriction" or, equivalently, "restriction of a co restriction" ) , then we say that rpg is a birestriction of rp to the pair (M, N), and rp is a biextension of rp g to the pair ( X, Y) . In this situation the restriction, corestriction, and birestriction are denoted by 'P I M, 'P I N and rp iZ , respectively. The set of all mappings from a set X to a set Y will be denoted by yx ( informally, in this notation X plays the role of the "exponent of the Cartesian power" ) . As usual, we say that a subset M in X is proper if M does not coincide with X. =
=
=
*
*
*
The term "linear space" will usually mean a vector space over the field of complex numbers CC. Sometimes, however, we have to consider linear spaces over the field of real numbers JR. These cases will be specially mentioned. If M and N are subsets in a linear space E, then their algebraic sum ( or, in short, the sum) is the set M + N {y + z : y E M, z E N}. If N consists of only one vector x, then we write M + x instead of M + {x } and we call this set the x-shift of M. For M C E and .A E CC the set .A M { .Ay : y E M} is called the .A-dilation of M. We say that a linear space E is decomposed into a direct sum of subspaces F and G and write E F EB G, if every x E E is uniquely represented in the form y + z, where y E F and z E G. Geometrically this means that F + G E and at the same time F n G {0}. In this situation G is called a linear complement of F in E. Let X be a subset in a linear space E. Its linear span is the subspace span ( X ) in E consisting of all linear combinations of vectors in X. A set X is called convex if for all points x, y E X the interval {tx + (1- t)y : 0 < t < 1 } lies in X. Finally, X is called balanced if for every point x E X the closed disk {.Ax : .A E lD>} lies in X. The term "operator" always means "linear operator". The kernel of an operator T is regarded in the sense of linear algebra; it is denoted by Ker ( T ) . If E and F are linear spaces, then the linear space of all operators from E to F is denoted by C(E, F) . We write C(E) instead of C(E, E) . A subspace F :=
:=
=
=
=
4
0. Foundations: Categories and the Like
in E is called an invariant subspace for T E £(E) (or invariant with respect to T) if x E F implies T(x) E F. In this situation the birestriction of an operator T to the pair (F, F ) is called the birestriction of T to F. By a "functional" we mean an operator with values in a scalar field. The space C(E, CC), i.e. , the linear space of all functionals on E, is called the linearly dual space for E and is denoted by E� . If x E E and f E E� , then the number f(x) will be sometimes denoted by (f, x) . On every linear space E we can define a new multiplication by scalars by putting .Ax to be equal to the "initial" .Xx (.A E CC, x E E) . Obviously, the underlying set of the space E endowed with this new multiplication by scalars and with the original addition is again a linear space. It is called the complex-conjugate space for the initial space E and is denoted by Ei (sometimes the notation E is also used, but we are afraid of mistaking this for the future notation for completion) . A linear operator T : E F between two linear spaces, viewed as a mapping from Ei to F (or from E to Fi ) , is called a conjugate linear operator. In terms of initial linear spaces this obviously means that T is additive and satisfies the identity T(.Ax) == .Xx. If a conjugate linear operator from E to F is an isomorphism between E't and F, it is called a conjugate linear isomorphism. Many linear spaces that are important in functional analysis consist of sequences of complex numbers. For these sequences we use the notation � ' 'TJ, etc. , and for their terms the notation �n , etc. The notation � ( �1 , �2 , . . . ) will be often used as well. A sequence with 1 at the nth place and 0 at all other places will be called the nth unit vector and denoted by pn . (A similar name, the kth unit vector, and the notation pk will be used for the vector (0, . .. , 0, 1, 0, ... , 0) (with 1 at the kth place) in the coordinate n dimensional space ccn . ) The space of all sequences, i.e. , ccN in the notation of set theory, will be denoted by c00 , and its subspace consisting of finitary sequences (in other words, the linear span of unit vectors) by coo · Sometimes we do not know in advance whether a given ordered system of numbers �1 , �2 , . . . is finite or infinite, i.e. , whether we consider an element of ccn for some natural n or an element of ccN . In these cases we use the expression "finite or infinite sequence" . Linear operators with one-dimensional (respectively, finite-dimensional, or n-dimensional) image will be called one-dimensional (respectively, finite --+
=
dimensional, n-dimensional) .
*
*
*
Let us go back to sets. The question we now have to discuss belongs to the so-called rigorous, or axiomatic set theory. This discipline exceeds the
1.
On sets and linear and metric spaces
5
scope of our book, but some notions and facts are necessary because the major part of modern mathematics uses them explicitly or implicitly. First of all, we have in mind the well-known
Let A be an arbitrary set, and Xv; v E A an arbitrary family of non-empty and mutually disjoint sets {with A as the set of indices). Then there exists a set containing exactly one element of every set in this family. Axiom of Choice (or the Zermelo axiom) .
When A is infinite (even if it is countable) this theorem cannot be de duced from the other axioms of set theory (which we did not list, however) provided these axioms are compatible. Perhaps, the readers' intuition, "cultivated on finite sets" , tells them that there is no other possibility-how could it be otherwise? But let us trust two classics of science (cited from [5, p. 3] ) : "When you meet the Zermelo axiom for the first time, it looks indisputable and obvious, but as you ponder on it , it turns out to be more and more mysterious, and its corollaries more and more amazing; you end up in losing its meaning, and then you ask, what does it really mean?" (Bertrand Russell) "I reflect days and nights on Zermelo's axiom. If one would only know what an amazing thing it is!" (N. N . Luzin) The axiom of choice resembles the famous fifth postulate of Euclid's geometry, because it looks quite unlike other axioms of set theory. This caused hot discussions some time ago. Mathematicians tried to get rid of it , but over time it became clear that too many major mathematical facts lean on this axiom explicitly or implicitly. (For instance, if you analyze the classical fact that Cauchy's definition of the limit of a function is equivalent to Heine ' s, you will find that you cannot prove it without the axiom of choice; see, e.g., [6, p. 95] ) . 1 That is why the majority of working mathematicians have to accept the axiom of choice, or, to be more precise, come to believe in it . Otherwise the main theorems of this book should have been thrown out.
The axiom of choice admits many equivalent formulations, and some of them look quite different from the formulation we have just given. One of these formulations, called Zorn ' s lemma, will be especially useful for us. Its statement does not look too transparent and requires some preparation: we need the notion of order on the sets, which is quite important by itself. Definition
1. We say that an order is defined on a set X if we have dis
tinguished a family of ordered pairs in X and the following conditions are 1 I clearly remember the following curious event : one well-known mathematician was giving a lecture being in low spirit , and for some strange reason he began to vent his anger on the axiom of choice. He splashed it with sarcasm, and after that he continued the lecture with proving a theorem where he used the fact that every ideal of a ring lies in a maximal ideal. He apparently forgot that this fact can be proved only with the help of the axiom of choice (and, moreover, these two facts are equivalent).
6
0. Foundations: Categories and the Like
fulfilled (here the writing x -< y means that the pair (x, y ) belongs to the distinguished family) : ( i) if x -< y and y-< z , then x-< z ; (ii) for each x we have x -< x; (iii) if x -< y and y-< x at the same time, then x A set with an order on it is called an ordered set.
=
y.
(The relation x -< y is usually expressed by saying "x precedes y" .) An order is said to be linear if for every x, y E X either x -< y, or y-< x. A set with a linear order is called a linearly ordered set. A given order on a set obviously generates an order on every subset of this set. Of course, every subset of a linearly ordered set is linearly ordered as well. Clearly, every set can be made ordered if we declare that every element precedes itself, and itself only. Such an order is called discrete. Of course, it is not linear. Remark. Although this example looks stupid, it has a deep meaning, and,
in particular, save us from some needless illusions. The reader will see below that in practice, every substantial mathematical definition admits such "stupid" , but actually very useful examples. Needless to say, each of the sets N, Z+ , Z, Q, IR has a natural order: x -< y means x < y. The set of all words in the Russian language have the so-called lexicographic order used in dictionaries. At every moment of time the set of all members of the English royal family is ordered by the line of the right to the throne. In particular, as of February 1 , 2000, Her Majesty Elizabeth II preceded all the others, then His Royal Highness Charles, the Prince of Wales follows. All these orders are, of course, linear. But it is not difficult to construct a non-linear order, say, on CC: a + bi -< c + di if a < c and b
1.
7
On sets and linear and metric spaces
X, or if the order is reverse, the empty set. The majority of other ordered sets we mentioned in our examples have no maximal elements.)
§ 8] ) . Let X be an ordered set with the follow ing property: every linearly ordered subset of X {in the sense of the order generated by the initial order of X) is bounded. Then there is at least one maximal element in X. Zorn's Lemma (see [7, VII,
We accept without proof that Zorn ' s lemma and the axiom of choice are equivalent (i.e. , each of them can be deduced from the other) . *
*
*
Let us give a typical example of application of Zorn ' s lemma in linear algebra. By the way, it justifies a proposition we will need in the future. Suppose E is a linear space. We recall that the system of vectors in E is called linearly independent if every finite subsystem of this system is linearly independent in the usual sense of linear algebra. By definition, a linear basis (called sometimes a Hamel basis) in E is a family of elements (i.e. , vectors) ev; v E A possessing the following property: every x E E; x =/=- 0 can be uniquely represented as a linear combination of vectors ev (i.e. , in the form l:�= I Avk evk ; Avk E CC) with non-zero coefficients. (Of course, the linear combination here is finite, as everywhere in "pure" linear algebra.) It is not difficult to verify (do this!) that every two linear bases in a linear space have the same cardinality. Exercise 1. Every (generally speaking, infinite-dimensional) linear space has a linear basis. Hint. Make the set of all linearly independent systems in a given space ordered by inclusion, and consider its maximal element (which exists by Zorn ' s lemma) . By definition, the linear dimension dim(E) of a linear space E is the cardinality of an arbitrary linear basis in E: as we have said before, this cardinality does not depend on the choice of a basis. (For instance, it is not difficult to see that the space of finitary sequences and the space of all polynomials in several variables with complex coefficients have countable linear dimension; on the other hand, the linear dimension of the space of all sequences is continuum.) If F is a subspace in E, then the codimension codimE F of F in E is defined as the dimension of the quotient space E /F. In particular, codimE F == n, n E N, means that there exists a sequence of vectors XI, . . . , Xn E E (where n cannot be reduced) such that E == span{F, X I , . . . ' Xn} . A small generalization of Exercise 1 is the following.
8
0. Foundations: Categories and the Like
2. Every linearly independent system in a linear space E can be extended to a linear basis in E. As a consequence, for every proper subspace F in E there exist (i) a non-zero functional on E vanishing on F, and (ii) a linear complement of F in E. Exercise
This implies that every functional on F can be extended to a functional on the whole E. In particular, if E =/=- 0, then there exist non-zero functionals on E. *
*
*
There is one more equivalent formulation of the axiom of choice that deserves our attention, and we also take it for granted. An element of an ordered set M is said to be the least if it precedes all elements of this set. (By analogy, the greatest element is defined. Note that the greatest element is always maximal, but not vice versa.) Zermelo's Theorem (see [7, VII, § 8 ]) . Every set X can be ordered in such
a way that every subset of X has a least {in this subset) element.
The proof of the equivalence of the axiom of choice, Zorn ' s lemma, and Zermelo ' s theorem can be found, for example, in [7] . Remark. In Zermelo ' s theorem some hypothetical order is discussed. Its
explicit construction is not indicated (actually it cannot even be indicated) , and of course it does not necessarily coincide with the natural order in some standard examples of sets. For instance, the natural order in IR, though linear, has nothing in common with the one discussed in this theorem. *
*
*
Now let us present some notation from real analysis. (Most probably the reader learned real analysis from the books [8] , [9] , and/or [10] . 2 ) A measure space (or a space with measure) is a triple ( X, M, J-L) con sisting of a set X, a a-ring M of subsets of X, and a a-additive measure J-L on M. The sets in M are called measurable. If there exists a countable family N C M such that for every Y E M and c > 0 there is Z E N such that J-L(Y 6 Z) < c, we say that our measurable set has a countable basis . Usually we denote measure spaces by (X, J-L) , keeping in mind that if the measure J-L is defined, we already have the a-ring M defined as well. A mapping of measure spaces is said to be measurable if the inverse image of every measurable set is measurable as well. A measurable mapping is said to be proper if the inverse image of every set of zero measure has zero 2 An English-speaking reader probably used the book [106] .
2.
9
Topological spaces
measure as well. Two mappings of measure spaces are said to be equivalent if they coincide almost everywhere (i.e. , everywhere except the set of measure zero) . The case where X is the real line IR or an interval [a, b] is the most important for us. In this case M is always (unless the converse is explicitly stipulated) defined as the a-ring of Borel subsets. We denote this a-ring by BOR or by BORb for IR or [a, b] , respectively. *
*
*
We assume that the reader is already familiar with the notion of a metric space and knows what are open and closed sets in it. We add (or recall) only the following. Metrics defined on sets generate several classes of mappings that treat these metrics in different ways.
2. A mapping rp M1 M2 of metric spaces is called isometric if for all x, y E M1 we have d(rp(x) , rp(y)) == d(x, y) ; an isometry if it is isometric and bijective; a contraction if for all x, y E M1 we have d(rp(x) , rp(y)) < d(x, y) ; 3 uniformly continuous if for every c > 0 there exists 8 > 0 such that for all x, y E M1 the inequality d(x, y) < 8 implies d(rp(x) , rp(y)) < :
Definition
-
-
--+
c ·'
- continuous at a point xo E M1 if for every c > 0 there exists 8 > 0 such that for all x E M1 the inequality d(x, xo) < 8 implies
d(rp(x) , rp(xo)) < c; - continuous if it is continuous at every point of M1 .
(Give an example of continuous but not uniformly continuous mapping!) If M is a metric space, and N a subset of M, then the restriction of the metric from M to N is (a metric on N) called the metric inherited from M. A subset of M with the inherited metric is called a metric subspace of the metric space M. If x E M and r > 0, then the open ball with center at x of radius r, i.e. , the set of those y E M for which d( x, y) < r, is denoted by U(x, r) . The distance from a point x E M to a subset N C M is defined as d(x, N) : == inf{d(x, y) ; y E N}. 2. Topological spaces
The reader remembers, of course, the definition of a converging sequence in a metric space. As a matter of fact, the notion of metric space was introduced 3We have to warn the reader that contractions are often defined as mappings such that d(
1.
10
0. Foundations: Categories and the Like
a century ago as the structure where it is convenient to define convergence of sequences. But in time people found that not every natural convergence used in analysis can be defined in terms of a metric. Exercise 1 * . The pointwise (i.e. , ordinary) convergence in C[O, 1] can
not be defined by a metric. (That means, of course, that there is no metric d on C[O, 1] such that the pointwise convergence of Xn to x is equivalent to the convergence in the metric space ( C[O, 1], d). ) Hint. If the pointwise convergence implies the metric convergence in some metric d, then one can show that for every t E [0, 1] and c > 0 we have d( y , 0) < c when y E C[O, 1] vanishes outside an interval [t, t + h]. But then there exists a sequence (for example, consisting of functions with trapezoidal graphs) converging to zero in the sense of metric, but not pointwise. Another structure, called a topology, provides many more possibilities. Definition 1. Let T be a family of subsets of a given set n. This family is
called a topology (on n) if it possesses the following properties:
(i) 0 (i.e. , the empty set) and 0 itself belong toT; (ii) the union of an arbitrary subfamily (of arbitrary cardinality) ofT belongs toT; (iii) the intersection of every finite subfamily ofT belongs toT. A set with a given topology on it (i.e. , speaking precisely, a pair consist ing of a set and a topology on it) is called a topological space. The area of mathematics that studies topological spaces (and even to a greater extent their continuous mappings, which will be discussed below) is called topology. Thus, topology is one of the words used to denote not only a concrete mathematical notion, but a mathematical discipline as a whole (other such words are algebra, homology, etc.) . The sets belonging to T are called open. Their complements in 0 are called closed. An open set containing a given point is called a neighborhood of this point. A point of a given (arbitrary) set � C 0 is called an interior point of this set if it has a neighborhood that is contained in �Suppose � is a subset of a topological space n. A point X E n is called an adherent point of this set if every neighborhood of x contains at least one point in �- Further, a point x E 0 is called a limit point of a set � if every neighborhood of x contains at least one point in � other than x , and a strict limit point of � if every neighborhood of x contains infinitely many points in �- (Soon we will see that these are different things.) Obviously, a subset
2.
Topological spaces
� is closed in 0
its limit points.
<====>
11 � contains all its adherent points
<====>
� contains all
To indicate a topology we do not necessarily need to indicate all its open sets. Let (0, T ) be a topological space. A subfamily To C T is called a base of this space if every set from T is a union of a subfamily (of arbitrary cardinality, generally speaking) in To. A family Too is called a subbase of this space if all finite intersections of sets from Too form a base of this space.
Suppose n is a set, and Too is a family of subsets inn such that 0 and n belong to TOO. Then there exists a unique topology on n for which Too is a base.
Proposition 1 .
Proof. Denote by
To
the family of all finite intersections of the sets from Too , and by T the family of all unions of the sets from To. Obviously, T is a topology with Too as a subbase. Let T1 be another topology with Too as a subbase. Then from the defini tion of the subbase it clearly follows that T1 C T. On the other hand, since 1 1 Too C T , the definition of topology implies that T C T • II The following almost evident proposition gives an alternative approach to the definition of topological space. It is based on distinguishing not the open, but the closed subsets.
2. Suppose 0 is a topological space, and a is the family of all its closed subsets. Then
Proposition
( ') 0, �tJ E a; (ii') the intersection of an arbitrary family of sets from a belongs to a; (iii') the union of an arbitrary finite family of sets from a belongs to a. Further, if 0 is an arbitrary set and a is a family of subsets in 0 satisfying these conditions, then the family consisting of all complements in 0 of the sets from a is a topology in 0. II .
1
(""\
In particular, for every subset S in 0 the intersection of all closed sets containing S is closed as well, and this is the smallest closed set containing S. This set is denoted by s- and is called the closure of S. It is easy to see that the closure s- is precisely the set of adherent points for s. Let 0 be a topological space and S an arbitrary subset in 0. Then the intersections of S with open subsets in 0 obviously form a topology on S. We call it the topology inherited from 0. The set S itself with this topology is called a topological subspace of n . Furthermore, let 1r be a surjective mapping of a topological space 0 onto a set �- If we take the subsets in � with open inverse images in n, we again
0. Foundations: Categories and the Like
12
obtain a topology. The set � with this topology is called the topological quotient space of n generated by the mapping 1r. *
*
*
We now pass from general constructions to examples. Suppose someone gives you a set n and asks you to equip it with a topology. Then it is likely that you will suggest one of the following two topologies. (And your choice, probably, will depend on your temperament: apparently, a choleric person will suggest the first example, while a melancholic one the second.) Example 1. All subsets in n "are declared open" . Such a topology is called
discrete , and n equipped with this topology is called a discrete topological space. Example 2. Only two subsets in n "are declared open" , namely the "oblig atory" subsets 0 and n itself. Such a topology is called antidiscrete, and n with this topology is called an antidiscrete topological space. *
*
*
Let S and T be subsets in a topological space n. We say that S is dense (or everywhere dense) in T if every neighborhood of every point from T contains a point in S. (Verify that this is equivalent to the fact that T is contained in the closure of S. ) A topological space is called separable (an ex tremely important notion!) if it contains an at most countable dense subset. (For instance, a discrete space is separable <====> it is at most countable; on the other hand, antidiscrete spaces are all separable) . The following obvious proposition is useful for proving separability of a given space.
3. Suppose E, F, G are subsets in a topological space n, E is II dense in F, and F is dense in G. Then E is dense in G.
Proposition
On the other hand, the following fact allows one to discern non-separable spaces.
4. Suppose � is a subset in a separable topological space n, and Ux; x E � is a system of disjoint open subsets in n such that for each x E � the set Ux is a neighborhood of x . Then � is at most countable. Proposition
Proof. If n 0 is a dense at most countable subset in n, then we can assign to
every x E � an arbitrary point from no lying in Ux. We obtain an injective • mapping from � to no . The rest is clear.
Let N be a subset of a separable metric space M such that d(x, y) > () for some () > 0 and for all x , y E N; x =/=- y. Then N is at most countable. Corollary 1.
2.
Topological spaces
13
Exercise 2. In a non-separable metric space there is a non-countable
subset N such that d(x, y) > () for some () > 0 and for all x , y E N; x =/=- y. Let 71 and 72 be two topologies on a set 0. Then we say that 71 is not stronger (or not finer) than 72 if every set open in the first topology is open in the second, i.e. , if 71 C 72 . In the same situation we say that 72 is not weaker (or not coarser) than 71 . "Strictly stronger" , or just "stronger" (or "finer" ) means "not weaker and different" , and "strictly weaker" or "weaker" (or "coarser" ) means "not stronger and different" . Let us mention the following obvious
Suppose we have two topologies on a set 0. Then the first topology is not weaker than the second � for any subset in 0 every point of this subset that is an interior point with respect to the first topology is interior with respect to the second. As a corollary, two topologies coincide � every subset in 0 has the same stock of interior points with respect to these two topologies. II Proposition 5.
Now we answer the question that the reader may have: Why do we use the terms "open" and "closed" sets? We came across them in the context of metric spaces, and they apparently have another meaning there! But there is no confusion.
Let (M, d) be a metric space. Then the family of open {in the sense of the given metric) subsets of M is a topology.
Proposition 6.
Proof. An elementary verification shows that the conditions listed in Defi
nition 1 are fulfilled.
II
Thus, every metric space is automatically a topological space, and the two meanings of the term "open set" are compatible. The corresponding topology of the metric space will be called the topology generated (or in duced) by the given metric. Of course, the two meanings of the other no tions we had introduced here are compatible as well: "closed set" , "interior point" , "closure" , "dense subset" , and "separable space" . (Explain why!) If for a given topology there exists a metric generating it, then this topology and the corresponding topological space are called metrizable. As a simplest example, every discrete topological space 0 is metrizable: its topology is generated by the so-called discrete metric defined by the for mula d(x, y) == 1 for x =/=- y and d(x, y) == 0 for x == y. (By the way, this example shows that different metrics can generate the same topology: it is not difficult to show that the topology generated by a metric d is discrete � for every x E 0 we have inf{ d(x, y ) : y E 0} > 0. Another classical ex ample of metrizable topological space is the extended complex plane (called the Riemann sphere) C (see, e.g. , [12] or [106] ) .
14
0. Foundations: Categories and the Like
The question of which topologies are metrizable and which are not, is one of the most typical problems in topology (see, e.g. , [13] ) . We give here only a very "rough" necessary condition of metrizability, which, however, is of great independent importance. Definition 2. A topological space 0 is called Hausdorff 4 if every two points
of n have disjoint neighborhoods.
Note one useful property of such spaces: every singleton (i. e., a set consisting of only one point) in a Hausdorff space is closed (since all the points of its complement are interior) . However, some other spaces can have this property as well; cf. Example 4 below.
Proposition 7.
Every metrizable topological space is Hausdorff.
Proof. If x and y are different points of a topological space whose topology
is generated by a metric d, then the open balls U(x, r) and U(y, r) , r • !d(x, y), are the desired neighborhoods.
This implies immediately that any antidiscrete space consisting of more than one point is not metrizable. Remark. As you know, in metric spaces every limit point is automatically
a strict limit point. You can easily see that the same is true for Hausdorff spaces. But for the general topological spaces this is not the case. The simplest example is the antidiscrete space consisting of two points. Of course, every topological subspace of a Hausdorff space is a Hausdorff space as well. At the same time the passage to topological quotient spaces does not necessarily preserve the property of being a Hausdorff space (construct an example! ) . Examples of non-metrizable Hausdorff spaces will appear in our lectures later (e.g. , spaces with weak topologies, or the space of test functions V in Chapter 4) .
Many non-Hausdorff topological spaces can be constructed with the help of a more general structure than metric. Namely, let M be a set. A function d : M x M IR+ is called a premetric or a predistance (sometimes one says "quasi" instead of "pre" ) if for every x, y, z E M we have d(y, x ) == d( x, y) and d( x, z ) < d( x, y ) + d(y, z) . (In other words, a premetric has all the properties of a metric, except the property that d(x, y) == 0 implies x == y . ) You have already implicitly encountered an important example of a premetric in measure theory: --+
4 In honor of the outstanding German mathematician Felix Hausdorff (1 868-1942) , one of the founders of topology.
2.
15
Topological spaces
Example 3. Suppose (X, M, J-L ) is a measure space. Then d(X, Y)
: ==
J-L(X 6 Y) is a premetric on the set M. ( Check the required properties and the fact that in general this is not a metric. ) A set with a given premetric is called a premetric space. In a premetric space open balls, interior points of sets and open sets are defined in the same way as in metric spaces. The usage of the last term does not lead to confusion because of the following obvious Proposition 8 ( cf. Proposition 7) . Let (M, d) be a premetric space. Then
the family of open (with respect to the given premetric) subsets is a topol ogy. Furthermore, the corresponding topological space is Hausdorff <====> our • premetric is a metric. By analogy with the "metric" case, we can speak of the topology gener ated by a premetric and of premetrizable topological spaces ; the meaning
of these terms is clear. Obviously, antidiscrete topology is always gen erated by a premetric, namely d(x, y ) 0. The simplest example of a non-premetrizable topological space is, apparently, the two-point set {0, 1}, where, in addition to the two obligatory open sets 0 and {0, 1 }, the set {0} is declared open ( check this! ) . A subset N in a premetric space is called bounded if the set of numbers {d(x, y ) : x, y E N} is bounded. The least upper bound of the latter is called the diameter of the set N. In this exposition of the foundations of topology we do not want to create an impres sion that topology is needed only for functional analysis. Actually, its ideas and methods run through the whole mathematics. But the difference is that in functional analysis a typical topological space is metrizable, or "close to metrizable" , while in algebra and in algebraic geometry the situation is quite different . As an illustration we consider the fol lowing example, which is rather exotic for functional analysis but typical and substantial for algebra. We mean the so-called Zariski topology in the coordinate complex space e n . Having in mind Proposition 2, we define the topology via closed sets rather than open sets. Everything is clear for n == 1:
Example 4. Let us define closed subsets in C as finite sets ( in addition to the obligatory 0 and C ) . Conditions ( i ' ) - ( iii ' ) from Proposition 2 are verified trivially, and we obtain a topology on C, which is quite different from the standard one used in complex analysis. It is called the Zariski topology in C.
For an arbitrary n E N the definition is as follows. For every finite set of polynomials P l , . . . , Pm in n complex variables we put Vpl , · · ·Prn : == {z == ( z l , . . . ' Zn ) E e n P l ( z ) == . . . == Pm ( z ) == 0} :
(this is the set of common roots of these polynomials ) . Denote by the family of such sets ( "algebraic surfaces" ) ' together with e n itself. Of course, satisfies condition ( i ' ) of Proposition 2. The following observation is a a
a
little more complicated.
0. Foundations: Categories and the Like
16 Exercise 3.
a
satisfies condition (iii ' ) of Proposition 2.
But we do not dare to suggest that the reader verify condition (ii ' ) : this would amount to repeating a remarkable achievement of young Hilbert . The fact is that Exercise 4° . The following statements are equivalent :
(i) the family a satisfies condition (ii ' ) of Proposition 2; (ii) for every family of polynomials Pv ; v E A (of arbitrary cardinality) , there exists a finite family of polynomials Q1 , . . . , Qk such that
{z E e n : Pv ( z ) == 0 for all
v
E A} == Vq 1 , . . . , q k .
(The second statement is one of the equivalent formulations up to some unessential simplifications of the celebrated Hilbert basis theorem; see, e.g. , [15] ) . Thus we see that a has the properties of the system of closed sets, and hence defines a topology on e n . This is the Zariski topology for an arbitrary n; of course for n == 1 we obtain the topology described in Example 4. It is easy to see that for every n E N the space e n with the Zariski topology is not Hausdorff and, as a corollary, not metrizable. Moreover, it is not premetrizable (check this! ) .
Now we give a definition that distinguishes the most important class of mappings between topological spaces.
0 and � be topological spaces. A mapping rp : 0 � � is said to be continuous at a point x E 0 if for every neighborhood U of the point rp(x ) in � there exists a neighborhood V of the point x in 0 such that rp(V) c U. A mapping from 0 to � is said to be continuous if it is continuous at every point of the space 0. Definition 3. Let
This notion is compatible with the notion of continuity in the context of metric spaces ( see the preceding section ) . The following proposition is easily verified. It provides an equivalent definition of continuity, which is often used as the initial definition in text books.
A mapping between topological spaces 0 and � is contin uous -<====> the inverse image of every open set in � is open in n -<====> the inverse image of every closed set in � is closed in 0. Moreover, for a map ping to be continuous it is sufficient that for some subbase of the topology in • � the inverse image of every set from this subbase is open in 0. Proposition 9.
The part of this proposition concerning the closed sets immediately im plies the following useful
Suppose rp : 0 � � is a continuous mapping between topological spaces, and � is Hausdorff. Then the inverse image of every • singleton in � is closed in n.
Proposition 10.
2.
17
Topological spaces Exercise 5 ° .
(i) A topological space 0 is discrete <====> for every topological space � any mapping from n to � is continuous. (ii) A topological space 0 is antidiscrete <====> for every topological space � any mapping from � to n is continuous. Remark. In contrast to the notion of continuous mapping we have just
considered, the notion of uniformly continuous mapping of metric spaces mentioned in the previous section has no reasonable analogue for topological spaces. The role of general structure which admits such an analogue belongs to so-called uniform spaces; see, e.g. , [13]. The set of all continuous complex-valued functions (i.e. , mappings to CC) defined on a topological space 0 is denoted by C(O) . Clearly, this is a linear space with respect to the pointwise operations. (Explain what C(O) is for a discrete and for an antidiscrete n.) We distinguish several important classes of continuous mappings of topo logical spaces. Suppose rp n � is such a mapping. It is said to be - a homeomorphism if it has continuous inverse mapping; - topologically injective if it provides a homeomorphism between n and Im( rp ), where the latter is regarded as a topological subspace in �; - open if the image of every open set in 0 is open in �; - topologically surjective if it is surjective and the topology in � coincides with the topology of the corresponding topological quotient space. Thus, we see that a mapping between topological spaces is a homeomor phism <====> it is surjective and topologically injective <====> it is injective and topologically surjective. :
Definition
r : [0, 1]
--+
--+
4. A path in a topological space 0 is a continuous mapping 0. We say that a given path connects the points r (O) and !( 1 ).
A topological space where every two points can be connected by a path is called path-connected.
As we mentioned before, a topology provides many more possibilities for investigating different types of convergence than a metric. Now we can explain why. The standard notion of convergent sequences in metric spaces prompts the following
xn ;
E N be a sequence of elements of a topological space n. We say that this sequence converges to an element X E n, which Definition 5. Let
n
0. Foundations: Categories and the Like
18
is called the limit, if for every neighborhood U of x there exists a positive integer N such that Xn E U for n > N. Exercise 6. There is a topology on C[O, 1] such that the convergence of a
sequence in this topology coincides with the pointwise convergence (although there is no metric with this property; see Exercise 1 ). Hint. For every triple (f E C[O, 1], t E [0, 1], c > 0 ) consider the set UJ,t,c: == {g E C[O, 1] : l f( t ) - g (t) l < c} , and take such sets as a subbase of this topology. The following obvious proposition is a topological analogue of the theo rem on the uniqueness of the limit. Proposition 11.
most one limit.
In a Hausdorff space every sequence of elements has at •
Some non-Hausdorff spaces also have this property. For instance, every uncountable space n where (apart from the obligatory n and 0) finite or countable subsets are declared closed may serve as an example; the reader can easily verify that in this space every convergent sequence is constant for sufficiently large indices. But, of course, we cannot just omit the Hausdorff ness condition. For instance, in an antidiscrete space every point is a limit of every sequence. In metrizable and, more generally, premetrizable spaces topology is de fined uniquely in terms of convergent sequences. This is the contents of
Let U be a subset of a premetrizable topological space M. Then U is open <====> for every sequence X n E M converging to some x E U, all its elements with sufficiently large indices belong to U.
Proposition 12.
d be a premetric generating this topology. If U is not open, then there is a point x E U which is not interior for U. Then, for any n E N, the open ball with center at x and radius 1/n does not lie entirely in U, hence there exists Xn E M such that d(x, x n ) < 1/n and X n � U. This means that Xn converges to x and at the same time has no intersections with U. The Proof. Let
II
rest is clear.
This proposition implies the equivalence of the notions of limit in the sense of Cauchy and in the sense of Heine for mappings between spaces of this class:
Let rp : 0 � be a mapping between two topological spaces. If rp is continuous, then, for every sequence X n E 0 converging to some X E n, the sequence rp( X n ) converges to rp( X ) If, in addition, n is premetrizable, then the converse is also true. Proposition 13.
--+
0
2.
19
Topological spaces
0 is premetrizable and rp "preserves convergent sequences" . Take an open set U in � and consider its inverse image v in n. If a sequence X n in n converges to a point X E v , then, according to our assumption, the sequence rp(x n ) converges to rp(x) . As rp(x) E U and U is open, the sequence rp(xn ) belongs to U for sufficiently large n. Therefore Xn belongs to V for the same (sufficiently large) n. But according to the previous proposition, this means that V is open. We have Proof. The first statement is obvious. Suppose
verified the equivalent definition of continuity provided by Proposition 9.
II
Finally, we state the following proposition, which is important in appli cations.
Suppose 0 is an arbitrary topological space, � is a Haus dorff topological space, M is a dense subset in 0, and rp, 1/J : 0 � are continuous mappings. If rp and 1/J coincide on M, then rp == 1/J .
Proposition 14.
--+
rp ( x) == y =!=- z : == 1/J ( x) for some x E 0. Take disjoint neighborhoods Vy and Vz of these points. By the hypothesis, the neighborhood U : == rp - 1 (Vy) n 1/; - 1 (Vz) of x contains at least one point x ' E M. As rp(x') E Vy and 1/;(x') E Vz , these points are different. But this contradicts the fact that rp I == 1/J I II Proof. Assume that, on the contrary,
M
M.
In a general topological space, even Hausdorff, neither topology, nor continuity of mappings can be characterized in terms of converging sequences. There are such examples "inside" functional analysis as well; see Exercise 4.2.4 below. A topological counterexample is as follows. Take a non-discrete space where every converging sequence is constant for large enough indices. We have already mentioned (after Proposition 1 1 ) one such space in connection with other questions. (A more complicated but extremely important for general topology example is the space ,BN; see, e.g. , [14, Corollary 3.6. 1 5] . ) Of course, neither Proposition 12 nor Proposition 13 is true for such spaces (explain why) . Such nuisances, however, disappear if we consider the so-called nets, a generalization of the notion of sequence. Definition 6. An ordered set A is said to be directed if for every .A, J-l E A there exists v E A such that .A -< v and J-l -< v (in other words, every two-point set in A is bounded) . A mapping from a directed set to an arbitrary set X is called a net in X .
The element in X corresponding to an element v E A is usually denoted x v ; it is called the element of the net with index v. Of course, sequences are special cases of nets corresponding to A == N.
Definition 7. Let X v ; v E A be a net in a topological space n . A point X E n is called the limit of this net if for every neighborhood U of this point there exists .A E A such that for all v E A the inequality .A -< v implies x v E U. Remark. Certainly, the readers have encountered at least once in their mathematical lives the notion of "true" net (i.e. , a net which is not sequence) . Recall the definition of Riemann integral: it is precisely the limit of a net of integral sums, although this construction possibly was not called a net . (Describe the corresponding directed set . )
0. Foundations: Categories and the Like
20
Exercise 7 ( cf. Proposition 1 2) . Let U be a subset of a topological space n. Then U is open ¢:::::::> for every net x v E 0; v E A converging to a point x E U there exists A E A such that for all v E A; A -< v we have Xv E U. Hint. If x is not interior for U, then we can define A as the set of all neighborhoods of x with the order "U -< V ¢:::::::> V C U" . After that to every neighborhood V we assign an arbitrary point of V not lying in U.
Exercise 8. Let
for every net Xv E 0; v E A, converging to some x, the net cp(xv ) converges to cp( x) .
Hint. This follows from Exercise 7.
The Hausdorff property can be characterized in terms of nets (but not in terms of sequences; see above) . Exercise 9. A topological space is Hausdorff ¢:::::::> every net has at most one limit. Hint. Assume that x and y have no disjoint neighborhoods. Define A as the set of all pairs (Ux ' Uy ) of the neighborhoods of X and y with the order "(U� ' u; ) -< (u; ' u; ) ¢:::::::> (U; C U� ) and (u; C u; ) " . Then to every such pair of neighborhoods we assign a point in their intersection.
In conclusion of this introduction to topology let us note the following. Of course, Exercise 6 is not the only example showing that a topology is more efficient than a metric. We go back to this many times in the book when studying weak topologies, generalized functions and other questions. Nevertheless, there are some important types of conver gence in analysis that cannot be described even by topology: Exercise 10 (0. G . Smolyanov) . On the set of measurable functions defined on the closed interval [0, 1] , the convergence (of a sequence) almost everywhere cannot be described by any topology.
Hint. Use the following two facts: (i) a sequence converging in measure does not necessarily converge almost everywhere, but it contains a subsequence converging almost everywhere; (ii) if a sequence does not converge to a given point in a topological space, then it contains a subsequence lying outside of some neighborhood of this point.
3 . Categories ; first examples
Terminological remark. When speaking about totalities or families of
objects, we use two terms: "set" or "class" . The fact is that we cannot use the same term for all conceivable totalities. For example, if we allow the notion "set of all sets" to make sense, we immediately obtain a series of well known paradoxes, which overshadowed the last years of Georg Cantor. So in the cases where we are not sure that we can use the term "set" (to speak informally, this concerns "too huge totalities" ) , we use the term "class" . Thus, every set is a class, but not vice versa; say, the class of all linear spaces is not a set. This "naive" approach is absolutely sufficient for our needs. However, in formal set theory everything is much more complicated. The explanation of what are sets and classes, and why this "playing with words" saves us from contradictions, exceeds the scope of this book; see,
3.
21
Categories; first examples
e.g. , [13] . (We only hint that it is expedient to define sets as those classes that are allowed to be elements of other classes.) Definition
1. We say that a category JC is defined if
I. A class Ob(/C) is indicated with elements called
objects of the cate
gory JC (usually they are denoted by letters X, Y, . . . , and we write X E /C, having in mind X E Ob(/C) ) . II. For every ordered pair X, Y E JC a set (now indeed a set!) hK (X, Y) is indicated; elements of this set are called morphisms from X to Y, or morphisms between X and Y. (The condition rp E hK (X, Y) can be written as rp : X --+ Y or X � Y, as if we were talking about mappings of sets; below we will see why this notation is useful) . The object X is called the domain, and Y the range of morphism
rp.
Ill. For every triple X, Y, Z E JC and for every pair rp : X --+ Y,
1/J : Y --+
Z (here the range of the first morphism coincides with the domain of the second morphism) a morphism from X to Z is defined; it is called the composition of morphisms rp and 1/J. (This morphism is denoted by 1/J rp or 1/Jrp ; please, pay attention to the order of the symbols. ) o
Moreover, the following two conditions are supposed to be fulfilled: (i) (associativity of composition) For every X, Y, Z, U E /C, rp : X --+ Y, 1/J : Y --+ Z, and x : Z --+ U we have ( x 1/J) rp == x ( 1/J rp) . (In other words, if a morphism ( x 1/J) rp or, which is equivalent, a morphism x ( 1/J rp) makes sense, then these two morphisms coincide.) (ii) For every X E JC there is a morphism 1 x : X --+ X , called the local identity for X, such that for every Y E JC and rp : X --+ Y (respectively, 1/J : Y --+ X), we have rp 1 x == rp (respectively, 1 x 1/J == 1/J ) . (In other words, if rp 1 x makes sense, then it is rp, and if 1 x 1/J makes sense, then it is 1/J.) o
o
o
o
o
o
o
o
o
o
o
o
Thus, a category consists of three ingredients: objects, morphisms, and the law of composition, and they satisfy two axioms, associativity of com position, and the existence of local identities. Of course, as it happens in algebra, associativity allows us to use long expressions like rp 1 rp 2 · · · 'Pn where the parentheses can be placed arbi trarily. But the difference is that not every such expression makes sense. If a category is chosen, we often write h(X, Y) instead of hK (X, Y) . The class of all morphisms of the category /C, i.e. , the union hK (X, Y) o
o
o
22
0. Foundations: Categories and the Like
over all X, Y E /C, is denoted by hK . Sometimes, if there is no danger of misunderstanding, we write rp E JC instead of rp E hK . A morphism such that its domain and range coincide (like, say, local identity) is called an endomorphism. Proposition 1. For every object there is only one local identity. Proof. If 1� is another candidate for local identity in X, then by property
(ii) , 1x 1� == 1� and at the same time 1x 1� == 1x .
•
A category JC is called a subcategory of a category £, if, first, every object and every morphism in JC are respectively an object and a morphism in £, second, composition of morphisms in JC is the same as in £, and, third, every local identity in JC is a local identity in £. A subcategory is said to be full if for every X, Y E JC we have hK (X, Y) == h.c (X, Y) . Now we invite the reader to the zoo of examples of categories. As a rule, we restrict ourselves to defining objects, morphisms, and the law of composition; in all cases the axioms of category are verified trivially. As usual ( cf. the discussion in Subsection 0. 1 ), the first example pro duces a delusive impression of being "silly" . Example 1. Every given class is turned into a category if we declare its elements to be objects and at the same time local identities, and say that there are no other morphisms. Such a category is called discrete. Another example is already "taken from real life" . Example 2. The category of sets Set . Its objects are arbitrary sets, mor phisms are mappings of sets, the composition of morphisms is the usual composition of mappings, and local identities are identity mappings. Actually in the historically first examples of categories their objects were sets with some additional structure, and morphisms were mappings of the sets compatible, in some sense, with this structure. It was the considera tion of these examples that led to the realization of the following fact: in a substantial mathematical theory the definition of objects should be accom panied with the definition of morphisms, and that "morphisms are more important than objects" . The influence of these ideas on modern mathe matics is difficult to overestimate. In all examples of this type the composition of morphisms is their com position as mappings, and local identities are identity mappings. This will always be assumed in the sequel. Example 3. The category Lin of linear spaces (we recall: over CC). Its objects are linear spaces, and morphisms are operators. This category has an important full subcategory FLin consisting of finite-dimensional spaces.
3.
Categories; first examples
23
Note that Lin is not a subcategory of Set since we can define many different structures of linear space on the same set. (Formally, a linear space is not a set, but a pair consisting of a set and a linear structure on it.) Among purely algebraic categories, we mention, in addition to Lin, the category Ab of abelian groups , the category Gr of (all) groups (of course, Ab is a full subcategory in Gr) , and the category Rin of all (not nec essarily unital) rings . Morphisms in these categories are what is usually called homomorphisms of groups and rings, respectively. Some algebraic categories that are important for functional analysis will be mentioned later (see Sections 5.2 and 6.3 ). Example 4. The category Ord of ordered sets. Its objects are ordered sets and morphisms are the so-called monotone mappings, i.e. , order-preserving (the meaning of this must be clear) mappings. Example 5. The category Met of metric spaces. Its objects are metric spaces, and morphisms are continuous mappings. This category is apparently the most important one in the theory of metric spaces. However, sometimes it is reasonable to consider some other categories, in particular, Metu and Met1 . As in Met , their objects are metric spaces, but morphisms in Metu are uniformly continuous mappings, and in Met1 contractions. Of course, Metu is a subcategory in Met , and Met1 is a subcategory in Metu , but these subcategories are not full. Example 6. The category Top of topological spaces . Its objects are topo logical spaces and morphisms are continuous mappings. This category has an important full subcategory HTop consisting of Hausdorff spaces. The latter in turn has a full subcategory consisting of metrizable topological spaces. (Note that here we say metrizable, but not metric. Indeed, Met is not a subcategory of Top for the same reason that Lin is not a subcategory of Set (explain this reason!) .) As in the theory of metric spaces, in real analysis (and in the related top ics of ergodic theory) there are several reasonable approaches to the question of which mappings of measure spaces should be viewed as compatible with the structure of measure space. Respectively, in these areas we can speak about several substantial categories. We mention here only one of them. Example 7. The category Meas . Its objects are measure spaces. As for morphisms, they are classes of equivalent proper measurable mappings (see Section 0. 1 ) between the corresponding spaces (here we have in mind the general principle of real analysis saying that we should not distinguish be tween equivalent mappings) . The composition of morphisms is defined as the equivalence class of compositions of representatives of the initial classes
24
0. Foundations: Categories and the Like
(check that this definition does not lead to misunderstanding) . The axioms of category are easily verified, and obviously, local identities are precisely the equivalence classes of identity mappings. These examples suggest that different "modern" mathematical sciences study their own categories, or to be more precise, classes of categories. Gen erally speaking, this is indeed the case, although with some part of oversim plification. (In particular, the same, or maybe, even greater consideration is given to the so-called functors, which will be defined in Section 7. ) The number of such examples will essentially increase, mostly at the expense of the categories serving functional analysis (see below Sections 1.4,
2.2, 4. 1 , 5.2-5.3, 6.3) .
Quite a few categories allow us to formulate some mathematical results briefly and elegantly. Here is one such example. Example 8.
The standard simplicial category Ll . Its objects are intervals
�n : == {0, . . . , n} of the set of non-negative integers, and morphisms are
non-decreasing mappings. Thus, Ll can be regarded as a full subcategory in Or d.
On the outstanding role of this category in many areas of mathematics you can read, for example, in [16] .
The following example is of rather general nature. Example 9. Let JC be an arbitrary category. Its
dual category is the cate
gory JC 0 that has the same objects as /C, but the set of morphisms hKo (X, Y) is, by definition, hK (Y, X) , and the composition of morphisms 1/J rp in JC0 (if it exists) is defined as the morphism rp 1/J in JC. o
o
This is an extremely useful example. As we will see many times, it allows us to cut by half the number of definitions, theorems, and other mathematical statements. (As a consequence, we save paper. In this way we can claim that categories have applications in national economy. ) An object X of a category JC is called initial (respectively, final) if for every Y E JC the set h(X, Y) (respectively, h(Y, X) ) consists of exactly one element. ( "There is exactly one arrow from X to Y, respectively, from Y to X." ) Note that X is an initial object in JC <====> X is a final object in JC 0 , and vice versa. (This is the first hint to the practical use of the notion of dual category.) An object 0 is called a zero object if it is both initial and final. Of course, in Lin there is a zero object, namely the zero linear space. On the other hand, in Set there are initial and final objects, but they are different. Final objects here are singletons, and the initial object is the
4.
Isomorphisms
25
empty set. (The fact that there is a unique mapping from empty set to any other set follows from the formal definition of the notion of mapping in rigorous set theory; see, e.g. , [17] .) In Ll there is only one final object, namely �o, but, as it can be easily verified, there are no initial objects. 4. Isomorphisms . T he problem of classification of obj ects and morphisms
Among all morphisms of a given category, some classes are especially inter esting. We start with the "best" ones. Everywhere in what follows, JC is an arbitrary category. Definition 1. Let rp : X --+ Y be a morphism in /C. A morphism 1/J : Y --+ X
in the same category is called inverse to rp if 1/Jrp == 1x and rp1/J == 1y . A morphism in JC is called an isomorphism if it has an inverse morphism. Objects X and Y in JC are called isomorphic if there exists an isomorphism from X to Y (or, equivalently, from Y to X - verify this!) . The inverse morphism to rp is usually denoted by rp - 1 . The following result is almost evident, but as we will soon see, it has important corollaries. Theorem 1.
phic.
Every two initial and every two final objects in JC are isomor
Proof. Suppose X and Y are initial objects in /C. The existence of the
necessary arrow in the definition of an initial object implies the existence of morphisms rp : X --+ Y and 1/J : Y --+ X. Consider 1/Jrp : X --+ X; the uniqueness of the arrow implies that 1/Jrp == 1x . Similarly, rp1/J == 1y . Thus, initial objects are isomorphic. Passing to the dual category (again it is useful!) we immediately see that final objects are isomorphic. II Let us see what the abstract definition of isomorphisms turns out to be in various examples of categories. In discrete categories where the objects "do not want to communicate with each other by means of arrows" , there are no different isomorphic objects. An isomorphism in Set is a bijection (i.e. , a one-to-one correspondence) . An isomorphism in Lin is what one usually call a linear isomorphism (i.e. , an operator which is also a bijection) . As the reader has probably guessed, isomorphisms in Met and in Top have a special name: homeomorphisms. An isomorphism in Metu , i.e. , a uniformly continuous mapping with a uniformly continuous inverse mapping, is called a uniform homeomorphism. Finally, an isomorphism in Met1 is a contrac tion having an inverse contraction; of course, this is just an isometry. An isomorphism in Meas between measure spaces (X, J-L ) and (Y, v) is, as you
0. Foundations: Categories and the Like
26
can easily verify, an equivalence class of mappings f : X --+ Y satisfying the following conditions: ( 1) there exist sets of full measure A C X and B C Y such that f maps A bijectively to B; (2 ) if C C X, then (a) C is measurable in X <====> f (C) is measurable in Y; (b) C has zero measure in X <====> f (C) has zero measure in Y. *
*
*
Informally, the meaning of the notion of "isomorphism" is that isomor phic objects are "actually the same" ; in some sense they represent the same object, but "in different clothes" . All that we can say in categorical terms (i.e. , in the language of arrows) about an object, we can also say about each isomorphic object. In every area of mathematics, when studying a particular category, the following natural question arises: is it possible to classify (or, in other words, to describe) objects of that category up to an isomorphism? As we will see later, in some categories this problem is trivial, in others a solution is given by a fundamental theorem (such as, say, the Riesz-Fischer theorem in Section 2.2 ). And there are some categories where this problem is hopeless because it is impossible to cover all the multitude of the emerging cases. But what do we mean by speaking about the solution (or a big advance) of this "classification problem" ? Example 1. As a simple instructive example, let us consider the category FLin we mentioned before. You know, of course, that two finite-dimensional
spaces are isomorphic <====> they have the same linear dimension. Moreover, every such space is linearly isomorphic to cc n for some n. We can express these facts as follows: the set of all integers can be chosen as a complete system of invariants of isomorphism classes for FLin, and for every such invariant n E N we can take cc n as a model for the corresponding object in
FLin.
In the general case, for a given category JC we sometimes can find some where a class M consisting of sufficiently "intelligible" elements, and to associate to every object JC an element in M in such a way that isomorphic objects correspond to the same element. Such a class M is usually called a system of invariants of isomorphisms in /C. We emphasize that the choice of a class should be viewed as appropriate if the class consists of "things well perceived by our intuition" , like natural numbers in the last example, or some "good" sets; otherwise what are the benefits of the construction?
4.
Isomorphisms
27
(One of the most important systems of invariants in analysis is provided by the spectrum of an operator. We discuss this in Chapter 5.) If we are lucky, a system of invariants we find is complete. This means that two non-isomorphic objects always have distinct invariants. Thus, in this case, objects have the same invariant <====> they are isomorphic. In addition, it is desirable to point out for every invariant a "sufficiently simple" object with this invariant. Such an object is often called a model object, or just a model for this invariant. The system of invariants gives us an idea of how rich our category is with "really different objects" and what is their nature. Thus, it is accepted (and apparently mathematicians silently agree with this) that the problem of classification of objects is solved if at least two things are done. First, you must indicate a system of invariants that is as transparent as possible (and you persuade the mathematical community in the latter) . Second, for every invariant you must indicate its model (again, sufficiently transparent) . It would be good if the construction of the invari ant was simple as well, but this depends on the situation, of course. (In functional analysis the exemplary solution for the problem of classi fication is the relevant result on the category of Hilbert spaces; see Theorem 2.2.2 below.) Example 2. Obviously, two objects (sets) in Set are isomorphic <====> these sets have the same cardinality. Thus, a complete system of invariants in Set
is precisely the class of all cardinalities (or cardinal numbers) . Of course, this declaration is, in fact, a tautology. Recall that cardinality was defined precisely as "the property that is common to equivalent sets" (see [8, p. 25] ) . However, the same is true for finite cardinalities: we merely overcame the corresponding psychological barrier in childhood, and now we forgot that this was not easy. (When saying this we stay, of course, on the "naive" point of view (see [8] ) and do not go deep into the "serious" set theory; cf. , e.g. , [7] . ) As a model of a set of a given cardinality, an interval of the transfinite line with this cardinality is usually suggested; the explanation can be found in [6] .
A little more substantial is the classification of objects in Lin. It is a direct generalization of what we have told about FLin. Exercise 1.
(i) Two linear spaces are isomorphic <====> they have the same linear dimension. (ii) Every cardinality is a linear dimension of some linear space.
28
0. Foundations: Categories and the Like
Hint. Suppose A is a set of cardinality m. Consider the set CC� of complex-valued functions on A taking non-zero values only on finite subsets in A. This is a linear space with respect to pointwise operations that has m
as linear dimension.
It is clear from this exercise that we can again take the class of all cardinalities as the complete system of invariants in Lin. Now, however, the invariant of the object is its linear dimension. If we agree to choose for a given cardinality m a model, say A, in the category of sets, then the model of the space of dimension m can be defined as the space C� .
As to the majority of the other examples described above (with the ob vious exception of Ll) , the problem of classification of objects there seems to be hopeless: "there are too many non-isomorphic objects" . At the same time it can be easily solved in some important subcategories of those cat egories. For example, the full subcategory FAb in Ab consisting of finite abelian groups is one of them. (The classical theorem from algebra provides a key for the classification of objects in FA b. Recall this theorem and draw the corresponding conclusions.) In the category Meas there is an important full subcategory consisting of the so-called Lebesgue spaces introduced by V. A. Rohlin in 1949 [18] . These are measure spaces with countable base satisfying some supplementary (not too complicated ) conditions. We do not want to divert the reader with the exact formulations; instead we only note that these spaces often arise in applications. It turns out (and it was proved by Rohlin ) that every Lebesgue space is isomorphic in Meas to one of the spaces of the form [0, 1] Il X, or X, where the �nterval �s endowed with the standard Lebesgue measure, and X is an at most countable set with "counting measure " (assigning measure 1 to every singleton). (Suggest the corresponding system of invariants, using combinations of finite, count able, and continual cardinality. ) In particular, - and this, of course, is the main result - from the point of view of real analysis, there is a unique Lebesgue space without points of non-zero measure, namely the interval of the real line. The Rohlin theorem and similar results (see, e.g. , [19] , [20] ) show that the structure of a measure space is the coarsest among all the substantial structures on a set (see the discussion in [2 1 , pp. 46-47] ) . *
*
*
Digression. Before going further, we introduce some elements of the cat
egorical language. Let JC be a category. A diagram in JC is an arbi trary family of objects in JC and morphisms between them. If X and Y are objects in a diagram, then a path from X to Y is a finite family 'P I : X � ZI ' 'P 2 : ZI � z2 ' . . . ' 'Pn : Zn - I � y of morphisms in this dia gram. (Of course, there can be many different paths in the diagram from one object to another, or no paths at all.) A diagram is called commutative if for every two objects X and Y and for every two paths
4.
29
Isomorphisms
� 1 , . . . , �m from X to Y we have 'Pn
o
· · ·
'PI == �m
o
the commutativity of the simplest diagrams
o
· · ·
o
�1 · For example,
Y --? Z T
means that a == T P in the first case, and T'lr == ap in the second case. (Obvi ously, our psychology perceives the picture much better than the formulas, especially if there are many formulas. ) *
*
*
We were talking about the problem of classification of objects in a given category. Equally important typical problems are those of classification of morphisms. There are several types of such problems depending on what classes of morphisms we wish to consider, say, all morphisms, all endomor phisms, or all automorphisms (i.e. , endomorphisms that are at the same time isomorphisms) . We pay special attention to the problem which is apparently the most important: the classification of endomorphisms up to similarity.
rp : X --+ X and � : Y --+ Y in a category JC are said to be similar if there exists an isomorphism L : X --+ Y Definition 2. Two endomorphisms
such that the diagram
x --? x
�1
y
'lj;
1�
y
is commutative (i.e. , �L == Lrp) . We say that an isomorphism L implements a
similarity between endomorphisms rp and �. We distinguish the following useful
If rp and � are two similar endomorphisms in JC, then rp is an isomorphism <====> � is an isomorphism. Proof. Suppose L implements this similarity. Then the existence of rp - 1 implies the existence of � - l , namely, � - l == Lrp - 1 L - I . The rest is clear. •
Proposition 1.
Actually, similarity is a special case of isomorphism of objects, but in a more complicated (compared to /C) category. Indeed, consider the category End(/C) defined as follows. Its objects are arbitrary endomorphisms in /C. If rp : X --+ X and � : Y --+ Y are such objects, then morphisms from rp to
30
0. Foundations: Categories and the Like
1/J are defined as morphisms p : X � Y such that the following diagram is
commutative:
y
'lj;
----+
y
The following proposition is verified immediately.
Suppose rp and 1/J are two endomorphisms in /C . Then a morphism L E hK implements similarity between rp and 1/J � it is an • isomorphism in the category End ( /C ) .
Proposition 2.
a
Thus, similarity is a special case of isomorphism, and we can apply what was said about invariants and models to the problem of classification of endomorphisms. Exercise 2. Find a complete system of invariants for the similarity of endomorphisms in FLin ( i.e. , for operators acting on finite-dimensional spaces) . Hint. Recall the Jordan theorem from the course of linear algebra. It shows that for invariants we can take unordered families of pairs consisting of an integer and a complex number. Such families characterize the Jordan blocks of an operator. Sometimes we have to make the definition of similarity of endomorphisms more rigid. Namely, if £ is a subcategory in /C, then endomorphisms rp : X � X and 1/J Y � Y in JC are said to be similar with respect to £ if there exists a morphism L : X � Y that makes the diagram in Definition 2 commutative, and it is an isomorphism in £. Of course, such a morphism must be an isomorphism in JC as well. Therefore, if morphisms in JC are similar with respect to £, then they are similar in the sense we used above. But the converse is not always true. It is easy to see that the "relative similarity" is again a special case of endomorphism. ( In which category? ) ( Running ahead, we note that a typical example of such "relative simi larity" is the unitary equivalence of operators on Hilbert spaces; see Section 1.4. ) :
A position-finding remark for the future. As for the classification of endomor phisms in the category Lin, which includes FLin, it seems to be an absolutely hopeless task. The same is true for the more important in functional analysis analogues of this task for the categories of Banach and Hilbert spaces ( to be defined later ) . This corresponds to the fact that up to now there are no satisfactory analogues of the Jordan theorem for arbitrary operators acting in these spaces. ( Moreover, for the case of Banach spaces it is already known that these analogues cannot exist : see the Enflo-Reed Theorem 2. 2. 5.) Thus, the problem of classification in the category of endomorphisms in Hilbert spaces
4.
31
Isomorphisms
i s far from solution. However, this problem is completely solved for the full subcategory consisting of the so-called selfadjoint operators, the notion of exceptional importance in analysis and quantum physics. The corresponding classification is one of the many guises of the Hilbert spectral theorem ranking among the major achievements of functional anal ysis. (Some mathematicians even say that it is the most important result. ) We prove it, in several equivalent formulations, towards the end of the book, in Sections 6.6-6.8.
Another typical problem of classification of morphisms is worth mention ing. If we want to compare arbitrary morphisms in a given category, then it is more preferable to take a different (compared to Definition 2) approach to the question of which morphisms should be considered equivalent.
: X1 Y1 and 1/J : X2 Y2 in a category JC are said to be weakly similar if there exist morphisms L 1 : X1 X2 and "2 : Y1 Y2 such that the diagram Definition 3. Two morphisms
rp
--+
--+
--+
--+
is commutative (i.e. ,
1/JL I L2 'P). We say that the pair ( L I , L 2 ) implements a ==
weak similarity between morphisms rp and 1/J.
By analogy with the notion of similarity, the notion of weak similarity can be strengthened if we introduce a new subcategory £ in /C. Namely, two morphisms rp and 1/J in JC are said to be weakly similar with respect to £ if there exist two isomorphisms L I and L 2 in £ such that the diagram in Definition 3 is commutative. The major example in functional analysis is the weakly unitary equivalence of operators in Hilbert spaces; see Section
1.4.
Again, as in the analysis of endomorphisms, the problem of classification of all morphisms in JC up to weak similarity is a special case of the problem of classification of objects in some new category. This time this is the category Mor(/C) where the objects are the elements of hK , and the morphisms from rp to 1/J are the pairs ( P I , P2 ) such that the diagram
is commutative.
32
0. Foundations: Categories and the Like
3. Let rp and 1/J be morphisms in /C . Then a pair of morphisms ( LI , L 2 ) realizes a weak similarity between rp and 1/J � it is an isomorphism • in Mor ( /C ) .
Proposition
We see that for endomorphisms the relation of weak similarity is much more tolerant than the relation of similarity. Of course, if two endomor phisms are similar, then they are weakly similar; but at the same time, as you can easily see, endomorphisms in FLin defined by the matrices ( 6 g ) and ( g 6 ) are weakly similar but not similar. Here is the general observation. Exercise
3. Find the complete system of invariants of weak similarity
and the corresponding models for the morphisms in FLin ( i.e. , for operators acting between finite-dimensional linear spaces ) . Hint. Every operator is weakly similar to the one that can be written as a diagonal matrix with zeroes and ones on the diagonal. Remark. Contrary to the Jordan theorem, the latter fact has a substantial
generalization, which you can find with the help of bases, to operators on infinite-dimensional spaces. More interesting is that a similar proposition holds for some traditional classes of operators in functional analysis. In our book the problem of classification up to weak unitary equivalence will be solved for compact operators in Hilbert spaces. This is one of the forms of the Schmidt theorem in Section 3.4. 5. Other classes of morphisms
As we have seen, the definition of isomorphism consists of two relations. If we consider each of them separately, we come to the following notions:
rp : X � Y be a morphism in a category /C. A morphism 1/J : Y � X in the same category is called left, respectively, right inverse for rp if 1/Jrp == 1x , respectively, rp1/J == 1y . A morphism in JC is called a coretraction, respectively, retraction if it has the left, respectively, the right Definition 1. Let
inverse morphism.
Of course, retractions turn into coretractions, and vice versa, if we con sider our morphism in the dual category. Proposition 1.
and a retraction.
A morphism is an isomorphism � it is both a coretraction
Proof. The implication ===> is obvious. To establish the converse implica
tion {:== , we arrange the brackets in the expression 1/Jz rp1/Jr , where 1/Jz is the • left, and 1/Jr the right inverse for rp, in two different ways.
5.
Other classes of morphisms
33
Isomorphisms, retractions, and coretractions are in some sense the "best" morphisms. The following definition distinguishes just the "good" ones. Definition 2. A morphism rp : X � Y in JC is called a monomorphism if rp1/; 1 == rp1/;2 always implies 1/; 1 == 1/;2 . A morphism is called an epimorphism if it is a monomorphism in the dual category JC0 . If a morphism is both a monomorphism and an epimorphism, then it is called a bimorphism. If we "unwrap" the definition of epimorphisms, we will see that they are precisely those morphisms for which 1/; 1 rp == 1/J2 'P always implies 1/; 1 == 1/;2 . Thus, a monomorphism is such rp that can be cancelled out as the left factor in a composition, whereas an epimorphism is such rp that can be cancelled out as the right factor in a composition. Proposition 2. Every coretraction is a monomorphism, every retraction is
an epimorphism, and every isomorphism is a bimorphism. Proof. If rp is a coretraction with the left inverse morphism 1/J, then rp1/; 1 == rp1/;2 implies 1/;rp1/; 1 == 1/;rp1/; 2 , which implies 1/; 1 == 1/;2 . So we have proved the
first statement; the second one is the first one in the dual category. The rest II is clear.
3. If a composition rp1/J is a monomorphism, then 1/J is a monomorphism as well; if rp1/J is an epimorphism, then rp is an epimorphism II as well. Let us give a stronger version of Proposition 1. Proposition 4. A morphism is an isomorphism <====> it is a coretraction and an epimorphism <====> it is a retraction and a monomorphism. Proof. If rp is a coretraction and 1/J is its left inverse, then rp1/Jrp == rp == lrp, where 1 is the corresponding local identity. So if, in addition, rp is an epimorphism, then rp1/J==l , and thus 1/J is the right inverse for rp. The rest is Proposition
clear.
II
Let us see what are monomorphisms and epimorphisms in different ex amples. It turns out that in a series of cases they can be characterized in terms of mappings of sets. Proposition 5. In all the categories figuring in HTop, each injective (as a mapping) morphism is
examples from Set to a monomorphism, and each surjective (as a mapping) morphism is an epimorphism.
Proof. Let JC be one of these categories. Suppose first that a morphism
rp :
� Y in JC is an injective mapping of the corresponding underlying sets, and for some Z E JC and 1/; 1 , 1/;2 : Z � X we have rp1/; 1 == rp1/; 2 . Then X
34
0. Foundations: Categories and the Like
for every element x E Z we have 'P �1 (x) == cp� 2 (x) , and from the injectivity of cp it follows that �1 ( x) == �2 ( x) . This means that �1 == �2 . Now let cp : X --+ Y be a surjective mapping of the underlying sets, and suppose that for some Z E JC and �1 , �2 : Y --+ Z we have �1 cp == �2'P· Then for every y E Y we can find x E X such that y == cp(x) , and after that • we obtain � 1 (y) == �1 cp(x) == � 2 cp(x) == � 2 (y) . Thus, �1 == �2 · Remark. Our advanced reader probably feels that there must be a general proposition covering all these cases. This is indeed the case; see Exercise 7.2 below.
Often, the converse is true.
In the categories Set and Lin monomorphisms are pre cisely injective mappings, and epimorphisms are surjective mappings.
Proposition 6.
Proof. Let cp : X --+ Y be a morphism in Set or in Lin. Assume first that
it is not injective, i.e. , cp(x1 ) == cp(x 2 ) for two different points (or vectors) X k E X ; k == 1 , 2. Then in the case of Set let us denote by Z a singleton and consider mappings (i.e. , morphisms in Set ) �k : Z --+ X; k == 1, 2, taking this singleton to X k · In the case of Lin we set Z :== CC and define �k : Z --+ X; k == 1 , 2 as operators (i.e. , morphisms in Lin ) taking 1 E CC to X k · Obviously, in both cases cp� 1 == 'P � 2 , and at the same time � 1 =/=- �2 · This means that cp is not a monomorphism. Now let us assume that cp is not surjective. Then in the case of Set we take a two-point set { 0, 1 } as Z and consider �k : Y --+ Z; k == 1 , 2 such that �1 0, and �2 takes Im(cp) to 0 and Y \ Im(cp) to 1. In the case of Lin we take the quotient space Y/ Im ( cp) as Z and consider �k : Y --+ Z; k == 1, 2, such that �1 0, and �2 : Y --+ Y/ Im(cp) is the quotient mapping (taking every y E Y to its coset modulo Im(cp) ) . Obviously, in both cases �1 'P == �2'P and at the same time �1 =/=- �2 · We see that cp is not an epimorphism. Thus, monomorphisms in these categories are injective and epimor• phisms are surjective. The converse is true by Proposition 5. Remark. We recommend that the reader give another proof for the part of the proposition concerning epimorphisms. Namely, for Set you can take for
Z the same Y but with the subset Im( cp) "contracted to a point" , and for Lin take Z to be CC. Exercise 1.
(i) The characterization of monomorphisms we gave in the preceding proposition is valid for Ord, Ab, Gr, Ll, and for all our examples of categories of metric and topological spaces. (ii) The characterization of epimorphisms is valid for Ab , Top, and Ll.
5.
35
Other classes of morphisms
You have noticed that a series of important categories of metric and topological spaces are not mentioned in the second part of our proposition. This is not by accident. Proposition 7. In the category Met the morphism rp X --+ Y zs an epimorphism <====> its image is dense in Y .
The "if" part. Suppose Z
and 1/J k : Y --+ Z; k == 1 , 2 are morphisms (i.e. , continuous mappings) such that 1/; 1 rp == 1/; 2 rp. Then 1/; 1 and 1/;2 coincide on the image of rp, and the assertion follows from Proposition Proof.
E Met
2. 14.
The "only if" part. Suppose there exists y E Y that does not belong to the closure of Im(rp) . Then for some r > 0 we have U(y, r) n Im(rp) == 0 . Consider the function f : Y --+ IR : x r-+ m { ; ( r d(y, x)) ; O}. Obviously, ax
-
it is a contraction, and thus, a morphism in Met . It remains to take 1/Jk : II Y --+ Z; k == 1 , 2 such that 1/;1 0 and 1/;2 == f. The rest is clear. -
Here is supplementary information for advanced readers . First, the characterization of monomorphisms as injective mappings is valid, appar ently, for most categories of "sets with the additional structure" that serve algebra, topol ogy, and functional analysis. In particular, it is valid for the category of rings Rin we mentioned above, but in this situation the "test" object Z is a little more complicated than the one used in Proposition 6: it is the ring of polynomials with integer coefficients and without the free term (verify this! ) . However, there are categories where the class of monomorphisms is larger than the class of injective mappings. We recall that an abelian group X is called divisible if for every x E X and n E N there exists y E X such that x = ny . Exercise 2. In the full subcategory of Ab consisting of divisible groups, the mor phism 1R � 1r : t � e it (which is certainly not injective) is a monomorphism.
The part of Proposition 6 dealing with epimorphisms can be extended to the categories Ord and Gr, but the corresponding arguments require more complicated constructions (try to prove this anyway! ) . On the other hand, Proposition 7 is obviously extended to the categories Metu and Met 1 (the above proof works almost without changes) . Exercise 3 * . Proposition 7 is also true for HTop.
Hint. If the closure of the image of
Now we return (up until Exercise 7) to the obligatory material. In Proposition 2 we learned that coretractions are monomorphisms and that retractions are epimorphisms. It is natural to ask whether the converse
36
0. Foundations: Categories and the Like
propositions are true. Of course, the answer is positive for discrete categories described in Example 3 . 1 . The following exercise is more substantial. Exercise 5 (a generalization of Proposition 6 ). Let rp be a morphism in Set or in Lin. Then (i) rp is an injective mapping <====> rp is a monomorphism <====> rp is a retraction; (ii) rp is a surjective mapping <====> rp is an epimorphism <====> rp is a retraction. Hint. It is sufficient to verify the following implications: injective ==* coretraction and surjective ==* retraction. In the case of Set the second one requires the axiom of choice. In the case of Lin both implications lean on the existence of linear complements (and this, in turn, requires the axiom of choice; cf. Exercise 1 . 2) . But for the majority of categories in algebra, topology, and functional analysis, the class of coretractions (respectively, retractions) is much smaller than the class of monomorphisms (respectively, epimorphisms) . Exercise 6° . In the category Ab the "duplication" Z --+ Z : n � 2n is a monomorphism but not a coretraction. The natural projection Z --+ Z/2Z is an epimorphism but not a retraction. Exercise 7. The categories Ord and HTop have mono- and epimor phisms with the similar properties. Hint. Every bijection that is not an isomorphism fits well. For the con struction you can use objects which were called discrete in the corresponding categories. However, in topology the following classical example of a monomorphism which is not a coretraction is, apparently, more important : the natural embedding 1r � II)) of the unit circle into the closed unit disk in two-dimensional Euclidean space. (This example also holds for the categories Met , Top, HTop, for the future category CHTop of compact Hausdorff topological spaces in Section 3. 1 and for some other categories.) The proof exceeds the scope of this book; see, e.g. , [22, p. 1 1] . It is a little easier to verify that another classical mapping that "winds a line on a circle" (mentioned in Exercise 2 in another connection) gives an example of an epimorphism which is not a retraction. (We recommend that the reader restore the proof.) Such geometrically transparent examples were among the reasons why algebraic topology appeared, with category theory born from it . The following classes of morphisms, which are intermediate between coretractions and monomorphisms (respectively, between retractions and epimorphisms) , deserve consider ation. They are especially interesting in topology, and, as we will see later, in functional analysis. Definition 3 . A morphism
6. The (co)product
37
bimorphism, implies that x is an isomorphism. A morphism
is an extreme epimorphism. Proof. Suppose
it is a monomorphism and at the
same time an extreme epimorphism {::=:::::} it is an epimorphism and at the same time an extreme monomorphism. Proof. If
Here is the first illustration showing that these notions are substantial: Exercise 8* . A morphism in Top is an extreme monomorphism ¢:::::::> it is topolog ically injective. A morphism in Top is an extreme epimorphism ¢:::::::> it is topologically surjective.
Hint. Suppose
=
1/Jx is the composition from
where Im(
6 . A sample of a category-theoretic construction : T he ( co) pro duct
Products and coproducts ( sometimes called direct products and direct sums ) are used in many areas of algebra, topology, and functional analysis. In different areas their construction was different, but gradually the feeling
38
0. Foundations: Categories and the Like
appeared that these constructions have some intrinsic unity. The category theory allows us to realize what makes these constructions related. This, in turn, brings us to a new level of their understanding, resulting in practical benefits in the study of these constructions. Let JC be a category, and Xv; v E A a family of objects of /C. (Some of these objects may coincide.)
product of the family Xv; v E A is a pair (X, {7rv}) con sisting of an object X E JC (denoted by IT { Xv; v E A}) and a family of morphisms 'lrv : X � Xv; v E A from /C, having the following property: for every Y E JC and every family of morphisms 'Pv : Y � Xv; v E A there is a unique morphism 1/J : Y � X such that for every v E A the diagram Definition 1. A
X
't/J
�vXvl /.v
y
is commutative. The morphisms 1rv in this definition are called projections
of X onto Xv .
In particular, the product of two objects Xk ; k == 1 , 2 is an object X1 UX2 together with morphisms 'Irk : xl u x2 � xk such that for every y and 'P k : y � xk there exists a unique morphism 1/J : y � xl u x2 making the diagram
commutative.
coproduct of a family Xv; v E A is a pair ( X, {iv}) consist ing of an object X (denoted by ll {Xv; v E A}) and a family of morphisms iv : Xv � X; v E A, satisfying the definition of product of this family in the dual category JC 0 . The morphisms iv in this definition are called the embeddings of Xv into X. Definition 2. A
In what follows, we will encounter coproducts as often as products. Therefore, to feel secure, we give an independent definition of coproducts in terms of the initial category /C. However, we recommend that the reader do this independently before moving further.
of a family Xv; v E A is a pair (X, { iv}) con sisting of an object X == ll{Xv; v E A} E JC and morphisms iv : Xv � X; v E A from JC satisfying the following condition: for every Y E JC and for
Theorem 2'. A coproduct
6. The (co)product
39
every 'Pv : Xv � Y; v E A there exists a unique morphism 1/J : X � Y such that for every v E A the diagram X
't/J
ri�
is commutative.
y
In particular, a coproduct of two objects Xk ; k == 1 , 2 is an object X1 UX2 together with morphisms i k : xk � xl u x2 such that for every y and 'P k : xk � y there exists a unique morphism 1/J xl u x2 � y which makes the diagram :
commutative. Sometimes, if there is no danger of misunderstanding, we speak about the (co) product, meaning the object X itself. Exercise 1. Let (X, { 1rv}) be a product of a family Xv; v E A, and suppose that for some J-L E A the set hK (X/L , Xv) is non-empty for all v E A. Then 1rIL is a retraction. State and prove a similar result for coproducts. Now we prove a theorem that contains, as special cases, many well-known results from algebra, topology, and analysis.
Suppose Xv; v E A is a family of objects in JC, and (X, { v}), (X ' , { � } ) are two products of this family. Then there exists a unique isomorphism L X � X' such that for every v E A the diagram X X'
Theorem 1 (Uniqueness theorem) . 1r
1r
:
�
------+
is commutative.
�\ /.�
(Thus, not only the objects themselves are isomorphic, but isomorphisms between the objects can be chosen to be compatible with projections. ) Proof. Let us construct a new category /C A . Its objects are arbitrary pairs
(Y, { 'Pv}) consisting of an object Y E JC and a family of morphisms 'Pv : Y � Xv; v E A from /C. We define the morphisms between objects (Y 1 , { rp�})
0. Foundations: Categories and the Like
40
and ( Y 2 , { cp�}) in the new category as morphisms 'ljJ : Y 1 --+ Y 2 in JC such that for every v E A the diagram
is commutative. The composition of morphisms in /C A is their composition in /C. An elementary verification shows that the axioms of category are fulfilled, the local identities in /C A are local identities in /C, and a morphism in /CA is an isomorphism <====> it is an isomorphism in /C. And now we use the main trick: we note that a pair (Y, { 'Pv}) is a product of our family in JC <====> it is a final object in /C A . Therefore, by Theorem 4. 1, the pairs (X, {7rv}) and (X', {1r�}) are isomorphic as objects • in JC A . The rest is clear. When passing to the dual category we immediately obtain the unique ness theorem for coproducts. We strongly recommend that the reader give its exact formulation (in terms of the initial category) and draw the corre sponding commutative diagram. What will the inhabitants of our categorical Zoo say about these con structions? We leave the case of "simplest organisms" (i.e. , discrete cate gories) to the reader, and turn our attention to the category of sets. In this category, there are no problems with coproducts:
Xv ; v E A in Set has a coproduct (X, { iv} ) , where X is the disjoint union of sets Xv (see Section 0. 1 ), and every em bedding iv : Xv --+ X acts by the rule x �----+ ( x, v) (in other words, it is the natural embedding of Xv into X, if X� is identified with Xv) · Exercise 2 ° . Every family
If we pass from coproducts in Set to products, then at first sight the situation becomes even simpler. Namely, we recall that the Cartesian prod uct of a family of sets Xv; v E A (notation: X {Xv; v E A}) is the family of mappings f : A --+ U{Xv; v E A} such that f(v) E Xv. To speak in less formal but, perhaps, more transparent way, we can say that the Cartesian product consists of "rows" ( . . . , x v, . . . ) , where the places are indexed by elements of A, and at the vth place there is an element from Xv. Proposition 1. Every family Xv; v E A of non-empty sets has a product in Set , namely, the pair (X, {7rv}) , where X == X{Xv; v E A} and 1fv : f �----+
f(v) .
41
6. The (co)product
Proof. For every Y and
'Pv ; v E A in the definition of product, the required morphism 'ljJ : Y � X is defined uniquely and acts by the rule y �----+ g , where II g : lJ 'Pv(y) . 1---+
All this sounds nice, but why are we sure that the Cartesian product X {Xv ; v E A} is a non-empty set? Let us look at this more carefully, and let us start with the case when the sets Xv are mutually disjoint. In this case, this is nothing else but the axiom of choice! (The mapping which is an element of the Cartesian product evidently defines the set occurring in the axiom of choice, and vice versa. ) Actually, the non-emptiness of the Cartesian product is one of the equivalent forms of the axiom of choice. (It is not difficult to prove, but we will not do this; see, e.g. , [6, p. 101] .) As for the latter, we have already said that we have to accept it if we do not want to ruin half of the building of modern mathematics. ) Therefore, we have to accept the "dogma" of the non-emptiness of products. This will be very helpful, as we will see later, for a series of important and useful theorems. Exercise 3. Construct the product and the coproduct of a non-empty family of objects in Ord.
Hint. These are the same Cartesian product and disjoint union, but with the order chosen in a special way.
Now we pass to the category Lin. Suppose Xv; v E A is a family of linear spaces. First note that the Cartesian product of the underlying sets (we already "know" that it exists!) is a linear space with respect to the coordinatewise operations. We denote this space by X {Xv; v E A}. The projections 'lrv from Proposition 1 are, evidently, linear operators. Consider the subspace in X {Xv; v E A} consisting of the mappings f such that f (v) =/=- 0 only for a finite set of indices v (depending on f, of course) . This subspace is called the direct sum of our family of linear spaces and is denoted by ffi{Xv; v E A}. (Sometimes, to avoid confusion, one says "outer direct sum" .) For every v E A we put
, { X 'l v ( X ) 0 .
J.l -
_
'
J.l
J.l
== ZJ ,
=I= lJ.
(We identify the "direct summand" Xv; v E A with its image under iv in the direct sum.) Now everything is prepared for formulation of the following evident result. Proposition 2. Every family Xv; v E A of linear spaces Lin) has the product and coproduct in Lin . These are the
(i. e., objects in Cartesian prod uct (X{Xv; v E A}, {7rv}) , and the direct sum (ffi{Xv; v E A}, {iv}) , re • spectively.
Certainly, in the case of a finite family of spaces, say, X1 , . . . , Xn , the product and the coproduct coincide ( as objects) , and we have the right to use either notation. Usually we will write xl EB . . . EB Xn or EB�= l xk .
42
0. Foundations: Categories and the Like
The easiest way is to interpret elements of this space as n-tuples x == ( xi , . . . , X n ); X k E Ek and to identify every Xk ; k == 1 , . . . , n with the im age of the corresponding embedding i k . The reader remembers, of course, that we have used the symbol EB for the decomposition of a linear space into the direct sum of two subspaces. This similarity is not incidental.
3. Let X be a linear space, XI and X2 subspaces of X, and i i and i 2 the natural embeddings of these subspaces into X . The following conditions are equivalent: (i) X is decomposed into the direct sum of XI and X2 ; (ii) X , together with the embeddings i i and i 2 , is the coproduct of XI and X2 in Lin; (iii) there is a linear isomorphism of the outer direct sum XI EB x2 into X taking (y, z) to y + z . Proof. (i) ::::::::} (ii) . If Y is a "new" space, and 'P k : Xk � Y; k == 1 , 2 linear operators, then the operator 'ljJ X � Y taking y + z ; y E XI , z E X2 to 'lj;(y) + 'l/J(z) is well defined. This, obviously, is a unique operator that makes Proposition
:
the diagram from the definitions of coproduct commutative. (ii) ::::::::} (iii) follows from Proposition 2 and Theorem 1 (on the unique ness) if we apply them to the coproducts in Lin. • (iii) ::::::::} ( i) is clear.
In the categories Ab and Gr products and coproducts always exist as well. In both cases products are Cartesian products of the underlying sets with pointwise group operation and the projections defined in Proposition 1. As for coproducts, in Ab it is the direct sum of abelian groups defined by analogy with the direct sum of linear spaces. In Gr it is a more complicated thing, the so-called free product of groups; see, e.g. , [15] . *
*
*
Already at the dawn of topology there was need in the "right" defini tion of what we now call product in Top. At that time, in the absence of categorical thinking, it was a difficult problem. Let Xv ; v E A be a family of topological spaces with topologies Tv . Again consider the Cartesian product X {Xv ; v E A}. Define a topology on it by using the prebase consisting of arbitrary subsets indexed by pairs ( v E A, U E Tv ) and having the form {f E X {Xv ; v E A} : f ( v ) E U}. (Thus, every set in this prebase consists of "rows" which have a certain coordinate belonging to some open set, while other coordinates are arbitrary. ) The topology in X {Xv ; v E A} with this prebase is called the Tychonoff topology.
43
6. The (co)product 3.
The Cartesian product of topological spaces Xv ; v E A en dowed with the Tychonoff topology is called the topological, or Tychonoff product of these topological spaces. Definition
The following almost obvious proposition will be useful later.
Let M be a subset of the topological product X {Xv ; v E A} . Then a point f E M is interior <====> M contains a neighborhood of the form Uj, vi , . . . , vn == { g E X { Xv ; ZJ E A} : g(vk ) E Uvk ; k == 1 , . . . , n} for some • vk E A and for some open sets Uvk in Xvk .
Proposition 4.
Of course, the topological product of every family of Hausdorff spaces is a Hausdorff space as well. It is clear also that for every v E A the (set theoretic) projection 'lrv : X --+ Xv defined above is a continuous mapping. Theorem 2. Every family of non-empty topological spaces has a product in Top {and in HTop if all the spaces are Hausdorff). This is the topological
product of these spaces with the projections defined above. Proof. Suppose Xv ; v
E A is our family, and Y is a "new" topological space together with the family of continuous mappings 'Pv : Y --+ Xv ; v E A. Then obviously there exists a unique mapping of sets 'ljJ : Y --+ X {Xv ; v E A} such that for every v E A the diagram X {Xv ; ZJ E A}
�v l
+------
Y
is commutative. So all we need to verify the category-theoretic definition of product is to show that 'ljJ is continuous. Suppose that V { f E X { Xv ; v E A} : fv E U} is a set in the standard pre base of the topology in X {Xv ; v E A} (see above) . Clearly, if y E Y, then 'lj;(y ) E V <====> 'Pv (Y) E Xv , and this means that the inverse image of the set V under 'ljJ coincides with the inverse image of the subset U in Xv under '{Jv , and thus is open due to the the continuity of the latter mapping. It remains to use the second statement of Proposition 2.6. • :
We give two illustrations. First of all note that, as a special case of the general definition of the Cartesian product, the Cartesian product of a family of copies of CC indexed with the points of some set A is the same as the set of all complex-valued functions on A. So, by analogy with the product of several identical numbers, it is denoted by (C A .
44
0. Foundations: Categories and the Like
Exercise 4° . The convergence of a sequence (for advanced readers, a
net) in cc A with respect to the Tychonoff topology is precisely the simple (i.e. , pointwise, or coordinatewise) convergence. Thus, the convergence discussed in Exercise 2.5 is the convergence in C[O, 1] regarded as a topological subspace in cc [o , 1 ] . Exercise 5. The topological product of a countable family of two-point discrete spaces is isomorphic to the Cantor perfect set. 5 The coproducts in the same categories Top and HTop do not pose problems. Exercise 6. Every family of topological spaces {Xv} has a coproduct in Top (and in HTop if these spaces are Hausdorff) . This is their set-theoretic disjoint union endowed with the topology where open sets are those having open intersection with each Xv. However, even in most "popular" categories the (co )products do not always exist . To advanced readers, we recommend the following exercise.
Exercise 7* . A family of metric spaces, each containing more than one point, has a product in Met ¢:::::::> the family is at most countable. Hint. For the implication {::::= , construct ll { Xn ; n E N} as the Cartesian product of metric spaces with the metric
d(f, g) .
-
·-
d(f(n) , g(n) ) � � 2n (d(f (n) , g(n) ) + 1)) ·
The implication ===} is proved by contradiction: take the metric space {0, 1 }N with the metric described above as the "obnoxious" space Y. The following exercises, despite being relatively easy, are not obligatory. Exercise 8. Every family of metric spaces has coproduct in Met and in Metu .
Hint. Endow the disjoint union with a suitable metric. Exercise 9* . A family of metric spaces each consisting of more than one point has a product in Metu {=:::::} the family is at most countable. Exercise 10* .
(i) A family of non-empty metric spaces has a product in Met 1 ¢:::::::> all of these spaces, except, possibly, finitely many, have finite diameter, and these diameters are bounded by a constant . (ii) A family of metric spaces has a coproduct in Met 1 ¢:::::::> all of them, except possibly one, are empty.
[0 , 1 ]
5 The Cantor perfect set , or just the Cantor set , is the subset of the closed interval consisting of all numbers of the form I:� 1 t n /3 n , where tn is equal to 0 or 2. Geometrically into three equal parts and delete it can be described as follows . We divide the interval the middle part , i .e. , the open interval ( 1 /3, 2/3) , called "the interval of rank 1 " . Next , from 1 /3] and [2 /3, we delete their middle parts, the open the two remaining closed intervals intervals ( 1 /9, 2/9) and (7 /9, 8/9) , called "intervals of rank 2" . Next , from the remaining four closed intervals we delete their middle thirds , "the intervals of rank 3" , and so on. The subset of that remains after all deletions of all these open intervals (of rank 1 , 2 , . . . ) is the Cantor set .
[0 , 1 ]
[0 ,
[0 , 1 ]
1]
7. Functors
( iii )
45
However, in the full subcategory Met 11 of the category Met 1 consisting of the spaces with the diameter < 1 , every family of non-empty spaces has the product and the coproduct .
Hint. ( i ) The ==* part is proved by contradiction. You should take a singleton as Y and construct two elements x, y E X for which d(x, y) > n for all n . To prove the {:::== part consider the following metric in X: d(f, g) == sup v d(f (v) , g (v) ) . ( ii ) If we consider a two-point space with sufficiently large diameter as Y, we get the necessary contradiction.
7. Functors
With some oversimplification, we can say that in category theory the notion of functor plays the same role with respect to the notion of category as in set theory the notion of mapping plays with respect to the notion of set. This notion has also a "world outlook" aspect. Namely, the notion of functor provides us with understanding and formalization of what the "right" math ematical construction connecting two, generally speaking, different areas of mathematics should be. Definition 1. Let JC and £ be two categories. We say that a covariant functor from JC to £, or between JC and £ (again in the arrow notation, F : JC -t £) is defined if (I) to every object X E Ob(/C) an object F(X) E Ob(£) is assigned; (II) to every morphism r.p : X -t Y in JC a morphism F ( r.p) : F (X) -t F(Y) in £ is assigned; and the following two conditions are satisfied: (i) if the composition ( 'l/Jr.p) makes sense, then F( 'lj;r.p) = F ( 'ljJ )F( r.p) ; (ii) for every X E Ob(/C) we have F ( l x ) == lF (X) · A contravariant functor from JC to £ is defined as a covariant functor from the dual category JC 0 to £. In terms of the initial categories the latter notion, clearly, means the following. The definition of contravariant functor repeats the definition of the covariant one everywhere except (II) and (i) . Namely, (II) must be replaced with (II' ) to every morphism r.p : X -t Y in JC a morphism F ( r.p ) : F(Y) -t F(X) in £ is assigned, whereas (i) must be replaced with (i') if the composition ( 'lj;r.p) makes sense, then F ( 'lj;r.p) == F ( r.p ) F ( 'ljJ) . (By the way, it is clear from this description that a contravariant functor from JC to £ can also be defined as a covariant functor from JC to £0 .)
46
0. Foundations: Categories and the Like
We recall that categories appeared in mathematics as a result of real izing that, informally, "morphisms are more important than objects" . Re spectively, functors appeared as a result of realizing that it is important to define the mathematical constructions not only on the objects, but on the morphisms as well. It is difficult to express this better than Eilenberg and MacLane have done in the pathbreaking paper ([23, p. 236] ) : " . . . whenever new abstract objects are constructed in a specified way out of a given ones, it is advisable to regard the construction of the corre sponding induced mappings on these new objects as an integral part of their definition." The simplest example of a covariant functor on an abstract category JC is, of course, the identity functor lK : JC � /C. The simplest contravariant functor is, of course, the duality functor l Ko JC � JC0 . They both preserve objects and morphisms, but in the second case, after applying the functor, a morphism from X to Y is regarded as a morphism from Y to X. Another example is the so-called constant functor from JC to £; it takes each object from JC to a chosen object in £, and each morphism in JC to the local identity of this object. (This functor can be considered both as a covariant and as a contravariant functor-this also happens!) :
The reader can easily define the composition of functors (by analogy with the com position of mappings) . In addition, by analogy with the isomorphisms of objects in a category, one can define an isomorphism of categories as a functor having the inverse functor. An isomorphism can be defined equivalently as follows: it is a covariant functor I : K � .C establishing a biject�on between the classes Ob(K) and Ob(.C) , and such that for every pair X, Y E Ob(K) the mapping
A covariant functor F JC � £ is called faithful, respectively, full, if for every X, Y E JC the mapping of the set hK (X, Y) to the set h£ (F(X), F(Y) ) :
47
7. Functors
taking rp to F(rp) , is injective, respectively, surjective. For example, the functor of natural embedding of a subcategory into the "larger" category is always faithful, and this functor is full <====> our subcategory is full. Con travariant faithful and full functors are defined in the same way, but with h1: (F(Y) , F(X)) instead of h1: (F(X) , F(Y) ) . Let us formulate an easy but very useful property of functors.
1. A covariant functor takes a retraction to a retraction, a core traction to a coretraction, and an isomorphism to an isomorphism. A con travariant functor takes a retraction to a coretraction, a coretraction to a retraction, and (again) an isomorphism to an isomorphism.
Theorem
Proof. Having in mind duality, we can restrict ourselves to the case of a
covariant functor F : JC � £. Let rp X � Y be a retraction in JC with the right inverse 'l/J. Then, by the definition of functor, F ( rp ) F ( 'ljJ) == F ( rp'l/J) == F(1 x ) == 1 F(X) , i.e. , F ( 'ljJ) is the right inverse for F( rp) in £. The rest is clear. II :
The modern mathematics is swarmed with functors, so the examples we will give below are a trifling part of the known ones. Example 1. The very definition of category implicitly contains two natural series of functors. One of these series consists of covariant, and the other of contravariant functors. Perhaps, these are the most important among all functors. Let us choose an object X of an (arbitrary) category /C. To every Y E Ob(/C) we associate the set hK ( X, Y ) , and to every morphism rp : Y1 � Y2 in JC we associate a mapping, denoted by hK ( X, rp) , from the set hK ( X, Y1 ) to the set hK ( X, Y2 ) taking 'ljJ E hK ( X, Y ) to rp'l/J. (In other words, hK ( X, rp) takes 'ljJ to the unique morphism x such that the diagram
X
�l �
is commutative) . It is easy to see that the mappings Y �----+ hK ( X, Y ) (on objects) and (rp : Y1 � Y2 ) �----+ ( hK ( X, rp) : hK ( X, Y1 ) � hK ( X, Y2 ) ) (on morphisms) determine a covariant functor from JC to Set . It is called the co variant functor of morphisms or, in short, the covariant mar-functor (some times one says covariant principal functor) generated by the object X. It is denoted by hK ( X, ?) : JC � Set . In addition to the covariant mar-functor, the object X generates another functor of morphisms, or mar-functor, denoted by hK (? , X ) . It is defined
48
0. Foundations: Categories and the Like
as hK ( X, ?) : JC 0 � Set , i.e. , as the functor we constructed above, but for the dual category. We strongly suggest that the reader give the detailed definition in terms of the initial category /C. For a series of categories it is more natural to define the mar-functor with the range not in Set but in one of the categories of sets with additional structure. Example 2. Let JC == Lin. We know from linear algebra that in this case hK (X, Y) , i.e. , a set of linear operators from X to Y, has itself the structure of linear space. And it is easy to verify that the mappings hK (X, cp) ;
Lin.
Remark. We should warn the reader in advance that the main examples of
such "specialized" mar-functors will be considered later. The mar-functors acting on the category of Banach spaces and having the range in the same category, and, in particular, the most famous of all of them, contravariant "asterisk" -functor (of passage to the dual spaces and adjoint operators) , will be some of them. Similarly, it is more convenient to consider mor-functors defined on Ab as the functors with the range in the same category Ab. On the contrary, we have to consider the mar functors on Gr as taking values in Set as in Example 1 because the sets arising in this situation have neither group, nor other supplementary structure.
Now we continue with examples of functors. Example 3. Here we consider a family of functors all having a similar definition. Suppose JC is a category such that its objects are sets with some supple mentary structure, and morphisms are mappings compatible in some sense with this structure (like in the majority of examples of Section 0.3) . In this case the so-called forgetful functor D : JC � Set arises (the notation says: "something was in the square, but we forgot about it" ) . This functor assigns to every object X E JC the underlying set, denoted by OX, and to every morphism from
49
7. Functors
in an obvious way (this functor "forgets" the multiplication by scalars, but still "remembers" the addition) . Another example is the functor D : Met � Top "forgetting" the metric, but "remembering" topology generated by this metric. Such functors will also be called forgetful. Example
4. A freedom functor. The forgetful functors take a set with a
given structure to the "naked" set. The functor we describe here acts in the opposite direction. Let S be a set. Denote by F(S) the set of formal linear combinations of elements of S with complex coefficients. This set is naturally endowed with the structure of linear space with a linear basis {l x : x E S}; if we identify x and lx, we can view S as lying in F(S) and forming a linear basis there. Let us associate to every set S the linear space F(S) and to every mapping of sets rp : S � T the linear operator F( rp) : F(S) � F ( T ) uniquely defined by the rule that it takes x E S C F(S) to rp(x) E T C F ( T ) . Clearly, we obtain a covariant functor F Set � Lin. It is called a freedom functor. (We will explain the origin of this term later.) :
1° . For given S E Set and E E Lin, construct a bijection between the sets hset (S, D( E ) ) and hLin (F(S), E) . Exercise
Remark. This exercise, if analyzed in a proper way, leads to one of the
central notions in category theory, the so-called adjoint functor. (In this case the freedom functor F is adjoint to the forgetful functor D.) Curious readers can find the details in, say, [24] . Recall that the majority of categories (though not all) considered by now had one com mon feature. Their objects are sets (in most cases, with some supplementary structure) , and morphisms are mappings of these sets (compatible in some way with this structure) . Actually there is a formally defined notion describing all such cases. Definition 2. A concrete category is a pair (K, D) consisting of a category K and a faithful covariant functor D from K to the category of sets.
Of course, most of the categories we described above can be viewed as concrete if we take for D the forgetful functor to Set . In some cases, however, it is more reasonable to consider some other functors, for instance those associating with an object X only a part of the underlying set. (This will be discussed, in particular, in Exercise 2.5. 19, where the category Ban 1 is constructed.) Now the hidden idea of Proposition 5.5 becomes clear.
Exercise 2 ° . Let (K, D) be a concrete category, and
Here is another important property of these categories. Proposition 1 . If (K , D) is a concrete category and � is an isomorphism in K, then D (�)
is a bijection.
50
0. Foundations: Categories and the Like •
Proof. This is a special case of Theorem 1 .
In many concrete categories, for instance, in our examples of categories of pure alge bra, the converse is also true: if a morphism is a bijection as a mapping of the underlying sets, then it is an isomorphism. We distinguish the class of such categories. Definition 3. A concrete category (K, D) is said to be balanced if for every morphism
Thus, the categories we have just mentioned are balanced . At the same time, it is easy to see that the categories (Met , D) and (Top, D) are not balanced (indicate the bijective morphisms that are not isomorphisms) . Running ahead , we note that for some categories of topology and functional analysis the fact that they are balanced is the content of fundamental theorems in these areas; in this book these are the Banach theorem on the inverse operator and the Alexandroff theorem. The functors of "partial loss of memory" can also be formalized, this time as functors between two concrete categories. (You can try to give the exact definition. ) The remaining part of this fragment of small print gives some kind of "general edu cation" , which will not be used in the rest of the book. Of course, you have already encountered the adjective "free" in mathematics. This happened when the notion of a free group and the (different) notion of a free abelian group were discussed . The fact is that the functors :F : Set ---+ Gr and :F : Set ---+ Ab, defined similarly to the functor :F : Set ---+ Lin from the previous example , assign to a given set S the free group, respectively, the free abelian group with S as the system of generators. (The details of the definition and the verification of the properties of such a functor are left to the reader. ) Actually, the functor :F : Set ---+ Top that assigns to every set S the discrete topological space (with S as the underlying set ) has the same nature (though this may not be obvious) . And in all these cases the analogue of Exercise 1 is valid if we replace Lin with Gr, Ab, or Top. (Formulate the precise statements and prove them!) Now we are ready to perceive the general definition of free object and freedom functor-as you have guessed, within the framework of an (arbitrary) concrete category. Definition 4. Let (K, D) be a concrete category, and X an object in K . A subset S E OX is called a bas�s of this object if for every Y E K and for a mapping (i.e . , a morphism in Set)
Of (
DY
Jl/ s
where j is the natural embedding, is commutative. An object in K that has a basis is called free. Exercise 3. Every two objects in a concrete category that have bases of the same cardinality are isomorphic.
Hint. If Sk is a basis for Xk , k == 1 , 2, and ink is a natural embedding of Sk into DXk , and if
It is easy to see that every object in Lin is free, and its basis is an arbitrary linear basis of the corresponding linear space ( cf. Exercise 1 ) . Free objects in Gr and in Ab are precisely what is called free groups and free abelian group (indicate the basis!) . Finally,
51
7. Functors
free objects in Top are discrete topological spaces, and in Met , discrete metric spaces; in these cases the basis coincides with the corresponding underlying set . The existence of a basis is connected with the existence of coproducts. Exercise 4 * . Suppose that in a concrete category (K, D) there is an object I with the basis consisting of one element s E D (I) . Then an object X of this category has a basis {=:::::} there is a family of morphisms iv : Iv ---+ X; v E A, where Iv == I for all v E A, such that the pair (X, { iv ; v E A} ) is a coproduct of the family Iv ; v E A . In addition, the subset { (Div ) (s) ; v E A} in D (X) is a basis for X.
(Thus, if in a category there is an object I with a singleton as a basis, then all free objects in this category are coproducts of the families consisting of copies of I.) Hint. If A is a basis for X, then (X, { iv; v E A}) is a coproduct of the family Iv ; v E A, where iv : Iv == I ---+ X are uniquely defined by the rule that Div (s) v E A C D(X ) . If (X, { iv ; v E A} ) is a coproduct of the last family, then the basis for X is the subfamily { Div (s); v E A} in D(X) . ==
Definition 5. A functor :F : Set ---+ K, where (K, D) is a concrete category, is called a freedom functor if to each S E Set it assignes an object :F (S) having S as the basis, and to every mapping (i.e. , morphism in Set )
D:F(S) DF (
r
�
rT
S T in the category Set , where js and jr are natural embeddings, is commutative. Certainly, the freedom functor is well defined if for every set S there exists at least one object in K with the basis S. (Explain why in this case the morphism :F(
By analogy with the "partially forgetful" functors mentioned above, there are (and they are useful) so-called functors of relative freedom, acting in the opposite direction. The latter functors are related with the former through the "adjointness relations" , like the bijection considered in Exercise 1 . Despite their importance in some chapters of functional analysis (and algebra) they exceed the framework of this book. An example of such a functor useful in the theory of Banach algebras can be found in [ 25 , p. 412] . A covariant functor from the standard simplicial category 4 to an arbitrary category K is called a cosimplic�al K-object . By analogy, a contravariant functor between these categories is called a s�mplicial K-object . (If, say, K is Lin, then these objects are usually called (co )simplicial linear spaces, etc. ) These functors are of extreme importance for studying the categories by homological methods (see, e.g. , [26] ) . (The terminology comes from the historically first simplicial object , which was the contravariant functor 4 ---+ Top taking the object { 1 , . . . , n } to the ( n - I )-dimensional standard simplex in JR n ; n E N. 'fry to define the action of this functor on the morphisms in 4. Our hint: some of these morphisms are called faces, and some other, degenerations) . The following category of technical nature plays a major role in topology, complex analysis, differential geometry, and recently, in functional analysis. Let n be a topological space. The category Open ( O ) is defined as follows. The objects of Open(O ) are open sets in n, and the set of morphisms between two objects U and V consists of the unique
52
0. Foundations: Categories and the Like
morphism of the natural embedding U C V if U is contained in V, and is empty otherwise. A contravariant functor from n to a given category K is called a presheaf of objects from K; in particular we can speak about presheaves of sets, abelian groups, rings, etc. (Try to give a detailed definition of a presheaf, say, of linear spaces without using the word "functor" . ) A presheaf satisfying some natural requirements is called a sheaf; to get an idea of the significance of sheaves, read the book [28] .
Now we go back to the main material. As you know, in set theory there are two fundamental notions, namely sets and mappings. In contrast, category theory is based on three "pillars" . Two of them are already familiar to you: these are category and functor. The third "pillar" is called a natural transformation, and here we give its definition. 6 A detailed discussion of this notion exceeds the scope of our book, but we are sure that at least its definition must be familiar to every self-respecting "pure" mathematician of the beginning of the 21st century.
F, G : JC � £ are two covariant functors between some categories. We say that a natural transformation a == {ax } between the functors F and G is defined (the notation is again an arrow, a F � G) if for every object X E JC a morphism a x : F(X) � G(X) is defined in £ such that for every morphism rp : X � Y in JC the diagram Definition 6. Suppose
:
F(X)
F (
F(Y)
G (X )
G (
G(Y)
ax 1
lay
(in £) is commutative. A natural transformation between two contravariant functors from JC to £ is defined as the natural transformation between the corresponding covariant functors from JC 0 to £ (decipher this!) . A natural transformation a == {a x } between two covariant functors is called a natural equivalence if a x is an isomorphism for every X E /C. A natural transformation a == {a x } between two contravariant functors is called a natural equivalence (sometimes one says natural antiequivalence) if it is a natural equivalence after the passage to the corresponding covariant functors. The following two exercises must clarify the nature of these notions. 6 0riginally this notion appeared in algebraic geometry, then it penetrated into pure algebra, and now it became an integral part of the language of a wide part of mathematics, including modern functional analysis. It is interesting that , as the founding fathers admit , historically category theory began from the study of natural transformations. "Category" was introduced in order to define "functor" , and "functor" was introduced in order to define "natural transformation" [16, p. 18] .
53
7. Functors
Exercise 6. Every morphism 'ljJ : X --+ Y in an (arbitrary) category JC
gives a natural transformation a between the corresponding mar-functors from JC to Set with az : hK ('l/J , Z) : hK (X, Z) --+ hK (Y, Z) as the compo nents. Formulate and prove a similar (dual) proposition for the contravariant mor-functors. Exercise 7. The functor hK (X, ?) is naturally equivalent to the identity functor in JC in each of the following cases: (i) JC == Set , where X is a singleton; (ii) JC == Lin, X == CC. Important examples of natural transformations of functors in functional analysis will be discussed later; see the canonical embedding into the second dual space in Section 2.5 and the Gelfand transform in Section 5 . 3. *
*
*
When mathematicians start playing with their favorite toys, nothing can stop them. The notion of natural transformation of functors allows us to introduce a very useful class of categories. Namely, let us take two categories K and .C, and consider the category _c K whose objects are arbitrary functors from K to .C (covariant , to be definite) , and morphisms are natural transformations between them; the composition of natural transformations is defined in an obvious way. Of course, this construction requires some care when we choose K and .C; otherwise we come to paradoxes. But , fortunately, we can do this without loss of generality (again, let us trust MacLane [16] ) .
Exercise 8 . Isomorphisms of functors as of objects in _c K are precisely their natural equivalencies.
We have mentioned the notion of isomorphism of categories before and said that , being too rigid, it is not very useful in practice. Now we at last can give a correct explanation of which categories should be considered "essentially the same" . Definition 7. A covariant functor F : K ---+ .C (between two arbitrary categories) is called an equ�valence (of these categories) if there exists a covariant functor G : .C ---+ K such that the composition G o F is a functor naturally equivalent to 1K , and the composition F o G is a functor naturally equivalent to 1.c . A contravariant functor F : K ---+ .C is called a dual equivalence (sometimes, antiequivalence) if the corresponding covariant functor from K 0 to .C is an equivalence. Two categories are called equivalent, or dually equivalent (or antiequivalent) if there is an equivalence, respectively, dual equivalence, between them. Exercise 9* . An equivalence of two categories can be characterized as an isomor phism of objects in some properly defined category.
Hint. The objects of this category are the same as in the "category of categories" Cat . The morphisms between two categories are not just functors like in Cat, but classes of natural equivalence of these functors. Exercise 10. The category of finite-dimensional linear spaces FLin is
(i) dually equivalent to itself; (ii) equivalent to its full subcategory with objects e n ; n
==
0, 1 , . . . .
Hint. The functor h(?, C) of passage to the dual space is the dual equivalence.
54
0. Foundations: Categories and the Like
The second part of this exercise shows that equivalent categories can contain different numbers of objects (and there is nothing frightening in it) . Later we will tell the reader about extremely important examples of dual equivalence between some categories in functional analysis, which, at first sight , have nothing in com mon at all. They are the Gelfand dual equivalence between the category of commutative C* -algebras and the category of locally compact topological spaces (see Section 6.3) , and the Pontryagin dual equivalence of the category Ab and the category of compact abelian groups (see Section 7.5) . And here is our last, but not least set-theoretic notion. Definition 8. A covariant, respectively, contravariant functor from a category K to Set is said to be representable if it is naturally equivalent to the functor h(X, ?) , respectively, h(?, X) for some object X E K. (The latter is called the represent�ng object for our functor. ) Exercise 1 1 . All the forgetful functors on the categories in examples of Section 0.3 are representable.
Hint. For Lin the representing object is C, for Ab and Gr it is Z, for Rin it is the ring of polynomials with integer coefficients and without the free term. For the other categories these are singletons. Of course, representable functors were introduced not for examples of this kind. Ac tually, over time it became clear that a series of very different problems on the "existence of the right construction" , answering some mathematical question, can be reduced to the problem of representability of some functor. Running ahead, we note that the problems of the existence of completions for metric and for normed spaces, of algebraic and "Banach" tensor product , and a series of other problems are among them.
Chapter 1
Normed S p aces and B ounded O perators
( "Waiting
for
Completeness" )
1 . P renormed and normed spaces . Examples
We can say that we enter the realm of functional analysis only after the following definition is given.
1. Suppose E is a linear space. A function II · II E --+ IR+ (mapping a vector x E E to a number denoted by ll x ll ) is called a prenorm if for every x, y E E and ,\ E CC we have ( i) II II == I ,\ I . II II ; (ii) (triangle inequality ) ll x + Yll < ll x ll + IIYII · A prenorm is called a norm if, in addition, (iii) ll x ll == 0 implies x == 0. A linear space endowed with a prenorm (respectively, a norm) is called pre normed (respectively, normed) . :
Definition
AX
X
One often says "seminorm" instead of "prenorm" . If we want to specify which (pre )norm we are talking about, we say something like "the (pre) normed space ( E, II · II ) " . 55
56
1. Normed Spaces and Bounded Operators
Note that ( i ) immediately implies 11 0 11 == 0. So we can replace the con dition ( iii ) with the condition ll x ll == 0 � x == 0. Remark. As always, when speaking about linear spaces we had in mind complex spaces. In some cases, however, we shall need ( pre ) norms in real linear spaces, respectively, real ( pre ) normed spaces. They are defined lit erally in the same way, but .A E CC should be replaced with .A E IR in ( i ) . However, we shall specify this when speaking about it. Of course, the reader is familiar with the notion of norm. We consider here the more general notion of prenorm. This will be useful later, when we shall study polynormed spaces ( Chapter 4) . The first examples of prenorms which are not norms are as follows. Example 1. Of course, the simplest prenorm in an arbitrary linear space is the zero prenorm: ll x ll == 0 for every vector x E E.
2. Every linear functional f : E --+ CC generates a prenorm II · li t in E by the formula ll x ll t == l f(x ) l . Unless E is one-dimensional, this prenorm Example
:
is not a norm.
Certainly, every normed space (E, II · II ) is automatically a metric space with the distance d(x, y) == ll x - Yll · Of course, the same formula turns a prenormed space into a premetric space. As a corollary, every prenormed space is automatically topological. This topological space is Hausdorff � the initial prenorm is a norm. We say that these ( pre ) metric and topol ogy are generated by the given {pre)norm and sometimes we call them (pre)normed, (pre)metric, and (pre)normed topology. Thus, in the context of ( pre ) normed spaces we can use with ease all metric and topological notions, such as closed or open ball of a given radius, closed or open subset, continuous map, etc. The following proposition can be verified immediately. Proposition 1 ( continuity of addition and of multiplication by scalars ) . :
The mappings of topological spaces E x E --+ E, (x, y) �----+ x + y, and CC x E --+ E, (.A, x) �----+ .Ax, where E x E and CC x E are endowed with the Tychonoff topology, are continuous. In particular, if sequences Xn and Yn tend in E to x and y respectively, then Xn + Yn tends to x + y. If Xn tends to x in E and • A n tends to .A in CC, then A n Xn tends to .Ax. Furthermore, an extremely important fact is that the topology in our linear spaces allows us to speak about the converging series of elements. As usual, we say that a series I:� 1 X n converges if the sequence of partial sums L:�= l x k ; n E N converges, and the sum of the series is defined as the limit of this sequence. In all that, obviously, the normed spaces stand out against
1.
57
Prenormed and normed spaces. Examples
the prenormed ones because of their property that series can have only one sum. Every subspace F of a prenormed space E is a prenormed space with respect to the corresponding restriction of the prenorm. Evidently, a sub space of a normed space is normed as well. The corresponding (pre )norm in F is said to be inherited from E. Now let E be a prenormed space and F a (linear) subspace. Then we can define a prenorm in the quotient �pace ElF by putting ll x ll : == inf{ ll x ll : x E x} for every coset x E ElF (check the necessary properties!) . This prenorm is called the quotient prenorm (or the quotient norm if it is a norm) of the initial prenorm. Proposition
in E.
2. A quotient prenorm in E IF is
Proof. Take x E E IF and choose ll x ll == 0 means that inf{ ll x + Y ll : y
of F. The rest is clear.
a
norm <====> F is closed
x E x. Then, obviously, the condition E F} == 0, i.e. , x belongs to the closure •
Let us mention an important special case of this proposition, which allows us to make normed spaces from the prenormed ones.
3. Suppose that E is a prenormed space and Eo : == { x E E : II x II == 0} . Then Eo is a closed {linear) subspace in E coinciding with the closure of zero. As a corollary, the quotient norm in E I Eo is a norm. This norm is well defined by the equality II x II == II x II , where x is an arbitrary • representative of the coset x.
Proposition
Evidently, every prenorm in a linear space E is a prenorm in its complex conjugate space Ei (see Section 0. 1 ) . We shall have in mind this prenorm when talking about complex-conjugate spaces of prenormed spaces. The following observation will be useful in the future.
4. Let E be a prenormed space, F a closed subspace, and y a vector in E that does not belong to F. Then there exists C > 0 such that for every x E E of the form .Ay + z ; .A E CC, z E F we have I-AI < C ll x ll . Proposition
.A =I= 0. Since F is E F. In particular, I-A l ilY - ( - .A - 1 z) II > I .A I O, and we can set •
Proof. Without loss of generality we can assume that closed, there exists () > 0 such that IIY - ull > () for all u
IIY - ( - .A - 1 z) II > 0 . Hence, ll x ll
c : == o - 1 .
==
And now, as is supposed to be done when a new key notion appears, we gather examples (having different degree of generality) .
1 . Normed Spaces and Bounded Operators
58 Example
3. Let us start with the coordinate complex space ccn . The
reader knows of course the classical "Euclid" norm in this space, namely 11 � 11 2 :== JL:�= 1 l �k l 2 ; � == (�1 , . . . , �n ) · But there are some other interest ing norms in cc n , two of which we distinguish: ll � ll 1 :== JL:�= 1 l �k l and II � II oo : == max { I �k I ; k == 1 , . . . , n}n. ( The verification of the properties of norm is obvious. ) More generally, cc possesses a family of norms depending on 1 /p a parameter p; 1 < p < oo : I I � II P :== 2:: �= 1 l �k i P . ( We do not prove that these are norms; cf. the more general Proposition 5 below. ) The space ( cc n ' I I . l i P ) ; 1 < p < ()() will also be denot ed by cc� .
(
)
Many important normed spaces consist of sequences of complex numbers with coordinatewise operations. Example
4. The spaces co , c, and Z 00 • The first of them consists of se
quences tending to zero, the second, of all converging sequences, and the third, of all bounded sequences. The norm is defined by the same formula ll � lloo :== sup { l �k l : k E N} ( of course, for the first space we can write "max" instead of "sup" ) . It is easy to see that co is a closed subspace of codimension 1 in c, and c is a closed subspace of infinite codimension in Z oo . Example 5. Take p, 1 < p < oo , and consider the set Zp of sequences �k for which 2::: � 1 ! �k i P < oo. This is a linear space with the norm II � II P :== 2::: � 1 l ek i P ) ljp. This is evident for p = 1 . For all other p, except p = 2 ( this case will be considered separately in the next chapter ) , we do not prove this; see again Proposition 5 below. It is useful to note that all these spaces, as well as co from the previous example, contain the same dense subspace coo ( see Section 0. 1 ) . It is natural to view the space Zp ; 1 < p < oo as an immediate infinite dimensional analogue of the spaces CC� from Example 3. Therefore, when presenting arguments common for finite- and infinite-dimensional spaces, we denote Zp by Z� , where n is the countable cardinality, and CC� by Z� , where, certainly, n is a finite cardinality. ( As for p == oo , in some cases, it is more convenient to regard the space co as the infinite analogue of CC� , and in other cases the space Z oo .)
(
)
We now move from spaces of sequences to spaces of complex-valued functions with pointwise operations. Example 6. Let
X be an arbitrary set, and Z 00 (X) the linear space ( a
subspace in ccx ) of all bounded functions. Endow this space with the norm ll x lloo :== sup { l x ( t ) l : t E X} . Such a norm, as well as the inherited norm of any subspace in Z 00 (X) , is called the uniform norm or sup-norm. Certainly, Z oo (N) is precisely l 00 •
1.
Prenormed and normed spaces. Examples
59
(If you read the non-obligatory material, you will see later that the spaces Zoo (X) are somewhat more than just examples: every normed space "belongs" to one of them, up to some reasonable identification; see Proposition 7. 1 below.)
0 is a topological space, then in the space l 00 (0) from the previous example we can distinguish the subspace Cb (O) of bounded
Example 6' . If
continuous functions. It is also endowed with the uniform norm. (Show that this subspace is closed.)
If we knew in advance that all the continuous functions on 0 are bounded, then we could write C(O) instead of Cb (O) (see the notation of Section 0. 1 ) . For the future, we note that this is certainly the case if 0 is compact in the sense of Definit ion 3. 1 . 1 below. Example 6" . What we have just said applies also to the space C [a , b] , where [a, b] is a closed interval of the real line. This is one of the most important and "popular" normed spaces, which we would like to mention specially. Example 6"'. Let us pay attention also to the spaces Co(IR) and Coo (IR) . The first consists of all continuous functions on IR that tend to zero as l t l � oo , and the second, the smaller one, consists of all continuous functions with compact support (i.e. , vanishing outside of some interval, which depends on the function) . (In both cases the norm is, of course, uniform.) Example 7. Take an integer n and consider the space cn [a, b] of n-times smooth (i.e. , n-times continuously differentiable) functions on the interval [a, b] . It is a normed space with the norm II x II == max { I x ( k ) ( t) I : t E [a, b] , k == 0, . . . , n} (where, as always, we put xC 0 ) == x) . :
In the next three examples (X, J-L) is an arbitrary measure space (cf. Section 0. 1) . Example 8. The linear space L� (X, J-L) of all (complex-valued) functions on X integrable with respect to Lebesgue measure is endowed with the prenorm
ll x ll � : =
L l x(t) i d�t (t) .
Later we will need the following notion from real analysis. A measurable function x on X is called essentially bounded if for some C > 0 we have J-L{t E X : l x(t) l > C } == 0. Example 9. The linear space L� (X, J-L) of all essentially bounded functions is endowed with a prenorm denoted by II · lloo which associates to every such function the infimum of the constants C mentioned above. (Check that this is indeed a prenorm on this linear space.)
60
1.
Normed Spaces and Bounded Operators
The following proposition provides a series of "intermediate" spaces be tween those in the two previous examples (see future Exercise 1) . Its proof is based on the classical Holder inequality, and thus is technically rather complicated. Therefore, we think it is reasonable to omit it here; see, e.g. ,
[33] .
[33, Theorem lll.3.5] ) . Suppose p E [1 , oo ) and L�(X, J-L) is the set of measurable functions x on X such that fx l x ( t ) I P dJ-L( t ) < 1 ( This is a prenormed space with prenorm ll x ii P : = fx l x( t ) I P dJL (t) ) 1P . Remark. The case of p == 2 is perhaps more important than p == 1 and oo
Proposition 5 (see
oo .
(to say nothing about other p ' s) . It will be considered in the next section.
It is easy to verify that for every p; 1 < p < oo, the prenormed space L� ( X, J-L) is normed <====> every singleton in X has a positive measure. In particular, all the spaces L� (X, • ) ; 1 < p < oo are normed, where X is an arbitrary set and "•" is the "counting measure" , which is equal to 1 at every point. We denote such spaces by lp ( X) . Clearly, for p < oo such spaces consist of functions f : X --+ CC for which � t E X I f ( t) I P makes sense. In particular, this means that f vanishes outside of an at most countable 1 subset in X. The norm is defined by the formula 11 ! 11 = ( � t E X l f(t) I P ) 1P . Note that L� ( X, •) is nothing but l 00 (X) from Example 6. Finally, for X == N the spaces lp (X) ; 1 < p < oo coincide with the space of sequences lp considered above (Examples 4 and 5) , and for X == {1, . . . , n}, with CC� from Example 3. Remark. For an advanced reader we note that the spaces lr (X) can be characterized as free objects in Ban 1 , one of the major categories of functional analysis (see Exercise 2.5.20 in the sequel) .
Example 10. If we apply the construction described in Proposition 3 to
the space L� (X, J-L) ; 1 < p < oo, we obtain a normed space denoted by Lp (X, J-L) . It is known from real analysis that the Lebesgue integral of a non negative function vanishes <====> this function vanishes almost everywhere. From this we see that, for p < oo, elements of the space Lp (X, J-L) are cosets of equivalent (i.e. , coinciding almost everywhere) J-L-measurable functions f(x) such that l f(x) I P is Lebesgue-integrable. If p == oo, then these elements are the classes of equivalent essentially bounded functions. Using the generally accepted language for such situations, we say "a function from Lp (X, J-L)" having in mind not an individual function, but the corresponding coset. This will not lead to a confusion if you remember the meaning of this phrase. The convergence in £ 1 (X, J-L) , respectively, in £ 2 ( X, J-L) , is often called mean convergence, respectively, mean-square convergence.
1.
Prenormed and normed spaces. Examples
61
As we noted before, lp ( X) == L�(X, • ) ; 1 < p < oo is a normed space, and hence in this case the corresponding subspace Eo in Proposition 3 is zero. Thus, lp (X) == Lp( X, • ) . So we can view lp ( X ) as a special case of normed spaces Lp ( X, J-L) . Let us distinguish a few spaces of the class Lp (X, J-L) with similar struc ture, where the effects connected with "continuous" measures are clearly seen. The measure spaces in these examples are an interval, a line, or a circle with standard Lebesgue measure. Having in mind this measure, we denote the corresponding spaces by Lp [a, b] , Lp ( IR ) , and Lp ( 1r ) . It is useful to note that they contain C[a, b] , Coo (IR) , and C(1r) , respectively, as dense subspaces. (Of course we regard these three spaces here only as sets, "for getting" that they have their own natural (uniform) norms, and we identify every continuous function with the corresponding class of equivalent func tions ( cf. the previous remark) .)
1°. Suppose 1 < p < q < oo . Then lp c l q , but Lp [a, b] � Lq [a, b] and Lp ( 1r ) � L q (T) . At the same time none of the spaces Lp ( IR ) and Lq ( IR) lies in another. Passing to the last example in this series, we choose an interval [a, b] c IR. Recall that a complex Borel measure (or complex Borel charge) on [a, b] is a a-additive set function J-L : BORb --+ CC. (We often omit the word "Borel" because we do not need other measures.) From real analysis (see, e.g. , [9] or [106] ) it is known that every complex measure can be represented (not uniquely, of course) in the form J-L : == VI - v2 + iv3 - iv4, where VI , . . . , v4 are usual (i.e. , non-negative) Lebesgue-Stieltjes measures. The variation of a complex measure J-L is the number var(J-L) :== sup L: �= I I J-L(A k ) l , where the supremum is taken over all partitions of the interval [a, b] into Borel subsets (or subintervals, which gives the same result) A k . Example 11. It is easy to see that the set M[a, b] of complex measures on [a, b] is a normed space, where operations are the linear operations on functions of Borel sets and the norm is defined by II J-L II : == var (J-L) . Exercise
A very useful property of many spaces described above is their separa bility. It is convenient to formulate here the following criterion, which can be easily verified in practice.
6. A normed space E is separable � it has a dense linear subspace Eo of at most countable dimension.
Proposition
Proof. If E is separable, then for Eo we can take the linear span of a count
able dense subset in E. Conversely, if Eo is a subspace with the indicated property, and {e n } is an at most countable linear basis in Eo , then the set
1.
62
Normed Spaces and Bounded Operators
of all linear combinations of vectors of this basis with complex-rational 1 • coefficients is dense in E. In the spaces listed before the following subspaces are dense and have a countable basis: a) for co and lp ; 1 < p < oo , this is the subspace of finitary sequences
coo;
b ) for C[a, b] and cn [a, b] , this is the subspace of polynomials ( this follows from the first Weierstrass approximation theorem ) ; c ) for Lp [a, b] , Lp (T) , and Lp (IR) ; 1 < p < oo , and for L� [a, b] , L�(T) , and L�(IR) with the same p, this is the subspace of step-functions with breakpoints in Q ( or, depending on the context, in the "ra tional circle" { eit ; t E Q} ) . In Lp [a, b] and L� [a, b] the same role belongs to the set of polynomials, and in Lp(1f) and L� (1r) , to the set of trigonometric polynomials ( i.e. , linear combinations of func tions zn; n E Z) . ( These facts, at least for p == 1 , 2, are usually proved in the theory of measure and integral; cf. [9] , and the proof extends almost without change to the case of an arbitrary p ) . The following more general fact is again based on the standard course of measure and integral. Exercise 2. Suppose (X, J-L) is a measure space with a countable basis ( see Section 0. 1) . Then the spaces Lp(X, J-L) and L�(X, J-L) ; 1 < p < oo are separable. Certainly, the ( pre ) normed spaces in Examples 1-3 are separable as well. At the same time, the space l00 is not separable. To understand this it is sufficient to take its subset with continuum cardinality consisting of all sequences of zeroes and ones, and apply Corollary 0.2. 1 . As a generalization of this fact we can suggest the following Exercise 3. Suppose a measure space (X, J-L) contains an infinite system with mutually disj oint subsets of positive measure. Then L00 (X, J-L) and L� (X, J-L) , in particular, L00 [a, b] and L� [a, b] , L00 (1r) and L� (1r) , L00 (IR) and L� (IR) are not separable. In addition to this space, the spaces lp(X) with uncountable X are also non-separable for each p. The space M[a, b] is also non-separable ( explain why ) . All the discussed examples of ( pre ) normed spaces are somewhat like reserve funds. They form a small ·part of the multitude of spaces used in 1 Complex-rational number is a complex number with rational real and imaginary parts.
1.
Prenormed and normed spaces. Examples
63
analysis. (A series of spaces serving the complex analysis is described in [30] , those used in harmonic analysis, in [31] .) However, in our exposition other spaces, namely those consisting of functionals and operators, will appear very soon. Now we suggest several constructions allowing us, informally speaking, to "add" normed spaces. In linear algebra, there are two such constructions: Cartesian product and direct sum ( cf. Section 0 . 6) . In functional analysis the number of these constructions is much greater. Suppose Ev; v E A is an arbitrary family of normed spaces, and E0 == x { Ev : v E A} the Cartesian product of the underlying linear spaces. First of all, consider the subset in E0 consisting of all elements (i.e. , mappings) f for which the set of numbers { ll f(v) ll : v E A} is bounded. Denote it by ffi oo { Ev : v E A}. Certainly, it is a subspace in E0 , and we can define the norm II · ll oo on it (sometimes denoted by II · ll rr) by putting ll f l l oo == sup { ll f (v) ll : ZJ E A}. Now take p; 1 < p < oo and consider the subset in E0 consisting of all elements f for which L:v E A ll f(v) II P < oo (this implies, of course, that the number of all v for which f(v) =I= 0 is at most countable) . Denote this set 1 by ffip {Ev : v E A} . For an element f we put II J II P : = ( L: v E A l f(v) I P ) 1P (we also use the notation ll f ii ii instead of ll f ii i ) · The following proposition is trivially verified. :
:
For p == 1 , oo the set ffiP { Ev : v E A} is a subspace in E0 , • and II · l i P is a norm in it.
Proposition 7.
The same is true for all other p. However, we do not need this fact except in the case p == 2 considered in Section 2. 1 . The normed space ffip {Ev : v E A}; 1 < p < oo is called lp -sum of the family Ev ; ZJ E A. Finally, the co -sum of the family Ev ; v E A is defined as the (evidently closed) subspace in the corresponding l00-sum; it is denoted by ffi o { Ev : v E A} and consists of those f which, as one says, vanish at infinity. This means that for every c: > 0 there is a finite subset tP C A such that II f ( v) II < c: as v � tP. Note that the direct sum (in the sense of linear algebra) of a family Ev; v E A is a dense subspace in ffip {Ev : v E A} for 1 < p < oo , and also in ffio { Ev : v E A}. We denote this subspace by ffi 00 { Ev : v E A}. If A == N and Ev == CC for all v, then the lp-sum of such a family is just lp ; 1 < p < oo , and the co-sum is co .
1.
64
Normed Spaces and Bounded Operators
Remark. The supplementary notation
II · llrr and II · ll u for the norms in
l00- and l 1 -sums is related with the general category theory character of the corresponding constructions: the first is a norm in the product, and the second in the coproduct of objects in one of our future categories, namely Ban1 (see Exercise 2.5. 16 in the sequel) . To conclude this section we discuss some general notions and facts.
Every prenorm II · II : E � IR+ on a linear space is a continuous function with respect to the topology generated by this prenorm. Proof. This follows from the inequality I II x II - II y II I < I x - y II , which in • turn follows from the triangle inequality. Let E be a prenormed (hence a premetric) space. An important related Proposition 8.
geometric notion is the closed ball of radius 1 centered at zero. It is called the closed unit ball, or just the unit ball in E, and is denoted by BE (or just B if there is no danger of confusion) . By analogy the open unit ball in E is defined; it is denoted by B� . The unit sphere in E is the set of vectors in E with norm 1 . Certainly, such "balls" can sometimes contain subspaces in E, and the "sphere" can be somewhat like a cylinder. This happens precisely when the given prenorm is not a norm. It turns out that defining a prenorm in E is equivalent to defining a set in E (with certain properties) which plays the role of the unit ball. Let M be an arbitrary subset in a linear space E. Put Mx : == { t E IR+ : t - 1 x E M} and denote by PM (x) the number inf Mx if Mx is non-empty, or the symbol oo otherwise. Definition 2. The function p : E � IR + U { oo} : x �----+ p M ( x) is called the Minkowski functional 2 (or a gauge functional) of the set M.
4. A set B is a unit ball for some prenorm in E � it is convex, balanced, and for every x E E the set {.A E CC : .Ax E B } is closed Exercise
and contains a neighborhood of zero in CC. Hint. If these conditions are satisfied, then the Minkowski functional of the set B is a prenorm. Remark. It is useful to take a look at the unit balls for the norms II · l i P in ccn mentioned in Example 3. For geometric visuality we restrict ourselves to the case of n == 2 and consider the real arithmetic space IR2 instead of CC 2 . If p == 1 , our unit ball is a rhombus. Then, as p increases, the "ball" 2 Hermann Minkowski (1864-1909) , outstanding German mathematician and mathematical physicist , a close friend of Hilbert.
1.
Prenormed and normed spaces. Examples
65
extends, its angles smoothen, and for p == 2 it turns into a usual disk. Then, as p continues to increase, our "ball" extending further, becomes something like an oval, and tends to a square with sides parallel to the coordinate axes. Finally, for p == oo , our "ball" turns into this square. Now we consider a natural question of when a premetric or a topology on a linear space is a premetric ( a topology) of some prenormed space. Exercise
5° . A premetric d( ·, ·) on a linear space E is generated by a
prenorm � it satisfies the following conditions:
( i ) "translation invariance" : for each x, y, z E E we have d( x + z, y + z ) == d(x, y) ; ( ii ) for each x E E and .A E CC we have d(.Ax, 0) == 1 -X I d(x, 0) . Exercise 6. A topology on a linear space
E is generated by a prenorm
� there is a set U in E satisfying the following conditions:
( i ) U is convex, balanced, and every one-dimensional subspace in E contains at least one non-zero vector from U; ( ii ) a family of sets Ux , t :== {x + ty : y E U} , where x E E , t > 0, is a basis of this topology.
Hint. The prenorm we are looking for is the Minkowski functional of the
set U . The coincidence of the topology generated by this prenorm with the initial one is guaranteed by Proposition 0.2. 5 . Let us make another observation. Since every normed space is automat ically a metric space, every subset of such a space is a metric space with respect to the inherited metric. It turns out that there are no other metric spaces up to the natural identification. Here is the precise meaning of these words. Exercise 7* . Every metric space M is isometric ( in other words, is isomorphic in Met 1 ; see Section 0.4) to a subset of some normed space, namely of Cb ( M ) ( see Example 6') .
Hint. Take x E M and consider the mapping M --+ cc M : y 1---+ f; f : z 1---+ d(x, z) - d(y, z) .
Exercise 8* . Every separable metric space M is isometric to a subset
in l 00 •
Hint. Take a dense subset { x n ; n E N} in M and consider the mapping M --+ cc N : y 1---+ � ; �n : == d(xi , Xn ) - d(y, xn ) ·
1.
66
Normed Spaces and Bounded Operators
2. Inner pro ducts and near-Hilbert spaces
Among various (pre )norms on a linear space, there is a class of "geometri cally best" (pre )norms, which best resemble the norm of the space we are living in. They are defined not directly, but with the help of the following preparatory structure. Before getting familiar with it we introduce here a more general notion. (Anyhow, it will be useful later.) Definition 1. Let H be a linear space. A function
x
S : H H --+ CC is
called a conjugate bilinear functional (or sesquilinear functional) if for every x , y, z E H, A, v E CC the following conditions hold: (i) ( "linearity with respect to the first argument" )
S ( Ax + vy, z) == AS(x, z) + vS(y, z ) ; (ii) ( "conjugate linearity with respect to the second argument" ) 3
S(x, A Y + vz) == AS(x, y) + vS(x, z) (here, as usual, bar means the complex conjugation) . (Clearly, a conjugate bilinear functional is a mapping, which, being de fined on H x Hi , is just a bilinear functional.) Given a conjugate bilinear functional S, the function Q : H --+ CC : x �----+ S ( x, x) is called the quadratic form of this "functional" . It is interesting that the latter can be recovered from its quadratic form. Proposition 1 ( "Polar identity" ) .
have
In our notation, for each x, y E H we
3
·k
S(x , y) = :2::: � Q(x + i k y ) . k =O
Proof. Direct verification.
II
2. A conjugate bilinear functional on H is called a pre-inner product if, denoted by (x, y ) instead of S(x, y) , it satisfies for each x, y E H Definition
the following two supplementary conditions: (i) ( "conjugate symmetry" ) ( y, x) == (x , y) ; (ii) ( "positive definiteness" ) (x , x) > 0. A pre-inner product is called an inner product (or "scalar product" ) if (x , x ) == 0 implies x == 0. 3 Mathematical physicists sometimes use the same term for the mappings that are conjugate linear with respect to the first argument , and linear with respect to the second one.
2.
67
Inner products and near-Hilbert spaces
(Note that the conjugate linearity of a pre-inner product with respect to the second argument follows from the linearity with respect to the first argument and the conjugate symmetry.) Definition 3. A linear space H equipped with a pre-inner product is called
a pre-Hilbert space. If H is equipped with an inner product, then it is called a near-Hilbert space. (The "true" Hilbert spaces will be defined later in the next chapter.) 4 Sometimes we use expressions like "pre-Hilbert space ( H, ( - , · ) )" with the obvious meaning. The restriction of a pre-inner product to a subspace in H is obviously a pre-inner product on this space; we say that it is inherited from H. Of course, a subspace of a near-Hilbert space is itself near-Hilbert. The simplest example of a pre-inner product on a linear space is obtained by putting ( x, y ) == 0 for all x, y. We hope that the reader is familiar with another example, the standard inner product in cc n ' defined by the equality (� , ry) : == ��= 1 �krlk · (If we replace here n with a smaller integer, for instance, if we put (� , ry ) : == �1 '1f1 , we obtain a pre-inner product which is not inner.) Now we pass to our main examples. Example 1. Consider the set l 2 of the so-called
square-integrable sequences,
i.e. , sequences � such that ��= 1 l �k l 2 < oo (cf. Example 1 . 5) . Obviously, it is a linear space with the well-defined inner product (� , ry ) : == � � 1 �krlk (the convergence of the series follows from the estimate I �k 'f/k I < � ( I �k 1 2 + I 'f/k 1 2 ) ) .
2 . Let (X, J-L) be a measure space. Consider the set Lg (X, J-L) consisting of the so-called square-integrable functions on X, i.e. J-L-measur able functions x X -t CC such that fx l x ( t ) l 2 dJ-L( t ) < oo (cf. Prop osition Example
:
1.5) . The estimate
1
l x(t)y(t) i < 2 ( i x(t W + i y( t W ) implies the following two facts: a) Lg (X, J-L) is a linear space with respect to the pointwise operations, and b) for each x, y E Lg (X, J-L) there exists fx x(t)y(t)dJ-L(t) . If we put (x , y) to be equal to the last integral, we obtain a pre-inner product on Lg (X, J-L) . It is easy to see that this is an inner product <====> every singleton in X has a positive measure. 4 The notation H is traditional . As for the terms "pre-Hilbert" , "near-Hilbert" , and j ust "Hilbert" space, they are used in honour of the great German mathematician David Hilbert who set a stamp of his genius on almost all fields of the mathematics of his time.
( 1 86 1-1 943),
68
1 . Normed Spaces and Bounded Operators
Remark. Every linear space can be made near-Hilbert. It is sufficient to
take a linear basis e v ; v E A and for x == �� 1 Ake vk and y == ��= 1 Jlkevk , put ( x, y) : == ��= 1 AkJlk ·
In the next two statements ( H, (- , · ) ) is a pre-Hilbert space. The following theorem provides the foundation for all the work with pre-inner products. Theorem 1 ( the ( abstract ) Cauchy-Bunyakovskii inequality ) .
H we have
For all x, y E
l (x, y ) l 2 < (x, x) (y , y ) .
Proof. By condition ( iii ) of Definition 1 , for every
A E CC we have 0 < ( Ax + y, Ax + y) == A -X (x , x) + A ( x, y) + .X ( y, x) + ( y, y) . Clearly, without loss of the generality we can assume that (x , y) =I= 0, and
put
From this we see that
x ) where t E IR. A : = t ( y, l ( x, y ) l '
( x , x) t 2 + 21 ( x, y ) I t + ( y, y ) > 0 for all t E IR. Since (x , y) =/=- 0, we have (x , x) > 0. We obtain a quadratic
polynomial that is non-negative everywhere on IR. Thus it must have non • positive discriminant. The rest is clear.
Here is the first application. Just as normed spaces can be produced from prenormed ones by a standard procedure, near-Hilbert spaces can be produced from pre-Hilbert ones.
2 ( cf. Proposition 1 . 3) . Put Ho : == {x E H : ( x, x) == 0} . Then Ho is a {linear) subspace in H. Further, in the quotient space H/ Ho there exists an inner product well-defined by the equality (x, f) ) : == (x , y ) , where x and y are arbitrary representatives of the co sets x and f).
Proposition
Proof. The first statement evidently follows from the Cauchy-Bunyakovskii
inequality and from the properties of a pre-inner product. Further, if x, x' E x and y, y' E f), then x ' - x, y' - y E Ho , and, by the same inequality, ( x' - x, y ) == 0 == (x, y' - y ) , and thus (x ', y' ) == (x , y ) . The rest is clear. II Applying a similar construction to the space Lg (x, 11 ) , we obtain a near Hilbert space denoted by L2 (X , Jl) . Its elements are classes of equivalent square-integrable functions. Some near-Hilbert spaces we introduced above are of special interest. The first is the space £ 2 (X, • ) == Lg (X, • ) , where "• " is the "counting mea sure" ( see Section 1 ) ; we denote this space by l 2 (X ) . If X == N, the space
2.
Inner products and near-Hilbert spaces
l 2 (X) is just l 2 from Example
1 , and if
69
X == { 1 , . . . , n} , it is ccn with stan
dard inner product (see above) . In addition, let us mention the spaces L2 [a, b] , L2 (1r) , and L 2 (IR) , where the interval, the circle, and the line with the standard Lebesgue measure are considered. The reader remembers, of course, that in the previous section the nota tion Lp (X, J-L) was used for a normed space, and this is the case in particular for p == 2. This is not a mere coincidence.
A function This prenorm is a norm �
Proposition 3.
ll x ll :== J(X:X) defined on H zs a prenorm. ( · , · ) is an inner product.
Proof. The triangle inequality we have to prove follows from the sequence
of relations
ll x + y ll 2 == ( x + y, x + y ) < (x, x) + l ( y, x) l + l (x , y ) l + (y, y ) < (x , x) + 2 J(X:X) v
normed ===> metric ===> Hausdorff topological (space) . Hence, in the context of pre-Hilbert spaces we can freely use all the notions formulated in terms of prenorm, premetric, and topology. Certainly, the Cauchy-Bunyakovskii inequality can be rewritten now as follows:
l ( x, y) l < ll x ii iiYII · Proposition 4. Suppose the sequence X n tends to x , and Yn to y in H. Then the number sequence ( x n , Yn) tends to (x, y ) . Proof. From the Cauchy-Bunyakovskii inequality we have
I ( x, y) - (xn , Yn ) I < ll x - Xn II IIYII + ll xn II IIY - Yn II < II X - X n II IIY II + ( II X II + II X - Xn II ) IIY - Yn II · The rest is clear. We leave the following more general fact to the reader.
•
70
1 . Normed Spaces and Bounded Operators Exercise 1 ° . The pre-inner product
x
(· , · ) : H H --+ CC is continuous,
where H x H is viewed as the topological product of two copies of H. We say that the prenorm constructed in Proposition 3 is generated by the corresponding {pre-)inner product. A (pre)norm generated by some pre inner product is said to be a Hilbert {pre)norm. Example 3. If ( , · ) is a pre-inner product on H, then the equality -
( ( x , y ))
: ==
(y , x ) , obviously, generates a pre-inner product in the complex-conjugate
space Hi . So the prenorm in a complex-conjugate space to a space with a Hilbert prenorm, is a Hilbert prenorm as well. Might it be that every prenorm is actually a Hilbert prenorm? No, this is not the case. Here is a rather rough necessary condition for normed spaces.
2. In a near-Hilbert space the equality ll x + Y ll == ll x ll + IIYII implies that x and y are linearly dependent. As a consequence, norms in Exercise
such spaces as C[a, b] , l 1 , L 1 (X, JL) are not Hilbert norms. (Continue the list of such examples. ) Hint. The pairs (x, y) for which the Cauchy-Bunyakovskii inequality turns into an equality are linearly dependent. But what about spaces like, say, l3 , where, as is easy to see, the geometric condition indicated above is fulfilled? We now see that this question can be solved in terms going back to Euclid.
In every space with pre-inner prod uct, for every two vectors x, y the following equality holds:
Proposition 5 (Parallelogram identity) .
Proof. This is verified directly.
•
It turns out that the converse is also true: if a prenorm satisfies the par allelogram identity, then it is a Hilbert prenorm. Note first that Proposition 1 implies
In every pre-Hilbert space for every two vectors x, y the following identity holds: 3 ·k (x , y) = L � ll x + i k y ll 2 · k=O
Corollary 1 (Polar identity for a prenorm) .
As for the following proposition, it is sufficient for the majority of stu dents just to know it as a fact, but the advanced students (noblesse oblige) must be able to prove it.
2.
71
Inner products and near-Hilbert spaces
2 ( von Neumann-Jordan; see [50, I, Theorem 2. 1 .8] ) . If a pre norm satisfies the parallelogram law, then it is a Hilbert prenorm generated by the pre-inner product well-defined by the polar identity.
Theorem
Exercise 3. Prove this theorem.
k Hint. From the parallelogram identity with x and y replaced by x + i k z and y + i z for k == 0, 1, 2, 3, we have 1
(x + y, 2z) . 2 Together with the equality (0, x) == 0 this gives (x , 2z) == 2 (x, z) and (x + y, z) == (x , z) + (y, z) . This implies that (>..x , y) == >.. ( x, y) for all positive integers >.. , for all >.. of the form � , for all rational ).. and, finally, due to Proposition 1 . 1 , for all real >.. . Together with the identity (ix, y) == i (x, y) , which is verified directly, this gives the equality (>..x , y) == >.. (x, y) for all complex >.. . The rest is simple. (x, z) + (y, z) ==
From now on we concentrate on geometric properties of near-Hilbert spaces, and till the end of this section H will denote such a space. The main advantage of near-Hilbert spaces as compared to arbitrary normed spaces is that we can speak about "right angles" between vectors in these spaces.
4. Vectors x and y in H are said to be orthogonal or perpen dicular ( notation: x ..l y) if (x , y ) == 0. A system of vectors ev; v E A is said to be orthogonal if all vectors are non-zero and mutually orthogonal. It is said to be orthonormal if in addition all the vectors have norm 1 . Proposition 6 . Every orthogonal system is linearly independent. Definition
L:�= l A k e k
Proof. If e 1 , . . . , e n are some vectors in our system, and 0; A k E CC, then the "inner product of this identity with e k " gives
Ak
== 0.
• Proposition 7 ( "Pythagorean equality" ) . If vectors e 1 , . . . , e n are mutually orthogonal, then
The rest is clear.
Proof. This follows immediately from the expression of the norm in terms
• Proposition 8. Every orthogonal system in a separable near-Hilbert space is at most countable.
of the inner product.
ev; v E A is our system, then, by the Pythagorean equality the set of elements 11:�11 ; v E A satisfies the hypothesis of Corollary 0.2. 1 with Proof. If () ==
v'2. The rest is clear.
Here are several classical examples of orthonormal systems.
II
1.
72
Normed Spaces and Bounded Operators
4. In the "coordinate" space l 2 the system consisting of unit vectors (see Section 0. 1) is, of course, orthonormal. Example 5. You can easily verify that in the function space L 2 [ - 1r, 1r] the nt; n E Z is orthonormal. system consisting of functions of the form hei 21T" This system is called trigonometric. (The same name is used for its forefaExample
v
ther, the orthonormal system consisting of the constant vk and functions )rr cos nx, )rr sin nx; n E N. The latter system fits the corresponding "real" near-Hilbert spaces.)
n ; n E Z is orthonormal. L 2 (1r) the system hz 21T" (This system arises from the trigonometric system in the previous example after rolling up the interval [ - 1r, 1r] into the circle 1r.) Example 7. In the space L 2 [0, 1] there is an orthogonal system of the so called Rademacher functions. These functions depend on n == 0, 1, 2, . . . and their construction is as follows: the interval [0, 1] is divided into 2n Example 6. In the space
v
equal parts, and on the resulting small intervals the function is alternately equal to + 1 and - 1. The Rademacher system can be extended to a larger system consisting of all possible products of the Rademacher functions. This extended system is called the Walsh system. If a finite or countable system of vectors in a near-Hilbert space is given, we can make an orthonormal system from it. This procedure is described in the proof of the following theorem.
Let xn ; n == 1, 2, . . . be a finite or infinite system of vectors in H containing at least one non-zero vector. Then there exists a (finite or infinite) orthonormal system e n ; n == 1 , 2, . . . such that span {e n ; n == 1 , 2, . . . } == span { x n ; n == 1 , 2, . . . } .
Theorem 3.
Proof. Let us construct the required system by induction. We sort out
vectors X n until we find a non-zero one. Denote it by e 1 . Now assume that e 1 , . . . , e k are already defined. Look at the initial vec tors. If it turns out that they all lie in the linear span of the vectors e 1 , . . . , e k , then the theorem is proved. If, on the contrary, among Xn we find some z not lying in this linear span, then we put e� + l : == z - 2:7 1 (z , ez)ez . Clearly, this is a non-zero vector orthogonal to all e 1 , . . . , ek . Hence, e 1 , . . . , e k+ l , where e k + l == ll ee�k ++ 1l ll , IS. an ort honormal system. Obvious arguments based on the induction on the number of vectors we find in this process shows that every X n belongs to span { e n ; n == 1, 2, . . . }, • and every e n belongs to span { x n ; n == 1 , 2, . . . }. The rest is clear. I
:
2.
Inner products and near-Hilbert spaces
73
The process of constructing the orthonormal system e 1 , e 2 , . . . we used in this proof is called the orthogonalization process. In particular, if H == cc n , then, applying the orthogonalization process to an arbitrary linear basis in ccn ' we obviously obtain an orthonormal system, which is also a linear basis. t2 n Example 8. Take L 2 (IR) and consider the sequence of functions t e 2 ; n == 0, 1 , . . . in this space. If we apply the orthogonalization process to it, we get an orthonormal system consisting of the so-called Hermite functions. t2 Obviously, the nth Hermite function has the form Pn (t)e 2 , where Pn (t) is a polynomial of nth degree, called the Hermite polynomial. ( In the last chapter of the book we will see how important the Hermite functions are. ) Definition 5. Suppose e n ; n E N is an orthonormal system in H, and x E H. A formal series �� 1 ( x, e n) e n in H is called the Fourier series of the vector x with respect to this system. The numbers ( x, e n) are called the Fourier coefficients of the vector x with respect to this system. If H : == £2 [ - 1r , 1r] , then for the case of trigonometric orthonormal system from Example 5 we obviously get the classical Fourier series and Fourier coefficients. That is where the terminology comes from. The following several propositions show attractive geometric properties of near-Hilbert spaces. We recall that in a metric space M the distance between an element x and a subset N C M is defined as the number d ( x, N) : == inf { d ( x, z ) : z E N}. An element y E N is said to be nearest to x E M if d ( x, y) == d ( x, N) . For instance, if we take CC� as M, the first unit vector p 1 ( defined in Section 0. 1) as x, and span { p2 } as N, then the set of nearest elements in N to x is the interval { t p2 : - 1 < t < 1 } . ( Give similar examples, say, in l 1 and C[a, b] .) On the other hand, the following proposition holds. Proposition 9. Suppose x E H, Ho is a finite-dimensional subspace in H, and e 1 , . . . , e n is an orthonormal linear basis in Ho . Then y :== ��=1 ( x, e k) e k is the only nearest vector in Ho to x, and
Proof.
n ll x - Yll 2 == ll x ll 2 - L l (x, e k) l 2 · k=1 It is easy to see that ( x, e k) == ( y, e k ) for every k. Hence, ( x - y) ..l
e k for the same k, and, applying the Pythagorean equality to the system x - y, ( x, e 1) e 1 , . . . , ( x, e n) e n , we obtain the required equality. Then, by the properties of a linear basis, ( x - y) ..l z for an arbitrary z E Ho , and, as a consequence, ( x - y) ..l (y - z ) for the same z. Therefore, d ( x, z) 2 == ll x - z ll 2 == IIY - z ll 2 + ll x - Y ll 2 .
1.
74
Normed Spaces and Bounded Operators
This means that d(x , z ) > d(x, y) , and y is the only vector z E Ho for which • the equality is achieved. This proposition has the following corollary. Corollary 2 ( "Bessel inequality" ) . Suppose that for the same H and x ,
ev; v E A is an orthonormal system in H. Then :2:: l ( x , ev ) l 2 < ll x ll 2 · v EA In particular, if the given orthonormal system is a sequence en ; n == 1 , 2, . . . , then the number sequence � with �n == (x , e n) belongs to the space l 2 , and the l 2 -norm of this sequence does not exceed ll x ll . The following definition is rather general, although in this book it will be considered only in the context of near-Hilbert spaces. Definition 6. Suppose E is a prenormable space. A subset M of its ele ments is called a total subset (or a total system) in E if its linear span is dense in E. Example 9. The system of unit vectors in l 2 is total, because its linear span is coo , which, as we said in the previous section, is dense in l2 . Example 10. The trigonometric system in L 2 [ - 1r, 1r] is also total (see ex ample 1 . 1 1 ( 6) ) . Remark. The Hermite system in L 2 (IR) is also total. (This is important not only for mathematics, but for quantum mechanics as well.) But we still cannot give the proof of this fact: we need the "true" Hilbert spaces and Fourier transform (see Exercise 7. 1.6) . Proposition 10. In every separable near-Hilbert space there is a finite or
a countable total orthonormal system.
Proof. Take a countable dense space, and enumerate it. Then the orthonor mal system constructed using Theorem 2, obviously has the required prop
•
erties.
Now we pass to the fundamental property of separable infinite-dimen sional near-Hilbert spaces.
Let H be an infinite-dimensional near-Hilbert space, e n ; n E N a total orthonormal system in H, and x E H. Then (i) x is the sum of the Fourier series of this vector with respect to the system {en } (i. e. , x == I: � ( x, en) en ); (ii) {Pars eval equality) ll x ll 2 == I:� I ( x, e n) 1 2 ; (iii) if X == I:� A n e n , then An == (x , e n ) .
Theorem 4.
1
1
1
2.
Inner products and near-Hilbert spaces
Proof. For n
75
E N we set Hn : == span{e1 , . . . , en} · Then from the totality of
the system { en } evidently follows that d(x, Hn ) � 0 as n � oo . Together with Proposition 9 this implies ( i ) and ( ii ) . If x == �� 1 An en , then, by Proposition 4 , for every n E N we have 00
( x, e n) = L >.k (ek , en) , k= 1 and the sum of the latter series is, obviously, An ·
II
Let us discuss this theorem from the general point of view. We see that the total orthonormal system in a near-Hilbert space plays the role similar to the role of ( linear ) basis in a linear space, but this time, infinite sums are allowed in the decompositions of vectors. Here is the general context for these situations. Definition 7. Suppose E is a separable normed space. A sequence en ; n E N of vectors in E is called a Schaude,JS basis, or a topological basis of this space if every x E E can be uniquely represented as the sum of a series x == �� 1 Anen ( this means that the coefficients An ; n E N are uniquely determined ) . Example 11. It is easy to see that for every p E [1 , oo) ( but not for p == oo ! ) the sequence of unit vectors is a Schauder basis in the space lp . Remark. Certainly, the word "separable" in Definition 5 can be omitted: it is clear that the space having such a system is automatically separable. There are versions of the definition of a topological basis for more general spaces, but we do not need them. Remark. For reinsurance let us emphasize that the notion of Schauder basis formally bears no relation to the categorical notion of basis we discussed before (Definition 0. 7.4) . However, some systems of elements can be bases in both meanings. (This is true, for instance , for the system of unit vectors in h if we consider this space as an object of our future concrete category (Ban 1 , 0 ) ; see Section 2.5) .
Thus, statements ( i ) and ( iii ) of the previous theorem can be reformu lated in the following way.
Every separable near-Hilbert space has a Schauder basis, which II is an arbitrary total orthonormal system of vectors in this space.
Theorem 4'.
( Below, when speaking about orthonormal bases in near-Hilbert spaces, we have in mind precisely Schauder bases. ) However, general separable normed spaces may have no Schauder bases even if they are Banach ( see Definition 2. 1 . 1 below ) . Some details will be given in Section 3. 3. 5 Juliusz Schauder ( 1 96 1 9 43 ), prominent Polish mathematician, a student of Banach . Trag 8 ically perished during the fascist occupation of Poland.
1 . Normed Spaces and Bounded Operators
76
3 . Bounded op erators : First acquaintance, examples
We have met the fundamental structure of functional analysis, the structure of normed ( and, more general, prenormed ) space. Now it is time to consider the mappings that properly interact with this structure.
Let T : E F be a linear operator between two prenormed spaces. The following conditions are equivalent: ( i ) there exists C > 0 such that II T (x) ll < C ll x ll for all x E E; ( ii ) sup { II T(x) ll : x E BE} < oo . Moreover, if K0 is the infimum of the constants C for which {i) holds, and Ko is the supremum in {ii), then K 0 == Ko . Proof. ( i ) ===> ( ii ) . Obviously, for every C satisfying ( i ) , and for every x E BE we have II T(x) ll < C. This implies ( ii ) , and moreover Ko < K0 . ( ii ) ===> ( i ) . Take x E E. If ll x ll > 0, then II II � II II == 1 , hence li T( fxrr ) II < Ko and II T(x) ll < Ko ll x ll . If ll x ll == 0, then for all t > 0 we have tx E BE. Therefore , t ii T( x) II == li T( t x) II < Ko , hence II T(x) II == 0 . Therefore, II T(x) ll < Ko ll x ll for all x E E. Thus ( i ) holds, where C == Ko. The rest is • clear. Proposition 1.
--+
Exercise 1 ° . The conditions in Proposition 1 are equivalent to the
following one: T takes every bounded ( in the sense of the prenorm ) subset in E to a bounded subset in F. Definition 1. An operator between prenormed spaces that has the proper ties described in Proposition 1 is called bounded. The number Ko ( or, what
is the same, K0 ) from this proposition is called the operator ( sometimes, uniform) prenorm of the operator T and is denoted by II T II .
This term and this notation are not accidental. Suppose E and F are two prenormed spaces. Denote by B ( E, F) the set of all bounded operators from E to F. Certainly, it is a subset of the linear space £ ( E, F) of all ( linear ) operators from E to F.
2. ( i ) B ( E, F) is a linear subspace in £ ( E, F) ; ( ii ) the function on B ( E, F) that assigns to each operator T the number II T II is a prenorm in B ( E, F) ; ( iii ) if F is a normed space, then B ( E, F) is a normed space as well. Proof. Take S, T E B ( E, F) . Then for each x E E we have II ( s + T) ( X ) II == II s ( X ) + T ( X ) II < II s ( X ) II + II T ( X ) II < c II X II ' Proposition
3. Bounded operators: First acquaintance, examples
77
where C : == II S II + II T II . This means that S + T E B ( E, F) and II S + T II < II S II + II T II . It is even easier to verify that for a bounded T and A E CC we have AT E B(E, F) and II AT II == I A I II T II . Thus, we see that (i) and (ii) hold. Further, if F is a normed space and T =/=- 0 in B ( E, F) , then T(x) =/=- 0 for some x E BE . Hence, II T II > II T(x) ll > 0. The rest is clear. II Exercise
2. If E =/=- 0, then the prenorm B( E , F) is a norm � the
same is true for the prenorm in F. Hint. Use the fact that there are non-zero functionals on E (Exercise 0. 1.2) . If condition (iii) of Proposition 2 is fulfilled, we will, of course, say "op erator norm" (instead of operator prenorm) . Later we will need the following result. Proposition 3.
spaces. Then
Let T : H
--+
K be a bounded operator between pre-Hilbert
II T II == sup{ I (T(x) , y) l : x E BH , y E BK } · Proof. For every x E BH such that II T(x) ll =/=- 0 the vector y : == T(x) / II T(x) ll lies in BK , and II T(x) ll == ( T(x) , y) . Taking the upper bound over all such x, we obtain the estimate II T II < sup{ I ( T(x) , y) l : x E BH , y E BK } · The converse inequality follows from the Cauchy-Bunyakovskii inequality. II Thus our stock of examples of (pre )normed spaces is now supplemented with the spaces B(E, F) consisting of bounded operators. Perhaps, the most important of them belong to the following two special cases: (i) F == CC. Here the space B(E, F) consists of bounded functionals on E. This space is called the dual space for E, and has a short nota tion E* . From Proposition 2(iii) we see that for every prenormed
E the space E* is normed. (ii) E == F. In this case we speak about "operators acting on E ", and write B(E) instead of B(E, E) . From the same proposition it follows that if E is a normed space, then the same is true for B(E) .
Another important example of a (pre )normed space is the subspace of finite dimensional bounded operators in B(E, F) . We denote it by F(E, F) , and if E == F, we write F(E) instead of F(E, E) . The following obvious relation connects the operator prenorm and the composition of operators.
4 ( "multiplicative inequality for operator prenorms" ) . Sup pose S : E --+ F and T : F --+ G are bounded operators between prenormed II spaces. Then the operator TS is bounded, and II TS II < II T II II S II . Proposition
1.
78
Normed Spaces and Bounded Operators
2. Let T : E F be an operator between prenormed spaces. It is called a contraction, if II T II < 1 (i.e. , II T(x) ll < ll x ll for all x E E) . An operator is called an isometry if it preserves norms, i.e. , II T(x) II == ll x ll for all x E E, and coisometry if it maps the unit open ball in E onto the unit --+
Definition
open ball in F.
Clearly, T is a contraction or an isometry � T has the same prop erties as the mapping between (pre)metric spaces.) Note that an isometric operator is injective if E is a normed space, and a coisometric operator is always surjective. Simplest examples of bounded operators are of course the zero operator 0: E --+ F; x �----+ 0 and the identity operator 1: E --+ E; x �----+ x; clearly, II 0 II == 0 and (if E =/= 0) II 1 II == 1 . Let us also distinguish the following
cc; ; 1 < p < oo to a linear prenormed space E is bounded. Indeed, every � E cc; can be written in the Example 1. Every linear operator T from
form
2:�= I �kPk , hence
n
II T( e ) II = L ek r ( pk ) < C ll e lloo < C ll e ll p ,
k= l
(Later we will see that the space cc; in this example can be replaced by an arbitrary finite-dimensional space. ) Some other classes of operators defined in general terms, will be intro duced in the following sections. Here we go over most popular examples of operators connecting various concrete normed spaces. As we have mentioned before, the majority of operators one encounters in mathematics are either functionals, or operators acting on some space (see above) . We postpone the discussions of the functionals till the next section. As to the second class of operators, we say immediately that all the spaces where our future operators will act, will be (as we will see in the next chapter) Banach spaces. Moreover, a significant part of these spaces will be Hilbert spaces. A remark for the future. We now start a list of examples and sug gest that the readers keep it readily available. In this book, we introduce a series of fundamental notions concerning operators, and every time we investigate what they mean for our concrete operators. Now we can only ask a few simplest questions about operators, and the first question is: what is its norm? But later we will be able to answer other questions for the same operators (gradually complemented by other examples) . For instance, does it belong to the class of compact operators? or Fredholm operators? What is its spectrum? What is the adjoint operator? We believe that the
3. Bounded operators: First acquaintance, examples
79
new notions can be fully understood only after considering them for many different examples. We begin with the spaces of sequences lp ; 1 < p < oo . (Those who do not trust Proposition 1. 1.5 can restrict themselves to the cases of p == 1 , 2, oo . )
2. Let A
(AI , . . . , A n , . . . ) be a bounded sequence (i.e. , a se quence belonging to l00) . The diagonal operator T>.. : lp � lp assigns to each � == ( �I , . . . , �n , . . . ) E lp the sequence T>.. ( � ) == (AI � I , . . . , A n�n , . . . ) . Example
==
If p < oo , then �� I I A n�n i P < � � I II A II� I�n i P , hence T>.. ( �) belongs to lp , and II T>.. ( � ) II p < II A II oo ll� ll p · At the same time sup{ II T>.. (x) ll : l l x l l < 1 } > sup{ II T>.. ( Pn ) ll : n E N} == II A II oo · Hence, T>.. is bounded and II T>.. II == II A II oo · Obviously, the same is true for p == ()(). As we will see later, under some conditions on A the diagonal operators acting on l 2 are models of important classes of operators acting on abstract Hilbert spaces (see Section 6.2) . Exercise 3. If a sequence A is not bounded, then the operator T>.. "leads out" of lp . Certainly, we can similarly define the operator T>.. ; A == (AI , . . . , A n ) act ing on z;; n < oo , the finite-dimensional "embryo" of the space lp . We call such operators diagonal as well; obviously, again l i T>.. II == II A II oo .
The shift operator. The mappings Tz : (� I , . . . , �n , . . . ) �----+ ( �2 , . . . , �n + I , . . . ) and Tr : ( � I , . . . , �n , . . . ) �----+ (0, �I , . . . , �n - I , . . . ) are, obviously, bounded operators on lp == lp (N) ; the first is called the operator of left shift, and the second, the operator of right shift. Clearly the norm of each operator is 1 , and Tr is an isometric operator, whereas Tz is coisometric. The Example 3.
following operator Tb has a resemblance (deceptive in many aspects, as we will see later) to them: it acts in lp (Z) (the space of sequences infinite on both sides) and takes � == ( . . . , �I , . . . , �n , . . . ) ; n E Z to 'TJ == ( . . . , 'f/I , . . . , 'f/n , . . . ) , where 'T/n : == �n - I · This is the so-called operator of bilateral shift in lp (Z) ; it is isometric, like Tr , but the difference is that Tb has the isometric inverse operator (indicate the latter) . Obviously, in both examples we could consider co instead of lp , and the obvious bilateral analogue of co instead of lp (Z) . It is easy to verify that the norms of the respective operators do not change. Now we pass to function spaces taking as a key example the space Lp (X, J-L) ; 1 < p < oo , where (X, J-L) is a measure space.
1.
80
Normed Spaces and Bounded Operators
4. The operator of multiplication by an essentially bounded mea surable function. Let f be such a function, i.e. , an element of the space L 00 (X, J-L) . Then for almost all t E X , we obviously have l f(t) l < ll f lloo · Therefore, if p < oo , then for each x E Lp ( X, J-L) we have l f(t)x(t) I P dJL (t) < ll f ll � l x( t) I P dJL (t) = ll f ll � ll x ll � · Example
L
L
This means that for the indicated values of p the mapping Tt : x ( t) �----+ f(t)x(t) is a well-defined bounded operator on Lp (X, J-L) , and II Tt ll < ll f lloo · Further, take an arbitrary K < ll f lloo · Then there exists a subset Y in X of positive measure such that I f ( t) I > K as t E Y. If x is its characteristic function, then
[
II Tt ( x ) II � = l f(t) I P dJL (t) > KP II xll �· Hence, II Tt ll > K for each K < ll f lloo · Together with what was said above, this implies that for p < oo we have II Tt II == II f II 00 • It is even simpler to see that the same equality is true for p == oo. The operators of this class acting on L 2 (X, J-L) , play an important role
in the general theory of operator algebras, and especially in the theory of the so-called von Neumann algebras (see Section 6.3 and the corresponding references) . In our exposition the most useful is the case where X is an interval in IR, and f ( t) : == t (the "independent variable" ) . As we will see in Sections 6. 7 and 6.8, the corresponding operators play a very important role in spectral theory. Certainly, the operator of multiplication by a function is a generalization of the operator of multiplication by a bounded sequence in lp , which, as we recall, corresponds to the case of (X, J-L) == (N, • ) . In particular, Example 2 is contained in Example 4. Further, in C[a, b] there is an obviously defined operator of multiplication by a continuous function, and in en [a, b] ; n E N an operator of multiplication by an n-times-smooth function (check the boundedness! ) . Example 5 . The operator of indefinite integration. Consider an interval on the number line, say [0, 1] , and the space £ 2 [0, 1] . Let us assign to every integrable function x its indefinite integral y(t) :== J� x(s)ds. Keeping in mind the Cauchy-Bunyakovskii inequality, we have
I Y ( t) I < y
1 x ( s) ds
fo I
I
=
( 1 , I x I ) < II 1 II 2 II x II 2 = II x II 2 ·
(to be more precise, its coset) belongs to L 2 [0, 1] , and IIYII 2 < ll x ll 2 · Thus, we obtain a bounded operator on L 2 [0, 1] , which, moreover, is a contraction. Hence
3. Bounded operators: First acquaintance, examples
81
Similarly, we can define the operator of indefinite integration on the spaces C[O, 1] and £ 1 [0, 1] .
4°. Both these operators have norm 1 . Hint. In the case of C[O, 1] consider x(t) 1 , and in the case of £ 1 [0, 1] , the sequence of functions x n (t) :== nxn , where Xn is a characteristic function of the interval [0, 1/n] . Exercise
Remark. Why did not we say a word about the norm of the operator
of indefinite integration in £ 2 [0, 1 ] ? The fact is that we are not ready to compute it now. If you try to do this "directly" ( cf. the previous hint ) , it is doubtful whether you obtain anything reasonable. The success will come later when we learn much more about operators ( see Exercise 6.2.5) . But now we would like to surprise the reader: the answer is 2/7r . Now we introduce one of the oldest and most respectable classes of op erators. Long ago it played an important role in the development of the notion of operator in functional analysis. Example 6.
Integral operators. Take an interval [a, b] and denote by 0
its Cartesian square endowed with the standard plane Lebesgue measure. Choose a function K(s, t) in the space £2 (0) of square-integrable functions. By Fubini ' s theorem, for almost all s E [a, b] the function K8 (t) :== K(s, t) belongs to L 2 [a, b] . So for each of these s and for every x E L 2 [a, b] there is a number
y(s)
:=
1 b K(s, t)x(t)dt
( the inner product of the vectors K8 (t) and x( t) in L 2 [a, b] ) . The Cauchy Bunyakovskii inequality gives l y(s) l < II Ks ll 2 ll x ll 2 · Thus, again by Fubini ' s theorem,
b b 1 i y(s Wds < 1 b II Ks ll � ds) ll x ll � (1b1 I K(s, tW dsdt) ll x ii� =
The last double integral is nothing but II K II � , the square of the norm of K(s, t) in L 2 (0) . Hence y E L 2 [a, b] and IIYII 2 < II K II2 II x ll 2 · This means that the mapping
x
foo--t
y, where y( s) : =
1 b K ( s, t)x(t)dt,
is a bounded operator on L 2 [a, b] with the norm not greater than II K II2 . We call the function K ( s, t) the kernel 6 of the constructed operator. 6 Thus the term "kernel" have two quite different meanings: in addition to the kernel of a linear operator defined in Section of Chapter 0, there is the kernel of an integral operator we define here. So please be on alert !
1
82
1 . Normed Spaces and Bounded Operators
A special class of integral operators is formed by the so-called Volterra operators . These are the ones whose kernel vanishes as s < t. In other words, Volterra operator takes x E L 2 [a , b] to
y(s) : =
1 8 K(s, t)x(t)dt
(the integral with variable upper limit) , where K is now an arbitrary function from L 2 ( D ) . The operator of indefinite integration in £ 2 [0, 1 ] , in its turn, is, of course, one of the Volterra operators. Indeed, here K(s, t) is equal to 1 as s > t, and 0 as s < t. Remark. By the way, we see that the norm of the operator of indefinite integration in L 2 [0, 1] does not exceed )2 (and is less than 1; cf. Exercise 4) .
Lp (IR) and Co (IR) we can assign to every a E IR the so-called operator of translation by a, denoted by Ta. It takes a function x to Ta (x) t r--+ x ( t a) . Obviously, this is an isometric operator with an isometric inverse operator (what is it?) . Similarly, every z E 1r generates a shift operator acting on Lp ('lr) or in C (1r) by the following rule: Tz ( x) : t r--+ x (z 1 t) . Remark. One more concrete operator acting on L 2 (IR) , perhaps the most Example 7. In the spaces :
-
-
outstanding operator acting on this space, will be added to our list much later. This is the so-called Hilbert Fourier operator (see Definition 7.4. 1) . Finally, we give a classical operator connecting different spaces. Example 8. The differentiation operator D k : cn + k [a, b] � cn [a, b] assigns to every function its kth derivative (as always, we set C0 [a, b] :== C[a, b] ) . (Check that this operator is indeed bounded.) 4 . Topological and categorical propert ies of bounded operat ors
From now on in our exposition the first categories of functional analysis appear. They are as follows. (i) The category No r . The objects are normed spaces and morphisms are (arbitrary) bounded operators. (ii) The category Nor1 . It has the same objects as in Nor, but the class of morphisms is smaller: it consists of contraction operators only. (iii) The category Pre. The objects are (arbitrary) prenormed spaces, and morphisms are (just as in No r ) bounded operators. (iv) The category Pre1 . The objects are the same as in Pre, and mor phisms are (just as in Nor1 ) contraction operators.
4.
83
Properties of bounded operators
The composition of morphisms in all these four categories is the usual composition of operators. The fact that it is a bounded, or depending on the case, a contraction operator, follows immediately from Proposition 3.4. The axioms of category (see Definition 0.3. 1) are easily verified; local identities are obviously the identity operators. The connections between these four categories can be illustrated by the following scheme: Nor u
c
Pre u
C Pre1 where the symbol c means that the left category is a full subcategory of the right category, and the symbol U means that the lower category is a non-full subcategory, but with the same objects, of the upper category. Remark. Let us immediately say that all these categories, even Nor, are not the main categories of linear functional analysis. The future categories of Banach and Hilbert spaces (which will be considered in the next chapter) are more important. But it is expedient to consider some basic notions and constructions without requiring completeness in the context of the categories we have introduced here. Nor1
Every time we encounter a new category the first question we have to think about is what are isomorphisms. Directly from Definition 0.4. 1 it follows that isomorphisms in Nor, as well as in Pre, are bounded linear operators having inverse bounded linear operators. At the same time isomorphisms in Nor1 , as well as in Pre1 , are contraction operators having inverse contraction operators. Definition 1. Isomorphisms in Nor and in Pre have a special name: topo logical isomorphisms (or: linear homeomorphisms) . The special name for isomorphisms in Nor1 and in Pre1 is isometric isomorphisms (or: linear isometries) . Let us distinguish the following obvious Proposition 1. Let T : E � F be an operator between prenormed spaces.
Then
(i) T is a topological isomorphism <====> it is bijective and there are two constants C, c > 0 such that c ll x ll < II T(x) ll < C ll x ll for all x E E. (ii) T is an isometric isomorphism <====> it is bijective and isometric
<====> it is bijective and coisometric.
II
Now let us note that the polar identity for prenorms immediately implies the following.
84
1 . Normed Spaces and Bounded Operators
2. Let H and K be pre-Hilbert spaces. Then the operator T : H -t K is isometric <====> for all x, y E H we have ( T(x) , T(y) ) == (x , y) {i. e., T preserves the pre-inner products). In particular, our operator is an isometric isomorphism <====> it is bijective and preserves pre-inner products . • Proposition
In the indicated case of pre-Hilbert spaces isometric isomorphisms have special name: unitary isomorphisms or unitary operators. According to this, isomorphic objects in Nor or, more generally, in Pre are also called topologically isomorphic (pre )normed spaces, and isomorphic objects in Nori or in Prei , isometrically isomorphic (pre)normed spaces. Isometrically isomorphic pre-Hilbert (in particular, near-Hilbert) spaces are also called unitary isomorphic. Example 1. For n == 1 , 2, . . . the space
l; ; 1 < p < oo (see Example 1 .3) is isometrically (and thus, topologically) isomorphic to the subspace in lp that
consists of sequences with zeroes after the nth term.
As the condition of being isometric isomorphism is, obviously, much more rigid that the condition of being topological isomorphism, it is not surprising that there are spaces isomorphic, say, in Nor, but not in Nori . Here is a simple illustration. Exercise 1. For a given n all the spaces cc; , p > 0, are mutually topo logically isomorphic. On the other hand, CC! is not isometrically isomorphic to CC2 for n > 1. In the following chapter we discuss the classification of objects in our cat egories up to isomorphisms; however, we restrict ourselves to the categories of Banach and Hilbert spaces. Note that in the introduced categories not every morphism that is a bijective map, is automatically an isomorphism (this makes these categories similar to Top, but different from Lin ) . In Nori one does not need to look long for examples: just take a contraction operator proportional to the identity operator. The examples in Nor are more complicated. Example 2. Consider two norms in l i : the natural norm II · II I of this space (see Example 1 . 1 .5) and the uniform norm II · lloo (inherited from co) . Then the operator 1 : (l i , II · II I ) -t ( l i , II · ll oo ) is bounded but has no bounded inverse operator. This follows, in particular, from the fact that the sum of n unit vectors has uniform norm equal to 1 , but its II · II I-norm is n. Remark. The "philosophical" meaning of examples of that kind will be
seen later, when we discuss one of the central theorems of these lectures, the Banach theorem on the inverse operator; see Section 2.4.
4.
Properties of bounded operators
85
We now recall that every prenormed space is automatically topological. A fact of principal importance is that bounded operators can be completely characterized in topological terms.
The following properties of an operator T : E --+ F between prenormed spaces are equivalent: (i) T is bounded; (ii) T is continuous at zero; (iii) T is continuous; (iv) T is uniformly continuous. Theorem 1.
>
0 and put 8 :== C ' where C is the constant from Proposition 3. 1 (i) . Then, obviously, for all x, y E E, from d(x, y) == ll x - Yll < 8 we have d(T(x) , T(y)) == II T(x) - T(y) ll < c . (iv) ===> ( iii) ===> (ii) . Clear. (ii)===> (i) . The mapping T of respective premetric spaces is continuous at 0 E E, and T(O) == 0 E F. Hence for c :== 1 there exists 8 such that ll x ' ll < 8 implies II T(x ' ) II < c . Take an arbitrary x E E. If ll x ll > 0, then we can put x' : = 2�� 1 1 . We see that ll x ' ll < 8, and this immediately implies II T ( x) II < � II x II . If II x II == 0, then for every t > 0 we have II tx II < 8. Hence t ii T(x) ll == II T(tx) ll < 1 , and therefore T(x) == 0. Thus, for every x E E we have II T(x) ll < C ll x ll , where C :== � II Proof. (i) ===> (iv) . Take
c
From this theorem we can easily obtain further information about pos sible properties of operators as of mappings between topological spaces.
An operator between prenormed spaces is a topological iso morphism <====> it is a homeomorphism.
Corollary 1.
This, in turn, implies the following
2. Let T : E --+ F be a bounded operator between prenormed spaces. Then T is topologically injective <====> it is injective and in addition there is c > 0 such that c ll x ll < II T(x) ll for all x E E. Corollary
Note that if E is a normed space, then the estimate c ll x ll < II T (x) ll for x E E automatically implies that T is injective.
Suppose T : E --+ F is a bounded operator between normed spaces. Then the following properties are equivalent: (i) T is topologically surjective; (ii) T is open;
Proposition 3.
86
1 . Normed Spaces and Bounded Operators (iii) the set T(B�) {i. e. , the image of the open unit ball in E) contains some open ball () B� (centered at 0 with radius () > 0); (iv) there exists C' > 0 such that for every y E F there is x E E satisfying T( x ) == y and ll x ll < C' IIYII ·
Proof. (i)====> (ii) . Let U be an open set in E. Then it is easy to see that the
inverse image of the set T(U) is the algebraic sum U + Ker(T) == U{U + x; x E Ker(T) } , which is open, as is seen from its explicit expression. (ii) ====> ( i) , (ii) ====> ( iii) . Clear. (iii)====> ( iv) . Take y E F, y f- 0. Since y' : = 2 1 Jyi/ Y E BB�, we have y' == T( x' ) for some x' E B� , and y == T( x ) for x :== 2 ll y ii 0- 1 x'. Hence ll x ll < C' IIYII for C' :== 20- 1 . (iv)====> (ii) . Suppose U is an open set in E, and y == T( x ) ; x E U. Take () > 0 such that x + () B� C U. Then for each z E F satisfying 1 ( C') 0 there is x' E E such that T( x' ) == y - z and ll x ' ll < 0; z < ll IIY for all this, obviously, z == T( x - x ' ) and x - x' E U. Thus, y is an interior point in T(U) . • Exercise
2. If E and F are prenormed spaces, then the equivalences
(i) <====> (ii) <====> (iii) and the implication (iv) ====> ( ii) of Proposition 3 remain true. At the same time, generally speaking, (iv) does not follow from (i) (iii) . Hint. Consider as a counterexample the projection E � E/ F, where F is a dense proper subspace of a normed space E. Certainly, every injective isometric operator is topologically injective, and coisometric one is topologically surjective. Let us look at the meaning of these notions for the concrete examples of operators considered in the previous section. Which of these operators are topological isomorphisms? Which are topologically injective? Isometric? We emphasize two observations. Exercise 3. The diagonal operator T>..
lp � lp ; 1 < p <
is a topological isomorphism <====> it is topologically injective <====> it is surjective <====> it is topologically surjective <====> the closure of the set of numbers An in CC does not contain 0. :
oo
This fact is a special case of the following one. Exercise 4. The operator Tt : Lp (X, J-L) � Lp (X, J-L) ; 1 < p < oo; f E L 00 (X, J-L) ( cf. Example 3.4) is a topological isomorphism <====> it is topologically injective <====> it is topologically surjective <====> it is surjective <====> there is () > 0 such that I f ( t) I > () for almost all t E X.
4.
87
Properties of bounded operators
Hint. If the last condition holds, then the operator T1 ; 1 is inverse to Tt.
If it does not hold, consider the images and inverse images (if they exist) of the functions Xn / llxn ll , where Xn is the characteristic function of the set {t E X : l f ( t ) l < 1/ n } . Remark. As for integral operators from Example 3.6, you can verify right now that they do not have any of these properties. Later you will find that all integral operators are compact (see Definition 3.3. 1 and Theorem 3.3. 1 below) , and this will allow you to obtain the same assertion as a simple corollary of this fact. We now apply Theorem 1 to the following question of principal impor tance. Suppose we have two prenorms on a vector space. What can we say about the relations between the topologies generated by these prenorms? Definition 2. Let II · II and II · II ' be prenorms on a linear space E. One says that II · II ' majorizes II · II if there exists C > 0 such that II · II < C ll · II ' (i.e. , for all x E E we have ll x ll < C ll x ll ' ) . Two prenorms are called equivalent if each of them majorizes the other. Example 3. Consider C[a, b] with two norms: I I · l l oo (the uniform norm) , and II · l i P for some p E [1 , oo ) (the norm inherited from Lp [a, b] ) . Then the first majorizes the second: II · l i P < ((b - a) ) 1 1P II · lloo · At the same time, these norms are not equivalent (check this!) . Example 4. Consider L 2 [a, b] with norms II · ll 1 and II · ll 2 · Then the formula ll x ll 1 == ( I, x) , where I(t) 1, and the Cauchy-Bunyakovskii inequality imply that ll x ll 1 < J(b - a) ll x ll 2 · Thus the second norm majorizes the first. Remark. Actually, a more general fact it true: for the very same interval the norm I I · ll q majorizes the norm II · l i P with 1 < p < q < oo . But this assertion is more difficult to prove , and we will not need it in the sequel. We note the following almost tautological Proposition 4. A linear operator T : E � F between prenormed spaces
is bounded <====> the initial prenorm in E majorizes the prenorm lll x lll : • II T(x ) ll .
The following proposition answers the above question. Proposition 5. Let I I · II and II · II ' be two prenorms in a linear space E, and 1
1
and the corresponding topologies. Then is not weaker than <====> I I · II ' majorizes II · I I . Proof. Clearly, is not weaker than <====> the identity operator 1: ( E, II · II ') � ( E, II · II ) is continuous. By the previous theorem, this pre • cisely means that II · II majorizes II · II . T
T
T
T
1
T
1
T
88
1.
Normed Spaces and Bounded Operators
Two prenorms on a linear space generate the same topology they are equivalent.
Corollary 3.
�
We immediately indicate the fact of fundamental importance: every two norms on a finite-dimensional linear space are equivalent (which, hav ing in mind Proposition 4, is equivalent to the statement that every linear isomorphism between finite-dimensional normed spaces is a topological iso morphism) . The proof, however, is not so simple as one might think, and requires some additional arguments. We will give it a little later; see Corol lary 2. 1.2 and Proposition 3.2. 7. Combining Theorem 1 and Proposition 0.2. 10 we immediately obtain
If T E � F is a bounded operator from a prenormed • space to a normed space, then its kernel is a closed subspace in E.
Proposition 6.
:
When discussing topological and isometric isomorphisms, we in fact were considering the conditions-more indulgent in the first case and more rigid in the second-under which we can identify different (pre )normed spaces ( cf. the discussion of the corresponding category theory notions in Section 0 .4) . Now let us talk about the conditions under which bounded operators (generally speaking, defined on different spaces) can be identified. Definition 3. Let S : E1 � E2 and T : F1 � F2 be bounded opera tors acting between prenormed spaces. They are called weakly topologically (respectively, weakly isometrically) equivalent if there are topological (re spectively, isometric) isomorphisms I and J such that the diagram
is commutative. Obviously, the weak topological equivalence is a special case of the gen eral category theory notion of weak similarity of morphisms, namely, the weak similarity of morphisms in Pre (or Nor in the context of normed spaces) . As for the weak isometric equivalence, it is the weak similarity of morphisms in Pre1 (or in Nor1 ) if the question is about the contraction operators, and the weak similarity of morphisms in Pre ( Nor) with respect to Pre1 (or Nor1 ) in the case of arbitrary operators. The reader can easily verify that topological isomorphism or topological injectivity or topological surjectivity of an operator are invariants of the weak topological equivalence, i.e. , these properties do not change when an operator is replaced by a weakly topologically equivalent operator. (Later
4.
89
Properties of bounded operators
the so-called Fredholm property will be added to the list.) At the same time, the properties of being an isometric isomorphism, an isometric or coisometric operator are invariants of the weakly isometric equivalence. Further, the dimension of the kernel and the co dimension of the image of an operator, and the property that the image or the kernel is closed, are, clearly, invariants of the weak topological equivalence. (We recall that the closedness of kernels is guaranteed in the case of Nor but not in the case of Pre.) Certainly,
the operator {pre)norm is a most important numeric invariant of the weak isometric equivalence.
The following, "more precise" identification, apparently, is of more in terest. E and T F --+ F be bounded operators, acting on prenormed spaces. They are called topologically (respectively, isometrically) equivalent if there exists a topological (respectively, isometric) isomorphism I such that the diagram Definition
4. Let S E :
--+
:
is commutative. The informal meaning of this as well as the previous definition is that equivalent operators coincide (act on vectors in the same way) after some identification of the given spaces by means of an isomorphism. (And for "just" equivalence only one isomorphism is allowed, while for a weak equiv alence two different isomorphisms are allowed.) Every invariant of a weak equivalence (of either type) is automatically an invariant of the corresponding "just" equivalence. However, the latter has some new invariants, say, the existence (or absence) of invariant subspaces of a given dimension, or the dimension of the subspace of eigenvectors with a given eigenvalue. But, as we shall see later, the most respectable invariant of the topological equivalence is the so-called spectrum of an operator (see Proposition 4. 1.1) . Obviously, topological equivalence is a special case of the general-catego rical notion of similarity of morphisms, namely, the similarity of morphisms in Pre (or in Nor in the context of normed spaces) . By the way, in the literature, the similarity of operators is often understood in the sense we use when speaking about topological equivalence. As for isometric equivalence, this is the similarity of morphisms in Pre1 (or in Nor1 ) if the question is about the contraction operators, and the similarity of morphisms in Pre
90
1 . Normed Spaces and Bounded Operators
(Nor) with respect to Pre1 (Nor1 ) in the case of arbitrary bounded oper
ators. If the question is about operators acting on pre-Hilbert ( in particular, on near-Hilbert ) spaces, then the unitary and the weakly unitary equivalence are the isometric and the weakly isometric equivalence; cf. the definition of a unitary operator above. ( Actually, these are the best studied types of equivalences; see Sections 3.4, 6.2 and 6. 7.) The operator I in Definition 4 is said to implement the topological (or, depending on the context, isometric, or unitary) equivalence between oper ators S and T. By analogy, the pair ( I, J) of operators from Definition 3 is said to implement the weak topological {isometric, unitary) equivalence of the corresponding operators. Later, as more material is gathered, we shall give many substantial ex amples of different types of equivalence in the context of operators acting on infinite-dimensional spaces. As for finite-dimensional spaces, the reader had already faced the corresponding examples, perhaps, expressed in somewhat different words. Refresh your memory with the following exercise, which is actually already familiar to you. Exercise 5.
( i ) Two operators on CC� ; 1 < p < oo are topologically equivalent {:::::::> they can be written by means of the same matrices in some ( gen erally speaking, different ) linear bases. ( ii ) Two operators on CC2 are unitarily ( i.e. , isometrically ) equivalent {:::::::> they can be written by means of the same matrices in some ( generally speaking, different ) orthonormal bases.
Thus, we meet matrices. In the first algebra course we were taught that matrix notation is a powerful analytical tool for studying operators ( that act on finite-dimensional spaces ) . In functional analysis matrix notation plays a more modest role, but nevertheless it is also very useful here. Operators acting on spaces with a Schauder basis can be written in a matrix form, but the matrices here are infinite to the right and down. Definition 5. Let E be a normed space with a Schauder basis
en ,
and
T an operator acting on E. The matrix of this operator in this Schauder basis is the table amn E CC; m, n E N uniquely defined by the rule Ten == I:� 1 amn em . Similarly, if we are talking about two normed spaces E and F
with Schauder bases e � and e� respectively, then the matrix of an operator T E � F with respect to these bases is the table amn E CC; m , n E N uniquely determined by the rule Te� == I:� 1 amn e':n . :
4.
91
Properties of bounded operators
If T is an operator acting on a near-Hilbert space with orthonormal Schauder basis e n , then its matrix in this basis has the form amn == ( Ten , e m ) . If T is an operator acting between two near-Hilbert spaces with orthonormal Schauder bases e � and e�, then its matrix with respect to these bases is amn == ( Te� , e':n ) .
Proposition 7.
Proof. This evidently follows from Theorem
II
2.3.
The matrix of an operator T acting on l 2 in the basis of unit vectors has the most visual form: its nth column is just the square-integrable sequence T( pn ) . The following proposition is a corollary of Proposition 3.3. Proposition 8. Let T be an operator on l 2 , and (a mn ) its matrix. Then
{ 2 II T II = sup � � amn en : en E � l en l 2 < 1 } amn en 'f/m : en , 'f/m E � l en l 2 < 1 , � l 'fln l 2 < 1 }· • = sup { m� 1 00
00
C;
00
C;
'
6°. What are the matrices of a diagonal operator and of operators of left and right shift in l 2 in the basis of unit vectors? Proposition 9. Let S be an operator acting on a normed space E with a Schauder basis e;, and T an operator isometrically equivalent to S on a normed space F. Then F also has a Schauder basis e; such that the matrices of operators S and T in these bases coincide. Proof. Suppose I : E � F implements the isomorphic equivalence be tween S and T. Put e; :== I(e;) and take y E F. Then, according to our assumption, x :== I- 1 (y) E E can be uniquely represented as a sum x == 2::� 1 An e;; An E CC. Hence y, being equal to I (x) , is uniquely repre sented as a sum y == 2:: � 1 An e; . Thus e; is a Schauder basis in F. Further, if { amn } is the matrix of S in the basis e; , then a mn e� 1 . amn e � 1 = amn l( e� 1 ) = Te� = ISI - 1 e� = ISe� = I Exercise
(zoo�
) Joo�
f�
• The rest is clear. Remark. We do not discuss here the following natural question: when is a given table a matrix of some bounded operator? Unfortunately, even for the space l 2 , which is the best space in many respects, the transparent answer is not known, and apparently not foreseen. (Proposition 8 clarifies almost nothing.) More about this will be said in the context of Hilbert spaces in Section 2. 2.
1.
92
Normed Spaces and Bounded Operators
To conclude this section we say a few words about bilinear operators in functional analysis. Unlike the linear ones, they admit at least two different interpretations of the notion of boundedness. Definition 6. Let E, F, G be prenormed spaces. A bilinear operator R : E x F � G is called jointly bounded if sup { II R(x, y) ll : x E BE , Y E Bp} < oo , and separately bounded if for each x E E and y E F the operators Rx : F � G : y �----+ R(x, y) and Ry E � G : x �----+ R(x, y) are bounded. The above supremum is called the prenorm of a (jointly bounded) bilinear operator R and is denoted by II R II · If II R II < 1, then R is called a contraction. Proposition 10. Every jointly bounded bilinear operator is separately boun :
• Proposition 11. Suppose R : E F � G is a jointly bounded bilinear operator, Xn tends to x in E, and Yn tends to y in F. Then R(x n , Yn ) tends to R(x, y) in G. ded.
x
Proof. For every n we have the estimate
II R(x, y) - R(xn , Yn ) ll == II R(x - X n , Y ) + R(x n , Y - Yn ) ll < II R II IIYII II x - X n ll + II R II ( II x ll + ll x n - x ii ) IIY - Yn ll · • The rest is clear. It is easy to prove a more general fact. Exercise 7. A bilinear operator R is jointly bounded {:::::::> it is contin uous as a mapping from the topological product E x F to the topological space G. We denote the set of all jointly bounded bilinear operators from E x F to G by Bil( E x F, G) . Remark. Having as examples Propositions 3. 1 and 3.2, we can show that this set is a linear space with the pointwise operations, and the function R �----+ II R II is a prenorm (and this justifies the term) . Finally, (Bil ( E x F, G) , II · II ) is a normed space {:::::::> G has the same property. However, we shall not need these facts. As one knows from the course of analysis, there are discontinuous func tions of two variables that are continuous in each variable separately. Simi larly, there are separately, but not jointly bounded bilinear operators. Example 5. A bilinear functional ! : coo x coo � CC : (� , ry ) �----+ �� 1 n�n 'f/n , where coo is considered with the uniform norm, is separately but not jointly bounded. (Check this!)
5. Some types of operators. Projections
93
Remark. We will see later that in Banach spaces, the two types of conti
nuity coincide (see Theorem 2.4.5) . 5.
Some typ es of op erators and operator construct ions . P roj ections
First, we note that to each pair consisting of a prenormed space E and a subspace Eo in E, one can assign the following two operators: the natural injection in : Eo � E, taking every vector x E Eo to the same vector consid ered in E, and the natural projection pr : E � E I Eo , taking every vector to its coset modulo Eo . Clearly, the norm of each operator is 1 , and in (being isometric) is topologically injective, whereas pr (being, as is easily verified, coisometric) is topologically surjective. Remark. We do not claim that the natural projection maps the ball in E onto the closed unit ball in E I Eo ; see Exercise 3.2.2.
closed unit
For future references let us formulate the following obvious
An operator is topologically injective {respectively, isomet ric and injective) {:::::::> it is a result of a successive application of a topological {respectively, isometric) isomorphism and a natural injection. II
Proposition 1.
Two other typical operators appear in the situation where a bounded operator T : E � F is already defined, and it is known that it maps a sub space Eo C E to a subspace Fo C F. The first operator is the corresponding birestriction T8 : Eo � Fo of our operator; certainly, IIT8 11 < IITII . The following proposition describes the second operator. Proposition 2. In the indicated erator T such that the diagram
situation there exists a unique linear op
is commutative, where the vertical arrows denote the natural projections. This operator is bounded and l i T II < l i T II . Proof. Clearly, the required operator is uniquely determined by the fact that it must take the coset x of a vector x modulo Eo to the coset of vector T ( x ) modulo Fo . Moreover, obviously, there is an operator that is well
defined by this rule. Further, from the commutativity of our diagram and from the fact that IIPr2 ll == 1 it follows that for all x E E I Eo and x E x we
94
1 . Normed Spaces and Bounded Operators
have II T (x) ll < II T(x) ll . Thus, II T (x) ll < II T II II x ll ; taking the infimum over all x E x, we see that l i T II < l i T II . • We distinguish a special case of the constructed operator.
Let T : E � F be a bounded operator, and Eo C Ker T. Then there exists a unique linear operator T such that the diagram E
Proposition 3.
l � E/Eo T F
pr
is commutative. This operator is bounded and II T II Eo == Ker T, then T is injective.
l i T II . If in addition
Eo and Fo :== 0, implies the statement about - the diagram - and the estimate l i T II < l i T II . On the other hand, l i T II < l i T II II P r ll == l i T II . Finally, if Eo == Ker T, then from the commutativity of the diagram it evidently follows that Ker T == 0. • Proof. The previous proposition, considered for the given
We say that the operator T in Proposition 3 (and in a more general situation, in Proposition 2) is generated by the operator T. The following proposition is similar to Proposition 1 , and it is verified immediately.
4. (i) An operator T is topologically surjective {:::::::> the corresponding generated operator T : E / Ker T � F is a topological isomorphism {:::::::> T is a composition of a natural projection and a topological isomorphism. (ii) T is a coisometric operator {:::::::> T is an isometric isomorphism {:::::::> T is a composition of a natural projection and an isometric • isomorphism.
Proposition
As an immediate corollary, do the following Exercise 1° .
(i) An operator is injective and isometric (respectively, co isometric) {:::::::> it is weakly isometrically equivalent to some natural embed ding (respectively, to some natural projection) . (ii) An operator is topologically injective (respectively, topologically surjective) {:::::::> it is weakly topologically equivalent to some natural embedding (respectively, to some natural projection) .
5. Some types of operators. Projections
95
We now consider one-dimensional operators (i.e. , operators with one dimensional image; cf. Section 0. 1 ) , and give their useful characterization. Suppose E and F are two prenormed spaces. For each pair f E E* , y E F we denote by yO f : E � F the mapping defined by the formula x �----+ (f, x )y. Proposition 5. (i) y O f is a one-dimensional operator with the norm IIYII II f ll ; (ii) if F is a normed space, then every bounded one-dimensional oper ator from E to F has the form y 0 f for some f E E* and y E F. (iii) one-dimensional operators are multiplied by the formula (y 0 f) (z 0 g) == f ( z )y 0 g. Proof. If T E B(E, F ) , and Im(T) == span{y} , then the mapping f : E � CC, taking each vector x to the unique A such that T(x) == AY , is obviously
a linear functional on E. It immediately follows from IIYII :/=- 0 that this • functional is bounded. Hence we have T == y 0 f. The rest is clear.
We shall see later that every finite-dimensional bounded operator is a sum of several one-dimensional bounded operators; see Proposition 2. 1 .5.
Remark. In the literature the operator y 0 f is often denoted by y ® /-this is the hint that it can be identified with an elementary tensor in some tensor product . We shall discuss these things later in Section 2. 7.
Here is another construction allowing us to build new operators. We recall the lp -sums of prenormed spaces, discussed in Section 1 . 1 . Exercise 2. Suppose that for some set of indices A , Ev and Fv ; v E A are two families of prenormed spaces, and Tv : Ev � Fv is a family of bounded operators such that the numbers II Tv II are jointly bounded. Then for each p; 1 < p < oo there exists a bounded operator T : EBP { Ev : v E A} � ffip {Fv : v E A} , well-defined by the rule (T(f)) (v) == Tv (f(v) ) , where f E ffip {Ev : v E A}. Moreover, II T II == sup{ II Tv ll ; v E A} . (This operator T is called the lp -sum of operators Tv ; v E A.) Now let us meet a new important typical operator-a bounded projec tion, which is closely related with the categorical construction of a coproduct (see Section 0.6) . First, we point out one special case of coproducts. Let EI and E2 be prenormed spaces, EI EB E2 its linear direct sum, and i i : EI � EI EB E2 : x �----+ (x, 0) , i 2 : E2 � EI EB E2 : y �----+ (0, y) . Endow EI EB E2 with a prenorm II · II I , by putting ll (x, y) II I :== ll x ll + IIYII ·
The normed space (EI EB E2 , II · II I ) with the injections i i and i 2 is the coproduct of EI and E2 in the category Nor. The same is true if we replace "normed" by "prenormed" and Nor by Pre.
Proposition 6.
96
1.
Normed Spaces and Bounded Operators
F be an arbitrary normed space with the bounded operators 'P k : Ek � F; k == 1, 2. Our aim is to show that there is a unique bounded operator 1/J making the following diagrams commutative (k == 1, 2) : Proof. Let
E
'l/J
ikr �
F
Ek
It is clear that there exists at least one linear operator with this property: 1/J : (x, y ) r--+ rp 1 (x) + rp 2 ( y ) . It is bounded due to the inequalities II 1/J ( X ' y) II < II rp 1 ( X ) II + II (/)2 ( y) II < c ( II X II + II y II ) == c II ( X ' y) II 1 · • The rest is clear. Remark. On reading this, you have guessed, of course, what are the co products (and maybe products as well) of families of objects in Nor and in Pre . But we think it is redundant to consider (co )products in these cate gories more carefully; we will do this in the context of the more important categories of Banach and Hilbert spaces. Now we recall Proposition 0.6.3 describing, from different points of view, the situation where a linear space is decomposed into a direct sum of two other spaces. When going over from Lin to Nor, we obtain the following "enriched" version. Proposition 7. Let ( E, II · II ) be a normed space, E1 and E2 subspaces of
E , i 1 and i 2 the natural injections of these spaces into E. The following statements are equivalent: (i) E as a linear space is decomposed into a direct sum of E1 and E2 , and the initial norm II · II is equivalent to the norm II · II ' {well-)defined by the equality IIY + z ll ' :== IIYII + l i z II for y E E1 , z E E2 ; (ii) E, together with the injections i 1 and i 2 , is a coproduct of E1 and E2 in Nor; (iii) the mapping of the space ( E1 ffi E2 , II · II 1 ) to E, taking a pair (y, z) to y + z , is a topological isomorphism. All this remains true after replacing "normed" by "prenormed" and Nor by Pre.
Proof. The proof of Proposition 0.6.3 holds in this situation of (pre)normed
spaces; we should only make some amendments to reflect the richer structure of the latter ones. (i) ===> ( ii) . If F is an "outside" space, and 'P k : Ek � F; k == 1, 2 are bounded operators, then the linear operator 1/J : E � F that takes
5. Some types of operators. Projections
97
y + z ; y E E1 , z E E2 to ¢ I ( Y ) + c/J2 (z) is well defined. If we take C > 0 such that II · II ' < C ll · II and put K : == max{ II 'P I II , II 'P 2 II } , we see that
II 1/J ( Y + z) II < II cp 1 II II Y II + II (iii) follows from Proposition 6 and the uniqueness Theorem 0.6. 1 (applied to coproducts in Nor) . II (iii) ===> (i) is clear. Definition 1. If the (equivalent) conditions of the previous proposition are
fulfilled, we say that E is decomposed into the direct sum of the subspaces E1 and E2 . In this case the subspace E2 is called a topological direct complement of the space E1 in E . A subspace that has a topological direct complement is said to be topologically complemented. Advanced readers wall certainly agree that from the point of view of this definition Proposition 7 can be interpreted in the following category-theoretic form. Exercise 3. A subspace E1 of a (pre )normed space E is topologically complemented {=:::::} the natural injection in : E1 ---+ E is a coretraction in Nor (or in Pre) {=:::::} the natural projection pr : E ---+ E / E1 is a retraction in Nor (or in Pre) .
Hint. If j is a left inverse for in, then its kernel is a topological direct complement of E1 in E. The image of the right inverse to pr plays the same role.
The reader who has done Exercise 0. 1.2 knows that every subspace of a linear space has a linear complement. But not every subspace of a normed space is topologically complemented. Here is a simple (and rather primitive) necessary condition.
Each topologically complemented subspace (and, as a corol lary, every topological direct complement of it) is closed.
Proposition 8.
Proof. This follows from Proposition 7(i) and from the obvious fact that, in
the notation used there, E1 is closed in E with respect to the norm II · II ' · •
It is much more interesting that in general, to be closed does not imply to be topologically complemented; see Theorem 2.3.4. Finally, we have come close to the bounded projections we promised before. As we show now, this notion is equivalent to the notion of decom position of a space into the topological direct sum of subspaces. First, we recall the pure algebraic prototype. Suppose a linear space E is decomposed into a direct sum of subspaces F and G. Consider the mapping P : E E associating to each vector x ; x == y + z; y E F, z E G the first summand y. It is easy to verify that P is --+
98
1 . Normed Spaces and Bounded Operators
a linear operator that is equal to its square (i.e. , P2 :== P o P coincides with P) , or, as one says, is an idempotent operator.
2. This operator is called the projection of E onto F along G. The vector P (x) is called the projection of the vector x onto F along G.
Definition
Thus, a direct sum decomposition gives rise to a certain projection. The converse is also true.
Suppose P is an idempotent operator acting on a linear space E, F :== Im( P ) , and G :== Ker(P) . Then E is decomposed into a direct sum of F and G, and P is a projection onto F along G. Proof. Every x E E can be represented as P(x) + (x - P (x) ) , where P(x) E F. It follows from P2 == P that P (x - P(x) ) == 0. This means that E == F + G. Further, if y E F, then y == P ( ) for some E E, and P ( y ) == P2 ( ) == P ( ) == y. Since P (z) == 0 for z E G, we have F n G == {0} . The • rest is clear. Proposition 9.
u
u
u
u
Let us now move from "pure" linear spaces to the (pre )normed ones.
Let E be a prenormed space. Then (i) if E is decomposed into a topological direct sum of subspaces F and G, then the projection P : E � E onto F along G is bounded; (ii) if P is a bounded idempotent operator on E, then E is decomposed into a topological direct sum of F :== Im(P) and G :== Ker(P) .
Proposition 10.
Proof. (i) Due to Proposition 7(i) , there is a constant
each x E E; x == y + z ; y E F, z E G we have
C > 0 such that for
II P (x) ll == IIYII < IIYII + ll z ll < C ll x ll . (ii) The decomposition of E into a linear direct sum of F and G is guaranteed by the previous proposition. Further, for each x E E ; x == y + z ; y E F, z E G we have IIYII + II z ll == II P (x) II + II ( I - P ) (x) ll < II P(x) II + II I - P II II x ll < (1 + 2 II P II ) II x ll • and thus we have verified the same condition (i) in Proposition 7. Corollary 1. A subspace F of a prenormed space E is topologically com plemented <====> there exists a bounded projection P : E � E having F as its zmage. Here are two illustrations: Exercise 4° . A diagonal operator T>.. on lp; 1 < p < oo is a projection <====> A consists of zeros and identities. An operator of multiplication by a
5. Some types of operators. Projections
99
function Tt in Lp ( X, J-L) ; 1 < p < oo is a projection <====> f coincides almost everywhere with the characteristic function of a measurable subset in X. Exercise 5* . Let T be an integral operator on £ 2 [a, b] with the kernel of the form K ( s , t) == L:�= l fk ( s ) gk ( t ) ; fk , gk E L2 [a, b] (such kernels are called degenerate) . Suppose each of the families { !1 , . . . , fn } and {g 1 , . . . , gn } is linearly independent. Then T is a projection <====> for every k, l == 1, . . . , n
we have J: fk(t)gz (t)dt
=
8k l ·
Remark. Actually there are no other projections among the integral oper
ators. But we shall be able to prove this only after establishing that these operators belong to the class of compact operators; see Exercise 3.3.5. To conclude this section, we say a few words about mono- and epimor phisms in our categories. The categories of prenormed and normed spaces behave differently in this respect.
In all four categories the monomorphisms are precisely the injective operators. At the same time in the categories Pre and Pre1 the epimorphisms are precisely the surjective operators, and in the categories Nor and N or1 the epimorphisms are the operators with dense image. Proposition 11.
Proof. Suppose JC is one of these categories. The proof of the fact that
every injective morphism in JC is a monomorphism, repeats word-for-word the arguments in Proposition 0.5.5. If the morphism T : E � F in JC is not injective, in other word, Eo :== Ker(T) :/=- 0, then for 0 : Eo � E and for the natural injection in : Eo � E we have T o 0 == T o in, and at the same time 0 :/=- in . Taking into account that Eo is an object in /C, and in (being a contraction operator) is always a morphism in /C, this means that T is not a monomorphism. For JC ==Nor or N or1 , the proof that a morphism in JC with dense image is an epimorphism, repeats word-for-word the corresponding argument for the category of metric spaces (see Proposition 0.5.7) . For JC ==Pre or Pre1 , the proof that a surjective morphism is an epimorphism repeats word-for word the corresponding arguments for the categories in Proposition 0.5.5. Now for a morphism T E � F in /C, denote by Fo the closure of the image for JC == Nor or Nor1 and the image itself for JC == Pre or Pre1 . Then for all these cases the space FIFo endowed with the quotient (pre )norm is an object in /C, and the natural projection pr : F � FIFo, being a contraction operator, is a morphism in /C. We see that 0 o T == ( pr ) o T, and at the same time 0 :/=- pr for Fo :/=- F. This means that in the last case T is not an • epimorphism. :
100
1.
Normed Spaces and Bounded Operators
If you are brave enough, try to restore the proof of the following beautiful result, which gives an interpretation of topologically injective and topologically surjective operators in the language of category theory. Exercise 6* . Let K be one of our categories, and T : E ---+ F a morphism in K. Then ( i) If K == Nor, then T is an extreme monomorphism {::=:::::} T is a topologically injective operator with closed image, and T is an extreme epimorphism {::=:::::} T is a topologically surjective operator. (ii) If K == N or 1 , then T is an extreme monomorphism {::=:::::} T is an isometric operator with closed image, and T is an extreme epimorphism {::=:::::} T is a coisometric operator. (iii) If K == Pre, then T is an extreme monomorphism {::=:::::} T is a topologically injective operator, and T is an extreme epimorphism {::=:::::} T is a topologically surjective operator. (iv) If K == Pre 1 , then T is an extreme monomorphism {::=:::::} T is an isometric operator with closed image, and T is an extreme epimorphism {::=:::::} T is a coisometric operator.
Hint. We only discuss why the extreme monomorphisms have a slightly different description in Nor and in Pre. Let us introduce the following notation: Fo is the image, and F1 is the closure of the image of T; T k :== T l Fk and ink i s the natural injection of Fk into F; k == 0, 1 . The topological injectivity of T is equivalent to the fact that T0 is a topological isomorphism, and the same property combined with the completeness of F1 is equivalent to the fact that T 1 is a topological isomorphism. Further, T0 is an epimorphism in Nor, as well as in Pre, and T 1 , generally speaking, is an epimorphism only in Nor. From this we see that if T is an extreme monomorphism in Nor (respectively, in Pre ) , then the desired structure of this operator is guaranteed by the equality T == (in 1 )T 1 (respectively, T == (in 0 )T0 ) . Consider now the converse statements. Suppose T == SR, So : == Im(S) and S0 : == S l 80 . We have to verify that R is a topological (and thus, a categorical) isomorphism in each of the following two situations: 1) T 1 is a topological isomorphism, and R is an epimorphism in Nor, and 2) T0 is a topological isomorphism, and R is an epimorphism in Pre. In the first case R is an operator with dense image, and therefore Fo C So C F1 ; together with the indicated property of T this gives F0 == So == F1 and, consequently, T0 == S0 R. In the second case R is surjective, and thus Fo == So , and again we have T0 == S0 R == T 1 . We see that in both cases R has a left inverse, namely (T0 ) - 1 S0 , and this is an isomorphism due to the category-theoretic Proposition 0.5.4.
6 . Funct ionals and t he Hahn-B anach theorem
The reader remembers of course that a functional is an operator with sim plest range, CC (or IR, if we consider real linear spaces) . As a special case of Definition 3. 1, we say that a functional f : E CC (where E is a prenormed space) is bounded if for some C > 0 we have lf(x) l < C l l x ll for all x E E. Moreover, there is a minimum such constant C, which coincides with the number sup{ l f (x) l : x E BE } (Proposition 3. 1) ; it is called the norm of the functional f and denoted by II f II . (The fact that this is a norm follows from Proposition 3.2.) The boundedness of a functional is equivalent to its continuity (Theorem 4. 1 ) . -�
101
6. Functionals and the Hahn-Banach theorem
We start the study of functionals with some instructive examples. Actu ally we do more: we describe all bounded functionals on some simple spaces, namely on spaces of sequences introduced in Section 1 . First, we make a preliminary purely algebraic observation. Namely, we describe all functionals on the linear space coo (of finitary sequences) . Obviously, every 'TJ E c00 (i.e. , an arbitrary sequence) defines a functional f,., : coo
c:e
00
f---t
2::::: en 'f/n
n= l (the sum makes sense because � is finitary) . Conversely, if we have a func tional on coo , then we can put 'Tin :== f( p n ), and we get a unique E such that f is equal to fry · ---+
'TJ
C 00
Now let us take a step from algebra to analysis, and consider various normed spaces of sequences in which coo is dense. Exercise 1. Every sequence rJ E Z 1 determines a bounded functional f'r/ on the space co by the formula � �----+ I:� 1 �n 'TJn , and, conversely, every bounded functional f on co is f'r/ for some uniquely determined rJ E Z 1 . The resulting bijection Io : l 1 � (co)* is an isometric isomorphism of normed spaces. (Thus, "bounded functionals on co are determined by elements of Z 1 " , and, up to an isometric isomorphism, (co)* == l1 .) Hint. If rJ is defined by the formula 'Tin :== f ( pn ) , then II TJ II 1 < ll f ll . This can be seen from the action of f on the elements in E of the form � == ( A 1 , . . . , An , 0, 0, . . . ) with A k E 1r such that A k'f/k == I 'Tlk I · Exercise 2. The same is true if we replace co with l 1 , l 1 with l00 , and the bijections Io : l 1 � (co)* with /1 : l 00 � ( l 1 ) * . (Thus, "bounded functionals on l 1 are determined by elements of Z oo " , and, up to an isometric isomorphism, ( l 1 ) * == Z oo .) Exercise 3. The same is true if we replace co by l 2 , l 1 by l 2 , and the bijections Io : l 1 � (co)* by /2 : l 2 � (l 2 )*. (Thus, "bounded functionals on l2 are determined by elements of the very same l 2 " , and, up to an isometric isomorphism, (Z2 )* == l 2 .) Hint. The boundedness of the functionals defined by the indicated rule follows from the Cauchy-Bunyakovskii inequality. To show that 'TJ generated by f is square-integrable, you can take sequences � == (iJI , · · · , iJn , O , O, . . . ) for different n. You should also know about a series of more complicated examples, which will be given without proofs. Proposition 1 (see [33, Theorem IV.8. 1] ) . Let (X, J-L ) be a and p E [1 , oo ) . Let q E ( 1 , oo ] be uniquely defined by the
measure space, equality 1/p +
1.
102
Normed Spaces and Bounded Operators
1/q == 1 for p > 1, and let q : == oo for p == 1 . Then every function y E Lq ( X, J-L) determines a bounded functional fy on the space Lp (X, J-L) by the formula x �----+ fx x(t)y(t)dJ-L(t) , and, conversely, every bounded functional f on Lp (X, J-L) is fy for some uniquely defined y E Lq (X, J-L) . The resulting bijection lp : Lq(X, J-L) � Lp (X, J-L)* is an isometric isomorphism of normed spaces. Remark. The proof of this fact for p == 2 will be given later as a corollary of the general Riesz theorem concerning the structure of functionals on Hilbert spaces (see Proposition 2.3. 10) .
Let us note that for p > 1 the numbers p and q can be replaced by each other: Lq (X, J-L)* coincides, up to an isometric isomorphism, with Lp (X, J-L) . Certainly, in the case of measure space N with the counting measure, Proposition 1 is turned into a statement allowing us to identify functionals on lp with elements of lq, i.e. , into the result of Exercise 1 with co replaced by lp , l 1 by lq, and bijection Io : l 1 � ( co)* by lp : lq � (lp )*. Exercises 2 and 3 correspond in this general scheme to the cases of p == 1 and p == 2. Now let us note an important characterization of bounded functionals.
2. A functional on a prenormed space is bounded kernel is closed.
Proposition
{:::::::>
its
===::> .
This is a special case of Proposition 0.2. 10. ¢::=:= . Suppose f : E � CC is our functional. If f == 0, then everything is proven. Conversely, let us choose x t/:. Ker(f) . Then every y E E can be uniquely represented in the form Ax + z ; A E CC, z E Ker(f) . Taking into account Proposition 1 . 1 .4, we obtain for some C > 0 the estimate • I f (y) I == l A I I f ( x) I < C I f ( x) I I I Y II . The rest is clear. Proof.
We note that the ¢=== - part of this proposition cannot be extended to arbitrary operators even if both spaces are normed. An operator can have closed, even zero, kernel, and nevertheless be non-bounded. (Give an exam ple!) *
*
*
We now turn to one of the fundamental theorems of functional analysis. To appreciate its depth, we suggest that you first do the following exercise in pure algebra. Exercise 4. Let
E be a linear space, Eo its subspace, and To : Eo � F
a linear operator with values in a linear space. Then there exists a linear operator T : E � F which is an extension of To to all of E (i.e. , such that
103
6. Functionals and the Hahn-Banach theorem
T I Eo == To or, in other words, the diagram Eo
inl � E
T
F
is commutative) . Hint. Let E1 be an arbitrary linear complement of Eo in E (see Exercise 0. 1 .2) . Then T can be defined as the operator taking vector y + z ; y E Eo , z E E1 to To (y) . Now let us assume, in addition, that E and F are prenormed spaces, and To is a bounded operator. Is there a bounded operator among all linear operators extending To to all of E? It turns out that this is not necessarily true, but we will discuss it later (see Proposition 2.3. 1 1 ) . However, as we will soon show, if F == CC, i.e. , in the case of a functional, a bounded extension always exists. But this fact is far from being simple. The following theorem is one of the few in these lectures for which we consider the real (not complex) numbers as the base field of scalars. Theorem 1 (Hahn-Banach) . Let E be a real prenormed space, Eo a sub
space of E, and fo : Eo � IR a bounded real linear {i. e. , IR-linear) functional. Then there exists a bounded real linear functional f : E � IR extending fo and such that II f II == II fo II · Proof. The result is clear in the simple case where II fo II == 0. If II fo II :/=- 0, we can replace fo with 11 7� 11 and assume that II fo i l = 1 . The main role in the proof belong to the following Lemma.
The Hahn-Banach theorem is true if codimE Eo == 1 .
Proof. Let us choose x
E
E \ Eo. Obviously, for arbitrary z1 , z2
E
Eo the
following chain of inequalities holds: fo(z2 - z1) < ll z2 - z 1 ll == ll (x + z2 ) - (x + z1 ) ll < llx + z2 ll + llx + z 1 ll · This implies - ll x + z 1 ll - fo ( z i ) < llx + z2 ll - fo (z2 ) , i.e. , for the numbers cl : == sup{ - llx + zl ll - fo( zi ) : Zl E Eo} and c2 :== inf{ llx + z2 11 - fo(z2 ) : Z2 E Eo} we have cl < c2 . Take c E IR such that C1 < c < C2 . (This number is defined uniquely if C1 == C2 , and can be arbitrarily taken in the interval [C1 , C2 ] if C1 < C2 ; a particular choice of c is not important.) Obviously, for all z E Eo we have - ll x + z ll - fo(z) < c < ll x + z ll - fo (z) , or, equivalently, l c + fo(z) I < llx + z l l .
104
1 . Normed Spaces and Bounded Operators
Now let us take an arbitrary y E E. From the hypotheses of the lemma it follows that y has the form Ax + z with uniquely defined A E IR and z E Eo . Put f : E � IR : y �----+ Ac + fo(z) ; obviously, this gives a well-defined linear functional extending fo. If A =/= 0, then l f(y) l == I A I (c + fo ( l z)) , and with the previous inequality this yields 1
i f(y) i < I A I x + A z = II Ax + z ll = IIYII · If A == 0, then the inequality l f(y ) l < IIYII immediately follows from the fact that li fo II == 1 . Combining both cases we see that f is bounded and II I II < 1 .
Since, by extending an operator ( here: a functional ) , we cannot reduce its • norm, we obtain that 11!11 == 1.
The general case. To deduce the general case from the lemma, we shall
make a ritual dance around Zorn ' s lemma. Consider the set M of pairs of the form (EI , !I ) , where EI is a subspace in E, and !I : EI � IR a functional extending fo and having the same norm. Introduce an order in this set ( see Definition 0. 1 . 1 ) : ( EI , !I ) -< ( E2 , !2 ) if EI C E2 and !2 I E1 == !I ; the properties of an ordered set are easily verified. Let Mo == {Ev; v E A} be a linearly ordered subset in M. Put E00 : == U {Ev : v E A} ; clearly, it is a subspace in E. Further, for each v E A and x E Ev we put foo (x) : == fv(x) ; evidently, this gives a well-defined linear functional f00 : Eoo � IR. Then, as is easily seen, the pair ( E00 , f00 ) belongs to M and is an upper bound of the set Mo. Thus, M satisfies the hypotheses of Zorn ' s lemma, and hence contains a maximal element; let us denote it by
(E', f') .
It remains to show that E' coincides with the whole E. Suppose this is not true; then we can take x E E\ E' and put E" : == {Ax + z : A E IR, z E E'}. Consider the above lemma for the case where E, Eo , and fo are respectively our E", E', and f'. Translating the lemma into the language of our ordered set M, we obtain that there exists a pair (E", f") belonging to M such that ( E', f' ) -< (E", !") . Since these two pairs are different, we obtain a contradiction with the fact that ( E' , f') is a maximal element in M. The • rest is clear. Remark. If E is separable, then we can prove the Hahn-Banach theorem
without appealing to Zorn ' s lemma; see Exercise 2. 1 .3.
Thus, we have constructed a norm-preserving extension of a functional. But is it unique? This question has different answers in different situations.
E be the real space l� ( i.e. , the Cartesian plane ) . Let Eo be a line ( i.e. , a one-dimensional subspace ) in E, and fo a functional Exercise 5. Let
6. Functionals and the Hahn-Banach theorem
105
of norm 1 on Eo. Then fo has many norm-preserving extensions in the following cases: (a) p == 1 and Eo is one of the coordinate axes; (b) p == oo, and Eo is one of the two main diagonals. In all other cases (including 1 < p < oo and an arbitrary Eo) fo has only one norm-preserving extension. Hint. Suppose C1 and C2 are the numbers in the proof of the previous lemma. Then in cases (a) and (b) the first number is less than the second, and in all other cases these numbers are equal. We shall inform the curious reader (without giving a proof) that for an arbitrary normed space E everything depends on what the unit sphere in the adjoint space E* is. If it does not contain an interval (like, say, the sphere in Lp (X, J.L) does not when 1 < p < oo ) , then each bounded functional on each subspace in E has a unique norm preserving extension. If such intervals exist (as in the case of L 1 (X, J.L) or Leo (X, J.L)) , then some functionals have many norm-preserving extensions. For details see, e.g . , [1 1] .
In the Hahn-Banach theorem the question was about prenorms and extensions of functionals. But the same theorem can be formulated in the language that corresponds, to some extent, to the geometric point of view. Let E be a (real) linear space. As is often done in linear algebra, by a hyperplane in E we mean an arbitrary shift of a subspace of codimension 1 , i.e. , the set of the form D :== {y + z E E}, where y is a vector from E, and z runs over a subspace of codimension 1 . Hyperplanes can also be described as level sets of linear functionals. Namely, for each f E E� , f =/=- 0, and c E IR we put DJ, c :== {x E E : f(x) == c } . The following exercise is elementary. Exercise 6°.
(i) DJ, c is always a hyperplane; it is a subspace <====> c == 0; (ii) every hyperplane in E has the form of DJ, c for some f E E� , f =/=- 0, and c E IR ; (iii) if f, g E E� , f =/=- 0, g =/=- 0, and c , d E IR, then D f, c == Dg , d <====> for some .A E IR, .A =/=- 0, we have f == .Ag and c == .Ad. Now let M be a subset in E, and D == DJ, c a hyperplane. We call D a hyperplane of support for M if either sup {f ( x ) : x E M } == c, or inf {f ( x ) : x E M } == c. Obviously, this definition makes sense, i.e. , it does not depend on the choice of the pair (J, c ) such that D == DJ, c · Thus, to speak informally, the hyperplane of support is a limit of the parallel hyperplanes lying on one side of M. Now let E be a prenormed space, and f a non-zero bounded functional on E.
106
1 . Normed Spaces and Bounded Operators
f is equal to 1 <====> support for the unit ball in E. Exercise 7° . Norm of
nf, I is a hyperplane of
In the light of what has been said, the Hahn-Banach theorem becomes equivalent to the following "geometric" proposition. Exercise 8 ° . Let
E be a prenormed space, Eo a subspace in E, and
n ° a hyperplane of support of the unit ball in Eo . Then there exists a hyperplane of support for the unit ball in E containing n ° .
Now armed with the knowledge concerning real spaces, we can go back to spaces over CC. Traditionally, the following theorem is again called Hahn Banach, 7 and we shall do the same in further references; no confusion will occur because of this.
2. The assertion of Theorem 1 remains true if we replace the word "real" with the word "complex" and the ground field IR with CC.
Theorem
Proof. Our E, now being a complex linear space, at the same time is a real
linear space, and Eo is its real subspace. Put go : Eo � IR : x �----+ Refo ( x) ; obviously, this is a real linear functional. Since l go(x) I < l fo (x) I , go is bounded, and II go II < II fo II . Applying the previous theorem, we obtain a bounded real-linear ( i.e. , IR-linear ) functional g : E � IR such that ll g ll ==
II go II .
Now put f : E � CC : x �----+ g(x) - ig( ix) . Clearly, it is a real linear operator between E and CC as real linear spaces, and f(i x) == if(x) for all x E E. From this we have that f is a complex linear ( i.e. , CC-linear ) functional. Take x E Eo ; then f(x) == Refo (x) - iRefo (ix) , and, as fo is a complex functional, f(x) == Refo (x) - iRe(ifo(x)) . Taking into account that Re(i z ) == -Imz for each z E CC we have f(x) == Refo (x) + iimfo (x) == fo (x) . Hence, f is an extension of fo to E. It remains to look at the norms. If x is such that f(x) =/=- 0, and y :== 1 1 x, then IIYII = ll x ll and f(y) E JR, hence f(y) = g (y) . This means that all the numbers of the form l f(x) l ; x E BE have the form l g(y) l ; y E BE . This implies that f is bounded, and ll f ll < ll g ll ; at the same time ll g ll • II go II < II fo II < II f II . Thus II fo II == II f II .
�
The Hahn-Banach theorem is an old reliable weapon in functional anal ysis, and we will shoot this gun many times in the book. 7 Actually,
Hahn and Banach did not consider the "complex" case. Theorem 2 was deduced from Theorem 1 by G. A. Suhomlinov and independently by H. F. Bohnenblust and A. Sobczyk. But names of mathematical statements live their own strange lives.
6. Functionals and the Hahn-Banach theorem
107
Here is the first application. Suppose E =/=- 0 is a normed space. Let us ask an innocent question: what guarantees that there are non-zero bounded functionals on E? In other words, does E =/=- 0 always imply E* =/=- 0? The answer is positive, but it is apparently impossible to establish this fact without the Hahn-Banach theorem. ( At least, nobody managed to do this up to now. )
Let E be a prenormed space, and x a vector such that ll x ll > 0. Then there exists a bounded linear functional f : E � CC of norm 1 such that f ( x) == II x II . Proof. Take Eo : == span { x} and choose a functional f : Eo � CC : AX �----+ A ll x ll ; A E CC. Obviously, li fo II == 1 , and the Hahn-Banach theorem immedi ately gives a functional f with the required properties. • Corollary 1. Let E be a normed space, and x and y two different vectors in E. Then there is f E E* such that f(x) =/=- f(y) . Theorem 3.
This corollary is often expressed by saying that "there are sufficiently many bounded functionals on a normed space" ( enough to distinguish the vectors ) . Here is a more general fact. Exercise 9. Suppose E is a prenormed space, E 1 is a closed subspace in E, and x E E \ E1 . Then there exists a bounded functional f : E � CC such that f l E1 == 0 and f(x) =/=- 0. Hint. Consider a bounded functional AX + y �----+ A on the subspace {Ax + y : y E E1 , A E CC}. Then apply the Hahn-Banach theorem. To realize why these results are substantial, consider for comparison one quite natural linear space in analysis endowed with the metric that is not generated by any norm. Exercise 1 0* . Let E be a linear space ( of cosets ) of functions on [ 0, 1] measurable in the sense of Lebesgue. Endow it with the following metric:
1 - y(t) j dt d(x, y) : = J{ 1 jx(t) o + l x(t) - y(t) i
( check that convergence with respect to this metric is the convergence in measure ) . Then there is no non-zero linear functional on E continuous with respect to this metric. We indicate another useful corollary of the Hahn-Banach theorem.
Let E be a normed space, and Eo a finite-dimensional subspace of E. Then there exists a closed linear complement of Eo in E.
Proposition 3.
1.
1 08
Normed Spaces and Bounded Operators
Proof. We use induction on the dimension of Eo . If dim Eo
== 1 , and Eo ==
span{x}, then, by Theorem 3, there is f E E* with f (x) =I= 0, and taking into account Proposition 2, we see that Ker(f) is the required complement. Now suppose this proposition is true for all subspaces (of all normed spaces) of dimensions up to n, and dim Eo == n + 1. Take arbitrary x E Eo \ {0} and f E E* with f(x) =I= 0. Put EI :== Ker(f) . Then Eoi :== Eo n EI is a subspace in EI with dim Eo I == n. By the induction assumption, there is a linear complement, say, F, of Eoi in EI , and F is closed in EI . Therefore, taking into account the completeness of EI in E, F is closed in E. In addition F + Eo == F + Eoi + span{x} == EI + span{x} == E, and from F n Eo C EI n Eo and F n Eo C F it follows that F n Eo == 0. The rest is • clear. Remark. Soon we shall prove (see Section 2. 1) that every finite-dimensional
subspace of a normed space is topologically complemented. *
*
*
The Hahn-Banach theorem allows us to give a complete description of bounded functionals on one of the most important normed spaces, C[a, b] . It turns out that every such functional is an integral with respect to some complex measure, and its norm coincides with the variation of this measure. To give the exact formulation and the proof of this result, we need some standard definitions and facts of real analysis (i.e. , of the theory of measure and integral) ; cf. [9] or [106] . Let J-L == VI - v2 + i v3 - i v4 ; VI , . . . , v4 > 0 be a complex measure on [a, b] (see preparation to Example 1.11). Then for every piecewise continuous function x : [a, b] � CC (this class is sufficient) the number
is called the Lebesgue-Stieltjes integral of x with respect to the complex mea sure J-L and is denoted by J: x(t)dJ-L(t) . This number, as can be easily shown, does not depend on the choice of the decomposition of J-L into a linear com bination of usual measures. Further, suppose
6. Functionals and the Hahn-Banach theorem
109
Finally, we recall the space of measures M[a, b] introduced in Example
1.11.
4 (Riesz) . 8 Every J-L E M[a, b] determines a bounded functional fJ.L : C[a, b] -t CC by the rule x r--+ I: x( t )dJ-L( t ) , and {the main point) every bounded functional on C[a, b] is fJ-L for some uniquely determined J-L E M[a, b] . The resulting bijection I : M[a, b] -t C[a, b]* is an isomet ric isomorphism of normed spaces. Theorem
Proof. For a complex number
.A , we denote by .x - the number .A/ I-AI E 1r if
.A =/=- 0, and 0 otherwise. Take J-L E M[a, b] ; obviously, the mapping fJ-L is well defined and is a linear functional. Further, for each x E C[a, b] we consider a sequence x m ; m == 1, 2, . . . of step-functions (i.e. , linear combinations of characteristic functions of intervals) uniformly tending to x. From the definition of the variation of a complex measure (see preparation to Example 1.11) it evidently follows that I I: X m (t)dJ-L(t) l < ll xm ll oo ii J-L II · Passing to the limit as m -t oo , we obtain the estimate I fJ-L ( x) I < II J-L II . Thus, the mapping I in our theorem is
well defined and is a contraction operator. We now show that I is injective. Suppose I (J-L) == fJ-L == 0. Take an interval S C [a, b] and choose a uniformly bounded sequence X n of contin uous functions that pointwise tend to the characteristic function x s of this interval. By the Lebesgue theorem on the passage to the limit under the integral (extended in an obvious way from usual measures to complex ones) the sequence fJ-L (x n ) tends to I: x s (t)dJ-L(t) , i.e. , to J-L(S) . Hence J-L(S) == 0. Since the interval S was arbitrary, J-L == 0. Hence, I is injective. It remains to verify that every functional f E C[a, b]* has the form of fJ.L for some J-L E M[a, b] , and II J-L II < ll f ll . Take a functional f E C[a, b]* and, using the Hahn-Banach theorem (here is the culmination point of the proof) , extend it to a bounded functional f on l00 ( [a, b]) with the same norm. For each t E (a, b] , denote by xt the characteristic function of the half open interval [a, t) , and introduce the function -
{
t �(t) == f( x ) , t E (a, b] , t - a. 0, Consider an arbitrary partition T == {a == to < t 1 < · · · < t n == b} of our interval. The sum � T :== 2:: �= 1 l �(t k ) - �(t k - 1 ) 1 is obviously 2: �= 1 l f( Xk ) l , where Xk denotes the characteristic function of the half-open interval [tk - 1 , t k ) · Put A k :== f ( Xk ) · Then � T coincides with !(2:: �= 1 .A k Xk ) , i.e. , 8
F. Riesz is an outstanding Hungarian mathematician, one of the founders of functional analysis. In this book we shall encounter Theorems 2. 2 . 1 , 2.3. 2 , and 3. 2.2 belonging to Riesz.
110
1.
Normed Spaces and Bounded Operators
-
with the value of the functional f on some function with uniform norm < 1. Hence, � T < II f II == II f II . Since the partition T was arbitrary, � is a func tion of bounded variation, and its variation is not greater than II f II . Let IIJLII be the complex measure generated by the function � in the way we described above. Then IIJL II evidently coincides with the variation � - Hence IIJLII < 11 ! 11 It remains to show that f == JJ-L . Take x E C[a, b] and consider the se quence X n of step functions uniformly converging to x . Since f is continuous, the sequence f(xn ) tends to f(x) == f(x) . On the other hand, from the definition of the measure JL it easily follows that /(xn ) == J: Xn (t) dJL (t) . Again using the Lebesgue theorem, we see that f (xn ) tends to fab x( t) dJL (t) == JJ-L (x) . • Consequently, f(x) == JJ-L (x) . The rest is clear. *
*
*
Now we discuss another application of the Hahn-Banach theorem. As we have already mentioned on a different occasion, when mathematicians begin to play with their toys, no one can stop them. Given a prenormed E, we defined E* . Now we can forget about the origin of E* and apply the same construction to E* , i.e. , consider the space E** :== (E*)* of bounded functionals on E* . This is a normed (see Section 1.3 ) space, and it is called the second dual to E. The main observation is that every vector x E E defines a functional a� on E* (i.e. , a vector from E** ) by the rule (a�, f) :== ( J , x) (recall that (a� , f) is another, more convenient here, notation for [a�] (f)) . A mapping a E : E � E** : x �----+ a� appears. We shall not write the upper index, if it is clear about which E we are talking. Functionals on E* of the form a x ; x E E are called evaluating functionals in what follows.
For every prenormed space E the corresponding mapping a is an isometric operator. Proposition 4.
a is a linear operator, and the estimate l (a x , f) l < ll x ll ll f ll shows that it is a contraction. Further, taking for each x E E the element f E BE * such that ll f (x) ll == ll x ll (Theorem 3) , we see that ll ax ll > ll x ll . • The rest is clear. Proof. Clearly,
The operator a : E � E** is called the canonical isometric operator (for E) . If E is a normed space, then, of course, a is injective, and hence, im plements an isometric isomorphism between E and its image in E** . In this case a is also called the canonical injection of E into E** : the identification of E and Im (a) with the help of a allows us to regard E as a subspace in E** .
6. Functionals and the Hahn-Banach theorem
111
Now it is natural to distinguish the following class of normed spaces. Definition 1. A normed space E is called reflexive if the operator a E � E** is surjective (or, what is equivalent in this context, is an isometric isomorphism) . Thus, E is reflexive if every bounded functional on E* is an evaluating functional. Let us immediately say that every finite-dimensional space is reflexive. This will be shown in the next section, and now we pass to more interesting examples and counterexamples. Proposition 5. The space l 2 and, more generally {here we take for granted
Proposition 1), Lp ( X, J-L) , where (X, J-L) is an arbitrary space with measure and p E (1 , oo ) , is reflexive. In particular, lp is reflexive for every p E (1, oo ) . Proof. Take g E Lp ( X, J-L) ** . Our goal is to find x E Lp (X, J-L) such that g == a ( x) , i.e. , (g, f) == ( f, x) for every f E Lp (X, J-L) * . Consider the mapping g : Lq ( X, J-L) � CC : y �----+ (g, fy) , where fy is the functional indicated in Proposition 1. Obviously, g is a bounded functional on Lq (X, J-L) . By the same proposition (but with q in the role of p ) , there exists x E Lp(X, J-L) such that (g, y) == x �----+ fx x(t)y(t)dJ-L(t) for all y E Lq ( X, J-L) . But the last integral is also the number (fy, x) . Hence, for the same y we have ( §, fy) == (fy, x) . It remains to apply Proposition 1 again and to recall that every f E Lp (X, J-L)* has the form of fy for some y E Lq ( X, J-L) . • Note that the reflexivity of Lp ( X, J-L) for p == 2 is also a direct corollary of the future observation that all Hilbert spaces are reflexive (see Proposition
2.3.9) .
Warning. In some books you can read the following "proof" of this
result: Since, up to an isometric isomorphism, Lp (X, J-L)* == Lq (X, J-L) and Lq (X, J-L)* == Lp (X, J-L) , we have Lp (X, J-L)** == Lq(X, J-L)* == Lp (X, J-L) (and here the author rests his case, thinking that everything is done) . Do not believe this proof! There are spaces E for which there exists (some) isometric isomorphism between E and E** , but the canonical operator is not of that kind: it is not surjective, and thus, such spaces E are not reflexive. (The first such space was found by R. James in 1951 (cf. [33, p. 102] ) ; in his example imagine this!-the operator a maps E to a subspace of codimension 1 in E** . We will suggest much simpler counterexamples. Exercise 11. The normed spaces co , l 1 , and C[a, b] are not reflexive. Hint. Let § 1 E c0* be the functional taking fry; rJ E l 1 (see Exercise 1) to L � 1 'Tin · Suppose g is an arbitrary functional on Zoo vanishing on co and taking the value 1 on ( 1, 1 , 1 , . . . ) (its existence is guaranteed by Exercise
1.
1 12
Normed Spaces and Bounded Operators
9) . Let fJ2 E li* be the functional taking f17 ; rJ E l00 (see Exercise 2) to g ( ry ) . Finally, let t be a point in [a , b] , h a functional on M [a, b] associating to each measure its value on the singleton { t}, and g3 E C [a, b]** a functional taking fJ.L; J-L E M [a, b] (see Theorem 4) to h(J-L) . None of the elements fJk ; k == 1 , 2, 3
belongs to the image of the corresponding canonical operator.
Remark. The spaces £ 1 (X, J-L) and L00 (X, J-L) are also not reflexive (unless
these spaces are finite-dimensional) . Our future spaces C(O) , where 0 are infinite compact sets, are not reflexive either (see Section 3. 1) . But these facts require more complicated proofs. A very rough necessary condition of reflexivity, namely completeness, will be considered later (see Corollary 2. 1.4) . Proposition 6.
If E is reflexive, then E* is reflexive as well.
Proof. Our goal is to show that every bounded functional rp : E** --+ CC is defined by some f E E* and acts by the rule rp ( x) == x(f) ; x E E** . Put f :== cpa , where a : E --+ E** is a canonical injection. By the assumption, every x E E** has the form a y for some y E E. Therefore, rp ( x) == rp ( a y ) == f(y) == ay ( f) == x(f) . •
We shall finish the discussion of reflexivity with the following rather curious criterion of this property.
50] ) . A normed space E is reflexive {:::::::> for every f E E* there exists x E BE such that f ( x) == II f II . { "The upper bound in the definition of norm for a functional can be reached".)
Theorem 5 (for the proof, see, e.g. , [108 , p.
Should the question be about real linear spaces, this property would certainly mean that every supporting plane of a unit ball has at least one common point with the ball ( cf. Exercise 7) . One more criterion of reflexivity, formulated in terms of the so-called weak topology will be given later (cf. Proposition 4.2. 16 and Exercise 4.2.7). Up to the end of this section we shall be speaking about linear spaces over the field of real numbers. For some applications it is useful that the Hahn-Banach theorem can be generalized by replacing prenorms with a more general class of functions. Exercise 12. Let E be a linear space and p : E ---+ JR+ a function with the following properties ( cf. Definition 1 . 1 ) :
( i ) p (Ax) == Ap (x) for all A > 0 , x E E, and ( ii ) p (x + y) < p (x) + p (y) for all A > 0, x, y E E.
1 13
7. Invitation to quantum functional analysis
Now let Eo be a subspace in E and fo : Eo ---+ IR a linear functional such that fo (x) < p( x ) for all x E Eo . Then there exists a linear functional f : E ---+ 1R extending fo and such that f( x ) < p( x ) for all x E E. Hint. The proof of Theorem 1 can be repeated in this situation almost literally. However, you should be careful in one place; namely when you choose the constant c to prove the inequality f(y) < p(y) ; y = AX + z (see the lemma) , you should consider separately two cases A > 0 and A < 0. We recall the almost tautological "geometric" interpretation of Theorem 1 expressed in Exercise 8. The generalization of this theorem we have given before admits a more substantial interpretation. Let E be a linear space, M and N disjoint subsets of E, and D = Dt,c a hyperplane. We say that D separates these subsets if for each pair (x E M, y E N) either f (x) < c < f(y ) , or f( x ) > c > f(y ) . ( "Our sets lie on different sides of D." ) We say that a point x of a subset M in E is linearly interior for this set if for each y E E the vector x + ty belongs to M for sufficiently small t > 0. Exercise 1 3 * . Let M and N be disjoint convex subsets of a (real) linear space E, and let M contain a linearly interior point . Then there is a hyperplane in E separating M and N.
Hint. Take K : = M - N (i.e. , the sum of M and ( 1 )N ; see Section 0. 1 ) . This set is convex, contains a linearly interior point, say, y, and does not contain 0. Apply Exercise 12 in the situation where p is a Minkowski functional of the set K - y, Eo is a straight line containing y, and fo : Eo 1R : y � - 1 . Then the arising functional f is non-positive on K, hence, f (x - z ) < 0 for all pairs (x E M, z E N) , and there exists c E 1R such that f (x) < c < f( z ) for all these pairs. Now we can take D : = Dt,c as the desired hyperplane. Note that the conditions of this exercise are automatically fulfilled if E is a prenormed space, and M contains an interior point (this time in the sense of the topology of E) ; then this point is automatically linearly interior. The condition concerning the linearly interior point cannot be thrown away: -
---+
Exercise 14. In the space of real finitary sequences ( "real eoo" ) consider the set M consisting of those sequences for which the leading non-zero term is positive. Then M and -M are convex, but they cannot be separated by a hyperplane.
7. Invitation to quant um functional analysis This entire section is addressed to "advanced" , and, above all, curious readers. We try to give them an idea (on a very elementary level) of a new structure of functional analysis which has come to the fore in the last 20 years. If we look at this structure from a general point of view, its appearance reflects a further step in the triumphant progress of new mathematical ideology. This view arose in the mathematical apparatus of quantum mechanics, and overflowed modern algebra and geometry. In functional analysis it settled down in the theory of operator algebras. For the last years the wave of this fashioned, so-called "non-commutative" or "quantum" mathematics rolled to the very foundations of functional analysis: now the very notion of norm has to be quantized. What is the essence of this new mathematical ideology? Using the "incomprehensible, but impressive" style, we can say that its main dogma is as follows: quantum mathemat ics emerges from the classical one after replacing functions by operators. Let us try to clarify this. The outstanding role that is played in "classical" mathematics by functions
1 14
1.
Normed Spaces and Bounded Operators
with their commutative (pointwise) multiplication, in "quantum" mathematics passes to operators with their non-commutative multiplication (composition) . But it seems that the most important thing is the nature of this passage. It turns out (and this is what we should come to believe in) that fundamental notions and results of classical mathemat ics do have substantial quantum analogues or versions. We can say that these classical notions represent a small and hardly visible ( "classical" ) part of a huge "quantum" ice berg. To comprehend all of this iceberg we should realize (guess?) how to reasonably replace functions lying in the foundation of these notions (results, methods, problems) with operators. But all this is too general and vague. The question is how to perform such "quan tization" in practice for a concrete notion taken from some area of mathematics (even assuming that this happens in an implicit or mediated form; as always, life if more com plicated than a scheme) . Often it is not clear in advance what to do , and different people can give you different suggestions. Ask, for example, what is a quantum group, and two specialists will give you three answers. However, gradually some conformity of ideas has been established in many fields of science. By now, for instance, the subject took shape, say, in non-commutative measure theory, non-commutative topology (we say a few words about them in Section 6.3) and (though, with less clarity) in non-commutative differential geometry. Among extensive literature on these topics the book [21] by Connes is especially impressive. It is the main source for modern "non-commutative" (i.e. , "quantum" ) mathematicians. Now we leave mathematics as a whole and concentrate on what we consider here the theory of normed spaces. What do people quantize in it and how? The following two statements serve as a "guide to action" : 1 . Classical functional analysis deals exclusively (do not be surprised! ) with spaces of functions, and its main structure is the norm, moreover (do not be surprised again! ) , the uniform norm. 2. Quantum (sometimes one says "quantized" ) functional analysis deals with the spaces of operators (we will soon specify where they act) , and its main structure is the so-called quantum norm. (Formally, this notion is different from that of a norm itself, and we shall also define it soon.) Let us finally finish the sermon and give the precise meaning to the words we said. First , why are there no other normed spaces, but function spaces? This is the reason: Proposition 1 . For each normed space E there is an �sometric operator Jo from E to
Zoo (X) , where X is a set. Proof. Take the unit sphere in E* as X and assign to each x E E the function Jox : X ---+ C : f r-+ f (x) . Then for every f E X we have I Jox (f) l < II I II l l x l l == ll x l l , and, by Theorem • 6.3, there is f such that I Jox(f) I == ll x l l . The rest is clear. Remark. Actually every separable normed space can be isometrically embedded into Zoo (i.e. , already X : == N will do) and even into the separable space C [O, 1] (cf. 5° at the end of Section 4.2) .
Thus we see that every normed space coincides (up to the most precise identification, isometric isomorphism) with (some) space of bounded functions endowed with the uniform norm. But what will be more important for us is that our spaces, being spaces of functions, automatically become spaces of operators. Proposition 2. For every normed space E there ex�sts an isometr�c operator J from E
to B(Z2 (X) ) , where X �s a set.
1 15
7. Invitation to quantum functional analysis
Proof. First, let us construct for a given set X an operator from Zoo (X) to B(Z 2 (X) ) . Assign to each x E Zoo (X ) the operator Tx : l 2 (X ) ---+ Z2 (X ) taking a square-integrable function y to Tx Y : t � x ( t )y( t) . (We see that Tx is an immediate generalization of the diagonal operator on l 2 ; it becomes the latter for X == N) . Clearly, Tx is a bounded operator on Z 2 (X ) and II Tx II == ll x ll oo From this it is clear that the mapping J1 : Zoo (X) ---+ B(Z 2 (X) ) : x � Tx is an isometric operator. It remains to consider the composition • J : == J1 J0 , where Jo is the isometric operator in the previous proposition. .
Now we pass to the main notion of the discussed topic, the quantum normed space . To understand better the logic of what shall happen now, look again at the usual ( "classical" ) normed space . Proposition 1 asserts the equivalence of the two approaches to this notion: it can be defined either in an "abstract way" , as in Definition 1 . 1 , or in a "concrete way" , simply as a subspace of Zoo (X) with the inherited norm. The notion of quantized normed space also has these two faces, "abstract" and "concrete" . Let us start with the abstract definition. Let E be a linear space. For each n == 1 , 2, . . . denote by Mn (E) the linear space of n x n matrices with entries from E. For x E Mm (E) and y E Mn (E) , we call the matrix Mm+n (E) of the form ( 0 � ) the direct sum of these matrices, and denote it by x EB y. If x == (x k z ) E Mn (E) and o: == ( o: k z ) is a usual matrix of the same size (i.e . , an element of Mn == Mn (C)) , then, similarly to the standard matrix multiplication, we put o:x : == (L:� 1 O: k iXi l ), xo: :== (L:� 1 O:i l X k i ) E Mn (E) . Finally, when speaking about the norm of a usual matrix o: E Mn , we will always have in mind its norm as an "operator on C2 " . In other words, the norm of the n x n matrix ( Ak z ) is sup
{ � I � Aklel l 2 : el E � e! } I{ k�l AklelT]k Bigl : e! , T]k E � e C;
= sup
( cf. Proposition 4.8) .
2 l l <1
C;
2 l d < 1,
�
2 I TJd < 1
}
Definition 1 (Effros-Ruan, 1 988) . Suppose that for each n E N, II · l i n is a norm in Mn ( E ) . A sequence of norms II · l i n is called a quantum norm (the authors call it matrix-norm) in E if it has the following properties:
(i) For each x E Mm (E) and y E Mn (E) we have ll x EB Y I I m + n == max { l l x l l m , I I Y I I n } · (ii) For each X E Mn (E) and o: E Mn we have l l o: x l l n < l l o: l l ll x l l n , l l xo: l l n < ll x l l n l l o: ll . A linear space E endowed with a quantum norm is called a quantized normed space or, briefly, a quantum space . (We note that the authors used the term "abstract operator space ". ) Later we shall use the expressions like "quantum norm { II · II n; n E N } " or "quantum space (E; { II · l i n ; n E N } )" (with the obvious meaning) . Thus, a quantum norm is a sequence of norms, and moreover, in different spaces. For the norm II · l i n we shall sometimes use the name the norm of the nth floor. Remark. We can define the same notion by considering instead of a sequence of norms II · l i n in Mn (E) a norm in the space Moo (E) of infinite matrices (xkz ) ; k, Z E N (with matrix elements from E) that have only a finite set of non-zero entries. We suggest that the reader formulate the requirements for such a norm replacing properties (i) and (ii) in Definition 1 .
1 16
1.
Normed Spaces and Bounded Operators
Exercise 1 . Let { II · I I n ; n E N} b e the quantized norm in E . Then for a matrix x E Mn (E) the norm II · II n does not change if we transpose some rows or columns in this matrix. This norm does not increase if we replace some rows or columns by zeroes.
Obviously, every subspace F of a quantum space E is also a quantum space with respect to the sequence of norms in Mn (F) inherited from the enveloping space Mn (E); we call it a quantum subspace in E. Now, here is an important moment : we introduce quantum spaces that play the same role in quantum functional analysis as the Zoo (X) do in classical functional analysis. These are B( l2 (X)) (where X is again an arbitrary set) with the quantum norm which will be defined right now. The essence of what will happen can be briefly and informally expressed as follows: "a matrix consisting of operators is itself an operator" . Our goal is to define the norm in Mn (B(l2 (X)) ) for each n. For this purpose, de note by nX the union of n disjoint copies of the set X (if you wish, the coproduct of n copies of this set in Set ; see Section 0.6.) Consider the normed space l2 (nX ) . It consists of square-integrable functions which can be conceived as the ordered families x == (x 1 , . . . , xn ) of functions from l2 (X) . Now for a given "operator-matrix" a == (akz E B(l2 (X) )) consider the operator Ta acting on l2 (nX) and taking x == (x 1 , . . . , Xn ) to (L:� 1 au ( xz ) , . . . , L� 1 anz ( xz ) ) . (Thus, if you depict x and Ta x in the form of columns (i.e . , n x 1-matrices) , then the column Ta x is a symbolic product of matrices a and x: Tax == ax. ) Obviously, the operation a � Ta is a linear isomorphism between Mn (B(l2 (X) )) and B(l2 (nX) ) . Hence, we can correctly define the norm II · l i n in the first of these spaces by taking the norm of the operator Ta as l l a l l n · We leave it to the reader to verify that the sequence { II · l i n ; n E N} is indeed a quantum norm in B(l2 (X) ) . Definition 2 . A quantum subspace of a quantum space of the form B(l2 (X)) is called a concrete quantized normed space or, briefly, a concrete quantum space. 9 The quantum norm in a concrete quantum space is called standard. Note that a quantum norm in a linear space E automatically defines an "ordinary" norm: this is the norm "on the first floor" , i .e . , in :&1h (E) == E. The normed space (E, ll · ll 1 ) will be called the underlying normed space of the corresponding quantum space. Example 1 . The simplest non-zero quantum space C. If we identify the complex plane with B(l2 (X)) for a singleton X (i.e. , speaking informally, with B(C) ) , we make it a concrete quantum space. According to the general prescription we formulated above, the norm I I · l i n in Mn (C) == Mn is determined after we identify the latter space with B(l2 (nX)) == B(C2 ) . Clearly, we obtain the very same norm of a matrix "as an operator on C2" that appeared in the definition of the quantum norm. Definition 3. Suppose 11 · 11 is a norm in a linear space E. Its quantization is an arbitrary quantum norm I I · l i n ; n E N in E such that I I · ll 1 == II · I I · A quantization of the norm ed space E is an arbitrary quantum space such that its quantum norm is the quantization of the initial norm in E.
(Thus, a quantized normed space is what we obtain from a normed space after a quantization.) It is easy to see that for a normed space E every isometric operator J from E to the operator space B(l2 (X)) automatically "quantizes" E. Indeed, for every n the linear operator Jn : Mn (E) ---+ Mn (B(l2 (X) )) : (xkz ) � (Jxkz ) arises, which clearly is injective. In deference to the creators of the theory we should have used their term "concrete space " , but we wished to avoid confusion with the omnipresent word "operator" . 9
operator
7. Invitation to quantum functional analysis
117
Hence, if for ll (xkz ) l l n we take the above norm of the operator-matrix ( Jxkz ) , we get a well-defined norm I I · l i n in Mn (E) . It is easy to verify that { I I · l i n ; n E N} is a quantum norm in E which is a quantization of the initial norm. We say that this quantum norm is generated by the isometric embedding J . Now we see that the idea of Proposition 2 is that for every normed space there exists at least one quantization. It is important, however, that the same space can be quantized in many completely different ways. Let us give two substantial examples. In both cases the "geometrically best space" l 2 plays the role of E.
Example 2. ( Column quantization. ) Consider the operator Jc : l 2 ---+ B( l2 ) taking a sequence � == (� 1 , �2 , . . . ) to the one-dimensional operator Jc � : l 2 ---+ l 2 which takes the first unit vector p 1 to � and is equal to zero on all other unit vectors . (Thus, � is mapped to the operator which in the basis of unit vectors is given by the matrix with � as the first column and zeroes as other columns; this justifies the name of the procedure.) Evidently, I I Jc � l l == I I Jc � (Pl ) ll == 1 1 � 1 1 · Thus, Jc is isometric, and hence, it generates "on the base of l 2 " a quantum space, which will be denoted by l2 .
Example 3. ( Row quantization. ) Now consider the operator Jr : l 2 ---+ B( l 2 ) taking a sequence � to the one-dimensional operator Jr� : l 2 ---+ l 2 which takes p n to �nP 1 , and hence, every rJ E l 2 to (L� 1 �k 'r/k )PI · (Thus, the matrix of the operator Jr� has � as the first row, and zeroes as the other ones.) It is easy to see that II Jr� ll == I I Jr� ( �/ 1 1 � 11 ) 11 == 11 � 1 1 · Hence, Jr is isometric, and therefore generates "on the base of l 2 " a quantum space denoted by 12 .
Later we shall see that these two quantizations of the space l 2 are, in a sense, pro foundly different . Besides, they have nothing in common with the quantization constructed in the proof of Proposition 2 (with l 2 as E) . Among all possible quantizations of a given norm on E there are two "extreme ones" , denoted by { ll · l l � ax; n E N} and { ll · l l �in ; n E N} . They have the property that for every quantization { I I · l i n ; n E N} of the same space we have I I · l l �in < I I · l i n < I I · l l � ax for all n. The first of these quantizations is said to be maximal , and the second , minimal; the corresponding quantized spaces, called respectively maximal and minimal, are denoted by Em a and Em . It turns out that the minimal quantization is precisely the one obtained in in x Proposition 2. (We shall not prove this.) It is easier to construct the maximal quantization. Consider the norm I I · l l � ax : == sup{ l l · l l n } in Mn (E) , where the supremum is taken over the nth floor norms of all existing quantizations of the space E. Exercise 2. The sequence { I I · l l � ax ; n E N} is a quantum norm in E which is a maximal quantization of the initial norm. A fundamental fact of quantum functional analysis is that there are no quantum spaces other that the concrete ones: each abstract quantum space can be "identified" with some concrete space. But what does this "identification" mean? The reader understands that this is a question about isomorphisms in some reasonably chosen category. As usual, when we are working with categories, "morphisms are more important than objects" . Consequently, the basic notion to which we now turn is apparently more important than the very notion of quantum space. Let E and F be quantum spaces, and T : E ---+ F a linear operator. Then on each "floor" n we have a linear operator Tn : Mn (E) ---+ Mn (F) taking a matrix x == (xkz E E) to Tn x : == (Txkz E F) . (For the reader who is familiar with tensor products we note that under the identification of Mn (E) with Mn ® E and Mn (F) with Mn ® F the operator Tn is l®T, where 1 is the identity operator on M n . ) It is easy to see that if T is a bounded
1 18
1.
Normed Spaces and Bounded Operators
operator "on the first floor" , then the same is true for Tn "on all floors" . The trick is that the norms of these operators are uniformly bounded. Definition 4 (Wittstock-Paulsen, 198 1 ) . An operator T is called completely bounded if it is bounded and (the main thing!) sup{ I I Tn l l ; n E N} < oo . (This supremum is denoted by I I T I I c b · ) Further, T is said to be completely contractive (respectively, completely isometric ) if Tn is a contraction (respectively, an isometric operator) for each n.
If J : E ---+ B ( l2 ( X ) ) is an isometric operator, B ( l2 ( X ) ) is endowed with a standard quantum norm, and E by the quantum norm generated by this J, then, by the definition of the latter one, J is an evident example of completely isometric operator. (You can guess that behind the notation I I T I I c b there is the fact that the corresponding function is indeed a norm on the linear space CB(E, F) of all completely bounded operators from E to F. Verify this.) Remark. Completely bounded operators between the spaces which themselves consist of operators appeared before quantum spaces. As Vern Paulsen and some other mathe maticians showed , a series of important problems in functional analysis, being interpreted in a proper way, are the questions on whether certain operators are completely bounded with respect to some quantizations of the corresponding spaces. This approach has led to substantial progress in these problems, and sometimes even to their complete solution. Perhaps, the most impressive success in this direction is the negative solution obtained by Pisier in 1 995 of the well-known and old Halmos similarity problem. About all this see, e.g. , [34, 35, 37] .
Now we can introduce basic categories of quantum functional analysis. Similarly to the "classical" analysis, there are two of them, and the choice depends on which quantum spaces are viewed as "the same" . These are the categories QN or and QN or 1 . Both have (abstract) quantum spaces as objects. But morphisms in the first category are all completely bounded operators, and in the second, completely contractive operators only. Definition 5 . Isomorphisms in QNor have a special name, complete topological �somor phisms , and isomorphisms in QNor 1 are called complete isometnc isomorphisms . Quan tum spaces with a complete topological (respectively, complete isometric) isomorphism between them are called completely topologically (respectively, completely �sometr�cally)
isomorphic.
From the definition it obviously follows that a complete topological isomorphism is precisely a completely bounded operator having a completely bounded inverse operator. At the same time a complete isometric isomorphism is an operator I such that In is an isometric isomorphism for every n . ( "I generates isometric isomorphism on every floor" .) Now we are ready to formulate the fundamental theorem we have promised before. Theorem 1 (Ruan, 1988) . Every abstract quantum space coincides, up to a complete
isometric �somorphism, w�th a concrete quantum space. The proof of this theorem is more complicated than the proof of its "classical" pro totype (Proposition 1 ) ; see, e.g . , [36, Theorem 2.3.5] . The Ruan theorem allows us to handle abstract quantum spaces as if they were concrete ones, which helps us to foresee many of their properties. On the other hand, if we accept the "abstract" point of view, we can concentrate our attention on the main structure, the quantum norm. This releases us from the dependence on the concrete embedding J : E ---+ B ( l2 ( X ) ) that implements this structure. Thus, in working with operator spaces we can use a series of constructions leading to objects that formally do
1 19
7. Invitation to quantum functional analysis
not consist of operators. The Ruan theorem gives us confidence that these objects can be identified with operator spaces : it is sufficient to verify the axioms of Definition 1 , and this usually is not very difficult. (To avoid unsubstantiated statements, we mention only two of many such construc tions. If E is a quantum space , then its quotient space E / F modulo a closed (under the norm of the first floor) subspace F, and the dual space E* are again quantum spaces: the quantum norm in E/ F is introduced by identification of Mn (E/ F) with Mn (E)/Mn (F) , and the quantum norm in E* , by identification of Mn (E* ) with B(E, Mn ) · Try to restore the details of both constructions; if you do not succeed, see [36] . ) If we have a quantum space ( E , { I I · I I n n E N}) , then we can "forget" all norms I I l i n but the first one, i.e. , pass to the corresponding underlying normed space . In so doing we forget the special properties of completely bounded (or contractive) operators and consider them as just bounded (contractive) operators between the underlying spaces. Hence, two natural forgetful functors arise, D : QNor � Nor and D : QNor 1 � Nor 1 . But let us go back to morphisms of our categories. Is the requirement of complete boundedness indeed substantially new as compared with the requirement of usual bound edness? (If not, what is the use of all this science?) Consider the following important example where the column and the row quantizations of l2 are compared . :
·
l2 �s completely bounded if regarded as an operator between l2 and l2 . A s a consequence, the quantum spaces l2 and l2 are not completely topolog�cally �somorphic.
Proposition 3 . No topolog�cal isomorphism I : l 2
�
Proof. Put ( m == ((f , (� , . . . ) :== I ( p m ) ; m == 1 , 2, . . . . By Corollary 4.2, for some c > 0 we have I:� 1 l (r l 2 > c2 for all m. Take n E N and consider the operator In : Mn (l2) � Mn ( l2 ) . Choose a matrix x == (xkz ) in Mn ( l2 ) that has only one non-zero column, namely the first one, with X k 1 :==p k ; k == 1 , . . . , n. Then the matrix y : == In x with entries Ykl == I(xkz ) also has only the first column non-vanishing, and Y k 1 == ( k ; k == 1 , . . . , n.
To find the norms of the matrices x E Mn (l2) and y E Mn (l2 ) , we should, according to the discussed procedure, look at the matrices constructed from the operators a :== Jr ( x) and b :== Jc ( x ) in Mn ( B ( l 2 ) ) . In both matrices only the first column is non-vanishing. Further, the matrix a has on the k1th place the operator taking p k to p 1 , and taking the other unit vectors to zero. In other words, we have the operator corresponding to the elementary matrix with 1 on the k 1 th place and zeroes on all the other places:
0...0
a == (an 1 )
0
0
1 0
0
0...0
0
It is clear after the identification with the corresponding operator acting on l 2 ( nN) that n the matrix a takes (� 1 , . . . , � ) to ( � { p 1 , �� p 1 , . . . , �� p 1 ) . Certainly, this means that l l x l l n , defined as I I a I I n , is equal to 1 . At the same time, the matrix b has on the ( k 1 )th place the operator taking p 1 t o ( k and the other unit vectors to zero; in other words, this operator has the matrix with the sequence ( k as the first column, and zeroes on the other places:
0
120
1.
Normed Spaces and Bounded Operators
Therefore, if we consider the concrete system (p 1 , 0, . . . , 0) , we see that the matrix b, after its identification with the corresponding operator on l2 ( nN) , takes this system to the system ( ( 1 , . . . , ( n ) . But the norm of the latter system in l 2 (nN) is JL:�= 1 ll ( k ll 2 > yric. This means that II y II n , defined as II b II n , is not less than yfic. Thus, we see that the operator In takes the matrix x of norm 1 to the matrix y of norm > yric. Hence, I is not completely bounded. • Remark. Analyzing this proof, we see that to be completely bounded, an operator T : l2 ---+ l2 must satisfy the condition L:� 1 II T ( p k ) 1 1 2 < oo . This means, as we will see in Section 3.4, that it must belong to the class of so-called Schmidt operators. Actually the class of completely bounded operators between 12 and l2 is precisely the class of the Schmidt operators (see, e.g. , [37] ) . Exercise 3 . Let l2in be the quantum space generated by the isometric embedding described in Proposition 2 (where E :== 1 2 ) . Then no two of the spaces l2in , 12 , and 12 are completely topologically isomorphic. Remark. Actually, as Pisier had shown, there is a continuum of quantum spaces with l2 "at the first floor" such that no two of them are completely topologically isomorphic.
Here is another elegant example. This time the question is about an operator that seems to be quite good and acts on the "sample" space B( l 2 ) with standard quantum norm. Proposition 4 (Tomyama, [36, Proposition 2.2. 7] ) . Let t be the transpose operator taking
an operator with matrix ( .A k z ) (in a basis of unit vectors} to the operator with matrix (1-t kl :== Az k ) (in the same basis). Then IIT IIn == n, and, as a corollary, the operator t (being an isometric isomorphism on the "first floor") is not completely bounded. Exercise 4 * . Prove the following estimate for the transpose operator:
liTli n
> n.
Hint. Consider the matrix a E M(B(l2 ) ) which at the klth place has the operator with the matrix having 1 at the lkth place and zeroes at the others. This matrix corresponds to the operator on B( l2 ( nN) ) permuting some vectors of the natural orthonormal basis. Hence, II a II n == 1 . At the same time the matrix t n (a) corresponds to the operator with matrix in the same basis having a "submatrix" of size n x n which consists of elements 1 . Consequently, I I a I I n > n. However (and there would not be a substantial theory without this, of course) , quite a few important classes of operators do consist of completely bounded ones. In the sequel, when speaking about (just) bounded operators between quantum spaces, we will have in mind the operators acting between the corresponding normed spaces (i.e. , "bounded on the first floor" ) . Theorem 2 . Let E be a quantum space and f : E ---+ C a bounded funchonal. Then
f
�s ( "automatically ") completely bounded with respect to the standard quantum norm in C (see Example 1}, and ll f ll c b == ll f ll . Proof. Take the matrix a == ( ak z E Mn (E)) and the element fn (a) E M n . Note that for all � == (� 1 , . . . , �n ) , ry == ( rJ1 , . . . , rJn ) E e n we have L � ,l = 1 f (a k z ) �l'r/k == f (u) , where u :== L� ,l = 1 akl �l 'r/k · In Mn , consider the matrix t having � as the left column, and matrix fJ having rJ as the upper row (and the other entries of both matrices are zeroes) . _ Then, obviously, u : == fja � is the matrix with entry u in the upper left corner and zeroes everywhere else. From the property (i) of the quantum norm it follows that ll u ll n == ! l u ll ,
121
7. Invitation to quantum functional analysis
and from (ii) we have
n
II u II n
<
II ��� n II a II n II fl l l n
== II � II II a II n II 'T] II (see Definition 1 ) . Thus,
I k,L f ( akl ) el 1Jk l = i f ( u ) i < ll f ll ll u ll < ll f ll ll a ll n ll e ii ii 7J II · l= l
If we recall the matrix-norms in the standard quantum norm for C , and pass to the supremum over all �, ry; 11 � 1 1 , 1 1 "7 1 1 < 1 , we obtain that ll fn ( a ) ll < ll f ll ll a ll n · Thus, ll fn ll < • II ! II . The converse inequality is obvious. Let us indicate another class of operators that are always completely bounded . His torically this class played a great role in the very formation of the notion of a completely bounded operator. Theorem 3 (see [36, p. 26] ) . Let E C B(l2 (X)) and F C B(l2 (Y) ) be concrete quantum spaces, and T : E ---+ F an operator which is a birestriction of some involutive homo
morphism between the corresponding algebras of operators (see Section 6. 3). Then completely contractive operator.
T
is a
Remark. Actually we have come close to one of the basic results of quantum functional analysis. It turns out that if we generalize a little the class of operators described in the previous theorem, we obtain a complete description of arbitrary completely bounded operators between quantum spaces. This fact has no analogue in traditional analysis, where no one expects that arbitrary bounded operators can be described, even if they act on such "geometrically best" spaces as l2 . But we do not want to overwhelm the reader with the precise formulation of this result; see, e.g. , [36] .
Finally, knowing about the existence of maximal quantizations, we can obtain one more series of "automatically" completely bounded operators. Exercise 5 . Let Em ax be a maximal quantized space, and F an arbitrary quantized space. Then every bounded operator T : Emax ---+ F is completely bounded, and IITII c b ==
II T II .
Hint. If T is a contraction operator, then the sequence of norms
{ !!a ll � : == max { jj a jj�ax , I ! Ta ll n }; a E Mn (Em ax )},
n
E N,
is a quantization, and therefore coincides with the maximal quantization. It is worth noting that now, using the categorical language, we can give precise mean ing to the statements like "classical functional analysis is part of quantum analysis" , or "quantum functional analysis is richer than the classical one." Consider the full subcat egories in QNor and QNor 1 consisting of maximal quantum spaces; denote them by MQNor and MQNor 1 . Using the previous proposition, you can easily do the following exercise. Exercise 6. The forgetful functor D : QNor ---+ Nor, being restricted to MQNor, provides an isomorphism of this category with the category Nor (in other words, D es tablishes a bijection between the objects and the morphisms of the categories MQNor and Nor; cf. Section 0.7) . The same is true if we replace QNor with QNor 1 , Nor with Nor 1 , and MQNor with MQNor 1 .
Thus, we can say that the categories Nor and Nor 1 , servicing classical functional analysis, are contained as full subcategories in QN or and, respectively, QNor1 , which service the quantum analysis. This is the origin of the words like "richer" , etc. Remark. There is another full subcategory in QNor (respectively, in QNor 1 ) isomorphic to Nor (respectively, to Nor 1 ) . It consists of minimal quantum spaces. This follows immediately from the fact that every bounded operator from an arbitrary quantum space
122
1.
Normed Spaces and Bounded Operators
to a minimal quantum space is automatically completely bounded. The proof of this fact is a little more complicated than what is stated in Exercise 5 , and we shall not give it . The essentially new phenomena of "quantum" science, which does not exist in the "classical" case, appear when we move from linear operators to bilinear (and, more gen erally, multilinear) operators. There is one main approach to what should be regarded as a bounded bilinear operator, namely, the one given in Definition 4.6. At the same time there exist at least two substantial "quantum" versions of this definition, each having its merits and applications. Using available elementary tools, we can define them as follows. Let E, F, and G be three quantum spaces, and R : E x F ---+ G a bilinear operator. Then for every natural n we can consider the bilinear operators R� : Mn (E) x Mn (F) ---+ Mn 2 (G) and R� : Mn (E) x Mn (F) ---+ Mn (G), defined as follows. It is convenient now to represent the n 2 x n 2 -matrices with entries from G (belonging to the range of the first operator) in the form of block-matrices z == ( zk z ) ; 1 < k, l < n, where every block is a matrix Z k z == ( zkl, i j E G); 1 < i, j < n. In this notation R� assigns to each pair of matrices x == ( x k z E E), y == ( Ykl E F) the matrix z with elements Zkl, t J : == R ( x k z , Yi j ) . At the same time, R� assigns to the same pair x , y the matrix u with elements U k z : == I:� 1 R ( X k i , YtZ ) ·
(In particular, if R is a usual multiplication of complex numbers (.A, p,) � .Ap,, then R� assigns to a pair of n x n-matrices the so-called Kronecker product, and R� assigns the usual matrix product . ) So we can require two (generally speaking, different) things from the initial bilinear operator R: either sup{ II R� II : n E N} < oo , or sup{ II R� II : n E N} < oo . Still there is no general agreement on how to call such operators, but, following [36] , we say that a bilinear operator R is (just) completely bounded if it satisfies the first condition, and multiplicatively bounded in the second case. (Christensen and Sinclair introduced a second type of operators in 1 987, and called them completely bounded. The operators of the first type were not discovered at that time; they were introduced by Effros and Ruan and, independently, by Blecher and Paulsen in the early 1990s.) One can prove (we shall not do this, however) that every mult�plicat�vely bounded bilinear operator �s completely bounded. But the converse statement is not always true; in particular, the bilinear functional l2 x l2 C; ( �, rJ ) � I:� 1 �n 'r/n is completely bounded but not multiplicatively bounded. ---+
Remark. A fact of fundamental importance is that multiplicatively bounded bilinear operators (as well as their immediate generalization, multiplicatively bounded multilinear operators) have explicit description, similar to the description of completely bounded operators mentioned above. In this respect the "multilinear quantized functional analysis" is much simpler than the "multilinear classical analysis" , where even bounded bilinear operators in general have much more complicated structure than the linear ones.
Our excursion to the area of quantum functional analysis approaches the end. Let us emphasize that what we have said is just an "appetizer" before the main dish. The reader can taste this dish, say, in the recent books by E. G. Effros and Z.-J. Ruan [36] and by G . Pisier [37] . We have mentioned here only things that can be exposed on the level of knowledge of the previous text of this book. (However, going deeper into classical functional analysis-when discussing tensor products, Banach algebras, etc.-we will sometimes be able to make "quantum" disseminations into the main text, mentioning further results; cf. Exercise 2 . 5 . 1 1 and the end of Section 2.7.) In principle, the relations between quantum and classical functional analysis are simi lar to those between quantum and classical physics. On one hand, the "things" in classical science (in mathematics they are notions, facts, and methods) have meaningful "quan tum" analogues, which, in addition to other virtues, allow us to better understand their
7. Invitation to quantum functional analysis
123
"classical" prototypes. On the other hand, quantum science comes across essentially new phenomena not encountered in classical science . Some illustrations to both statements have already been given before, and you will find them in abundance in the books we cited and in journal articles. As an epilogue, we cannot refuse the temptation to formulate a theorem that lies in the foundation of "quantum" theory and plays the same outstanding role there as the Hahn-Banach theorem does in "classical" theory. However, now "operator-valued operators" assume the role of functionals (C-valued operators) . Theorem 4 (Arveson-Wittstock, [36, Theorem 4. 1 .5] ) . Let E be a quantum space, Eo a quantum subspace of E, and To : Eo ---+ B( l 2 (X) ) (where X is a set) a completely bounded operator. Then there exists a completely bounded operator T : E ---+ B( l 2 (X)) extending To
and such that
II T II cb == II To II cb .
Chapter 2
B anach S p aces and Their Advantages
1 . What lies
on
the surface
Let me have men about me that are fat; Sleek-headed men and such as sleep o ' nights: Yond Cassius has a lean and hungry look; He thinks too much: such men are dangerous.
( Shakespeare, Julius Caesar)
From the course of analysis the reader should know the meaning of the words fundamental sequence in a metric space ( or, as one sometimes says, Cauchy sequence ) and complete metric space, i.e. , a metric space where every fundamental sequence has a limit. Both these notions, fundamental sequence and complete space, make sense not only in metric spaces but also in premetric ones. The only difference is that a fundamental sequence in a premetric space may have many limits. Remark. Since you already know the basics of topology, you should have in
mind the following. The notion of completeness cannot be carried over from metric to topological spaces. This cannot be done, for example, because a space homeomorphic to a complete metric space is not always complete: the line, which is complete, is homeomorphic to an open interval, which is obviously not complete. ( "The property of being complete is not invariant with respect to isomorphisms in Met ." ) On the other hand, this property is invariant with respect to isomorphisms in Metu , i.e. , it is preserved under the passage to the uniformly homeomorphic metric spaces ( check this! ) .
125
2.
126
Banach Spaces and Their Advantages
In fact, there exists a general structure, the so-called uniform space, which allows us to consider the notion of completeness naturally. The class of uniform spaces includes all premetric spaces and at the same time all polynormed spaces that we will consider later ( in particular, the non-premetrizable polynormed spaces; cf. Chapter 4) . The consideration of these spaces exceeds the scope of this book; you can read about them, say, in [13] or [1 4] .
One of the main advantages of complete metric spaces, which lies in the foundation of many applications of them, is the "principle of nested closed subsets" , which can be formulated as follows: if M is a complete metric space, and N1 � N2 � · · · is a sequence of nested non-empty closed subsets
with diameter tending to zero, then these sets have a unique common point.
(By the way, look at what remains of this principle if we go over from metric spaces to premetric ones.) Many arguments leaning on the principle of nested closed subsets become short and elegant when we use an important corollary of this principle, the so-called Bair theorem. A subset N of a topological (in particular, metric) space M is called rarefied, or nowhere dense if every non-empty open set in M contains a non-empty open subset disjoint from N (in other words, N is not dense in any open set in M) . Then, a subset in M is called meager if it is a union of a countable family of rarefied sets, and fat if it is not meager. 1
Every complete metric space (and, as a consequence, ev ery topological space with topology generated by a complete metric) is fat. The proof can be found, say, in [8 , p. 66] .
Baire Theorem.
Now we introduce two central notions of functional analysis. Definition 1 . A normed space is called a Banach space 2 if it is complete as a metric space. A near-Hilbert space is called a Hilbert space if it is a Banach normed space. Let us note some simple facts.
normed space that is topologically isomorphic to a Ba nach space is a Banach space. ( "The property of being Banach space is
Proposition 1 . A
invariant under isomorphisms in Nor." ) Proof. If E is a Banach space,
I : E � F is a topological isomorphism, and Yn is a fundamental sequence in F, then Xn : == I - 1 ( yn ) is also a fundamental, "biblical" terms, we follow Bourbaki [38] . In literature meager sets are often called sets of the first category, while fat ones are called sets of second category. Certainly, it is extremely inconvenient , because, like many modern mathematicians, we use the term category for a very different thing. 2 In honor of the outstanding Polish mathematician Stefan Banach These spaces were independently and simultaneously introduced by N. Wiener, the future "father of cybernet ics" . However, the most profound results that determine the face of the theory belong to Banach and his school; see Section of this chapter.
1 When using these
( 1 892- 1 94 5).
4
1.
What lies
on
127
the surface
and thus, convergent sequence in E. If we denote by x its limit, then we see that Yn == I ( Xn ) converges as well, namely to I ( x ) . •
2. If a subspace F of a normed space E is a Banach space with respect to the inherited norm, then it is closed.
Proposition
Proof. Indeed, if F is not closed, then there exists
x
E
E such that x does
not belong to F but is a limit of a sequence Xn E F. Then X n is fundamental, • and, by the uniqueness of the limit in E, has no limit in F.
closed subspace F of a Banach space E is a Banach space with respect to the inherited norm.
Proposition 3. A
Proof. The limit of an arbitrary fundamental sequence from F exists in E.
• Corollary 1 . A closed subspace of a Hilbert space is a Hilbert space with respect to the inherited inner product.
But it is a limit point for F, hence it belongs to F.
The overwhelming majority of normed spaces as examples in 1 . 1 (al though not all of them) are Banach spaces. These are, in particular, CC� and lp ; 1 < p < oo, c0 , Cb (O) (including C [a, b] ) , cn [a, b] ; n == 1 , 2, . . . , and L00 (X, J-L) (including l00 (X) ) ; the proof of their completeness does not pose problem. (Try it anyway. ) Similarly, it is easy to see that for every p; 1 < p < oo the lp-sum of an arbitrary family of Banach spaces is a Banach space. The l00-sum of Banach spaces will also be called the Banach direct product, and the l1-sum, the Banach direct sum. The spaces Lp (X, J-L) ; 1 < p < oo are Banach as well, but the proof of this fact is more complicated (see, e.g. , [9 , Section 23] . Apparently, in a standard course of measure and integral, this was explained, at least for the cases of p == 1 and 2, but perhaps only for X == [a , b] , IR, and 'lr. However, this will be quite sufficient for us. In particular, we see that the spaces £2 (X, J-L) and all their specializa tions, like l 2 , L2 (IR) , etc. , are Hilbert spaces. If Hv; v E A is a family of Hilbert spaces, then the l 2 -sum of this family is a Hilbert space with respect to the inner product well-defined by the equality ( f, g ) :== 2:: { (f ( v ) , g ( v ) ) : E A}. We call this Hilbert space the Hilbert direct sum or just Hilbert sum of a given family, and use the notation "EB" instead of "EB 2 " . For Hv CC we obviously obtain the Hilbert space l 2 (A) (up to a unitary isomorphism) . To construct examples of non-Banach (i.e. , incomplete) normed spaces, it is sufficient, by Proposition 2, to take a dense proper subspace in a Banach space. It immediately follows from this observation that lp is not a Banach space with respect to the norm inherited from lq ; q > p (these norms differ from the norm of lp ) · The same is true for LP [a, b] ; 1 < p, with respect to v
2.
128
Banach Spaces and Their Advantages
the norm inherited from Lq [a, b] ; q < p, and for C00 [a, b] with respect to the norm inherited from an arbitrary cn [a, b] ; n == 0, 1 , 2, . . . . The space C[a, b] with integral norm ll x ll :== J: l x ( t ) l dt is also non-Banach, since (up to identifying a function with its coset) it is a dense proper subspace in L 1 [a, b] . (Give other examples.) Running ahead, we note that there are no other kinds of non-Banach spaces: all of them are dense subspaces in Banach spaces (see Corollary 6. 1 below) . As an exercise, it is useful to ask students to prove directly ( i.e. , without using £1 [0, 1]) that C[O, 1] is not a Banach space with respect to the integral norm, by producing a non convergent fundamental sequence. Somebody will always suggest the sequence Xn (t) : == t n , assuming that since it "converges to a discontinuous function" , it cannot converge in this space. This is a good occasion to recall that different types of convergence should not be mixed up.
Now we show that all the finite-dimensional normed spaces are Banach. This will allow us to establish the equivalence between different norms (and many other things) , as we have promised before. The key step in the proof is the following result.
Every normed space of a finite dimension n is topologically isomorphic to CC!, and moreover, every linear isomorphism between these spaces is a topological isomorphism.
Proposition 4.
n. Everything is clear for n == 1. Suppose the statement is true for some n == k, E is a given (k + I)-dimensional normed space, and I : E cc� + 1 a linear isomorphism. Then for some linear basis e 1 , . . . , e k + 1 in E the operator I acts by the formula x == L:7+11 A.z e z �----+ ( A. 1 , . . . , A. k + 1 ) E CC� + 1 . For m == l , . . . , n put Fm :== span { e 1 , . . . , e m - 1 , em + 1 , · · · , ek + 1 } · By the induction assumption, the subspace Fm in E is topologically isomorphic to CC�. The last space is obviously Banach, hence by Proposition 1, Fm is Banach as well. By Proposition 2, Fm is closed in E. Thus Proposition 1.1.4 (with Fm as F) provides us with a constant Cm > 0 such that for every X == L:7+11 A z e z we have A. m I < Cm ll x II . If we put c :== max{ c1 ' . . . ' ck + 1 }, I then II I ( x) II 1 == L:7+11 I A.z l < C II x II . Therefore, I is a bounded operator. At the same time, I- 1 , as an oper ator from cc� + 1 , is also bounded (see Example 1.3.1). The rest is clear. • Proof. Use induction on --+
2. Every two normed spaces of the same finite dimension are topologically isomorphic. Moreover, every linear isomorphism between them is a topological isomorphism.
Corollary
1.
What lies
on
the surface
129
Suppose E is a finite-dimensional normed space. Then (i) E is complete; (ii) every norm on E majorizes every prenorm; (iii) every linear operator from E into a prenormed space is bounded. Proof. As we know (cf. Example 1.3. 1) , all these facts are true for the spaces C! ; n E N. This means that they are true for every normed space Theorem 1 .
topologically isomorphic to one of these spaces. It remains to apply the • previous proposition. Taking Proposition 2 into account, we immediately obtain Corollary 3.
(i) Every finite-dimensional subspace of a normed space
is closed; (ii) every two norms on a finite-dimensional space are equivalent.
As a curious illustration let us note the following Exercise 1 . If a linear space has a countable linear basis (like coo or the space of polynomials do) , then there is no norm taking it to a Banach space. Hint. From Corollary 3 and from the observation that a closed proper subspace of a normed space is rarefied it follows that if we endow this space with a norm, then we obtain a meager set. After that the Bair theorem works. Remark. It is known that all separable Banach spaces have countable linear dimension. Thus, they all are linearly isomorphic ( Lowig ' s theorem; see , e.g. , [39] ) .
Now we can fulfil our promise given in Section 1.6.
Suppose E is a normed space and Eo is a finite-dimensional subspace in E. Then Eo is topologically complemented. Proof. Suppose E1 is a closed linear complement of Eo in E (it exists by Proposition 1.6.3) , and let pr : E --+ E I E1 be the natural projection. Clearly, the restriction T == pr i Eo : Eo --+ E I E1 is a linear homeomorphism, hence a topological isomorphism due to Corollary 2. It remains to note that the operator P == r- 1 pr is a bounded projection to Eo along E1 , and to • use Proposition 1.5. 10. Proposition 5.
o
Here is another application.
Every bounded finite-dimensional operator T between normed spaces E and F is a sum of several one-dimensional bounded oper ators.
Proposition 6.
2.
130
Banach Spaces and Their Advantages
Proof. First assume that both spaces
E and F are finite-dimensional. Let
e 1 , . . . , e n be a linear basis in E. Denote by Tk the operator taking e k to T ( e k ) and taking other basis vectors to zero. Clearly, T == � Tk and each Tk is a one-dimensional operator, which is bounded by Theorem 1 . In the general case consider the operator T : E / Ker ( T ) � F generated by the operator T ( see Proposition 1.5.3) . Its corestriction to Im ( T) .......
is an operator ( moreover, a linear isomorphism ) between finite-dimensional normed spaces. We already know that it is a direct sum of several bounded one-dimensional operators, say, T1 , . . . , Tn . Hence, the one-dimensional operators Tk : == (in)Tk (pr) : E � F ( where in and pr act in an obvious way ) • are bounded, and their sum is certainly T. .......
.......
.......
Many spaces consisting of operators are Banach as well.
If F is a Banach space, then for every prenormed space E the normed {cf. Proposition 1. 3. 2) space B(E, F ) is again a Banach space. In particular, for every prenormed space E the dual space E* is always a Banach space, and if E is Banach, then B(E) is a Banach space as well.
Proposition 7.
Tn be a fundamental sequence in B(E, F ) . Then for every x E E the sequence Tn (x) is, of course, fundamental in F, and thus has a ( unique ) limit, which we denote by T(x). This defines the mapping T : E � F. From the additivity of all Tn and the continuity of addition in F it obviously follows that T is additive. Similarly, from the homogeneity of Tn and continuity of Proof. Let
the multiplication by scalars in F it follows that T is homogeneous. Thus, T is a linear operator. Since Tn is fundamental and hence bounded in B ( E, F ) , for some C > 0 and every x E BE we have II Tn (x) ll < C. As a consequence, for the same x we have II T(x) ll < C. This means that T is bounded ( i.e. , belongs to B(E, F ) ) . It remains to show that Tn tends to T in the operator norm. Take c > 0 and a natural N such that for m, n > N we have II Tm - Tn ll < � Then for every x E BE and for the same m, n we have II Tm x - Tn x ll < � Taking into account that Tm (x) tends to T(x) as m � oo, we obtain that II Tx - Tn x ll < � < � - If we take the supremum over all x E BE , we see that • li T - Tn ll < � < E . The rest is clear. Corollary 4. A Exercise
reflexive normed space is always a Banach space.
2. If E is a prenormed space with a non-zero prenorm, and
F is a normed space, then the completeness of F is not only sufficient but also necessary for the completeness of B(E, F ) .
1.
What lies
on
131
the surface
Hint. If Yn is a fundamental but not convergent sequence in F and f E E* \ { 0 } , then the one-dimensional operators f 0 Yn form a fundamental
sequence which does not converge in B(E, F) .
Why Banach spaces are better than other spaces? Let us postpone the acquaintance with deep results ( mostly connected with the name of Banach; see Section 4) , and look at what lies on the surface. Here is one of these things: in a Banach space "the series are well-summed" .
( i ) ( "Weierstrass test" ) . In a Banach space, every absolutely convergent series converges. ( ii ) If, in a normed space E, every absolutely convergent series con verges, then E is a Banach space.
Proposition 8.
Proof. ( i ) If �C: 1 X k is an absolutely convergent series, then the sequence
of its partial sums ��= 1 x k ; n == 1 , 2, . . is fundamental. The rest is clear. ( ii ) Let X n be a fundamental sequence in E. Clearly, it has a subsequence x� :== X nk such that ll x � + 1 - x � ll < 2\ . From this and from the hypothesis it follows that the series �r 1 ( x �+ 1 - x �) converges in E. Since the nth partial sum of this series is x� + 1 - x� , we conclude that the sequence x � converges. Thus, the sequence X n E E is fundamental and contains a convergent subsequence; this means that the sequence itself converges. • .
We present one of the numerous consequences of this simple fact.
Let E be a Banach space and F a closed subspace in E . Then the normed quotient space E IF (cf. Proposition 1. 1 . 2) is Banach.
Proposition 9.
Proof. Let � � 1 Xn be an absolutely convergent series in E I F. For every n, choose Xn E X n such that I I X n I I < 2 II Xn II . Clearly, the series �� 1 Xn is
absolutely convergent. Hence, by Proposition 8 ( i ) it converges to a vector x E E. Consequently, the series � � 1 X n converges to the coset of x in • E lF. It remains to use Proposition 8 ( ii ) . Another advantage of Banach spaces concerns extensions of operators. Although the corresponding fact has a simple proof, it is so important for applications that we promote it to the rank of a theorem. Theorem 2 ( "extension-by-continuity principle" ) . Let E be a prenormed space, Eo a dense subspace in E, and F a Banach space. Let To be a bounded operator from Eo to F. Then there exists a unique bounded operator T from E to F extending To {i. e., such that T I Eo == To or, equivalently, the following
2.
132
Banach Spaces and Their Advantages
diagram is commutative: Eo
inl � E
T
F
where in is the natural embedding). Further, l i T II == l i To II , and if To is isometric, then the same is true for T. Finally, if E is a normed space, and To is topologically injective, then T is also topologically injective. x E E and a sequence X n E Eo converging to x . The latter is fundamental. Hence from the estimate II To (x m - X n ) II < l i To II ll x m - X n II it follows that the sequence To (x n ) is fundamental as well. Since F is complete, To (x n ) tends to some vector of this subspace, which we denote by T(x) . If x� is another sequence in Eo tending to x, then obviously To(x n ) - To (x�) tends to zero in F. This means that T(x) does not depend of the choice of X n , and thus the mapping T : E --+ F is well defined. Moreover, if x E E0 , then taking X n : == x for all n, we see that for this x the equality T(x) == To (x) holds. This means that a mapping, T extends To. If we take x, y E E and sequences X n and Yn in Eo tending to x and y respectively, we see that from To (x n + Yn ) == To (x n ) + To ( Yn ) and from the continuity of the sum in F it obviously follows that T(x + y) == T(x) + T(y) . Proof. Take
as
Hence, T is additive. The homogeneity is established in a similar way. Thus T is a linear operator. Finally, for the same x and X n we have II T(x) II == limn �oo II To(x n ) II < limn�oo l i To II ll x n II == l i To II ll x II . This means that T is bounded and l i T II < II To II . Since the opposite inequality is obvious, the norms of the two oper ators coincide. From the same equality we see that the estimate II To ( Y ) II > c ll y ll with the equality II To ( Y ) II == II Y II for all y E Eo implies the same es timate and the same equality for T. This proves the results concerning topologically injective and isometric operators. It remains to note that the uniqueness of a continuous operator extending To follows immediately from the density of Eo in E (together, of course, with • the fact that F is Hausdorff) . As a first, comparatively modest, application we propose the following . exercise. Exercise 3. Prove the Hahn-Banach Theorem
separable E without using Zorn ' s lemma.
1.6. 1 for the case of
2.
Categories of Banach and Hilbert spaces
133
Hint. The separability allows us, using the main lemma in the proof of
the Hahn-Banach theorem countably many times, to extend a given func tional to a dense subspace in E. Then the extension-by-continuity principle works. Let us note a special situation where the latter principle is often applied.
Suppose E and F are Banach spaces with dense subspaces Eo and Fo respectively, and T8 : Eo --+ Fo is a topological isomorphism. Then there exists a unique topological isomorphism T : E --+ F such that T8 is its birestriction. If in addition T8 is an isometric isomorphism, then T is also an isometric isomorphism. Finally, if E and F are Hilbert spaces and T8 is a unitary operator, then T is also a unitary operator. Proof. Suppose S8 : == (T8)- 1 , To : Eo --+ F is a coextension of T8 , and So : Fo --+ E is a co extension of S8. If we extend To and So by continu ity, we obtain bounded operators T : E --+ F and S : F --+ E uniquely defined by their birestrictions. Then the operator ST : E --+ E, being the identity operator on a dense subspace Eo , must be the identity operator; similarly, TS : F --+ F is also the identity operator. Thus, T is a topological Proposition 10.
isomorphism. The rest is clear.
II
In fact, the "roots" of the extension-by-continuity principle lie deeper, in metric spaces and uniformly continuous mappings (in other words, in the category Metu ) ·
M be a metric space, Mo a dense subset in M (with inherited metric) , N a complete metric space, and rpo : Mo --+ N a uniformly Exercise 4. Let
continuous mapping. Then there exists a unique uniformly continuous map ping rp : M --+ N extending 'PO · If, in addition, rpo is isometric, then rp is also isometric. Finally, if M is also complete, the image of rpo is dense in N, and rpo is isometric, then rp is also isometric (i.e. , an isomorphism in Met1 ) . 2. Categories of Banach and Hilb ert spaces . C lassification and t he Riesz-Fischer t heorem
It is time to add four new categories of functional analysis to the ones we already know. They seem to be even more important. (i) The category Ban. The objects are Banach spaces and the mor phisms are (arbitrary) bounded operators. (ii) The category Ban1 . The objects are the same as in Ban, but the class of morphisms is smaller: only contraction operators are declared to be morphisms.
2.
134
Banach Spaces and Their Advantages
(iii) The category Hil. The objects are Hilbert spaces, and the mor phisms are bounded operators, as in Ban. (iv) The category Hil1 . The objects are the same as in Hil, and the morphisms are contraction operators as in Ban1 . All we have said in the context of our previously introduced categories about the composition of morphisms and the verification of axioms of cat egory (see the beginning of Section 1.4) can be automatically carried over to the new categories. The interconnection between all the eight categories can be illustrated by the following scheme: Hil u
c
Ban u
c
Nor u
C
Pre u
where the notation c and U has the same meaning as in the scheme in Section 1.4. The first question about the categories we have just introduced, is of course the following: what are isomorphisms in these categories? Clearly, a morphism in Ban or in Hil is an isomorphism � it is an isomorphism in Nor (or in Pre) . And a morphism in Ban1 or in Hil1 is an isomorphism � it is an isomorphism in Nor1 (or in Pre1 ) . Hence, the isomorphisms
in Ban and in Hil are topological isomorphisms, whereas the isomorphisms in Ban1 and in Hil1 are isometric isomorphisms (we recall that both types were characterized in Proposition 1.4.1). According to the agreement in Section 1 .4, isomorphisms in Hil1 are also called unitary isomorphisms or unitary operators.
Now we proceed to a typical question of classification of objects in a category up to an isomorphism, discussed in Section 0.4. Recall that, infor mally, this is the question of how many "really different" objects are there in the category under consideration. For all categories of functional analysis introduced up to now, the question certainly pertains to the classification of objects up to a topological or an isometric isomorphism. It turns out that the categories of Banach and Hilbert spaces behave quite differently. Oversimplifying matters, we can say that, despite the multitude of seemingly different examples, there are surprisingly few Hilbert spaces. At the same time, there is a huge multitude of Banach spaces (to say nothing about prenormed spaces) . The main result is that all infinite dimensional separable Hilbert spaces are in essence the same space, but presented in different ways.
1907) . Let H and K be infinite-dimensional separable Hilbert spaces with orthonormal {Schauder) bases e� and e� ; n E
Theorem 1 (Riesz-Fischer,
2.
135
Categories of Banach and Hilbert spaces
N (see Theorem 1 . 2.4' ) . Then there exists a unzque unitary isomorphism U : H ----+ K such that U(e� ) == e� for all n. Proof. Let Ho : == span { e� ; n E N } and Ko : == span { e� ; n E N } ; by the definition of a Schauder basis, Ho is dense in H, and Ko in K. Obviously, there exists a linear operator U8 uniquely defined by the rule that for every n it takes e� to e� . Take arbitrary x, y E Ho ; they have the form x == Er 1 Az e� and y == Er 1 J.Lz e �' for some A z , J.Lz E CC and m. Consequently, U8 (x) == Er 1 Az e�' and U8 ( Y ) == Er 1 J.Lz e?. Hence m
m
( U8 (x) , U8 (y)) = :2::: A. k J.Lz ( e % , e n = :2::: A. k J.L k , k, = 1 k= 1 and similarly, ( x, y) == 2:: ; 1 A k fl k · This shows that U8 is a unitary isomor phism between H and K. It remains to apply Proposition 1.9. • Corollary 1 . Every infinite-dimensional separable Hilbert space is unitarily isomorphic to l 2 . l
A half-fictional digression. The Riesz-Fischer theorem lies in the foundation of a vast area mathematics studying and / or using operators in Hilbert spaces. These operators are integral part of the apparatus of quantum mechanics. Long ago, in the mid- 1 920s, the words "quantum mechanics" were not used, but physicists already knew very well that there exist phenomena beyond the classical picture of the world by Newton and Laplace. Finding the laws of this new and strange world seemed to be an incredibly difficult task. But two men took the challenge. Two theories were suggested, seemingly having nothing in common with each other, "matrix mechanics" by Heisenberg and "wave mechanics" by Schrodinger. As usual, heated arguments began on which theory is better.
Now we know who of those outstanding physicists was right: both. The fact is that from the present-day point of view, our physicists were working with operators on different Hilbert spaces: Heisenberg worked with l 2 ( thence matrices; cf. the last part of Section 1 .4 ) , while Schrodinger ( if we restrict ourselves to the simplified one-dimensional case ) in L2 (1R) . But they had no any idea of Hilbert spaces, to say nothing about their unitary isomorphism. That is why they did not guess that their theories were essentially the same, but stated in two different languages. Finally, J. von Neumann 3 crossed the t ' s ( more about him will be said later in the book ) by proving the so-called theorem on the uniqueness of canonical commutation rela tions ( see, e.g. , [63] ) . In the simplified form it shows that one of the isomorphisms between l2 and £ 2 (JR) provided by the Riesz-Fischer theorem implements a unitary equivalence of the operators that are central in both theories ( again, using present-day terminology ) . ( As a matter of fact , this is the very same isomorphism that takes the unit vectors to the Hermite functions in Example 1 . 2.8. ) To speak informally, l 2 can be superimposed on £2 (JR) in such a way that the Heisenberg theory turns into the Schrodinger one. 3 John von Neumann (1 903-19 5 7) , a great mathematician of the 20th century. Many math
ematicians think that he is one of the two greatest mathematicians of his time (the other being A. N. Kolmogorov, who was of the same age) . Von Neumann was born in Hungary, spent part of his young years in Germany, and then moved to the United States. According to some of his biographers and historians of mathematics, he seemed to be one of very few great mathematicians who had easy temper.
2.
136
Banach Spaces and Their Advantages
After this theorem it became clear to von Neumann and other most advanced people, like M . Stone,4 that the "right" theory should not be tied to a concrete Hilbert space; instead, it must be based on an abstract Hilbert space. From matrix mechanics and wave mechanics a unified quantum mechanics was born. 5
The Riesz-Fischer theorem can be generalized to any Hilbert spaces, not necessarily separable. To formulate the result, we recall that an orthonormal Schauder basis is another name for a countable total orthonormal system.
2 (for the proof, see [50 , §2.2] ) . (i) Every Hilbert space has a total orthonormal system. (ii) Every two such systems have the same cardinality (called the Hilbert dimension of this space) . (iii) Two Hilbert spaces are topologically isomorphic � they are uni tarily isomorphic � they have the same Hilbert dimension. More over, if e�; v E A and e�; v E A are total orthonormal systems in H1 and H2 respectively, then there exists a unique unitary isomorphism between H1 and H2 , taking e� to e�; v E A .
Theorem
Certainly, every cardinality m can be a Hilbert dimension. Namely, if we take l 2 ( X ) , where X is a set of cardinality m, we get a Hilbert space with the total orthonormal system 8x ; x E X , where if y == x , otherwise. The theorem we have just formulated, together with the observation we made, gives the complete solution to the problem of classification of objects in Hil and Hil 1 . In both categories (and this feature resembles Set and Lin; cf. Example 0.4.2 and Exercise 0.4. 1) the complete system of invariants is the class of all cardinalities. But now the invariant of a given object is not its cardinality or linear dimension, but the Hilbert dimension. If X is a model in Set of a set of cardinality m, then the space l 2 ( X ) can be regarded as a model of a Hilbert space with the invariant m. In particular (this follows from the Riesz-Fischer theorem) , in the full subcategories of Hil and Hil 1 consisting of infinite-dimensional separable spaces, all objects are isomorphic (they have the same invariant, the countable cardinality) , and we can take l 2 as a unique model here. 4 This is the first time Stone's name appears in this book. 5 I t is quite another matter that the creators of both theories remained unconvinced: "My mechanics is better anyway, and let mathematicians say whatever they want" ( see details in [40 ] ). Here is another unfortunate example of the lack of mutual understanding between mathematicians and "practical" physicists.
2.
Categories of Banach and Hilbert spaces
137
However, when passing from Hilbert spaces to more general Banach spaces not even a shadow remains of the idyllic picture we have just ob served. It turns out that there are too many topologically (and the more so, isometrically) non-isomorphic Banach spaces. Too many to have even a faint hope for a complete classification of objects even in Ban. (In practice, math ematicians were sure of this long ago, even before realizing the categorical meaning of the discussed problem.) Nevertheless, many results have been gathered on the existence of the topological and sometimes even isometric isomorphisms between spaces that look quite different at first sight. From the categorical point of view these results give the classification of objects in some full subcategories of Ban or (with more luck) in Ban1 . We have already discussed the most striking and important for applications discovery of that kind, concerning Hilbert spaces. Here is another result showing that the case of Banach spaces £1 ( · ) differs from the case of Hilbert spaces £ 2 ( · ) .
Let ( X, J-L) be a space with measure that has a countable basis. Then the space L 1 (X, J-L) is isometrically isomorphic to one of the following spaces: CC!; n E N; l 1 ; L1 [0, 1 ] ; the Banach direct sum of L 1 [0, 1 ] and CC! ; n E N; the Banach direct sum of L1 [0, 1 ] and l1 . These spaces (and in particular, £ 1 [0, 1 ] and l 1 ) are not isomorphic (even topologically) to each other. Theorem 3.
A similar assertion holds if we replace the subscript 1 by an arbitrary p; 1 < p < oo, not equal to 2 and instead of the Banach direct sum (in other words, l1-sum) consider the lp-sum. For a proof of Theorem 3 see, e.g. , [41] . Of course, Corollary 1 .2 gives a complete classification of objects in the category of finite-dimensional normed (i.e. , finite-dimensional Banach) spaces and all (i.e. , all bounded) operators. The system of all invariants is again the set of integers (dimensions) , like in the pure algebraic category FLin in Section 0.4. The advanced reader can easily verify that the category of finite-dimensional normed spaces and the category of finite-dimensional linear spaces are equivalent . (But apparently the attempt to construct isomorphism between these categories would be a meaningless task.)
3. 1 , we will indicate some results pertaining to Banach spaces of the form C(O) . (We shall see, in particular, that all the spaces C(OCn ) , where ocn is the n-cube, and n runs over natural numbers, Remark. Later, in Section
are isomorphic to each other topologically but not isometrically.)
In general, the majority of facts on the existence of isomorphisms es tablished by various mathematicians up to now, concern classical Banach spaces. As for arbitrary spaces, a series of results obtained recently show
138
2.
Banach Spaces and Their Advantages
that they can behave in a very odd way. (And this, apparently, buries the hope for a reasonable classification of such spaces.) Here is an impressive example. In contrast with all examples of Banach spaces discussed before, the following theorem holds.
There is a Banach space that is not topologically isomorphic to any of its proper subspaces.
Theorem 4 (Cowers, 1994; see [5 1] ) .
Now let us move from the problem of classification of objects to the problem of the next level of complexity, that of classification of morphisms. Obviously, the similarity of endomorphisms and the versions of this notion discussed for general categories in Section 0.4, are characterized for our new categories in the same terms as for categories from the previous chapter. In particular, the similarity of endomorphisms in Ban and in Hil, i.e. , bounded operators acting on Banach ar1d Hilbert spaces, is nothing but their topo logical equivalence. At the same time, similarity of endomorphisms in Ban with respect to Ban1 , and in Hil with respect to Hil1 (see the same sec tion) is precisely their isometric equivalence (called, as we remember, unitary equivalence in the context of Hilbert spaces) . Finally, the similarity of endo morphisms in Ban 1 or in Hil1 is the very same isometric, or, respectively, unitary equivalence of operators, but now not for all operators but only for contraction operators. The classification problem for bounded operators, both up to topological and (as a 1nore rigorous version) up to an isornetric equivalence, is one of the typical problems of functional analysis. We shall return time and again to this problem for various classes of operators, and to examples of isometrically and topologically equivalent operators. Here is one of the simplest examples. Example 1. The operator of bilateral shift in l 2 (Z) is unitarily equivalent
to the operator of multiplication by z in £ 2 (1r) . This equivalence is imple mented by the unitary operator taking a sequence (0, . . . , 1, 0, . . . ) with 1 at the nth place to zn ; n E Z. The following example will be useful later in the study of compact self adjoined operators in Section 6.2. Example
2. Let H be a Hilbert space. Consider an operator T in H,
such that there exists an orthonormal systent�·-e 1 , e 2 , . . . of finite or count able cardinality m in H that consists of eigenvectors of this operator with eigenvalues AI , A 2 , . . . , and Tx == 0 for every x E Ho : == {e 1 , e 2 , . . . }1_. Then T is unitarily equivalent to the operator R : l24Ho � l24Ho , acting on l2 as a diagonal operator T>.. with A : == (A 1 , A 2 , . . . ) , and sending Ho to zero. This equivalence is established by the unitary operator I : H � l24Ho that takes en to pn E l 2 and is the identity operator on Ho .
2.
139
Categories of Banach and Hilbert spaces
As we have seen in Section 1 .4, the question of equivalence ( first of all, isometric equivalence ) of operators is related to their matrix representation. In the context of Hilbert spaces this relation is especially close.
Operators S : H1 � H1 and T H2 � H2 acting on infinite-dimensional Hilbert spaces are unitarily equivalent � they have the same matrix representations in some orthonormal Schauder bases.
Proposition 1 .
Proof.
:
===> .
This follows immediately from Proposition 1 .4.9. {:::=:: . If { a m n ; n E N} is the matrix of operators S and T in orthonormal bases e; and e; respectively, then the Riesz-Fischer theorem implies that for the operator U : H1 � H2 we have
( T ue ; , e � ) == ( Te; , e � ) == amn == ( Se; , e � ) == ( U se ; , Ue � ) == ( U s e ; , e � ) for all m, n E N. From this we have TUe; == USe; for all n E N, hence, • TU == US. Thus, in the context of Hilbert spaces a matrix determines an operator uniquely up to the unitary equivalence. Now we have the following natural question: is there an effective way to verify whether the table of complex numbers is a matrix of a bounded operator in a Hilbert space? No such way was found up to now ( and, possibly never will ) . Instead we know many necessary and many sufficient conditions. Here is an illustration. Exercise 1 . For a table { amn ; n E N} to be a matrix of a bounded operator on a Hilbert space, ( i ) the condition sup { L: � 1 l amn l 2 ; n E N} + sup { L: � 1 l a mn l 2 ; m E N} < oo is necessary, but not sufficient; ( ii ) the condition L:�, n == l l amn l 2 < oo is sufficient, but not necessary. Let us suggest another, more interesting example of unitary equivalence. It is related to matrices that are now infinite in all direction. Such matrices appear when it is natural to number the elements of the orthonormal basis in a Hilbert space H by all integers, as, for instance, in the case of the trigonometrical basis in £ 2 [ - 1r , 1r ] , or the basis of integer powers of z in £ 2 ('Jr) . If we take such a basis en ; n E Z in H, we can consider the matrix of an operator T in this basis by putting a rnn : = (Ten , ern) ; m , n E Z. An important class of these matrices are the so-called Laurent matrices, i.e. , the matrices with arn +k ,n +k = arnn for all integers m , n, k ( "matrices which are constant on all diagonals parallel to the main diagonal" ) . Exercise 2* . A bounded operator on a Hilbert space can be represented by a Laurent matrix in some basis ¢:::::::> it is unitarily equivalent to the operator of multiplication by a bounded measurable function in £ 2 ('Jr) . Hint. Suppose an operator S in £2 ('Jr) is represented by a Laurent matrix in the basis z n ; n E Z. Let f : = I:� (X) a n o z n . From the Laurent property it follows that S acts as the multiplication by f on every function from the linear span of the basis. For -
140
2.
Banach Spaces and Their Advantages
x E L 2 ('lr) there exists a subsequence X m of these functions tending to x , such that Sxm tends to Sx in norm and at the same time almost everywhere . From this we have Sx = f x . By the way, this implies that f is essentially bounded.
every
Remark. This result started a large part of the theory of operators starts, the analysis of the so-called Toeplitz and Hankel operators. The former are represented by matrices ( amn); m, n E N, where am+k ,n +k = amn ( "the lower right quadrant of a Laurent matrix" ) , and the latter by matrices ( am n); m , n E N with a m - k ,n +k = am n ( "the turned-over upper right quadrant of a Laurent matrix" ) . Informally, "Toeplitz matrices" are constant on the diagonals parallel to the main diagonal, and "Hankel matrices" , on the diagonals perpen dicular to the main diagonal. Both classes of operators admit an important representation in terms of function theory, this time with participation of not only real, but also complex analysis. On these operators and their numerous applications see, e.g. , [42] .
Let us go back to classification of endomorphisms in our categories. Ba sically, the situation here is the same as in the question on the classification of objects we discussed before. For a series of important special subcate gories complete solutions have been found. (The main of these results is, of course, the classification of self-adjoint operators up to unitary equivalence, which will be discussed in the last two sections of Chapter 6.) At the same time, operators acting in Hilbert (the more so, in Banach) spaces apparently again are too numerous to be classified. The reader who has done Exercise 0.4.2 knows that operators acting on finite-dimensional linear spaces admit a complete classification, and this fact relies on the existence of the Jordan form. In turn, the Jordan form is related to the position of invariant subspaces of the operator under con sideration. Naturally, one of the first questions in the study of bounded operators on infinite-dimensional spaces was the following: what are invari ant subspaces of these operators? The reader might have noticed that all operators we listed as examples in the previous chapter had invariant sub spaces, and moreover, rather many invariant subspaces. The more amazing was a comparatively recent result that gave a solution to a forty year old problem.
There exists a Banach space and a bounded operator acting on it which has no proper non-zero invariant closed subspaces. Moreover, one can take l 1 for such a Banach space.
Theorem 5 (P. Enflo and C. Read; see [62] ) .
(It is not known up to now whether it is possible to replace here l 1 by l 2 , that is, by an infinite-dimensional separable Hilbert space.) 3 . T heorem on the orthogonal complement and around it
First we recall Proposition 1 .2.9 on nearest vectors in finite-dimensional subspaces of near-Hilbert spaces. The conclusion of this proposition becomes
3. Theorem on the orthogonal complement
141
invalid for infinite-dimensional subspaces of near-Hilbert spaces, even if they are closed. Exercise 1 . Let Ho be the linear span of all basis vectors starting with the second in l 2 , and H the linear span of Ho and the vector x == (1, 1/2, 1/3, . . . ) . Then Ho is closed in H, but there is no vector in Ho nearest to x. Such disgraceful things are impossible in Hilbert spaces.
Let H be a Hilbert space, Ho an arbitrary closed subspace in H: and x a vector in H. Then there is a unique vector in Ho nearest to x.
Proposition 1 .
Proof. Set d : == inf{d(x, z)
: z E Ho} and take a subsequence Yn E Ho such
that ll x - Yn ll � d as n � oo. Due to the parallelogram inequality, for all m, n E N we have the equality II ( X - Ym ) + ( X - Yn ) II 2 + II ( X - Ym ) - ( X - Yn ) II 2 == 2 II X - Ym II 2 + 2 II X - Yn II 2 · The first of these squares of norms is 4 ll x - Ym tYn 11 2 > 4d2 , and the right hand side of the equality tends to 4d2 as m, n � oo. Hence, the non-negative double sequence I IYm - Yn ll 2 == ll (x - Ym ) - (x - Yn ) ll 2 cannot have positive upper limit, so it tends to zero. We have shown that the sequence Yn is fundamental. But H is complete, so Yn tends to some y E Ho . Since ll x - Yll == limn �oo ll x - Yn ll == d, we see that y is a nearest vector to x in Ho . It remains to show that every nearest vector to x in Ho , say z , coincides with y. The same parallelogram inequality, considered for x - y and x - z, g1ves 0
2 y z + 4 x- 2 + IIY - z ll 2 == 2 ll x - Yll 2 + 2 ll x - z ll 2 == 4d2 . Again the first square of the norm is not smaller than 4d2 , hence the second • is not positive, and therefore must be equal to zero. Thus, y == z.
Proposition 1 becomes invalid if we replace the word "Hilbert" with "Banach" . As we saw in Section 1 .2 (when considering the space CC� , the vector P l , and the subspace span{p 2 } ) , there can be many nearest vectors. But sometimes there are no nearest vectors at all. Exercise
2.
(i) Let f : E � CC be a functional on a Banach space, Eo : == Ker(f) and x E E \ Eo. Then there exists a nearest vector to x in Eo <====> the upper bound in the definition of the norm of f ( cf. Proposition 1 .3. 1 (ii)) is attained.
2.
142
Banach Spaces and Their Advantages
( ii ) Functionals without the indicated property indeed exist, and an example of such a functional is
f : C[ - 1 , 1] � CC : z �----+
1° z(t )dt - l{o z(t)dt. 1
-1
Hint. From the definition of the norm of a functional it follows that
· h l f(x) l · "d es Wit - y ) l . Th"IS COinCI ' I f '' == SUPy E Eo l f(x ll x - yll infyEEo ll x - y ll " Returning back to spaces with inner product, we give the following im portant definition.
H be a near-Hilbert space. The orthogonal complement of a vector x E H is the set xl_ == {y E H : y _L x }. The orthogonal complement of a subset M C H is the set M 1_ == {y E H : y _L x for all X E M} . Proposition 2. If M is a subset in a near-Hilbert space H, then ( i ) M 1_ is a closed subspace in H; ( ii ) M c ( M j_ ) j_ Definition 1 . Let
:
•
Proof. ( i ) follows immediately from the algebraic properties and from the • continuity of the inner product, and ( ii ) is clear.
We write x _L M if x _L y for all y E M.
Let H be a near-Hilbert space, Ho a subspace of H, and x E H. Then x _L Ho � the distance between x and Ho is equal to ll x ll {in other words, 0 is the nearest vector to x in Ho). Proof. ===> This follows from the Pythagorean equality ll x - y ll 2 == ll x ll 2 + IIYII 2 , which is true for all y E Ho . For every y E Ho and A E CC we have ll x - A YI I > ll x ll , hence ( x - Ay , x - A Y ) > ( x , x ) . Thus, ( x , x) - A ( y, x) - A ( x, y) + AA ( y, y ) > ( x, x) . Take A == t (x , y ) . Then for all t > 0 we have - 2t l (x, y ) 1 2 + t 2 1 (x, y ) 1 2 ( y, y ) > 0. 0. The rest is For y :/=- 0 and t < (y2, y ) this is possible only if ( x, y ) • clear. Proposition 3. .
{:::=:: .
:
Now we formulate the main result. Theorem 1 ( on the orthogonal complement ) .
Suppose H is a Hilbert space, and Ho is a closed subspace in H. Then H == Ho EB Hd- , i. e., H can be decomposed into a direct sum of Ho and its orthogonal complement.
3. Theorem on the orthogonal complement
143
E H. Let y be the nearest vector to x in Ho (see Proposition 1) . Put z :== x - y; then the norm of z is equal to the distance from x to Ho. Since y E Ho , this coincides with the distance from z to Ho. By Proposition 3, z _L y. Hence, x can be represented as a sum y+z; y E Ho , z E Hr}- . Then, for all YI E Ho , z1 E Hr}- such that x == YI + z1 , we have y - YI == z1 - z and at the same time y - YI _L z1 - z. Since the vector y - YI is orthogonal to Proof. Take x
itself, it vanishes, hence y == YI and z == z1 . The rest is clear.
II
Here are some geometrical corollaries.
For every {not necessarily closed) subspace Ho in a Hilbert space H, the subspace (Hr}-)1_ coincides with the closure of Ho . In particular, if Ho is closed, then (Hr}-)1_ == Ho .
Proposition 4.
Proof. From Proposition 2 it follows that (Hr}-)1_ is a Hilbert space con taining H0 ( : == the closure of Ho) . Let H1 be the orthogonal complement
of Ho in ( Hr}-) 1_ . Clearly, every x E H1 is orthogonal to Ho and Hr}-; hence, x _L x, and x == 0. Therefore H1 == 0. By Theorem 1 , (Hr}-)1_ == Ho EB H1 . The rest is clear. II
Let { xv ; v E A} be a system of vectors in a Hilbert space H such that for every y E H the condition y _L Xv for all v E A implies y == 0. Then this system is total in H.
Proposition 5.
Proof. Put Ho :== span{xv ; v
E A}; then, by the assumption, Hr}- == {0} . In view of the previous proposition, the closure of Ho is {0} 1_ , which coincides • with the whole H. Proposition 6. Let H be a Hilbert space, Ho a closed subspace in H, and J : Hr}- � HIHo the restriction of the natural projection pr : H � HIHo . Then J is an isometric isomorphism. Proof. From the decomposition into the direct sum (provided by Theo
rem 1) it follows that every element x (a coset) in HI Ho contains exactly one vector x E Hr}- , and every mapping taking x to x is a linear operator inverse to J. Further, ll x ll :== inf{ ll x + Y ll y E Ho} coincides with the distance from x to Ho. Hence, by Proposition 3, ll x ll coincides with ll x ll . • The rest is clear. :
Under the assumptions of the previous proposition, the norm in HI Ho is a Hilbert norm.
Corollary 1 .
We now recall operators called projections in Definition 1.5.2.
2. Under the assumptions of Theorem 1, the projection acting from H to Ho along Hr}- is called an orthogonal projection or, in short, Definition
144
2.
Banach Spaces and Their Advantages
orthoprojection to Ho . It is usually denoted by PH0 , or just P if there is no
danger of confusion.
Orthoprojections are much more than just examples of operators. Their outstanding role in the general theory will become clear in Chapter 6. From the Pythagorean theorem we have
An orthoprojection maps the open (respectively, closed) unit ball in H onto the open {respectively, closed) unit ball in Ho . As a corollary, • the norm of a non-zero orthoprojection is equal to 1 .
Proposition 7.
Note that 1-PHo is also an orthoprojection in H; it has He}- as its range. One of the most important applications of the theorem on the orthogonal complement is that it allows one to give a complete description of all bounded functionals on Hilbert spaces. A mapping between prenormed spaces E and F which is a conjugate lin ear isomorphism (see Section 0. 1 ) and at the same time preserves the norms of vectors, is called a conjugate linear isometric isomorphism. Clearly, a con jugate linear isometric isomorphism is precisely a mapping which, regarded as a mapping between Ei and F, is an isometric isomorphism.
2 (Riesz) . Let H be a Hilbert space. Then every vector e E H defines a bounded functional fe : H � CC by the rule x �----+ (x , e ) , and every bounded functional on H is fe for some uniquely determined e E H. The resulting bijection I : H � H* : e �----+ fe is a conjugate linear isometric iso morphism of normed spaces. {In other words, the bijection I is an isometric isomorphism between Hi and H* .) Theorem
e E H the mapping fe is a linear functional, I fe (e) I < II x II II e II and fe ( x) == II e II 2 . Hence this functional is bounded and II fe ll == II e ll . Thus, the mapping I indicated in the statement of the theorem
Proof. Obviously, for each
is well defined. From the conjugate linearity of the inner product in the second argument it follows that this is a conjugate linear operator. Since I preserves the norms, it must be injective. It remains to show that every f E H* is fe for some e E H. Certainly, it is sufficient to consider the case of f :/=- 0. Put Ho :== Ker( f ) ; this is a closed subspace in H of codimension 1. Take (using Theorem 1) an arbitrary e ' E He}- of norm 1 and put e :== f(e' )e'. Then for every y E Ho we have f ( y ) == fe ( Y ) == 0 and, in addition, f(e' ) == f(e' ) ( e ' , e ' ) == ( e ' , f(e' )e' ) == fe (e' ) . We see that the functionals f and fe coincide on a subspace of codimension • 1 and on one supplementary vector. The rest is clear.
3. Theorem on the orthogonal complement
145
The bijection I in the Riesz theorem is called the canonical bijection between the Hilbert space and its dual. We emphasize that this is not a linear, but a conjugate linear isomorphism. Here are several applications. First, we formulate the following simple but useful result. Proposition 8. If H is a Hilbert space, then its conjugate Banach space H*
is also a Hilbert space with respect to the inner product that is well defined by the equality ( Ix, Iy ) :== (y, x) .
Proof. The fact that this inner product is well defined is immediately veri
fied by using the fact that I is a conjugate linear isomorphism. The assertion that the norm in H* is generated by this inner product follows from the fact that the norm in H is generated by the inner product ( · , · ) , and I is an isometry. II Proposition 9. Every Hilbert space H is reflexive.
rp E H** defines a functional g E H* by the formula g (x) : == rp(fx) (where, as before, we put fx :== I(x)) . By the Riesz theorem, g == fe for some e E H. From this, for each x E H we have rp(fx) == fe (x) == ( e, x) == fx(e) . Again applying the Riesz theorem, we see that rp(f) == f ( e ) for all f E H* , i.e. , rp == ae is the evaluating functional (see Section 1.6) . The rest • is clear. Proof. Every
The following application of the Riesz theorem plays an important role in the questions connected with the future Spectral theorem (see Section
6.6) .
Let S : H x H --+ CC be a conjugate-bilinear functional on a Hilbert space. It is called bounded if sup{ I S(x, y ) I ; x, y E BH} < oo. In other words, S, considered on H x Hi , must be a jointly bounded bilinear functional; see Definition 1.4.6. (As a matter of fact, we could say "separately" instead of "jointly" , but we will see this later, in Theorem 4.5.) The indicated supremum is called the norm of the functional S and is denoted by II S II . The set of bounded conjugate-bilinear functionals will be denoted by CBil(H) . We now take T E B(H) and assign to it the mapping Sr : H x H --+ CC : (x, y ) �----+ ( Tx, y ) . From the properties of the inner product it is clear that Sr E CBil(H) , and II Sr ll == II T II (see Proposition 1.3.3) . The conjugate bilinear functional Sr is said to be associated with the operator T. Theorem 3. The mapping taking T to Sr is a bijection between the sets B(H) and CBil(H) . Proof. Obviously, the indicated mapping is injective. So to prove the the
orem, for each S E CBil(H) we must find T E B(H) such that Sr == S.
146
2.
Banach Spaces and Their Advantages
Take x E H and consider f : H � H : y �----+ S(x, y) . Clearly, f is a linear functional with norm II I II :::; l i S II llx l l . By the Riesz theorem, there exists a unique z E H such that S(x, y) == ( y, z ) . In other words, S(x, y) == ( z, y) . Now let us "release" x, and put T : H � H : x �----+ z. Clearly, this mapping is well defined by the rule ( Tx, y) == S(x, y) ; x, y E H. It follows from the linearity of ( · , · ) and S in the first argument that T is a linear operator. Finally, 1 /Tx ll == sup{ I ( Tx, y) l ; y E BH } == sup{ I S(x, y) l ; • y E BH} < II S II IIxll , hence T is bounded. The rest is clear. Formally, the Riesz theorem deals with abstract Hilbert spaces. How ever, one can deduce from it numerous propositions about concrete Hilbert spaces we encounter in various fields of analysis. As an illustration, we can now fulfil the promise given in Section 1 .6.
Let ( X, J-L) be a measure space. Then every function y E L 2 (X, J-L) defines a bounded functional fy on the very same L 2 ( X, J-L) by the formula x �----+ fx x ( t )y ( t ) dJ-L ( t ) . Conversely, every bounded functional f on L 2 (X, J-L) is fy for some uniquely determined y E L 2 (X, J-L) . This bijection I2 : L 2 (X, J-L) � L 2 (X, J-L)* is an isometric isomorphism of normed spaces. Proof. Obviously, for H == L 2 (X, J-L) the canonical bijection I assigns to every y the functional ]y acting by the formula ]y (x) == fx x (t )y ( t ) dJ-L(t ) Proposition 10.
(we write the hat in the notation of functional to avoid possible confusions) . Hence the functional fy, being just f11 , where y is a complex conjugate function to y, is well defined. Thus, we obtain a mapping I2 , and I2 is a linear operator (unlike I) . Further, every f E L 2 (X, J-L)* is JY for some y E L 2 (X, J-L) , and thus f == f11 . From this we conclude that the mapping I2 is surject ive. Finally, for each y E L 2 ( X, J-L) we have ll fy ll == ll f11 == IIY II == II Y II · • The rest is clear. "'
Warning. The proof of the Riesz theorem works, with obvious changes,
in the case of real Hilbert spaces as well. Moreover, the corresponding canonical bijection is a "true" isomorphism between the space and its dual space. So we can speak about identifying these spaces by means of this bijection. However, for our principal field CC we must be very careful with such statements: here I, although it preserves norms, is not even a linear operator. We note here that from the form of the inner product in H* it follows that the mapping I takes every total orthonormal system in H to a system with the same properties in H* . Hence, the Riesz-Fischer theorem immedi ately implies that in the case of a separable Hilbert space H there are "true" unitary isomorphisms between H and H* (and there are many of them, of course) . Moreover, from the more general Theorem 2.2 it follows that the
3. Theorem on the orthogonal complement
147
same is true for an arbitrary Hilbert space. But all these isomorphisms (be ing linear operators) have nothing in common with the canonical bijection, and they depend on the choice of a total orthonormal system in H. Later (in Section 6. 1) we will see that the Riesz theorem lies in the foun dation of one of the major notions in functional analysis-Hilbert adjoint operator. *
*
*
According to the theorem on the orthogonal complement and Propo sition 7, every closed subspace of a Hilbert space is the image of a pro j ection (i.e. , an idempotent operator) with norm 1 . This means that every closed subspace of a Hilbert space is always topologically complemented (see Definition 1 .5. 1 and Corollary 1.5. 1) . In addition, for a topological direct complement of this subspace we can take its orthogonal complement. But if we go over to general Banach spaces, the picture gets out of con trol: closed subs paces without topological direct complements appear. This phenomenon (that was an unpleasant surprise at the time of its discovery) is usually mentioned in a series of the so-called pathological properties of Banach spaces. Moreover, unlike other pathological properties that will be discussed later (see Theorem 3.3.3) , there is a quite classical series of coun terexamples. Theorem 4.
ments:
The following subspaces have no topological direct comple
(i) co in l 00 (the Phillips theorem) ; (ii) the subspace in C[-1r, 1r] consisting of all functions such that all their
Fourier coefficients with negative indices vanish.
If you are curious of how this can be possible, we give here the proof of the first assertion. Proof of the Phillips theorem. First, for each M C N we put Zoo ( M ) : = {� E Zoo : �n = 0, n rt M} . If f E Z� , then /M : = f l z oo ( M · Note that, if the subsets Mk E N; k = ) 1 , 2, . . . are mutually disjoint, then from the definition of norms in Zoo and Z� it evidently follows that I:� 1 II fMk II < II f II · Now suppose that co is topologically complemented . By Corollary 1 .5. 1 , there exists a bounded projection P : Zoo ---+ Zoo with co as the image. Let f( k ) be a functional on Zoo that assigns to every � the kth term of the sequence P(�) E co . Clearly, it is also bounded. Note that f( k ) ( p z ) = 1 for Z = k, and f( k ) ( p z ) = 0 for l -1 k.
We construct by induction a sequence of natural numbers no < n1 < · · and a sequence of embedded infinite number sets M( o ) =:) M ( l ) =:) · . Put no : = 1 and M(O) : = N . Suppose no , . . . , n k and M( o ) , . . . , M ( k ) have already been constructed. Consider an + arbitrary partition of M ( k ) into disjoint infinite sets MS!: l ) ; m E N, and the corresponding ( n'(J+ l ) ; m E N. Since the series of the norms of the last functionals functionals f( n k ) and fM ·
•
'YYl
•
148
2.
Banach Spaces and Their Advantages
ll f ( n(�+ l ) II
� and no , . . . , n k rt M$;: + 1 ) . We put M( k + 1 ) equal to MS/: + 1 ) , then take an arbitrary N E M( k + 1 ) such that N > n k , and put n k + 1 : = N. Denote by ( ( k ) the sequence with ones at the places n k , n k + 1 , . . . and zeros at the other places . Put ( : = ( ( 1 ) . Obviously, ( = pn1 + · · + pnk + ( ( k + 1 ) , ( ( k + 1 ) E M( k + 1 ) , and l l ( ( k + 1 ) II = 1 for all k. Then, f( n k ) (p n l ) = . . . = f( nk ) ( pnk- 1 ) = 0 and f( n k ) ( pnk ) >= �1 . · From this, by the choice of M( k + 1 ) , for every k we have l f( n k ) ( () I = 1 1 + J
Mrn
<
·
oo .
We obtain a contradiction.
•
The proof of the second part of Theorem 4 (in fact, of a more general statement) can be found in [41 , p. 191] .
The discussed topics are closely related to the question concerning the extension of operators mentioned in Section 1 .6.
Let E be a prenormed space, and Eo a subspace of E. The following properties of Eo are equivalent: (i) Eo is topologically complemented; (ii) the identity operator on Eo can be extended to a bounded operator S : E --+ Eo; (iii) every bounded operator To : Eo --+ F, where F is an arbitrary prenormed space, has an extension to a bounded operator T : E --+ F. Proposition 1 1 .
Proof. (i)===> ( iii) . By Corollary 1 .5. 1 , there exists a bounded projection
P : E --+ E with Eo as the image. Evidently, T : == To o P I Eo will do as the required extension. (iii)===> ( ii) is clear. (ii) ===> ( i) . Obviously, the composition ( in ) S : E --+ E is a projection • with the image Eo. So, Corollary 1.5. 1 works again. Taking the above counterexamples into account, we see that already in the context of Banach spaces not every bounded operator has a bounded extension (to say nothing about a norm-preserving one) . In other words, in the Hahn-Banach theorem, the space of scalars cannot be replaced by an arbitrary Banach space. But, of course, in Hilbert spaces complete harmony reigns: Proposition 12. Let H be a Hilbert space, Ho a subspace in H, and F an arbitrary Banach space. Then every bounded operator rpo : Ho --+ F has an extension rp : H --+ F such that II rpo II == II rp II . Proof. Let HI be a closure of Ho . In view of the extension-by-continuity
principle, rpo has an extension 'P I : HI --+ F such that II 'P I II == II 'P II . It remains to put rp : == 'P I P, where P is an orthoprojection in H onto HI . •
4.
Open mapping and uniform boundedness principles
149
As a matter of fact , examples in Theorem 4 hide the fact that the absence of the indicated pathology is a characteristic property of Hilbert spaces. Theorem 5 ( Lindenstrauss-Tzafriri, [109] ) . The following properties of a
Banach space E are equivalent: ( i ) every closed subspace in E has a topological direct complement; ( ii ) E is topologically isomorphic to a Hilbert space {in other words, the norm in E is equivalent to some Hilbert norm). For discussion and references see [43, p. 255] .
4. Open mapping principle and uniform boundedness principle
Classical functional analysis rests upon three pillars-three fundamental the orems connected with the name of Banach. They are the Hahn-Banach theorem, the Banach theorem on the inverse operator, and the Banach Steinhaus theorem. The reader already knows (we hope ) and respects the first of them. In this section we speak about the second and the third the orems, i.e. , those that had given to Banach spaces their name. Apparently, the Banach theorem on the inverse operator is the deepest among these three results. We start with it. The following proposition historically was the final part of the proof of this Banach theorem, but it has some independent applications.
Let T E --+ F be a bounded operator from a Banach space to a normed space. Suppose that the set T(B�) {i. e., the image of the open unit ball in E) is dense in an open ball 0 B� (with center at 0 and radius 0 > 0) of the space F. Then T(B�) contains all of OB� . Proof. Take y E OB� and 8 E (0, 1) such that y' :== 8- 1 y lies in OB�. By the assumption, there exists x1 E B� such that IIY' - T(x i ) II < (1 - 8 ) 0. Since the set T((1 - 8 ) B� ) ( i.e. , the image of the ( 1 - 8 ) -dilation of the ball B� ) is obviously dense in 0(1 - 8 ) B� ( i.e. , in the (1 - 8 ) -dilation of the ball OB�) , there exists x 2 ; ll x 2 ll < 1 - 8 such that ll y' - T(x1 ) - T(x 2 ) ll < (1 - 8) 2 0. Since T((1 - 8) 2 B�) is also dense in 0 (1 - 8) 2 B� , there is x 3 ; ll x 3 11 < (1 - 8) 2 such that IIY ' - T (xi ) - T (x 2 ) - T(x 3 ) ll < (1 - 8 ) 3 0. Continuing this process, we obtain a sequence of vectors x n ; n E N ; ll xn ll < (1 - 8 ) n - I such that IIY' - T(x1 ) - · · · - T(x n ) II < (1 - 8) n 0. Since the series 2:: � 1 ll xn ll converges and E is complete, the Weierstrass test ( Proposition 1.8) shows that the series 2:: � 1 X n converges to a vector x ' E E. Since T is continuous, T(x ' ) coincides with 2:: � 1 T(x n ) , in other words, with y' . Proposition 1.
:
2.
150
Banach Spaces and Their Advantages
Thus, y' == T( x' ) , and ll x ' ll < 2:: � 1 ll xn ll < 2:: � 0 (1 - 8) n == 8 - 1 . Put x :== 8 x ' ; then, clearly, x E B� and T( x ) == 8y' == y. The rest is clear. • Taking Proposition 1.4.3 into account, we deduce
If the hypothesis of the previous proposition is fulfilled, then the operator T is topologically surjective. Theorem 1 (Open mapping principle) . Let T : E � F be a surjective bounded operator between Banach spaces. Then T(B� ) contains an open ball O B� for some () > 0. As a corollary, 6 T is open and topologically surjective. Corollary 1 .
U � 1 T ( n B� ) .
But F is a Banach space, and therefore, due to the Bair theorem, there exists m such that T ( m B� ) is not rarefied. Hence, it is dense in some open ball V == y + cB� c F. Put () : == 2� and take z E OB� . Since the vectors y + 2mz and, of course, y lie in V, there exist sequences Yn , y� E T ( m B� ) such that the first tends to y, and the second to y + 2mz. Hence, the sequence Zn y� - Yn , obviously belonging to T( 2m B� ) , tends to 2m z , and thus 2!n zn E T(B� ) tends to z. We have shown that T (B� ) is dense in OB� , and it remains to use • Proposition 1. Proof. Since T is surjective, F ==
: ==
Let T : E � F be a bi jective bounded operator between Banach spaces. Then its inverse linear operator r - 1 is bounded. (In other words, every continuous linear isomor phism between Banach spaces is a topological isomorphism.) Theorem 2 (Banach inverse operator theorem) .
Proof. The previous theorem applied to a bijective T means precisely that
r - 1 is continuous at zero.
The rest is clear.
•
The Banach theorem can be reformulated as follows: every morphism in Ban that is bijective as a mapping of sets is an isomorphism. Certainly, the same is true for Hil (but, of course, not for Ban 1 or Hil1 ) . From the proven theorem, taking Propositions 1 . 1 and 1.2 into account , we have
A bounded operator between Banach spaces zs topologically injective � it is injective and has closed image. Corollary 2.
Exercise 1 .
(i) The open mapping principle follows from the Banach theorem. (ii) Under the assumptions of Proposition 1 , F also is a Banach space. 6 See Proposition
1. 4 . 3 .
4.
151
Open mapping and uniform boundedness principles
Hint. In both cases, Propositions 1 .5.3 and 1.5 .4(i) hold and allow us to
restrict ourselves to the case where T is injective. How important is the condition of completeness in the two Banach the orems? The fact that F should be Banach is already shown in Example 1.4.2. The completeness of E is also necessary, although the corresponding counterexamples look more complicated. Exercise 2. Let ( F, II · II ) be an infinite-dimensional Banach space. Then there is another norm II · II ' such that the identity operator 1 : ( F, II · II ' ) � (F, II II ) is bounded, and even a contraction, but has no bounded inverse operator. Hint. There is a linear basis e v ; v E A in F such that II e v II == 1 and inf{ ll e�-t - e v II ; J-L, v E A} == 0. Hence the following norm will fit: II 2: ; 1 A k e vk ll ' : == 2: ; 1 I A k l · There is another theorem which is also often convenient in applications. In fact, it is equivalent to the Banach theorem. In this theorem we use the notion of the graph of a mapping rp : X � Y between sets. By definition, this is the subset r( rp) : == { (x, rp(x)) : X E X} in the Cartesian product X x Y. Obviously, the graph of a linear operator T : E � F is a subspace in E EB F (or, what is the same, in E x F) . ·
3
(Closed graph theorem) . Let T : E � F be a linear operator between Banach spaces. If the graph r(T) of this operator is closed in E EB F considered with the norm of l 1 -sum, then T is continuous. {In other words, if the convergence of X n to x in E and the convergence of T ( xn ) to y in F imply that y == T(x) , then T is continuous.)
Theorem
Proof. By the assumption, r( T ) is a Banach space with respect to the norm inherited from E EB F. Consider s : r(T) � E : ( x, T ( x )) 1---+ Clearly, this
X.
is a bijective bounded (and even a contraction) operator between Banach spaces. By the Banach theorem, s - 1 is also bounded, i.e. , for some C > 0 and for all x E E we have ll ( x, T( x )) ll :== ll x ll + II T( x ) ll < C ll x ll . From this, II for the same x we have II T( x ) ll < (C - 1 ) ll x ll . Exercise 3 . The Banach theorem follows from the closed graph theo
rem. The Banach theorem, as well as its versions, Theorems 1 and 3, has many applications. Here is, perhaps, the simplest one. It looks quite peculiar, asserting something like "an inequality implies an equality" .
Suppose we have two norms II · II and II · II ' in a linear space E, and both spaces (E, II · II ) and (E, II · II ') are Banach. If the first norm majorizes the second, then they are equivalent.
Proposition 2.
2.
152
Banach Spaces and Their Advantages
Proof. The majorization condition evidently means that the operator 1 : (E, II · II ) -t (E, II · II ' ) is bounded. But then, by the Banach theorem, the inverse operator 1 : ( E, II · II ') -t ( E, II · II ) is bounded as well. The rest is
•
clear.
Now we indicate an important application of the Banach theorem to geometry of Banach spaces. Proposition 3. Let E be a Banach space, E. Then every closed linear complement E2
topological direct complement of E1 .
Proof. In our case the mapping
and E1 a closed subspace of of E1 in E {if it exists!) is a
I : E1 EB1 E2 -t E indicated in Proposition
1 .5.7(iii) , is a bounded bijective operator between Banach spaces. By the • Banach theorem, it is a topological isomorphism. Combining this proposition with Corollary 1 .5. 1 , we obtain
A projection of a Banach space to its closed subspace along another closed subspace is (automatically) bounded.
Corollary 3 .
Exercise 4. In Proposition 3 the word "Banach" cannot be replaced
by the word "normed" . Hint. In l 2 consider the subspaces H1 , the closure of span{p 2n ; n E N} and H2 , the closure of span{ ( p 2n + � p2n - l ) ; n E N} , and put H == H1 + H2 (without closure!) . :
*
*
*
Up to now, we have been discussing the concrete meanings of the general categorical notion of an isomorphism in various categories of functional anal ysis. What can we say about the morphisms of the next level of complexity: about retractions and coretractions? In the category of Hilbert spaces the situation is quite transparent. Proposition 4.
Then
Let S : H -t K be an operator between Hilbert spaces.
(i) S is a coretraction in Hil <====> S is an injective operator with closed
zmage; (ii) S is a coretraction in Hil1 <====> S is an isometric operator.
Proof. (i) , (ii) : ===> . Let T be a bounded left inverse to S (i.e. , a morphism in Hil ) . Without loss of generality we can assume that H =/=- 0, hence T =/=- 0.
Then for each X E H we have TSx = X . Therefore, II S x ll > uh ll x ll · Thus, Corollary 1 .4.2 implies that S is topologically injective, and, in particular
4.
Open mapping and uniform boundedness principles
153
(Corollary 2 ) , its image is closed. If, in addition, S and T are contraction operators (i.e. , morphisms in Hil 1 ) , then the last inequality means simply that II Sx ll == ll x ll , i.e. , S is an isometric operator. (i) , (ii) : � - Put Ko : == Im(S) . By the Banach theorem, the operator S0 : == S I Ko is a topological isomorphism on Ko. Suppose P is an ortho projection onto Ko in K, and P0 : == P I Ko . Then, obviously, the operator T : == (S0 ) - 1 P0 is a left inverse to S, and thus S is a coretraction in Hil. If, in addition, S is an isometric operator, then T is a contraction operator, • and thus S is a coretraction in Hil1 . Following the proof of Proposition 4, it is easy to do Exercise 5. For the same S,
(i) S is a retraction in Hil <====> S is a surjective operator; (ii) S is a retraction in Hil1 <====> S is a coisometric operator.
Hint. Suppose S is surjective, and hence topologically surjective, and
So is the restriction of S to Ker(S) j_ . Propositions 1 .5.4 and 3.6 imply that this is a topological isomorphism, and in addition, an isometry, provided S is a coisometric operator. So the coextension of the operator (So) - l to H is a right inverse operator for S. In our remaining categories the description of (co )retractions i s not so transparent: the geometric nature of these objects is too complicated . Exercise 6. In the category Ban
(i) a coretraction is precisely an injective operator such that its image is a closed subspace having a closed linear complement ; (ii) a retraction is precisely a surjective operator such that its kernel has a closed linear complement. In the case of categories Nor and Pre we can make only a not very impressive declaration that coretractions in these categories are topologically injective operators with topologically complemented images, and retractions are topologically surjective operators with topologically complemented kernels (verify this!) . Certainly, what we have said is true for Ban as well, but in this category (and that is the point) the conditions of Exercise 6 imply "all the rest" automatically. Remark. In the category Ban 1 the description of (co )retractions can be given in geo metric terms, as in Ban, but it is more complicated . We have already seen (in the proof of Proposition 4 (ii) : ::::} ) that to be a coretraction in Ban 1 , a contraction operator T : E ---+ F must be isometric, and its image must have a closed linear complement . The latter, as we now know, means that there exists a bounded projection of F onto this image. But gener ally speaking, this is not sufficient: among such projections there should be a contraction (and as a corollary, it will have norm 1 ) . Similarly, T is a retraction in the same category {=:::::} it is isometric and there is a projection of E onto its kernel which at the same time is a contraction operator.
2.
154
Banach Spaces and Their Advantages
Actually, the question on how to describe (co )retractions in Ban1 is among many questions stimulating the interest to subspaces of Banach spaces that are "so well situ ated" that they are images of contraction operators. In particular, in the same geometric terms (this time the question was about the subspaces in B(H) ) the class of the so-called amenable operator algebras was characterized in the 1980s. These algebras play an im portant role in mathematics and mathematical physics ( cf. the end of Section 6.3) . *
*
*
Now we pass to the third fundamental result connected with the name of Banach. Theorem 4 ( Uniform boundedness principle; Banach-Steinhaus ) .
Let E be a Banach space and F a prenormed space. Suppose Tv E � F; v E A is a family of bounded operators. Assume that for every x E E there exists Cx > 0 such that for all v E A we have II Tv ( x ) II < Cx . Then there exists a constant C > 0 such that II Tv ll < C for all v E A. :
The assumed and the stated properties of the operator family here are called, respectively, the pointwise boundedness and the uniform boundedness ( of this family ) . Proof. Assume the contrary. Then we can choose from our family of op
erators a sequence Tn ; n == 1 , 2, . . . such that II Tn ll > n for all n. For each natural k we put Mk : == {x E E : II Tn ( x ) ll < k for all n } . We shall show that every such set is rarefied. Take an open ball U ( x, r) == x + r E� in E and choose an arbitrary n > � (k + Cx) · In view of the inequality II Tn ll > n and the definition of the operator norm, there exists y' E BE such that II Tn ( y' ) II > n; without loss of generality we can assume that y' E B�. Put z : == x + r y'. Then ll x - z ll < ll r y' ll < r, i.e. , z E U(x, r) , and II Tn ( z ) II > II Tn ( r y' ) II - II Tn ( x ) II > r n - Cx > k. By continuity, the same inequality holds after replacing z by an arbitrary point in a certain neighborhood of z, which we call U. Hence, U(x, r) contains an open set, namely the intersection of U ( x, r ) with U, which does not contain points of Mk . This means that Mk is rarefied. Now consider the set M : == U{Mk : k == 1 , 2, . . . } . Being a union of a countable system of rarefied sets, M is meager. By the Bair theorem, it cannot coincide with the entire E. At the same time for all x E E the estimate II Tn ( x ) II < Cx implies that x E Mk for all k > Cx , a contradiction .
•
Certainly, everything that the Banach-Steinhaus theorem says about operators is also true for functionals. The following special case is useful.
5. Banach adjointness functor and other categorical questions
155
Let E be a normed space, and x v ; v E A a family of vectors such that for all f E E* the set of numbers { f ( x v ) ; v E A} is bounded. Then the initial family is bounded with respect to the norm in E.
Proposition 5.
Proof. Consider the family a ( x v ) of functionals on E* , where a : E � E** is the canonical embedding (see Section 1 .6) . By the assumption, this family
is pointwise bounded and thus, by the completeness of E* , it is uniformly II bounded. It remains to recall that a is an isometric operator. The assumption of the completeness of E cannot be omitted. Exercise 7° . The sequence of functionals fn : ( coo , II · ll oo ) � CC; � �----+ n�n is pointwise bounded but not uniformly bounded. We should emphasize that in the Banach-Steinhaus theorem the com pleteness of only one of the spaces is assumed, namely the domain E of our operators; at the same time, in the Banach inverse operator theorem both spaces E and F should be complete. (As for the Hahn-Banach theorem, no completeness of any space is assumed there.) We shall apply the indicated principle many times in this book. Right now we shall give a corollary concerning bilinear operators. Recall (Defi nition 1 .4.6) that there are two approaches to the question which bilinear operators between (pre )normed spaces must be viewed as bounded. More over, already in the classes of normed spaces these two approaches lead to two different classes: separately bounded bilinear operators form a wider class than jointly bounded ones (Example 1 .4.5) . However, in the class of Banach spaces such examples are impossible.
Let E be a Banach space, and F and G arbitrary prenormed spaces. Then every separately bounded bilinear operator n : E F � G is jointly bounded. Theorem 5.
X
Proof. Take y E F and consider the mapping Ry : E � G (see Definition 1 .4.6) . The separate continuity of n implies that the family {Ry : II Y II < 1 }
is a subset in B(E, G) satisfying the assumptions of the Banach-Steinhaus theorem. By this theorem, sup{ II Ry II : II Y II < 1 } < oo , and this number • obviously coincides with II R II in Definition 1 .4.6. 5 . Banach adj ointness functor and other categorical questions
A series of important functors is defined on the categories of functional analysis. To begin with, on these categories, as on any category, act the co- and contravariant functors of morphisms (see Example 0.7. 1 ) . But a noticeable feature of functors in the categories Ban, Nor, and Pre is that
2.
156
Banach Spaces and Their Advantages
they can (and should) be considered with the values not in Set as in the general case, but in the initial category. We shall restrict ourselves to the case of the first of the indicated categories, which is apparently the most important. We have seen that for each E, F E Ban the set B ( E, F) , i.e. , in the gen eral categorical notation, haan (E, F) , has the structure of a Banach space (Proposition 1 . 7 ) . If in Section 0.7, we, respectively, replace haan (E, F) by B ( E, F) in every notation connected with mar-functors, we can make the following observation. Proposition 1. For every E E Ban and for each morphism rp : F � Ban {i. e., for each bounded operator from F to G) the mappings
B(E, rp) : B(E, F)
B(E, G) : 'ljJ
r---t
rp'l/J
B (rp, E ) : B(G, E ) � B ( F, E ) : 'ljJ are bounded operators.
r---t
'l/Jrp
�
and
G in
Proof. This immediately follows from the estimate for the operator norm • of the composition; see Proposition 1 . 3.4.
Proposition 1 implies that every Banach space E generates two functors, a covariant and a contravariant, from Ban to Ban itself, namely
B( E , ?) : F
r---t
B( E , F) ; rp
B (? , E) : F
r---t
B(F, E ) ; rp
r---t
and r---t
B( E , rp) B ( rp, E ) .
We leave it to the reader to give similar definitions of morphism functors acting from Nor to Nor, and from Pre to Pre. Among all of these functors, the main role in analysis belongs to the contravariant functor B(?, CC) : Ban� Ban corresponding to the case E == CC. Definition 1. This functor is called the
Banach star functor) and is denoted by ( * )
Banach adjointness functor (or : Ban� Ban.
Obviously, for an object E in Ban the object (*) ( E ) is just the dual space E* . Respectively, for a morphism T in Ban it is customary to write T* instead of ( * ) (T) . The morphisms we have built (i.e. , operators) are so important that we shall give their direct definition.
G be an operator between Banach (or, more general, normed) spaces. The operator T* : G * � F* acting by the rule f fT is called the Banach adjoint to T . Definition 2. Let T r---t
:F
�
5. Banach adjointness functor and other categorical questions
157
. Thus, for every x E F we have the equality [T* f] ( x) f(Tx) or, 1n other notation, (T * J, x) == (J, Tx) . (1) If you prefer formulas, you can accept this as an initial definition of the ad joint operator. At the same time the action of this operator is well illustrated by the following commutative diagram:
which is easier to remember than equality ( 1) . Remark. We call this functor "Banach adjointness" instead of just "ad
jointness" because later in the book an important role will belong to the notion of the Hilbert adjoint operator (see Definition 6. 1 . 1 ) , and this is not quite the same. Contrary to ( * ) : = B(?, C) , the (covariant) functor B(C, ?) is of no special interest . Exercise 1 . The functor B(C, ?) is naturally equivalent to the identity functor in Ban.
Here are several properties of the Banach adjoint operator. Exercise 2. Let S : E1 --+ E2 and T : F1 --+ F2 be bounded operators between Banach spaces. Then (i) if E1 == F1 and E2 == F2 , then (S + T)* == S* + T* ; (ii) (AS)* == AS* for all A E CC; (iii) II s * II == II s II ; (iv) if E1 == F2 , then (ST)* == T* S* ; ( v) if 1 is the identity operator on E, then 1 * is the identity operator on E* . Hint. Properties (iv) and (v) are encoded in the word "functor" , and (iii) follows from the Hahn-Banach theorem. Note that (iii) implies that the Banach adjoint to a contraction operator is again a contraction. This allows us to introduce an analogous "Banach adjointness functor" acting on the category Ban1 . Taking the star functor twice we obtain the functor (** ) : Ban-+Ban, which is certainly covariant. This functor assigns to every F E Ban its second dual space F** (see Section 1 .6) and to every operator T : F --+ G its so-called second adjoint operator T** : F** --+ G** . It is closely
158
2.
Banach Spaces and Their Advantages
connected with the identity functor 1 in Ban ( cf. Section 0. 7) . We recall the canonical embedding ap : F � F** defined in Section 1.6 for each normed (in particular, Banach) space F.
For every operator T : F � G there is the following com mutative diagram of Banach spaces:
Proposition 2.
In other words, the family { ap : F E Ban} is a natural transformation between the functors 1 and (** ) acting on Ban {see Definition 0. 7. 6). Proof. Since elements of G** are functionals on G* , our goal is to show
that for all x E F and f E G* we have (aa (Tx) , f) == (T** (apx) , f). The construction of the operators ap and aa together with equality ( 1 ) gives (aa (Tx) , f) == ( f, Tx) == (T * f, x) == (apx, T* f) . Now (apparently, this is a psychologically difficult moment) , we substitute T** for T* , T* for T, apx for f, and f for x in the general formula ( 1) . We • see that (apx, T* f) is just (T** (apx) , f) . The rest is clear. Banach adjoint operator is one of the key notions in functional analysis, and we will often discuss its properties. Now we give several illustrations of what it turns out to be for some concrete operators. Exercise 3. Suppose A E Zoo . Then the Banach adjoint operator r; to the diagonal operator T).. (see Example 1 .3.2 ) acting on l2 (and, respectively, on co, l 1 ) coincides, up to isometric equivalence, with the diagonal operator T).. on l 2 (respectively, l 1 , l 00 ) . To be more specific, we have the following commutative diagrams:
where /2 , Io and /1 are isometric isomorphisms indicated in Exercises 1 .6. 11 .6.3. Exercise 4. The Banach adjoint operator T[* to the operator of the left shift Tz (see Example 1.3.3 ) acting on l 2 (respectively, in co, l 1 ) coincides, up to isometric equivalence, with the operator of the right shift Tr in l2 (respectively, l 1 , l 00 ) , and in this assertion we can interchange Tz and Tr.
5. Banach adjointness functor and other categorical questions
159
(Formulate and prove a more detailed statement, writing the corresponding commutative diagrams.) Exercise 5. Find the Banach adjoint operators for the shift operators on the spaces Lp (X) for X : == IR, 1r and p == 1 , 2 (see Example 1 .3.7 ) . Hint. This is the "shift in the opposite direction" . Certainly, the Banach adjoint operator to a topological or isometric iso morphism I also belongs to the same class. You can either verify this di rectly, or use the fact that I* is the value of the functor (* ) on an isomorphism in Ban or in Ban1 . Remark. The converse is also true: if I* is a topological or isometric iso
morphism, then I have the same properties. This fact will be soon suggested as Exercise 10 in the "advanced" part of the text. In the following two examples, E is a closed subspace of a Banach space F. Example 1. The Banach adjoint operator in* : F*
� E* to in : E � F
assigns to each functional on F its birestriction to E. From the Hahn Banach theorem it evidently follows that this operator is a coisometry.
� F* adjoint to pr : F � FjE, acts by the formula f �----+ g , where g : x �----+ f(x + E) . Clearly, this is a contraction operator, and from the fact that every coset of norm < 1 contains a vector of norm < 1 it follows that ll f ll == sup{ l f ( x ) l : x E B�; E } < sup{ lg(x) l : x E B� } == ll g ll . Example 2. The operator pr * : (F/E) *
Thus, pr* is an isometric operator. Here is a more general observation. Exercise 6.
(i) An operator between Banach spaces is isometric <====> its Banach adjoint operator is coisometric; (ii) an operator between Banach spaces that is Banach adjoint to a coisometric operator, is isometric. Hint. The ===:>-part in (i) and the entire statement (ii) are established by arguments similar to those in Examples 1 and 2. Further, from Proposition 1 it follows that if I** is coisometric, then I has the same property. Taking this into account, we deduce the �-part of (i) from (ii) . The following exercise is a "topological" analogue of the previous "met ric" assertion.
2.
160
Banach Spaces and Their Advantages
Exercise 7.
(i) An operator between Banach spaces is topologically injective � its Banach adjoint operator is surjective (or, equivalently, topolog ically surjective) ; (ii) an operator between Banach spaces that is Banach adjoint to a surjective operator, is topologically injective. Hint. The representation of operators as compositions described in Proposition 1 .5.3 allows us to reduce the case under consideration to (co )iso metrjc operators jn the prevjous exercjse.
Remark. As a matter of fact , in both Exercises 6 and 7 part (ii) is true in
"both directions" , just as part (i) . In other words, an operator is coisomet ric (respectively, surjective) � its Banach adjoint operator is isometric (respectively, topologically injective) . But the proofs of the �-part of this criterion known to author, require powerful tools exceeding the scope of the book. The required arguments are presented in [72, Theorem 4. 15] . (Perhaps you will succeed in finding a simpler proof.) Later we will find out which topological properties distinguish the adjoint operators among all operators acting from F* to E* (see Theorem 4.2.3 ) . Observations we have already made allow us to prove the theorem lying in the foun dation of the so-called "Banach homology" , a set of problems on the boundary between functional analysis and homological algebra. Let K be a category. A sequence in K is a diagram of the form Xn - 1 <.p n- 1 Xn <.p n Xn + 1 ·
•
· f--
f--
f--
f--
· · ·
·
For simplicity we restrict ourselves to sequences that are infinite both on the left and on the right . (Therefore, the indices in the notation of objects and morphisms are arbitrary integers. ) When we write such a sequence, the arrows can be directed to the right as well. But, of course, all arrows in a sequence must have the same direction. If our K is Ban, then, applying the functor ( * ) to the objects and morphisms of the sequence T T En - 1 n - 1 En n En + 1 (£) · · · f--
f--
f--
f--
· · ·
we obtain the sequence (£* ) in the same category Ban; it is called dual to (£ ) . Definition 3 . A sequence (£) in Ban is called exact if for every n E Z we have Ker(Tn - 1 ) = Im(Tn ) (in other words, for every x E En the statements "Tn - 1 X = 0" and "x = Tny for some y E En + 1 " are equivalent) .
Perhaps you have already encountered exact sequences in algebra or topology (most probably, in a more general context of abelian groups) . This is one of the basic notions in homological algebra and in algebraic topology (see , e.g. , [44] ) .
5. Banach adjointness functor and other categorical questions
161
The declaration that a sequence i s exact presents, in condensed (and, in many math ematicians ' opinion, elegant) form, important information about the structure of objects and morphisms in this sequence, and about their relations. For instance, it is easy to verify that a sequence
is exact ¢:::::::> I is a topological isomorphism. Here is the key example. Exercise 8. Suppose a sequence in Ban of the form ·
·
·
S T f-- 0 f-- 0 f-- G f-F f-- E f-- 0 f-- 0 f--
·
·
·
is exact . (Such sequences are called short. ) Then the operator S is topologically injective, T is (topologically) surjective, and the quotient space FI Im( S) is topologically isomorphic to G. We now formulate the promised Theorem 1 (for the proof, see [47, Theorem 0.5.2] ) . A sequence (£) in Ban is exact ¢:::::::> its dual sequence ( £* ) is exact.
We suggest that you prove something like a local form of Theorem 1 , from which Theorem 1 itself immediately follows. Exercise 9. Let
G ;!_ F � E be a diagram in Ban, and TS = 0. Then the following statements are equivalent: (i) Im(S) is a dense subspace in Ker(T) , and , in addition, Im(T) is closed in G. (ii) Ker(S* ) = Im(T* ) .
Hint. Consider the commutative diagrams G
T
F
� )r D
G*
T*
+----
F*
R\ ;... D*
where D := FI Ker(T) , and R is uniquely defined by the commutativity condition. (i) ====} (ii) . Since TS = 0 implies S* T* = 0, it is sufficient to show that every f E Ker( S* ) can be represented in the form pr* R* g ; g E G* . Clearly, f vanishes on Im ( S) , hence , by density, on Ker(T) as well. Therefore, f = pr * h for some h E D* . Since Im (T) is closed, R is topologically injective. Therefore R* is surjective (Exercise 7(i) ) . The rest is clear. (ii) ====} (i) . If lm(S) is not dense in Ker(T) , the Hahn-Banach theorem provides f E F* vanishing on Im(S) and not vanishing on Ker (T) . But the first condition means precisely that f E Ker(S* ) , and the second means that f ri. Im(T* ) . If lm(T) is not closed, R* is not surjective (Exercise 7(i) ) . If we take h E D* \ Im(R* ) , we see that f : = pr * h belongs to Ker(S* ) , but not to Im(T* ) . Theorem 1 allows us to do quickly the following exercise . Exercise 10. An operator I : E ---+ F is a topological or isometric isomorphism ¢:::::::> the same is true for I* : F* ---+ E* .
2.
162
Banach Spaces and Their Advantages
Hint. Consider the sequence and, in the "isometric" part , apply Proposition 2 . In what follows, we shall generalize this observation, again with the help of Theorem 1 ; see Exercises 3.5.7-3.5.8. You can learn more about applications of this theorem from [4 7] and references therein. The following several lines are addressed to the reader who has read Section 1 . 7 and wants to learn more about its subject . One of the fundamental facts of quantum functional analysis is that the morphism functor defined on the category QNor of quantized normed spaces, can be "reorganized" in such a way that it takes values in the same category (similarly to the morphism functor in Ban and in other categories of functional analysis mentioned at the beginning of this section) . Certainly, the main thing here is that for any quantum spaces E and F the set C B(E, F) (i.e. , in the general categorical notation, hQNor (E, F)) has a canonical structure of quantum space. This structure is by no means obvious, and for a long time even its existence was doubtful. But in the early 1 990s Effros and Ruan, and independently, Blecher and Paulsen found the desired quantization. Here is their recipe. Recall that for each n E N the space Mn ( CB ( E , F)) must be endowed with a norm in such a way that the family of norms becomes a quantization, i.e. , satisfies the conditions indicated in Definition 1 . 7 . 1 . With that purpose, we consider for the same n a normed space CB ( E, Mn (F)), where the norm of mth floor in Mm (Mn (F)) is defined by the obvious identification of the latter space with Mmn (F) . Exercise 1 1 * . (i) For each n there exists a linear isomorphism between the spaces Mn ( CB ( E , F) ) and CB ( E, Mn ( F ) ) that takes every matrix T = (Tk z) to the operator T : E ---+ Mn ( F ) acting by the formula x �-----+ (Tk z (x)) . (ii) The sequence of norms II II n in Mn (C B ( E, F) ) ; n E N defined by the formula II T II n : = II TII cb is a quantization of the space CB ( E, F ) . (iii) For each E, F, G E QN or and every completely bounded operator
CB ( E, cp ) : CB ( E, F) ---+ CB(E, G) : 'lj; �-----+ cp'lj; , CB( cp , E ) : CB (G, E ) ---+ CB ( F, E ) : 'lj; �-----+ 'lj; cp
are completely bounded operators.
These facts imply that every quantum space E generates two functors, a covariant one and a contravariant one, from QNor to QNor, namely
CB ( E, ?) C B (?, E )
:F :F
�-----+ CB ( E, F) ;
As in the classical functional analysis, the most important of these functors is the conravariant functor (* ) : = CB(?, C) , the quantum version of the star functor. When studying this functor, and first of all, when studying the structure of quantum dual spaces, we come across some phenomena that do not have "classical" analogues. For instance , there is a symbolic equality ( 12 ) * = 12 , understood in the sense that the quantum space dual to 1 2 with the column quantization, coincides up to a completely isometric (i.e. , isometric on all floors) isomorphism, with the same 1 2 , but endowed with the row quantization. In the same sense (12)* = 12 , (12in)* = 12 ax and (12 ax)* = 12in.
5. Banach adjointness functor and other categorical questions
163
Actually, for all examples of quantum Hilbert spaces known up to 1996, their dual quantum spaces turned out to be different from the initial spaces. The more amazing was Pisier ' s discovery when he showed that among all quantum Hilbert spaces there is exactly one behaving in complete correspondence with the "classical" Riesz theorem: for this space H its canonical bijection onto H* (see Theorem 3 . 2) is a completely isometric conjugate linear operator. Thus, such quantum space plays in quantum functional analysis the same role as the usual Hilbert space does in the classical functional analysis. The construction of this "Pisier-Hilbert space" is rather complicated , so we shall not describe it; see, e.g. , [3 6 , 3.5] . *
*
*
In contrast with the situation in the category Ban, the set hK (E, F) for JC == Ban1 (being just the unit ball in B(E, F) ) is not even a linear space. Therefore, the mar-functors defined on this category, are considered with values in the category with "poor structure" , mostly in Set . On the other hand, Ban 1 has also some advantages. As we shall see a little later, they come to the forefront in problems connected with (co )products. Among other functors defined on the categories Ban and Ban1 , for getful functors deserve special attention. They act to one of the categories Lin, HTop, Met , or Set (cf. Section 0.7 ) . Note in addition the functor O : Ban1 �set , assigning to every Banach space its unit ball, and to every contraction operator rp : E � F its birestriction 0 ( rp ) : BE � BF . The use of this functor will be explained to advanced readers at the end of this Section. A series of interesting functors act in the opposite direction. They con struct Banach spaces from sets, topological spaces, etc. Example 3. Let us assign to each set X the Banach space l00 (X) (see
Section 1 . 1 ) and to each mapping w : X � Y the operator l00 (w) : l00 (Y) � l00 (X) taking a function x E l00 (Y) to the function y E l00 (X) : y ( t ) :== x(wt ) . Obviously, we obtain a contravariant functor l00 (?) : Set�Ban1 (check the necessary properties.) Similarly, there are contravariant functors defined on various subcate gories of Top and taking values in Ban1 . (Suggest one of them and fill the details.) However, later (see Example 3. 1 . 1 ) we will need to define accurately the most important of these functors. Further, in the theory of operator algebras it is useful to consider the functor from Meas to Ban1 assigning to a measure space (X, J-L ) the Banach space L00 (X, J-L). (Give an accurate definition.) We had already discussed the structure of the operators that are core tractions and retractions in the categories of Banach and Hilbert spaces in previous sections. We now say a few words about other types of morphisms in these categories.
2.
164
Banach Spaces and Their Advantages
In the categories Ban, Bani , Hil, and Hil1 , the monomor phisms are precisely the injective operators, and the epimorphisms are the operators with dense image. Proposition 3.
Proof. The proof of similar results in Nor and Nori (Proposition 1 .5. 1 1 )
works without changes in these four categories as well. One should only take into account that the passage to the quotient space does not take us out of the class of Banach (Proposition 1.9 ) and Hilbert (Corollary 3. 1 ) spaces. • *
*
*
In addition, we suggest the following exercise. Exercise 12 * .
(i) Let T : E ---+ F be a morphism in Ban or in Hil. Then T is an extreme monomorphism {=:::::} T is an operator with closed image; T is an extreme epi morphism � T is a surjective operator. (ii) Let T : E ---+ F be a morphism in Ban 1 or in Hih . Then T is an extreme monomorphism � T is an isometric operator; T is an extreme epimorphism {=:::::} T is a coisometric operator. As a corollary, we obtain that in Hil and in Hil 1 the class of extreme monomorphisms coincides with the class of coretractions, and the class of extreme epimorphisms coincides with the class of retractions. Hint. Combine Exercise 1 .5.8 with the open mapping principle. *
*
*
Now let us turn to (co)products. Consider first the case of a finite family of Banach spaces, say, EI , . . . , En . Certainly, the Banach direct product and Banach direct sum (see Section 1 ) of this family has the same underlying linear space, which coincides with both their Cartesian product and their direct sum as linear spaces. There fore, we can use any corresponding notation, x�=I Ek or EB �=I Ek , and also the short notation E. Evidently, elements of E can be regarded as rows x == ( xi , . . . , Xn ) ; X k E Ek , where the 'lrk : E --+ Ek act as x �----+ X k , and the i k : Ek --+ E act as y �----+ ( 0, . . . , O, y, O, . . . ) (y is on the kth place) . As usual, we treat every Ek as a subspace in E, identifying it with Im( i k ) · Here the norm II · ll rr of the Banach direct product and the norm II · li n of the Banach direct sum (defined for the arbitrary family of spaces in Section 1 . 1 ) satisfy the obvious estimate II · ll rr < II · li n < n il · ll rr , and thus, they are equivalent. Each norm in E equivalent to these norms is called admissible. Certainly, E is a Banach space with respect to an admissible norm. Exercise 13. Every norm in
E that turns this space into a Banach
space and coincides on each Ek with the initial norm in the latter space is admissible.
5. Banach adjointness functor and other categorical questions
165
Every finite family E1 , . . . , En of normed spaces has both the product and the coproduct in the category Ban . Namely, if II · II is an admissible norm, then ( E, II · II ) , together with the projections 'lrk ; k == 1 , . . . , n, is the product, and the same ( E, II · II ) , together with the injections i k ; k 1 , . . . , n, is the coproduct.
Proposition 4.
==
Proof. We restrict ourselves to coproducts, leaving the parallel case of prod ucts to the reader. Let F be an arbitrary Banach space together with bounded operators (i.e. , morphisms in Ban ) 'Pk : Ek --+ F; k == 1 , . . . , n.
Our goal is to show that there exists a uniquely defined bounded operator 1/J such that for each k == 1 , . . . , n the diagram E
't/J
ikrEk A
F
is commutative. By the properties of coproduct in Lin, there exists a unique linear operator 1/J with this property. Hence it remains to show that it is bounded. Take x E E. Obviously, it can be uniquely represented in the form L:�= l i k ( Yk ) ; Yk E Ek . Therefore, taking into account that the norm II · II in E is admissible, for some C > 0 and K : == max{ II 'P k II ; k == 1 , . . . , n } we have n
n
n
k= l
k= l
k= l
The rest is clear.
II
Exercise 14. Every finite family of Hilbert spaces has the product and the coproduct in the category Hil.
Hint. The Hilbert direct sum (see Section 1 ) of the given spaces, con
sidered with the corresponding projections, is their product, and considered with the corresponding embeddings, is their coproduct. From Proposition 4.3 we immediately have
Suppose a Banach space E is decomposed into a direct sum of closed subspaces E1 and E2 . Then, together with the corresponding natural embeddings, E is the coproduct of E1 and E2 in Ban .
Corollary 1 .
Trying to extend the results about finite (co )products to infinite families of objects, we come to a lamentable result. Exercise 15* ( cf. Exercises 0.6. 7 and 0.6.8 about somewhat different situation in Met ) . Any infinite family of non-zero objects in Ban has neither
166
2.
Banach Spaces and Their Advantages
product , nor coproduct in this category. The same is true if we replace Ban by Hil. Hint. If, say, the family En ; n E N has a hypothetical product in Ban, then the desired contradiction follows from the choice of F == CC and op erators 'Pn : CC � En with sufficiently rapidly growing norms. In the case of a hypothetical coproduct we can put F == EBP { Ev : v E A} for each p E [1 , oo] when considering Ban and p == 2 for Hil, and after that take the multiples of the corresponding embeddings, again with rapidly growing norms, as 'Pn : En � F. :
:
But the situation sharply improves if we pass from Ban to Ban1 (i.e. , forbid the operators to have large norms) . Exercise 16. Every family of Banach spaces Ev ; v E A has both the product and the coproduct in the category Ban1 . The product of this family is the Banach direct product ((B00 {Ev : v E A} , II · ll rr ) with projections 1rJ.L : EB 00 { Ev : v E A} � EJ.L : f �----+ f (J-L ), and the coproduct is the Banach direct sum ((B 1 {Ev : v E A} , ll · ll n ) with the embeddings iJ.L : EJ.L � (B 1 {Ev : v E A} taking x E EJ.L to the mapping that sends J-L to x, and the other
elements of the index set A to zero. Hint. The proofs of both results are very similar. We restrict ourselves to the first. Suppose we have F and a (contraction!) 'Pv : F � Ev ; v E A. The pair consisting of the Cartesian product E0 of our spaces and the standard family of projections is the categorical product of our family in Lin (Proposition 0.6 ) . Hence, there exists a unique linear operator 1/; 0 : F � E0 making the corresponding diagram commutative for each v E A. It is easy to see that the corestriction 1/J to the Banach direct product is a contraction operator (i.e. , a morphism in Ban1 ) . The rest of the section is addressed to the reader who wants to know more about the categories under consideration. First , the following exercise contrasts with the previous one. Exercise 17* . In the category Hih (even) a family of two non-zero spaces has neither product , nor coproduct.
Hint. Suppose Hk are our spaces and X k E Hk ; ll x k II = 1 (k = 1, 2) . To obtain a contradiction, take Y := C and 'P k : 1 �-----+ ± x k for products, and 'P k : y �-----+ (y, X k ) for coproducts (see Definitions 0.6 . 1 and 0.6.2 ' ) . It turns out that because of the specific features of "Hilbert" geometry (in particular, the parallelogram identity) the required operator 'lj; cannot be a contraction. We now recall the usual approach to the study of a category formed by sets with some additional structure. Usually, such a category is endowed with the forgetful functor and becomes a concrete category in the sense of Definition 0. 7.2. Certainly, the categories of functional analysis are no exception. Here , in particular, one can speak about concrete categories (K , D) , where K is any of the categories Ban, Ban 1 , Hil, Hih , and D is the forgetful functor to Set . As is easy to see , the concrete categories Ban 1 and Hih are not
167
6. Completion
balanced, but the two remaining categories are: for Ban, if you look at it attentively, you see that this fact is simply an equivalent formulation of the Banach theorem. (Show that the wider concrete category (Nor, D) is already not balanced; cf. Example 1 .4 . 2 . ) In conclusion, consider the question about the categorical bases and freedom. First , we describe free objects in (Ban, D) and (Hil, D ) . Exercise 1 8 . In the indicated two concrete categories the free objects are precisely the finite-dimensional spaces, and a (categorical) basis of a finite-dimensional space is just a linear basis.
Hint. The last statement follows from Theorem 2 . l (iii) . Now let E be an infinite dimensional space , and S a basis in E. Then S cannot be finite , because otherwise , taking F := E j span( S) , we obtain operators pr, 0 : E ---+ F that are different morphisms taking S to 0 E F. And if S contains a sequence Xn ; n E N , then there is no bounded operator on E taking X n to nx n . Exercise 19. In the concrete categories (Ban1 , D) and (Hil 1 , D) no space has a (categorical) basis.
Hint. If x is a point of a hypothetical basis S in E, then every mapping
However, the situation with bases sharply improves if we associate with Ban1 another concrete category, namely, (Ban1 , 0) , where 0 : Ban1 ---+ Set is the "unit ball" functor. This category, certainly, is balanced. But its main advantage is the following result. Exercise 20. A Banach space is free in the sense of the concrete category (Ban1 , 0) {::=:::::} it is isometrically isomorphic (i.e. , isomorphic in Ban1 ) to the space h (X) for some X. A basis for h (X) is the set of functions
8 (t ) = s
{
1 0,
'
t = s, ; t -1 s
s E X.
Hint. Since C has a singleton basis in the concrete category in question, the result follows from Exercise 0.7.4. We leave it to the reader to recover the details of the construction of the corresponding freedom functor :F : Set ---+ Ban1 : X �-----+ h (X) .
6 . Completion
We have seen how nice and attractive are Banach spaces. But imagine that someone indicates a normed space and says that it is Banach, but it turns out that the space is not complete. Do not worry: this space can be made complete by adding new points, in the same way the real numbers are produced from the rational numbers following Cantor. We first describe what we would like to expect from our hypothetical construction. Until further notice, E is a normed space.
Let (E, i) be a pair consisting of a Banach space E and a contraction operator i : E � E. The following properties are equivalent: ( i ) {Universal property) For every pair (F, rp) of a Banach space F and a contraction operator rp : E � F, there exists a unique contraction
Proposition 1.
168
2.
Banach Spaces and Their Advantages
operator 1/J : E --+ F such that the diagram
F E --? 'lj;
is commutative; (ii) i is an isometric operator, and its image is dense in E. Proof. (i) ===> ( ii) . Take x E E. By Theorem 1 .5.3, there exists a functional f : E --+ CC of norm 1 such that f(x) == llx ll · Since f is a contraction
operator to a Banach space, by the assumption, there exists a contraction operator 1/J : E --+ F making the above diagram with CC as F and f as rp commutative. In particular, 1/; i (x) == f(x) , hence ll i (x) ll > 11 1/J i (x) ll == llx ll . Since i is a contraction operator and x is arbitrary, i is an isometric operator. Now let E1 be the closure of the image of i. Put F :== E / E1 and take the zero operator as rp : E --+ F. Then the indicated diagram is commutative if we take for 1/J the natural projection E onto F. But it is commutative, in particular, for 1/J :== 0. By assumption, such 1/J is unique. Hence - the natural projection E onto F is a zero operator, and this means that E == E1 . (ii)� (i) . Put Eo :== Im( i ) and for a given rp : E --+ F consider the operator 1/Jo : Eo --+ F well defined (because i is injective) by the rule 1/Jo ( i (x)) :== rp(x) . Since i preserves norms, 1/Jo is a contraction operator just like rp. Evidently, a contraction operator 1/J : E --+ F makes the above-indicated diagram commutative � 1/J is an extension of 1/Jo . But by assumption, Eo is dense in E. Therefore, the existence and uniqueness of such an operator • follow from the extension-by-continuity principle (Theorem 1 .2) . Definition 1 . A pair (E, i ) that has the equivalent properties indicated in Proposition 1 , is called the completion of the normed space E. Remark. Sometimes, we will briefly say "the completion of the space E" , having in mind just the Banach space E. Let us emphasize, however, that
this is just a liberty of speech: the isometric isomorphism i is an integral part of the definition of completion. If the completion ( E, i ) is given, then, identifying every x E E with i ( x) , we can view E as a dense subspace in E; this is often (although, not always) useful. (Psychologically, there must be nothing new for us here: formerly this is how we had learned to view rational numbers as a special case of reals, and still earlier, integers as a special case of rational numbers.)
169
6. Completion
Warning. In old textbooks the sentence "the completion of E is a
complete space containing E as a dense subset" is used as a precise definition of completion. It follows from what we have said earlier that our point of view is somewhat different. In the universal property of the completion the contraction operators were considered; however, there is an analogue involving arbitrary bounded operators.
Let (E, i) be a completion of the space E. Then for each and each bounded operator rp : E --+ F there is a unique bounded operator 1/J : E --+ F such that the diagram in Proposition 1 {i) is commutative. In addition, 11 1/J II == II 'P II ·
Proposition 2. Banach space F
Proof. This evidently follows from the extension-by-continuity principle
•
(Theorem 1 .2) .
Here are several classical examples of completions. If P[a, b] is the space of polynomials on an interval endowed with the uniform norm, then its completion is the pair (C[a, b] , in : P[a, b] --+ C[a, b] ) . The completion of the space C[a, b] considered with the (non-uniform) norm ll x ll 1 :== J: l x(t) l dt is (L 1 [a, b] , i : C[a, b] --+ L1 [a, b] ) , where i assigns to every continuous function its coset in £ 1 [a, b] . (As you see, it is not always convenient to require that E is a part of E. ) The completion of coo considered with the norm II · li P is (lp , in) for 1 < p < oo, and (co , in) for p == oo . In all these examples we, of course, give just one, apparently the simplest, of possible completions. Note that if E itself is a Banach space, then one of its completions is (E, lE) · At the same time, all completions of such E, obviously, are defined as pairs (F, i E --+ F) where i is an arbitrary isometric isomorphism between E and some Banach space. However, various completions of the same space differ only "in form, but not in substance" . :
Suppose ( E 1 , i 1 ) and ( E2 , i 2 ) are two completions of a normed space E. Then there exists a unique isometric isomorphism I : E1 --+ E2 such that the diagram
Theorem 1 (Uniqueness of completion) .
is commutative.
2.
170
Banach Spaces and Their Advantages
(Thus, E I and E2 are not only isometrically isomorphic, but this iso metric isomorphism can be chosen to agree with the operators i i and i2 ) . Proof. (Cf. proof of Theorem 0.6. 1. ) We introduce the category Ban E (some complication of Ban) . Its objects are all pairs ( F, cp ) consisting of a
Banach space F and a contraction operator cp : E � F. The morphisms be tween objects ( FI , 'PI ) and ( F2 , 'P2 ) of this category are contraction operators 1/J : FI � F2 such that the diagram
is commutative. Composition of morphisms in Ban E is their composition in Ban (i.e. , the usual composition of operators) . The axioms of category are easily verified. From the universal property of completions it evidently follows that the pair (E, i ) is a completion of E <====> it is an initial object in Ban E . Hence, from Theorem 0.4. 1 we see that the pairs (E I , i i ) and (E2 , i 2 ) are isomorphic as objects of Ban E . Certainly, the corresponding morphism I is the required • isometric isomorphism. Now we pass to the question of the existence of completions. Unlike the question of uniqueness, there is no general categorical scheme here, and the specific features of the considered construction come to the foreground. Theorem 2 (On the existence of completion) .
a completion.
The first proof. Consider the linear space
Every normed space E has
£ of all fundamental sequences
x == (X I , x 2 , . . . ) of vectors from E with coordinatewise operations. Endow it with the prenorm II x II o == limn �oo II X n II , where II · II is the norm in E (certainly, such prenorm is well defined) . Put Eo :== {x E £ : ll x ll o == 0 } and consider the normed space E :== £f£o (see Proposition 1 . 1 .3 ) . Further, consider the operator j : E � £ assigning to a vector x the constant sequence x' :== ( x, x, . . . ) and put i :== (pr)j : E � E, where pr : £ � E is the natural projection. Our goal is to show that the pair (E, i ) is a completion of the :
space E. First, since j and pr are, obviously, isometric operators, the same is true for i. Further, take an arbitrary y E E; then y == pr ( x ) for some x == (X I , x 2 , . . . ) E £. Since x is fundamental, it easily follows that the
171
6. Completion
elements ( constant sequences ) x� :== j (x n ) tend to x in £ as n � oo , hence the elements i(xn ) == pr(x�) tend to y in E. Thus, Im (i) is dense in E. It remains to show that E is complete. Let Ym ; m == 1, 2, . . . be a fundamental sequence in E, and Ym == pr ( x m ) ; x m == ( x!, x2 , . . . ) E £. For each m == 1, 2, . . . the sequence x m is fundamental. Therefore, there exists a term of this sequence, which we denote by X m E E, such that ll x m - x � ll < � for all sufficiently large n, and thus II X � - x m ll o < � ( in £ ) . In particular ' we see that the sequences x � and x m in £ either converge or do not converge simultaneously, and if they converge, then have the same limits. Now take the sequence x :== (x1 , x 2 , . . . ) ( formed by the chosen vectors of E) . From the relations
ll xz - X m ll == ll x � - x � ll o < ll x � - x l ll o + ll x l - x m ll o + ll x m - x � ll o for all l, m == 1 , 2, . . . , we can easily see that this sequence is fundamental in E, i.e. , x E £. This evidently implies that the sequence x � ( of elements of £ ) tend to x, and thus, as we noted before, the sequence x m has the same limit. But then the given sequence Ym == pr ( x m ) E E also converges, • namely, to the element pr(x) E E. a : E � E** and denote by E the closure of the image of a in E** . Since a is an isometric operator, Propositions 1 .7 and 1.3 show that the pair (E, a l E ) is the required • completion. The second proof. Consider the canonical embedding
Remark. The second proof perhaps is not so instructive as the first. Of
course, it is much shorter, but this is deceptive: it relies on a very powerful tool, the Hahn-Banach theorem; as we remember, the proof of the Hahn Banach theorem requires serious work. Corollary 1 .
Banach space.
Every non-complete normed space is a dense subspace of some
We now discuss with the advanced reader the category theory interpretation of The orem 2. The fact is that the question on the existence of completions, as well as a vast variety of various questions of algebra and analysis, is a question of representability of some functor (cf. Definition 0.7.8) . Let E be a normed space . Consider the contravariant functor :F : Ban1 ---+ Set assign ing to every Banach space F the unit ball in the space B(E, F) and to every morphism
2.
172
Banach Spaces and Their Advantages
The construction of a completion is compatible with an inner product if the latter exists. Proposition 3. Let H be a near-Hilbert space. Then there exists a pair (H, i) , where H is a Hilbert space that is a completion of H as a normed space. If ( H 1 , i 1 ) and ( H 2 , i 2 ) are two such pairs, then there exists a unique unitary isomorphism U such that the diagram
is commutative. Proof. Let (H, i) be the completion of H as a normed space. Our aim is
to show that the norm in H is defined by some inner product. Take x', y' E H and consider the sequences X n , Yn E H uniquely deter mined by the fact that i(xn ) tends to x ' , and i (yn ) to y' . Then, as is easy to see ( cf. the estimate in the proof of Proposition 1.2.4) , the sequence (x n , Yn ) is fundamental, and therefore tends to a certain number, which we shall denote by ( x ' , y' ) . Evidently, this number does not depend on the choice of the sequences Xn , Yn and has the properties of an inner product; in addition, for all x, y E H we have ( i(x) , i(y) ) == (x, y ) . It remains to show that ll x ' ll == y' (x' , x') for each x ' E H. If we take a sequence Xn E H such that i(x n ) tends to x' , we see that the sequence ll x n ll == ll i (xn ) II tends to ll x' ll and the sequence (x n , Xn) tends to ( x', x ' ) , by the construction of inner product in H. Since in H we have ll x n ll • J (x n , X n ) for all Xn , the required equality holds. Exercise 2. Give another proof of this proposition using the result of
Exercise 1.2.3. Remark. We shall not need the notion of completion of a general metric space ( generally speaking, without a linear structure ) , so we do not discuss it in details. We only note that one of the equivalent definitions of the completion of-a metric space M is a pair ( M, i) consisting of a complete metric space M and an isometric mapping i M � M with dense image; the reader can easily restore the definition in terms of a universal property. The proofs of the corresponding uniqueness and existence theorems repeat, up to minor obvious changes, the proofs of Theorems 1 and 2. However, the reader who has done Exercise 1 . 1 . 7 earlier can give a brief proof of the existence by considering an isometric mapping, say i ' , of the given M to Cb ( M ) and taking as M the closure of Im( i') . After this, the corestriction of i' to M can be taken as i. :
173
7. Algebraic and Banach tensor products 7. Algebraic and Banach tensor products
When the first standard courses of functional analysis were delivered in the 1950s, there were few people who heard about Banach tensor products, and only a few papers devoted to them. But now tensor products made a step from algebra to functional analysis, and took a very noticeable place there. They are widely used in most areas of functional analysis, and some "vanguard" directions, such as the theory of operator algebras (together with the adjacent questions of quantum physics) and quantum functional analysis, are just inconceivable without tensor products ( cf. [45] , [46] ) . We are convinced that it is time to include elementary information on tensor products in functional analysis into university textbooks. The general meaning of the construction of tensor product, both in alge bra and in analysis, is as follows: instead of bilinear operators (from a given class) it allows us to consider (just) linear operators, but defined on more complicated spaces than the initial ones. We are ready to give the precise meaning to this vague sentence. First, we consider a purely algebraic preliminary notion, the tensor prod uct of linear spaces. Perhaps you were told about it in an algebra course, but it is likely that the presentation there was different from what we need here. Thus, we have to start from the very beginning. So, let E and F be two linear spaces.
(8, 0 ) , where 8 is a linear space and (} : E x F � 8 a bilinear operator, is called the (algebraic) tensor product of E and F if for every linear space G and every bilinear operator R : E x F � G there exists a unique linear operator R : 8 � G such that the diagram Definition 1. A pair
ExF
el8 � G R
is commutative. The linear operator R is said to be associated with the
bilinear operator n.
The property in this definition is called the universal property (of the algebraic tensor product) .
== e n . Consider the space emn , which we will view as the linear space of m x n-matrices with complex entries. Denote by pkl the elementary matrix with 1 on the (kl)th place and zeroes everywhere else. Consider the bilinear operator (} : em en � emn uniquely kl k l Example 1 . Suppose E
==
em , F
X
defined by the rule that it takes the pair of unit vectors ( p , p ) to p . Then
2.
174
Banach Spaces and Their Advantages
the pair (e mn , 0) is the tensor product of the spaces e m and en . Indeed, for every bilinear operator n : e m X e n � G the operator R : emn � G taking an m x n-matrix (A kz ) to the vector �� 1 �r 1 A kz R(p k , pz ) is the unique operator making the required diagram commutative. A generalization of this example is the following exercise.
M and N be arbitrary sets, and E and F linear spaces of functions on M and N, respectively. Denote by L the linear span of functions on M x N of the form x(s)y(t) ; x E E, y E F, s E M, t E N, and let 0 : E x F � L be the bilinear operator taking the pair (x E E, y E F) to the function x(s)y(t) E L. Then the pair (L, 0) is the algebraic tensor product of E and F. Hint. The following observation is helpful: if the functions X I , . . . , Xn E E are linearly independent, and not all the functions YI , . . . , Yn E F are zero, then ��= l x k (s)yk (t) =/= 0. Theorem 1 (Uniqueness of tensor product) . Suppose (81 , 0 1 ) and (8 2 , 02 ) are two tensor products of linear spaces E and F. Then there exists a linear isomorphism I : 81 � 8 2 such that the diagram Exercise 1. Let
is commutative. Proof. After Theorems 0.6. 1 and 6. 1 the reader can guess what kind of arguments will be given. Let us introduce the category Lin E , F whose objects
are the pairs ( G, R) consisting of a linear space G and a bilinear operator R : E x F � G, and morphisms between objects ( G 1 , R1 ) and (G2 , R2 ) are linear operators � : G1 � G 2 for which the diagram
is commutative. The composition of morphisms is defined as the usual composition of operators. Axioms are easily verified. From the universal property of the tensor product it follows that the pair (8, 0) is the tensor product of E and F <====> this pair is an initial object in Lin E , F . This implies that the pairs (8 1 , 0 1 ) and ( 8 2 , 0 2 ) are isomorphic as • objects of Lin E , F . The rest is clear.
7. Algebraic and Banach tensor products
175
The existence of the tensor product is established by giving explicit construction. We shall restrict ourselves to one of such constructions, which is, perhaps, the most instructive. (On another construction see, e.g. , [47] . ) Let E 0 F be the space of formal linear combinations of the elements of the Cartesian product E x F (cf. Example 0.7.4 ) . We use the notation x 0 y for the elements 1 ( x, y) of the natural linear basis in E 0 F (here 1 is the scalar factor) , and consider in E 0 F the set M of elements of one of the following types: x 0 ( YI + Y2 ) - x 0 YI - x 0 Y2 , (x i + x 2 ) 0 Y - XI 0 Y - x 2 0 y, (Ax) 0 y - A(x 0 y) , x 0 (Ay) - A(x 0 y) , x i , x 2 , x E E, YI , Y2 , Y E F, A E CC. Let us introduce the notation EQ9F for the quotient space (EO F)/ span(M) and x @ y for the coset x O y + span(M) . It is easy to verify that the mapping {) : E x F � E Q9 F : ( x, y) �----+ x Q9 y is a bilinear operator. The elements of the form x Q9 y are called elementary tensors. Obviously, to every type of elements from M there corresponds an identity in E Q9 F; for example, an element of the first type yields the identity (x i + x 2 ) Q9 y == XI Q9 y + x 2 Q9 y, etc.
Every two linear spaces E and F have the tensor product, namely, the space (E Q9 F, {)) .
Theorem 2 (Existence of tensor product) .
E x F � G be a bilinear operator. It uniquely defines a linear operator R0 : E 0 F � G taking x 0 y to R(x, y) . By the choice of M, Ro generates the operator R : E Q9 F � G uniquely defined by the formula R(x Q9 y) == R(x, y) . Then the diagram in Definition 1 is obviously commutative if we take 8 : == E Q9 F and () : == {). Finally, since E Q9 F == span(Im( {))) , the operator R is uniquely defined by the condition R( x Q9 y) == R(x, y) , i.e. , by the requirement of commutativity of this diagram. • Proof. Let R
:
The universal property of the algebraic tensor product allows us to dis tinguish an important class of linear functionals.
tensor product of functionals f : E � CC and g : F � CC is the functional f Q9 g : E Q9 F � CC associated with the bilinear functional f x g : E x F � CC : (x, y) �----+ f (x) g (y) . Definition 2. The
Clearly, the introduced functional is uniquely defined by the equality (f Q9 g ) (x Q9 y) == f (x) g (y) . We see that in general there are many ways to represent an element of the space E Q9 F as a sum of elementary tensors. It is very useful to know when such a sum is non-zero.
2.
176 Proposition 1 .
equivalent:
For an element u
Banach Spaces and Their Advantages
E
E Q9 F the following conditions are
(i) u :/=- 0; (ii) u can be represented in the form u == ��= I x k Q9 yk , where XI , . . . , Xn are linearly independent, and YI :/=- 0; (iii) u can be represented in the form u == ��= I X k @ Yk , where XI , . . . , Xn
and YI , . . . , Yn are linearly independent.
Proof. (i) ===> (iii) . Among all representations of u as a sum of elementary
tensors there is a representation with the least number of summands; let it be ��= I X k Q9 Yk · If the vectors XI , . . . , X n are linearly dependent, then one of them, say, XI , has the form ��= 2 A k x k ; A k E CC. Hence u == ��= 2 X k Q9 (A. k YI + Yk ) · By the choice of the initial representation, this is impossible. Therefore, XI , . . . , Xn are linearly independent, and it remains to apply the same arguments to the vectors YI , . . . , Yn . (iii) ===> ( ii) is clear. (ii) ===> ( i) . Take f : E � CC such that f (x i ) :/=- 0 and j(x 2 ) f (x n ) == 0, and g : F � CC such that g (yi ) :/=- 0. Then (f Q9 g) (u) f (xi ) g (y i ) :/=- 0. The rest is clear. •
The following two facts follow from Proposition 1 . Exercise 2 ° . If x, y E E, then x Q9 y == y Q9 x in E Q9 E <====> x and y are. collinear. Exercise 3 ° . If e � ; J-L E A' and e�; v E A" are linear bases in E and F respectively, then e � Q9 e�; (J-L, v ) E A' x A" is a linear basis in E Q9 F. Now we can suggest another important example of the algebraic tensor product.
Let E and F be normed spaces. Then there exists a linear isomorphism7 Gr8 : E* Q9 F � :F(E, F) uniquely defined by the rule that it takes the elementary tensor f Q9 y to the one-dimensional operator f 0 y.
Proposition 2.
Proof. Consider the bilinear operator g :
x
E* F � :F(E, F) taking a pair
(f, y) to f 0 y. The universal property provides an operator Gr8 uniquely
defined by the indicated rule. From Propositions 1.5.5 and 1.6 it evidently follows that Gr8 is surjective. We show that it is injective. Suppose u E E* Q9 F does not vanish. By Proposition 1 (ii) , it can be represented as ��= I x'k Q9 Yk , where YI , . . . , Yn are linearly independent, and x i :/=- 0. Take XI E E such 7
The notation means that this mapping is a birestriction of the so-called Grothendieck op erator Gr, which will be defined at the end of this section.
177
7. Algebraic and Banach tensor products
that xi (x1) == 1 . If T :== Gr8 ( u ) , then Tx1 == Y1 + ��= 2 x'k (x1 ) Yk =/= 0, and • thus T =!= 0. The rest is clear. Our next goal is to pass from the purely algebraic tensor product to its "Banach" version. As a matter of fact, there exist quite a few different versions of the notion of tensor product for general Banach spaces. However, we shall restrict ourselves to one of them, apparently the most important ( cf. the remark after Exercise 5) . (Another version, the most convenient for working with Hilbert spaces, will be presented in the next section.)
E and F be Banach spaces. A pair (8, 0) , where 8 is a Banach space and 0 : E x F --+ 8 a bilinear contraction operator, is called the projective tensor product of E and F, or, as we shall more frequently say, the Banach tensor product of these spaces if for every Banach space G and for a bilinear contraction operator n : E F --+ G there exists a unique linear contraction operator R 8 --+ G such that the Definition 3 ( cf. Definition 1) . Let
:
X
diagram
ExF
is commutative.
el8 � G R
Let (8 1 , 0 1 ) and (8 2 , 02 ) be two projective tensor products of Banach spaces E and F. Then there exists a unique isometric isomorphism I : 8 1 --+ 8 2 such that the diagram Theorem 3 (Uniqueness; cf. Theorem 1 ) .
is commutative. Proof. Proof of Theorem 1 works with obvious "Banach" modifications. Now we use the category Ban E , F , whose objects are pairs (G, R) consisting
of a Banach space G and a bilinear contraction operator R : E x F --+ G. • We leave the remaining details to the reader.
Now we begin preparations for the existence theorem of the Banach tensor product. Suppose we have (arbitrary) prenorms on linear spaces E and F. For each element u of the algebraic tensor product E Q9 F we put ll u ll p :== inf ��= 1 ll x k II IIYk II , where the infimum is taken over all possible representations u == ��= 1 X k Q9 Yk ; X k E E, Yk E F. Clearly, II · l i P is a prenorm.
2.
178
Banach Spaces and Their Advantages
projective tensor product of the given prenorms or simply the projective prenorm on E Q9 F. Definition 4. This prenorm is called the
We shall denote the prenormed space (E Q9 F, II · li p ) by E @p F. Exercise 4. The open unit ball in E @p F is the convex hull of the set { x Q9 y : x E B� , y E B�} . :
x
Let R E F � G be a jointly bounded bilinear operator to a prenormed space. Then the associated operator Ro : E @p F � G is also bounded, and II Ro I == II R II . Proof. Take u == ��= I X k Q9 Yk ; X k E E, Yk E F; then from Ro ( u ) == ��= I R(x k , Yk ) it follows that li Ra ( u ) II < II R II ��= I ll x k II IIYk II . There fore, by the definition of projective prenorm, II Ro ll < II R II . Further, from R(x, y) == Ro (x Q9 y) and the obvious estimate ll x Q9 Y ll < ll x ll IIYII we have that II R II < II Ro II sup{ II x Q9 y II : x E BE , y E BF } < II Ro II . • Corollary 1. For f E E* and g E F* the norm of f Q9 g as a functional on E @p F is equal to II f II ll g II . Prop osition 4. For all x E E and y E F we have ll x Q9 Yll p == ll x ii iiYII · Proof. Obviously, it is sufficient to consider the case where ll x ll > 0 and IIYII > 0. By Theorem 1 .6.3, there exist functionals f E E* and g E F* such that f(x) == ll x ll , g(y) == IIYII , and ll f ll == ll g ll == 1 . Hence, taking Corollary 1 into account, ll x ll IIYII == (f Q9 g ) (x Q9 y) < ll x Q9 Yll p · The rest is clear. • Proposition 3.
Before proceeding further, let us distinguish a useful application of the Hahn-Banach theorem. It is a special case of the statement presented in Chapter 1 as Exercise 1 .6.9.
Let E be a normed space and XI , . . . , X n E E a linearly independent family of vectors. Then there exists f E E* such that f(x ) =/=- 0, f(x 2 ) == · · · == f(x n ) == 0. Proof. Since XI , . . . , X n are linearly independent, a functional with indi cated property exists on the space Eo == span { XI , . . . , Xn } and is bounded by Theorem 1 . 1 (iii) . It remains to extend it to E using the Hahn-Banach • theorem. Proposition 6. If E and F are normed spaces, then E Q9p F is also a normed space. Proposition 5.
i
u E E @p F is non-zero. By Proposition 1 , u has the form ��= I X k Q9 Yk ; X k E E, Yk E F, where XI , . . . , Xn are linearly independent, and YI =/=- 0. Therefore, by Proposition 4, there exist f E E* and g E F*
Proof. Suppose
179
7. Algebraic and Banach tensor products
such that f(x 1 ) =/= 0, f(x 2 ) == · · · == f(xn ) == 0 and g(y1 ) =/= 0. Hence, taking Corollary 1 into account, n
ll f ll ll 9 ll lluii P > l (f 0 g) ( u ) i = L f(x k ) 9 ( Yk ) k= 1
=
l f(xi ) 9 ( YI ) I > 0. •
The rest is clear.
From now on we shall study the case where E and F are Banach spaces. Then, by the previous proposition, E @p F is a normed space. Take its completion and denote it by (E @ F, i) . Then, as was discussed in the previous section, we can assume, without loss of generality, that E @p F is contained as a dense subspace in E @ F, and i is a natural embedding. Introduce also a bilinear operator J : E x F --+ E @ F assigning to a pair (x E E, y E F ) the elementary tensor x Q9 y . From the definition of the prenorm II · li P it immediately follows that {) is a bilinear contraction operator. A
Theorem 4 (Existence of projective tensor product) . For every Banach space G and for every bounded bilinear operator R : E x F --+ G there is a
unique bounded linear operator R : 8 --+ G such that the diagram ExF
Jl8 � G R
is commutative. Herewith, II R II == II R II . In parti cular, the pair (E @ F, J ) is the Banach tensor product of the spaces E and F. Proof. Suppose Ro : E Q9 F --+ G is the operator associated (in the sense of Definition 1 ) with R; then, by Proposition 3, II Ro ll == II R II . By Proposition 6.3, there exists a unique operator R : E @ F --+ G with the necessary
properties.
II
The norm in E @ F will be also denoted by II · li P (or just II · II ) . We call R the operator associated with the {bounded) bilinear operator R, and the indicated property of the pair (E @ F, J ) , the universal property (of the Banach tensor product) . There should be no confusion with similar purely algebraic notions. Further we shall need the following general fact, which is also of inde pendent interest.
Let E be a normed space, and Eo a dense subspace of E . Then for each c > 0 every element x E E can be represented as the sum of an absolutely convergent series � C: 1 x k , where X k E Eo and � � 1 ll x k ll < ll x ll + c.
Proposition 7.
2.
180
Banach Spaces and Their Advantages
x1
Eo such that ll x - XI II < Similarly, there exists x 2 E Eo such that
Proof. Since Eo is dense in E, there exists
E
c/4. Hence II x i II < ll x ll + c /4. ll x - XI - x 2 ll < c/8. Hence, ll x 2 ll < ll x - XI II + c /8 < 3c /8. Further, there exists X 3 E Eo such that ll x - XI - x 2 - x 3 11 < c /16, hence ll x 3 11 < ll x - XI - x 2 ll + c /16 < 3c /16. Continuing this process, we obtain a sequence x k of vectors in Eo such that n x - L X k < cj2n + l and ll xn ll < 3c/2n +l for all n > 2. k= I From this it follows that, first, the series � C: I X k converges to x. Moreover, 00 00 3 1 L ll x k ll < ll x ll + c ( 4 + L 2 k+ 1 ) = ll x ll + E , k= I k= 2 • and this is what we need. Proposition 8. Every u E E@F is representable as the sum of an absolutely convergent series �C: I X k Q9 Yk ; x k E E, Yk E F, k == 1 , 2, . . . . Moreover, llullp == inf � � I ll x k II IIYk II , where the infimum is taken over all possible representations of u in this form. 0 and, using the fact that E Q9 F is dense in E ® F and Proposition 7, represent u in the form � C: I un , where Un E E ® F and � � I llun llp < llullp + c/2. By the definition of the norm II · li p , every Un has the form � �]_ X ni Q9 Yni , where n L ll x ni II IIYni II < llun l i P + 2 .E2n · i= I Proof. Take
c
>
Hence �� I � �]_ ll xni II IIYni II < llullp + c. Since c was arbitrary, llullp is not smaller than the infimum indicated in the formulation. The converse • inequality is obvious. As an exercise, we suggest the following result, which establishes a close relation between taking the space of operators and taking the projective ten sor product. In fact , this connection is the main advantage of the projective tensor product compared to other versions of this construction. Exercise 5 (Adjoint associativity law) . For all E, F, G E Ban there is an isometric isomorphism between the spaces B(E @ F, G) and B(E, B(F, G)) ,
uniquely defined by the rule that it takes an operator rp to the operator 1/J such that ( 1/J ( x)) ( y ) : == rp ( x Q9 y ) . In particular, up to an isometric isomor phism we have (E @ F) *
== B(E, F * ) .
181
7. Algebraic and Banach tensor products x
Hint. The norm of the bilinear operator from E F to G with which rp is associated, coincides with II � II . Remark. Explicit constructions of other versions of the notion of tensor
product of Banach spaces, which are not considered in this book, could be obtained as a result of the completion of the linear space E Q9 F in norms different from the projective one. There exists a very deep theorem due to Grothendieck, 8 asserting that there are precisely 14, not more, not less, "natural" in some reasonable sense versions of such norms, and thus 14 versions of tensor product (see, e.g. , [48, 11.27] ) . Another use of tensor products of Banach spaces in analysis is that they help to describe the passage from functions of one variable to functions of two variables, that have, to speak informally, "the same nature" . This concerns practically all existing tensor products, in particular, those exceeding the scope of this book. For the projective tensor product we discuss here, the spaces LI ( · ) are especially good. Suppose (XI , J-LI ) and (X2 , J-l2 ) are two measure spaces, and let (XI x X2 , J-LI x J-L 2 ) be their product. We assume that it is known that every element of the Banach space LI (Xk , J-l k ) ; k == 1, 2 can be approximated by linear combinations of the characteristic functions of measurable subsets in Xk, and every element in LI (XI x X2 , f.-LI x J-l2 ) is a linear combination of the characteristic functions of sets of the form M x N, where M (respectively, N) is a measurable set in XI (respectively, in X2 ) . Theorem 5 (Grothendieck) .
Up to an isometric isomorphism,
LI (XI , J-LI ) @ LI (X2 , J-L 2 ) == LI (X I
x
x2 , J-LI
x
Proof. Consider the bilinear operator n : LI (XI , J-LI )
J-L 2 ) . X
LI (X2 , J-l 2 ) � LI (XI X x2 , f.-LI X J-l 2 ) taking a pair (x, y) to the function x(s)y(t) ; s E XI , t E x2 . Clearly, II R(x, y) II == ll x ll IIYII ' and hence II R II == 1. Let R : LI (XI , J-LI ) @ LI (X2 , J-l 2 ) � LI (XI X x2 , f.-LI X J-l 2 ) be an operator with the same norm, associated with R. By Proposition 1 . 10, it is sufficient to estab lish that R isometrically maps a dense subspace in LI (XI , J-LI ) @ LI (X2 , J-L2 ) onto a dense subspace in LI (XI X x2 , f.-LI X J-l 2 ) · Put L : == span{xM Q9 XN }, where M and N are various measurable sub sets in XI and X2 respectively, and X is the characteristic function. From the definition of the prenorm II · li P combined with the above-said about approx imations, it follows that every elementary tensor in LI (XI , J-LI ) @ LI (X2 , J-l 2 ) is approximated by elements from L. Therefore, L is dense in LI (XI , J-LI ) @p L I (X2 , J-L2 ) , hence in the entire LI (XI , J-LI ) @ LI (X2 , J-L2 ) · A. Grothendieck ( born in I 9 2 8) , outstanding French mathematician, who made important 8
discoveries in functional analysis, algebraic geometry, and other areas of mathematics.
2.
182
Banach Spaces and Their Advantages
Further, each u E L is evidently representable as a finite linear com bination of L: z ,m A. z , m XMz Q9 X Nm , where the measurable sets Mz and, re spectively, Nm are disjoint. But then, by the definition of the projective measure, !l ull < L: z m I A.z , m iJL I (Mz ) JL2 (Nm ) , and at the same time II R ( u ) ll == II L: z , m A l, m X Mz (s) X Nm (t) l ! == L: z , m I A.z ,m iJL I (Mz ) JL2 (Nm ) · Hence II R ( u ) ll > II u II , and together with II R II == 1 , th is gives II R ( u ) II == II u II . Thus R isometrically maps L to the linear span of functions of the form XM (s) xN ( t) with measurable M and N, and as was noted before, this span • is dense in Ll (Xl X x2 , JL 1 X JL2 ) · The rest is clear. '
*
*
*
The reader remembers that behind a reasonable construction in math ematics, some functor is usually hidden ( cf. the quotation from Eilenberg MacLane in Section 0. 7) . This is definitely true for the construction of Banach tensor product.
Let S : E1 � E2 and T : F1 � F2 be bounded opera tors between Banach spaces. Then there exists a unique bounded operator S @ T : E1 @ F1 � E2 @ F2 such that (S @ T) (x Q9 y) == S(x) Q9 T(y) for all x E E1 , y E F1 . Moreover, II S @ T II == II S II II T II . Theorem 6.
E1 F1 � E2 @ F2 : (x, y) r--+ S (x) Q9 T(y) . Obviously, this is a bilinear operator with norm II S II II T II · Denote by S @ T the corresponding • associated operator. The rest is clear. Proof. Put R :
x
The constructed operator S @ T is called the projective or Banach tensor product of operators S and T. In the special case of functionals, in other words, when E2 == F2 == CC, the operator S @ T maps E1 @ F1 to CC @ CC, and hence is itself a functional, up to the identification of the latter space with CC (by the isometric isomorphism A.1 Q9 A. 2 r--+ A.1 A. 2 ) ) . It is easy to see that this functional extends by continuity the functional S Q9 T defined on the algebraic tensor product E1 Q9 F1 (see Definition 2) . Using the series representation of elements of Banach tensor products as described in Proposition 8, one can show that the tensor product of two surjective operators is a surjective operator (try to do this) . However, the tensor product of two injective operators is not always injective, even if these operators are isometric. But, it is not easy to give a corresponding example ( cf. [48, I.5.8] ) . Now we can introduce a new class of functors acting on the category Ban. It is the second most important class after the class of operator functors (see Section 5) , and it is closely related with the latter.
183
7. Algebraic and Banach tensor products
Let us take E E Ban. To every F E Ban we assign the Banach space E @ F and to every bounded operator T : F1 � F2 the operator lE @ T : E @ F1 � E @ F2 . Obviously, we obtain a covariant functor from Ban to Ban denoted by E@? and called the functor of Banach tensor product (by E from the left) . Similarly, if F is chosen, a standard covariant functor ? @ F arises (the functor of "Banach tensor product by F from the right" ) . The following material, till the end of this section, is for advanced readers. First, we emphasize again that the Grothendieck theorem reflects special geometric properties of the £ 1 ( ) spaces . In other classes of function spaces the projective tensor product does not have such a transparent description. In particular, the projective tensor product of two Lp ( · ) ; p > 1 spaces is not a space of this class. (The corresponding analogue of the Grothendieck theorem requires other types of tensor products; cf. Exercise 8.4 below. ) We will not discuss this anymore, and invite the reader to take for granted a few words about what the projective tensor product does with the spaces C [a , b] . Let D be the notation for the square [a, b] x [a, b] in the coordinate plane. Consider the contraction operator V : C [a , b] ® C [a , b] ---+ C(D) associated with the bilinear operator V : C [a , b] x C [a , b] ---+ C(D) : (x , y) r--+ x ( s ) y ( t ) . It can be shown (we will not do this, and do not ask you to do this; see, e.g. , [47, Theorem II. 5 .9] ) that this operator is injective, and, thus, allows us to identify the tensor product in question with the image of V . At the same time, although the image contains all smooth functions, and hence is dense in C(D) , this operator is not surjective. The space C [a , b] ® C [a , b] is called the Varopoulos space. (Its isometrically isomorphic copy, the function space Im (V) with norm I I V (u) l l : = ! l u ll , often has the same name.) This space plays a significant role in studying some important (and difficult) questions of the theory of Fourier series and harmonic analysis; see, e.g. , [49] (where these spaces are called tensor algebras) . Now we give one of many examples showing how Banach tensor products work in the theory of operators. Here it is very useful that an important class of operators, the so-called nuclear operators, can be characterized in terms of tensor product. -
Definition 5 . An operator T : E ---+ F is called nuclear if it is representable as the sum of an absolutely convergent in B(E, F) series 2::: � 1 Sk of one-dimensional operators. The infimum inf 2::: � 1 I I Sk I I over all such representations of the operator T i s called the nuclear norm of the (nuclear) operator T, and is denoted by I I T I I N ·
The set of nuclear operators from E t o F is denoted by N (E, F) . I f E = F, we write N (E) instead of N (E, E). The following proposition is easily verified.
r--+
I I T I IN Proposition 9. The set N (E, F) is a subspace in B(E, F)}, and the function T on N ( E, F) is a norm on this space; moreover, this norm �s not less than the operator
norm.
•
Note the resemblance of the definition of nuclear norm and the expression for the norm of elements of Banach tensor products in Proposition 8. We now show that this is not a coincidence. Consider the mapping Q : E* x F ---+ B(E, F) : (/, y) r--+ f 0 y. Obviously, this is a bilinear operator with norm 1 . The bounded operator Gr : E* ® F ---+ B(E, F) associated with Q is called the Grothendieck operator (for the pair ( E, F) ) . We see that this operator is uniquely defined by the equality Gr(f @ y) = f 0 y, and it has norm 1 .
184
2. Banach Spaces and Their Advantages
Theorem 7. The image of the Grothendieck operator is precisely N(E, F) , and the core striction Gr0 of this operator to (N(E, F) , II · liN) is a coisometric operator. Proof. Suppose u E E* ® F can be represented as the sum of an absolutely convergent series L:� 1 fk ® Yk ; fk E E* , Yk E F (see Proposition 8) . Then T : = Gr(u) can be represented as the sum of the series L:� 1 fk 0 Yk absolutely convergent in the operator norm. Therefore, T E N(E, F) . Further, every operator T of nuclear norm < 1 can be represented as the sum of a series L:� 1 sk , where the sk are one-dimensional operators, and L:� 1 I I Sk I I < 1 . By Proposition 1 . 5. 5, every Sk has the form /k 0 Yk for some fk E E* and Yk E F, so that I I Sk l l = l l fk ii i i Yk ll · Hence the series L:� 1 /k ® yk converges in E ® F • to some element u, l l u l l p < 1 , and Gr(u) = T. The rest is clear.
From this theorem and Proposition 1 . 5 .4(ii) we immediately get Proposition 10. We have the following commutative diagram:
E* ® F
prl
1 E* ® F/ Ker(Gr) -------+ N(E, F)
where I is an isometric isomorphism. Taking Proposition 1 . 9 into account, we obtain Corollary 2. The space (N(E, F) , I I · l iN) is a Banach space.
For the vast majority of the known examples of Banach spaces E the operator Gr : E* ® F ---+ B(E, F) is injective for all F E Ban, so that its corestriction to N(E, F) is an isometric isomorphism between E* ® F and (N(E, F) , II · l i N ) . Thus, in this situation nuclear operators can be characterized as elements of the tensor product E* ® F. For the case where both spaces are Hilbert , this fact will be proved later (see Theorem 3.4.5) . But there are "bad" pairs of Banach spaces for which Ker(Gr) :I 0 . Some details will be reported later; see Theorem 3. 3.3. Let us concentrate on the case E = F. Among all bilinear functionals on E* x E the following one attracts attention: it takes a pair (/, x) to the number f (x) and is called the natural duality between E* and E or the painng of E* and E. The bounded functional associated with the natural duality is called the tensor trace functional or just the tensor trace and is denoted by ttr : E* ® E ---+ C. Clearly, it is uniquely defined by the equality ttr(f ® x) = f (x ) ; besides, I I ttr II = 1 . If the operator Gr : E* ®E ---+ B(E) is injective, then, assigning to a nuclear operator T the number ttr( u) , where u E E* ® E is such that Gr( u) = T, we obtain a linear functional on (N(E) , I I · liN) of norm 1 . This functional (which is well defined if Ker(Gr) = 0) is also called the trace or, if we want to be precise, the operator trace, and it is denoted by the symbol tr. The reader can guess that all this must be related to the classical notion of the trace of a matrix, the sum of its diagonal elements. Indeed, this is so: Exercise 6* . Let E be a finite-dimensional space. Then
(i) the Grothendieck operator is a linear isomorphism between E* ® E and £(E) ; (ii) for every T E £(E) its operator trace (well defined by (i) ) coincides with the trace of the matrix of this operator in each linear basis.
185
7. Algebraic and Banach tensor products
Hint. Take a basis e 1 , . . . , e n in E and consider the basis e r , . . . , e � in E* such that is 1 if k = l, and 0 if k ":l l. Look at the trace of the element ® e and the trace of a matrix of an operator T E £ (E) in the indicated basis.
Az k ei k
ek:(e z )
For which Banach spaces E is the trace of nuclear operators acting on E well defined? In other words, when, for a given T E N (E) , is the number ttr( u ) independent of the choice of u E E* ® E such that Gr( u ) = T? Running ahead, we note that this happens precisely when E has an outstandingly important geometric property, the so-called Grothendieck approximation property (see Definition 3.3.2 and Theorem 3.3.3 below) . We emphasize one more time ( cf. what we have said earlier about the injectivity of Gr) that for the vast majority of operators, the operator trace is indeed well defined. For Hilbert spaces we shall prove this in the next section. Finishing the discussion of the Banach tensor product, we again look at it through the "categorical glasses" . We suggest that you do the following two simple exercises. First of them reveals the intrinsic alliance between this construction and the construction of completion ( cf. Exercise 6. 1 ) . Exercise 7. Show that each of the existence theorems of this section (Theorems 2 and 4) is equivalent to representability of some functor. Hint. Consider the functor :F from Lin (respectively, Ban) to Set assigning to each G the set of all bilinear (respectively, bilinear contraction) operators from E x F to G.
The second exercise sheds light on the result of Exercise 5 concerning the coincidence of the Banach spaces mentioned there: actually, not only individual spaces coincide, but the "entire" functors. Exercise 8. For E, F, G E Ban,
(i) the composition of the functors B(?, G) (? ® F) is naturally equivalent to the functor B(?, B(F, G) ) ; (ii) the functor B ( E ® F, ? ) is naturally equivalent to the composition B(E, ? ) B(F, ?) . o
o
Remark. The indicated fact is a manifestation of a deep connection between the functors ? ® F and B(F, ?) , which is usually expressed by the words "the first is adjoint to the second" . About the general notion of adjoint functors see, e.g. , [24] . *
*
*
Finally, a few words to the readers who are interested in quantum spaces discussed in Section 1 . 7. In quantum functional analysis tensor products play even more impor tant role than in classical functional analysis. Since there are two substantial "quantum" versions of the notion of a bounded bilinear operator, there exist two substantial "quan tum" analogues of the notion of Banach (i.e. , projective) tensor product. These are the "operator projective tensor product" of Effros-Ruan and Blecher-Paulsen and "Haagerup tensor product" (see [36] ) . The first is defined using the universal property of completely bounded operators, and the second is defined for multiplicatively bounded bilinear oper ators. Each tensor product is a pair consisting of a Banach (i.e. , complete in all floors) quantum space and a bilinear operator of the corresponding type. The operator tensor product resembles the classical Banach tensor product; in partic ular (and this is very important) it admits a natural analogue of the adjoint associativity law. On the contrary, the Haagerup tensor product "®h" is very unusual; it is sufficient to say that it depends on the order of the tensor factors. For instance, l2 ®h l2 = N (l 2 ), and at the same time l2 ®h l2 = K( l 2 ). (Here N is as always, the symbol for the space of nuclear operators, and K the symbol for the space of compact operators, which will be defined
2.
186
Banach Spaces and Their Advantages
in the next chapter) . But such inconveniences are overweighted by the following advan tage of this tensor product: the "Haagerup tensor product" of two surjective operators is surjective, and the product of two injective operators is injective. No other known tensor product of "classical" Banach spaces possesses this feature of Haagerup tensor products (see details, e.g. , in [36] ) .
8 . Hilbert tensor product
Now we leave general Banach spaces and concentrate on Hilbert spaces. First, note that in this context the projective tensor product, remaining an important working instrument, does not always make us happy: the projective tensor product of Hilbert spaces usually is not a Hilbert space. Exercise 1 . Let H and K be Hilbert spaces, and e 1 and e 2 (respectively, e� and e�) orthogonal vectors of norm 1 in H (respectively, in K) . Then the norm of the vector e 1 Q9 e� + e 2 Q9 e� in H @ K is 2 and, as a corollary, the norm in H @ K is not a Hilbert norm. Hint. The norm of the bilinear functional on H x K taking (x , y) to ( x, e 1) (y, e� ) + (x, e 2 ) (y, e� ) is 1 . Nevertheless, there exists a useful version of the construction of tensor product that preserves the "Hilbert nature" of given spaces. This time, to achieve the goal quickly, we shall immediately give an explicit construction. It is based on the fact that the algebraic tensor product of two near-Hilbert spaces has a natural pre-inner product.
Let H and K be near-Hilbert spaces. Then there exists a unique pre-inner product in the linear space H Q9 K such that (xi Q9 YI , x 2 Q9 Y2 ) == (xi , x 2 ) ( YI , Y2 ) . As a corollary, for the correspon ding prenorm we have ll x Q9 Y ll == ll x ll IIYII . Moreover, if H and K are near-Hilbert spaces, then the same is true for
Proposition 1.
H Q9 K.
Proof. If ( · , · ) is a pre-inner product on HQ9K with the indicated properties,
then for have
u, v E H Q9 K; u == ��= l x� Q9 y� , v == ��= l x% Q9 y� we evidently
(1)
n
(u, v ) = L (x �, x f' ) ( yk , yf' ) . = k ,l l
This means that there can exist at most one such pre-inner product. Let us show that one indeed exists. First take a pair x E H, y E K and denote by gx , y H Q9 K � CC the linear functional associated with the bilinear functional ( x1 E H, YI E K) �----+ :
(xi , x) ( YI , y ) .
8. Hilbert tensor product
187
Now take u E H Q9K and consider the mapping F : H x K � CC; (x , y) �----+ gx , y (u). Then, taking an arbitrary representation of u in the form u == 2:�= 1 x k Q9 Yk , we obtain that F(x, y) == 2: �= 1 (x , x k) ( y, yk) · This clearly implies that F is a bilinear functional. Denote by fu H Q9 K � CC the corresponding associated functional. Finally, for u, v E H Q9 K we put ( u, v ) fu (v) . Supp ose u == 2:: �= 1 x� Q9 y� and v == 2:: �= 1 x % Q9 Y% · Then elementary arguments using the linearity of functionals fu and gx"k ' yk" provide equality (1) . Its special case is the equality given in the statement. Further, it easily follows from (1) that ( u, v ) == ( v , u) , ( u, u ) > 0. Finally, from the obvious equality ( u, v ) == fv(u) it follows that ( u, v ) is linear in the first argument. Thus, it is indeed a pre-inner product. It remains to consider the case of inner products, i.e. , near-Hilbert H and K. Suppose u E H Q9 K; u == 2:: �= 1 X k Q9 Yk is non-zero. Take an orthonormal basis e 1 , . . . , e z in span{x1 , . . . , xn } · Each X k has the form of L:;n 1 A/clez ; A kz E CC. Hence, obviously, u == L:;n 1 e z Q9 zz , where zz == 2::�= 1 A kl Yk , and since u =/=- 0, not all zz vanish. We obtain that ---
:
: ==
Thus,
m
m
i ,j = 1
i= 1
(u, u ) =/=- 0, and H Q9 K is a near-Hilbert space.
•
Taking the continuity of the inner product (Proposition 1.2.4) into ac count, this obviously implies the following result.
If a sequence Xn tends to x in H, and Yn to y in K, then • Xn Q9 Yn tends to x Q9 y in (H Q9 K, ( ·, · ) ) . Proposition 2.
From now till the end of the section H and K are Hilbert spaces. Denote by H Q9 K the Hilbert space which is a completion of the near Hilbert space (H Q9 K, ( · , · ) ) (see Proposition 6.3) . . . . . Definition 1. The pair (H Q9 K, 19) , where 19 : H x K � H Q9 K is a bilinear operator acting by the rule (x, y) �----+ x@y, is called the Hilbert tensor product of Hilbert spaces H and K. Apparently, the "advanced" reader must feel a discomfort because of the discrepancy between the definitions of Banach and Hilbert tensor products: the first was given in terms of a universal property, while the second by an explicit construction. In fact , in an exposition of this circle of problems ( exceeding the very preliminary information we present in this book ) the Hilbert tensor product can ( and should ) be defined in the spirit of Definitions 1 and 3. If you are curious, here is the scheme ( cf. [50] ) . Instead of the
2.
188
Banach Spaces and Their Advantages
class of all bounded bilinear operators, in Definition 3, we should consider the class of Schmidt bilinear operators. This is the name for bilinear operators R : H x K ---+ L, where H, K, L are Hilbert spaces, with the following property: for each total orthonormal system { e JL ; JL E A 1 } in H and { ev; v E A2 } in K and each z E L we have L: { I (R(e JL , ev), z) j 2 : JL E A 1 , v E A2 } < oo . (A remarkable fact is that the indicated number depends only on z and not on the choice of orthonormal systems.) The supremum of the indicated numbers over all z E BL is called the Schmidt norm of our bilinear operator. Now we say that the pair (8, 0) , where e is a Hilbert space and e H X K ---+ e a bilinear Schmidt operator, is the Hilbert tensor product of the Hilbert spaces H and K if for each Hilbert space L and for each bilinear Schmidt operator R : H x K ---+ L with Schmidt norm < 1 , there exists a unique linear operator R with operator norm < 1 (i.e. , a contraction operator) such that the diagram :
HxK
el � e
R
L
is commutative. This definition immediately implies the uniqueness theorem in the spirit of Theorems 1 and 3 (formulate it! ) . After that , as a proof of the existence theorem we must explicitly construct the pair ( H ® K, rO) and establish that it indeed has the universal property j ust described. All these arguments are sufficiently important and instructive, but exceed the scope of this book.
Now we suggest two exercises clarifying the nature of the Hilbert tensor product. Suppose (only for simplicity) that H and K are separable, e�; n E N is an orthonormal Schauder basis in the first space, and en ; n E N in the second space.
e� Q9 en ; m, n E N, arranged as a sequence in an arbitrary order, is an orthonormal basis in H Q9 K. Next, for each n == 1 , 2, . . . we put Hn :== { x Q9 en ; x E H }. Clearly, this is a closed subspace in H Q9 K, and there is an isometric (and unitary) isomorphism between H and Hn taking x to x Q9 en . (We can say that Hn ; n E N are isometrically isomorphic copies of the space H.) By analogy, in the same space HQ9K there are closed subspaces Kn that are isometrically isomorphic copies of the space K . Exercise 2. The system
Exercise 3* . Show that there exists a unique unitary isomorphism .
.
.
H Q9 K � (B{Hn ; n E N} (where EB denotes the Hilbert sum defined in Section 1) taking x Q9 en to the sequence ( z1 , z2 , . . . ) ; Zm E Hm such that Zn == x Q9 en and Zm == 0 for m =/=- n. Construct a similar unitary isomorphism . between H Q9 K and (B{Kn ; n E N}. . Hint. There exists a unique operator R : H Q9K � (B{ Hn ; n E N} taking x Q9 y to the sequence z1 , z2 , . . such that Zn :== ( y, en ) x Q9 en . It defines an isometric isomorphism between the dense subspace in H Q9 K consisting of U:
.
.
8. Hilbert tensor product
189
sums of elementary tensors of the form x Q9 e n ; x E . subspace in (B{Hn ; n E N}.
H, n E N and a dense
Now we give an instructive example. Exercise 4 (cf. Theorem 7.5) . Let
spaces, and (XI isomorphism,
X
x2 , f.-LI J-l 2 ) X
(XI, J-LI) and ( X2 , J-L2 ) be two measure
their product. Then, up to an isometric
Hint. Put L :== span{xM Q9 XN }, where M and N are arbitrary measur
able subsets in XI and X2 respectively, and X is the characteristic function; this is a dense subspace in L 2 (XI, J-LI) Q9 L 2 (X2 , f.-L 2 ) · The operator from L2 (XI , f.-LI) Q9 L 2 (X2 , J-L2 ) to L 2 (X I x X2 , f.-LI x J-L2 ) , well defined by the rule (x, y) �----+ x ( s ) y ( t ) , establishes an isometric isomorphism of L onto a dense subspace in L 2 (XI X x2 , f.-LI X J-l 2 ) · One of the main advantages of the Hilbert tensor product that makes it to resemble the Banach tensor product, is that this construction is "func torial" : it is defined not only for the spaces, but for the operators as well. The following theorem is fundamental here.
Let S : HI � H2 and T : KI � K2 be bounded operators between Hilbert spaces. Then there exists a unique bounded operator S Q9 T : HI Q9 KI � H2 Q9 K2 such that (S Q9 T)(x Q9 y ) == S(x) Q9 T(y) for all x E HI, Y E KI. Moreover, l i S Q9 T i l == II S II II T II . Theorem 1 (cf. Theorem 7.6) . .
.
.
Proof. Since the algebraic tensor product
.
HI Q9 KI , i.e. , the linear span of
elementary tensors in HI Q9 KI, is dense in the latter space, there may exist at most one bounded operator satisfying the desired condition. Furthermore, by the identity ll x @ y ll == ll x ll II Y II for the norm in the Hilbert tensor product, for this hypothetical operator S Q9 T we would have .
II S Q9 T II > sup{ II (S Q9 T) (x Q9 y) ll ; x E BH1 , y E BK1 } == sup{ II S(x) II II T(y) ll ; x E BH1 , y E BK1 } == II S II II T II . Hence, our goal is to show that this operator indeed exists, and l i S Q9 T i l < II S II II T II · In fact, everything can be reduced to the following special case. Lemma. There exists a bounded operator SQ91: HI Q9 KI � H2 Q9 KI such that (SQ91) (x Q9 y) == S(x) y for all x E HI , y E KI . Moreover, 11 8 @ 1 11 < II S II . .
.
.
0
.
.
2.
190
Banach Spaces and Their Advantages
Proof. Consider the bilinear operator n
.
: Hl Kl � H2 Q9 Kl : (x, y) X
1---+
Suppose R : H1 Q9 K1 � H2 Q9 K1 is the associated operator. Take u E H1 x K1 , and a representation u == L:�= l X k Q9 Yk · Without loss of the generality, we can assume that the system Y I , . . . , Yn E K1 is or thonormal. (Otherwise, we choose an orthonormal basis in span{y1 , . . . , Yn }, decompose vectors Yk with respect to this basis, and use the bilinearity of the symbol Q9 ) . Then the system X I Q9 Y I, . . . , X n Q9 Yn is orthogonal in H1 Q9 K1 , and S(x 1 ) Q9 Y I, . . . , S(xn ) Q9 Yn is orthogonal in H2 Q9 K1 . Therefore, using the Pythagorean equality (Proposition 1 .2.7) we have
S(x) Q9 y.
n n n 2 L S(x k ) Yk = L II S(x k ) Yk ll 2 = L II S(x k ) ll 2 =l =l k= l k k n n < II S II 2 L ll x k ll 2 = II S II 2 L ll x k Yk ll 2 = II S II 2 II u ll 2 · k=l k= l Thus, R is a bounded operator from the near-Hilbert space H1 Q9 K to the Hilbert space H2 Q9 K1 , and II R II < II S II . Extending it by continuity to the . . whole H1 Q9 K, we obtain the operator SQ9 1 with required properties. • 69
69
69
Now we complete the proof of. theorem . 1 . Similarly . to the lemma, we . obtain a bounded operator l @ T : H2 Q9 K1 � H2 .Q9 K2 such that Q9 y). == x Q9. T(y) for. all x E H2. , y E K1 , and ll l @ T II < II T II . Put (1Q9T)(x . S Q9 T : == ( 1 Q9T)(SQ9 1 ) : H1 Q9 KI � H2 Q9 K2 . By the multiplicative inequality for the operator norm, this operator is bounded, and II SQ9T II < II S II II T II . • The rest is clear. Here is a finite-dimensional illustration. Let e�, . . . , e� and e 1 , . . . , en be orthonormal bases in H and K, respectively, and S : H � H and T : K � K operators given in these bases by the matrices a == ( akz ) and b == (bij ) . For brevity, denote ers : == e� Q9 e8 ; == 1, . . . , m, s == 1 , . . . , n. r
S@T acting on H@K (or, what is the same, in H Q9 K ) is given in the basis e 11 , e 12 , . . . , e 1n , e 21 , . . . , e 2n , . . . , em l , . . . , emn by the block matrix ( akz b), and in the basis e 11 , e 21 , . . . , em l , e 12 , . . . , em2 , . . . , e 1 n , . . . , emn by the block matrix (bija). Exercise 5 ° . The operator
In conclusion, we suggest that advanced readers define the functor of Hilbert tensor product H®? (similarly to the functor of Banach tensor product) , this time acting on the category Hil.
Chapter 3
From Comp act S p aces to Fredholm O p er ators
1 . Compact spaces and relevant functional spaces
A significant part of this and the next section is, most probably, known to
you. However, our goal is to present this information in the form we will need later. In the history of mathematics there have been many results which were originally viewed by their authors as being of auxiliary nature, and thus were modestly called lemmas. But later developments and results that stood behind all this had been leading ( perhaps, people of another generation ) to the discoveries of fundamental mathematical notions. One of most striking examples of such a development is the story of Borel ' s lemma, which led to the notion of compactness. Let (0, ) be a topological space, and � a subset in 0. A subfamily C is called an open covering of this subset if � C U{ U : U E a} . If ao and a are two open coverings of the same �' and ao C a, then ao is called a subcovering of a; we also say that a contains the subcovering ao . Very often when coverings are discussed, the role of � belongs to the whole space n. T
a
T
Definition 1. A topological space
0 is said to be compact if every open
covering of 0 contains a finite ( i.e. , consisting of a finite number of sets ) subcovering. A subset � of an ( arbitrary ) topological space 0 is said to be a compact subset if it is compact as a topological subspace in 0. 191
3. From Compact Spaces to Fredholm Operators
192
The term "compact" seems to be very appropriate and reflects our ev eryday impression of what is compact. The reader will see this gradually, as more material is accumulated. We can also speak about compact subsets without involving the inherited topology. The following proposition is verified immediately. Proposition 1 . A subset � of a topological space 0 is compact � every
open covering of � contains a finite subcovering.
II
It is not difficult to give an equivalent definition of compactness in terms of closed sets ( since they are defined as complements of the open ones ) and in terms of adherent points. With this goal in mind, we say that a family a of subsets in 0 is centered if every finite subfamily of a has a non-empty intersection. Proposition 2.
alent:
The following properties of a topological space 0 are equiv
( i ) n is compact; ( ii ) every centered family of closed subsets in 0 has a non-empty inter
section; ( iii ) every centered family of (arbitrary) subsets in 0 has a common adherent point. Proof. The implication ( i ) � ( ii ) immediately follows from the standard
set-theoretic identities that connect complements, unions, and intersections (write these identities ) . Then, clearly, ( iii ) implies ( ii ) . The converse impli cation follows from the fact that the family of closures of a centered family is again centered, and a common point of these closures is a common adherent point of the initial sets. II
Evidently, as a special case of ( ii ) , we obtain that the sequence V1 =:) V2 =:) V3 =:) • • • of embedded closed non-empty subsets of a compact topological space always has at least one common point. This property is essentially stronger than the "closed embedded subsets principle" in the context of complete metric spaces. The latter, as we recall, requires that the diameters of these sets tend to zero. The advanced reader who has done Exercise 0.2.7, knows that in a topological space the family of open sets can be described in terms of convergent nets. Naturally, he will ask: how can we determine, using the same terms, whether our space is compact? The answer is given in terms of subnets, a reasonable although not so straightforward generalization of the notion of subsequence .
Definition 2 . Let xv ; v E A and yJ.L ; p, E A ' be two nets of elements of a set X. The second net is called a subnet of the first if there exists a mapping r : A ' ---+ A such that for all p, E A' we have x, ( J.L ) = yJ.L , and for each v E A there is p, E A ' such that v -< r ( p, ) .
1.
193
Compact spaces and relevant functional spaces
Theorem 1 ( [13, Chapter 5 , Theorem 2] ) . A topological space n is compact ¢:::::::> every
net of elements of n has a convergent (in 0) subnet.
Clearly, if two topological spaces are homeomorphic, then they are com pact or not compact simultaneously; as one says, compactness is a topo logical property (i.e. , the property that is invariant under isomorphisms in Top) . This is the main difference of this property from the property of completeness of metric spaces, which is not invariant with respect to home omorphisms ( cf. the beginning of Section 2. 1 ) . The first example of a compact space is, of course, a closed interval of the number line (and this is just the content of Borel ' s lemma) . From a course of calculus you know that an arbitrary set in IRn or ccn with Euclidean metric is compact <====> it is closed and bounded. Proposition 2. 1.4 evidently implies that the same is true for sets in arbitrary finite-dimensional normed spaces (we will return to this in the next section) . Further, a discrete topological space is compact <====> it is finite (look at its covering by singletons) . In compensation, an antidiscrete space is always compact. You can easily verify that the complex plane with the Zariski topology is also compact, and this, of course, distinguishes the Zariski topology from the standard topology in e used in analysis. Actually, e n with the Zariski topology is compact for all n, and moreover a stronger condition is true: every sequence of its embedded closed subsets stabilizes. (This shows how different e n looks to an algebraist and an analyst. ) But we will not digress to this; see, e.g. , [52, 2. 1]
Why do mathematicians like compact spaces? First, the following fact lies on the surface.
If f! is a compact space, then every continuous function rp : f! --+ CC is bounded, and the supremum of its absolute value is attained.
Theorem 2 (recall the Weierstrass theorem!) .
Proof. For every A E cc we put U>._ : == {t E n : l rp(t) - A I < 1 } . Clearly, U>.. ; A E CC is an open covering of our compact space; let U>.. 1 , , U>..n be a
finite subcovering. Then for every t E 0 we have l rp(t) l < ��= l I A k l + 1 . The proof of the statement about the supremum repeats word-for-word the proof of the classical Weierstrass theorem. II •
•
•
Thus, if f! is compact, then the function space C(f!) coincides with Cb (f!) (see Example 1 . 1 .6') , and therefore it is a Banach space with respect to the uniform norm. We will see more than once how important these spaces are. A function rp : f! --+ CC on an (arbitrary, for the present) topological space is called vanishing at infinity if for every c > 0 there is a compact subset � in 0 such that l rp (t) l < c as t tf_ �- From Theorem 2 follows
3. From Compact Spaces to Fredholm Operators
194
Every continuous function vanishing at infinity on a topo logical space is bounded, and the supremum of its absolute value is at • tained.
Proposition 3 .
The set of continuous functions on a topological space 0 that vanish at infinity is denoted by Co ( O ) . It is easy to verify that this is a closed subspace in Cb ( O ) , and therefore a Banach space with respect to the uniform norm. If 0 is compact , then certainly Co ( O ) == C(O) . We have already come across typical spaces of that kind among the examples of normed spaces of Section 1 . 1 : these are co (X) (i.e. , Co (X) for the discrete X, in particular, co == Co (N) ) , and also Co (IR) . Let us continue the discussion of nice properties of compact spaces. Proposition
4. A closed subset of a compact space is a compact space. •
Proof. This follows immediately from Proposition 2(ii) . Proposition 5.
A compact subset of a Hausdorff space is closed.
Proof. Suppose we speak about a subset � in 0. Take a point
in 0 \ � Our aim is to show that X is an interior point for 0 \ �- For each y E � denote by Ui and Vy disjoint neighborhoods of x and y . Then the family {Vy y E � }, being a covering of � ' contains a finite subcovering, say, {Vyl ' . . . ' Vyn }. Hence, the set n�= l Uik is a neighborhood of X that does • not intersect �X
:
Let rp : 0 1 � 0 2 be a continuous mapping of a compact space to an arbitrary topological space. Then Im( rp) is a compact subset in 0 2 . (In short: a continuous image of a compact space is compact.) Proposition 6.
Proof. Let a be an open covering of Im(rp) in 0 2 . Then {rp - 1 (U) : U
a} is an open covering of 0 1 . Take its finite subcovering, say, {V1 , . . . , Vn }, and • consider the family { rp(VI ) , . . . , rp(Vn ) }. The rest is clear. E
Combining these facts we obtain a very substantial result. Theorem 3 (Alexandroff1 ) . Let 01 be a compact space, 0 2 a Hausdorff topological space, and rp : 0 1 � 0 2 a continuous bijective mapping. Then this is a homeomorphism {in other words, rp - 1 is automatically continuous). Proof. Since the homeomorphisms are precisely the continuous open bijec tions (see Section 0.2) , our goal is to show that rp is open, or (what is the 1 P. S. Alexandroff ( 1 896- 1 98 2 ) was a prominent Russian mathematician, one of the creators
of topology. Together with his friend P. S. Uryson (the second P. S. , as they used to joke) he introduced the class of topological spaces that they called "bicompact spaces" . Now we call them "compact spaces" .
1.
195
Compact spaces and relevant functional spaces
same, taking the bijectivity of rp into account) that the image of every closed set � in 0 1 is closed in 0 2 . Successively applying Propositions 4, 6, and 5 , we see first that � is compact, second that rp(�) has the same property, II and finally that the latter set is closed. Remark. Note the intrinsic similarity of the Alexandroff theorem and the Banach theorem on the inverse operator. Both describe the situations (that seem to have nothing in common) where a continuous bijective mapping is automatically a homeomorphism. Certainly, neither the compactness of the first space, nor the Hausdorff property of the second can be omitted. (Construct the corresponding ex amples.) The Alexandroff theorem can be generalized in two directions. Proposition 7 ( cf. Corollaries 2.4.2 and 2.4. 1 ) . Let rp 0 1 � 0 2 be a :
continuous mapping of a compact space to a Hausdorff space. Then ( i) if rp is injective, then it is topologically injective; (ii) if rp is surjective, then it is topologically surjective.
Proof. The first statement follows from Theorem 3 and the Hausdorff prop
erty of 0 2 . To verify the second, denote by the topology on 0 2 and by the quotient topology modulo rp (see Section 0.2 ) . Then, by Proposition 6, (0 2 , ) is compact, and thus the identity mapping 1 : (0 2 , ) � ( 0 2 , ) satisfies the hypothesis of Theorem 3. The rest is clear. II T
T
1
T
1
T
*
*
1
T
*
Now our categorical Zoo is supplemented with a new animal of a valuable breed. In Top we consider the full subcategory whose objects are compact Hausdorff spaces and denote it by CHTop. (Topologists are interested also in the category CTop consisting of all compact spaces, but it is considerably less important for functional analysis.) Certainly, the isomorphisms in CHTop are homeomorphisms, but due to Theorem 3, they can be characterized in this category simply as bijective morphisms. (Thus, the Alexandroff theorem has the same meaning for the category CHTop as the Banach theorem does for the category Ban ) . As for other types of morphisms, one can easily verify ( cf. Propositions 0.5 .5 and 0.5 .6) that the monomorphisms in CHTop are precisely injective morphisms, and all surjective morphisms are epimorphisms. Actually, the epimorphisms are completely characterized as surjective morphisms, and this distinguishes the category CHTop from the category HTop and makes CHTop similar to Top (cf. Section 0.5 ) . But we leave such things for the advanced reader.
3. From Compact Spaces to Fredholm Operators
196
The structure of "good" morphisms distinguished in Section 0 . 5 is ideally simple in the categories Set and Lin. Compared to this, the category CHTop gives a picture of the next level of complexity. The following exercise summarizes our earlier observations. Exercise 1 . In the category CHTop
(i) every bimorphism is an isomorphism; (ii) a morphism
::
Hint. The main observation is that every epimorphism is surjective; everything else easily follows from this. The main assertion can be proved as suggested in the remark after Proposition 0.5.6.
The feature of the introduced category most important for applications is the existence of categorical products in a transparent explicit form. The following theorem is one of the most important in topology. At the same time it has various helpful implications in functional analysis (see, e.g. , Theorem 4.2.5) . Let us recall the construction of the topological (Tychonoff) product of topological spaces, which provides, together with projections, a category theoretical product of objects in Top (and in HTop) (see Section 0.6) .
4 (Tychonoff) . The topological product of an arbitrary family of compact topological spaces is a compact space.
Theorem
Proof. Let
Ov;
lJ
E
A be our spaces, and n
:==
X {Ov;
lJ
E
A} their topo
logical product. Further, let a be an arbitrary centered family of subsets in 0. By Proposition 2(iii) , it is sufficient to find a common adherent point for this family.
The family a is contained in a maximal centered family (3 of sub sets in 0 (i. e., in a family (3 that cannot be appended with one more set without violating the centering condition).
Lemma.
Proof. Consider the set X of all centered families of subsets in
0 that
contain the family a . It is ordered by inclusion, and the union of all families in a linearly ordered subset in X is evidently a centered family. From this we see that X as an ordered set satisfies the assumptions of Zorn ' s lemma. Hence, X has a maximal element. It is easy to see that this maximal element is the desired family. •
4. For each "coordinate" v E A we take the projection 1rv : 0 --+ Ov and consider the family f3v : == {7rv (�) ; � E (3} of subsets in Ov. Since (3 is centered, the same is true for f3v · By compactness of Ov, all sets in f3v have common adherent point, say, tv . End of the proof of Theorem
1.
197
Compact spaces and relevant functional spaces
Now consider the point t in 0 with coordinates tv . From the definition of topology in 0 it follows that every neighborhood of this point contains a neighborhood of the form V : == n�= l Vvk , where Vvk ; vk E A is the inverse image with respect to 1rvk of some neighborhood Uvk of the point tvk E Ovk . By the choice of coordinates of the point t, for each k == 1 , . . . , n the neighborhood Vvk has a non-empty intersection with all sets of the family {3. Note also that /3, being a maximal centered system, must contain the intersection of every finite family of its sets. All this means that adding to the family f3 the set Vvk results in a centered family, and thus, by maximality, Vvk E f3 for all k, hence V E /3. Thus, V has a non-empty intersection with every subset in the family /3, and thus in the initial family a . The rest is clear. •
Every family of non-empty spaces in CHTop (and also zn product, which is the topological product of all these spaces together with the corresponding projections.
Corollary 1. CTop) has a
We expect a curious reader to ask: what about coproducts? Recall that in wider categories Top and HTop coproducts always exist , and they are just disjoint unions (Ex ercise 0.6.6) . Since the disjoint union of an infinite family of non-empty topological spaces cannot be compact (check this! ) , one can think that only finite families have coproducts in CHTop. But actually the coproduct of an arbitrary family of compact Hausdorff spaces exists in CHTop, but it has more complicated nature: one must take not the disjoint union of these spaces, but the so-called maximal compactification of this union. But what is this compactification? The fact is that there are many ways to make a given space compact by adding to it new points in such a way that the initial space is dense in this new compact space . Every such compact space is called a compactification of the initial space. In this book we will come across the "minimal" , or the so-called Alexandroff compactification, when only one point is added to a given space (see Defini tion 4) . As you will see below, the construction of this compactification is quite simple. But there is also a much deeper fact. For a very wide class of Hausdorff spaces there exists the "maximal" (in some reasonable sense) compactification, the so-called Stone-Cech compactification. In this maximal compactification there can be many additional points; for instance, the corresponding procedure applied to the discrete N results in 2N (N is the continuum cardinality) new points. v
v
The Stone-Cech compactification of a given space can be defined in terms of some universal property similarly to the definition of completions (cf. Proposition 2.6. l (i)) . It can also be regarded as an initial object in the category whose objects are all Hausdorff compactifications of a given space and morphisms are continuous mappings that are the identity on this space. The details on compactifications can be found, e.g . , in [14] .
For a finite number of compact spaces the compactness of their topolog ical product can be proved without using Zorn ' s lemma. The corresponding arguments are based on one important characterization of compact spaces which we give without proof (see, e.g. , [14] ) .
3. From Compact Spaces to Fredholm Operators
198
A topological space 0 is compact � for every topological space � the projection of the topological product 0 � onto � maps every closed set to a closed set. Theorem 5 (Kuratowski, [14, Theorem 3. 1. 16] ) . x
Exercise 2 ° . Deduce from Theorem 5 that the topological product of
every finite family of compact spaces is compact. Hint. Use the induction on the number of factors.
The category of compact Hausdorff topological spaces is related to other categories of analysis and topology by a series of functors. Here is one of the most important of them. Example 1 (cf. Example 2.5.3) . To every 0 E Ob (CHTop) we assign the Banach space C ( O ) , and to every w E h ( O I , 02 ) (i.e. , to every continuous mapping) the operator C (w ) : C ( 02 ) � C ( O I ) taking a function r.p : 02 � CC to the function C ( w ) ( r.p ) : O I � CC : t r--+ r.p ( w ( t)) . Since C ( w ) is a contraction, we obtain a contravariant functor C : CHTop � Bani called the functor of
continuous functions. (Such a functor, of course, can be defined on other
categories of topological spaces, say, on CTop, but some good properties will be lost in those cases; cf. Theorem 6 below.) Now we keep our promise given in Section 2.2, and talk about some classical results concerning the classification of Banach spaces of continuous functions. Formally, these results are related to the problem of classification of objects in two categories: a full subcategory in Bani and a full subcat egory in Ban. Both have the same objects, namely, the spaces of the form C ( O ) , where 0 is a compact Hausdorff space. Here (contrary to, say, the case of full subcategories consisting of Hilbert spaces; cf. Theorem 2.2.2) the "topological" and "metrical" classifications look quite different.
Let O I and 02 be compact Hausdorff spaces. The following statements are equivalent: (i) C ( O I ) and C ( 02 ) are isometrically isomorphic as Banach spaces (i. e., they are isomorphic as objects in Bani ) ; (ii) C ( O I ) and C ( 02 ) are isometric as metric spaces {i. e., they are isomorphic as objects in Meti ) ; (iii) OI and 02 are homeomorphic {i. e., they are isomorphic as objects in CHTop) . Furthermore, if the functor of continuous functions (see Example 1) takes a morphism r.p from CHTop to an isomorphism in Bani , then r.p is also an isomorphism. {Sometimes, one says that the functor C respects isomorphisms.) Theorem 6 ( [53, Theorem 7.8.4] ) .
1.
Compact spaces and relevant functional spaces
199
A proof of this theorem ( as well as of the other classification results mentioned below ) can be found, say, in [53] . Remark. The first two statements of the previous theorem immediately follow from the third in view of functorial properties of C and Theorem 0.7. 1. But the converse implication is not trivial; it was proved by Banach for metrizable compact spaces, and after that by Stone in the general case. Thus, there are "as many" pairwise non-isomorphic spaces of the form C(O) as pairwise non-homeomorphic compact spaces. In particular, cubes of different dimensions, spheres with different number of attached handles, the Cantor set-all these 0 give pairwise non-homeomorphic C(O) . At the same time, we have the following result. Theorem 7 ( Milyutin, [53, Chapter 5, Theorem 2] ) . Let 0 1 and 0 2 be (ar
bitrary) non-countable compact topological spaces. Then the Banach spaces C(0 1 ) and C(0 2 ) are topologically isomorphic (i. e., isomorphic as objects in Ban). In particular, both of them are topologically isomorphic to C[O, 1 ] .
Together with compact spaces we consider the class of spaces which are their nearest generalizations. At first sight, these spaces look much larger than compact spaces, but we will see later that actually they can be realized as parts of compact spaces. We say that a subset of an arbitrary topological space is relatively com pact if its closure is compact. ( Thus, this property depends on the enveloping space that contains our set. ) Definition 3. A topological space 0 is called locally compact if every point of n has a relatively compact neighborhood. Obviously, local compactness ( as compactness; see above ) is a topological property. Classical examples of locally compact but not compact spaces are the real line and the complex plane. Certainly, the reader can immediately conclude that all IRn and cc n are locally compact. By Proposition 2. 1.4, this immediately implies that every finite-dimensional normed space, real or complex, has this property. ( The same question for infinite-dimensional spaces will be discussed in the next section. ) Every discrete space is ob viously locally compact as well. Here are examples of spaces that are not locally com pact. Exercise 3. The topological product 0 of an infinite family Ov ; v E A of non-compact spaces ( say, copies of N or IR) is not locally compact. Hint. For every open set U in 0 there exists v E A such that 1rv ( U ) == Ov . Now we concentrate on Hausdorff locally compact spaces.
200
3. From Compact Spaces to Fredholm Operators Exercise
4. Every open subset of a Hausdorff compact space is a Haus
dorff locally compact space (in the inherited topology) . Hint. Every point of a given open set has a neighborhood U such that its closure U is disjoint from the complement of this set.
It turns out that there are no other Hausdorff locally compact spaces; moreover, every such space is obtained from a compact space by removing one point. The construction of this compact space is a natural generaliza tion of the construction of extended complex plane, certainly known to the reader. Let ( 0, 7) be a topological space (so far arbitrary) . Let us add to it one more point, called oo , the point at infinity. Denote by 0 + the obtained set, and consider the system 7+ of subsets in it consisting of a) all the sets from 7, and b) all the sets of the form 0 + \ � ' where � is a closed compact subset in 0. We see that the sets of the first type do not contain the point at infinity, whereas the sets of the second type do. It is trivially verified (word-for-word as for the extended complex plane) that 7+ is a topology in 0 + and (0, 7) is a topological subspace in ( 0 + , 7+ ) · Also, it is easy to see that oo is an isolated point in ( 0 + , 7+ ) <====> the initial space (0, 7) is compact. Proposition 8.
The topological space (0 + , 7+ ) is compact.
E a a set containing oo . Then 0 + \ Uoo is com pact in 0, and therefore the open covering { U n 0 : U E a} in (0, 7) contains a finite subcovering, say, {U1 n 0, . . . , Un n 0}. • But then {U1 , . . . , Un , Uoo } is a finite covering of the entire 0 + . Definition 4. The topological space ( 0 + , 7+ ) is called the one-point com pactification, or the Alexandroff compactification, of the space (0, 7) . Proposition 9. The one-point compactification of a Hausdorff locally com pact space 0 is itself Hausdorff. Proof. Let a be an open covering of our space, and U00
Proof. We need to show that different points X ' y
E 0 + have disjoint neigh
borhoods. Since 0 is Hausdorff, it is sufficient to consider the case where y == oo. Take a neighborhood U C 0 of a point x with compact closure � • in 0; then U and 0 + \ � are the neighborhoods we are looking for. Exercise 5. For every topological space 0 the Banach space Co (O) coin
cides, up to an isometric isomorphism, with the subspace (of codimension 1) in C ( 0 + ) that consists of functions vanishing at infinity.
2.
201
Compact metric spaces and total boundedness
Hint. Assign to every function from Co (O) its extension to 0 + vanishing
at oo .
Here is a question ( at first sight, harmless ) for a strong student. How can we define morphisms in the category of Hausdorff locally compact spaces? One can say, there is no problem: all continuous mappings will do! But there is a much more useful approach. Namely, let us say that a morphism from f21 to 0 2 is a continuous mapping between the corresponding Alexandroff compactifications that preserves points at infinity, i.e. , W + : n l + � n 2+ such that W+ ( 00 ) == 00 . It is easy to show that the mapping W : f21 � f2 2 can be extended to a continUOUS mapping W+ with the indicated property if and only if for each compact set K C f2 2 its inverse image w - l ( K) is compact . Such mappings are called proper. This definition is useful, for instance, because it is easy to see that for a function x E Co (f2 2 ) C C(f2 2+ ) the corresponding function x(w+ (t)) belongs to C0 (01 ) C C(f21+ ) · The constructed category plays an important role in the theory of commutative Banach algebras ( cf. Section 5.3 in the sequel ) .
2. Compact metric spaces and tot al boundedness
Let us descend from general topology "one floor down" to the theory of metric spaces. How can we judge by the properties of the metric whether the underlying topological space is compact? If we try to express it in formally, in "everyday words" , the answer is as follows: a metric space is compact � it is "small" and does not have "holes" . We already know the exact mathematical notion corresponding to our intuitive idea of "not having holes" : it is completeness. Now we will get acquainted with metric spaces corresponding to the idea of "smallness" . We start with an auxiliary notion. Definition 1. Let (M, d) be a metric space, Mo a subspace of M, and c > 0. A subset N in M is called a Hausdorff c -net (or just an c -net) for Mo if for each element x E Mo there is y E N such that d(x , y ) < c (or, in other words, if Mo C U{U(y, c) ; y E N} ) . Definition 2 . A subset Mo of a metric space M is called totally bounded if for every c > 0 in M there is a finite (i.e. , consisting of finitely many elements) c-net for Mo . In particular, for Mo == M we obtain the definition of a totally bounded metric space. The following proposition eliminates the possibility of different interpretations for the words "totally bounded subset" . Proposition 1. A subset Mo of a metric space M is totally bounded � Mo is totally bounded as a metric subspace in M. Proof. The second property means, of course, that for every c
>
0 the set Mo has a finite c-net consisting of the points of Mo . Hence, � is obvious. To establish ===> , take c > 0 and a finite c/2-net YI , . . . , Yn E M for Mo . Take those k; 1 < k < n for which there exists at least one point Zk E Mo
3. From Compact Spaces to Fredholm Operators
202
such that d(zk , Yk ) < c/2; we choose such a point Zk · Further, for every x E Mo there exists at least one k such that d(x, Yk ) < c/2; so for such k the point Zk is determined, and from the triangle inequality it follows that d(x, zk ) < c. Therefore, the points Zk form an c-net for Mo in the very same • Mo .
CC! and consider an arbitrary bounded subset M. Then for some C > 0 and for all x == (xi, . . . , Xn ) E M we have l x k l < C for all k == 1 , . . . , n. For every m E N we denote by Nm the set of all points in CC! with coordinates of the form c r;;:s ' where r and are integers between -m Example 1 . Take
s
to m; certainly, this set is finite. One can easily prove that for every c > 0 the set Nm for m > V2cCn is an c:-net for Mo . Thus, every bounded subset in
CC! is totally bounded. Proposition 2 ( cf. Proposition 1 . 7) . Let rp : MI � M2 be a uniformly continuous mapping of a totally bounded metric space to an arbitrary metric space. Then Im( rp ) is a totally bounded subset in M2 . (Briefly: uniformly continuous image of a totally bounded metric space is itself totally bounded.)
0, and then take 8 > 0 in the definition of uniform continu ity. Let YI , . . . , Yn be a finite 8-net in MI . Then, obviously, rp(yi), . . . , rp (yn ) • is a finite c-net for Im( rp ) in M2 .
Proof. Take c
>
Two uniformly homeomorphic {i. e. , isomorphic in Metu ; see Section 0.4) metric spaces are either simultaneously totally bounded or not. Proposition 3. The closure of a totally bounded subset of a metric space is totally bounded. Corollary 1.
Proof. Obviously, every c/2-net in M for Mo is an c-net for the closure of
•
Mo . The rest is clear.
We can come to the notion of total boundedness in several ways:
4. The following properties of a subset Mo of a metric space M are equivalent: (i) Mo is totally bounded; (ii) every sequence of points in Mo has a fundamental subsequence; (iii) every subset in Mo consisting of points with pairwise distance ex ceeding some constant () > 0 is finite.
Proposition
x 2 , . . . be our sequence. For each m E N, con struct a subsequence xr ' x 2 ' . . . in the following way. For xi ' � we take the initial sequence. Now suppose for some m we have already constructed the sequence x ! , x2 , . . . . Since for Mo there is a finite 1/m-net in M, Mo Proof. (i)====> (ii) . Let
XI ,
X '
•
•
•
2.
Compact metric spaces and total boundedness
203
is contained in the union of a finite family of open balls of radius 1/m. Then one of these balls contains the points x � for an infinite set of indices of some subsequence in xr ' x2 ' . . . ; we denote n, i.e. ' contains allI elements I the latter by x;_n + , x;n+ , . . . . Note that all pairwise distances between the elements of the constructed subsequence are less than 2/m. Now consider the sequence x� , x� , . . . . It is a subsequence in XI , x 2 , . . . , and, obviously, for k > l we have d(x � , x � ) < 2/l; thus, it is fundamental. (ii)===> (iii) . If in Mo there is an infinite subset satisfying (iii) , then the sequence consisting of different points of this subset has no fundamental subsequences. (iii) ===> (i) . Let us assume that for some c > 0 there is no finite c-net for Mo in M. Take an arbitrary XI E Mo . Since {x i } is not an c-net for Mo , there exists x 2 E Mo with d(x i , x 2 ) > c. Since {x i , x 2 } is not an c-net for Mo , there exists X 3 E Mo with d( XI , X 3 ) , d( x 2 , X 3 ) > c. Continuing, we construct an infinite set of points X n E Mo ; n E N such that d(x m , X n ) > c II for m =/=- n, a contradiction. The following proposition justifies the term "totally bounded" .
5. A totally bounded subset Mo of a metric space M is (au tomatically) bounded. Proof. If YI , . . . , Yn E M is a 1- net for Mo, then pairwise distances between II the points of M0 do not exceed max{d(x k , xz ) ; 1 < k, l < n} + 2. Proposition
The converse statement is, certainly, not true. An infinite discrete metric space is bounded, but it is not totally bounded: every c-net in this space for c < 1, obviously, coincides with the entire space.
A totally bounded metric space is separable. It suffices to look at the union of all finite � -nets; n == 1, 2, . . . . •
Proposition 6. Proof.
Now we arrived at a characterization of compact metric spaces.
The following properties of a metric space M are equivalent: (i) M is compact; (ii) M is complete and totally bounded; (iii) every sequence of points in M has a convergent subsequence; (iv) every infinite subset in M has a limit point in M. Proof. (i) ===> ( iv) . Suppose an infinite subset N in M does not have limit points. In particular, this means that N is closed. Then every point x E N has a neighborhood Ux not containing other points of N. But then the Theorem 1.
204
3. From Compact Spaces to Fredholm Operators
family consisting of M \ N and all Ux ; x E N is an open covering of M without a finite subcovering. ( iv ) ====> ( iii ) . Let xn ; n == 1 , 2, . . . be a sequence in M; denote by N the set of elements of this sequence. If N is finite, then our sequence contains a subsequence with coinciding elements, which certainly converges. If N is infinite, then it has a limit point, and thus ( since we are in a metric space! ) , a strict limit point, say, x. But then there exist indices n 1 < n2 < · · · such that d(x, X nk ) < 21k for all k. The rest is clear. ( iii ) ====> ( ii ) . If M is not complete, then we can find a fundamental se quence of points in M that does not have a limit, and after that we see that every subsequence of this sequence does not have a limit either. If M is not totally bounded, then, by Proposition 4, there exists a sequence of points that does not even has a fundamental ( to say nothing about convergent ) subsequence. So, it remains to prove the most laborious implication. ( ii ) ====> ( i ) . Suppose, on the contrary, that there exists an open covering a of the space M that does not have a finite subcovering. Let us construct by induction a system of embedded closed subsets V 1 =:) V 2 =:) • • • in our space with the following properties: a) for every n E N the family a regarded as a covering of the set vn has no finite subcovering of v n ; b ) the diameter of the set vn does not exceed 2 /n. At the first step of induction we, using the total boundedness of M, take a finite 1-net for M, say, Yf , . . . , y� 1 • For k == 1 , . . . , m 1 denote by Vk1 the closure of the open ball U ( Yk , 1 ) . If for every k the family a, considered as an open covering of Vk1 , has a finite subcovering of this set, then, obviously, the union of these subcoverings over all k gives a finite subcovering of the entire M. Hence there exists at least one k for which this is not the case; we put V 1 : == Vk1 . Suppose we have already constructed the sets V 1 , . . . , vn - 1 with both properties a ) and b ) . Using the total boundedness of M, take a finite 1 / n net, say, y]_ , . . . , y�n . For k == 1 , . . . , mn denote by Vkn the intersection of vn - 1 with the closure of the open unit ball U (yJ: , 1 /n ) . The nsame arguments as at the first step of induction ( but with M replaced by v - 1 and the sets Vk1 by the sets vkn ) show that there exists at least none k such nthat wen cannot choose from a a finite sub covering of the set vk . We put v : == Vk ; obviously, this set also has properties a) and b ) . Now recall that M is not only totally bounded, but also complete. Hence, by the principle of nested closed sets, the intersection of all vn consists of exactly one point, say, x . But x lies in some U E a, as does every other point
2.
Compact metric spaces and total boundedness
205
of the space M. Since U is open, it contains an open ball U(x, c) of some radius c > 0. Therefore, due to the condition on the diameters (property b)) , this ball, and thus U as well, contains all sets vn with sufficiently large n. But then as a covering of such vn , contains a sub covering consisting of only one U. We have come to a contradiction with property a) of the constructed sets. • a,
Taking into account Proposition 3 and the fact that the completeness of a subset in a complete metric space is equivalent to the closedness of this subset, we obtain
A subset of a complete metric space is compact <====> it is totally bounded and closed. The same set is relatively compact <====> it is totally bounded.
Corollary 2.
Certainly, if a metric space is not complete, it always has closed totally bounded sets that are not compact, or, what is the same here, relatively compact: it is sufficient to take the set of elements of a non-convergent fundamental sequence. *
*
*
We recall fundamental properties of finite-dimensional spaces mentioned in Theorem 2. 1 . 1 and in Corollaries 2. 1 .2 and 2. 1 .3. (Their "geometrical meaning" is that on a finite-dimensional space all norms are equivalent, and the "categorical meaning" is that all spaces of the same finite dimension are isomorphic to each other as objects in Ban.) Now we can suggest another approach to the proofs of these results, more precisely, of Proposition 2. 1 .4, from which all others easily follow. Proposition 7 ( == Proposition 2. 1 .4 ) .
Every normed space E of finite di mension n is topologically isomorphic to CC!, and every linear isomorphism between these spaces is a topological isomorphism.
J be an arbitrary linear isomorphism from CC! onto E. Since it is certainly bounded (Example 1 .3. 1) , it is sufficient to find c > 0 such that c ll x ll < II J(x) ll for all x E E (see Proposition 1 .4 . 1 ) . Let S be the unit sphere in CC!. Since J is an injective operator, the function f : S � CC : x �----+ II J ( x) II - l is well defined, and it is continuous by the continuity of the norm. But S, being a closed bounded set in CC! , Proof. Let
is compact (Example 1 and Corollary 2 ) . By Theorem 1 .2 (the Weierstrass theorem) , this implies that f(x) is bounded from above by a constant C. It remains to take c : == 1 /C . II *
*
*
3. From Compact Spaces to Fredholm Operators
206
The rest of the section is devoted to the following important class of problems: which subsets are totally bounded and which are compact in a particular normed space? First , we distinguish the following simple result.
In a normed space, the algebraic sum of two totally bounded sets is totally bounded and every dilation of a totally bounded set is totally • bounded.
Proposition 8.
This fact will be useful later, and now we consider the case of finite dimensional spaces. Actually, everything here is ready for an answer. Proposition 9.
Then
Let M be a subset of a finite-dimensional normed space E.
(i) M is totally bounded <====> it is bounded; (ii) M is compact <====> it is bounded and closed. Proof. By Proposition 7 and Theorem 1 .4. 1 (iv) ,
E is uniformly homeomor
phic to CC! , where n == dim(E) . Hence M is uniformly homeomorphic to a bounded subset in CC!. Since the latter is totally bounded if and only if it is bounded (see Example 1 and Proposition 5) , (i) follows from Corollary 1 . Taking into account the completeness of E (Theorem 2. 1 . 1 (i) ) and Corollary • 2, we deduce (ii) . The situation in the infinite-dimensional spaces is not so simple. First, we gather some "empirical" material. Exercise 1 . The unit balls in the spaces C[a, b] , lp and Lp [a, b] ; 1 < p < oo are not totally bounded. Hint. Take functions (or sequences) of norm 1 with disjoint supports and use Proposition 4(iii) . Now let us find which general fact stands behind these observations. The following proposition is a key for the subsequent theorem.
"lemma on the near-perpendicular" ) . Let E be a normed space, and Eo its closed proper subspace. Then for every c > 0 there exists a vector x E Eo (called an c -perpendicular) such that II x II == 1 and d(x, Eo) > 1 - c.
Proposition 10 (known as
E \ Eo . Since Eo is closed, d(y, Eo) > 0. Hence, taking into account the obvious equality d(A.y , Eo) == I A. I d(y, Eo) for each A. E CC, we can assume without loss of generality that d(y, Eo) == 1. Let z E Eo be such that d(y, z ) < 1 + 8, where 8 > 0 is so small that 1/(1 + 8) > 1 - c. Then for xo : == y - z we have ll xo ll < 1 + 8. At the same time, since the sets {xo E Eo } coincide, E Eo } and {y Proof. Take an arbitrary
y
E
u : u
u : u
2.
207
Compact metric spaces and total boundedness
d ( xo , Eo ) == d ( y, Eo ) == 1 . Therefore, it is easy to see that x required €-perpendicular.
: ==
x f ll x ll is the
•
Certainly, if E is a Hilbert space, it has a vector x such that ll x ll == 1 and at the same time d ( x, Eo ) == 1: this is the "true" perpendicular to Eo ; its existence is stated in the theorem on orthogonal complement. But for general Banach spaces, such a luxury is not guaranteed. Exercise 2.
( i ) There are pairs ( a Banach space E, a closed subspace Eo ) such that for each x E E \ Eo we have ll x ll > d ( x, Eo ) . ( ii ) There are coisometric operators, taking the closed unit ball of the
first space to a proper subset of the second space. Hint. In proving ( i ) Exercise 2.3.2 is useful, and ( ii ) follows from ( i ) . Theorem 2 ( Riesz ) .
The following properties of a normed space E are
equivalent: ( i ) the (closed) unit ball in E is compact; ( ii ) the unit ball in E is totally bounded; ( iii ) E is finite-dimensional.
Proof. The implication ( i ) ====> ( ii ) is clear.
( ii ) ====> ( iii ) . Suppose, on the contrary, that E is infinite-dimensional. Choose an arbitrary c; 0 < c < 1. Denote by x 1 an arbitrary vector in E of norm 1. Since the space E1 : == span ( x1 ) is one-dimensional and closed
by Corollary 2. 1 .3, the lemma on the near-perpendicular guarantees the existence of a vector x 2 of norm 1 such that d ( x 2 , E1 ) > 1 - c. Further, since E2 : == span ( x1 , x 2 ) is two-dimensional, the same lemma gives a vector X 3 of norm 1 such that d ( x 3 , E2 ) > 1 c . Now consider E3 : == span ( x1 , x 2 , x 3 ) , and so on. Following these arguments, we obtain a sequence of vectors xn ; n E N in BE for which the pairwise distances between elements are not smaller than 1 - c. By Proposition 4 ( iii ) , this is impossible. II ( iii ) ====> ( i ) immediately follows from Proposition 9 ( ii ) . -
Exercise
3. The properties ( i ) - ( iii ) of the space E are equivalent to the
local compactness of E.
For a series of concrete Banach spaces, their totally bounded ( and thus, by Corollary 2, relatively compact ) subsets can be described in sufficiently transparent terms. We present one of the most important facts of this kind, concerning the space C[a, b] . The following definition is useful here.
3. From Compact Spaces to Fredholm Operators
208
C[a, b] is called equicontinuous if for every c > 0 there exists 8 > 0 such that l f(t') - f(t") l < c for all f E M and t', t" E [a, b] with I t' - t" l < 8. The classical Cantor theorem implies that every finite subset in C[a, b] Definition 3. A subset M E
is equicontinuous. At the same time, it is easy to verify that the set of functions sin nt; n == 1 , 2, . . . is not equicontinuous.
A subset M in C[a, b] is totally bounded <====> it is bounded and equicontinuous.
Theorem 3 ( Arzela ) .
Take c > 0; our goal is to find a finite c-net for M in C[a, b] . By assumption, there is n E N such that l f(t') - f(t") l < c / 3 whenever l t' - t" l < 1/n and f E M. Put t k :== a + (b -na) k ; k == 0, . . . , n and consider the ( linear contraction ) operator T : C[a, b] � cc�+ l , T(f) : == (f(to), . . . , f(tn ) ) . By Proposition 9 ( i ) , the set T(M) c cc�+ l is totally bounded. Hence, taking Proposition 1 into account, there is a finite subset N in M such that T(N) is an c / 3-net in T(M) . We show that N is an c-net for M. To this end, take an arbitrary function f E M and a point t E [a, b] and choose k such that I t - t k l < ljn. From the definition of the set N it follows that there is g E N such that l g( t k ) - f(t k ) l < c / 3. Hence, l f(t) - g (t) l < l f(t) - f(t k ) l + l f(t k ) - g( t k ) l + l g( t k ) - g (t) l < 3c / 3 == c. This means that N is an c-net for M. ===> . Taking into account Proposition 5, it is sufficient to verify that M is equicontinuous. Take c > 0, and let N be a finite c / 3-net for M in C[a, b] . As we mentioned above, the set N of functions, being finite, is equicontinuous. Therefore, there exists 8 > 0 such that the condition I t' - t" l < 8 for all g E N implies l g(t ' ) - g (t") l < c / 3. Take an arbitrary f E M; then for some g E N we have II f - g II < c /3 because for I t' - t" l < 8 we have the estimate I ! ( t' ) - ! ( t " ) I < I ! ( t ' ) - g ( t' ) I + I g ( t' ) - g ( t " ) I + I ! ( t" ) - g ( t" ) I < 3c I 3 == c . • The rest is clear. Proof.
�-
oo
Exercise 4.
( i ) The unit ball in C 1 [a, b] , regarded as a subset in C[a, b] , is totally
bounded, but not compact. ( ii ) The subset in the unit ball in C[a, b] consisting of functions f such that l f (t' ) - f(t") l < I t' - t" l for all t' , t" E [a, b] , is compact. Next we give a description of totally bounded sets in another important class of Banach spaces.
3. Compact operators: General properties and examples
Exercise 5. Let M be a subset in
bounded
{:::::::>
it is bounded, and
209
lp ; 1 < p < oo. Then M is totally
00 ��00 sup { L l enl p : e = (6 , 6 , . . . ) E M } = 0 (for p < oo) ;
n =m+ l sup l �nl : � == (� 1 , �2 , . . . ) E M } == 0 (for p == oo) mlim � oo sup { n >m +l (i.e. , the norms of the "tails" (�m + l , �m+ 2 , . . . ) of the sequences � from M
uniformly tend to zero) .
3 . Compact op erators : General prop ert ies and examples
The previously considered notions, and first of all the notion of total bound edness, allow us to distinguish a very important class of operators. One can advance rather far in studying these operators, and in the case where such operators connect two Hilbert spaces, we may even say that one can completely perceive their nature. Definition 1. An operator T : E � F between normed spaces is called compact 2 if it takes every bounded set from E to a totally bounded set in F. Certainly, the discussed property of an operator is determined by its behavior on only one bounded set.
An operator T : E � F is compact {:::::::> the image of the II unit ball in E is totally bounded. Remark. As we will see later, under certain conditions on E and F, which Proposition 1.
are fulfilled in many cases, the image of the unit ball is not only totally bounded but compact; see Propositions 4.3 and 4.2. 17. This partially justi fies the term "compact operator" .
Later, after becoming familiar with the so-called weak topologies, we will speak about another approach to the notion of compact operator (see Theorem 4.2.4) . From Proposition 2.9 we immediately obtain the following result. Proposition 2. Every finite-dimensional bounded operator is compact. On the other hand, from Theorem 2.2 (the Riesz theorem) we have
No projection onto an infinite-dimensional subspace (see Sections 1 . 5 and 2. 3) and, in particular, no identity operator on an infinite II dimensional normed space, is compact.
Proposition 3.
2 Half a century ago mathematicians used to say "completely continuous operator" instead of "compact operator" .
3. From Compact Spaces to Fredholm Operators
210
The set of compact operators from E to F is denoted by /C(E, F) , and one usually writes /C ( E ) instead of /C (E, E) . Proposition 4.
JC(E, F) is a closed subspace in B(E, F) .
/C(E, F) . Then (S + T) (BE ) , being a part of S(BE) + T(BE ) , is totally bounded by Proposition 2.8. Hence, S + T E /C(E, F) . It is even easier to verify that A.S E /C(E, F) for every A. E CC. It remains to show that every adherent point of the set /C ( E, F) , say S, belongs to JC ( E, F) . Take c > 0 and T E /C ( E, F) such that l i S - T i l < c/2. Proof. Take S, T
E
Then it is easy to see that every c/2-net for T(BE) is an c-net for S(BE) · The rest is clear. •
If F is a Banach space, then JC ( E, F) is also a Banach space for every normed space E. Corollary 1 .
Propositions 2 and 4 together give a useful way of verifying compactness for a wide class of operators.
If an operator between two normed spaces can be approximated in the operator norm by finite-dimensional operators, then it is compact.
Corollary 2.
The following proposition, in addition to its independent importance, is very useful for the verification of compactness of concrete operators.
Suppose E, F, and G are normed spaces, S E B ( E, F ) , and T E B(F, G) . If at least one of these operators is compact, then TS : E G is also compact. Proposition 5.
�
Proof. If T is compact, then everything is clear. If S is compact, then
TS(BE) is totally bounded by Proposition 2.2 and Theorem 1 .4. l (iv) .
•
This proposition implies, in particular, that a topological isomorphism between infinite-dimensional normed spaces is never compact (otherwise the identity operators on these spaces would be compact as well) . To an advanced reader, we recommend the following Exercise 1 (Schauder) . An operator T compact ¢:::::::> the Banach adjoint operator compact . ====}
: E � F between two Banach spaces T* : F* � E* (see Definition 2. 5.2)
is is
Let Y l ' . . . ' Yn be an c I 4-net in T( BE ) . Consider an operator s : F* � e n ' S(f) == (j(y1 ), . . . , f (Yn )) and take a finite subset N C BF * such that S ( N) is an c/4-net in S(B F * ) . Then T* (N) is an c--net in T* (B F * ) . {:::== follows from ====} and from the uniqueness of the canonical embedding of the space into its second dual space (Proposition 2 . 5 . 2) .
Hint.
.
3. Compact operators: General properties and examples
211
Now it is time to recall various concrete examples of operators we have gathered in Section 1 .3. The new typical question is whether these operators are compact.
The diagonal operator T>.. : lp --+ lp where A E l 00 (see Example 1.3.2) is compact <====> A E co . Proof. � - Let A ( n ) : == (A I , . . . , A n , 0, 0, . . . ) . Then the operator T>.. ( n ) is finite-dimensional, and l i T>.. - T>.. ( n ) II == II A - A (n ) lloo --+ 0 as n --+ oo (Example Proposition 6.
,
1.3.2) . Now Corollary 2 works. ====> . If A t/:. co , then there are n 1 < n 2 < · · · such that I An k I > () for some () > 0 and for all k E N. But for all k , l E N; k =/=- l and p < oo we have II T( pnk ) - T( pn z ) ll == II A n k Pnk - Anl Pn 1 ll = I Ank i P + I .Anl i P > <120.
y/
Taking into account Proposition 2.4(iii) , for p < oo our operator takes the bounded set { pn k ; k E N} to a bounded but not totally bounded set. It is easy to see that the same is true for p == oo. II Remark. In fact, diagonal operators on l 2 are much more than just exam
ples of compact operators. In the following section we will see that many compact operators acting on the infinite-dimensional separable Hilbert space (in particular, those having trivial kernel and dense image) coincide with one of the diagonal operators on l 2 up to the weak unitary equivalence. As for the shift operators acting on lp , lp (Z) , or Lp (IR) (see Examples 1.3.3 and 1 .3.7) , they are never compact (explain why) . Exercise 2. The multiplication operator Tt : Lp [a, b] --+ Lp [a, b] ; 1 < p < oo, where f E L 00 [a, b] (cf. Example 1 .3.4) is not compact, unless f == 0 (as an element of L 00 [a, b] ) . Hint. If f =/=- 0 on a set of positive measure, then for some () > 0 the set M : == {t E [a, b] : l f( t ) l > 0} also has positive measure. Therefore the operator T9 Tf, where g ( t) is 1/ f(t) for t E M and zero otherwise, is a projection onto an infinite-dimensional subspace in Lp [a, b] . Although the following exercise is a special case of the subsequent the orem, it would be nevertheless useful for the reader to look at it first. Exercise 3. The operator of indefinite integration (see Example 1.4.5) , acting on one of the spaces E : == L 2 [0, 1] , C[O, 1] , or L1 [0, 1] , is compact. Hint. The case E == C[O, 1] follows from Exercise 2.4(ii) . If E == £ 2 [0, 1] , then from the Cauchy-Bunyakovskii inequality it follows that the set of functions T (BE) is equicontinuous; so we can use the Arzela theorem and the estimate II · II 2 < II · II 00 • Finally, if E == L 1 [0, 1] , then the set T (BE) (although not equicontinuous) consists of functions with total variation < 1.
3. From Compact Spaces to Fredholm Operators
212
To construct a finite c:-net, we divide the interval [0, 1] into many equal parts and find this net among the functions that are constant on the intervals of the partition. Let us now recall a general class of integral operators on £ 2 [a, b] that contains the operator of indefinite integration (see Example 1 .3.6 and sub sequent comments) . Theorem 1.
Every integral operator on L 2 [a, b] is compact.
Proof. We need the following lemma.
Suppose functions en ( t); n E N form an orthonormal basis in L 2 [a, b] . Then the functions Um , n (s, t) : == em (s) en (t); m, n E N (arbitrar ily numbered) form an orthonormal basis in L 2 (D) (on the latter space, see also Example 1 . 3. 6). Lemma.
Um , n consider a function y E
Proof. Elementary computations using the Fubini theorem show that
is an orthonormal system. To verify its totality, L 2 ( D ) that is orthogonal to all Um , n , or, in other words, satisfies the condi tion Io Um , n (s, t)y(s, t)dsdt == 0 for all m, n E N. By Proposition 2.3.5, it is sufficient to establish that y vanishes as a vector in L 2 (D) , i.e. , y(s, t) == 0 almost everywhere on D. Again by the Fubini theorem, we have I: em (s)xn (s)ds == 0, where Xn ( s ) : == J: en ( t) y ( s, t) dt. This means that for every n the function Xn is or thogonal to all em ; m E N in L 2 [a, b] . By the totality of em , we have X n == 0 in L 2 [a, b] , i.e. , for a set Mn of full measure on [a, b] we have I: en (t)y(s, t)dt == 0 for all s E Mn . But then for s E M == n{Mn ; n E N}, by the totality of en , we have y(s, t) == 0 for almost all t E [a, b] . Since M is also a set of full • measure, this means that y == 0 almost everywhere on D. :
T be an integral operator, and K its kernel. For each n == 1 , 2, . . . consider the integral operator Tn with the kernel Kn (s, t) : == 2:: � l= l A k , l e k (s) ez (t), where A k , l : == ( K, uk , l ) are the corresponding Fourier coefficients. Obviously, every function in the image of this operator is a linear combination of the functions e k ; k == 1 , . . . , n. Hence Tn is a finite-dimensional operator. Further, since T - Tn is an integral operator with the kernel K - Kn , the operator norm of T - Tn is not greater than the norm of K - Kn in L 2 (D) (see Example 1 .3.6) . But from our lemma, it evidently follows that K is the limit of the sequence Kn in £ 2 (D) . Thus, T End of the proof of Theorem 1 . Let '
is approximated by finite-dimensional operators, and therefore (Corollary 2) • is compact. Here is one of the numerous corollaries of this theorem.
3. Compact operators: General properties and examples
Exercise 4. An integral operator on
213
L 2 [a, b] with the kernel K(s, t) is
a projection <====> K(s, t) is degenerate and satisfies the equalities indicated in Exercise 1 .5.5. Hint. The kernel of a one-dimensional integral operator has the form f ( s ) g ( t ) for some J, g E L 2 [a, b] . Exercise 5. The differentiation operators (see Example 1 .3.8) are never compact. Now please do the following exercise of general nature. Exercise 6. Every two weakly topologically equivalent operators are simultaneously either compact or not. *
*
*
We have seen how useful and simple is Corollary 2. But the question whether the converse is true is not simple at all. The analysis of this problem led to the following fundamental notion in the geometric theory of Banach spaces.
2 (Grothendieck, 1955) . A Banach space F has the approxi mation property if for every Banach space E every compact operator from E to F can be approximated in the operator norm by finite-dimensional
Definition
operators.
Here is an important example. Proposition 7. Proof. Let
T:
Every Hilbert space has the approximation property. E
H be a compact operator between a Banach space and a Hilbert space. Take c > 0 and consider an c/2-net Y I, . . . , Yn in H for T(BE) · Put Ho : == span{y1 , . . . , Yn } and consider the orthoprojection P : H --+ H onto Ho . Certainly, it is a finite-dimensional operator, and for every k 1 , . . . , n we have Pyk Yk · Hence, taking into account our choice of the c/2-net, we see that for each x E BE there is k such that II Tx - PTx ll < II Tx - Ykll + II P(yk - Tx) ll < 2 11 Tx - Ykll < c. II The rest is clear. ==
--+
==
Gradually it became clear that the overwhelming majority of "classical" Banach spaces (Lp (X, J-L) , C (O ) , . . . ) have the approximation property. For many separable spaces it was a corollary of the following important result (we recommend that advanced readers prove it; see Exercise 7) .
2. If a Banach space has a Schauder basis, then it has the ap proximation property. Theorem
2 14
3. From Compact Spaces to Fredholm Operators
For some time it looked like the question asked by Grothendieck, namely whether every Banach space has the approximation property (the famous approximation problem) would have a positive answer in the near future, at least in the class of separable spaces. But the optimists were wrong.
3 ( P. Enflo, 1972, [54]) . There exist separable Banach spaces, in particular, closed subspaces in co, that do not have the approximation property. As a corollary (see Theorem 2), there are separable Banach spaces without Schauder bases. Theorem
The question of whether every separable Banach space has a Schauder basis, was posed long ago by Banach. For a long time it was, perhaps, the best known problem in the theory of Banach spaces. Thus, Enflo, by proving this theorem, also gave the negative answer to this old problem. All separable Banach spaces without the approximation property known so far have been skillfully constructed examples. But outside the class of separable spaces, such a "black sheep" has been found among well-known objects. It turned out that B(H) , where H is an infinite-dimensional Hilbert space, does not have the approximation property (A. Szankowski, 1981) . By the way, verify that B(H) is indeed non-separable. Exercise 7* . Prove Theorem 2. Hint. Let e 1 , e 2 , . . . be a Schauder basis in a Banach space (F, II · II ). For n == 1, 2, . . . consider the projections Pn : F � F : I:� 1 A k e k �-----+ I:: �=l A k e k and the new norm II · II ' : y �-----+ sup{ I I Pn (Y) I I ; n E N} in F. If x m ; m == 1 , 2, . . . is a fundamental sequence in (F, II · II ' ), then for every n the sequence Pnx m ; m == 1 , 2 , . . . converges with respect to every norm in Fn :== span{ e1 , . . . , en } to some Xn , and for each l == 1 , 2, . . . we have Pnxn+l == Xn . From the properties of a Schauder basis it follows that the sequence Xn is fundamental in the initial space ( F, II · II ), and thus tends to some x with respect to the norm II · II · At the same time, Pn x == Xn ; n == 1 , 2, . . . . Taking into account that Pn are contraction operators on (E, II · II ' ), we can see that x is the limit of xrn in (F, II · II ' ). Therefore, (F, II · II ' ) is a Banach space , and by Proposition 2 .4.2, all Pn are bounded in ( F, I · II ), and their norms do not exceed some constant C.
Now suppose T : E � F is a compact operator, c > 0, and y1 , . . . , Ym is a finite 2( l� C) -net for T(BE) in F. If an integer N is such that for all n > N and k == 1 , . . . , m we have II Yk - PnYk II < 2 ( l� C) , then for the same n and for each x E B E we have IITx - Pn Tx ll < c . Thus T can be approximated by the finite-dimensional bounded operators Pn T. *
*
*
Advanced readers should also know that the approximation property is a very deep notion, which can appear in various forms that do not resemble each other. The following theorem is by no means simple, and we give it without proof. (The argument needed for its proof can be found, e.g. , in the extensive book [48 , Chap. !.5] . ) Theorem 4 (Grothendieck) . The following properties of a Banach space F are equ�valent:
(i) F has the approximation property;
4.
215
Compact operators between Hilbert spaces
(ii) for every compact subset K C F and for every c > 0 there ex�sts a fin�te dimens�onal bounded operator S : F � F such that I I Y - Sy ll < c for all y E K; (iii) the Grothendieck operator Gr : F* 0 F � B(F) (see Sect�on 2 . 7) �s inJective (and thus �s an �sometr�c �somorphism between F* 0 F and the space of nuclear operators N (F)); ( iv) for every u E F* 0 F the condit�on Gr( u ) == 0 implies tr( u ) == 0 (�. e. , the trace of a nuclear operator act�ng on F �s well defined; cf. Section 2. 7}.
Furthermore, if at least one of the spaces E* or then the operator Gr : E* 0 F � B(E, F) �s inJect�ve.
F
has the approx�mat�on property,
4. Compact op erat ors between Hilb ert spaces
Let H and K be Hilbert spaces. Suppose e is a vector in H, and f is the functional on H defined by e according to the rule described in Theorem 2.3.2 ( Riesz ) . Let y be a vector in K. For the corresponding one-dimensional operator we use, together with the notation f 0 y , also the notation e 0 y. Then the multiplication rule for one-dimensional operators indicated in Proposition 1.5.5 will look as follows:
( e 1 0 Y1 )( e 2 0 Y2 ) == ( 2 e 1 ) ( e 2 0 Y1 ) · y
,
The structure of the compact operators between Hilbert spaces is com pletely described by the following theorem. Theorem 1 ( Schmidt 3 ) . Let H and K be Hilbert spaces and T : H � K a
compact operator. Then there exist ( a ) an orthonormal system e� , e� , . . . in H of finite or countable cardi nality; ( b ) an orthonormal system e1, e�, . . . in K of the same cardinality, and ( c ) a (finite or infinite) sequence s 1 > s 2 > · · · of positive numbers with the index set of the same cardinality, tending to zero if it is infinite, such that T can be represented in the form n where, depending on the case, L: n is either a finite sum, or the sum of a convergent series in B(H, K) . Thus, the operator acts by the formula Tx = L Sn (x, e�) e � , n
(1879-1969) ,
3Erhard Schmidt prominent German mathematician, a student of Hilbert. It was his ( and Frechet's ) work where the language inherited from Euclid's geometry began to be widely used in the study of Hilbert spaces of sequences and functions.
2 16
3. From Compact Spaces to Fredholm Operators
where L:n is either a finite sum, or the sum of a series in K. {In other words, T( e�) == sn e� for all n, and T takes every vector orthogonal to all e� to zero). In addition, li T II == s 1 . Before starting the proof, let us emphasize that in the case where the sum T == L: n sn e� 0 e� has infinitely many summands, we do not claim that this series absolutely converges in the operator norm. The latter condition distinguishes a special class of operators, and we will discuss this later. Proof. Certainly, without loss of generality we can assume that
T =/=- 0.
There exists a vector e E H; ll e ll == 1 such that II Te ll == II T II { "a vector where the operator norm is reached").
Lemma 1.
Proof. By the definition of operator norm, there is a sequence X n E BH such that II TX n II -t II T II as n -t oo . Since the set T ( BH ) is totally bounded
and K is complete, the sequence Tx n has a convergent subsequence ( see Proposition 2.4) . Changing the notation if needed, we can assume that Txn tends to some y E K. 0bviously, II y II == II T II . By the parallelogram identity, for every natural m and n we have
ll x m - X n ll 2 == 2 ll x m ll 2 + 2 ll x n ll 2 - ll x m + X n ll 2 < 4 - ll x m + X n ll 2 · Further, II T(x m + Xn ) ll < II T II II x m + x n ll · For m, n -t oo the left-hand side of this inequality tends to II 2 YII == 2 II T II . Since li T II =/=- 0, it follows that ll x m + xn ll 2 tends to 4, and thus ll xm - x n l l 2 tends to zero. Therefore, the sequence X n is fundamental in H, and hence, tends there to a vector e. The rest is clear. • Lemma 2. Let e E H; II e II == 1 be such that II Te II == II T II . Th en fo r ev e ry x E H the condition x _L e implies Tx _L Te. (In other words, T takes { e } j_ to {Te } 1_ . ) Before starting formal arguments, we note that, according to our geomet ric experience in IR3 , this is automatically true. Let, for example, li T II == 1. If x _L e and Tx has an acute angle with Te, then for small t the length of the vector T( e + tx) == Te + tTx must be greater than the length of e + t x ( Figure 1 ) . But this is impossible since T does not increase the norm of a vector. Proof. Assume that the opposite is true: for some x; x
_L e we have ( Tx, Te )
=/=- 0. Replacing, if necessary, x by AX for some A E CC, we can assume that ( Tx, Te ) > 0 ( "Tx and Te form an acute angle" ) . Then for every t > 0 we
4.
217
Compact operators between Hilbert spaces
e
Te
tx Figure 1
have
II T II 2 l i e + tx ll 2 > l i T( e + tx) 11 2 == ( Te + tTx, Te + tTx ) == 11 Te ll 2 + 2t ( Tx, Te ) + t 2 11 Tx ll 2 . Since II e + tx II 2 == 1 + t 2 II x II 2 ( the Pythagorean theorem) and II Te II == II T II , we have 2t ( Tx, Te ) < t 2 ( II T II 2 II x ll 2 - 11 Tx ll 2 ). For sufficiently small t > 0 •
this is impossible, hence we obtain a contradiction.
End of the proof of Theorem 1. Lemma 1 gives at least one vector
e� E H; l i e� II == 1 such that li Te� II == II T II . We take such a vector and put e� : = 1 }11 Te�, 8 1 : = l i T II · Then we put H1 :== { e� } j_ and T1 :== T I H1 : H1 � K. If T1 == 0, then we stop here. Otherwise, since T1 is compact together with T, Lemma 1 gives at least one e� E H1 ; II e� II == 1 such that II T1 e� II == II Te� II == II T1 II . Choose such a vector and put e� : == wf1 1 Te�, 8 2 : == II T1 II . Note that { e�, e� } is an orthonormal system in H, and by Lemma 2, the same is true for the system {e�, e� } in K. In addition, we have 8 1 > 8 2 . Now we put H2 :== {e�, e� } j_ and T2 :== T I H2 : H2 K. If T2 == 0, we stop here; otherwise, using the compactness of T2 , we take e� E H2 ; II e� II == 1 such that II T2 e; ll = II Te; ll = II T2 II , and set e� : = 1 ,z\ 1 re; , 83 : = II T2 II · Now we see that { e�, e�, e� } and hence ( by Lemma 2) { e�, e�, e� } are orthogonal systems, and 8 1 > 8 2 > 8 3 . After that we go over to the space H3 : == { e�, e�, e� } 1_, the operator T3 :== T I H3, etc. �
Continuing this process, we obviously may face two possibilities: 1 . For some n, after constructing orthonormal systems { e� , e�, . . . , e� } in H and {e�, e�, . . . , e�} in K, and positive numbers 8 1 > 8 2 > · · · > 8n , we see that T vanishes on { e�, e�, . . . , e�}j_. Then, obviously, our operator can be represented as a finite sum T == 2:: �== 1 8 k e� 0 e%.
218
3. From Compact Spaces to Fredholm Operators
2. All Tn ; n E N are non-zero operators. In this case our process leads to countable orthonormal systems { e� , e� , . . . } in H and { e�, e�, . . . } in K, and to a sequence of positive numbers 8 I > 82 > · · · . Put () :== limn� oo 8n . Then by the Pythagorean theorem, for every m, n E N the number d(Te'm, Te�) == 11 8m e� - 8n e� ll equals J8� + 8; > /20. Hence, Proposition 2.4 ( iii ) applied to T ( BE ) guarantees that () == 0. For every n we put Sn :== L:�= I 8 k e � 0 e % . Then the operator T Sn vanishes on span { e� , . . . , e�} and is equal to Tn on ( span { e� , . . . , e�} )1_. Hence, li T - Sn ll == II Tn ll == 8n · Since 8 n -----+ 0 as n -----+ oo , the operator T can • be expanded in the series L:C: I 8 k e � 0 e % . The rest is clear.
L: n 8 n e� 0 e� in the formulation of the Schmidt theorem is called the Schmidt series of a ( compact ) operator T, and the numbers 8n , the 8-numbers of this operator. Definition 1. The sum
Certainly, a compact operator is finite-dimensional <====> its Schmidt se ries contains only a finite number of terms. In the proof of the Schmidt theorem we have seen that, generally speak ing, in the construction of the system e� , e�, . . . ( and thus, of the Schmidt series ) , there is some arbitrariness: for instance, there can be many vectors x satisfying the condition II Tx ll == II T II , and we can choose any one of them as e� . However, as we will now show, this arbitrariness is not that big, and, in particular, the numbers 8 n do not depend on the choice of the system e Ii , e I2 , · · · . Suppose that, for some orthonormal systems e� , e� , . . . and e�, e�, . . . , the statement of the Schmidt theorem is true. Since some neighboring num bers 8 n can coincide, for some natural n1 < n2 < · · · we have 8 I == · · · == 8n1 > 8n1 +I == · · · == 8 n2 > 8n 2 +I == · · · > 8nk - 1 +I == · · == 8n k > · · · · For k == 1 , 2, . . . , consider the spaces H k :== span { e n'k - 1 +I , . . . , e� k } ( we put no == 0 here ) . ·
The spaces H k do not depend on the choice of the sys tems e� , e� , . . . and e�, e�, . . . {hence, they depend only on the operator T). Further, for every k the numbers 8n k +I , . . . , 8n k + 1 coincide with the norm of the operator T( k) :== T I k ) , where H( k ) : == ( span { H I , . . . , H k } )1_ . As a corollary, the numbers 8n are uniquely defined by the operator T.
Proposition 1 .
H(
Proof. If T == 0, then
8 k == 0 for all k, and there is nothing to prove. In the case where T =/=- 0 we use induction on k. Here is the start: Lemma. The space H I consists of all x E H for which II Tx ll l i T II ll x ll , and the numbers 8 I , 8 2 , . . . , 8n1 coincide with II T II .
4.
Compact operators between Hilbert spaces
219
x E H1 the L: n s; l (x , e� ) l 2 . Hence, x E H im
Proof. Knowing how our operator acts, we see that for each
following equality is true: 11 Tx ll 2 == plies II Tx ll == s 1 ll x ll . Further, taking into account the Bessel inequality, 11 Tx ll 2 < L:n s r l ( x, e � ) l 2 < s r ll x ll 2 , and the previous sentence, we have s 1 == · · == Sn 1 == li T II . Finally, if II Tx ll == s 1 ll x ll , then for every n we have s; l ( x , e� ) 1 2 == s i I (x , e� ) 1 2 . Therefore ( x , e� ) == 0 as n > n1 , so that 11 Tx ll 2 == 2::� 1 1 s i l ( x, e� ) 1 2 . Hence II Tx ll == li T II ll x ll implies ll x ll 2 == 2:: � 1 1 I ( x, e� ) 1 2 , II and thus x E H 1 (see Proposition 1 .2.9) . The rest is clear. ·
End of the proof of Proposition 1. Suppose we proved the proposition 1 for 1 , . . . , k . Then, together with the spaces H , . . . , H k , the space H( k ) , and as a corollary, the operator T( k ) also depend only on the operator T. In addition, as you can easily see, the operator T( k ) : H( k ) � K acts by the
formula
Hence, if we consider our lemma with H( k) as H and T( k ) as T, then the space Hk +I will play the role of H 1 . Consequently, applying the lemma to this situation, we obtair1 that H k +I == { x E H( k) : II T( k) x ll == II T( k ) II ll x ll }. This means that Hk +I , as well as the operator T( k) (see above) , depend only on the operator T. The same lemma gives the equalities s nk +I == · · · ==
Snk + 1
==
II T( k) II ·
II
There are other beautiful characterizations of s-numbers, and we now present one of them without proof (see, e.g. , [55] ) .
Under the assumptions of the Schmidt the orem we have the following result: For every n == 1 , 2, . . . the set of numbers li T -
Proposition 2 (Allahverdiev) .
Another characterization of s-numbers will be obtained later, after the introduction of Hilbert adjoint operators (see Proposition 6.2. 13) . The following proposition partly justifies the term "compact operator" .
Suppose T : H � K is a compact operator between Hilbert spaces. Then T ( BE ) is compact. Proof. Since T ( BE ) is totally bounded in K, and K is dense, we only need to verify that T ( BE ) is closed in K. Suppose L: n s n e� 0 e� is the Schmidt series of our operator. First we show that the set T ( BE ) consists of all vectors of the form L: n S n A n e� , where L: n I A.n l 2 < 1. Indeed, since the numbers ( x, e� ) in the decomposition of the Proposition 3.
218
3. From Compact Spaces to Fredholm Operators
2. All Tn ; n E N are non-zero operators. In this case our process leads to countable orthonormal systems { e� , e� , . . . } in H and { e7, e�, . . . } in K, and to a sequence of positive numbers S I > s 2 > · · Put () : == limn � oo Sn . Then by the Pythagorean theorem, for every m , n E N the number d(Te'm , Te�) == ll sm e� - sn e� ll equals Js� + s; > /20. Hence, Proposition 2.4 ( iii ) applied to T (BE ) guarantees that () == 0. For every n we put Sn :== L:�= I s k e � 0 e % . Then the operator T Sn vanishes on span { e� , . . . , e�} and is equal to Tn on ( span { e� , . . . , e�}) 1_. Hence, li T - Sn II == II Tn II == Sn . Since Sn 0 as n -----+ oo , the operator T can • be expanded in the series L:C: I s k e � 0 e%. The rest is clear. ·
.
-----+
L: n sn e� 0 e� in the formulation of the Schmidt theorem is called the Schmidt series of a ( compact ) operator T, and the numbers s n , the s-numbers of this operator. Definition 1 . The sum
Certainly, a compact operator is finite-dimensional <====> its Schmidt se ries contains only a finite number of terms. In the proof of the Schmidt theorem we have seen that, generally speak ing, in the construction of the system e� , e�, . . . ( and thus, of the Schmidt series ) , there is some arbitrariness: for instance, there can be many vectors x satisfying the condition II Tx ll == II T II , and we can choose any one of them as e� . However, as we will now show, this arbitrariness is not that big, and, in particular, the numbers s n do not depend on the choice of the system e Ii ' e I2 ' . . . . Suppose that, for some orthonormal systems e� , e� , . . . and e7, e�, . . . , the statement of the Schmidt theorem is true. Since some neighboring num bers Sn can coincide, for some natural n1 < n 2 < · · · we have S I == · · · == Bn 1 > Bn 1 +I == · · · == Bn 2 > Sn 2 + I == · · · > Sn k-1 + I == · · · == Sn k > · · · · For k == 1 , 2, . . . , consider the spaces Hk : == span { e� k-1 + I , . . . , e� k } ( we put no == 0 here ) .
The spaces H k do not depend on the choice of the sys tems e� , e� , . . . and e7, e�, . . . {hence, they depend only on the operator T). Further, for every k the numbers Sn k + I , . . . , Snk+1 coincide with the norm of the operator T( k) : == T I k , where H( k ) : == ( span { H I , . . . , H k } )1_ . As a corollary, the numbers Sn are uniquely defined by the operator T.
Proposition 1 .
H( )
Proof. If T == 0, then
s k == 0 for all k, and there is nothing to prove. In the case where T =/=- 0 we use induction on k. Here is the start: Lemma. The space H I consists of all x E H for which II Tx ll and the numbers s i , s 2 , . . . , Sn 1 coincide with li T II .
4.
Compact operators between Hilbert spaces
219
x E H1 the L: n s; l ( x, e� ) l 2 . Hence, x E H im
Proof. Knowing how our operator acts, we see that for each
following equality is true: 11 Tx ll 2 == plies II Tx ll == s 1 ll x ll . Further, taking into account the Bessel inequality, 11 Tx ll 2 < L:n s i l ( x, e� ) l 2 < s i ll x ll 2 , and the previous sentence, we have s1 == · · · == Sn 1 == II T II . Finally, if II Tx ll == s1 ll x ll , then for every n we have s; l ( x , e� ) l 2 == s i l (x , e� ) l 2 . Therefore (x , e� ) == 0 as n > n1 , so that 11 Tx ll 2 == 2::� 1 1 s i l ( x, e� ) l12 . Hence II Tx ll == II T II II x ll implies ll x ll 2 == 2:: � 1 1 l ( x, e� ) l 2 , II and thus x E H (see Proposition 1 .2.9) . The rest is clear.
End of the proof of Proposition 1. Suppose we proved the proposition for 1 , . . . , k. Then, together with the spaces H 1 , . . . , H k , the space H( k ) , and as a corollary, the operator T( k ) also depend only on the operator T. In addition, as you can easily see, the operator T( k) : H( k ) � K acts by the
formula
Hence, if we consider our lemma with H( k) as H and T( k ) as T, then the space Hk +I will play the role of H 1 . Consequently, applying the lemma to this situation, we obtair1 that Hk +I == { x E H( k) : II T( k) x ll == II T( k) II ll x ll }. This means that H k +I , as well as the operator T( k) (see above) , depend only on the operator T. The same lemma gives the equalities s nk +I == · · · ==
Snk + 1
==
II T( k) ll ·
II
There are other beautiful characterizations of s-numbers, and we now present one of them without proof (see, e.g. , [55] ) .
Under the assumptions of the Schmidt the orem we have the following result: For every n == 1 , 2, . . . the set of numbers li T - 4> 11 , where 4> runs through all possible bounded operators with the image of codimension n - 1, has the smallest number, and this smallest number coincides with Sn .
Proposition 2 (Allahverdiev) .
Another characterization of s-numbers will be obtained later, after the introduction of Hilbert adjoint operators (see Proposition 6.2. 13) . The following proposition partly justifies the term "compact operator" .
Suppose T : H � K is a compact operator between Hilbert spaces. Then T(BE ) is compact. Proof. Since T(BE ) is totally bounded in K, and K is dense, we only need to verify that T (BE) is closed in K. Suppose L: n s n e� 0 e� is the Schmidt series of our operator. First we show that the set T(BE ) consists of all vectors of the form L: n Sn A n e� , where L: n 1 An l 2 < 1 . Indeed, since the numbers (x , e� ) in the decomposition of the Proposition 3.
3. From Compact Spaces to Fredholm Operators
220
vector Tx in the Schmidt theorem satisfy the Bessel inequality, in the case of II x II < 1 such a vector always has the indicated form. Conversely, every vector of the indicated form is, of course, Tx for x == L: n Ane�, and the latter obviously belongs to BE . Now let y be an adherent point for the set T (BE ) . Since T (BE ) belongs to the closure of the linear span of the vectors e�; n == 1 , 2, . . . , we have < 1. y = L: n en e� for some en E C. Our goal is to verify that L:n l�_;:t n Let N be the number of vectors e� if they form a finite system, and an arbitrary natural number otherwise. Take c > 0; then there exists z == L: n SnAn e� such2 that I An l2 < 1 and l i z - Yll < c. Obviously, l i z - y ll2 == L:n l �n - SnAn 1 . Consequently, for all n in the latter sum we have l �n < !An i + L SnAn l < c . As a corollary, � . Hence Sn Sn
Since c was chosen arbitrarily, this means that clear.
2:::: ;; 1 l �s'l < 1. The rest is n
•
(Later we will be able to obtain a much more general result of that sort; see Proposition 4.2. 17.) *
*
*
From the general categorical point of view the Schmidt theorem is a striking result about classification of morphisms. We recall the notion of weak unitary equivalence of operators on pre Hilbert spaces discussed in Section 1 .4. The classification theorem we suggest as the next exercise is equivalent to the Schmidt theorem and can be considered as one of its formulations. Exercise 1 .
(i) Let T : H � K be a compact operator between Hilbert spaces. Then there exists an ordered family s == (s1 , s 2 , . . . ) of non-increasing positive numbers of finite or countable cardinality m , and Hilbert spaces Ho and Ko such that T is weakly unitarily equivalent to the operator R : l�ffiHo � l�ffiKo, acting on l2 as a diagonal operator T8 and taking Ho to zero. Moreover, the numbers s1 , s 2 , . . . are s-numbers of the operator T, and thus this family of numbers is uniquely defined by the operator. (ii) Two compact operators T : H1 � K1 and S : H2 � K2 be tween Hilbert spaces are weakly unitarily equivalent <====> their families of
4.
221
Compact operators between Hilbert spaces
s-numbers coincide, the space Ker(T) is unitarily isomorphic to Ker(S) , and the space Im(T)j_ is unitarily isomorphic to Im(S) j_ . Hint. (i) Suppose T == L: n sn e � 0 e�, Ho : == Ker(T) and Ko : == Im(T)j_ . Then there is a unique U1 : H � l2ffiHo taking e� to pn and the identity on Ho. Also there exists U2 : K � l2ffiKo with similar properties. The operators ul and u2 make the required diagram commutative. (ii) ===> . If U1 and U2 are unitary operators establishing the weak unitary equivalence between S and T, then they are the required isomorphisms between the kernels and the orthogonal complements of the images. Further, if T L: n sn e� 0 e� , then U1 e� , U2 e� , and the very same Sn play the analogous role for S. After that Proposition 1 works. (ii) {:::=:: . By (i) , both operators are weakly unitarily equivalent to the same operator. ==
Thus, every compact operator between Hilbert spaces is uniquely defined up to the weak unitary equivalence by the following data: a) a (finite or infinite) sequence s1 > s 2 > · · · of s-numbers; b) the Hilbert dimension of its kernel, and c) the Hilbert dimension of the orthogonal complement of its image (see Theorem 2.2.2) . In other words (cf. general discussion in Section 0.4) , the complete system of invariants of the weak unitary equivalence for this class of operators consists of triples (s, a , /3) , where s is a non-increasing sequence of positive real numbers tending to zero (or finite) , and a and f3 are cardinal numbers. A simplest model (Section 0.4) of a compact operator with the invariant (s, a , /3) is R : l2ffiHo � l2ffiKo. Here m is the greatest natural number for which Sm > 0, or the countable cardinality if there is no such number. Further, Ho and Ko are Hilbert spaces with Hilbert dimensions a and f3 respectively (say, Ho : == l 2 ( X ) and Ko : == l 2 (Y) , where X has the cardinality a , and Y has the cardinality /3) , and R acts by the formula presented in the previous exercise. *
*
*
Certainly, the operator between Hilbert spaces is finite-dimensional <====> its set of s-numbers is finite. Other conditions on its s-numbers distinguish other classes of compact operators. Here is one of the most important cases.
Schmidt operator (or a Hilbert-Schmidt operator) if its s-numbers satisfy the condition L: n s; < oo . The Schmidt norm of such an operator is l i T l i : == JL: n s; . The set of Definition 2. A compact operator T is called a s
Schmidt operators is denoted by S(H, K) , and we write S(H) instead of S(H, H) . The class of operators we have just introduced can also be defined with out appealing to s-numbers. For clarity, we will show this in the case where
3. From Compact Spaces to Fredholm Operators
222
both spaces are separable and infinite-dimensional. Until further notice, we assume that our spaces have these properties. If a given operator is finite-dimensional, so that its "Schmidt series' ' is a '\:"" nN= 1 sn e In O e IIn , we can arb"t1 rar1" 1y augment the vectors eI1 , . . . , eIN fin1"te sum L...to a countable orthonormal system in H by vectors e � + 1 , e � + 2 , . . . , and the vectors e 7 , . . . , e'Jv, to a countable orthonormal system in K by vectors e'Jv+ 1 , e'Jv+ 2 , . . . , and put SN + 1 == S N + 2 == · · · : == 0. Then every compact operator, no matter what its image is, is represented as a "true" series 2:� 1 sn e � 0 e� convergent in the operator norm, which we still call the Schmidt series of this operator. This agreement allows us to avoid tedious repetitions related to the case of finite-dimensional operators in the future. Theorem 2.
The following properties of a bounded operator T : H � K
are equivalent: (i) T is a Schmidt operator; (ii) for some orthonormal basis d� , d� , . . . in H we have L: r 1 II Td� II 2 < . 00 '
(iii) for an arbitrary orthonormal basis d� , d�, . . . in H we have
00 L 11 Td� ll 2 < oo; k= 1 (iv) for the matrix ( a kl ) of the operator T with respect to some orthonor mal bases in H and K we have L: r,z = 1 l a kz l 2 < oo (see Definition 1.4. 5); ( v) for the matrix ( a kl ) of the operator T with respect to arbitrary or thonormal bases in H and K we have L: r,z= 1 l akz l 2 < oo . Finally, if T has these properties, the sums of all these series coincide and they are equal to II T II � {that is, to 2: � 1 s;) . Proof. The implications (iii) ====> ( ii) and (v)====> ( iv) are obvious. Further,
by the Parseval equality (Theorem 1 .2.4(ii) ) , for some orthonormal bases d� , d� , . . . in H and d7 , d� , . . . in K we have II Td� 11 2 == L: r 1 I ( Td� , dt ) 1 2 , and therefore (1)
00
00
00
k= 1
k ,l = 1
k l-- 1 '
From this we have the equivalences (ii){::::::> (iv) and (iii) {::::::> (v) . Now to prove the equivalence of all the five conditions we only need to justify the implications (i) ====> (iii) and (ii)====> (i) . We do this in several steps.
4.
223
Compact operators between Hilbert spaces
Suppose that T is a compact operator with Schmidt series 2:: � 1 sn e � 0 e� . Then for every orthonormal basis d� , d� , . . . in H the num bers 2: � 1 s; and L: C: 1 11T d� ll2 simultaneously either exist or do not exist, and if they exist, they coincide. Proof. For vectors Td� and e� from the corresponding orthonormal systems
Lemma 1.
we have
( Td� , e � ) = ( L sm ( d� , e � ) e � , e � ) = L sm ( d� , e � ) ( e � , e � ) = s n ( d� , e �) . m m Taking into account that 11 Td� ll2 == 2:: � 1 I ( Td� , e�) l 2 , we see that the sums L: C: 1 11 Td� ll2 and L:r,n = 1 s; l ( d�, e �) l 2 are simultaneously finite or not, and if they are finite, they coincide. Let N be an arbitrary natural number. Then, by the Parseval equality, oo N N N
L L s� i ( d� , e�) l 2 = L s � ll e� ll 2 = L s� .
n= 1 n= 1 n= 1 k = 1 This implies that L:C:n= 1 s; l ( d� , e�) l 2 < oo <====> 2:: � 1 s; < oo, and these sums coincide. The rest is clear. • '
For a bounded operator T H � K and an orthonormal basis d� , d� , . . . in H we have l iT II < JL: C: 1 II Td� 11 2 .
Lemma 2.
:
( Certainly, this lemma is interesting only if the latter sum is finite. ) Proof. For every x
E
H we have Tx == L: C: 1 ( x , d� )Td� . Hence, taking into
account the Cauchy-Bunyakovskii inequality and Parseval equality, we have 00
II Tx ll < L i ( x , d� ) I II Td� ll k= 1
The rest is clear.
00
00
00
k= 1
k= 1
k= 1 II
If for some orthonormal basis d� , d� , . . . in H we have the in equality L: C: 1 II Td� II 2 < oo, then T is compact. Proof. For every k, define the finite-dimensional operator Tk by the formula x � 2:: 7 1 ( x , d�) Td� . ( Thus, Tk coincides with T on the first k vectors of our basis and takes the remaining vectors to zero. ) Then L: r 1 II ( T - Tk) d� ll2 == L: r k+ 1 11Td� ll2 . Hence, using the previous lemma for T - Tk as T, we obtain Lemma 3.
that T == limk� oo Tk in the operator norm. Therefore, T is approximated • by finite-dimensional operators and thus ( Corollary 3.2) is compact.
3. From Compact Spaces to Fredholm Operators
224
End of the proof of Theorem 2. It remains to note that the implication
(i)===> (iii) is contained in Lemma 1, and the implication (ii)===> ( i) follows from this lemma together with Lemma 3. Thus, the proof of the equivalence of the conditions (i)-(v) is completed. As for the last statement of the theorem about the equality of indicated sums, it evidently follows from the • same Lemma 1, taking into account formula ( 1 ) . Remark. We see (Theorem 2 (v) ) that the Schmidt operators are completely
described in terms of matrices. In fact, this is the property characterizing this particular class of operators; nobody knows how to describe other classes of operators in terms of matrices.
4. (i) S(H, K) is a subspace in JC(H, K) , and thus a linear space; (ii) II · l i s is a Hilbert norm in S(H, K) . The inner product ( - , ·) gener ating this norm (and unique by the polar identity) is as follows: if we take arbitrary orthonormal bases in H and K, then for Schmidt operators S and T we have
Proposition
00
L a k t b kt , k ,l = l where ( akz) and ( bkz) are the matrices of these operators with respect to these bases ; (iii) (S(H, K) , II · l i s ) is a Hilbert space. ( S, T)
=
Proof. Choose orthonormal bases in H and K, and assign to every com
pact operator from H to K the matrix of this operator with respect to these bases. Since our matrices are functions of two natural arguments, we obtain a mapping from JC (H, K) to the space of all double sequences, and this mapping is evidently an injective linear operator. By Theorem 2, this operator is a bijection (3 : S(H, K) --+ l 2 ( N x N) , and this obviously implies (i) . Further, from the same Theorem 2 it follows that for the standard norm II · l l 2 in l 2 ( N x N) we have l l f3(T) l l 2 l i T l i s · Certainly, this means that II · l i s is a norm in S(H, K) , and (3 is an isometric isomorphism between normed spaces (S(H, K) , II · l i s ) and ( l 2 ( N x N ) , II · l l 2 ) . But the norm in the latter space is generated by the inner product (a , b) :== L: rz -- 1 a kzbkz; this implies (ii) . Finally, from the fact that l2 (N x N) is a Hilbert space and (3 is an isometric isomorphism, we obtain (iii) . • ==
'
Remark. Later, when Hilbert adjoint operators will be introduced, we will
obtain another characteristic property of Schmidt operators and a formula for the inner product in S(H, K) that is not connected with concrete or thonormal bases (see Corollary 6.2.2(i) in the sequel) .
4.
225
Compact operators between Hilbert spaces
It turns out that the Schmidt operators acting on the space L 2 [a, b] are already known, but under a different name.
5. An operator on L 2 [a, b] is a Schmidt operator � it is an integral operator (see Example 1 . 3. 6). The mapping K � TK , where TK is the integral operator on L 2 [a, b] with kernel K, is a unitary isomorphism between Hilbert spaces L 2 ([a, b] [a, b] ) and S(L 2 [a, b] ) .
Proposition
x
2:: � 1 s n e� 0 e� is our operator; now the vec tors e� and e� are square-integrable functions. Put un ( r, t) : == e� (r)e�(t); r, t E [a, b] . From the Fubini theorem it evidently follows that Un is an orthonormal system in L 2 (D), where D : == [a, b] [a, b] . Since the series 2:: � 1 s; converges, the series 2:: � 1 sn Un converges in L2 (D) to a function K == K (r , t) . Consider the integral operator TK with kernel K, and for an arbitrary natural number N consider the integral op erator TKN with kernel KN == 2:: � 1 S n Un · Then for every x E L 2 [a, b] for almost all r E [a, b] we have Proof. ===> . Suppose
T
==
x
N
1a nL= Sn un (r, t)x(t)dt 1 = L s n e �(r) 1 e� (t)x(t)dt = L s n ( x, e� ) e �(r). a n= n=
[TKN (x)] (r) =
b
N
b
1
N
1
This means that TKN (x) == 2:: � 1 s n ( x, e� ) e� , i.e. , TKN == 2:: � 1 s n e� 0 e� . Now we see that by the Schmidt theorem, T == limN-H)() TKN , and at the same time, the estimate II TK - TKN II < I l K - KN II ( see again Example 1.3.6) shows that TK == limN�oo TKN . Thus, T == TK. ¢== . Let TK be an integral operator with kernel K. Choose two ar bitrary orthonormal bases { e� , e� , . . . } and { e�, e�, . . . } in £ 2 [a, b] . Then the matrix of the operator TK with respect to these bases is, obviously, akl :== f0 K(r, t)ukz (r, t)dt dr, where ukz (r, t) : == e % (r)e� (t) . On the other hand, U kl is an orthonormal basis in £2 (D) ( the arguments we used in the proof of Theorem 3. 1 work with obvious modifications ) , so that the numbers akl are the Fourier coefficients of the element K E2 £ 2 (D) with respect to this basis. Therefore, by the Parseval equality, II K II == L: �m= 1 l anml 2 . Apply ing Theorem 2, we see that TK is a Schmidt operator, arid II TK II s == II K II . II The rest is clear. Corollary 1.
( i ) An operator between two separable Hilbert spaces is a
Schmidt operator � it is weakly unitarily equivalent to an integral operator on L 2 [a, b] .
3. From Compact Spaces to Fredholm Operators
226
( ii )
An operator acting on a separable Hilbert space is a Schmidt opera tor � it is unitarily equivalent to an integral operator on L 2 [a, b] .
This corollary can be viewed as an analytical description of the Schmidt operators. We give another interpretation of the Schmidt operators, this time as elements of Hilbert tensor products. For a vector x E H, denote by x* E H * the functional defined by this vector. Recall that for a Hilbert space H the dual space H* is again a Hilbert space with respect to the inner product ( x*, y *) :== (y, x ); x, y E H ( see Proposition 2.3.8) .
There is a unitary isomorphism between the Hilbert spaces H* @ K and S(H, K) taking an elementary tensor x* Q9 y to the one-dimen sional operator x 0 y ; x E H, y E K. Theorem 3.
Proof. The space
H* @ K contains a dense subspace H* Q9 K, i.e. , the
algebraic tensor product of H* and K. The space S (H, K) contains the subspace :F(H, K) , which is again dense because every Schmidt operator is a limit in the Schmidt norm of partial sums of its Schmidt series. Consider the linear isomorphism Gr8 : H * Q9 K � F(H, K) ( see Proposition 2.7.2) . Take u E H * Q9 K; u == 2:: �= 1 x 'k Q9 Yk and choose an orthonormal basis ei, . . . , e :n in span { xi, . . . , x�}. Then u == 2:: � 1 e 'k @ Zk for some z1 , . . . , Zm E K. From the definition of the inner product in H* Q9 K ( see Section 2.8) it follows that the system e 'k Q9 zk ; k == 1 , . . . , m is orthogonal. Therefore, ll u ll 2 == 2::� 1 ll e'k Q9 zk ll 2 == 2:: ; 1 llzkll 2 . At the same time T :== Gr8 (u ) == 2:: � 1 e k 0 Zk · Since Tx == 0 for every x _L span { e1 , . . . , em }, from Theorem 2 it follows that l i T II � == 2:: � 1 II Te k 11 2 == 2:: � 1 II Zk 11 2 . Thus , Gr8 is an isometric, and hence a unitary isomorphism between dense subspaces in H* @ K and S(H, K) . It remains to use the extension-by-continuity principle • ( in the form described in Proposition 2. 1 . 10) . Now, consider another special class of compact operators. In the follow ing definition H and K are arbitrary Hilbert spaces.
nuclear operator ( or a trace class operator ) if for its s-numbers we have L: n Sn < oo . The nuclear norm of such an operator is the number II T II N :== L: n Sn . The set
Definition 3. A compact operator T : H � K is called a
of nuclear operators is denoted by N(H, K) , and we write N(H) instead of N(H, H) . Here the advanced reader may notice that we have already used the same notation and the same name in another context ( cf. Definition 2. 7. 5) , where the s-numbers were not even mentioned. We will soon show that the two definitions of nuclearity agree. However, for now we pretend that we forgot about the definition in Section 2 . 7 and, speaking about nuclearity, we have in mind the definition just given.
4.
227
Compact operators between Hilbert spaces
Again, for clarity we restrict ourselves to the case where our Hilbert spaces are separable and infinite-dimensional.
Let T be a nuclear operator acting on a Hilbert space H, and 2: � 1 sn e� 0 e� its Schmidt series. Suppose e1 , e 2 , . . . is an orthonormal basis in H and ( akz ) is the matrix of the operator T in this basis. Then the series L:r 1 akk {composed from the diagonal elements of our matrix) absolutely converges, and its sum is equal to 2: � 1 Sn ( e�, e� ) . In particular, this sum does not depend on the choice of orthonormal basis.
Proposition 6.
Proof. Consider the series L: �k = 1 s n ( e k , e� ) ( e�, e k ) · It is absolutely conver'
gent since L:r 1 I ( e k , e� ) I I ( e�, e k ) I is the inner product of two vectors in l2 of norm 1 ( modules of the corresponding Fourier coefficients ) and L: n Sn < oo . Hence, the corresponding double series are absolutely convergent, and their sums coincide. By the equality Te k == 2:: � 1 Sn ( e k , e� ) e� we have
f; f� sn (ek , e�) (e� , ek ) f; (f� sn (ek , e�) e� , ek ) f; (Tek , ek) · =
=
At the same time, decomposing e� and e� with respect to the basis e k ; k == 1 , 2, . . . , we see that L:r 1 ( e k , e� ) ( e� , e k) == ( e� , e� ) , and therefore summa tion in another order gives the number 2:: � 1 s n ( e� , e� ) , which does not depend on e k . The rest is clear. • Proposition 6 makes the following notion well defined.
4. The sum of the diagonal elements of the matrix of a nuclear operator T in an arbitrary orthonormal basis of the space H is called the trace of this operator and is denoted by tr (T) . Definition
Again, we will soon show that this definition agrees with the definition of trace in Section 2.7.
Let T be a compact operator and S a bounded operator on H, and suppose ST and TS are nuclear operators. Then tr (ST) == tr (TS ) {the trace of a product does not depend on the order of factors).
Proposition 7.
Proof. Suppose
T == 2:: � 1 sn e� 0 e�. Then for every n we have 00
sn ( Se � , e �) = ( 2 J s k Se�, e� ) e % , e� ) = ( TSe� , e� ) . k= 1 Since ST(x) == 0 for every x _L span {e� : n E N} , we have tr (ST) 2:: � 1 ( STe�, e� ) . On the other hand, Im (TS) C span {e� : n E N} , hence tr (TS) == 2:: � 1 ( TSe�, e� ) . It remains to sum over n the first and the last ( STe � , e�)
=
expressions in the above-written chain of equalities.
II
3. From Compact Spaces to Fredholm Operators
228
The following facts about nuclear operators should be known to all read-
ers.
( i ) N(H, K ) is a subspace ( [55, Chapter III, §§8, 9] ) . in S (H, K ) {hence, a linear space), and II · li N is a no rm on N(H, K ) . Moreover, N(H, K ) is a Banach space with respect to this norm. ( ii ) The composition of two Schmidt operators, as well as the compo sition {in any order) of a nuclear and a bounded operator, is a nuclear operator. ( iii ) The trace is a unique (up to a scalar factor) functional f on N(H) that is continuous in the nuclear norm and satisfies the condition f(ST) == f(TS) for all one-dimensional operators S and T.
Proposition 8
Some of these results will be proved later in "small print" . But there are many things the reader can establish right now as a useful exercise ( and without special effort ) . Here is some
2:: � 1 sn e�
Information to think over. If T == 0 e� E N(H, K ) , then- for every orthonormal systems d� in H and d% in K the double se ries (d� , e� ) (e% , d�) is absolutely convergent. Changing the order of summation, we see that I (Td� , d%) I < II T II N· Therefore, if S, T E N(H, K ) and S + T == s�d� 0 d% , then == ((S + T ) d� , d%) < II S II N + II T II N·
I:� 1 L:r 1 sn
I:� 1
L: r 1 L: r 1
I:� 1 s�
If T E S (H , K ) , S E S ( K, L) , ST == 2:: � 1 s n e� 0 e� , and en is an orthonormal basis in K, then sn == L:r 1 ank bkn , where ank :== ( Se k , e� ) and bkn : == ( Te�, e k ) · Theorem 2(v) implies that the sum of the double series L:�k = 1 ank bkn is the inner product of the double sequences in l2 (N x N) . Suppose T E N(H, K ) , S E B ( K, L) , ST == 2:: � 1 sn e� 0 e�' , and T == I:� 1 t k f£ 0 f�, where L:r 1 t k < oo . Then it is easy to see that I:� 1 s n I: � 1 L:r 1 t k ( e� , !£ ) \Sf�, e�') . Let us change the order of summation. For every k from the Cauchy-Bunyakovskii inequality we obtain the estimate 2: � 1 l ( e�, f£ ) 1 1 ( S j�, e�' ) l < II S II · Hence, 2:� 1 sn < II S II L:r 1 t k , so that ST E N(H, L) and II ST II N < II S II II T II N· Finally, for every x, y E H it is easy to see that tr ( x 0 y ) == ( y, x ) . Let the functional f : N(H) --+ C have the properties from ( iii ) . Then, multiplying some one-dimensional operators in a different order, as described in Proposition 1.5.5, we see that f( xO y ) == f (z O z ) ( y, x ) == f(zO z ) tr ( x O y ) for all x, y, z E H, li z II == 1. Thus, if we take z E H; li z II == 1 and put A : == f(z 0 z) , then f == A ( tr ) on F(H) , which is a dense subspace in (N(H) , II · li N ) , as follows from the form of the Schmidt series. '
==
4.
Compact operators between Hilbert spaces
229
Now we state a fundamental theorem describing the action of functionals on the introduced spaces of compact operators. Theorem 4 (Schatten-von Neumann, [68, Chapter II, §1] ) . Let H and K
be (arbitrary} Hilbert spaces. Then (i) Every nuclear operator T : K --+ H defines a bounded functional fr on the space JC(H, K) (with the operator norm) by the rule fr (S) : == tr (ST ) , and every bounded functional on JC(H, K) has the form fr for a unique T E N(K, H) . The resulting bijection IK : T �----+ fr is an isometric isomorphism of the space (N(K, H) , II · l i N ) onto JC(H, K)*. (ii) Every bounded operator T : K --+ H defines a bounded functional fr on the space N(H, K) {with the nuclear norm} by the rule fr (S) : == tr (ST) , and every bounded functional on (N(H, K) , II · l i N ) has the form fr for a unique T E B(K, H) . The resulting bijection IN : T �----+ fr is an isometric isomorphism of the space B(K, H) {with the operator norm) onto (N(H, K) , II . l l N )*. (iii) Every Schmidt operator T : K --+ H defines a bounded functional fr on the space S(H, K) {with the Schmidt norm} by the rule fr(S) : == tr (ST ) , and every bounded functional on (S(H, K) , II · l i s) has the form fr for a unique T E S(K, H) . The resulting bijection Is : T �----+ fr is an isometric isomorphism of the space (S(K, H) , II · l i s) onto (S(H, K) , II · l i s)*. Certainly, the fact that (S(K, H) , II · l i s) and (S(H, K) , II · l i s )*) are iso
metrically isomorphic, immediately follows from the Riesz-Fischer theorem: both spaces are separable and Hilbert. It is essential that the operators in S(K, H) define functionals on S(H, K) precisely by the formula using the trace. As to the second of these results, later it will turn out to be a simple corollary of the interpretation of the space of nuclear operators in terms of Banach tensor products . We discuss this at the end of this section (see Exercise 2) .
Note the inner similarity between these three assertions and the de scription of actions of functionals on the most important function spaces (Exercises 1.6. 1-1.6.3) . The space JC(H, K) behaves like c0 , N(H, K) like l 1 , and S(H, K) like l 2 . Actually it is advisable to treat each of these spaces of operators as an operator version of the corresponding space of sequences; then many features in their behavior become predictable. By the way, we have already observed another example of the similarity: the composition of two Schmidt operators is a nuclear operator, and this is an analogue of the fact that the coordinatewise product of two sequences in l 2 belongs to l 1 . Remark. If we restrict ourselves to diagonal operators on l 2 (Example 1.3.2) , then, informally speaking, the Schatten-von Neumann theorem turns into a unification of Exercises 1.6. 1-1.6.3. Indeed, for an arbitrary (i.e. , just
3. From Compact Spaces to Fredholm Operators
230
bounded) diagonal operator T>.. , A can be an arbitrary sequence in l 00 • At the same time, a diagonal operator is a Schmidt operator {=:::} A E l 2 , and it is a nuclear operator {=:::} A E l 1 . Moreover, tr(T>.. T11) , whenever this number is defined, coincides with 2:: � 1 An f.1n , i.e. , the same sum that was used for the description of functionals on the spaces of sequences. The spaces lp for other p E [1 , oo ) also have an operator version: they are the so-called Schatten-von Neumann classes of order p. Namely, by def inition, a compact operator T : H � K belongs to this class if its s-numbers satisfy the condition 2:: � 1 s� < oo . Knowing the spaces lp and their rela tionships, one can predict many things about their operator versions: for instance, the composition ST, where S and T belong to the Schatten-von Neumann classes of order p and q , respectively, is a nuclear operator if 1 Ip + 1 I q == 1. In addition, operators of order q act "as functionals gen erated by the trace" on the class of order p, etc. For details about these classes (that go beyond the scope of this book) ; see, e.g. , [55] or [56] . Advanced readers should also know about an interpretation of nuclear operators be tween Hilbert spaces in terms of tensor products. This is an analogue of Theorem 3 proved earlier. Besides , such readers should know some corollaries of this fact. In particular, we will show that in the context of Hilbert spaces the definitions related to nuclearity are equivalent to "general Banach" definitions in Section 2 . 7. Warning. However, before this has been done, we give the nuclear operators and nuclear norms the meaning indicated in this section ( see Definition 3) . For simplicity we assume, as before, that the Hilbert spaces H and K are separable and infinite-dimensional ( so that we can always speak about infinite Schmidt series ) . In what follows , for each x E H we denote by x* the corresponding functional on H acting by the formula x* (y) = (y, x) ( see Section 2.3) . Theorem 5. The set N(H, K) of nuclear operators ts a subspace tn K(H, K) , and the
nuclear norm I I · I IN ts a norm there. Moreover, there ts an tsometrtc tsomorphism between the space (H* 0 K, l l · l l p ) (where I I · I I P ts the proJecttve norm defined tn Section 2. 7) and the space (N(H, K) , I I · I I N ) , taking an elementary tensor x* @ y to a one- dimensional operator x O y ; x E H, y E K . Proof. Let Gr : H* 0 K ---+ B(H, K) be the Grothendieck operator defined in Section 2 . 7. It acts on the elementary tensors in the way we indicated. As we will see, our desired ison1etric isomorphism will be the corestriction of this operator to its image. Lemma 1 . Suppose u E H* 0 K and S
:=
e� , e� , . . . tn H and e � , e � , . . . tn K we have
Gr (u ) . Then for every orthonormal system
L I ( S e� , e � ) l
n= 1
<
l l u llp ·
Proof. Let u be represented as the sum of an absolutely convergent series I:� 1 e k @ Yk ( see Proposition 2 . 7.5) . Then S = L� 1 ek 0 Yk , and
I ( S e� , e � ) l f I \ f (e� , ek )Yk , e � ) l f f l (e� , ek ) I I (Yk , e � ) l , f n =1 k = 1 n= 1 k= 1 n= 1 =
<
4.
231
Compact operators between Hilbert spaces
provided the latter series converges . Changing the order of summation and using the Cauchy-Bunyakovskii inequality (for l2 ) together with the Bessel inequality, we see that the right-hand side does not exceed 2::: � 1 ll e k II II Yk II · It remains to take the infimum of such sums over all representations of u as the sum of a series of elementary tensors . • Lemma 2. Every operator u E H* 0 K, Gr ( u ) == T} .
T from
the tmage of Gr ts nuclear, and
II T II N
<
inf{ ll u ii P :
Proof. We already know from Theorem 2 . 7 . 7 that T can be represented as an abso lutely convergent series of one-dimensional operators . Hence, it is approximated by finite dimensional operators, and therefore is compact . Let 2::: � 1 s n e� 0 e� be its Schmidt series . Since sn == (Te� , e�) , the previous lemma implies that 2::: � 1 s n < llu ii P for each u such that Gr ( u ) == T. The rest is clear. •
the form T == Gr( u) where u E H* 0 K sattsfies the inequality llu ii P < II T II N . If, in addition, T is finite-dimenstonal, then such u can be chosen in (the algebraic tensor product) H* @ K .
Lemma 3. Every nuclear operator T
: H ---+ K has
Proof. Suppose I:� 1 sne� 0 e� is a Schmidt series for T . From the nuclearity of this operator and the "Weierstrass test" (Proposition 2 . 1 .8) it evidently follows that the series I:� 1 sne� @ e� converges in the Banach space H* 0 K to some element u. Hence, Gr ( u ) == T and llull < I:� 1 ll sne� ll ll e� ll == I:� 1 Sn . It remains to recall that the Schmidt series of a finite-dimensional operator is a finite sum, and thus the corresponding u belong to H* 0 K. • End of the proof of Theorem 5. Combining Lemmas 2 and 3, we see that the set of nuclear operators is precisely the image of the Grothendieck operator, and the nuclear norm of a nuclear operator T is inf{ ll u ii P : u E H* 0 K, Gr ( u ) == T}. According to Theorem 2.7. 7, this means that in the context of Hilbert spaces the two definitions of nuclear operators (in Section 2 . 7 and in this section) , and the two definitions of nuclear norm are equivalent. Taking into account Theorem 2.7.7, all the statements of Theorem 5 immediately follow, except the last one.
Let Gr0 be the corestriction of the Grothendieck operator to its image. We know from Theorem 2 . 7.7 and Corollary 2 . 7.2 that Gr0 : (H* 0 K, II · li P ) ---+ (N (H, K) , II · liN) is a coisometric operator between Banach spaces. Evidently, it maps the algebraic ten sor product H* 0 K onto the space of finite-dimensional operators F(H, K) , and the corresponding birestriction is just the operator Gr8 from Proposition 2 . 7.2. However, now, contrary to the case of Banach spaces, we can go further. The last statement of Lemma 3 together with Proposition 2. 7.2 guarantees that for every finite dimensional T there is a unique u E H* ® K such that Gr ( u ) == T and llu ii P < II T II N · Since the converse inequality is already known, this means that Gr8 is an isometric isomorphism. Further, from each of the equivalent definitions of the nuclear norm it follows immediately that F(H, K) is dense in (N (H, K) , II · II N ) . Thus, Gr0 is an isometric isomorphism between dense subspaces in H* 0 K and (N (H, K) , II · liN) . It remains to use the version of the extension-by-continuity principle given in Proposition 2. 1 . 10. • Theorem 5 implies, in particular, that by assigning to a nuclear operator T the number ttr( u), where u E H* 0 K is such that Gr( u) == T, we obtain a well-defined bounded functional on (N (H), II · liN) , namely the one which was called the operator trace in Section 2.7. The following proposition shows that this definition of the operator trace is equivalent to Definition 8 in Section 2.7.
232
3. From Compact Spaces to Fredholm Operators
Proposition 9. Suppos e H is a separable Hilbert space wtth orthonormal basis e 1 , e2 , . . . , u is an element of H* @ K, and T :== Gr(u ) . Then ttr(u) == 2::: � 1 (Tek , ek ) . Proof. If T == L� 1 s n e � 0 e� , then T :== Gr(u) , where u == 2::: � 1 ttr(u) == 2::: � 1 sn (e� , e�) , and it remains to use Proposition 6 .
sne� @ e� .
Hence,
•
In conclusion we note that the reader who has mastered the "adjoint associativity law" (see Exercise 2.7.5) , can do the following Exercise 2. Prove statement (ii) of the Schatten-von Neumann theorem (Theo rem 4) .
Hint. The mentioned law provides an isometric isomorphism B(K, H** ) ---+ (K @ H* ) * taking an operator S to the functional fs : y @ e* �-----+ [S(y)] (e* ) . Identifying H* * and H, we see that the latter number is tr( e 0 Sy) == tr( SR) , where R : == e 0 y. After interchanging the factors and applying Theorem 5 we obtain an isometric isomorphism B(K, H) ---+ N(H, K) * taking an operator T to the functional fr such that fr (R) == tr(TR) at least for one-dimensional operators R.
5 . Fredholm operators and t he index
From what we have already said about compact operators you may get an impression that they are "minor" operators, holding in functional analysis approximately the same position as finite-dimensional operators do in pure algebra. In mathematics it is often instructive, whenever you have something "minor" , to take a quotient by this "minor" and to see what remains. To formalize this vague idea in our concrete situation, we introduce the category Ban /K, where the objects are (like in Ban) Banach spaces, but the morphisms between objects E and F are not the bounded op erators between these Banach spaces, but their cosets modulo the sub space of compact operators. (Thus, hsan ;x: (E, F) : == B(E, F) /K(E, F) .) The composition of morphisms (i.e. , cosets) S + K(F, G) and T + K(E, F) is the morphism (coset) ST + K (E, G) . (Clearly, the latter coset does not depend on the choice of representatives in the corresponding cosets: if S' and T' are other representatives of these cosets, then the operator ST - S'T' == ( S - S')T + S' (T - T') is compact by Propositions 3.4 and 3.5, and therefore ST + K(E, G) == S'T' + K(E, G) ). The axioms of category are easily verified; in particular, local identity of the object E is the coset lE + K (E) . Certainly, the zero morphisms in this new category are precisely the spaces of compact operators. But how can we characterize the cosets that are isomorphisms? Informally, they consist of operators that are "most far away from compact ones" , so to say "anti-compact" operators. The current section is devoted to these operators. This, however, will not become clear immediately, and we use a roundabout way.
5. Fredholm operators and the index
233
Suppose S : E � F is a bounded operator between Banach spaces, and Im(S) has finite codimension in F. Then Im(S) is closed.
Proposition 1.
-
-
Proof. Since Im( S) == Im( S) , where S : E / Ker( S) �
F is the operator
generated by the operator S (see Section 1 .5) , we can assume without loss of generality that S is injective. Denote by Fs an arbitrary linear complement of Im(S) in F. By the assumption, it is finite-dimensional, and thus (Theorem 2. 1 . 1 (i)) is a Banach space. Let E EB Fs be the Banach direct sum of the indicated spaces (see Section 2. 1 ) , and R : E EB Fs � F the operator that coincides with S on E and with the identity operator on Fs. Clearly, R is bounded and bijective; therefore, by the Banach theorem, it is a topological isomorphism. Hence R maps the closed subspace E in E EB Fs onto a closed subspace in F. The latter is just Im( S) . II Now we introduce the main notion of this section.
F between Banach spaces is called an abstract Fredholm operator or just a Fredholm operator if its kernel has Definition 1. An operator S : E �
finite dimension, and its image has finite codimension. (Thus, not only the kernel, but also the image of a Fredholm operator is automatically closed.) The integer dim Ker( S) - codimp Im( S) is called the index of our Fredholm operator, and is denoted by Ind(S) .
Forestalling the possible perplexity of the reader, we should immediately say that the index of a Fredholm operator is a more important and deeper characteristic than the numbers dim Ker( S) and codimp Im( S) themselves, though they look "more geometrical" . Contrary to the latter numbers, the index behaves more regularly under composition of operators and has the properties of stability. Of course, we will explain this later. Definition 1 immediately implies that two topologically equivalent oper ators are both either Fredholm or not. Here are first examples. If the question is about the operator S acting on a finite-dimensional space E (which is Fredholm, of course) , then, as one knows from a course of linear algebra, dim Ker(S) + dim lm(S) == dim E, and therefore Ind(S) == 0. Every topological isomorphism between Banach spaces is also Fredholm and has zero index; every bounded projection of finite codimension has the same property. The operators of left and right shift in l2 (Example 1.3.3) are Fredholm, and the first has index 1 , whereas the second has index - 1 . Exercise 1 (leaning on the course of ordinary differential equations) . The differential operator x �----+ x ( n ) + !I (t)x (n - I ) + · · · + fn (t); !I , . . . , fn E
3. From Compact Spaces to Fredholm Operators
234
C[a, b] from cn [a, b] to C[a, b] has index n. At the same time the derivative 1 1 X r--+ X from C ('Jr) to C ('Jr) has index 0. Remark. Apparently, Fredholm operators that are most important in analysis and ge ometry arise in the study of the so-called elliptic pseudodifferential operators on compact manifolds. Such operators act between Banach spaces obtained from the spaces of sections of vector bundles over manifolds upon completing them in some special ( "Sobolev" ) norms. The theory of such operators have now grown into a voluminous science, far beyond the scope of this book. Working with these operators requires, in addition to Sobolev spaces, the knowledge of the theory of vector bundles and algebraic topology. Some of the results, in particular, the famous Atiyah-Singer index theorem, are presented in [57] and [105] .
Speaking about counterexamples, let us emphasize the following result.
In the case of infinite-dimensional E or F a compact op erator T : E � F is never Fredholm.
Proposition 2.
Proof. Let T be Fredholm. Suppose
F is infinite-dimensional. Then, by
Proposition 1 , the corestriction of T to its image is a surjective operator onto an infinite-dimensional Banach space. Hence, by the open mapping principle and Theorem 2.2 (Riesz) , T(BE) is not totally bounded. Hence, T is not compact. Thus, F is finite-dimensional. But then from the fact that Ker(T) is II finite-dimensional it evidently follows that E is finite-dimensional. Now let us take a pair of operators from our list of examples. A diagonal operator T>.. : lp � lp ; 1 < p < oo (Ex ample 1.3.2) is Fredholm {=:::} zero is not a limit point of the sequence A == (AI , A 2 , . . . ) . If it is Fredholm, its index is always equal to zero. Hint. If T>.. is Fredholm, then the same is true for its birestriction to the subspace generated by an arbitrary family of unit vectors. Hence, none of these birestrictions can be a compact operator. Exercise 2.
Exercise 3. The multiplication operator Tt
: Lp [a, b]
�
Lp [a, b] ; 1 <
< oo , where f E L 00 [a, b] (Example 1 .3.4) is Fredholm {=:::} for some () > 0 we have I f I > () almost everywhere on [a, b] . (In other words, Tt is Fredholm {=:::} it is invertible.) Hint. If the measure of the set Z :== { t E [a, b] : f(t) == 0} is posi tive, then Ker(Tt) coincides with { g : g == 0 outside of Z} and is infinite dimensional. If, to the contrary, 11(Z) == 0, then Tt is injective. If in addition it is Fredholm, then it is topologically injective, and, as a corollary, invertible (Exercise 1.4.4) . Let us go back to the general theory. One of the most important prop erties of the index is as follows.
p
5. Fredholm operators and the index
235
Let S : E � F and R : G be Fredholm operators. Then RS : E � G is also Fredholm, and
Theorem 1 (multiplicative property of index) .
F� Ind(RS) == Ind(R) + Ind(S) .
Proof. Consider the subspace Fo :== Im(S) n Ker(R) in F. It is finite
dimensional. Clearly, Ker(RS) == {x E E : Sx E Ker(R) } , and S maps Ker(RS) onto Fo . Moreover, the kernel of the corresponding birestriction S8 : Ker(RS) � F0 is certainly Ker(S) . Hence, dim Ker(RS) == dim Ker(S) + dim Fo < oo . Now, denote by F0 an arbitrary linear complement of the algebraic sum Im(S) + Ker(R) in F; it is also finite-dimensional. Evidently, Im(RS) (or, which is the same, R(Im(S) ) ) coincides with R(Im(S) + Ker(R) ) , and, con sequently, Im(R) is an algebraic sum Im(RS) + R(F0 ) . But if some z E G has the form RSx; x E E, and at the same time, the form Ry; y E F0 , then Sx - y E Ker(R) , and therefore y E (Im(S) + Ker(R) ) n F0 . Hence, y == 0, and thus z == 0. Therefore, Im(R) is in fact the direct sum Im(RS) EB R(F 0 ) . Taking into account that Ker(R) n F 0 == (0) and, as a corollary, the operator R lpo is injective, we obtain codimc Im(RS) == codimc Im(R) + dim F 0 < oo . We conclude that RS is a Fredholm operator, and Ind( RS) == (dim Ker( S) + dim F0 ) - ( codimc Im( R) + dim F 0 ) . Now denote by F1 an arbitrary (necessarily, finite-dimensional) linear com plement of Fo in Ker(R) . Then dim Ker(R) == dim Fo + dim F1 , and at the same time from the obvious coincidence of Im(S) + Ker(R) and Im(S) EB F1 it follows that codimp im(S) == dim F 0 + dim F1 . The rest is clear. II We now pass to a very important special class of Fredholm operators. It originates from the classical paper of 1903 by the Swedish mathematician E. Fredholm [58] . With a considerable simplification and "modernization" , we can say that in this work it was established that the operator of the form 1 -T, where T is an integral operator, is a Fredholm operator, and its index is 0. Certainly, at that time Fredholm himself heard nothing about any operators, and argued in terms of the so-called integral equations of the second type (to be introduced further in this section) . Later (and this is an achievement of Riesz and Schauder, who are already known to the reader) it was found that the real reason of this phenomenon is not the concrete form
236
3. From Compact Spaces to Fredholm Operators
of the integral operator, but the fact that (as we know from Theorem 3. 1) it is compact. In the following two theorems, S : E --+ E is an operator on a Banach space of the form S == 1-T, where T is a compact operator. (As one says, S is a compact perturbation of the identity operator.) Theorem 2. S is a Fredholm operator. Proof. For every
x E Ker(S) we have x == Tx. Consequently, Ker(S) is an
invariant subspace for T, and T I Ker ( S) is the identity operator and at the same time a compact operator (together with T) . Hence, from Proposition 3.3 it follows that Ker(S) is finite-dimensional. Now we show first that Im(S) is closed, and second that it has finite dimension. Suppose a sequence Yn == Sx�; x � E E tends to some y E E. Take a closed linear complement Es of Ker(S) in E (which exists by Proposition 1.6.3) and the projection P : E --+ E onto Es along Ker(S) (bounded, in view of Corollary 2.4.3) . Then we see that for vectors Xn :== Px� E Er the sequence Sx n also tends to y. Let us show that the set { xn ; n E N} is bounded. Suppose this is not the case. Then, taking a subsequence if necessary, we can assume that ll x n ll --+ oo for n --+ oo . Put x � :== X n / ll x n ll · Then, due to the boundedness of the convergent sequence Sx n , the sequence Sx� == x � - Tx � tends to zero. But T is compact, hence (Proposition 2.4(ii)) Tx � contains a fundamental subsequence, which converges by the completeness of F. Taking, if necessary, another subsequence, we can assume that Tx� tends to some z E E. Then x� tends to the same z . Hence, Sz == 0, and at the same time z E Es . Therefore, z E Ker(S) n Es == 0. On the other hand, ll z ll == ll x� ll == 1 , a contradiction. Using again the compactness of the operator T, we see that the sequence Txn contains a convergent subsequence, and again we can assume that Tx n converges in E; let z' be its limit. Then the sequence Xn == Yn + Tx n also converges, namely, to y + z ' . Therefore, Sxn tends to S(y + z') , thus the latter vector is y. This shows that Im(S) is closed. Finally, consider the normed (by Proposition 1 . 1 .2) space E :== E / Im(S) . Since Im(S) is an invariant subspace ,..., ,..., ,..., for ,...,S, hence for T as well, these operators generate the operators S, T : E --+ E. From the commutative diagrams in the definition of these operators (see Proposition - 1.5.2) , it evidently follows that for every- coset ± E E we have S± == ±- - T± and at the same time S± == 0; therefore T is the identity operator on E. But the natural projection pr: E --+ E figuring in these diagrams is (as was noted in Section 1.5) a coiso metric operator; therefore ( pr o T) ( B� ) == (T o pr) (B�) == T ( B� ) == B� .
5. Fredholm operators and the index
237
Since the operator pr T is compact (together with T; see Proposition 3.5) , this means that B�, and consequently BE are totally bounded sets. Applying Theorem 2.2 (Riesz) , we see that dim E < oo. In other words, II codimE Im(S) < oo . The rest is clear. Theorem 3 (Fredholm alternative 4 ) . The operator S is injective {=:::} it is o
surjective.
(Thus, our operator faces the following alternative: either its image fills all of E, or its kernel is non-zero.) ===> .
Suppose Ker(S) == 0, but Im(S) # E. Put En : == Im( Sn ) ; n == 1 , 2, . . . . Then, obviously, for all n we have En + I == S (En ) C En , and every En is an invariant subspace of the operator S, and consequently, of T as well. Take an arbitrary x E E \ E1 . Since S is injective, Sx does not belong to the set of vectors of the form Sy; y E E1 . This means that E2 # E1 . The same argument applied to a vector in E1 \ E2 shows that E3 # E2 . Similarly, E4 # E3 , and eventually, En +I is a proper subspace in En for every n. Let Sn , Tn : En � En be the birestrictions of operators S and T to En . Evidently, Sn == lEn - Tn , and Tn is compact together with T. By Theorem 2, E1 is closed, hence is a Banach space. Therefore, S1 satisfies the assumptions of Theorem 2, and this guarantees that E2 is closed. Continuing these arguments, we establish that all En are closed. Now, using the lemma about near-perpendicular, we take in each En a vector Xn such that l l x n ll == 1 and d(x n , En +I ) > 1/2. Then, obviously, Tx n - Tx n + l == X n - z , where z == Sx n + X n +l - Sx n +I E En +l · Since d(Tx n , Txn +I ) > 1/2 and, by Proposition 2.4(iii) , the set {Tx n ; n E N} is not totally bounded, and we obtain a contradiction with the fact that T is compact. n � - Now assume that Im(S) == E, but Ker(S) # 0. Put E :== Ker( Sn ) ; n == 1, 2, . . . . Then for all n we have En+ I == {x E E : Sx E En } and En C En +I . Taking x E E 1 \ 0 and using the surjectivity of S, we see that x == Sy for some y , which necessarily lies in E2 \ E 1 . Similarly, y == Sz for z E E3 \ E2 , and so on. Continuing these arguments, we see that En is a proper subspace in En + I for every n. Since S is continuous, the inverse image of every closed set is closed. Successively applying these arguments, we establish that all En are closed. Using the lemma about near-perpendicular, in every En ; n > 2 we take a vector Xn such that l l xn l l == 1 and d(xn , En - I ) > 1/2. Then the same Proof.
4 An alternative in mathematics, well in real life, means an obligatory choice between two incompatible possibilities ( for example, "To be , or not to be?" ) . as
as
3. From Compact Spaces to Fredholm Operators
238
arguments as at the end of the �-part, give d(Txn , Tx n - 1 ) > 1/2. Thus, • we obtain a contradiction with the compactness of T. Theorem 4. The index of the operator S {which, as we already know, is
Fredholm} is zero. Proof. Let E 8 be an arbitrary linear complement of Im(S) in E. It is finite-dimensional together with Ker(S) , and thus closed. Further, let Es and P be the same subspace and projection as in the proof of Theorem 2, and Q : == lE - P. Obviously, Q is a projection onto Ker(S) along Es , which is bounded together with P.
For every operator R : Ker(S) � E the operator S : E � E : x �----+ SPx + RQx has the form lE - T , where T is a compact operator.
Lemma.
A
A
A
A
Proof. Put T : == lE - S. This operator acts by the formula x �----+ x A
- RQx. Thus, taking into account that x == Px + Qx, we have T == TP + Q - RQ. By Proposition 3.5, the operator TP is compact together with T, whereas Q and, as can be easily seen, RQ are bounded finite-dimensional • operators; thus, they are compact as well. The rest is clear. SPx
End of the proof of Theorem 4. Combining the lemma with the Fred
holm alternative, we see that for every R : Ker(S) � E 8 the corresponding operator S is injective {=:::} it is surjective. But as one can easily verify, Ker(S) == Ker(R) , and Im(S) == Im(S) EB Im(R) . Hence, if Ind(S) > 0, or, in other words, dim Ker(S) > dim E 8 , then, we can take a surjective but not in jective R, and we obtain that Ker(S) # 0, and at the same time Im(S) == E. On the other hand, if lnd(S) < 0, or in other words, dim Ker(S) < dim E8 , then, taking an injective but not surjective R, we obtain that Ker(S) == 0, and at the same time Im(S) # Im(S) EB E 8 == E. In both cases we obtain a • contradiction. So the only possibility is that Ind(S) == 0. A
The following exercise generalizes the above theorem. Exercise 4. Suppose an operator S : E � F between Banach spaces has the form I - T, where I is a topological isomorphism, and T is compact. Then S is a Fredholm operator, and lnd(S) == 0. Hint. The operator S :== I - 1 S is a compact permutation of the identity operator. It has the same kernel as S, and I is an isomorphism between the images of these operators. One of the most intriguing problems of geometry of Banach spaces , open as of now, 5 is as follows: are there Banach spaces E "so pathological" that B(E) = span ( l , K(E) )? By the theorems proved above, every operator acting on such a hypothetical space is either compact or Fredholm of index 0. 5 May 2 4 , 2 00 1 .
5. Fredholm operators and the index
23 9
None of the concrete Banach spaces mentioned in this book has this property. To make sure of this, it is sufficient to find a Fredholm operator on a given space with a non-zero index, or an operator which is neither compact, nor Fredholm. In the role of the latter one we can take a projection for which both the kernel and the image are infinite-dimensional. In our concrete spaces we have plenty of such projections. The difficulty in this problem is that , as we already know, there are Banach spaces so complicated that they have no projections with that property. In other words (see Corollary 2.4.3) , they do not have closed infinite-dimensional subspaces with an infinite dimensional closed linear complement ( ! ) . About this group of problems, see , e.g. , [59] .
Now let us pay homage to those questions of analysis that eventually led to Theorems 2-4. The equation of the form (1)
x(s)
- 1b K(s, t)x(t)dt = y(s) ,
where K E L 2 (D) and y E L2 [a, b] are given functions and x E L 2 [a, b] is a known function, is called ( since the 19th century ) an integral equation of the second kind. 6 ( Such equations are considered in function spaces other than £ 2 [a, b] , but we restrict ourselves to the latter spaces. ) If y == 0, then the equation is called homogeneous ; certainly, it has the form (2)
x(s)
- 1b K(s, t)x(t)dt = 0.
From the algebraic point of view the considered equation is a special case of an operator equation. This is the name for an equation of the form Sx == y, where S : E � F is an operator between two linear spaces, y a given vector in F, and x an unknown vector in E. For y == 0 such an equation is called a homogeneous operator equation. Certainly, in the case we consider here we have E == F == L2 [a, b] , and S == 1 -T, where T is an integral operator on L2 [a, b] with kernel K(s, t) . When one considers operator equations in general, and the integral equa tions of the second kind in particular, two typical questions arise: 1 ) For which y in the right-hand side does our equation have a solution, and if it does, what is the description of the set of all solutions? 2) What is the set of solutions of a homogeneous equation; in particular, does a homogeneous equation have a non-zero solution? Obviously, both questions have adequate translations into the geomet rical language of the theory of operators: 1 ) What is the image of S, and what is the full inverse image of every vector y ? 6 Integral equations of the first kind are equations of the form are more complicated, and we will not discuss them.
J: K( s , t) x (t)dt == y ( s ) . They
240
3. From Compact Spaces to Fredholm Operators
2) What is the kernel of S, and, in particular, is S an injective operator? These two questions are closely related to each other: obviously, if y E Im(S) and y == Sx for some x E E, then the full inverse image of y (i.e. , the set of all solutions of the corresponding operator equation with y as the right-hand side) is {x + z : z E Ker(S) } . The following theorem in the theory of integral equations, which we deliberately formulate in an old-fashion style, is actually a special case of Theorems 2-4. Theorem 5 (sometimes called the triple Fredholm theorem) .
Let {1} be an
integral equation of the second kind. Then (i) There is a finite family X I ( s) , . . . , Xm ( s) of linearly independent so lutions of the homogeneous equation {2} such that every solution of (2} is a linear combination of solutions X I ( s) , . . . , Xm ( s) . (ii) There is a finite family ZI ( s) , . . . , Zn ( s) of linearly independent func tions in L 2 [a, b] such that the right-hand sides y(s) for which the equation (1) has at least one solution are precisely those functions for which J: y(t)zk (t)dt == 0 for k == 1 , . . . , n . (iii) m == n {i. e., the number of solutions in {i} coincides with the num ber of linearly independent functions in {ii}}.
Thus , integral equations of the second kind, being an evident object of infinite dimensional analysis, behave surprisingly similarly to finite systems of linear equations with the same number of variables. Major consequences for the entire mathematics arose from the fact that Hilbert was among people surprised by the Fredholm paper ( cf. , e.g. , [38, p. 22 1] ) .
H :== L 2 [a, b] , consider the operator S :== 1 -T, is the integral operator with kernel K(s, t) . Clearly, part (i) is
Proof. In the Hilbert space
where T equivalent to the statement that the kernel is finite-dimensional; therefore, this is a direct corollary of Theorems 2 and 3. 1. Clearly, m == dim Ker(S) . Let us now "modify" part (ii) . We do this in several steps. The first equivalent formulation is obvious: - There is a linearly independent system of vectors ZI , . . . , Zn in H, such that Im(S) == (span{zi , . . . , zn } )l_ . Since the subspace in the right-hand side of the latter equality is closed, by Proposition 2.3.4 we can express this as follows: - The image of the operator S is closed, and its orthogonal comple ment is finite-dimensional. Finally, establishing the equality dim Im(S) j_ == dim H/ Im(S) Propo sition 2.3.6 shows that if we take into account Proposition 1, the latter
5. Fredholm operators and the index
241
statement is equivalent to the fact that codimH Im(S) is finite. Therefore, the indicated codimension is precisely n. Hence part (ii) also follows from Theorems 2 and 3. 1 , and (iii) follows from these theorems together with Theorem 4. • Remark. In fact, Fredholm made another important contribution, antici
pating a valuable fact we present in Exercise 8. He connected the solution of integral equation (1) with the solution of the so-called adjoint integral equation, where the kernel K(s, t) is replaced by K* (s, t) : == K ( t , s) . Later we will speak about this in the general context of Hilbert adjoint operators; see Proposition 6. 1. 10. We now recall the category Banj/C introduced at the very beginning of this section: we will keep the promise we gave there.
6 (S. M. Nikol ' skii) . Let S : E --+ F be an operator between Banach spaces. Then S is a Fredholm operator {=:::} there exists a bounded operator R : F --+ E such that RS == lE - T1 , where T1 E /C(E) , and SR == lp - T2 , where T2 E /C ( F ) . {In other words, S is Fredholm {=:::} its coset S + JC(E, F ) is an isomorphism in Banj/C). Moreover, the operator R can be chosen in such a way that T1 and T2 are finite-dimensional.
Theorem
�-
Suppose Es is a closed linear complement of Ker(S) in E (cf. proof of Theorem 2) , P1 is a projection onto Es along Ker(S) , Q 1 : == 1E - P1 , Fs is a linear complement of Im(S) in F, P2 is a projection onto Im(S) along Fs, and Q 2 : == lp - P2 . All these projections (cf. the same proof) are bounded. Denote by sg : Es --+ Im(S) the corresponding corestriction of the operator S. It is a bijective bounded operator, and since Im(S) is closed, by the Banach theorem sg is a topological isomorphism. Put R : F --+ E : x � (Sg)- 1 P2x. Clearly, RS == P1 , and SR == P2 ; thus RS == lE - Q1 and SR == lp - Q 2 . Since Q 1 and Q 2 are finite-dimensional operators, we obtain the implication � together with the last statement of the theorem. ¢== . According to Theorem 2, RS and SR are Fredholm operators. Consequently, Ker( RS) (containing Ker( S) ) is finite-dimensional. At the same time Im(SR) (contained in Im(S)) has finite codimension in F. The • rest is clear. Proof.
Exercise 5 * . Let Ban/ F be the category defined similarly to Ban/ K, but taking the quotient modulo finite-dimensional operators instead of compact ones. Then for an operator S between Banach spaces E and F, its coset S + K ( E , F) is an isomorphism in Ban/K (i.e. , S is a Fredholm operator) ¢:::::::> the coset S + F ( E , F) is an isomorphism in BanfF.
3.
242
From Compact Spaces to Fredholm Operators
(We see that although there are many more compact operators than finite-dimensional ones, the property of "being an isomorphism up to compact perturbations" is equivalent to the property of "being an isomorphism up to the finite-dimensional perturbations" . )
The Nikol ' skii theorem allows us to expose one more (in addition to Theorem 1) advantage of index compared to dim Ker(S) and codimp lm(S) . Proposition 3 (Stability of the index under compact perturbations) . If S
is a Fredholm operator and T a compact operator between Banach spaces E and F, then S + T is also a Fredholm operator and Ind(S + T) == Ind(S) .
Proof. According to the ::::=:> - part of the previous theorem, there are opera tors R, T1 , and T2 with indicated properties. Then R(S T) == lE - T1 RT and (S T) R == lp - T2 TR. From this, taking into account the ¢=
+
+
+
+
part of the previous theorem, it evidently follows that S + T is a Fredholm operator. Further, from the very same theorem, where R is used as the initial operator, it follows that R is Fredholm as well. Combining Theorem 1 with the fact that the operators RS and R( S + T) satisfy the hypotheses • of Theorem 4, we see that lnd(S) == - lnd(R) == lnd(S + T) . Thus, the subset in B(E, F) consisting of Fredholm operators contains together with each S the whole coset S + IC(E, F) , and the index is constant on this coset. In Section 5.3, where topological properties of the operator composition will be discussed, we will come across another stability property of Fredholm operators, this time with respect to perturbations that are small in the operator norm. There we will find some further facts on the structure of the set of Fredholm operators.
6. A Fredholm operator has index 0
it is represented as a sum of a topological isomorphism and a compact (and even a finite dimensional) operator. Hint. The required finite-dimensional operator T isomorphically maps Ker(S) onto a linear complement of Im(S) . Exercise
{=:::}
We suggest that the reader who has learned about exact sequences (see Section 2.5) , will translate the definition of a Fredholm operator into the language of "Banach homo logical algebra" . Exercise 7° . An operator S an exact sequence
(3)
·
·
·
:
E ---+ F is Fredholm {::=:::::} in the category Ban there is
s f-- O f-- C f-- F f-E f-- K f-- 0 f--
with finite-dimensional C and K
.
Here is the concrete use of this point of view.
·
·
·
5. Fredholm operators and the index
243
Exercise 8 . If S : E ---+ F is a Fredholm operator, then S* : F* ---+ E* has the same property, and dim Ker( S* ) == codimp im( S) , codimE* Im(S* ) == dim Ker(S) . As a corollary, Ind(S* ) Ind(S) . Hint. This is a simple corollary of Theorem 2 . 5 . 1 . ==
-
Exercise 9* . The converse is also true: if S* is a Fredholm operator, then the same is true for S.
Hint. The difficulty is in establishing that Im( S) is closed : if it is done, we can apply Theorem 2 .5 . 1 to the exact sequence of the form (3) , where K :== Ker(S) and C :== F/ Im( S) . Since S* * is a Fredholm operator (see above) and in view of the properties of the canonical embedding a , it is sufficient to show that S* * (E) == S* * ( E1 ) is closed , where E1 is a closed linear complement of E n K in E. Take X n E E1 such that S** (xn ) tends to y E F* * . Suppose Eo is a closed linear complement of K in E* * , P is a projection onto Eo along K , and Q :== 1 - P. If Xn is bounded, everything is fine: we can assume that Qxn tends to some z in the (finite dimensional! ) space K. At the same time, since S** Pxn ---+ y; n ---+ oo, and Im( S* * ) is closed, the Banach theorem shows that Pxn converges in Eo to some z. Therefore X n converges in E 1 , and y == S** (limn-+ex) Xn ) · If, on the contrary, the norms of xn are infinitely increasing, and x� :== Xn / l l xn l l , then S* * x� (i.e., S** Px� ) tends to zero, and the same is true for Px� . Taking into account that dim K < oo, we can assume that Qx� tends to some u E K, and hence x� tends to the same u . We have a contradiction with the choice of E1 . (Perhaps, you will be able to find a simpler proof. )
Chapter 4
Polynormed S p aces , Weak Topologies , and Generalized Functions
1 . Polynormed spaces
Up to now it was sufficient to consider only one norm ( or a prenorm ) on a linear space. We used this structure to describe a wide class of various types of convergence in analysis, like uniform convergence, mean conver gence, mean-square convergence, etc. But there are many other natural important types of convergence that cannot be described in this way. Here are some examples.
C 00 [a, b] of infinitely smooth ( == infi nitely differentiable ) functions on an interval [a, b] . The so-called classical convergence in C 00 [a, b] is defined as follows: a sequence X n tends to X
Example 1. Consider the linear space
"classically" if for every k == 0, 1 , . . . the sequence of kth derivatives x �k ) converges to x ( k ) uniformly on [a, b] . ( As usual, we define the Oth derivative of x as x itself: x ( O) == x). Note that, informally, from this convergence one can derive several types of convergence in the theory of generalized functions ( we shall see this soon ) and in differential geometry. Example 2. Let 0 (]IJ) 0 ) be the linear space of holomorphic functions on the
open unit disk ]IJ)0 in the complex plane. Consider the so-called Weierstrass convergence in 0 (]IJ)0 ) : a sequence Wn tends to w in the sense of Weierstrass if for any closed subset K c ]IJ)0 the sequence of the restrictions wn i K tends
245
246
4.
Polynormed Spaces and Generalized Functions
to w i K uniformly on K. This is the standard convergence used in complex analysis.
c00 (of all sequences) consider the coordi natewise (or simple) convergence: � ( n) tends to � if for every k the sequence
Example 3. In the linear space
n �k ) tends to �k ·
Suppose that the class of convergent sequences in a linear space E is given. Then we say that a prenorm II · II on E generates this convergence if for any sequence X n in E the following is true: Xn converges to x in E in the announced sense {=:::} X n converges to x in the prenormed space ( E, II · II ) . Exercise 1 .
(i) There is no prenorm in C00 [a, b] generating the classical conver gence. (ii) There is no prenorm in 0 (]IJ)0 ) generating the Weierstrass conver gence. (iii) There is no prenorm in c00 generating the coordinatewise conver gence. Hint. In all these situations the limit is unique. So it is sufficient to prove that the convergence cannot be generated by a norm (instead of a prenorm) . (i) If Xn tends to zero classically, then the same is true for x�. So for our hypothetical norm, ll xnll � 0 implies ll x� ll � 0. But look at the sequence
Xn ( t ) :== n leenntt II "
(ii) Similar arguments work in complex analysis: if Wn converges in the sense of Weierstrass, then the same is true for w� . (iii) Take Pn / II Pn ll · However, these three types of convergence, as well as many others that are not generated by one (pre )norm, can be adequately described using a structure of the "next level of complexity" -polynormed spaces. What this actually means is as follows. Definition 1 . A linear space E equipped with a family of prenorms
ll · ll v ;
v
E A, or, to be more precise, a pair (E, ll · l v ; v E A) consisting of a linear space E and a family of prenorms ll · ll v ; v E A on it, is called a polynormed space. 1
(Of course, it would be more accurate to say "polyprenormed" instead of "polynormed" , but let us have a pity on our language.) 1 This notion is related to the notion of locally convex space that can be found in many
textbooks. This connection will be considered later, in one of the remarks.
1.
Polynormed spaces
247
If ( E, ll · ll v ; v E A) is a polynormed space, then any prenorm on E of the form max { II · ll v1 , , II · ll vn }, where v1 , . . . , Vn is a finite set of indices from A, will be called an accompanying prenorm ( for the family II · ll v ; v E A) . Each space ( E, II · II ) , where II · II is an accompanying prenorm, is called an accompanying prenormed space ( for the polynormed space ( E, ll · ll v ; v E A) ) . Of course, ( semi ) normed spaces are special cases of polynormed spaces (they correspond to the family consisting of only one prenorm ) . If the family of prenorms is countable, i.e. , A == N, we say that the space is countably .
.
•
normed.
If E is equipped with a family of prenorms, then the restrictions of these prenorms to a subspace F in E endows F with the structure of polynormed space called a polynormed subspace in E. Now, following reader ' s expectations, we collect a list of examples. Example 4. Take an arbitrary linear space E and equip it with all existing
prenorms on E ( i.e. , all the functions satisfying conditions ( i ) and ( ii ) of Definition 1.1.1). Such polynormed spaces are called the strongest. ( As we shall see soon, their role in functional analysis resembles the role of discrete spaces in topology. )
C 00 [a, b] in Example 1 becomes polynormed when equipped with the family of norms II · l i n , where ll x lln :== max { l x ( k) ( t) l ; k == 0, . . . , n; a < t < b }. ( Thus, the nth norm here is precisely the norm of coo [a, b] as a subspace in en [a, b] .) We should note that this example is Example 5. The space
extremely important for the future theory of generalized functions-it plays the role of an "embryo" of standard spaces of test functions. Example
6. The space O(U) of holomorphic functions on a domain U in the
complex plane becomes polynormed if equipped with the family of prenorms II · I lK ; K E A, where A is the family of all closed bounded subsets in U, and ll w ii K :== max { l w ( z ) l ; z E K} . ( By the way, which of these prenorms are norms? ) These spaces and their multidimensional analogues are polynormed spaces considered in complex analysis. Example 7. The space c00 becomes polynormed if equipped with the family of prenorms II · l i n, where II � l i n : == l �n l ·
8. Consider the linear space B ( E, F) , where E and F are pre normed spaces. In Proposition 1.3.2 we had endowed it with the standard Example
prenorm, but right now let us forget about it and make this space poly normed in the following way. Take A : == E and define a family of prenorms ll · ll x; x E E on B ( E, F) by putting II T II x to be equal to II T (x ) ll , the prenorm of T ( x ) in F . This family of prenorms in B ( E, F) is called the strong-operator family of prenorms and is denoted by so.
4.
248
Polynormed Spaces and Generalized Functions
Remark. The special case of this polynormed space when F == C defines a family of prenorms on the dual space E* . It plays an important role in the theory of weak topologies and generalized functions; this will be discussed in details in subsequent sections of this chapter.
Example 9. The same space B(E, F) can be made polynormed in a different
way. Namely, take A == E x F* and equip B(E, F) with the family of prenorms II · ll x , f ; x E E, f E F* by putting II T II x , J equal to l f(T(x)) l . This family of prenorms in B( E, F) is called the weak-operator family of prenorms and is denoted by wo. The case where E == F == H is a Hilbert space plays a special role here. In this case the canonical bijection between H and H* allows us to consider the index set A :== H x H and the family of prenorms II T II x , y == I ( Tx, y ) l . :
:
Remark. Both families of prenorms in B(H) , so and w o , are indispensable in the theory of operator algebras. In particular, they participate in the definition of one of the most important classes of operator algebras, the so-called von Neumann algebras ( see Definition 6.3.5 below ) . In addition, these families will be useful in Sections 6.6-6.8 in questions related to the spectral theorem.
Other important examples of polynormed algebras will be considered later in this chapter. As we know, every (pre)normed space is automatically (pre)metric, and thus a topological space. As for polynormed spaces, we shall describe a natural way to turn them into topological spaces. (As we shall see later, these topological spaces are not necessarily metrizable.) This procedure resembles the topologization of metric spaces by indicating open balls. Let (E, I · ll v ; v E A) be a polynormed space. Take r > 0, x E E and choose a finite set of indices v1 , . . . , Vn . With every such system we associate the set Ux , vl ,··•,Vn , r == { y E E II Y - x ll vk < r; k == 1, . . . ' n }, i.e. , an open ball of radius r centered at x in the accompanying prenormed space (E, max{ II · ll v1 , , II · ll vn }. Every such set is called a standard open ball in E. Note that Ux , v1 , ... , vn , r == n�= l Ux , vk , r · Further, if M is a subset in E, we call a point x E M an interior point of this set if it is contained in M together with some standard open ball Ux , v1 , .. . , vn , r· Finally, a set U in E is said to be open if every point of U is interior. As you have already guessed, this terminology is justified by the following :
:
•
•
•
The class of open subsets in E defines a topology. This topology contains all the open sets of every accompanying space for E.
Proposition 1 .
Proof. From the evident inclusion
Ux , vl ,·· · ,Vn ,r n UX,J.Ll ,· · ·,J.Lm ,S
�
Ux , vl ,···,Vn ,J.Ll , ·· ·,J.Lm ,t
1.
249
Polynormed spaces
(where all the indices belong to A and t : == min{r, s } ) we see that the intersection of every two (and thus, of every finite number) of open subsets is open. This establishes condition (iii) from Definition 0.2. 1. The rest is clear. • Remark. It is easy to see that the described topology can be given also by
each of the following equivalent definitions:
1) It is the only topology for which the family of standard open sets Ux,v,r is a subbase (see Proposition 0.2. 1). 2) It is the weakest topology for which all the prenorms II · l l v , v E A
are continuous. 3) It .is the weakest topology containing all open sets of all accompany1ng spaces.
The topology defined in this way is called the topology generated by the family of prenorms l l · l l v ; v E A. In particular, the topology in B(E, F ) (and thus in B(H) ) generated by the strong or weak family of operator prenorms, is called, respectively, the strong- or weak-operator topology and is denoted by the same symbol so or wo. Let us formulate the following obvious result.
In a polynormed space E the following holds: the addition E x E � E ( x , y) �----+ x + y is a continuous map of topological spaces; the multiplication by scalars C x E � E : (A, x ) �----+ AX is a contin uous map of topological spaces; if U is an open set in E, then for every M C E the algebraic sum U + M is open in E; in particular, for any x E E the shift U + x is open in E; if U is an open set in E, then for any A E C the dilation AU is open in E. •
Proposition 2.
(i) (ii) (iii) (iv)
:
(In fact, the last two properties follow from the first two, but we shall not need it.) From this already we can see that not every topology on a linear space is generated by a family of prenorms. So it would be timely to ask, how can these topologies be characterized? We have already considered a similar question for prenormed spaces, and the information obtained there will be useful now.
250
4.
Polynormed Spaces and Generalized Functions
E is generated by a family of prenorms � there exists a family of subsets in E satisfying the Exercise 2. A topology defined on a linear space
following conditions: ( i ) every set U in this family is convex, balanced, and contains a non zero vector from each one-dimensional subspace in E; ( ii ) the corresponding system of sets Ux,t :== {x + ty : y E U} (where x E E, t > 0, and U runs through our family ) is a basis of our topology. Hint. Consider Minkowski ' s functionals of these sets ( see Exercise 1.1.6) .
Remark. A linear space equipped with a topology satisfying these two conditions is called locally convex. This term is standard in the literature, but we shall not use it . The difference between the notions of polynormed and locally convex space is of the same sort as the difference between metric and metrizable spaces . Locally convex spaces are special cases of topological vector spaces , the most general class of linear spaces with reasonable topologies. This is the name for a space with a topology such that the algebraic operations are jointly continuous, i.e. , satisfy the conditions (i) and (ii) of Proposition 2. The overwhelming majority of topological vector spaces in analysis are locally convex (i.e., "polynormed" ) . Nevertheless, there are some exceptions. For example, the convergence in measure in the space of measurable functions (see Exercise 1 .6. 10) is the convergence with respect to some metric , and the corresponding topology provides an example of a topological vector space that is not locally convex.
After we defined topology in polynormed spaces, we can discuss a series of standard questions about this topology. We shall see that every topo logical property we consider has an adequate and sufficiently transparent description in terms of prenorms. Let us start with the notion of conver gence.
A sequence X n tends to x in a polynormed space (E, II · ll v ; v E A) � for any v E A we have ll x - X n ll v � 0 as n � oo {in other words, X n tends to x in every accompanying polynormed space).
Proposition 3.
Proof.
===> .
E A and c x - x n v � 0 as n �
For any
v
> 0 we have X n
E Ux,v,c
for sufficiently
large n, i.e. , ll oo . ll Take a neighborhood Ux of x . It contains a standard ball {::::::= . Ux,v1 , . . . ,vn ,r · For every k == 1, . . . , n, ll x - X n ll vk � 0 as n � oo . Hence, for sufficiently large n we have II x - X n II vk < r for every k == 1 , . . . , n. This • means that X n lies in Ux,v1 , . . . ,vn ,r and hence in Ux . In particular, we see that the convergence of a sequence in the poly normed space c oo [a, b] is precisely the classical convergence. The conver gence in O ( U) is the uniform convergence on any compact set ( this type of convergence is called the Weierstrass convergence, by analogy with the case of U == JI)) 0 ) . The convergence in c00 is the coordinatewise convergence.
1.
Polynormed spaces
25 1
Thus, we failed when trying to use a single prenorm ( see Exercise 1 ) , but succeeded with a family of prenorms. Recall that the pointwise convergence in C[a, b] cannot be defined by a metric, to say nothing about a norm ( Exercise 0.2. 1 ) . But this type of convergence can be adequately described in terms of a polynormed space. Indeed, if E is an arbitrary space of functions on a set X, then the pointwise convergence in E is obviously the convergence in the polynormed space ( E, II · II t ; t E X) , where II x II t : == I x ( t) l In the polynormed spaces ( B ( E, F) , so) and ( B(E, F) , wo) ( Examples 8 and 9) the statement that Tn tends to T means precisely that for every x E E ( respectively, for every x E E, f E F * ) T(x n ) tends to T(x) ( respectively, f(Tx n ) tends to f(Tx) ) . In the first case we again obtain the pointwise ( or, maybe we should say here "vectorwise" ) convergence. In addition, note that the convergence of Tn to T in (B(H) , wo) , where H is a Hilbert space with a chosen orthonormal basis, implies the convergence of every matrix entry of Tn to the corresponding matrix entry of T. As for the strongest spaces ( Example 4) , the convergence in these spaces can be described in purely algebraic terms:
E is a strongest polynormed space, X n is a se quence in E, and x is a vector. Then X n tends to x the following two Exercise 3* . Suppose
�
conditions are fulfilled:
( i ) all vectors X n and x belong to a finite-dimensional subspace F of E; ( ii ) for some ( and hence, for each ) basis e k ; k == 1 , . . . , m in F in the n corresponding expansions X n == 2:: � 1 A i ) e k , x == 2:: � 1 A k e k we n have >. i ) --+ >. k ( n --+ oo ) for every k.
Hint. If the vectors Yn : == x X n are linearly independent, then we can extend this system to a linear basis in E ( Exercise 0. 1 . 2) . After that we can construct a prenorm ( and even a norm ) in E such that IIYn ll 1 for all n . -
The reader knowing what a convergent net is, may notice that in Proposition 3 se quences (i .e. , nets with the domain N) can be replaced with arbitrary nets. Our proof remains true up to the obvious changes. (Nevertheless, give an accurate proof. ) But the result of Exercise 3 essentially uses the fact that we deal with sequences , and it cannot be extended to arbitrary nets. (Give a counterexample.)
A polynormed space (E, II · ll v; v E A) is Hausdorff as a topological space for every x E E, x # 0 there exists v E A such that ll x ll v > 0 . {In other words, we can distinguish elements of E by prenorms ll x ll v · )
Proposition 4.
�
252
4.
Polynormed Spaces and Generalized Functions
Let y and z be two different vectors in E. Take x : == y z and choose the index v indicated in the statement. Then clearly the neighborhoods Uy , v,r and Uz , v, r of y and z are disjoint for r < ll x l v /2 . :::::=::> . Take x E E \ { 0} . We use the fact that 0 has a neighborhood that does not contain x. Then x is not contained in some standard ball, say, Uo ,v1 , ... , vn , r · But this means precisely that for some k; 1 < k < n we have Proof.
{:::::::= .
ll x ll vk > r.
•
Now we see that all the spaces in our examples are Hausdorff; the only possible exceptions are (B(E, F) , so) and (B(E, F) , wo) : they are Hausdorff {=:::} F is normed. Which operators are compatible with the structure of a polynormed space? The following theorem answers this question. You are already pre pared to perceive it since Theorem 1.4. 1 gives the answer in the special case where spaces are prenormed (see also Proposition 1.3. 1) .
The following properties of an operator T between polynormed spaces (E, II . II IL ; 11 E r) and ( F, II . ll v ; E A), are equivalent: (i) for each v E A there exists an accompanying prenorm I · II ' on E and a constant C > 0 such that for every x E E we have II T(x) ll v < C ll x ll ' {in other words, T is bounded as an operator between prenormed spaces (E, II · II ') and (F, II · ll v)) ; (ii) for each accompanying prenorm II · II in F there exists an accompa nying prenorm 1 · 11' on E and a constant C > 0 such that for every x E E we have II T(x) II < C ll x ll ' {in other words, T is bounded as an operator acting between prenormed spaces (E, II · I ') and (F, I · II )); (iii) T is continuous at zero {with respect to the topologies generated by the given families of prenorms); (iv) T is continuous (with respect to the same topologies).
Theorem 1 .
lJ
II · II == max { ll · l l vk ; k == 1 , . . . , m }. Then for every k there exists an accompanying prenorm I · I � in E and Ck > 0 such that for every x E E we have II T(x) ll vk < Ck ll x ll � · Clearly, the assertion is true for the accompanying prenorm II · II ' : == max{ ll · II � ; k == 1, . . . , n } and C == max { Ck ; k == 1, . . . , n } . Proof. (i)::::=::> (ii) . Put
:
(ii) ::::=::> ( iv) . Take x E E. We must show that T is continuous in x. Let U be a neighborhood of the point T(x) in F. It contains a standard open ball U� o) == {y E F : II Y - Tx ll < r} for some accompanying prenorm I · I in F and some r > 0. By hypothesis, there exists an accompanying prenorm II · II ' in E such that T is bounded as an operator between (E, II · I ') and (F, II · I ). Consequently (Theorem 1.4. 1) , T is continuous. But then there
1.
253
Polynormed spaces
exists a neighborhood V of x in the space (E, I · II ') such that T(V) lies in U� o ) and hence in U. It remains to note that V is an open set in the polynormed space (E, II . II IL ; M E r) . The implication (iv)::::=::> (iii) is obvious. (iii) ======> ( i) . Take v E A and a neighborhood Uo ,v, I of zero in F. By hypothesis, there exists a neighborhood W of zero in E such that T(W) C Uo ,v, l · We know that w contains a standard open ball wj 0) = {X E E : l l x l l ' < 8} , where II · II ' is some accompanying prenorm in E. Thus T, regarded as an operator between prenormed spaces (E, 1 1 · 1 1 ') and (F, ll · l l v ) , maps some dilation of the open unit ball to a bounded set. Obviously, such an operator must be bounded. • Now we can answer the question concerning the comparison of different topologies generated on the same linear space by two different systems of prenorms. We have already discussed this when studying prenormed spaces, and the information we have obtained there suggests the answer.
1.4.2) . A family of prenorms II . I IL ; M E r in a linear space E majorizes a prenorm 11 · 11 in E if there is a finite set of indices Definition 2 ( cf. Definition
. . . , f-ln such that the accompanying prenorm max{ I · 1 1 111 , , II · l l �tn } majorizes II · I · Furthermore, we say that a family of prenorms majorizes another family of prenorms if the first family majorizes every prenorm in the second family. Finally, we say that two families of prenorms are equivalent if each of them majorizes the other. Here is an instructive example. /L l ,
•
Exercise 4. In the space
•
•
B (E, F ) , where E and F are prenormed
spaces, the family consisting of one operator prenorm majorizes the strong operator family, and the strong-operator family of prenorms majorizes the weak-operator family. If E and F are infinite-dimensional normed spaces, then these three families of prenorms are not pairwise equivalent. From Definition 2 we see that if we add to a given family of prenorms the maxima of all its finite subfamilies, then the new family of prenorms is equivalent to the initial one. Note that any finite family of prenorms II · Il k ; k == 1 , . . . , n is equivalent to one prenorm, namely to their maximum.
Suppose we have two families of prenorms on a linear space. Then the topology generated by the first family is not weaker than the topology generated by the second family the first family majorizes the second. As a corollary, two families of prenorms generate the same topology � they are equivalent.
Proposition 5.
�
1.4.5 can be transferred with obvious • changes from prenormed to the polynormed spaces. Proof. The proof of Proposition
4.
254
Polynormed Spaces and Generalized Functions
Now we can easily answer the question of when the topology of a poly normed space can be generated by just one prenorm (in this case the space is called prenormable) . Moreover, with some additional effort , we can an swer the question of when the topology can be defined by a premetric that in general is not generated by one prenorm. (Of course, the prefix "pre-" can be omitted if the initial family of prenorms satisfies the hypothesis of Proposition 4.)
6. A polynormed space is prenormable norms is equivalent to a finite subfamily.
Proposition
�
its family of pre
Proof. As we have said before, every finite family of prenorms is equivalent
to one prenorm (namely, their maximum) . This immediately implies the sufficiency. The necessity follows from the previous proposition and from the following obvious observation: if a family consisting of one prenorm is equivalent to some family of prenorms, then the first family is equivalent to • a finite subfamily of the second family.
1. A countably normed space (E, II · l in; n E N) such that II · lln+ l majorizes II lin , but these prenorms are not equivalent, cannot be prenormable. Of course, C 00 [a, b] is an obvious example of a countably normed space with this property, and the same is true for C00 after replacing II li n by max { II · II I , · · · II · II n} · Exercise 5. The polynormed space O ( U ) is not prenormable. Hint. The family of prenorms in O(U) is equivalent to the family { II · II Kn ; n == 1 , 2, . . . }, where K1 C K2 C · · · is a sequence of compact sets in U such that U == U� 1 Kn . Corollary
·
·
'
The prenormability criterion can be reformulated in the following terms. We call a subset in a polynormed space bounded if it is bounded in every accompanying prenormed space. (In our exposition we shall not use this notion frequently, but it comes to the forefront if we go sufficiently deep into the theory of topological vector spaces; see [60] or [61] .) Exercise
6. A polynormed space is prenormable
bounded open set.
�
it contains a
Now let us discuss the case of finite-dimensional spaces. We have seen that every two norms in such a space are equivalent (Corollary 2. 1.3(ii) ) . The following statement is a generalization of this fact to the polynormed spaces.
1.
255
Polynormed spaces
Exercise 7. Every two families of prenorms generating Hausdorff topo
logies on a finite-dimensional space are equivalent. In particular, every such family of prenorms is equivalent to some (and thus, any) norm. Hint. The main step is to show that each "Hausdorff" family of prenorms {x E I I · l l v ; v E A majorizes any norm I I · I I in E. Consider the sphere S E I I x I I 1 } with topology generated by the norm, and for any x E S, take the index v ( x ) E A such that l l x l l v ( x ) > 0. Since E is finite-dimensional, every norm always majorizes every prenorm. Hence there exists an open covering { Ux : x E S } of our sphere such that I I Y I I v ( x ) > 0 for all y E Ux . But the sphere is compact (we again use that E is finite-dimensional) . If we take a finite subcovering { Ux 1 , , Uxn } , we obtain that the prenorm max{ ll · l l v ( xk ) ; k 1 , . . . , n} is a norm. :�
�
:
•
•
•
�
Now we suggest that you work with the condition of premetrizability. Exercise 8* . A polynormed space is premetrizable
�
the family of
prenorms is equivalent to a countable subfamily. Hint. ¢::== . If (E, I I · l i n ; n E N) is a countably polynormed space, then its topology can be defined by the premetric d ( x , y ) : = 2::.::: � 1 2 n c � lln ) . (Notice that although this premetric is invariant under translations, it is not generated by any prenorm. This follows, in particular, from the fact that diameter of E is not greater than 1 . ) :::::=::> . If the topology is generated by a premetric, then the neighborhood of zero { x : d(O, x ) < � } contains a standard open ball centered at zero; this means that to every n we can associate a finite subfamily of our family of prenorms. The union of all these subfamilies over all n E N is the countable subfamily we are looking for.
:�;��
Here is an illustration. Exercise 9 ° . Polynormed spaces C 00 [a, b] , O( U ) , and Coo are metriz able. At the same time, infinite-dimensional strongest polynormed spaces are not premetrizable. Hint. Regarding O(U) , see the hint to Exercise 5. Now suppose E is a strongest space, en ; n E N is a linearly independent system of vectors in E, and II l i n ; n E N are some prenorms. Then we can extend en to a basis of E and construct a prenorm (and even a norm) II · I I such that for any n we have l l en ll > n max{ llen ii i , · · · , l l en lln } · ·
Now we would like to say a few words about a certain class of polynormed spaces playing an important role in the general theory and applications. Definition 3. A polynormed space is called a Frechet space if its topology can be defined by a complete metric that is invariant under shifts (i.e. , a metric satisfying the condition d ( x , y) == d (x - y, 0) for all x, y) .
256
4.
Polynormed Spaces and Generalized Functions
( Verify that c oo [a, b] and all other examples of metrizable polynormed spaces men tioned above are Frechet spaces. ) Frechet spaces possess many properties typical for Banach spaces . For example, some fundamental results like the Banach theorem on the inverse operator or the Banach Steinhaus theorem are valid for Frechet spaces . In general, it is a good idea to view Frechet spaces as the next reasonable generalization of Banach spaces . For details, see, e.g., [63] , [60] , and [61] . In fact, the requirement that the metric is invariant with respect to the shifts in Definition 3 can be omitted. This follows from a very deep theorem by V. Klee [103] . *
*
*
We now return to the operators discussed in Theorem 1 . Remark on terminology. In the context of polynormed spaces, the op
erators that satisfy the hypotheses of Theorem 1 will be called continu ous. As for the term "bounded" , we reserve it for operators that preserve bounded sets. Actually, outside the special case of prenormed spaces, not every bounded operator is continuous ( see, for example, [60, 11.8]) . Here are several examples of continuous operators. Some of them are presented as exercises.
differentiation operator D on the polynormed space (c oo [a, b] ; II · l i n ); n == 1 , 2, . . . that assigns to every function its deriva tive. From the obvious inequality II D(x) lln < ll x lln + l ; n == 0, 1 , . . . it follows that this is a continuous operator on c oo [a, b] . On the other hand, it is use Example 10. Consider the
ful to note that this operator is not continuous on any of the accompanying spaces of c oo [a, b] . Exercise 10. The similarly defined differentiation operator in
O(U) is
also continuous. it follows that Hint. From the integral formula w '( z ) = 2!-i J"Y if K and L are closed sets in U and a contour 'Y goes around K and lies in L, then for some C > 0 we have ll w ' IIK < C ll w ii L·
(����2d(
Exercise 1 1 . The operator of multiplication by an infinitely smooth
function in C 00 [a , b] and the operator of multiplication by a holomorphic function in O(U) are continuous.
(B(E, F) , so) , choose x E E, and consider B ( E, F) --+ F T �----+ T(x). From the equality
Example 1 1 . Take the space
the evaluation operator Tx II Tx(T) II == II T I x it follows that this operator is continuous. Note that the continuity is preserved after endowing B(E, F) with the operator prenorm. At the same time the same operator from (B( E, F) , wo) to F is, generally speaking, not continuous ( explain why ) . :
:
1.
257
Polynormed spaces
Here is another observation which is important in the spectral theory (see Sections 6.5-6. 7) and in the theory of operator algebras. Consider again B(E, F) and take operators S E B(E) and R E B(F) . Since the composition of continuous operators is continuous, we obtain the maps Ms : B(E , F ) � B(E , F) : T r--+ TS and R M : B(E , F) � B(E , F) : T r--+ RT. Clearly, they are linear operators; they are called operators of composition (with S from the left and with R from the right) . Proposition 1.3.4 immediately shows that both these operators are continuous with respect to the operator prenorm in B(E , F) . Moreover, we have Proposition 7.
(i) Operators Ms and R M are continuous as opera-
tors in (B(E , F) , so) ; (ii) the same is true if we replace (B(E , F) , so) with (B( E , F) , wo) .
II · llx ; x E E is a prenorm belonging to the strong-operator family and T E B(E , F) . Then II Ms (T) II x == II TS(x) ll == II T II sx and IIR M(T) IIx == II RT(x) ll < II R II II T IIx · If II · llx , f ; x E E, f E F* is a prenorm from the weak-operator family, then for the same T we have II Ms (T) llx , f == l f(TSx) l == II T II sx ,J and IIR M(T) II x ,J == l f (RTx) l == I ( R * f) (Tx) l == II T IIx,R* f· Thus, in both cases we have the estimates needed in Theorem 1. (Note that
Proof. Suppose
instead of a finite family of prenorms now one prenorm fits; in both cases the required inequality turns out to be equality with the constant C == 1.) •
We can characterize the strongest polynormed spaces and also the poly normed spaces with zero prenorm (it is natural to call them the weakest polynormed spaces) in terms of operators as follows. Exercise 12 ° ( cf. Exercise 0.2.5 about discrete and antidiscrete topo logical spaces) . (i) A polynormed space E is the strongest (up to the equivalent family of prenorms) � for any polynormed space F every linear operator from E to F is continuous. (ii) A polynormed space E is the weakest � for any polynormed space F every linear operator from F to E is continuous. Other instructive examples of continuous operators between polynormed spaces will appear later in connection with weak topologies and generalized functions. *
*
*
Now we have two new categories of functional analysis: (i) The category Pol. Its objects are polynormed spaces and mor phisms are continuous operators.
4.
258
Polynormed Spaces and Generalized Functions
(ii) The category HPol. It is a full subcategory in Pol with Hausdorff polynormed spaces as objects. To avoid possible misunderstanding, we emphasize that a linear space with two different families of prenorms gives rise to two different objects in Pol, even if the two families are equivalent (and consequently, generate the same topology on E) . Obviously, the category Pre described before is a full subcategory in Pol, and Nor is a full subcategory in HPol (and thus, in Pol as well) . Exercise 13 ° . There is a full subcategory in HPol ( and thus, in Pol) which is isomorphic ( see Subsection 0. 7) to Lin: this is the category of strongest polynormed spaces . There is another subcategory in Pol with the same property: this is the category of weakest polynormed spaces.
Thus, "one may consider linear algebra
as
part of the theory of polynormed spaces ."
Note in addition that in HPol there is another important subcategory, Fr, consist ing of Frechet spaces ( see Definition 3) . Its behavior resembles the behavior of the full subcategory Ban. But the category Fr is beyond the scope of this book.
Clearly, just as in Pre, the isomorphisms in these categories are con tinuous operators having continuous inverse operators. So we preserve for them the same name: topological isomorphisms. (Categories that generalize the category Pre1 , and consequently the notion of isometric isomorphism, have not appeared in mathematics so far, and it is not likely that they will be needed in the future.) The set of morphisms between objects E and F of our new categories, i.e. , the set of continuous operators between polynormed spaces, will (again as in Pre) be denoted by B(E, F) . The set of continuous functionals on E again is denoted by E* . Proposition 8 (cf. Proposition
£( E, F) , and thus a linear space.
1.3.2) . The set B(E, F) is a subspace in
S, T E B(E, F) and choose an accompanying prenorm II · II in F. By Theorem 1 , there exist accompanying prenorms II · II ' and II · II " in E such that S (respectively, T) is bounded as an operator from (E, II II ' ) (respectively, (E, II II ") ) to (F, II · II ). Hence, both operators are bounded as operators from (E, max{ II II ' , II II "}) to (F, II · II ) , and the same is true for their sum. Again by Theorem 1 we have S + T E B(E, F) . We leave to the reader the verification of the fact that the elements in B(E, F) can be multiplied by scalars. • Proof. Take
·
·
·
·
The linear space E* will be called dual to E (by analogy with the case when E is prenormed) .
1.
Polynormed spaces
259
As an optional material (even for the advanced reader) , we describe some additional properties of the categories Pol and HPol. First of all, Proposition 8 shows that we can define morphism functors on these categories with range in Lin. The reader can easily restore the details by analogy with the morphism functors B(E, ?) , B(?, E) : Ban ---+ Ban discussed in Section 2.5. Here you may feel dissatisfied . Why is the set B( E, F) endowed with the structure of linear space only? Can we make it an object of the same category Pol like we did when considering morphisms in Ban? It turns out that there is no "canonical" way to make B(E, F) a polynormed space. Instead, there are many ways to do that , and each has its advantages and disadvantages. We describe just the simplest way (used in the theory of generalized functions) ; it is inspired by Example 8. Let E and (F, II . llv; v E A) be polynormed spaces. Take r : == E X A and for any pair (x E E, v E A) consider the function II · llx,v : B(E, F) ---+ IR:+ , where II T II x,v : == II T(x) llv · Obviously, this is a prenorm. This family of prenorms and the corresponding topology on B(E, F) are said (following the same example) to be strong-operator. Now you can easily check the following. For every polynormed space E, the same constructions as those used when defining the functors with range in Lin, but now with strong-operator families of prenorms in B(E, F) , give covariant and contravariant functors from Pol to Pol (and from HPol to HPol) . We should, however, keep in mind that these functors are not extensions of the functors defined for Ban in Subsection 2 . 5 . This follows from the fact that even if E and F are infinite-dimensional Banach spaces , these functors give non-normed spaces (explain why! ) . Many properties of Pre and Nor are preserved when passing to more general cate gories Pol and HPol. In particular, the notion of topological direct sum is extended from prenormed spaces to polynormed spaces (Definition 1 . 5 . 1 ) , as well as related questions of describing (co )retractions (see Exercise 1 .5 .3) . The same is true for the characterization of continuous projections in terms of a decomposition of the space into a topological direct sum (Proposition 1 .5 . 1 0 and Corollary 1 .5 . 1 ) . Finally, the results on the characterization of mono- and epimorphisms (and their " extreme" generalizations) are carried on literally from Pre to Pol (Proposition 1 . 5 . 1 1 ) , and from Nor to HPol (Exercise 1 .5 .6) . You can easily restore the details. Here is something more interesting. In some aspects the categories Pol and HPol behave much better than, for example, Ban (and they rather resemble Ban 1 ) . As we know, only finite families of objects in Ban have the (co )product (Exercise 2 . 5 . 1 5 ) . It turns out that in our new categories any family of objects has the product and the coproduct . Let E�.� ; v E A be a family of polynormed spaces , and II · II JL ; J.l E A v the family of prenorms on E�.� . Take the linear space X {E�.� ; v E A} (i.e., the product of our family in Lin ) and endow it with the family of prenorms 111 · 111� ; J.l E A�.� , v E A put ting 111!111 � :==
ll f ( v ) II JL ·
Exercise 14. The space X { Ev ; v E A} endowed with the family of prenorms 111 · 111� ; J.L E A�.� , v E A and with the projections 1rv (see Subsection 0.6 ) is a product of the family Ev ; v E A in Pol. If, in addition, all our spaces are Hausdorff, then X { Ev ; v E A} is a product in HPol.
Now consider the linear space E == ffi{Ev ; v E A} (i.e., the coproduct of our family in Lin ) . For every v E A let us agree to identify Ev with the subspace in E consisting of f such that f (J.L) == 0 for J.l =1- v (see Section 0.6 ) . Further, consider in E the family � of all prenorms II · II such that the restriction of II · II to each Ev is majorized by the family of prenorms II · II JL ; J.l E Av of the space Ev .
4.
260
Polynormed Spaces and Generalized Functions
Exercise 15* . The space ffi{Ev ; v E A} endowed with the family of prenorms � and with the inj ections iv (see Subsection 0.6) is a coproduct of the family E�.� ; v E A in Pol. Moreover, if all our spaces are Hausdorff, then it is a coproduct in HPol as well. Example 12. Consider a countable family of copies of C. The product of this family is the polynormed space c(X) with the projections 1r : c(X) ---+ C : � �-----+ � , and its coproduct is Coo with the strongest family of prenorms and the injections : C ---+ coo : A �-----+ (verify both statements! ) .
n
in
n
.Apn
Concluding the discussion of the categories Pol and HPol, we note that they have sufficiently many free objects for the existence of the freedom functor. We can regard these categories as concrete categories (see Subsection 0. 7) if for any E E Pol we take its underlying set as DE. Exercise 16* . Suppose K is Pol or HPol, and E
E
Ob(K) . Then
(i) E is a free object in K ¢:::::::> the family of prenorms of E majorizes any prenorm (in other words , E is the strongest polynormed space up to a topological iso morphism) ; (ii) there exists a freedom functor :F : Set---+ K that associates with every set S the linear space :F(S) (see Example 0. 7.4) equipped with the strongest family of prenorms.
2. Weak topologies
In this section we shall concentrate on one special way to define a family of prenorms, namely using appropriate families of functionals. The topologies arising here play an important role in studying general polynormed spaces, and also allow us to better understand objects of classical functional analysis, namely normed spaces. We formulate, for further references, a special case of Theorem 1.1 deal ing with functionals. Until further notice, (E, II . 11 1-L ; 11 E r) is a polynormed space.
The following properties of a functional f : (E, 11 · 11 1-L ; 11 E f) � C are equivalent: (i) there is a finite set of indices /1 1 , . . , f.1n and a constant C > 0 such that for each x E E we have l f(x) l < C max { ll x ii J.L 1 , • • • , ll x ii J.Ln } {in other words, f is bounded as a functional on the accompanying pren o rmed space ( E, max { II · II J.L 1 , · · · , II · II J.Ln } )}; ( ii) f is continuous at zero; (iii) f is continuous. • Theorem 1 .
.
We emphasize that for the continuity of a functional on a polynormed space it is necessary and sufficient that this functional is continuous with respect to at least one accompanying prenorm, and not with respect to all of them. (A frequent mistake on exams!)
2.
261
Weak topologies
How many continuous functionals are there on polynormed spaces? The Hahn-Banach theorem gives a complete answer to this question. Theorem 2 ( cf. Corollary 1 .6. 1 ) .
A space (E, II · II IL; /1 E r) is Hausdorff {::::::} for each vector x E E; x # 0 there exists f E E* such that f ( x) # 0, or, equivalently, for every vectors x , y E E; x # y there is f E E* such that f(x) # f(y) . ===> .
Proposition 1 .4 gives 11 with ll x ll 11 > 0, and Theorem 1 .6.3 gives a non-vanishing functional on x bounded with respect to the prenorm II · 11 11• Then Theorem 1 works. ¢::=: . If f(x) # 0 for f E E* , then Theorem 1 gives at least one 11 c • ll x ll 11 =/= 0. Then Proposition 1 .4 works.
Proof.
For our future study of weak topologies we need several standard facts from linear algebra, which we present here with complete proofs.
Let T E --+ F be an operator between linear spaces such that for some finite family of functionals !I , . . . , fn on E the condition fi (x) · · · fn (x) 0 implies Tf 0 {in other words, n �== I Ker(fk ) C Ker (T) ) . Then the image of T is finite-dimensional. Proof. In e n consider the subspace of the rows (!I ( X ) ' . . . ' fn ( X ) ) ; X E E. Consider vectors X I , . . . , X m such that the rows (/I (x k ) , . . . , fn (x k ) ); k 1 , . . . , m form a basis of this subspace. Then for every x E E there are AI , . . . , A m E C such that fz (x) L "!: I A k fz (x k ) for all l 1 , . . . , n. Hence, x L "!: I A k X k belongs to the kernel of all fz , and therefore, the kernel of • T. Thus Tx E span (Tx i , . . . , Tx m ) for each x E E. The rest is clear. Corollary 1 . If !I , . . . , fn is a finite family of functionals on a linear space E, then the quotient space E I n �== I Ker(fk ) is finite-dimensional. :
Proposition 1 . �
�
�
�
�
�
�
-
As an application, try to obtain one of the many proofs of the following "intuitively clear" fact.
E be an infinite-dimensional Hausdorff polynormed space. Then the dual space E* is also infinite-dimensional. Hint. For every !I , . . . , fn E E* there exists f E E* that does not vanish on n�== I Ker(fk ) · Proposition 2. Let E be a linear space, /I , . . . , fn functionals on E, and f another functional on E. Suppose that for every x E E the condition fi (x) · · · fn (x) 0 implies f(x) 0 {i. e., n�== I Ker(fk ) C Ker(f)). Then f E span ( fI , . . . , fn ) · Exercise 1 . Let
�
�
�
�
262
4.
Polynormed Spaces and Generalized Functions
� e n : X 1---+ ( !I ( X ) ' . . . ' fn ( X ) ) and the corresponding injective operator T : E I Ker(T) � e n : X + Ker(T) 1---+ Tx . (We came across such a construction in Section 1 .5.) From our assumption, Ker(T) C Ker(f) . Hence f generates a functional f : E I Ker(T) � e : x + Ker(T) �----+ f (x) . By the injectivity of T, there exists a functional g such Proof. Consider the operator T : E �
�
that the diagram
E I Ker(T)
T
-
en
� e /.
is commutative. It remains to put A k for all x E E we have -
f(x) == f (x
: ==
g(p k ) ;
k == 1 , . . . , n and to see that
+ Ker(T) ) == gT( x + Ker(T) ) �
n
n
k= I
k= I
= g C�= !k (x) p k ) = L A k fk (x) . • Suppose again that E is a linear space, and E0 a space of linear func tionals on E (i.e. , a subspace in the algebraic dual space E � to E) . We say that E0 is sufficient if for every non-zero vector x E E there exists f E E0 such that f (x) # 0. For instance, Theorem 2 means in these terms that a polynormed space E is Hausdorff {=:::} continuous functionals on E form a sufficient space.
Suppose E0 is sufficient, XI , . . . , Xn is a linearly indepen dent system in E, and A I , . . . , An are arbitrary complex numbers. Then there exists f E E0 such that f(x k ) == A k ; k == 1 , . . . , n .
Proposition 3.
Proof. Consider the operator s : E0
�
en :
X
1---+
(f (x i ) , . . . ' f ( x n ) ) .
We have to show that S(E0) == e n . Suppose this is not the case. Then there exists a non-zero functional g on e n vanishing on S(E0 ) . For x : == E�= I g(p k ) x k E E and for every f E Eo we have f(x) == E�= I g(p k ) f (x k ) == g( L�= I f(x k ) P k ) . Since E�= I f(x k ) P k is (f (xi ) , . . . , f ( x n ) ) E S(E0 ) , the choice of g guarantees that f(x) == 0. At the same time from g # 0 we have that not all numbers g(p k ) vanish, and thus, x =/= 0. The functional f E E0 was chosen arbitrarily, hence we came to a contradiction with the fact that • E0 is sufficient. In the following definition, E is (still just) a linear space, and A is a set of linear functionals on E (i.e. , a subset in E � ) .
2.
263
Weak topologies
Definition 1 . A family of prenorms
{ II · l i t; f E A} on E, where ll x llt
l f(x) l , is said to be a A-weak family of prenorms ,
:�
and the corresponding
topology generated on E is called the A -weak topology.
We immediately note the following Proposition 4. The indicated family of prenorms is equivalent to the family { II · l i t; f runs over span(A) in E� } .
A k fk , then for x E E we have l g (x) l < C max { l fk ( x ) l ; 1 , . . . , n}, where C n max{ I A kl ; k 1 , . . . , n } . The rest is clear. •
Proof. If g
k
�
�
E�== l
:�
�
Thus, every time we speak about a A-weak topology, we can assume that A is a subspace in E � . We had already encountered two spaces with a family of prenorms of indicated type: these are c00 in Example 1 . 7 and (B(E, F) , wo) in Example 1.9. (Describe A in both cases.) Here are two principal classes of examples. Definition 2. Let (E, II · l v; v E A ') be a polynormed space. A weak (just weak, without mentioning any set) family of prenorms in E is the family { II · l i t; f E E* } , where ll x ll t l f(x) l . This family, as well as the cor responding generated topology in E, which also is called weak, is shortly denoted by w . Thus, the introduced family is a A-weak family of prenorms in E if we take the subspace E * in E � as A. :�
From this definition one can easily deduce
For every polynormed space E the weak topology in E is not stronger (i. e., no finer} than the initial one.
Proposition 5.
Proof. Look at the condition of continuity of a functional in Theorem 1 . It
precisely means that the prenorm I · l i t ; f E E* is majorized by the initial • family of prenorms. Definition 3. Again let ( E, II · l v ; v E A') be a polynormed space. The weak* (read "weak-star" ) family of prenorms in E* (note that this time in the dual, not in the original space!) is the family { II · ll x; x E E} , where l f ll x l f(x) l . This family, as well as the topology in E* generated by it (also called weak*), is denoted by w* . :�
This is a special case of Definition 1. Namely, like in the case of normed spaces (see Section 1 .6) , every vector x E E defines a functional ax on E* by the formula f �----+ f ( x), called evaluating functional. Various evaluating functionals form a subspace in ( E* ) � . If we denote it by A, then clearly the introduced family is precisely the A-weak family of prenorms in E* .
4.
264
Polynormed Spaces and Generalized Functions
When speaking about A-weak and, in particular, weak or weak* conver gence, we have in mind the convergence in the corresponding topology. Again, let E and A be the same as in Definition 1 . Note the following property of A-weak topologies, sharply distinguishing them from the normed ones.
6. Let E have infinite dimension. Then, for each A, every neighborhood of zero in the A-weak-topology of E contains a nontrivial and even an infinite-dimensional subspace.
Proposition
Proof. Such a neighborhood contains a standard open ball centered at zero:
for some functionals f1 , . . . , fk E A and r > 0 it has the form { x E E : I fk ( x) I < r; k == 1 , . . . n, n } . Therefore, it always contains the kernel of the operator T : E --+ e : X 1---+ (f1 (x ) , . . . fn (x) ) acting from an infinite • dimensional space to a finite-dimensional one. The rest is clear. '
Proposition 1 .4 immediately implies the following result.
space E endowed with a A-weak topology is Hausdorff • {=:::} the space span ( A ) is sufficient.
Proposition 7. A
This fact, combined again with Proposition 1 .4 and the tautology that a functional is non-zero if it is non-zero on some vector, yields Corollary 2.
( i ) For an arbitrary polynormed space E, the space (E* , w* ) is Haus
dorff. ( ii ) If E is a polynormed space, then the space (E, w ) is Hausdorff {=:::} E (with the initial topology) is Hausdorff. Exercise 2. A linear space
E is premetrizable in the A-weak topology
{=:::} the linear dimension of the space span ( A ) is at most countable. Hint. Use Exercise 1 .8.
Remark. From Exercise 2 it follows in particular that the weak topology
on an infinite-dimensional normed space and the weak* topology on the dual space to an infinite-dimensional Banach space are not metrizable ( cf. Exercise 2. 1 . 1 ) . Functionals that are continuous in the A-weak topology are called A weakly continuous. It is not difficult to distinguish them from all linear functionals:
functional f E E � is A-weakly continuous {=:::} it is a linear combination of functionals in A.
Proposition 8. A
2.
265
Weak topologies
A is a subspace in E� (see Proposition 3) . {:::::: . Using the tautology I f ( x) I < II x II f , we can apply Theorem 1 . ===> . By the same theorem, there are /I , . . . , fn E A and C > 0 such that lf(x) l < C max{ I !I (x) l , . . . , l fn (x) l } . But then the functionals /I , . . . , fn • and f satisfy the hypothesis of Proposition 2. The rest is clear. Corollary 3. Let E be a polynormed space. Then (i) functionals on E continuous in the weak topology are precisely those continuous in the initial topology; (ii) functionals on E* continuous in the weak* topology, are precisely the evaluating functionals.
Proof. We can assume that
In different concrete situations that we shall encounter later in the book it will be important to know whether a particular subspace E0 in E* is dense with respect to the weak* topology. Here is the way to determine this.
If E0 is sufficient, then it is dense in E* with respect to the weak* topology.
Proposition 9.
Proof. Take g E E* and an arbitrary neighborhood of g in the weak* topol
ogy. This neighborhood contains a standard open ball U9,xl , . . . , xn , r , and thus contains all h E E* where h (x k ) == g (x k ) ; k == 1 , . . . , n. Our goal is to find among those h a vector from E0 • If the system X k consists of zeroes, everything is clear. Otherwise, it contains a maximal linearly independent subsystem; without loss of gener ality we can assume that this is X I , . . . , x m ; m < n. The sufficiency con dition together with Proposition 3 provides f E E0 such that f ( x k ) == g ( x k ) ; k == 1 , . . . , m. Further, for every l == 1 , . . . , n the vector xz has the form E � I J1k X k ; /1 k E C. Hence, f (x z ) == E � I l1 k f ( x k ) == E � I J1kg (x k ) == • g (xz ) . The rest is clear. Exercise 3. If E is Hausdorff, then the converse is true.
Let T : E --+ F be a continuous operator between polynormed spaces. Then it is easy to see that a linear operator T* : F* --+ E* acting by the formula f �----+ JT is well defined. In other words, T* is defined by the formula [T* f] ( x) == f (Tx) or, which is equivalent, by the commutative diagram T F E ----�
� �
T
( cf. Definition 2.5.2) .
266
4. Polynormed Spaces and Generalized Functions
The operator T* is continuous with respect to the weak* topologies in F* and E* .
Proposition 10.
Proof. Take a prenorm
II
· ll x ; x E E in (E* , w* ) and look at the prenorm
II · ll rx in (F* , w*) . From the construction of T* it immediately follows that • l i T* f ll x == ll f ll rx · Replacing "==" by "<" , we can use Theorem 1 . 1.
Definition 4. For the same E, F, and T, the operator T* regarded as acting
from (F* , w* ) to (E* , w* ) is called the weak* adjoint operator to T.
(These operators will be soon considered in our study of generalized functions.) Weak topologies are most important and productive in the special case where the initial space E is a normed space. Until further notice, we assume that E is normed. Thus, in the indicated situation the space E is endowed with two families of prenorms, and E* has even three. In E we have the initial norm (i.e. , the family consisting of one norm) and the weak family of prenorms. In E* we have the norm, then the weak family of prenorms corresponding to this norm (i.e. , the A-weak system, where A :== E** ) , and finally the weak* family of prenorms. (Thus, the latter family is the A-weak family, where A is the space of evaluating functionals. Anticipating further events, we can identify this space with E by the canonical bijection.) Note that the weak* family of prenorms (and thus the weak* topology in E*) is just a special case of the strong-operator family of prenorms (and the strong-operator topology in B(E, F) ) for F == C . (The explanation of this terminological mess is that two systems of definitions appeared indepen dently, one from the theory of topological vector spaces, and the other from the theory of operator algebras. And now they collide in a rather ridiculous way. ) What are the interrelations between these topologies? Proposition 1 1 .
(i) The weak topology in E is not stronger, and if E
is infinite-dimensional, strictly weaker than the normed topology. (ii) The weak* topology in E* is not stronger than the weak one, and these topologies coincide {=:::} E is reflexive.
Proof. (i) Obviously, every prenorm of the weak family, hence this entire
family, is majorized by the initial norm. Therefore, "not stronger" follows from Proposition 1 .5, and "strictly weaker" from Proposition 6. (ii) Since the evaluating functionals on E* are continuous with respect to the norm, the weak* family of prenorms is part of the weak one, hence
2.
267
Weak topologies
majorized by it. If, in addition, E is reflexive, then these families coincide. If, on the contrary, we know that these topologies coincide, then every cp E E** , being continuous in the weak topology (Corollary 3(i) ) , is continuous in the weak* topology as well. Therefore, by (ii) and the same corollary, it is one of the evaluating functionals. • Thus, every sequence Xn E E tending to some x with respect to the norm, tends to x weakly, and if the space in question is dual, then also weakly* . The sequence of unit vectors in l 2 is probably the simplest example of a sequence tending to zero weakly (and if we identify l 2 with l2 , then weakly* ) , but not with respect to the norm. The same unit vectors in the space l 1 identified with c0 give an example of a sequence tending to zero weakly* , but not weakly (explain why) . However, not all the information on the weak topologies can be expressed in the language of sequences. Take the space l 1 , where the weak topology is certainly weaker than the normed one (Proposition 1 1 (i) ) . At the same time we have Exercise 4* . Every sequence in
l 1 that weakly tends to some vector,
tends to it with respect to the norm. Hint. Use the fact that li == l 00 • Since the weak convergence in l 1 implies the coordinatewise convergence, we can choose a subsequence such that its terms have "almost disjoint" supports. Selecting proper functionals in l 00 , we can show that such a sequence tends to zero with respect to the norm. Thus, we see that the weak topology in l 1 cannot be defined in terms of convergent sequences. (In Section 0.2 we promised to give such examples.) By Corollary 2, both spaces ( E, w) and ( E* , w*) made from a normed space E are Hausdorff. The following important result characterizes adjoint operators in the context of normed spaces.
Let E and F be normed spaces, and S : F* � E* a bounded {with respect to the norm} operator. Then S has the form T* for some bounded operator T : E � F {=:::} S is continuous with respect to the weak* topologies in F* and E* .
Theorem 3.
Proof. The imp l ication
{::::::= is a special case of Proposition 10.
===> Take x E E. The evaluating functional
ax
is weakly* continuous; therefore, by hypothesis, the same is true for ax S : F* � C . Hence, Corollary 3(ii) provides y E F such that ax S(g) == g (y) for all g E F* , and this y is unique by Corollary 1.6. 1 . If we assign to every x such y , we obtain o
o
4.
268
Polynormed Spaces and Generalized Functions
a mapping T : E � F defined by the formula g(Tx) == (Sg)(x). Clearly, T is a linear operator. It remains to show that it is bounded. For every x E BE , consider the functional f3x : F* � C : g �----+ g(Tx). Since for g E F* we have
I ,Bx(g) l == l g(Tx) l == I (Sg)(x) l < II Sg ll ll x ll < II Sg ll , the family of functionals { ,Bx; x E BE } is pointwise bounded. But F* is a Banach space, hence we can apply the Banach-Steinhaus theorem. It gives a constant C > 0 such that I ,B(x) ll < C for our x E BE. On the other hand, • by Proposition 1 .6.4, II ,Bx l == ll arx l == II Tx l · The rest is clear. Exercise 5 . Let E, F be normed spaces. A linear operator T: E � F
is bounded {=:::} it is continuous with respect to the weak topologies in E and F. Now we show that the weak topologies provide an alternative approach to the notion of a compact operator, the result we promised in Section 3.3. We need one geometrical observation, which is of independent interest.
Let E be a finite-dimensional normed space. Then for every c > 0 there exists a finite family of functionals !I, . . . , fn of norm 1 such that for each x E E we have ll x ll < ( 1 + c) max { l fk (x) l ; k == 1 , . . . , n } .
Proposition 12.
S in E* , being totally bounded, has a finite 8-net for 8 :== min{ � , � } . We denote it by /I , . . . , fn E S. Take an arbitrary x E E and for brevity put lll x lll : == max{ l fk (x) l ; k == 1 , . . . , n } . By Theorem 1 .6.3, there exists a functional f E S such that f ( x) == II x II · Choose k such that II! - !k ll < 8. Then l x ll == l f(x) l < I ( ! - fk )(x) l + l fk (x) l < 8 ll x ll + lll x lll · Hence, ( 1 - 8) ll x ll < lll x lll , and ll x ll < � 8 111 x lll = ( 1 + � 8 ) lll x lll < ( 1 + E ) lll x lll · Proof. The unit sphere
1
1
• : E�F
The following properties of a bounded operator T acting between normed spaces are equivalent: (i) T is compact; (ii) the restriction T I B : B � F of T to the unit ball B in E is contin uous with respect to the weak (i. e., inherited from ( E, w)) topology in B and the normed topology in F; (iii) the same mapping is continuous at zero with respect to the indicated topologies.
Theorem 4.
2.
269
Weak topologies
x E B and c > 0. Clearly, our goal is to find a finite family of functionals !I , . . . , fn E E* and 8 > 0 such that for every x' E Ux,fi , . . . ,fn ,8 B, i.e. , for every x ' satisfying the conditions (*) I fk ( x' - x) I < 8; k == 1 , . . . , n and II x' I < 1 we have II Tx' - Tx ll < c. Let YI , . . . , Ym be an � -net for T(B). Consider the finite-dimensional space Fo :== sp an(yi , . . . , Ym , Tx) C F. By Proposition 12, there are func tionals g � , . . . , g� : Fo -+ C such that ll g � ll == · · · == l g�ll == 1 and II Y II < 2 max{ l g� (y) I ; k == 1 , . . . , n } for all y E Fo. For every k == 1 , . . . , n we extend g� -preserving the norm to a functional 9k : F -+ C , and put fk :== T* 9k and ·. Now let x ' satisfy conditions ( * ) . There is l; 1 < l < m , such that I Tx' - Yz ll < � · Then II Tx' - Tx ll < II Yz - Tx ll + � · But Yl and Tx lie in () ( )
Proof. i ===> ii . Take an arbitrary n
� u
5• c
Fo, hence
II Yz - Tx ll < 2 max { l g� ( yz - Tx) l ; k == 1, . . . , n } == 2 max { l 9k (yz - Tx) I ; k == 1 , . . . , n } .
FUrther, for each k == 1 , . . . , n we have
l gk (Yz - Tx) l < l gk (Yz - Tx') l + l gk (Tx' - Tx) l < IIYz - Tx' ll + l fk (x' - x) l < : + : = 2 : . Therefore, II Tx ' - Tx ll < 2 · 2 � + � == c. The implication ( ii ) ===> ( iii ) is clear. ( iii ) ===> ( i ) . We construct an c-net in T( B ) for a given c > 0. The indicated continuity evidently implies that there exist /I , . . . , fn E E* and 8 > 0 such that for each x E 2 B satisfying the condition l fk (x) l < 8, k == 1, . . . ' n we have I I Tx ll < c. Consider the operator S : E -+ c � , S(x) == ( !I(x), . . . , fn (x)). Since the bounded set S( B ) in a finite-dimensional space is totally bounded, there are XI, . . . , X m E B such that Sxi, . . . , Sx m is a 8-net in S(B). We claim that Txi, . . . , Tx m is the desirable c-net in T( B ). Take an arbitrary x E B. From our construction it follows that there exists l; 1 < l < m such that l fk (x) - !k (x z ) l == l fk (x - x z ) l < 8 for all k == 1, . . . , n. Hence, II T(x - x z ) ll < c. This is what we need. • It is natural to ask what happens if, speaking about the indicated con tinuity condition, we take the entire space E instead of the unit ball? The following exercise supplements ( and clarifies ) the picture. Exercise 6* . The following properties of a bounded operator T : E F between two normed spaces are equivalent: -+
270
4.
Polynormed Spaces and Generalized Functions
(i) T is continuous with respect to the weak topology in E and the normed topology in F; (ii) T is finite-dimensional. Hint. (i)� (ii) . The given version of continuity provides a finite family of functionals satisfying the hypothesis of Proposition 1. (ii)� (i) . Our operator generates a finite-dimensional injective operator from E I Ker(T) to F. Therefore,- E I Ker(T) is finite-dimensional and there is a finite family of functionals f1 , . . . , fn on E I Ker(T) whose kernels have a non-zero intersection. The estimate ll x ll < C max{ l !k (x) l ; k == 1, . . . � n} ; x E E I Ker(T) implies II Tx ll < C max{ l fk (x) I ; k == 1 , . . . , n} for !k :== fk pr and all x E E. Weak topologies shed new light on the relations between a normed space and its second dual space. Recall that the isometric operator of canonical embedding a : E � E** allows us to identify E with a subspace in E**. Let us see how it is placed there. If E is a non-reflexive Banach space, then, clearly, it is a proper subspace in E** closed with respect to the norm. Thus, it is not dense (and even is rarefied) in E** with respect to the norm topology. However, if we pass to another natural topology, the picture changes.
13. For every normed space E the image of the canonical embedding is dense in E** :== (E*)* with respect to the weak* topology. Proof. If f E E* \ 0, then we take x E E such that f (x) # 0, and for the functional ax on E* we have ax (f) # 0. This means that the subspace a ( E ) in E** is sufficient as a set of functionals on E* . It remains to apply • Proposition 9 with E* in the role of E and a ( E ) in the role of E0 • Proposition
Actually we can go further:
For a normed space E the image of its unit ball under the canonical embedding is dense with respect to the weak* topology in the unit ball of the space E** .
Proposition 14.
For the proof see, e.g. , [33 , p. 460] . We have come to the most important of all the results about weak topolo gies. It is a new powerful (and often used) tool, also associated with the name of Banach. Theorem 5 (Banach-Alaoglu) . Let E be a normed space. Then the unit
ball in E* {with respect to the norm) is weakly* compact. In other words, it is compact with respect to the topology inherited from ( E* , w*) . Proof. Denote B : == BE* . For every x E E denote by JI)) x the closed disk {z : l zl < l l x ll } in C , and consider the set 0 of all functions r on E such
2.
Weak topologies
271
that r (x) E JI))x for every X . Obviously, n is just the Cartesian product X {JI)) x ; x E E} . We will regard it as the topological product (with the Tychonoff topology) . The estimate l f(x) l < ll x ll implies that B == O n E� == n n E* for all f E B , x E E (for the corresponding sets of functions on E) . In particular, B C 0. Thus, in addition to the weak* topology in B, we can consider another topology inherited this time from the "Tychonoff" 0. We will also call it the Tychonoff topology. Lemma 1 .
coincide.
The two topologies in B, the weak* and the Tychonoff one,
Proof. Since we compare two inherited topologies, our goal is to show that
a set in B has the form V n B, where V is open in 0 {=:::} it has the form U n B, where U is open in (E* , w*) . ===> . Take f E V n B . From Proposition 0.6.4 (and from the definition of the topology of the complex plane) it follows that V contains a Tychonoff neighborhood of f of the form Vt == {! E n : l r (X k ) - f(x k ) l < r; k == 1 , . . . , n } for some X I , . . . , xn E E and r > 0. Denote by Ut the standard open ball Uf,xl , · · · ,Xn ,r in E* and put u :== U { UJ : f E v n B}. Then u is open in 0 and, clearly, U n B == V n B. ¢== . Take f E U n B. From the definition of the weak* topology in E* it follows that U contains a standard open ball UJ,x 1 , . . . ,xn ,r in E* . Put Vt :== {! E n : l r (x k ) - f(x k ) l < r; k == 1 , . . . , n } and V :== U { VJ : f E U n B} . Then V is open in 0 and, clearly, V n B == U n B. • Lemma 2.
The ball B is a closed subset in 0. 1
be an adherent point for B. Then for every x, y E E and c > 0 in the (Tychonoff) neighborhood Proof. Let
{I' E n : I T' (x) - T (x) l < ; , IT' (y) - T (Y ) I < ; , IT' (x + y) - T (x + y) l < ; } there is a point (i.e. , a functional) f E B. By the linearity of f,
l r (x + y) - r (x) - ! ( Y ) I == l r (x + y) - f(x + y) - r (x) + f(x) - ! ( Y ) + f(y) l < l r (x + y) - f(x + Y ) l + l r (x) - f(x) l + l r ( Y ) - f(y) l < c . Since E > 0 was chosen arbitrarily, we have r (x + y) == r (x) + ! (y) . Similar arguments show that !(Ax) == Ar (x) for all x E E, A E C . Thus 1 is a linear functional on E, and from 0 n E� == B (see above) it follows that 1 E B. • End of the proof of Theorem 5. By Tychonoff ' s Theorem 3. 1.3, the
topological space n, being a topological product of compact spaces, is com pact itself. Hence, its subset B, closed by Lemma 2, is compact in the
4.
272
Polynormed Spaces and Generalized Functions
Tychonoff topology by Proposition 3. 1.5, and thus (Lemma 1) in the weak* • topology. Actually, the above formulation of Theorem 5 belongs to Alaoglu. Ba nach made a decisive step when he found the following sequential (i.e. , ex pressed in the language of sequences) prototype: Exercise 7* . Let E be a separable normed space. Then every uni
formly bounded sequence of functionals /I , /2 , . . . on E contains a weakly* convergent subsequence. Hint. Let XI , x 2 , . . . be a dense subset in E. The first observation is that if a subsequence fnk (xm ) converges for every m, then fnk (x) converges for every x E E to some g( x ) , where g E E* . After that we can take a sequence fn (XI ) , choose a convergent subse quence, say, !� (xi ) , then in j� (x2 ) choose a convergent subsequence J� ( x 2 ) , etc. After that the "diagonal" subsequence J::; converges at every xn ; n == 1' 2' . . . . Now we show one of the many applications of the Banach-Alaoglu theo rem, the one concerning compact operators. Consider the canonical embed ding a : E � E** of a normed space into its second dual (see Section 1.6) . Endow E with the weak, and E** with the weak* family of prenorms. The operator a, as we remember, bijectively maps E onto its image a(E) . We endow the latter with the family of prenorms inherited from E** and again denoted by w* .
The operator a : (E, w) � (E** , w*) is topologically in jective. As a corollary, in the case of reflexive E, a is a topological isomor phism. Proposition 15.
Proof. Both families of prenorms are indexed by the set E* , where the el
ements play the role of functionals in the first case, and of the "original" vectors in the second case. From this it evidently follows that a bijec tively maps every standard open ball in (E, w) to the standard open ball in (a ( E) , w* ) , and the inverse image of every standard open ball in (a( E) , w*) is the standard open ball in ( E, w) . The rest is clear. • Thus, E is identified with the image in E** not only as a normed space (cf. Section 1 .6) , but also as a polynormed space with respect to the rea sonably chosen families of prenorms.
Let E be a reflexive Banach space. Then the unit ball in E is compact in the weak (i. e., inherited from ( E, w)) topology.
Proposition 16.
2.
273
Weak topologies
Proof. By Proposition
15, the canonical injection
a : (E, w) � (E** , w* )
is a topological isomorphism. This means that, in particular, every set in (E, w) that has compact image in (E** , w* ) , is compact itself. But a, being an isometric isomorphism with respect to the corresponding norms (due to the reflexivity of E) , maps BE onto BE** . It remains to use the Banach Alaoglu theorem (with E* in the role of E in its statement) . • Exercise 8. If we take Proposition
14 for granted, then the converse
result is also true. Hint. Weakly* dense subset a (BE ) in BE** is also compact, and, as a corollary, closed. Now we can prove a significantly stronger result than Proposition 3.4.3 about compact operators between Hilbert spaces.
Let T : E � F be a compact operator from a reflexive space to an arbitrary normed space. Then T(BE ) is compact in the normed topology of the space F.
Proposition 17.
Proof. Combining Proposition
16 with Theorem 4, we see that T(BE) en
dowed with the normed topology, is the image of a compact set under a continuous mapping of topological spaces. Hence, the required fact follows • from Proposition 3. 1. 7. We believe that the advanced reader should know a few more things about polynormed spaces and weak topologies. 1 ° . We formulate a theorem of general character discovered much later than the Ba nach theorems. However, this theorem also is one of the most powerful tools of functional analysis. Let M be a set in a linear space. A point in M is called an extreme point of this set if it does not lie inside an interval with endpoints in M. Theorem 6 ( Krein-Milman, [29, Chapter III, §2, Theorem 14] ) . Every compact convex
set in a polynormed space is the closure of the convex hull of its extreme points.
About various applications of this theorem in different fields of analysis see , e.g. , [65, Ch. 10] , [25, Ch. 6.2] , [66, Ch. 2. 13] . Here we only note that the Krein-Milman theorem together with the Banach-Alaoglu theorem guarantees the existence of a sufficient ( in some reasonable sense ) set of extreme points of the unit balls in the dual spaces to normed spaces. Exercise 9. If we take the Krein-Milman theorem for granted, then the space £ 1 [a, b] is not isometrically isomorphic to any dual space of a normed space. The same is true for the space eo .
Hint. The unit balls of these spaces have no extreme points .
2 ° . By Proposition 5 ( see also Proposition 1 1 ) , in a polynormed space there are in general considerably less closed sets in the weak topology than in the initial topology. However, the following assertion holds.
4.
274
Polynormed Spaces and Generalized Functions
Exercise 10. A subspace Eo in a polynormed space E is closed under the weak topology {::::::} it is closed under the initial topology.
Hint to {:== . For every x E E \ Eo there exists fx E E* such that fx I Eo == 0 and fx (x) =/= 0 . By Corollary 3, Ker(fx ) is weakly closed, and E == n{Ker (fx ) : x E E \ Eo } .
Actually, this is true for all convex sets in E , and not only for subspaces . Try to prove this using Exercise 1 .6 . 1 3 on the separation of convex sets by a hyperplane. 3° . Suppose E is a linear space, and A is a subspace in E � . We say that a family of prenorms in E is compatible wtth the set A if all functionals on E continuous in the topology of the corresponding polynormed space, are precisely functionals from A. Exercise 1 1 . Among all possible families of prenorms in E compatible with A there is one defining the weakest (i .e. , coarsest) topology in the corresponding family of topologies, and this is the A-weak family.
In particular, if E is a polynormed space, and E* its dual space, then every other family of prenorms, say, v, such that (E, v ) * == E* , majorizes the weak family. Remark. It is interesting that among the topologies considered in this exercise, there is also the strongest (i .e. , finest) one, the so-called Mackey topology. If E is endowed with a norm and A is its dual space, then the Mackey topology coincides with the norm topology. But these facts are not so simple; see, e.g. , [60] or [61] .
4 ° . Let E be a normed space. The question of metrizability of the weak topologies connected with this norm has the following solution: Exercise 1 2.
(i) E is metrizable in the weak topology {::::::} it is finite-dimensional. (ii) E* is metrizable in the weak* topology {::::::} E has at most countable linear dimension. In particular, E* cannot be metrizable in the weak* topology if E is an infinite-dimensional Banach space.
Hint. Use Exercises 2 and 2 . 1 . 1 . What sometimes makes our life easier is that the unit balls of our spaces are much more "inclined to be metrizable" . Exercise 13 *
•
(i) If E* is separable with respect to the norm, then BE == {x E E : l l x l l metrizable in the weak topology. (ii) If E is separable with respect to the norm, then BE* == { f E E* : 1 1 ! 1 1 metrizable in the weak* topology.
<
1 } is
<
1 } is
Hint. If fn ; n == 1 , 2, . . . is a dense subset in E* , then the distance in BE can be defined by the formula d (x, y)
fn (X Y) � : = � 2 n ( ll + l fn-(X -l y ) l ) .
5 ° . Here is another application of the Banach-Alaoglu theorem. Exercise 14. Every normed space can be embedded by an isometric isomorphism into the space of the form C(O) , where n is a compact space.
Hint. Take n :== BE* and assign to every x sponding evaluating functional.
E
E the restriction to n of the corre
2.
275
Weak topologies
Exercise 15. Every separable normed space can be embedded into loo by an isometric isomorphism.
Hint. Combining Proposition 3.2.6 and Exercise 13(ii) , we see that BE* has a count able subset, say, /1 , /2 , . . . , dense in the weak* topology. The mapping j : C(BE* ) ---+ loo : x �-----+ ( x ( /1 ) , x(/2 ) , . . . ) is an isometric operator. Then the previous exercise works. Remark. There is also a separable classical Banach space into which every separable Banach space can be embedded by means of an isometric isomorphism: this is C [O, 1 ] . For the proof see, e.g. , [67] . 6° . Using weak* topologies one can define an important functor from Pol to HPol. It is a version of the star functor from Section 2 . 5 . To every E E Ob (Pol) we can assign the polynormed space (E* , w* ) . As we remember (Corollary 2(i) ) , it is Hausdorff, and thus, is an object of HPol. Then to every morphism in Pol we assign its weak* adjoint operator. By Proposition 10, it is a morphism in HPol. Thus, as is easy to verify (do this) , we obtain a contravariant functor from Pol to HPol, called the weak* star functor and denoted by (* ) (or, if there is a danger of confusing it with other versions of the star functor, by (w* ) ) . Remark. If we restrict the functor (w* ) to the subcategory Ban in Pol, then, contrary to the Banach star functor, it takes us out of this subcategory. The reason is that for an infinite-dimensional Banach space E the space (E* , w * ) is not normalizable (and, actually, is not even metrizable; see Exercise 2) . Thus , the introduced functor is not a generalization (or an extension) of the Banach star functor.
In conclusion, we mention a special example of weak* topology, playing an important role in the theory of operator algebras and, therefore, in the mathematical methods of modern quantum physics. Again, as in Examples 1 .8-1 .9, we take a space of operators, but now a concrete space B(H) , where H is a Hilbert space. We identify it with the space dual to N(H) (the space of nuclear operators on H) using the isometric isomorphism in Theorem 3 .4.4 ( Schattenjvon Neumann) . This allows us to consider, in addition to the strong- and weak-operator topologies introduced in Section 1 , the weak* topology in B(H) . The latter, by the same Schatten-von Neumann theorem, is generated by the family of prenorms indexed by the elements of the set N(H) and defined by the rule I I T I I s :== I tr(TS) I ; T E B(H) , S E N(H) . 2 The weak* topology in B(H) has a special name: the ultraweak topology, or the Dixmier topology. It was discovered much later than the strong- and the weak-operator topologies, but gradually it became clear that it is the most useful among the non-norm topologies of this space (of which there are at least seven, and all of them are useful) . In particular, an important role belongs to continuous operators from (B(H) , w* ) to var ious polynormed spaces, called normal operators for some historical reasons. Normal functionals are especially important , and they admit a number of substantial equivalent characterizations. Some facts about topologies in B (H) generated by different families of prenorms can be found in [25] , and much more detailed information is contained in [68] , [50] . If you are interested, please do the following exercise. Exercise 16* . Let H be an infinite-dimensional Hilbert space .
(i) The strong-operator and the ultraweak topologies in B(H) , which are incompa rable, both are strictly stronger (i .e. , finer) than the weak-operator topology.
2 A rather unfortunate term: it is not a "superweak" topology,
1 6 ( i ) . But such is the tradition.
as
you will see from Exercise
276
4.
Polynormed Spaces and Generalized Functions
( ii ) On the unit ball in B(H) , the weak-operator and the ultraweak topologies coincide, and both are weaker ( i.e., coarser ) than the strong-operator topology. ( iii ) The classes of convergent sequences for the weak-operator and the ultraweak topologies in B(H) coincide. Hint. ( iii ) follows from ( ii ) and from the Banach-Steinhaus theorem. Exercise 17. Describe the dual space to B(H) with the weak-operator topology.
Hint. This is the space of finite-dimensional operators.
3 . Spaces of test funct ions and generalized functions
The general notion of function requires that a function of x be a number given for each x, which changes as x changes. ( Nikolai Lobachevskii )
I recoil with disgust and horror from this growing sore of functions with no derivative. ( Charles Hermite )
In this and in next sections we will discuss continuous functionals on some particular polynormed spaces consisting of smooth functions. These functionals are called "generalized functions" . 3 They indeed can be regarded as objects including the majority of functions of a real variable as a special case. Before passing to precise formulations we shall try to explain infor mally why the traditional notion of function going back to Lobachevskii (see the epigraph) and Dirichlet requires a generalization. It is all fine when you can differentiate a function you are working with b oth a mathematician and a physicist are happy. In the old times when all functions you could encounter were analytic, this sentence would sound strange: can it really be otherwise? But as the new kinds of relations "acquired the rank of functions" , it became clear that, alas, this happens. Even the continuity did not help. Examples of continuous functions with larger and larger sets of "bad points" became more and more widespread, to dismay some outstanding mathematicians (again see the epigraph) . Finally, Weierstrass hammered in the last nail by constructing a well-known example of a continuous function which is nowhere differentiable. So, one would like to differentiate, but this seems to be impossible. Or, should we declare that only smooth functions have the right to exist, and the others should be banned? Clearly, such "mathematical fundamentalism" is a bad idea. Nobody will object to viewing a function as an arbitrary dependance: it is too useful for a mathematician, and applications of analysis would sharply decrease if such a ban were put into effect . 3 I n the English language literature they are usually called distributions.
3. Spaces of test functions and generalized functions
277
But there appeared some lucid minds-first of all, S. L. Sobolev and Laurent Schwartz-who explained what should be done. It turned out that, strange as it may seem at first sight, if we want to differentiate everything, we should not narrow, but on the contrary, essentially broaden the notion of function. In particular, we have to include objects that cannot be called functions (i.e. , the correspondences between numbers) , even functions of very general nature. But if it is not a function, then what is it? Let us look at what a function is for a practicing physicist. What is important, is the "averaging law" of a given function. To be more precise, it is important not what f(t) is, but what J f(t)cp(t)dt is for various "test" functions cp(t ) , i.e. , using the "scientific language" , what is important is not a function itself, but the integral functional it generates on the space of test functions. If so, why do we not make the next natural step? (Of course, it is easy to speak about natural now, standing "on the giants ' shoulders" .) Perhaps, the functional itself is important , regardless of whether it is generated by a function f ( t) or not. That was how physicists led by Paul Dirac, con sciously or subconsciously, worked with functionals that cannot always be represented as integrals. On the other hand, the physicists did not care about mathematical rigor, they called the necessary functionals integrals, and wrote "cp(O) fiR 8(t)cp(t)dt" , having no problems with the fact that such 8(t) cannot exist (see Theorem 2 in the sequel) . This approach, mostly due to the faultless intuition of the scientists of Dirac ' s scale, led to amazing success. But as in real life, in science one needs an adequate language to express the ideas. Sooner or later one has to establish mathematical rigor. Otherwise specialists gradually stop un derstanding what is that they actually do, and their reports to colleagues become incomprehensible. We are going to give a formal definition of what turned out to be a "right" generalization of functions, satisfying (we do not know for how long) the needs of both physicists and mathematicians. First, we have to specify what spaces of test functions (i.e. , the domains of our future functionals) we shall be speaking about. In fact, in analysis there are many such spaces, and they vary depending on the problem under consideration (see, e.g. , [69] ) . We shall restrict ourselves to three most useful spaces. �
Remark. For a series of applications, including the theory of partial dif
ferential equations, it is expedient to consider the spaces of functions of several variables. However, we think that for the first acquaintance with generalized functions it is reasonable to stick to one variable-real line-and concentrate on the main ideas and constructions without the distraction of digesting multi-indices.
278
4.
Polynormed Spaces and Generalized Functions
For every infinitely smooth (infinitely differentiable) function cp( t) ; t E JR, and for every n E Z+ and N E N we denote ll cp ll�) : == max{max { l cp ( k) ( t ) l ; -N < t < N } ; k == 0, . . . , n } . We call this number the standard prenorm of the function cp with parameters N and n. Certainly, the correspondence cp �----+ ll cp ll � ) is indeed a prenorm in every linear space consisting on infinitely smooth functions on the real line. Informally, all three spaces of test functions we describe here originate from the polynormed space C 00 [a, b] (Example 1.5) , but in different ways. We start with the most complicated and at the same time most important of them. As we will see, this will give the biggest supply of generalized functions. Recall that a function on the real line is called finitary if it vanishes outside some interval. Our first space is The space V. It consists of all finitary infinitely smooth functions. Here is the simplest substantial example of such a function: the "crust" F(t) . It is equal to e t2-1 1 for l t l < 1 and vanishes for l t l > 1 . We note that in V there is an increasing chain of subspaces VN : == { cp E V : cp ( t) == 0 for l t l > N; N == 1, 2, . . . } . Of course, V == U{VN : N ==
1 , 2, . . . } .
We want to make V a polynormed space. A prenorm II · II in V is called admissible if for every N E N there are n E Z + and C > 0 such that for every cp E VN we have ll cp ll < C ll cp ll � ) . In other words, a prenorm is admissible if on every space VN it is majorized by some standard prenorm (depending on N) .4 We make V a polynormed space equipped with the family of all admissible prenorms. This family will often be denoted by d. Remark. For every N the space VN is polynormed with respect to the standard family of prenorms II · II � ) , n E Z + . It is easy to see ( cf. Theorem 1.1 and Proposition 1.5) that the family d is the "strongest" (i.e. , biggest) among all possible families of prenorms on V for which all the embeddings VN � V are continuous. Certainly, every standard prenorm in V is admissible. Here are more interesting examples. Example 1 . Let a(t) ; t E 1R be an arbitrary continuous everywhere positive function. For cp E V we put l cp ll a : == max{ a ( t) l cp ( t) l ; t E JR } . Obviously, II · ll a is an admissible prenorm (and even a norm) in V, and in the above estimate for every N we can take n == 0. 4 The same condition can be described, up to equivalence, by a frightening formula [29, p. 99] , but, fortunately, we will not need it.
3. Spaces of test functions and generalized functions
Example 2 . For
279
1
cp E V let us put ll cp ll :== 2::: � l cp ( k ) (k) l . Obviously, II · II
is also an admissible prenorm. However, this time in the above estimate the order n of the derivative must increase unboundedly as N grows. Note a few obvious properties of the family of admissible prenorms:
( i ) If II · I l k ; k == 1, . . . , m are admissible prenorms, then II · II : == max { ll · I l k ; k == 1 , . . . , m } is also admissible. ( ii ) If a prenorm is majorized by an admissible one, then it is admissible as well. ( iii ) If II · II is admissible, then the same is true for the prenorm 111 · 111 defined by the formula lll cp lll :== ll cp' ll · •
Proposition 1.
Now we pass to the next space. We say that an infinitely smooth function cp(t) ; t E 1R is rapidly decreasing if for every p, q E Z + the function tPcp ( q) (t) is bounded, or, in other words, if cp( t) and all its derivatives decrease faster than every negative power of every polynomial as l t l � oo . The most famous among rapidly decreasing functions is, perhaps, the Gauss function 2 t e-2 , so popular in probability theory. ( In these lectures it will also play an important role later in the discussion of the Fourier transform. ) Thus, we introduce
The space S, also called the Schwartz space. It consists of all rapidly
decreasing infinitely smooth functions. We make it polynormed by intro ducing the family of prenorms ( actually, norms, as can be easily verified ) II . ll p, q ; p , q E z + , where ll cp ll p , q :== max{ l tPcp ( q) (t) l : t E JR } . This family will be often denoted by s. Finally, here is
The space £. It consists of all infinitely smooth functions ( no matter how they grow ) . The family of prenorms we define on this space is extremely
simple: it consists of all the standard prenorms, and only them. We shall denote this family by e.
Being polynormed, all three spaces V, S, and £ automatically become topological spaces. Let us discuss which sequences converge there. We shall say that a sequence of infinitely smooth functions cpm classically converges to cp on a subset M C 1R if cp};:) uniformly converges to cp ( n ) on M for all n == 0, 1 , 2, . . . ( cf. similar usage of this term in Example 1 . 1) . Then it follows from Proposition 1.3 that the convergence of a sequence cpm to cp in £ is precisely the classical convergence on each interval in JR. FUrther, from the very same proposition and from the choice of prenorms in S it easily follows that cpm tends to cp in S {=:::} for every p == 0, 1, 2, . . . , the sequence
4.
280
Polynormed Spaces and Generalized Functions
tP cpm ( t ) classically converges to tP cp ( t ) on the real line. As for V, the answer in this case will probably surprise you.
sequence cpm tends to cp in (V, d) {=:::} the following condi tions are fulfilled: ( i ) all cpm vanish outside some interval {in other words, belong to some V N} ; ( ii ) on this interval (or, equivalently, on the entire real line} cpm clas sically converges to cp.
Theorem 1. A
=====>-
( i ) . Suppose that, on the contrary, for every natural N there are cpm N and tN ; l tN I > N such that cpm N (tN) # 0. Passing, if necessary, to a sufficiently rarefied subsequence, we can assume that cpN(tN) # 0, and the sequence tN; N == 1 , 2, . . . either unboundedly increases or unboundedly decreases. Obviously, there is a continuous function a(t) ; t E 1R such that a(tN) == l cp N ( tN) l - 1 ; N == 1 , 2, . . . and a( t) > 0 for all t. Since cp is finitary, the function a(t) l cp(t) - cpN(t) l is equal to 1 at the point tN for every N except the first few. Hence for the same N and for the norm ll · ll a from Example 1, ll cp - cpN IIa > 1. Since ll · ll a belongs to the family d, Proposition 1 .3 forbids the convergence of the sequence cpm to cp, a contradiction. =====>- ( ii ) . Since all standard prenorms belong to d, cpm classically con verges to cp on every interval. Taking ( i ) into account, this provides the classical convergence of cpm to cp on the entire real line. {:::::::= . Our goal is to show that cpm tends to cp with respect to every admissible prenorm II · II · Take N indicated in ( i ) . By the definition of admissible category, for some n E z + and C > 0 we have ll cpm - cp ll < C ll cpm - cp ll � ) . But by ( ii ) , the numbers in the right-hand side of this • inequality tend to zero. The rest is clear. Proof.
From Proposition 1 .4 it evidently follows that all the three spaces V, S, and £ are Hausdorff. From Proposition 1 .6 it is also easy to see that they are not normed. To advanced readers, we suggest the following Exercise 1 . £ and metrizable.
S are metrizable, and even Frechet spaces, whereas
Hint. Use Exercise 1 .8. For
V,
V
is not
you can apply Theorem 1 instead.
Now we shall discuss the mutual positions of all three spaces of test functions and compare their topologies. Clearly, V c S c £. Later we shall need the following technical res11lt. Proposition 2 ( "on a hat" ) . For every interval [a, b] C 1R and for each c > 0 there exists a function ra , b ,c ( t ) in v {the so-called "hat") such that
3. Spaces of test functions and generalized functions
0 < ra, b ,c (t) <
281
1 and
ra,b,c (t) = �: tt EEt [a[a, -b],c: , b + c:]. Proof. Take the crust F(t). Clearly, there are c, C > 0 such that F( ct) == 0 as l t l > � and C fiR F(ct)dt == 1. Now it is easy to verify that the function b+c/2 l ra,b,c (t) : == c a-c/2 r(c(s - t))ds
{
is what we need. Proposition 3. dense in (S, s) .
• The space V {hence also S) is dense in ( £, ) , and V is e
£ (respectively, cp E S) a sequence 1/Jm E V tending to cp with respect to every prenorm from e (re spectively, from s) . Let us show that we can take 1/Jm : == cpFm, where rm(t) : == F- 1 , 1 , 1 (!) (see Proposition 2). We see that on every interval [ - N, N] the function cp - 1/Jm vanishes for m > N, and therefore for the same m and for all n == 0, 1, . . . we have ll cp - 1/Jm ll �) == 0. Thus, we established the required convergence in the space £. If cp E S, then, clearly, l l cp - 1/Jm ll p,q == max{ l tP (cp - 1/Jm)(q ) (t) l : l t l > m } . From the Leibniz formula it follows that (cp - 1/Jm)(q) == cp(q) - cp(q ) rm + :Ek= l Ck 0 such that Proof. It is sufficient to find for a given
cp
E
-1 q ll cp - '1/Jm ll p,q < max { 2 l tP
m }· l ==O But cp E S, hence for all l == 0, the functions tP cp( l ) (t) tend to zero as The rest is clear. • ltl --+ oo .
. . . ,q
The topology in V inherited from (S, s) is weaker {i. e. , coarser} than the topology in (V, d) , and the topology in S inherited from ( £, ) is weaker than the topology in ( S, s) .
Proposition 4. e
(Clearly, one can express this as follows: the natural embedding of (V, d) into ( S, s ) and the natural embedding of ( S, s) into ( £, e ) are continuous but not topologically injective operators.)
4.
282
Polynormed Spaces and Generalized Functions
Proof. It is easy to see that the prenorms in s are admissible. Hence, this family is part of d. Further, every standard prenorm (a prenorm in e ) ,
is obviously majorized by the norm max{ ll · ll o , q : 0 < q < N } for some sufficiently large N. This shows that this proposition is true if we replace the word "weaker" by "not stronger" . FUrther, for every non-zero cp E V we can put cpm ( t ) : == Cm cp ( t - m ) ; m == 1 , 2, . . . and take the coefficients Cm > 0 tending to zero sufficiently fast. Obviously, we obtain a sequence tending to zero in (S, s) , but not in (V, d) , taking into account Theorem 1 (i) . Finally, if we take a similar cpm ( t ) , but now with Cm 1, we obtain a sequence tending • to zero in ( £ , e ) , but not in (S, s) . The rest is clear. On the spaces of test functions a series of important operators act. Per haps, the principal among them is the differentiation operator cp �----+ cp'. We denote it by the same symbol D for all three spaces; this will not result in a confusion. Proposition 5 .
The operator D considered in (V, d) is continuous.
Proof. For each
II · II from
II Dcp ll < 1 · lll cp lll , where
d and for every
111 · 111
remains to apply Theorem 1 . 1 .
cp
V take the "estimate" is the prenorm from Proposition 1 (iii) . It E
Exercise 2. The same is true if we replace (V, d) by
•
( S, s ) or ( £ , e ) .
Consider another classical operator.
1/; ( t ) ; t E 1R be an infinitely smooth function. Then the "operator of multiplication by 1/J" M'l/J : cp �----+ cp 1/J is continuous as an operator in (V, d) and in ( £ , e ) . If in addition every derivative of 1/J has polynomial growth, then this operator is continuous in ( S, s ) . Exercise 3. Let
But all that was only a preliminary, and now the real story begins. (i) Continuous functionals on the space (V, d) (i.e. , elements of V * ) are called generalized functions; (ii) continuous functionals on the space (S, s ) (i.e. , elements of S* ) are called tempered generalized functions; (iii) continuous functionals on the space ( £ , e ) (i.e. , elements of £* ) are called compactly supported generalized functions, or generalized
Definition 1 .
functions with compact support.
In all three cases, instead of "generalized function" , one often says "dis
tribution" .
3. Spaces of test functions and generalized functions
283
Certainly, we can apply the general continuity criterion for a functional on a polynormed space from Theorem 2. 1 to all objects we have just in troduced. But the special form of the families of prenorms in these spaces allows us, at least in two cases, to make some simplifications. (i) A functional f V � C is continuous {i. e., it is Proposition 6. a generalized function} {=:::} for every N E N there are n E Z+ and C > 0 such that l f ( 'P ) I < C II 'P II�N) wh enever 'P E VN . (ii) A functional f £ � C is continuous {i. e. , it is a compactly sup ported generalized function) {=:::} there are N E N, n E Z + and C > 0 such that l f ( 'P ) I < CI I 'P I �N) for all 'P E £. :
:
:::=::::> Combining Theorem 2. 1 and Proposition 1, we see that there exists an admissible prenorm II · I in V such that l f ( 'P ) I < II 'P II for all 'P E V. The rest is clear. (i) , ¢::=: This condit ion means that the prenorm II · l i t ; II 'P II t l f ( 'P ) I Proof. (i) ,
.
:�
.
is admissible. Now Theorem 2. 1 works. (ii) , :::=::::> This follows from Theorem 2. 1 and the estimate max{ ll · ll ��k ) ; max{n k; k 1 , . . . , m } and N k 1, . . . , m } < II · I �N) ' where n max{Nk ; k � 1 , . . . , m } . • ( ii) , ¢::=: This is clear. .
:�
�
:
�
.
One says that a generalized function f has finite order if in the estimates of Proposition 6(i) the "order of derivative" n does not depend on N. The minimum of all such n is called the order of the generalized function f. Now let us discuss what guises a generalized function might have. Call a measurable function on the real line locally integrable if it is in tegrable on every interval with respect to standard Lebesgue measure. The set of locally integrable functions (more precisely, of their cosets modulo coincidence almost everywhere) is denoted by Lioc . Clearly, Lioc is a linear space with respect to the pointwise operations (more precisely, pointwise on the sets of full measure in 1R ) . Let f be a locally integrable function. Consider the mapping f V � C 'P r--+ fiR f(t)'P(t)dt. Clearly, j is a functional. Moreover, for 'P E V N N N ( ) we have the estimate l f('P) I < C II 'P II o , where C J_ N l f(t) l dt. Thus, by Proposition 6, f is a generalized function. Definition 2. A generalized function is called regular if it has the form f for some locally integrable function f. Otherwise, it is called singular. A
:
:
A
:�
A
A
A
Clearly, the functional f does not change if we replace f by an equivalent function. Hence, we have the well-defined mapping i Lioc � V* f r--+ f, which, certainly, is a linear operator. (From this moment, following A
:
:
284
4. Polynormed Spaces and Generalized Functions
the tradition of real analysis, we use expressions like "a locally integrable function f from Lioc ." As always, if you remember what stands behind them, this does not result in a confusion.) Thus, a generalized function is regular {=:::} it lies in Im( i ) . Certainly, regular generalized functions have order 0. To move forward, we need the following fact from real analysis.
Let f be an integrable function on a closed interval [a, b] such that I: f ( t ) cp ( t ) dt 0 for every cp E V vanishing outside [a, b] . Then f 0 almost everywhere on [a, b] .
Proposition 7.
�
�
Proof. Without loss of generality we can assume that our function takes
real values. Consider an arbitrary interval [c, d] C [a, b] and its characteristic function Xc, d · Take the hats Fn (t) rc+ I /n , d - I /n ,I /n ( t). Then the sequence Fn al most everywhere tends to Xc, d on [a, b], and everywhere we have l f ( t ) Fn ( t ) l < l f( t ) l . By the Lebesgue theorem, :�
d
jc f(t)dt = 1b f (t ) c,d (t ) dt = nl�m 1b f (t ) n (t ) dt = 0. x
a
00
a
T
In particular, I: f(t)dt 0 for all x E [a, b]. Differentiating this identity • with respect to x we see that f ( x ) 0 almost everywhere on [a, b]. �
�
Here is the first application.
The operator i is injective; in other words, two locally inte grable functions generate the same generalized function {=:::} they are equiv alent.
Proposition 8.
Proof. Let a locally integrable function
A
f be such that f
0. From the construction of f and Proposition 7 it immediately follows that f vanishes almost everywhere on each interval of the line. Hence, it vanishes almost • everywhere on the whole line. A
�
Thus, i is a bijection between Lioc and the subspace in V* consisting of regular generalized functions. Identifying these two spaces by means of this bijection (a usual "psychological device" for a mathematician) , we can as sume that Lioc is a part of V* . Since Lioc contains, to speak informally, "the majority" of functions we encounter in analysis, this identification essentially justifies the term "generalized function" . In what follows, speaking about a "generalized function f ( t) ; f E Lioc " (say, a polynomial, or an exponent) , we will have in mind the regular generalized function i(f) corresponding to f.
3. Spaces of test functions and generalized functions
285
Here is the first, and historically most famous, example of a singular generalized function.
8 : V � C : cp �----+ cp(O) is called the Dirac 8-function, or just the 8-function. Theorem 2. The Dirac 8-function zs a singular generalized function of order 0. Definition 3. The functional
Proof. The fact that this generalized function has order zero immediately
N
follows from the estimate l 8(cp) l < ll cp ii � ) for all cp E V and N. Suppose it is regular and corresponds to some f E Lioc . Then for every interval [a, b] not containing 0, and for every cp E V vanishing outside [a, b] we have J! f(t)cp(t )dt == fiR f(t ) cp(t)dt == 8(cp) == 0. By Proposition 7, f == 0 almost everywhere on [a, b]. But the line is the union of a countable family of such intervals with the singleton {0}. Hence, f == 0 almost everywhere on JR, and therefore 8 == f is a zero functional. But, by definition, 8 ( cp) is non-zero for cp(O) # 0, a contradiction. • A
Remark. Many physicists are convinced that the 8-function is indeed a genuine function 8(t) such that 8(t) == 0 for t # 0 and 8(0) is "so large" that fiR 8(t)dt == 1 and, "as a corollary" , fiR 8(t)cp(t)dt == cp(O) for each cp E V. But
what is admissible for physicists, is forbidden for mathematicians. Again:
The 8-function is not a function, despite its name!
The same formula cp �----+ cp(O) defines a functional in S* and in £* . We denote such functionals by 8 and call them 8-functions-this will not lead to a confusion. The following class of generalized functions contains both regular gen eralized functions and a 8-function. Let BORN be the family of Borel subsets of the interval [ - N, N], BOR00 :== U{BORN : N == 1 , 2, . . . } , and J-L : BORoo � C a set function such that for every N the restriction /-LN of J-L to BORN is a complex measure on the corresponding interval. Exercise 4. The mapping f : V � C : cp �----+ fiR cp(t)dJ-L(t) , where the integral is understood as J: cp(t)dJ-L N ( t ) for sufficiently large N, is a well defined generalized function. It is regular {=:::} for each N the measure /-LN is absolutely continuous with respect to Lebesgue measure. Exercise 5. The generalized function f has order zero {=:::} it is gener ated by some measure J-L from the previous exercise. Hint for ===> . From the Riesz Theorem 1 .6.4 it follows that the restric tion of f to VN acts as an integral with respect to some complex measure /-LN ·
4.
286
Polynormed Spaces and Generalized Functions
In fact, generalized functions may appear in very different constructions. Here is another example. Below, P.V. denotes the integral in the sense of principal value. Exercise 6. The mapping cp E V �----+ P.V. fiR
V �----+ � C: cp ( k) ( k ) is a generalized
function; it does not have a finite order.
1
Now we make the first attempt to justify the words "tempered" and "compact support" in Definition 1 . Recall that for an ordinary function the words "tempered growth" are often used to mean "polynomial growth" , and by a support of a function one usually means an arbitrary set outside which the function vanishes. Let f be a locally integrable function of tempered growth such that for some p E Z+ and for all t E 1R we have l f (t) l < C(1 + l t i P) . Then the function g (t) : = l �+l must be integrable on the real line and hence for t every cp E S we have fiR f(t)cp(t)dt == fiR g (t) ( l t1P+2 + l)cp(t) dt. Thus we have defined a mapping (and obviously a linear functional) f : S � C : cp �---+ fiR f(t)cp(t)dt. From the latter equality it follows, in addition, that 1 / (cp) l < 2C max{ ll cp ll p+ 2 , o , ll cp ii i , o } , where C : == fiR lg (t) l dt. By Theorem 2. 1 , this shows that f is a tempered generalized function. Similarly, let f be a locally integrable function with compact support (i.e. , vanishing outside some interval) . Then, certainly, fiR f(t)cp(t)dt exists for every cp E £, and the functional f on £ arising by the same rule as above, N is continuous in the prenorm II · II � ) for sufficiently large N (write down the corresponding estimate) . Thus, f is a compactly supported generalized function. The functionals f E S* or f E £* produced by this method from locally integrable functions are called regular, by analogy with the functionals from V* ; this will not lead to a confusion. Certainly, in both constructions, two locally integrable functions define the same functional on the corresponding space of test functions {=:::} they coincide almost everywhere ( cf. Proposition 8) .
li
A
A
A
A
A
A
Exercise 8.
(i) A continuous (hence locally integrable) function generates a tem pered generalized function {=:::} it has tempered growth. (ii) A continuous function generates a compactly supported generalized function {=:::} it has a compact support.
3. Spaces of test functions and generalized functions
287
Hint for (ii) . By Proposition 6(ii) , / (cp) == 0 for all cp vanishing on some
interval.
Thus, we have three "spaces of generalized functions" V* , S* , and £* . Let us endow each of them with a weak* family of prenorms. This makes them polynormed (and thus, topological) spaces and allows us, in particular, to speak about the convergence of sequences: by the general Proposition 1.3, in each of these spaces a sequence fn tends to f {=:::} for every cp from the corresponding space of test functions the numeric sequence fn ( cp) tends to
f(cp) .
Exercise 9. Suppose a sequence 'Pn E V is such that 'P n == 0 outside the interval [and fiR cp ( t ) t == 1 for all n. Then 'Pn as a sequence of
� , �]
d
regular generalized functions tends to the 8- function in V* .
(Such a sequence is called 8-shaped. This example shows that regular generalized functions may tend to a singular one. ) Let in be the natural embedding of V into S. Since the topology in V is stronger (i.e. , finer) than the one inherited from S (Proposition 4) , in is a continuous operator. Take its weak* adjoint operator in* , which assigns to every continuous functional on S its restriction to V. From the density of V in S (see Proposition 3) it follows that in * is injective. Thus, using the operator in* , we can identify every tempered generalized function with a generalized function, namely, with its restriction to V. Similarly, we can view every compactly supported generalized function as a tempered generalized function. So the embeddings £* C S* c V* arise. The corresponding operators of natural embeddings, being weak* adjoint to the natural embeddings of the corresponding spaces of test functions, are continuous. (However, they are not topologically injective-verify this!)
A functional in V* lies (using our identification} in S* {=:::} it is continuous with respect to the topology inherited from S. The same is true if we replace the pair (V, S ) with the pair ( S, £) , or with the pair (V, £) .
Proposition 9.
Proof. The proof in all three cases is practically the same. (In addition,
the third follows from the first two.) So we consider only the first case. ====> . This immediately follows from the construction of the operator in * . {:::::::= . We need to show that every f : V � C continuous as a functional on the subspace in ( S ,s) , is a restriction of a functional in ( S, s )* . By Theorem 2. 1 , f is bounded as a functional on at least one prenormed space (V, II · I ), where II · II is the maximum of some prenorms in the family s. FUrther, the density of V in (S, s) (Proposition 3) means, in particular, that
4.
288
Polynormed Spaces and Generalized Functions
V is dense in (S, 11 · 11 ) . But then, by the continuity extension principle (and since C is complete) , there exists a bounded functional on (S, 11 · 11 ) extending f. This functional, by the same Theorem 2. 1 , belongs to ( S, s) *. The rest • is clear.
We now recall the identification of Lioc with a part of V* . Certainly, Lioc contains £, and thus S and V as well. Furthermore, the space S, consisting of functions that are certainly tempered, can be identified with a part of S* . Combining all this with the relations between the spaces of functionals considered before, we obtain a diagram
where the arrows denote natural embeddings. Here we shall pay a special attention to the resulting embedding of V into V* , and also (having in mind the future Fourier transform) of S into S* . Remark. The fact that each of the spaces
V and S is naturally embedded
into its dual space, plays an outstanding role in the theory of generalized functions. However, it is necessary to emphasize that this is a consequence of a very special nature of these spaces. A general polynormed space, even a "very good" one, should by no means be a part of its dual space; take, for example, £. Let us make an important observation. Proposition 10.
V is dense in (V* , w* ) , and S is dense in (S * , w* ) .
Certainly, fiR cp( t)cp(t)dt > 0. This means that there exists a generalized function in V E V* , namely cp, such that cp ( cp) # 0. Thus, V is a sufficient subspace in V* , and the result about V follows from Proposition 2.9, with V* in the role of E* , and V in the role of E0 • The result about S can be proved similarly. • Proof. Take
cp E V\{0} .
4 . Generalized derivatives and t he struct ure of generalized functions
Since V is a part of V* , we can ask the following question. Do the most important existing operations in V, such as differentiation, have extensions to V*? To be more precise, let T be an operator acting on V. Are t �ere biexten sions T : V* � V* of T? In other words, are there operators T : V* � V* __
4.
289
The structure of generalized functions
such that for cp E V we have T(cp) == T(cp) (see Section 0. 1 ) , i.e. , the following diagram is commutative? V
T
V
1V* T- V*1
Here the vertical arrows mean natural embedding. Of course, from the point of view of pure linear algebra, there is a huge number of such operators. But V* is endowed with topology. Hence, it is reasonable to be interested in those operators that are continuous in this topology. Putting the question in this form usually leads to full success.
For the differentiation operator D : V � V there exists a """"" unique biextension D : V* � V* , continuous in the weak* topology. This operator acts by the formula [Df] (cp) :== f( - Dcp) ; f E V * , cp E V {in other words, it is the weak* adjoint operator for - D; see Definition 2.4).
Theorem 1 .
Proof. Due to the continuity of """""
D (Proposition 3.5) , the above formula
defines an operator D in V* , which is weakly* continuous by Proposition 2.10. Take cp E V, consider its derivative cp', and denote by the same symbols the corresponding regular generalized functions. Then, taking into account that cp is finitary, for every 1/J E V and a sufficiently large N we have
- l cp( t)'ljJ' (t)dt ( cp( t)'ljJ( t) I NN - l cp' ( t)'ljJ ( t)dt)
fJ cp(1/;) == cp( - D1j;) == ==
-
=
cp' ( '1/J ) .
This means that D is a biextension of D. FUrther, V* is Hausdorff. Hence, by Proposition 0. � 14, every weak* continuous operator on V* biextending D coincides with D not only on V, but also on its weak* closure. It remains • to apply Proposition 3. 10.
D(f) is called the generalized de rivative of a generalized function f and is denoted by f'.
Definition 1 . The generalized function
Thus, every generalized function has a derivative-we need only to sug gest a reasonable definition of such a derivative. In particular, we can speak about the derivative of an arbitrary locally integrable function; if such a derivative is again a regular function, this object usually is called the gener alized derivative in the sense of Sobolev. (Recall the epigraph from Hermite ' s letter; how unfortunate that he did not live till the discovery of generalized
4.
290
Polynormed Spaces and Generalized Functions
derivatives! ) Of course, we cannot now speak about the derivative at an individual point, but it seems that we can live well without it. Remark. An important role in the modern analysis belongs to Banach spaces consisting of those f E Lp (IR) that have k generalized derivatives in the sense of Sobolev, and these derivatives belong to Lp(IR) . These are the so-called Sobolev spaces. They are often
l P denoted by W;' Cll� ) and endowed with the norm 11 ! 11 : = 0 J��. l f( l (t) ! dt making them Banach spaces. ( Often inst ead of IR, various domains in IR n are considered in the definitions of such spaces . ) About the role of the Sobolev spaces , see , e.g. , [70] . The general theory of Sobolev spaces is presented in [71] .
f}L-7
For each natural number m every regular generalized func tion is the ( m + 1)th generalized derivative of an m-times-smooth function.
Proposition 1 . Proof. For
f E Lioc we put g ( t)
:�
J� f(s)ds and take cp E V. Since g and
cp satisfy the assumptions of the theorem on integration by parts ( see, e.g. , [9 , p. 125] ) , for sufficiently large N we have f(t)cp(t)dt f(cp) . g(t)cp ' (t)dt -g( t)cp(t) I NN + g' (cp) �
- 1:
1:
=
=
Hence, every regular function is a generalized derivative of some continuous function, this continuous function is a derivative of a smooth function, and • so on. Example 1 . Let
O(t) be the so-called Heaviside function 1 , t > 0, O(t) 0, t < 0. �
For every cp and sufficiently large
() ' ( cp)
�
- l 0( t)cp' (t)dt
=
{
N we have -
fooo cp' (t)dt
=
- cp( t) � �
�
cp(O) .
Thus, the generalized derivative of the Heaviside function is the Dirac 8function. Exercise 1 * . Suppose we are given a function on the real line that has
bounded variation on every interval. Find its generalized derivative. Answer: It must be among the "measures" in Exercise 3.4. Remark. For a specific example, take the Cantor scale. 5
5The Cantor scale is a continuous non-decreasing function on the closed interval [0 , 1 ] , closely related to the Cantor set ( see footnote on p. 44) . This function, being continuous, is uniquely determined by its values on the open intervals that were deleted in the process of constructing the Cantor set. It is constant on each of these intervals, and equals 1 / 2 on the interval of rank 1 ( cf. footnote on p. 44) , equals respectively 1 /4 and 3/4 on the intervals of rank 2 , 1 /8, 3/8, 5/8, and 7/8 on the intervals of rank 3, and so on. The "classical" derivative of the Cantor scale is
4.
29 1
The structure of generalized functions
2.
Find the generalized derivative of the function ln l t l . Answer: This is the "principal value integral" from Exercise 3.5. As you know, sometimes it happens that an ordinary function is dif ferentiable almost everywhere, but cannot be recovered from its derivative. ( The same Cantor scale is the classical example. ) The theory of generalized function saves us from this unpleasant situation. Exercise 3. Let f be a generalized function. Then f' == 0 {=:::} f is constant ( the precise meaning of such a statement was discussed above ) . As a corollary, two generalized functions that have the same generalized derivative coincide up to a constant summand. Hint. The functional f vanishes on all cp'; cp E V, i.e. , on all cp such that fiR cp(t)dt == 0. Hence, f is the constant f(cpo) for every cpo such that Exercise
fiR cp0 (t)dt == 1 .
If f is a generalized function, then its generalized primitive is a general ized function g such that g' == f. Here is another advantage of generalized functions as compared with ordinary functions. Exercise 4. Every generalized function has a generalized primitive, and two such generalized primitives are equal up to a constant summand. Hint. Take cpo E V such that fiR cpo(t)dt == 1 . As a primitive for f we can take g cp �----+ -! (�) , where �(t) is the unique primitive for cp(t) ( fiR cp(t)dt)cpo(t) lying in V. Remark. This fact initiates the theory of differential equations for gener alized functions, which serves as a powerful apparatus of modern analysis and physics. You can read about it, say, in the classical monographs [72], [73], [74], or in the excellent textbook [70]. :
-
*
*
*
We have been discussing the differentiation in V* . But the main con struction can be almost literally transferred to tempered generalized func tions. Exercise 5 ° . Theorem 1 remains true if we replace V with S and V * with S* . The operator D acting on S* turns out to be a birestriction of the same operator acting on the ( larger ) space V*. Another classical operation in analysis, multiplication by a sufficiently "good" function, can also be transferred from ordinary functions to gener alized ones. -
the function that is equal to zero almost everywhere. In our "generalized" sense it is the Cantor measure. You should admit that such a "derivative" provides much more information about the initial function.
4.
292 Exercise 6. Let
1/;(t) ; t
Polynormed Spaces and Generalized Functions
E
1R be an infinitely smooth function. Then
the operator M'l/J : V � V (see Exercise 3.3) admits a unique weakly* continuous biextension to an operator M'l/J : V* � V* (called the operator of multiplication by 1/J in V* ) . If, in addition, 1/J and all its derivatives are tempered, then the statement remains true if we replace V with S. Hint. This time M'l/J is precisely the weak* adjoint operator for M'l/J. We recall the topological characterization of compactly supported gener alized functions (Proposition 3.9) . They have another, perhaps more visual, description. We define a support of a generalized function f to be an arbitrary closed set M c 1R such that f ( cp) == 0 whenever cp == 0 in some neighborhood of M. In this case it is sometimes said that f is concentrated on M, or that f vanishes outside M. (Thus, the statement "f vanishes at a point" has no meaning, but "f vanishes on an open set" has reasonable interpretation.) Remark. Among all supports of a generalized function there is the minimal
one, namely, the intersection of all supports. (Try to prove this! ) People often speak about the support have in mind this minimal support.
2. A functional f E V*
belongs to £* {=:::} it has a compact support. {In other words-and this is not a joke!-a compactly supported generalized function is precisely a generalized function with compact sup port.) Proposition
:::::=::> .
From Propositions 3.9 and 3.6 (ii) it follows that for some N and n we have l f(cp) l < ll cp ll }t' ) . Hence, if cp ( t ) == 0 for t tf_ [ - N, N] , then f ( cp ) == 0. {::::::= . By the assumption, there is N such that f == 0 outside [ - N , N] . Hence, if we put � : == F- N, N, l (see Proposition 3.2) , then for each cp E V we have f(cp - cp � ) == 0, and thus f ( cp ) == f ( cp � ) . But cp � E V N + l · There fore, by Proposition 3.6(i) , there are n and C > 0 such that l f(cp � ) l < C ll cp � ll }t'+ l ) . From the Leibniz formula for higher derivatives it easily fol lows that ll cp � ll }t' + l ) < C' ll cp ll }t' + l ) for some C' > 0. Thus, l f(cp) l < CC' II cp l r+ l ) , and therefore the functional f is continuous with respect to the topology inherited from £. It remains to apply again Proposition 3.9. • Proof.
Theorem 3 below describes the structure of generalized functions with compact support. It shows that in some sense they "do not go far away from the regular ones" . But first we have to prove an important preliminary result (Theorem 2) .
4.
293
The structure of generalized functions
Suppose f E V* and N E N. Then there exist n E Z + and a square-integrable function h(t) on [-N, N] such that f(cp ) h(t)cp ( n) (t)dt
Theorem 2.
=
for every cp E VN .
1:
By Proposition 3.6(i) , there exists n - 1 (it is now more convenient to speak about n - 1 rather than n) such that our functional, considered on VN , is bounded with respect to the norm II · 11�)1 . Take the corresponding n and for arbitrary cp, 'ljJ E VN put ( cp, 'l/J ) :� fiR cp( n) (t)'ljJ( n ) (t)dt. Lemma. ( , ·) is an inner product in VN . The norm II · ll 2,n defined by the N equality II 'P II 2 ,n JI N l cp(n) (t) l 2 dt (i. e., the norm of the corresponding near-Hilbert space} majorizes the standard norm II · 11�)1 in VN . Proof. Clearly, (· , · ) is a pre-inner product, hence II · ll 2 ,n is a prenorm in VN. For k � 0, 1 , . . . , n - 1 we put ll cp ll� :� max{ l cp ( k ) (t) I : t E [-N, N] } . Since II · 11�)1 � max{ ll · II� ; k � 0, . . . , n - 1 } , it is sufficient to show that ll · ll 2 ,n majorizes 11 · 11 � _ 1 , and for every k � 0, 1 , . . . , n - 2, 11 · 11 �+ 1 majorizes II · II � · Take cp E VN. For every k we have cp ( k) (-N) � 0, so that cp( k ) (t) � J� N cp ( k+ 1 ) (s)ds; t E [-N, N] . For k � n - 1 this equality together with the Cauchy-Bunyakovskii inequality for square-integrable functions implies Proof.
-
:=
I 'P c n - l ) (t) l <
1: l cp Cn) (t) l 2 dt 1: dt
=
V2NII 'P II 2 ,n·
For the remaining values of k the same equality gives the estimate l cp ( k ) (t) l < 2N max{ l cp ( k + 1 ) (s) l : s E [-N, N] } , and this implies 11 · 11 � < 2N II · II �+ 1 . The • rest is clear. Consider the Hilbert space £ 2 [-N, N] and the subspace V!v in it consisting of the nth derivatives of functions from VN. Since every cp E VN vanishes at the point -N together with all derivatives, the condition cp ( n ) 0 implies cp 0. We can consider the functional ho : V/v � C defined by the rule ho ( cp (n ) ) :� f ( cp) . Since our functional f is bounded with respect to the norm II · 11�)1 of the space VN, N n the lemma implies that l ho(cp ( ) ) l < c JI N l cp(n ) (t) l 2 dt for some c > 0. This means that the functional ho is bounded with respect to the norm inherited from £ 2 [-N, N] . Let us extend ho to a bounded functional on the End of the proof of Theorem 2.
4.
294
Polynormed Spaces and Generalized Functions
entire £ 2 [ - N, N] . Since we know the general form of functionals on this space (Proposition 2.3. 10) , we obtain a function h E £ 2 [ - N, N] such that ho(cp (n)) � INN h(t)cp (n)(t)dt for every cp (n) E V"Jv. The rest is clear. • Theorem 3. Let f be a compactly supported generalized function. Then (i) for some n E Z+ , f can be represented in the form ��= O hi ) , where all hk are finitary regular functions; (ii) f is a {high order) derivative of some regular generalized function g, which can be chosen as smooth as we wish.
k
(Certainly, the second representation of our generalized function is more elegant than the first, but, on the other hand, the function g is not necessarily finitary there.) By assumption, f is concentrated on an interval [ - ( N - 1), N - 1] . Hence, for every cp E VN - 1 , f(cp) � f(cpb,.), where b,. : � r- ( N - 1 ) , N - 1 , 1 · FUrther, cpb,. E V N . By the previous theorem, there exist h E £2 [ - N, N] and n such that f(cp) � I NN h(t)(cpb,.) (n )(t)dt for all cp E V. Differentiating the product we obtain n N h k ip ( k ) dt, f(
1 �(-l) k
9k
0
9k
From the construction of the regular function g such that f is its high order derivative we can see that g must be tempered. The next result generalizes the second statement of Theorem 3. It is proved by similar arguments, but requires more cumbersome estimates, so we only state it here. Theorem 4 (see [29, Chapter III, §4, Theorem 35] ) . Suppose a generalized function f belongs to S* {i. e., is a tempered generalized function). Then
4.
29 5
The structure of generalized functions
it is a high-order generalized derivative of a tempered regular function, and (again} we can choose this multiple primitive as smooth as we wish. Theorem 3 allows us, in particular, to give a simple description of gener alized functions concentrated at a point ( hence, having minimal non-empty support ) . First do the following exercise of independent interest. Exercise 7 ( generalizing Exercise 3 ) . Let f be a generalized function. Then f ( n ) vanishes on an interval ( a, b) {=:::} f is a polynomial of degree < n on this interval. Hint. The same arguments as used in Exercise 3 show that j (n - l ) is equal to a constant c1 on ( a, b). Hence, (f ( n - 2 ) - c1t) ' == 0 on ( a, b). There fore, j ( n - 2 ) - c1 t is equal to another constant, say, c2 on ( a, b). But then, [f (n - 3 ) - � c1 t 2 + c2 t)]' == 0 on ( a, b), and so on. Exercise 8. Every generalized function f concentrated at a point t E 1R is a linear combination of the generalized function 8t cp �----+ cp ( t) ( "a shift of the Dirac 8-function" ) and its generalized derivatives. Hint. Suppose t == 0. Combining Theorem 3 ( ii ) and the previous exer cise, we see that the function g such that g ( n ) == f coincides with a polyno mial of degree < n to the right of 0, and with another polynomial of degree < n tok the left of 0. Hence, g is a linear combination of the functions t k O(t) and t (l - O(t) ); k == 0, . . . , n - 1, where O(t) is the Heaviside function. Dif ferentiating each of these functions n times, we obtain that up to a scalar factor, () ( n - k ) == 8 (n - k - I ) for some k < n. :
An advanced reader may try to prove the following two results generalizing Theorem 3 in different directions. The next exercise provides useful information about the structure of generalized func tions, now of arbitrary nature. Exercise 9. For every f E V* there exist
(i) a sequence of finitary regular generalized functions 9n satisfying the following condition: for every interval ( a, b) there is N such that for all n > N the functions 9n vanish on ( a, b) ; (ii) a sequence of non-negative integers kn such that for every r.p E V and for sufficiently large N we have (
and,
as
N
L g�k n ) (
n=l
a corollary, f = I::' 1 g�k n ) in the weak* topology of V* .
Hint. For every m E Z we put �m : = F2 m,m, 1 , F(t) : = I:: := - <Xl �m (t) and em (t) : = Then em (t) = 0 outside (m - 1 , m + 1 ) and I::= - <Xl em (t) = 1 . 6 The generalized _.._,
�;;) .
6 I n this book the family of functions with these two properties appears as an episode. Actually
such families of functions, called partitions of unity subordinate to a given covering (in our case, the covering of the line by intervals ( m - 1 , m + 1 ) ; m E Z) , play a very important role in analysis
296
4.
Polynormed Spaces and Generalized Functions
function f ern : r.p � f(r.pern) has the interval [m - 1 , m + 1] as a support , and thus can be represented in the form fern = L�mo hi��' where hi�� are regular functions vanishing outside [m - 2, m + 2] . But f = I: :=- <Xl f ern . The second statement of Theorem 3 fails to be true for arbitrary generalized functions (explain why! ) . However, Exercise 10. The second statement of Theorem 3 is true for all generalized functions of finite order.
Hint. Now in the representation of fern (see the previous hint) the numbers nrn are k bounded. From this we see that f is a finite sum of generalized functions gi ) , where 9k : = L::=- <Xl hk ,rn ·
and geometry. The details about applications of partitions of unity in the theory of generalized functions and in differential geometry can be found, respectively, in [73] and [75] .
Chapter 5
At the G ates of Spectr al T heory
I was developing my theory of infinitely many variables having in mind merely mathematical interests, and even called it "spectral analysis" , without any idea that it would later find applications in the real spectra of physics .
( David Hilbert )
The Latin word "spectrum" means "phantom" , "ghost" , "spirit" . In this chapter we shall see that operators also have "spirits" . However, they live not in old dilapidated castles, but on the complex plane. As is known, spirits, having been asked properly, can tell many important things. To make sure of this the reader does not have to repeat the experience of Hamlet or Macbeth; instead, look at the Gelfand-Mazur theorem of this chapter, or, even better, at the Hilbert spectral theorem in the next chapter. 1 . Spectra of op erat ors and their classificat ion . Examples
Let T be a bounded operator acting in a Banach space E. Let us agree that once E is chosen, we shall omit the index in the notation lE. Definition 1 ( going back to Hilbert ) . A complex number A is called a regular point of an operator T if the operator T - Al is a topological iso morphism. Otherwise it is called a singular point of this operator. The set of singular points of the operator T is called the spectrum of this operator and is denoted by a (T) . 29 7
298
5. At the Gates of Spectral Theory
By the Banach theorem, to say "A belongs to a(T)" is equivalent to "T - Al is not a bijective operator" or "the operator (T - Al) - 1 does not exist." We immediately distinguish a simple, but very important property of the spectrum. Proposition 1 . Suppose operators T : E � E and S : F � F are topo logically equivalent {i. e. , they are similar as endomorph isms in Ban). Then their spectra coincide. Obviously, for every A E C the operators T - AlE and S - Alp are topologically equivalent. Consequently, our proposition is a special case of • Proposition 0.4. 1 for JC == Ban. Proof.
Thus, spectrum is an invariant of the topological equivalence. Of course, assigning to every operator its spectrum, we do not obtain a complete sys tem of invariants of similarity (in the sense discussed in Section 0.4) . For instance, the zero operator on C 2 is not topologically equivalent to the oper ator given by the matrix ( g 6 ) , but they have the same spectrum consisting of only zero (check this!) . Thus, if we know the spectrum of an operator, we do not know everything about this operator, but (as we will see later) we know a great deal. Let us think why a point A would belong to the spectrum of T. There can be several reasons for this. 1. The "most rough" possible reason is that T - Al maps at least one non-zero vector, say, x to zero. (Of course, T - Al cannot be bijective in this case.) Then, certainly, Tx == AX, i.e. , A is an eigenvalue of the operator T. The subset in a(T) consisting of the eigenvalues of the operator T is called the point spectrum and is denoted by ap(T) . If our operator acts in a finite-dimensional space, then, as we know from linear algebra, it is bijective {=:::} it has a zero kernel. Applied to T - Al, this means that in the case of such an operator, a(T) == ap(T) . However, the operators of functional analysis, as a rule, act on infinite-dimensional spaces, and so they may have other numbers in their spectra. 2. Assume that A is not an eigenvalue of the operator T (i.e. , T - Al is injective) . Then it may happen that Im(T - Al) does not fill all of E. Certainly, in this case, A E a(T) as well. It is useful to distinguish two possible subcases here. 2a. Im(T - Al) , being smaller than E, is "at least" dense in the latter. The subset in a(T) consisting of those A is called the continuous spectrum of our operator and is denoted by ac(T) .
1.
29 9
Spectra of operators and their classification. Examples
2b. Finally, the "worst case" is that though T - Al is injective, its image is not even dense in E. The subset in a(T) consisting of such A is called the residual spectrum of the operator T and is denoted by ar (T) . Let us emphasize that there are no other possibilities for a point A to be singular: if Ker(T - Al) 0 and at the same time Im(T - Al) E, then, by the Banach theorem, T - Al is a topological isomorphism, and A is regular. It is easy to check that not only the entire spectrum, but also each of its indicated parts undergoes no change under the passage to a topologically equivalent operator. ==
==
Remark. Clearly, the spectrum of an operator can be characterized "in a scientific way" as the set of those A for which the sequence
···
+----
0
+----
E
T -.Xl +----
E
+----
0
+----
···
fails to be exact. This is the "embryo" of the approach to the definition of a multidimensional analogue of the notion of spectrum, the so-called joint spectrum of a system of n commuting oper ators in a Banach space. Such a definition (crowning many years of search for the "right" definition of joint spectrum) was found in 1 970 by J . Taylor in terms of the absence of the exactness of other, more complicated sequences (with n + 1 non-zero terms) . Taylor's theory and its subsequent development are described in [76] and in the more recent book [77] .
The following observations often help to find out to which part of spec trum a given point belongs. Proposition 2. Suppose A E a(T) and there is c > 0 such that II Tx - Ax ll > c ll x ll for all x E E {in other words, T - Al is topologically injective). Then A E ar (T) . Proof. This directly follows from the fact that the image of a topologically • injective operator between Banach spaces is closed. From this we immediately have Proposition 3. Suppose A E ac(T) . Then there exists a sequence X n E E; ll xn ll 1 such that Tx n - AX n --+ 0 as n --+ oo . • The points of the spectrum can be classified in other ways, and the most important of these ways is apparently as follows. A point A is called an essentially singular point of an operator T if T - Al is a non-bijective oper ator that is not even a Fredholm operator. The subset in a(T) consisting of essentially singular points of the operator T, is called the essential spectrum of T and is denoted by ae (T) . Thus, a(T) is divided into two parts, ae (T) and its complement, according to the "remoteness of T - Al from a bijective operator." Note that Proposition 3.5. 1 gives the inclusion ac(T) C ae (T). ==
5. At the Gates of Spectral Theory
300
It is easy to show (do this) that the essential spectrum is also an invariant of the topological equivalence. Exercise 1 . The spectrum of an operator T : E --+ E coincides with the spectrum of the adjoint operator T* : E* --+ E* , i.e. , a(T*) == a(T) . In addition, (i) if A E ar (T) , then A E ap (T*); (ii) if A E ap(T) , then either A E ap(T*) or A E ar (T* ); (iii) if A E ac(T) , then either A E ac(T* ) or A E ar (T*) ; moreover, the latter inclusion is not possible if E is reflexive. Hint. The operation (* ) , being a functor after all, preserves topological isomorphisms. Therefore, a(T) => a(T*). Put S : == T - A 1 . (i) If x does not belong to the closure of Im(S) , then S* f == 0 for each f such that f i m ( ) == 0 and f(x) # 0. (ii) If Sx == 0 for x # 0, and f(x) # 0, then f does not belong to the closure of im(S*). (iii) If Im(S) is dense in E, then S* is injective. Hence, if it is not surjective, then S** (as well as S*) is a topological isomorphism. Then Proposition 2.5.2 works. Combining all this we see that a(T) == a(T*). 1 i
S
The reader who has done Exercises 3.5.8 and 3.5.9 will conclude from them that the essential spectra of T and T* also coincide: ae (T* ) = ae (T) .
Certainly, for T == A 1 ; A E C we have a(T) == ap(T) == { A }, and the same is true for ae (T) , provided E is infinite-dimensional. Somewhat more interesting is the following Proposition 4. Let Eo and E1 be non-zero closed subspaces in E, and P a projection onto Eo along E1 . Then a ( T) == aP ( T) == { 0, 1 } , 0 ¢: ae ( T) {:=:::> dim E1 < oo and {1 } E ae (T) {:=:::> dim Eo < oo . Proof. If we look at the action of P on vectors of our subspaces, we shall conclude that 0, 1 E ap(T) . If A # 0, 1, then we put Q : == 1 - P, and after that ( 1 - A) - 1 P - A - 1 Q will be the inverse operator for P - A 1 . The rest is clear. • The next result is not so simple. Theorem 1 . Let T : E --+ E be a compact operator acting in an infinite dimensional space. Then a(T) consists of 0 and at most countable set of eigenvalues for which 0 is the only possible limit point {if this set is infinite). 1 However, taking into account the previous remark in small print, advanced readers may note that the coincidence of u(T) and u(T * ) is a special case of Exercise 2 .5.10.
1.
Spectra of operators and their classification. Examples
301
Moreover, for each non-zero eigenvalue the subspace of the corresponding eigenvectors is finite-dimensional. Finally, ae (T) == {0} . From the fact that T is not a Fredholm operator (Proposition 3.5.2) it follows that 0 E ae (T) , and hence 0 E a(T) . Now take A E a(T) \ {0}. Then T - A1, as well as 1 - A - 1 T == -A - 1 (T - A1) , does not have the inverse operator. Thus, one of the two possibilities is true: either Ker(1- A - 1 T) I= 0, or Im(1 - A - 1 T) I= E. But A - 1 T is compact; hence, it follows from the Fredholm alternative that the second possibility implies the first. From this we see that the common kernel of the operators T - A1 and 1 - A - 1 T cannot vanish, which means that A is an eigenvalue of T. FUrther, the space of the corresponding eigenvectors, clearly, coincides with this kernel. Therefore, by Theorem 3.5.2 this kernel is finite-dimensional. By the same theorem, A t/: ae(T) . It remains to prove that a(T) is at most countable, and 0 is its only possible limit point. Assume that at least one of these statements is not true. Then there exists a sequence An of points in the spectrum that tends to some () I= 0; without loss of generality we can assume that I A n l > 0/2 for all n, and all An are distinct. We already know that all the A n are eigenvalues. Suppose Tx n == An Xn for X n I= 0; n == 1 , 2, . . . ; then, as we remember from linear algebra, the system Xn is linearly independent. Put En : == span{x 1 , . . . , x n }; clearly, it is an invariant subspace of the operator T. Using the lemma on the near-perpendicular, we choose, for every n > 1 , a vector Yn E En such that ll Yn II == 1 and d(yn , En - 1 ) > 1/2. Since codimEn En - 1 == 1 , we have that Yn == f.1n Xn + Zn for some f.1n E C \ {0} and Zn E En - 1 · Therefore, for every k == 1, . . . , n- 1 we have Tyn - TYk == f.1n An Xn + Tzn - TYk == A n Yn - Un , where Un : == An Zn - Tzn + Tyk E En - 1 · Hence, II TYn - TYk ll == I A n i iiYn - An 1 un ll > () /4. Thus, T maps the bounded set { yn ; n E N} to a set which is not totally • bounded. This contradicts the compactness of T. Proof.
*
*
*
We now look at our standard examples of operators and find their spec tra. At this point we restrict ourselves to the questions lying on the surface. Some things that we could have proved right now, but without much ele gance, we will obtain afterwards as immediate applications of some general results. Exercise 2. Let T11; 11 == (11 1 , /1 2 , . . . ) be a diagonal operator on lp; 1 < p < oo . Then a(T) is the closure of the set {Mn }, and ap(T) is this set itself. FUrthermore, a(T) \ ap(T) is equal to ac(T) for p < oo , and to ar (T) for p == oo . Finally, (for all p) ae (T) is the set of limit points of the sequence 11 ·
5. At the Gates of Spectral Theory
302
Hint. Put S :== T - Al. If A # f.1n for every n, then the subspace coo lies in Im( S) (and is dense in lp for p < oo ) . If A is a limit point of the sequence f.1 , then II S( pn ) ll can be as small as we wish, and in the case of l 00 we have d(�, Im(S)) > 1 for � :== (1, 1, 1 , . . . ) . The rest is clear, taking Exercises 1.4.3 and 3.5.2 into account. Exercise 3. Let (X, 11 ) be a measure space, and Tt the operator of multiplication by f E L00 (X, 11 ) in Lp (X, 11 ) ; 1 < p < oo . Then u(TJ ) is the set f(X)ess of essential values of the function f (i.e. , such numbers A that for every c > 0 the measure of the set {t E X : l f(t ) - A I < c } is positive) . Further, in the notation ZA :== {t E X : f(t) == A } we have up (TJ ) == {A : 11 ( ZA ) > 0 } , and u (TJ ) \ up (TJ ) coincides with uc (TJ ) for p < oo and with ur (TJ ) for p == oo . Finally, if (X, 11 ) is an interval [a, b] with usual Lebesgue measure, then ue(TJ ) coincides with all u(TJ ) . In particular (and, apparently, this is the most instructive case) , for the operator of multiplication by the independent variable in L 2 [a, b] we have u(TJ ) == uc (Tj ) == ue(TJ ) == [a, b] . Hint. If 11(ZA ) > 0, then for every g E Lp (X, 11 ) \ {0 } vanishing out side ZA , we have Tt g == Ag . If 11 ( ZA ) == 0, then S :== Tt - Al is injective. However, if in addition A E f(X)ess , then, taking the sets Yn :== {t E X : l f (t ) - A I < 1 / n } and their characteristic functions Xn , we see that S(xn/ll xnll ) --+ 0 for n --+ oo . Moreover, if p < oo , then the subspace U� {g E Lp (X, 11 ) : g l yn == 0 } lies in Im(S) and is dense in Lp (X, J1 ) . If, on the contrary, p == oo , then d(h, Im(S)) > 1 for h(t) 1. The rest is clear, taking Exercises 1.4.3 and 3.5.2 into account. Exercise 4. The point spectrum of the operator of left shift in lp ; 1 < p < oo coincides with JI)) 0 for p < oo , and with JI)) for p == oo . The residual spectrum of the operator of right shift in lp ; 1 < p < oo coincides with JI)) 0 for 1 < p < oo , and with JI)) for p == 1. If p == oo , the residual spectrum of the right shift contains JI)) 0 . Finally, the point spectrum of the right shift is empty for all p. Hint. If � == (1, A, A 2 , . . . ) lies in lp, then Tz� == A�. If the same � defines a functional f� : 1J �----+ 2:: � �n 1Jn on lp, then every vector in Im(Tr - Al) belongs to the kernel of this functional. The rest follows from Exercise 1. (Certainly, all that we said in Exercises 2 and 4 about lp ; 1 < p < oo is true for co as well; check this!) Remark. These exercises already show that each possibility indicated in conditions (ii) and (iii) of Exercise 1 can indeed occur (give the corresponding examples) . 1
1
The following general observation often helps in studying spectra.
1.
303
Spectra of operators and their classification. Examples
(a generalization of Proposition 1). Suppose for S : E --+ E and T : F --+ F the operators S - A1 and T - 111 ; A , 11 E C are weakly topologically equivalent. Then A E a(S) {=:::} 11 E a(T) , and the same is true for ap , ac , ar , or ae in place of a. Hint. Look at the diagram in the definition of weak topological equiva lence. Exercise 5
Here is a typical application. Let T be any of the shift operators acting on Zp , Zp (Z) , or (for a # 0) in Lp (lR) ; here we assume that 1 < p < oo . Then for each A E 1r the operators T - 1 and T - A 1 are weakly topologically equivalent. As a corollary, each of the sets a(T) , ap (T) , ac (T) , ar ( T) , and ae (T) either contains the entire 1r or is disjoint from 'lr. Hint. In the case of Tz : Zp --+ Zp the desired equivalence is implemented by the pair of diagonal operators defined by the sequences (A, A 2 , . . . ) and (A 2 , A 3 , ) Similar pairs work for Tr and Tb. In the case of Ta : Lp (JR) --+ Lp(JR) the desired pair consists of the operators of multiplication by the exponentials ei ct with parameters c depending on A and a. Exercise 6.
•
•
•
•
(Which alternative actually occurs in each concrete shift, will be clarified in Section 3. However, we already know something from Exercise 4, and will learn more in the next exercises.) 7* . The residual spectrum of the right shift operator on Zoo also contains the entire JI)) . Hint. For every 1} E Im(Tr - 1 ) we have limn � oo � l:�= I 1Jk == 0. As a corollary, the distance from (1, 1, 1, . . . ) E Z oo to Im(Tr - 1) is 1. Exercise
If p == oo , then the point spectrum of the shift operators on Zp (Z) and (for a # 0) in Lp (lR) contains 'lr. If p == 1 , then the residual spectrum of the same operators contains 1r. of the functional Hint. If p == 1, then Im( Tb - 1) is contained in the kernel 00 � r--+ 2:: � _ 00 �n , and Im(Ta - 1 ) in the kernel of x r--+ J 00 x(t)dt. The rest is clear. Exercise 8 .
The point spectrum of the shift operator by a Lp ( 'lr ) contains all numbers an ; n E Z. Exercise 9 ° .
E
1r in
Later (see Exercises 3.3-3.6 and 3.9) , we will give further information about the spectra of concrete operators. However, from Theorem 1 we al ready know a great deal about the integral operators, since they are compact.
304
5. At the Gates of Spectral Theory
2. Somet hing from algebra: Algebras
To understand better the nature and the role of spectra, it is useful to find out which of their properties are related to analysis, and which have purely algebraic character. (We have to "separate the structures" , as I. M. Gelfand2 used to say in his famous seminar.) Let us first look at spectra from a very general viewpoint of abstract algebra. Definition 1. Let A be a linear space. A bilinear operator m A x A � A is called a multiplication if it satisfies the associativity identity ( ab )c == a(bc), where the notation ab is used instead of m( a , b). A linear space endowed with a multiplication (formally speaking, a pair (A, m ) consisting of a linear space and a multiplication on it) is called a complex associative algebra, or, since we do not consider others, just an algebra. :
Associativity implies that the product a 1 a 2 an does not depend on the way we arrange parentheses in it. The simplest example of algebra is, of course, C, and the main example in this book is the space B(E) of all bounded operators in a Banach space E, equipped with the operator composition as multiplication. But the algebraists have their own favorite toys. The most important of them (and the one of much use for us) is the algebra of polynomials C[t] in a formal variable t (the "C" means that we speak about polynomials with complex coefficients) . The set Mn of matrices of size n x n with usual matrix multiplication is also an algebra, and it is very important both in algebra and analysis. Another instructive example is the algebra e x of all complex valued functions on an (arbitrary) set X with pointwise multiplication. We say that two elements a, b of a given algebra commute if ab == ba. The algebra is called commutative if all its elements commute. (Show that B(E) is commutative {:=:::> dim E == 1.) Definition 2. An element 1A of an algebra A (also denoted by 1 if there is no danger of confusion) is called a unity of this algebra if a1 == 1a == a for every a E A. If an algebra has a unity, it is said to be unital. · · ·
All the above-mentioned examples are, of course, unital; in particular, a unity in B(E) is the identity operator. But there are algebras without unity; the algebra of matrices of the form ( g B ) ; A E C is one of them. Definition 3. Let A be a unital algebra and a E A. An element a[ 1 (respectively, a; 1 ) in A is called a left {right} inverse to a if a[ 1 a == 1
2 1. M. Gelfand (born in 1913 ) , one of the most prominent mathematicians of our time. His
enormous contribution to the science includes the creation of the theory of Banach algebras.
2.
305
Something from algebra: Algebras
(respectively, aa; 1 == 1 ) . An element a - 1 is called (just) an inverse to a if a - 1 a == aa - 1 == 1 . An element that has an inverse is called invertible. Proposition 1 (resembling Proposition 0.5. 1). If an element a E A has a left inverse a[ 1 and a right inverse a; 1 , then a is invertible and a - 1 == a[ 1 == a; 1 . As a corollary, an element of an algebra can have at most one inverse element.
It is sufficient to put parentheses in the expression az- 1 aa; 1 in two • different ways. Proof.
However, sometimes an element may have many left (or right) inverse elements, but no "real" inverse element. Please, give an example, using, say, shifts in l 2 . Proposition 2. Suppose a 1 , . . . , a n E A. Then (i ) if all of them are invertible, then the same is true for a 1 a 2 an ; (ii) if all of them commute, then the converse statement is also true: the invertibility of a 1 a 2 an implies the invertibility of all factors. · · ·
· · ·
(i) Clearly, an 1 an� l a1 1 is the inverse one to a 1 a2 an (ii) We put b :== (a 1 a2 an ) - 1 . Then, by commutativity, we have (ba 1 ak - l ak+ l an ) ak == 1 and ak ( a 1 ak - l ak+ l an b) 1 for all k == 1, . . . , n. It remains to apply Proposition 1. • · · ·
· · ·
· · ·
Proof.
· · ·
· · ·
· · ·
.
· · ·
Here is the general notion of spectrum we promised before. Definition 4. Let A be a unital algebra. A complex number A is called a regular point of an element a E A if the element a A1 is invertible; otherwise, it is called a singular point of a. The set of all singular points of an element a is called the spectrum of this element and is denoted by aA ( a ) or just a (a) . -
We see that the spectrum of a bounded operator on a Banach space E defined in the previous section is precisely the spectrum of it as an element of the algebra B(E) . Remark. One can also define the spectrum of an element of an arbitrary, not necessarily unital, algebra; this can be done using the procedure of the so-called unitization (adjoining a unity) ; see, e.g., [25] or [79] . But we will not need this. What subsets on the complex plane can be the spectra of elements of algebras?
306
5.
At the Gates of Spectral Theory
The spectrum of an element of the algebra e x , i.e. , the spec trum of a given function on X, is, clearly, the set of values of this function. Thus, if X has the continuum cardinality, then every non-empty subset in C is the spectrum of some element of this algebra. Example 2. In the algebra C ( t ) of rational functions of a formal variable t every element has empty spectrum unless it is a multiple of the identity. The same is certainly true for the algebra of meromorphic functions on the plane (recall complex analysis) , and moreover, for every algebra where each non-zero element has the inverse. Example
1.
These examples already show that every subset on the complex plane, including the empty one, can be the spectrum of an element of an algebra. In the next section we will see that this "algebraic permissiveness" disappears in functional analysis. Exercise 1. If a and b are elements of a unital algebra A and a is invertible, then a(ab) == a(ba). Hint: ba - A l == a - 1 ( ab - Al)a. *
*
*
As we have already said many times, after introducing a new structure, this time the structure of algebra, we have to introduce the class of mappings consistent with this structure. Definition 5. A linear operator rp : A --+ B between two algebras is called a homomorphism if rp(ab) == rp(a)rp(b) for all a, b E A. If in addition both algebras are unital, then our homomorphism is called unital whenever rp(lA) == lB. We now formulate the following obvious result. Proposition 3 . If rp : A --+ B is a unital homomorphism and a invertible, then rp( a ) E B is also invertible.
E
A is •
Algebras and their homomorphisms form a category denoted by Alg (with obvious composition law and local identities) . Clearly, isomorphisms in Alg (called isomorphisms of algebras) can be characterized as bijective homomorphisms. The classical example is the isomorphism between B(E) with dim E == n and Mn assigning to every operator its matrix in a given basis. The structure of the category Alg is much more complicated than that of Lin or even Gr. If you are curious, you can do the following Exercise 2* . Monomorphisms in Alg are injective homomorphisms . On the other hand, not every epimorphism is surjective; at the same time the extreme epimorphisms
2.
307
Something from algebra: Algebras
are precisely surjective ones. There are fewer extreme monomorphisms than arbitrary monomorphisms. Every family of objects in Alg have a product and a coproduct . Hint. The embedding in: C[t] � C(t) is an epimorphism, and at the same time a (non-extreme) monomorphism. Products as sets are Cartesian products, while coproducts are the so-called free products, resembling free coproducts of groups.
We recall the standard fact that in "categorically designed" areas of mathematics a valuable information on the structure of an object can be obtained by studying morphisms of this object into some simplest, or at least "well-understood" objects of the category. In the category Lin such a role belongs, of course, to functionals, and in Ban-due to the Hahn Banach theorem-to bounded functionals. As for Alg, the following classes of morphisms are very useful. Definition 6. A homomorphism from an algebra A to C is called a character of A. A homomorphism from A to B(E) is called a representation of A in the Banach space E. Since the algebra B(E) for a one-dimensional E coincides (up to an isomorphism of algebras) with C, it is natural to regard characters as a special case of representations. Remark. Characters are particularly effective in the study of commutative algebras. But if an algebra if "far from being commutative" , then, generally speaking, its characters are of little use. (For example, you can verify that the algebra Mn with n > 1 has no non-zero characters at all.) The fun damental role in the study of algebras passes to representations; for details see, e.g. , [78] or, in the context of functional analysis, [25] . Under the action of homomorphisms spectra change in the following way. Proposition 4. Let rp A --+ B be a unital homomorphism between unital algebras. Then for every a E A we have aB(rp(a)) C aA(a). :
If A E C is a regular point for a, then, by Proposition 3, rp ( a )- A lB == rp( a - AlA) is invertible, and thus, A is a regular point for rp( a). The rest is • clear.
Proof.
Certainly, the spectrum of an element may be preserved under the action of a homomorphism, or it may become smaller (give examples) . *
*
*
We were talking about homomorphisms from arbitrary algebras to some standard algebras. Now let us consider an important class of homomor phisms from a standard algebra to arbitrary ones.
5. At the Gates of Spectral Theory
308
Let p(t) == co + c 1 t + · · · + cn tn be a polynomial (i.e. , an element of the algebra C [t] ) , A a unital algebra, and a E A. Then the element co l + c1 a + · · · + cn an E A is called the value of the polynomial p on a or just a polynomial p of a and is denoted by p(a). The mapping rp : C[t] --+ A : p �----+ p (a) (which is clearly a unital homomorphism) is called the polynomial calculus of a in A. Definition 7.
(The subscript in the notation rp indicates that later we shall come across some other "calculi" . ) The polynomial calculus interacts well with spectra. For a complex valued function f defined on a set M C C, we denote the set { / (A) : A E M } briefly by f ( M ) . In particular, speaking about the set p(M); p E C [t] we regard p as the corresponding function of a complex variable. Theorem 1 (Spectral mapping law for polynomial calculus). Let A be a unital algebra, a E A and p E C[t] . Then a (p (a) ) == p ( ( a) ) . a
Take 11 E C. By the fundamental theorem of algebra, a polynomial p(t) - 11 can be decomposed into a product c(t - AI) · · · ( t - An ) , where AI , . . . , An are roots of this polynomial, n its degree, and c # 0. Since the polynomial calculus is a homomorphism, the element p( a ) - 111 can be decomposed into the product c ( a - Ai l) · · · (a - A n l) . Since all a - A k l; k == 1 , . . . , n commute, Proposition 2 shows that invertibility of p( a ) - 111 is equivalent to invertibility of all a - A k l . Therefore, 11 E o-(p( a )) {:=:::> A k E a ( a) for at least one k {:=:::> A E (a) for at least one root A of the polynomial • p (t) - 11 · The last condition, certainly, means that 11 E p( o-(a)). Proof.
a
Here are some of the simplest examples showing how this theorem works. An element a of an algebra is called idempotent if a 2 == a, and nilpotent if an == 0 for some n. Proposition 5. Let A be a unital algebra, and a E A. Then (i) if a is idempotent and different from 0 or 1, then a ( a) == {0, 1 }; (ii) if a is nilpotent, then a ( a ) == {0} . Proof. (i) Since p( a) == 0 for p(t) :== t 2 - t, for such p we have o-(p(a)) == {0} . Hence, by Theorem 1, A E o-(a) implies A 2 - A == 0, and consequently, o-(a) cannot contain numbers other than 0 or 1 . Moreover, the equality a(a - 1)== 0 and the fact that both factors are non-zero evidently imply that they are both non-invertible. The rest is clear. (ii) Now p( a) == 0 for another polynomial, namely, p(t) :== tn . Hence, by • the same theorem, A E o-(a) implies An == 0.
2.
309
Something from algebra: Algebras
In particular, we obtained a new and almost instantaneous proof of Proposition 1.4 concerning the structure of the spectrum of a projection. Up to now we were considering polynomials of elements of algebras. Now, to speak informally, we shall take the function 1/t of these elements. Proposition 6. Let a be an invertible element of a unital algebra A. Then a(a - 1 ) == { A - 1 : A E a(a)}. {In other words, a ( a - 1 ) == f( a(a)), where j (A) == 1 / A.) Proof. Clearly, a(a - 1 ) , as well as a ( a), does not contain 0. Further, for A # 0 we have a - I - A 1 == - A1a - I ( a - A - l l). Therefore, taking into account Proposition 2, we see that a - - A l is invertible {=:::} a - A- 1 1 is invertible. The rest is clear. • :
One can combine Theorem 1 and Proposition 6 as follows .
=
Exercise 3. Let r (t) p(t) jq(t); p, q E C [t] be a rational function of the formal variable t (we assume that the polynomials p(t) and q(t) are relatively prime) , and a an element of a unital algebra A such that the spectrum of a does not contain the poles of the function r (t) (i .e . , the roots of the polynomial q(t) ) . Then for the element r(a) : = p(a) /q(a) (which exists by Theorem 1 ) 3 we have
a ( r (a))
= r (a(a) ) ,
where r in the right-hand side is viewed as a function of a complex variable. Remark. Let a be an element of a unital algebra A. Then it is easy to verify that a subset C a (t) in C (t) consisting of all rational functions with all poles outside a(a) is an algebra with respect to the same operations as in C(t) . Assigning to every r E C a (t) an element r( a) E A considered in the previous exercise , we obtain a homomorphism lr : C a ( t) ---+ A extending (as a map) the polynomial calculus. It is called the rational calculus of the element a. The meaning of Exercise 3 is that this more general calculus also satisfies the spectral mapping law.
The rational calculus is probably the most general "functional calculus" in pure alge bra. In the next section we pass from algebra to analysis , and this will allow us to speak about a wider class of functions of elements of (this time not "pure" ) algebras.
We introduce several general notions. Definition 8. Let A be an algebra. A subspace B in A is called a subalgebra of A if a, b E B implies ab E B. A subspace I in A is called a two-sided ideal or just an ideal in A if a E I implies ab, ba E I for every b E A. Certainly, every subalgebra B in A is an algebra with respect to the multiplication inherited in the obvious way from A. FUrther, every ideal is a subalgebra. Finally, if the algebra is unital, then from the definition of the ideal it follows that 1 E I {=:::} I == A. Hence, if A is not one-dimensional, then span{l } is a simplest example of a subalgebra which is not an ideal. 3 The element r ( a ) can be called "the value of the rational function r on the element
a '' .
5 . At the Gates of Spectral Theory
310
In every commutative algebra A the subset { aao; a E A} , where ao is a chosen element in A, is an ideal. (Using the Euclidean algorithm, show that there are no other ideals in e [t] .) If X is an arbitrary set, and Y a subset in X, then the set { f E e x : f l y == 0 } is an ideal in e x . (Note that in the case when X is infinite, there are other ideals; find some of them.) In the classical functional analysis the following example is one of the most important. Example 3. The set K(E) consisting, as we remember, of compact oper ators on a Banach space E, is an ideal in the algebra B(E) of all bounded operators on E. This is just the "scientific form" of Proposition 3.3.5. Let us also pay attention to the set K+ (E) :== { A l + T : T E K (E) } . It is certainly a subalgebra in B(E) ( "the subalgebra of compact perturbations of the scalar operators" ) . As we already discussed in Section 3.5, for all concrete Banach spaces mentioned in this book, K+ (E) is strictly smaller than B(E) , but whether it is always true is a well-known open problem.
But we digressed from our main theme here. Let us go back to the pure algebra and introduce one more notion we shall need later. Let A be an algebra, and I an ideal in A; consider the quotient space A/ I. Take a, b E A and choose arbitrary elements a' , b' in the cosets a + I and b + I. Then ab - a' b' == (a - a')b + a'(b - b' ) belongs to I. This means that the coset of the product ab modulo I does not change if we replace a and b with some other representatives of their respective cosets. Thus, by assigning to every pair (a + I, b + I ) the coset ab + I, we define a mapping from A/ I A/ I to A/ I. Clearly, it has all the properties of a multiplication in A/ I. Definition 9. The space A/ I endowed with the indicated multiplication is called the quotient algebra of the algebra A by the ideal I. x
Obviously, the natural projection pr: A A/ I is a homomorphism of algebras. FUrthermore, if A is unital, then A/ I is also unital: lA + I is its identity. The example of a quotient algebra most important for this book is B(E)/K(E) , where E is a Banach space. Such an algebra is called a Calkin algebra and is denoted by C(E) . You can guess that studying this algebra is equivalent to studying operators up to compact perturbations. In particular, the Nikol' skii theorem for the case E == F can be formulated as follows. Proposition 7. An operator S E B(E) is Fredholm {=:::} its coset S + K(E) is invertible in the algebra C(E) . • -t
This in turn implies
3. Banach algebras and spectra
311
The essential spectrum of an operator T E B(E) is precisely the spectrum of the coset T + JC(E) as an element of C ( E ) .
Corollary 1 .
3 . Banach algebras and sp ectra of t heir element s .
Fredholm op erators revisited
In the previous section, we had a gasp in the somewhat rarefied atmosphere of abstract algebra. Now let us see what happens on the "lower floor" , namely consider an operator algebra B(E) not from the pure algebraic point of view, but, as von Neumann used to say, from the point of view of "abstract analysis" . The role of the foundation now goes to the following structure. Definition 1 . A Banach space A with multiplication is called a Banach algebra if this multiplication is related to the norm by the following inequal ity: l l ab l l < l l a l l ll b l l · This inequality is called the multiplicative inequality ( for the norm ) . In particular, we see that the bilinear operator of multiplication is bound ed and its norm is not greater than 1 . Hence, as a special case of Proposition 1.4. 1 1 , we obtain Proposition 1 . In a Banach algebra, if an tends to a, and bn tends to b, • then an bn tends to ab. Remark. Exercise 1 .4.7 shows that the bilinear operator of multiplication is continuous with respect to the Tychonoff topology in A A. But in the future, when speaking about continuity of multiplication, we shall have in mind Proposition 1 . x
Actually the same structure of Banach algebra arises if we require an apparently much weaker relation between the multiplication and the norm. Exercise 1 . Let A be a Banach space endowed with a multiplication which is separately continuous as a bilinear operator. Then one can endow A with a norm II · II ' equivalent to the initial norm and such that (A, II · II ' ) is a Banach algebra. Moreover, if A is unital, then this norm can be chosen in such a way that II l ii ' == 1. Hint. Use Theorem 2.4.5. There are three main sources of Banach algebras in analysis: the theory of operators, the theory of functions, and harmonic analysis. ( You already know something about the first two areas, and if you heard about Fourier series, then you know something about the most standard part of the third area as well. )
312
5. A t the Gates of Spectral Theory
The main Banach algebras in the theory of operators are, as you can guess, B( E ) and K( E) ; they are most studied and important for applications in the case of Hilbert spaces. We recall that both algebras are endowed with the operator norm and composition of operators as multipli cation. Example 2 ( the main Banach function algebra) . This is the Banach space C(O) , where 0 is a compact space ( see Section 3. 1) . This algebra is endowed with the pointwise multiplication. A slightly more general example is the Banach algebra C0 (0) , where 0 is a locally compact space ( again, cf. Section 3. 1). Example 1.
Remark. We should treat algebras in these examples ( first of all Co (O ) and B(H) , where H is a Hilbert space ) with special respect . They are the ranges of two canonical homo morphisms, the "Gelfand transform" and the "universal representation" , both playing the fundamental role in the general theory of Banach algebras ( cf. [25 , p. 1 79] ) . We shall say something about these constructions at the end of this chapter and in Chapter 6.3.
The Banach space l00 and, more generally, L00(X, 11) are also Banach algebras with respect to the pointwise ( in the case of l00 it is better to say coordinatewise ) multiplication. Speaking about L00 ( X, 11), we have in mind coordinatewise multiplication of representatives of the corresponding cosets; certainly, the coset of the product does not depend on the choice of such representatives. ( Actually, these algebras are special cases of algebras from Example 2. The meaning of this strange declaration will be clarified later in Section 6.3.) Example 4. The space cn [ a , b] ( cf. Example 1.1.7) is a Banach algebra with respect to the norm ll x ll ' :== L:�= O �� max{ x ( k ) (t); a < t < b} ( check that it is equivalent to the norm indicated in Example 1 . 1.7) and the pointwise multiplication. Example 3.
In the last chapter of the book we shall discuss typical Banach algebras of harmonic analysis, namely, Banach spaces £ 1 ( G) , where G :== Z, JR, or 'lr, endowed with the so-called convolution multiplication. Here we describe the simplest of them. Example 5 ( one of the guises of the Wiener algebra) . The Banach space l 1 (Z) is a Banach algebra with respect to the multiplication (� 1J ) n :== I: � _ 00 �k 1Jn - k · Here the product of � and 1J is denoted by � 1J and is called the convolution. Verify that the convolution indeed belongs to l 1 (Z) . In Chapter 7 we shall explain to advanced readers another appearance of the Wiener algebra, this time as a certain algebra of continuous functions. *
*
3. Banach algebras and spectra
313 *
*
*
Like Banach spaces, Banach algebras are the objects of two natural cat egories. In the first of them the morphisms are all bounded (as operators) homomorphisms, and in the second, only contraction homomorphisms are allowed. We shall not go deep into the study of these categories. It is nec essary to note only that isomorphisms between objects of the first category, which are called topological isomorphisms of Banach algebras , are, clearly, bounded homomorphisms with bounded inverse; by the Banach theorem, these isomorphisms can be characterized just as bijective bounded homomor phisms. Isomorphisms in the second category, called isometric isomorphisms of Banach algebras are bijective isometric (as operators) homomorphisms. We now present the first substantial result of the theory of Banach alge bras, which will give you an idea of one distinctive feature of this theory. It turns out that in some situations the continuity (i.e. , boundedness) of opera tors between Banach algebras can be deduced from pure algebraic properties. Here is the oldest example of such a "theorem on automatic continuity" . Proposition 2. All characters of Banach algebras are contraction opera tors. Suppose, on the contrary, that we have a character x : A � C such that x(a) == 1 for some a with ll a ll < 1. Then the series 2:: � an converges to some b E A, due to the estimate l an l < ll a ll n and the completeness of A. From the continuity of the multiplication it follows that a + ab == a + limn � oo a l:�=l ak == b. Hence, 1 + x( b) == x(a + ab ) == x(b) , which, of • course, is impossible. Proof.
1
The entire book [79] is devoted to the phenomenon of automatic conti nuity, and there you can find much deeper results. In Section 6.3, we shall offer to our advanced readers another striking result of that kind. Among subalgebras and ideals of Banach algebras, the closed ones are naturally of most interest. The ideal K(E) in B(E) is one of them (see Proposition 3.3.4) . Proposition 3. Let I be a closed ideal of a Banach algebra A . Then the quotient algebra A/ I endowed with the quotient norm is a Banach algebra. Proof.
Then
Our goal is to verify the multiplicative inequality. Take ii , b E A/ I. -
-
-
lliib l == inf{ ll c ll : c E iib} < inf { -ll ab ll : a E -ii , b E b} < inf{ ll a ll ll b ll : a E ii , b E b} == lliill ll b ll · •
5. At the Gates of Spectral Theory
314
In particular, we see that the Calkin algebra (introduced at the end of the previous section) is a Banach algebra with respect to the quotient norm. Let us look at the behavior of invertible objects of Banach algebras. If A is a unital algebra, then the set of its invertible elements is denoted by Inv(A) . By Proposition 2.2, this is a group with respect to the multi plication inherited from A. We shall denote by inv : Inv(A) � Inv(A) the mapping a �----+ a- 1 .
4.
Let A be a unital Banach algebra. Then (i) for each a E A; ll a ll < 1 the element 1 - a is invertible, and the inverse element can be represented as a sum of the series ( 1-a ) - 1 == 1 + 2::� 1 an {"the K. Neumann series "); (ii) the mapping inv is continuous at 1.
Proposition
(i) Consider the formal series 1 + l:C: 1 ak . Since ll ak ll < ll a ll k , the "Weierstrass test" guarantees the convergence of this series to some b E A. Let bn be partial sumsn Iof this series. Then from the obvious equality (1-a)bn == bn (l-a) == 1-a + and from the continuity of the multiplication it follows that (1 - a ) b == b (l a ) == 1. (ii) From the form of the K. Neumann series it follows that for II a ll < 1 we have Proof.
-
Hence, for inv(l ) .
b
:
-
1 - a we have limb�l inv(b)
== lima�o (l
- a) - 1
1
•
Let A be a unital Banach algebra. Then (i) the group Inv(A) is an open subset of A; (ii) the mapping inv is continuous everywhere on Inv(A) .
Proposition 5.
(i) Take a E Inv(A) . Then for every b E A such that ll b ll < l a- 1 11 - 1 , by the multiplicative inequality we have ll a- 1 b ll < 1. Taking into account Proposition 4(i) , we see that a + b == a(l + a- 1 b) E Inv(A) . Thus the whole U(a, r) lies in Inv(A) for r == ll a- 1 11 - 1 . (ii) Take c > 0. By Proposition 4(ii) , there exists 8 > 0 such that for every c E A with l c ll < 8 we have 1 - c E Inv(A) and 11 ( 1 - c) - 1 - 1 11 < c/ ll a- 1 11 · If b is such that ll b ll < 8 /ll a- 1 11 , and, as a corollary, l a- 1 b l < 8, Proof.
3. Banach algebras and spectra
315
then
ll ( a - b) - 1 - a - 1 11 == ll (a (1 - a - 1 b )) - 1 - a - 1 11 == 11 (1 - a - 1 b) - 1 a - 1 - a - 1 11 == 11 (( 1 - a - 1 b) - 1 - 1)a - 1 ll < 11 (1 - a - 1 b) - 1 - 1 ll ll a - 1 ll < l a - 1 l ( c:/ ll a - 1 ll ) == c.
The rest is clear. • Let us see how these facts affect the behavior of spectra. Recall that each subset in C can be the spectrum of an element of an abstract algebra. In the context of Banach algebras the situation is quite different. Theorem 1. Let A be a unital Banach algebra, and a E A . Then a(a) is a compact {i. e. , bounded and closed) set in the disk { z E C : l z l < l a ll }. Proof. We must verify that the set of regular points of an element a con tains all A with I A I > II a II , and that it is open. The first condition follows from the fact that for the indicated A the element 1 - A - I a is invertible by Proposition 4 ( i ) . Hence, the element a - A1 == -A(1 - A- 1 a) is invertible as well. The second condition follows from the fact that the invertibility of an element a - A1 implies the invertibility of elements sufficiently close to a - A1 ( Proposition 5 ( i )) . In particular, all elements a - 111 for 11 sufficiently • close to A are invertible. Example 6. The spectrum of an element f of the algebra C( O ) is, obviously, f(O) , i.e. , the set of values of the function f. In particular, in the algebra C[a, b]; a, b E 1R the spectrum of the function t �----+ t ( independent variable ) is the interval [a, b]. This special case plays an important role in the exposition of fragments of spectral theory in Chapter 6. Proposition 6. Let a be an invertible element of a unital Banach algebra A such that II a ll < 1 and ll a- 1 1 < 1 . Then the spectrum of a is contained in the unit circle. Proof. By the previous theorem, a ( a), a ( a- 1 ) c JI)) . At the same time, by • Proposition 2.6, A E a(a - 1 ) {=:::} A- 1 E a(a). The rest is clear. Corollary 1. Let T be an operator in a Banach space. If it is an isometric isomorphism {in particular, a unitary operator in a Hilbert space), then
a(T) C 'lr.
However, conclusions that are even more important can be derived from the fact that the spectrum of an operator cannot be empty. To prove this fact, we use complex analysis. Unless stated otherwise, everywhere in the sequel, A is a unital Banach algebra, a is an element of A, and Reg ( a ) is the set of regular points of a
5. At the Gates of Spectral Theory
316
(i.e. , C\a ( a ) ) . Due to the compactness of a ( a) , Reg (a) is an open set in C and it contains a neighborhood of the point at infinity. For A E Reg( a) we put R (A) :== (a - A1) - 1 . It evidently follows from Proposition 5(ii) that the mapping R : Reg(a) --+ A : A �----+ R(A ) ( the so-called resolvent function of the element a ) is continuous. Proposition 7 (Hilbert ' s identity) . For every A, 11 E Reg ( a) we have R(A ) - R( M ) == (A - M )R( M )R(A). Multiplying both elements from the left by a - 111 and from the • right by a - A1, we obtain (A - 11 ) 1 in both cases. Proposition 8. For every bounded functional f A --+ C the function W J : Reg(a) --+ C : A �----+ f ( R ( A) ) is holomorphic. Proof.
Take Ao E Reg (a) ; then from the Hilbert identity it follows that for all A E Reg ( a) we have WJ (A = �: (Ao) R( Ao )R(A).
Proof.
l
=
The mappings R and f are continuous, hence the right-hand side has a limit as A --+ Ao , namely, f( R (Ao)) 2 . We have verified the Riemann definition of • a holomorphic function. Theorem 2. The spectrum of every element a of a unital Banach algebra is a non-empty set in C. On the contrary, suppose that a ( a) == 0 , in other words, Reg( a ) == C. Then for every f E A* , w f (A) is an entire analytic function. The continuity of inv at 1 implies that 1 ( 1 - A - 1 a) - 1 ) == 0 . -A lim R A == lim ( ( ) .A-+ oo .A-+ oo Proof.
In particular, lim .A -+ oo W J (A) == 0. Thus, W J is bounded, and by the Liouville theorem, it is constant. Therefore, W J (A) 0. Thus, for each A E C and f E A* we have f( R (A)) == 0. Since f E A* is arbitrary, Theorem 1 .6.3 shows that R (A) 0. This contradicts the fact • that invertible elements in any algebra are different from zero.
Putting A == B(E) in Theorems 1 and 2 and, with Corollary 2. 1 taken into account, also A == C ( E ) , we immediately obtain Corollary 2. Both the spectrum and the essential spectrum of a bounded operator in a Banach space are non-empty sets.
3. Banach algebras and spectra
317
Note that there are no restrictions on the spectrum of an element of a Banach algebra, other than the requirements of compactness and non emptiness. Exercise 2. Every non-empty compact set K c C is a spectrum of some element of a unital Banach algebra. Hint. The spectrum of a function w ( z ) == z as an element of C(K) is K. The same is true for the spectrum of an appropriate diagonal operator on lp (see Exercise 1. 2) . We have come to the theorem that underlies the majority of results of the whole theory of Banach algebras (see, e.g. , [25 , 79 , 80, 101] ) . Theorem 3 (Gelfand-Mazur) . Let A be a unital Banach algebra, and at the same time a division ring {in other words, every non-zero element of A is invertible). Then A coincides, up to an isomorphism of algebras, with the field C . By the previous theorem, for every a E A there is at least one A E C such that a - A1 is not invertible. Since A is a division ring, this means that • a == A1. Hence, A == {A1 ; A E C} . The rest is clear.
Proof.
Now we draw our attention away from facts of general character, and instead use them to advance much further in our study of spectra for concrete operators. Exercise 3 ( cf. Exercises 1 .4 and 1. 7) . The spectrum of the operator of left or right shift on lp ; 1 < p < oo , and on co, is JI)) . In more details, (i) in the case of lp ; 1 < p < oo and co we have ap (Tz ) == ar (Tr ) == JI)) 0 , ac (Tz ) == ac (Tr ) == 'lr, and ar (Tz ) == ar (Tr) == 0 ; (ii) in the case of l1 we have ap(Tz ) == JI))0 , ac (Tz ) == 'lr, ar (Tr) == JI)) , and ar (Tz ) == ap (Tr ) == ac (Tr ) == 0 ; (iii) in the case of Zoo we have ap (Tz ) == ar (Tr ) == JI)) , and ac (Tz ) == ac (Tr ) == ar (Tz ) == ap (Tr ) == 0 . Hint. The result about the entire spectrum follows from Exercise 1 .4 and Theorem 1. For the continuous spectrum, the points of 1r are neither the eigenvalues of the operator under consideration, nor the eigenvalues of the adjoint operator. Hence, Exercise 1 . 1 (i) works. Exercise 4 (cf. Exercise 1 .8) . For all lp ; 1 < p < oo the spectrum of the operators Tb lp (Z) � lp (Z) and Ta Lp (lR) � Lp (IR) ; a # 0 is 'lr. In more details, if 1 < p < oo , then the entire spectrum is continuous, if p == 1 , then it coincides with the residual spectrum, and if p == oo , then it coincides with the point spectrum. :
:
318
5. At the Gates of Spectral Theory
Hint. The location of the spectrum follows from its non-emptiness , Corollary 1, and Exercise 1.6. The same Exercise 1.6 allows us to restrict our consideration to the point 1 E 1r. We immediately see that 1 is an eigen value of our operator only if p == oo. Together with Exercises 1 . 1 (i) and 1.8, this gives the classification of the spectrum in the case of 1 < p < oo. Exercise 5 The essential spectrum of the operators Tz , Tr : lp � lp , Tb : lp (Z) � lp (Z) , and Ta : Lp ( lR ) � Lp( lR) ; a # 0 (everywhere 1 < p < oo ) is 'lr. Hint. As a clarifying example consider the operator Tz . Take A E [))0 and in the Calkin algebra C ( lp ) consider the equality ( (Tz - A1) + JC(lp )) (Tr + JC(lp )) == (1 + JC(lp )) - A(Tr + JC (lp ) ) . By Proposition 4(i) , the right-hand side is invertible. Hence, Proposition 2.7 and the Fredholm property for Tr show that Tz - A1 is a Fredholm operator. Hence, ae ( Tz ) C 'lr, and it remains to use the non-emptiness of the essential spectrum and Exercise 1.6. Exercise 6 ( cf. Exercise 1. 9) . The spectrum of the shift operator by an element a E 1r in Lp ( 'lr ) is as follows: (i) if a == e2 1ri 7: , where m and n kare relatively prime natural numbers, then a(Ta) == ap (Ta) == { e 21r�· n ; k == 0, 1 , . . . n - 1 }; (ii) in the other cases a(Ta) == 'lr. Hint. In the first case, due to the equality T'/: == 1, Theorem 2 . 1 works, and in the second, Theorem 1 works. Remark. Actually, in the "Hilbert" case p == 2 every shift operator figuring in the three previous exercises except Tz and Tr , is unitarily equivalent to some operator of multiplication by a function (which at first sight bears no resemblance to it) . If we knew this by now, we could immediately obtain the corresponding facts about spectra as simple special cases of Exercise 1.3. These unitary equivalences are established by the Fourier operators which we did not discuss yet. We shall consider them in the last chapter (see Proposition 7.4.5) . We suggest another curious application of the theorem on the properties of the spec trum. Exercise 7 ( "on the complexity of quantum mechanics" ) . No Banach algebra has a pair of elements a, b such that for some A E C; A =f. 0 we have ab b a == Al.
-
Hint. Replacing, i f necessary, a with a + ( II a l l + 1 ) 1 , we can assume that a is invertible. Taking into account Exercise 2 . 1 , we have a ( a b ) == a ( a b ) + A. But if a set in C does not change under the shift by A =f. 0, then it is either empty or unbounded. How is quantum mechanics related to all this? As a special case of the obtained result , we see that in a Hilbert space there is no patr of bounded operators S and T
3. Banach algebras and spectra
31 9
sattsfytng the "canontcal commutatton relatton" ST - TS == ilil , where n ts the Planck constant. However, precisely such a pair must enter basic known mathematical models of quantum mechanics (giving , in particular, the mathematical picture of the Heisenberg indeterminacy principle) . That is why we have to look for such pairs among unbounded operators, which are much more difficult to work with than bounded ones ( cf. the epigraph to Chapter VIII in [63] ) .
Now we introduce an important numeric characteristic of the spectrum. Definition 2. Let a be an element of a unital Banach algebra A. The spectral radius of a is the number r ( a) :== max { I A I : A E ( a)}. We see that the spectral radius is defined in pure algebraic terms inde pendently of the norm. Nevertheless, it has an explicit description in terms of the norm. Theorem 4 ( Spectral radius formula ) . For every a in the previous definition a
we have
r ( a ) == nlim -+oo �Proof. Put U :== { A : I A I > r ( a )} and V :== { A : I A I > ll a ll }. By Theorem 1 we have V U. Take an arbitrary f E A* . By Proposition 8, the function W J is holomorphic in U and, as a corollary, in V. But by Proposition 4 ( i ) , for every A V we have R (A) == - A - 1 (1 - A - 1 a ) - 1 == - L A - ( k+ l )ak . k=O Hence W J can be expanded into the Laurent series - L:r 0 A - ( k + 1 ) f(a k ) in C
E
00
But then, by well-known properties of holomorphic functions, the same Laurent series represents our function in U as well. Thus, in particular, for every A E U we have limn-+ oo A- ( n + 1 ) f(an ) == 0. Now take A E U and "release" f. Taking into account Proposition 2.4.5, from the last equality we see that for some C > 0 and all n we have II A- (n+ 1 ) an ll < C, so that � < I A I �- Passing to the upper limit, we see that limn-+ oo � < I A I . Since A is an arbitrary number satisfying the inequality A > r(a), we obtain (1) limn-+ oo � < r ( a). Now take A E (a). Combining Theorem 2. 1 ( for p (t) == tn ) with The orem 1, we see that l A i n < ll an ll · Consequently, I A I < � for every n. Passing to the lower limit, we see that, by the definition of spectral radius, (2) limn-+ oo � > r (a). It remains to combine inequalities ( 1 ) and ( 2 ) . • V.
a
5. At the Gates of Spectral Theory
320
The spectral radius formula can be proved without using holo morphic functions; see [80 , pp. 22-23] . Remark.
The formula we discuss provokes interest to the following class of ele ments in Banach algebras. The behavior of these elements resembles the behavior of nilpotent elements in abstract algebras. Definition 3. We say that an element a of a Banach algebra is topologically nilpotent or quasinilpotent if limn � oo � == 0. Certainly, this condition is equivalent to the fact that the sequence ll an ll tends to zero faster than e - an for every a > 0. From the spectral radius formula we have Corollary 3. Quasinilpotent elements in a Banach algebra are precisely those for which the spectrum consists of zero only. Find the spectrum of a topologically nilpotent element without using the spectral radius formula. Hint. If A # 0, then the series 2:: � A - ( n+ I )a n converges to the inverse of Al - a. Now we can find the spectrum of an old friend. Exercise 9. The operator T of indefinite integration (in £ 1 [0, 1] , £ 2 [0, 1] , and C[O, 1] ; see Example 1.3.5) is a topologically nilpotent operator, and, as a corollary, its spectrum consists of a single point, namely, zero. The same is true for the Volterra operator in L 2 [a, b] (see Example 1.3.6) with essentially bounded kernel K ( s, t) . As a corollary, for such K the integral equation K(s, T)x(T)dT - Ax(s) == y(s) Exercise 8.
0
t 1
has a unique solution in L2 [a, b] for each A E C \ {0} and for every right hand side y E L 2 [a, b] . (This type of integral equations is called the Volterra equations ; cf. the Fredholm integral equations in Section 3.5.) Hint. Let T be the operator of indefinite integration. Then for every in the unit ball of the corresponding function space, rn + I x is a continuous function satisfying the estimate 1 Tn+ 1 x(t) l < t n jn! ; t E [0, 1] . x
This result is true for the Volterra operators with an arbitrary square integrable kernel. But this is much more difficult to prove (see, e.g. , [ 8 1 , exercise 14 7] ) . Remark.
*
*
*
3. Banach algebras and spectra
321
Recall that one can "take polynomials" of every element of an abstract algebra. Our possibilities become wider if we go over from abstract alge bras to Banach algebras: in this situation we can "take some holomorphic functions" of elements. In this book taking entire functions will be sufficient. Consider the set O(U) of holomorphic functions on a domain U of a complex plane, and write 0 instead of O(C) . We know already that O(U) and, in particular, 0 is a polynormed, and as a corollary, a topological space (see Section 4. 1 ) . But this set, certainly, is also an algebra with respect to the pointwise operations. Clearly, these two structures agree. Proposition 9. If a sequence Vn tends to v and Wn to w in O(U) (we mean the Weierstrass convergence; cf. Section 4. 1}, then Vn Wn tends to vw. • Take a unital Banach algebra A and an element a E A. Suppose w : C � C is an entire holomorphic function, and �C: ck z k is its Taylor series. Then the numeric series �C: l ck l ll a ll k converges. This, due to the estimate II an II < ll a ll n andk the Weierstrass test, guarantees the convergence of the series �C: ck a in the Banach space A. We denote the sum of this series by w(a) . Definition 4. The element w(a) E A is called the value of the entire func tion w at the point a. The mapping re : 0 � A : w �----+ w(a) is called the entire holomorphic calculus, or just the entire calculus of a. 0
0
0
In particular, we can speak about the element exp (a) (denoted also by e a ) or, say, about the element sin(a) . They are defined as !e (w) , where in the first case exp(z) , and in the second one, sin(z) is taken as w E 0. Note that the algebra of (formal) polynomials is, up to an isomorphism of algebras, a subalgebra of 0 consisting of "polynomial functions" . Proposition 10. The entire calculus (of every a} is a continuous unital homomorphism of algebras, extending the polynomial calculus. First, it is clear, that 'Ye is a linear operator. Now suppose K c C is the disk of radius r > ll a ll centered at zero. By the classical Cauchy inequality (see, e.g. , [12, p. 1 1 1] ) , for every w E 0; w(z) == �� Cn Zn we have l cn l < ll�llK . Hence 00 00 ll w(a) ll < L cn an < L l cn l ll a ll n < C ll w ii K , n ==O n ==O where C = (� � ���r ) . By Theorem 4. 1 . 1 , le is a continuous operator. Suppose now that w 1 , w 2 E 0 are represented by the Taylor series �C: ck z k and �� dk z k . Consider the partial sums Pn : == ��==O ck z k Proof.
0
:
0
0
0
5. At the Gates of Spectral Theory
322
and Qn :== ��= O dk z k ; n E N. We know that the polynomial calculus is a homomorphism, hence (Pn Qn )(a) == Pn ( a)qn (a). The sequence Pn tends in 0 ( i.e. , in the sense of Weierstrass ) to W I , Qn tends to w2 , and consequently, Pn Qn tends to W I W2 ( by Proposition 9) . Thus, due to the continuity of the mapping w �----+ w (a), we have limn --+ oo Pn ( a ) == W I (a), limn --+ oo Qn (a) == w2 ( a), and limn --+ oo (Pn Qn )(a) == (wiw 2 )( a ). But the multiplication in A is also con tinuous ( Proposition 1 ) ; hence, li mn --+ oo Pn ( a ) qn (a ) == W I ( a )w 2 (a). The rest • is clear. Later ( Section 6.2 ) a special case of this proposition where W I ( z ) exp ( z ) and w 2 ( z ) :== exp ( -z ) will make our life much easier. Corollary 4. For every a the element exp ( a) is invertible with the inverse exp ( -a ) . :
The entire calculus is as good in dealing with spectra as the polynomial calculus.
11. For every w E 0 we have w(a ( a )) C a(w(a)). Proof. Take A E a ( a ) and put wo( z ) :== w( z ) - w(A) E 0 . Since wo(A) == 0, we have wo(z) == (z - A)wi ( z) for some W I E 0 . Since our calculus is a Proposition
homomorphism, we have
w( a ) - w(A)1 == !e(wo) == !e( z - A ) !e(wi) == (a - A1)wi (a). But a - A1 is not invertible, and by the commutativity of the algebra 0 , a - A1 and wi (a) commute in A. Hence, by Proposition 2.2 ( ii ) , w(a) - w(A)1 •
is not invertible. The rest is clear.
Actually, the converse inclusion also holds. We will not prove it, but you should know this fact. Theorem 5 ( spectral mapping law for an entire calculus, [25 , Chapter II, Theorem 2.23]) . For every w E 0 we have w(a(a)) == a(w(a)) . Remark. Thus, we know what are the exponential and the sine of an element in a Banach algebra, in particular, of a bounded operator. But can we speak, for example, about the logarithm of such an element? This is not clear since the function log z can be defined only in domains that are smaller than C. It turns out that the situation is as follows Gelfand, 1 939) : for a holomorphic function w in a domain U C C, one can give a reasonable definition of the element w (a) {=:::::} U contains a(a) . Moreover, the corresponding mapping w �-----+ w (a) from 0 ( U) to A the so-called holomorphic calculus of a in U ) is a homomorphism, for which in addition the corresponding version of the spectral mapping law is true. If U is an open disk, or, more generally, an annulus, then the element w (a) can be easily constructed by analogy with an entire function of a: we should put our element instead of the complex variable into the Taylor or Laurent expansion of a given function. For more complicated domains the element w (a) can be constructed using the vector-valued contour integration technique. For details see, e.g. , [25] .
(
(
(
)
3. Banach algebras and spectra
323
The holomorphic calculus of (one) element of a Banach algebra was defined and its study was almost completely finished by the early 1 940s . But the question of the "right" definition of holomorphic functions of several commuting elements of a Banach algebra and, as a major special case, of several commuting operators in a Banach spaces remained open. This problem was solved in 1970 by J . Taylor using methods of homological algebra, functional analysis, and multidimensional complex analysis . These topics and the main result of this theory-the Taylor theorem on a holomorphic multioperator calculus-are well presented in the book by J . Eschmeier and M . Putinar [77] . Let us now make another digression, this time of general character. Second time in this book the needs of applications compel us to consider the multiplication in polynormed algebras which are not Banach algebras. First , having in mind the future spectral theorem we considered the multiplication in the algebra B( E) endowed with the weak-operator and also the strong-operator family of prenorms (see Examples 4 . 1 .8-4. 1 .9) . Now the algebras O(U) appear. All these algebras belong to the class of so-called polynormed algebras, i.e. , polynormed spaces endowed with multiplication that is continuous in some reasonable sense. In the case of A == (B(E) , s o ) and A == (B(E) , w o ) the multiplication is separately continuous, i.e. , continuous with respect to each argument (Proposition 4. 1 . 7) . At the same time in the case of A == O(U) the multiplication, as you can easily verify, has a stronger property of so-called joint continuity, i.e. , continuity with respect to the Tychonoff topology in A x A. It is not difficult to see that a series of other examples in Chapter 4, first of all c oo [a, b] and the spaces of test functions, also are polynormed algebras with respect to the pointwise multiplication, and some of them with respect to the convolution as well. (But this is important for the applications exceeding the scope of our book.) As a matter of fact, the theory of polynormed (and more general topological) algebras is a part of functional analysis that is closely related to the theory of Banach algebras but at the same time has its own distinctive features . About this circle of problems you can see, e.g. , [82] , [83] , [25] . One of the most important applications, namely to the multi parameter spectral theory, is presented in [77] .
Now we return to the material for all readers and give an overview of main results of the most traditional part of the theory of Banach algebras, the theory of commutative Banach algebras. The core of this theory is that these algebras are, roughly, algebras of functions. A commutative Banach algebra is called semisimple if it has no topo logically nilpotent elements except zero. ( This is a special case of the fun damental notion of semisimple algebra; you can read about this notion in, say, [25] . ) Theorem 6 ( Gelfand, [25 , Chapter IV, Theorem 2. 14] ) . Let A be a semisim
ple commutative Banach algebra. Then there exists an injective contraction homomorphism of this algebra to the algebra Co(O), where 0 is a locally compact (and in the case of unital A, compact) topological space.
The Gelfand theorem shows that every semisimple commutative Banach algebra coincides, up to an isomorphism in Alg, with an algebra of contin uous functions-the image of the mentioned homomorphism. Let us explain how this homomorphism acts. First , where does the mentioned locally compact space comes from? The answer follows . As a set , it consists of all non-zero
324
5. At the Gates of Spectral Theory
characters of our algebra (i.e., we recall, non-zero functionals on A "respecting" the mul tiplication) . By Proposition 2 , it can be identified with a subset in A* , and this allows us to endow it with the topology inherited from the weak* topology in the latter space. The introduced topological space is called the Gelfand spectrum of our commutative Banach algebra; we shall denote it by n. (A) or just by 0. Exercise 1 0. The Gelfand spectrum is a locally compact, and i f A is unital, a compact Hausdorff space .
Hint. If we add the zero character 0 to 0, we obtain a closed, with respect to the weak* topology, subset in B A* . (Moreover, in the unital case, 0 is its isolated point . ) Then the Banach-Alaoglou theorem works. Now we define one of the most important functors in functional analysis. Consider the category UCBA, whose objects are unital commutative Banach algebras and morphisms are unital continuous homomorphisms . Suppose r.p : A ---+ B is a morphism in UCBA. Then it is easy to see that the adjoint operator r.p* : B * ---+ A* maps O. (B) to O. (A) . Denote by O. (r.p) : O. (B) ---+ O. (A) the corresponding birestriction; from Proposition 4.2. 1 0 it follows that this is a continuous mapping. Obviously (taking into account Exercise 10) , the correspondence A �-----+ n. (A) ; r.p �-----+ n. ( r.p) is a covariant functor from the category UCBA to the category CHTop. It is called the Gelfand functor. Remark. To simplify the exposition we have introduced only a "part" of the Gelfand functor. In fact, it is defined on the category of all commutative Banach algebras and all their continuous homomorphisms, and takes values in the category of locally compact spaces (its morphisms are discussed at the end of Section 3 . 1 ) . You can easily restore all the missed details.
Going back to the promised construction of homomorphism, consider a commutative Banach algebra A and its Gelfand spectrum 0. Let us assign to every a E A the function a : 0 ---+ C acting by the rule a( x ) : = x (a ) ; X E 0. Now we can give the detailed formulation of the Gelfand theorem. Theorem 6' . Let A be a commutative Banach algebra, and 0 its Gelfand spectrum. Then
(i) for every a E A the function a(t); t E 0 is continuous and vanishes at infinity; (ii) the mapping FA : A ---+ Co ( O ) : a �-----+ a is a contraction homomorphism; its kernel is the set of all topologically nilpotent elements of algebra A; (iii) for distinct s , t E 0 there exists a E A such that a ( s ) =/= a(t) {the image of FA separates the points of 0). Exercise 1 1 . Prove part ( i) of Theorem 6 ' .
Hint. The functional f �-----+ f ( a ) is continuous on A* with respect to the weak* topol ogy. Its restriction to the compact set in BA* consisting of all characters (including 0) vanishes at the point 0. Definition 5 . The homomorphism FA introduced in this theorem is called the Gelfand transform of the commutative Banach algebra A. Example 7 ( "let well alone" ) . Let A be the algebra C0 (0) , where 0 is a locally compact Hausdorff space. Then, using the Alexandroff theorem , it is easy to prove (try!) that the Gelfand spectrum of this algebra coincides up to a homeomorphism with 0, and FA : A ---+ Co ( O ) is just the identity homomorphism.
3. Banach algebras and spectra
325
The Gelfand transforms of different algebras are compatible. In the following exercise we shall again restrict ourselves to the unital case. Recall the functor C CHTop---+ B an 1 considered in Example 3. 1 . 1 . Obviously, the same construction gives a functor from CHTop to UCBA, which we again denote by C. :
Exercise 12. For every unital continuous homomorphism r.p : A ---+ B between unital commutative Banach algebras the following diagram is commutative:
In other words , the family { FA : A E UCBA} is a natural transformation between the identity functor in UCBA and the composition of functors C o n . : UCBA---+ U CBA. *
*
*
To conclude this section, we return to the Fredholm operators, armed with the knowledge of topological properties of the group of invertible op erators. We must fulfil our promise given in Section 3.5: to continue the discussion of the stability of the index. Denote by f/J(E, F) the set of Fred holm operators between Banach spaces E and F and write f/J(E) instead of f/J(E, E) . Theorem 7 (Stability of index under small perturbations; cf. Proposi tion 3.5.3) . The set f/J(E, F) is open in B(E, F ) , and the function lnd : f/J(E, F) � Z : S �----+ Ind(S) is continuous. Take S E f/J( E, F) . Our first goal is to show that if the norm of T E B(E, F) is small, then the operator S + T is also a Fredholm operator. By the Nikol' skii theorem the coset S + JC(E, F) is an isomorphism in the category Banj/C ; let R + IC(F, E) be the inverse operator. Then, you can easily see, for every T E B(E, F) we have R(S + T) + JC(E) == U + JC(E) and (S + T)R + JC(F) == V + JC(F) , where U :== 1 + RT and V :== 1 + TR. Now we assume that II T il < II R II - 1 , so that II RT II , li TR ll < 1 . Then, by Proposition 4(i) for B(E) and B ( F ) as A, the operators U and V are invertible. As a corollary, the morphism ( S + T) + JC(E, F) of the category Banj/C has the left inverse u- 1 R + JC( F, E) and the right inverse RV- 1 + JC ( F, E) . Hence (by Proposition 0.5 . 1 ) , it is an isomorphism. By the same Nikol' skii theorem, S + T E f/J(E, F) . Now, look at the indices. We know that RS E 1+ /C (E) and R(S + T) E U + JC( E) , where U is invertible. Combining this with Theorem 3. 5 . 1 and Proposition 3.5 .3 we obtain the identities Ind(R) + Ind(S ) == Ind(1) == 0 and Ind(R) + Ind(S + T ) == Ind ( U ) == 0. Thus, Ind(S ) == Ind (S + T) , i.e. , lnd is • a locally constant function. The rest is clear. Proof.
as
5. At the Gates of Spectral Theory
326
However, the dimensions of the kernel and cokernel of a Fred holm operator, taken separately, are not stable under small perturbations. Consider, for example, an arbitrary Fredholm (i.e. , having finite-dimensional kernel) projection P. Among all its "arbitrarily small" perturbations there are operators with zero kernel, namely, operators of the form P + Al, which are invertible for sufficiently small A E C. Remark.
Now we say a few words about further properties of the topological space
path-connected. 4
ts non-empty and
Exercise 13* . Prove the part of this theorem concerning the non-emptiness.
Hint. Consider the operators that are weakly similar, as morphisms in Ban, to the n-th power of the operators of left (for n > 0) and right (for n < 0) shift in l2 . The second part of this theorem dealing with the path-connectedness is more difficult. It relies on the following fact : the group lnv(B(H) ) , where H is a Hilbert space (with topology inherited from B(H)) is path-connected (see, e.g. , [84, pp. 76-77] ) . We suggest that you do the following exercise. Exercise 1 4 * . Complete the proof of Theorem 8, taking the path-connectedness of lnv(B(H)) for granted.
Hint. Take 81 , 82 E <�> - 1 ( l2 ) . If we succeed in connecting them by a path, then the general case also becomes clear. For k == 1 , 2 the operator Tz Sk has the form Uk - Tk , where Uk is invertible (Exercise 3.5 .6) . Hence , it can be connected with Uk by the path t �-----+ Uk -tTk . Therefore the path-connectedness of the group Inv B( l2 ) allows us to connect Tz S 1 with Tz S2 by a path in _ 1 ( l2 ) · But the Fredholm projection TrTz is connected by a path with 1 in
Are the sets
4 The proof of this theorem is explained in the following exercises, where references are also given.
Chapter 6
Hilbert Adjoint O p erators and the S p ectral Theorem
1 . Hilb ert adj ointness : First information
It is no secret that more than 90% of papers on the theory of operators are devoted to operators acting in Hilbert spaces. Why is it the case? It is because the algebra B(H) , where H is a Hilbert space, has an important algebraic operation that does not exist in B(E) for general Banach spaces E. This is the passage to the Hilbert adjoint operator. The reader is already familiar with the Banach adjoint operator: it acts between the spaces that are dual to the initial Banach spaces, and it "changes the direction of arrows" . Of course, this applies to operators between two Hilbert spaces as well. But in this case, using special features of Hilbert spaces, we can assign to the initial operator the operator connecting the same spaces (and not their dual spaces) . Everywhere in this section, unless specified otherwise, H and K are arbitrary Hilbert spaces and T H � K is an arbitrary bounded operator. To avoid misunderstanding, from now on we shall denote the Banach adjoint operator for T by Tb* . Let us recall the canonical bijections I H � H * and J K � K* provided by the Riesz theorem. :
:
:
327
6. Hilbert Adjoint Operators and the Spectral Theorem
328
1. There exists a unzque bounded operator Th* making the diagram
K
Proposition
�
H
commutative. Clearly, there is only one mapping making this diagram commuta tive, namely, J - Irb* J. It is a composition of a linear (in the middle) and two conjugate-linear operators. Hence, it is easy to check that this is a "true" linear operator. Since Tb* is bounded and J and J- 1 preserve norms, • our operator is also bounded. Proof.
The following definition is the most important in the entire book. Definition 1. The constructed operator Th* (i.e. , I - 1 Tb* J ) is called the Hilbert adjoint operator for the operator T. From now on everywhere in this book we will say, more and more often, just "adjoint" instead of "Hilbert adjoint" and use the notation T* instead of Th* . The introduced operator can be characterized by two equivalent funda mental identities. Each of them is often given in textbooks as the initial definition of the Hilbert adjoint operator. Theorem 1. (i) T* is the unique operator from K to H such that for all x E H and y E K we have (Tx, y) == (x, T* y) ; (ii) the same is true if we replace this equality with (y, Tx) == (T * y, x) . Warning.
Proof.
have
(i) Taking into account the commutativity of the diagram above, we
(Tx, y) == [ Jy] ( Tx ) == [Tb* Jy ] ( x ) == [IT* y] ( x ) == (x, T* y) . It remains to prove the uniqueness. If for a linear operator S K � H and for all x E H, y E K we have (Tx, y) == (x, Sy) , then for the same x, y, we have x ..l ( T* - S ) y. Therefore S == T* . (ii) This assertion follows from (i) and the conjugate symmetry of the • inner product. :
1.
Hilbert adjointness: First information
329
We call the identities in Theorem 1 the adjointness formulas . Here is another approach to the discussed notion. Let R : H K --+ C be a bounded conjugate-bilinear functional. Denote by R* : K H --+ C the mapping ( y, x) �----+ R(x, y ) . Clearly, it is also a bounded conjugate-bilinear functional. The adjointness formulas evidently imply the following result. Proposition 2. Suppose Sr is a conjugate-bilinear functional associated with T (see Section 2. 3}. Then T* is the {unique) operator to which (Sr)* • is associated. x
x
The case where H and K coincide is the most interesting one. Certainly, in this case T* :== Th* acts in the same H (contrary to Tb*) . In the algebra B(H) an additional structure arises. It is the extremely important operation T �----+ T* (the "Hilbert star" ) . Later in the book we will learn more about it. When defining Hilbert spaces, we were, as usual, speaking about complex Hilbert spaces. Certainly, Definition 1 can be repeated verbatim for real Hilbert spaces. However, in that context the Hilbert adjointness is nothing but the Banach adjointness. The reason is that in the real case the canonical bijections are real (linear) isometric isomorphisms. Hence, the diagram in Definition 1 shows that the operators Th* and Tb* are weakly unitarily equivalent, and in the case H == K even unitarily equivalent. For complex Hilbert spaces, the Banach adjointness and the Hilbert adjointness are far from being the same. The following simple observation confirms this. Exercise 1. Suppose T :== il : l 2 --+ l 2 . Then, up to the unitary equivalence, Tb* coincides with the same i l, while T* == - il. As a corollary, T* is not unitarily equivalent, and not even topologically equivalent to T. Hint. The operator implementing the unitary equivalence acts by the formula 1J �----+ f'r/ (Exercise 1 .6.3 ) , and the canonical bijection, by the formula 1} 1---+ fr, . (However, to give the precise meaning to the statement that the Hilbert adjointness is the same thing as the Banach adjointness in the real case and a different thing in the complex case, we need the language of functors. See Exercise 2 below.) Remark. At the same time, the two constructions have much in common. For instance, the Banach adjoint operator is topologically injective, topolog ically surjective, isometric, or coisometric {=:::} the Hilbert adjoint operator has the respective property. This evidently follows from the fact that the mappings I and J are isometries of metric spaces. The Hilbert star has the following properties.
6. Hilbert Adjoint Operators and the Spectral Theorem
330
Suppose T : HI � KI and S : H2 � K2 are bounded operators between Hilbert spaces. Then (i) if HI == H2 and KI == K2 , then (S + T)* == S* + T* {additivity); (ii) (;\T)* == ;\T* for all A E C {conjugate homogeneity; compare this with "simple " homogeneity in Exercise 2. 5. 2{ii}}; (iii) if HI == K2 , then (TS)* == S*T* {antihomomorphy}; (iv) T** == T (period 2}; ( v) l i T* II == II T i l · In addition, 1 H == 1 H for every Hilbert space H. Proof. All these properties easily follow from the adjointness formulas. We shall restrict ourselves to verification of (iv) . Theorem 1 applied to T* implies that for every x E HI and y E KI we have ( T*y, x ) == ( y, T**x ) , and at the same time ( T*y, x ) == ( x, T* y ) == ( Tx, y ) == ( y, Tx ) . From this, by the uniqueness mentioned in Theorem 1 , it follows that • T** == T. Note that the formula 1 * == 1 (which, of course, can be proved directly) follows from (iii) and (iv) . Prove this as an easy exercise. Corollary 1 . The mapping T �----+ T* from B(H, K) to B (K, H) is continuous in the operator norm; in particular, from Tn � T; n � oo we have T� � T*; Proposition
3.
n � oo .
Applications of the following fact are so important that it deserves the rank of a theorem. Theorem 2. For every T E B(H, K) we have II T*T II == II T II 2 . Proof. The inequality < follows from the fact that the Hilbert star preserves the norm, and from the multiplicative inequality. The inequality > is proved as follows: for each x E BH we have • 11Tx ll 2 == ( Tx, Tx ) == ( T* Tx, x ) < II T* T II II x ll 2 < II T* T II · The established identity is called the C* -identity. This name does not sound extremely pleasant, but over the years it became standard in math ematics. (In Section 3 we shall say a few words about the origin of this name.) As with almost every important construction, the operation T �-----+ Th * hides a functor. It is the contravariant Hilbert adjointness functor ( h * ) Hil ---+ Hil. This functor does not change objects but takes every morphism (i.e. , bounded operator) T to Th * . The axioms of the contravariant functor immediatelyfollow from Proposition 3(iii) . In addition to this :
1.
Hilbert adjointness: First information
331
functor, we consider the functor ( b * ) : Hil ---+ Hil defined as an obvious "birestriction" of the Banach adjointness functor from Section 2.5. A similar pair of functors acts in Hil 1 . Exercise 2. Considered in either of the categories Hil or Hil 1 , the functors ( h * ) and ( b * ) are not naturally equivalent . However, they become naturally equivalent if we replace complex Hilbert spaces with real Hilbert spaces .
What are (Hilbert) adjoint operators for our standard examples of oper ators acting in Hilbert spaces? The following several exercises can be done by elementary verification. Exercise 3.
(i) The adjoint operator for the diagonal operator T>.. : l2 � l2 is T>.. , where A is the complex conjugate sequence for A. (ii) The adjoint operator for the operator of left shift Tz : l 2 � l 2 is the operator of right shift Tr , and vice versa. (iii) The adjoint operator for the operator of multiplication by a func tion f, Tt : L2 ( X, f.-l ) � L 2 ( X, J.-l ) , is Tf , where f is the complex conjugate function for f. (iv) The adjoint operator for the operator of indefinite integration in L2 [0, 1 ] acts by the formula �----+ y , where y( t) :== ft1 The following proposition, generalizing the last part of the previous ex ercise, deserves special attention. Proposition 4. Let TK : L 2 [a, b] � L 2 [a, b] be an integral operator with the kernel t) . Then the adjoint operator is the integral operator TK * with the kernel t) : == ( t, Proof. For each E L2 [a, b] we have
x
K(s, K* ( s ,
x(s) ds .
K s).
x, z (TK x , z) = 1b [TK x] ( s ) z ( s ) ds = 1b (1b K(s , t)x(t)dt) z(s) ds . Since K(s , t) and x(t)z(s) belong to L 2 ( ) , the product of these functions D
is integrable in the square D . Hence, by the Fubini theorem, the indicated double integral is
1b (1bK (s , t) x (t ) z ( s ) ds) dt = 1bx (t) ( 1b K(s , t)z(s) ds) dt = 1 bx (t) ( 1 b K*(t , s)z(s) ds ) dt = ( x , TK · z) .
• Now let us discuss relations between the main geometric objects related to a given operator and to its adjoint. The following theorem is a further
332
6. Hilbert Adjoint Operators and the Spectral Theorem
generalization of the observations made by Fredholm in the context of inte gral equations, and then reinterpreted in Hilbert ' s school. Theorem 3. Suppose T : H --+ K is an operator between two Hilbert spaces. Then (Im(T) ) _t Ker(T* ) and (Ker(T))_t Im(T * ) {the small bar, as always, means the closure). In other words, K Im(T) - EB Ker(T* ) H Ker(T)EBim(T* ) - . an d Proof. The first equality follows from the chain of equivalences: x E (Im(T))_t {:=:::> ( x, Ty ) 0 for all y E H {:=:::> ( T* x, y ) 0 for all y E H {:=:::> T* x 0 {:=:::> x E Ker(T* ) . Taking into account Propositions 2.3.4 and 3(iv) , we see that this implies Im(T* ) - Im(T * ) _t_t Ker(T** ) _t Ker(T) _t . • The rest is clear. Proposition 5 . If T acts on H, and Ho is an invariant subspace for T, then Ht is an invariant subspace for T* . Proof. Since y E Ho implies Ty E Ho, from ( x, y ) 0 for each y E Ho we • have ( T*x, y ) ( x, Ty ) 0 for the same y. The rest is clear. Let us show that a number of possible properties of a given operator are preserved under the passage to the adjoint operator. Proposition 6. IfT is a one-dimensional operator of the form xOy {cf. the beginning of Section 3.4), then T* is also one-dimensional, and T* y 0 x. Proof. This follows from the elementary verification of the adjointness re • lations. Proposition 7. If T is finite-dimensional, then T* is finite-dimensional as well, and dim(Im(T) ) dim(Im(T*)) . Proof. Since the image Im(T) is closed, Theorem 3 shows that Im(T*) T* (Ker(T*) EB Im(T) ) Im(T* I Im (T) ) . Since Ker(T*) n Im(T) {0 } , we see • that T* I Im( T) must be injective. The rest is clear. Exercise 4. Deduce this fact from Proposition 6. Hint. Take the representation T L:: �= l X k 0 Yk with linearly indepen dent systems {x 1 , . . . , X n } and {y1 , . . . , Yn } · Proposition 8. If T is compact, then T* is compact as well. If in addition T l: n sn e� 0 e� is the decomposition of the operator as in the Schmidt theorem, then T* l: n sn e� 0 e� . ==
==
==
==
==
==
==
==
==
==
==
==
==
==
==
==
==
==
==
==
==
1.
333
Hilbert adjointness: First information
The compactness of T* does not require using the Schmidt theo rem. Namely, by Proposition 3.3.7, T is approximated in the operator norm by finite-dimensional operators. In view of Proposition 3(i) , (v) , T* is ap proximated by the adjoints to finite-dimensional operators, which, by the previous proposition, are finite-dimensional well. Now we recall the Schmidt theorem and consider the decomposition of T indicated there. From Proposition 6, taking into account algebraic and topological properties of the Hilbert star, we immediatelyc obtain the required decomposition for T* . Of course, the latter also shows that T* is • approximated by finite-dimensional operators. Proposition 9. If S is a Fredholm operator, then S* is a Fredholm operator as well. Proof. According to the �-part of the Nikol'skii Theorem 3.5.6, there are operators R, T1 , T2 with the properties indicated there. But by Propositions 3 and 8, the operators R* , r;, Ti satisfy the hypothesis of the {::::= - part of • the same theorem. The rest is clear. Now let us go back to the Fredholm integral equations of the second kind (see Section 3.5) . Together with the initial equation Proof.
as
x (s )
(1)
- 1b K ( s , t)x (t)dt = y ( s ) ,
we shall consider the so-called adjoint equation x (s)
(1*)
- 1b K* (s , t ) x (t ) dt = y (s) . as
We have already noted that equation ( 1 ) is the same the operator equation Sx == y in the space L 2 [a, b] , where S ==1-T, and T is the integral operator with kernel K (s , t) . Also, it is clear, taking Proposition 4 into account, that equation ( 1 *) is the same the operator equation S*x == y. Consider the corresponding homogeneous equations as
- 1b K(s, t) x (t)dt = 0, x (s) - 1 b K * (s , t) x (t)dt = 0.
(2)
x (s)
(2* )
Preserving the style of Theorem 3.5.5, let us formulate the promised addi tion. Proposition 10 (Fredholm). Let x1 , . . . , X m and z 1 , . . . , Zn be families of linearly independent functions, occurring in Theorem 3. 5. 5. 1 Then 1 We know that m
=
n, but now it is not important.
334
6. Hilbert Adjoint Operators and the Spectral Theorem
(i) the functions z1 , . . . , Zn are solutions of the homogeneous equation (2*) such that every solution of this equation is a linear combination of the indicated solutions; (ii) the right-hand sides y ( ) for which equation ( 1 *) has at least one solution are precisely those functions for which J: y (t)x k (t) dt == 0 for k == 1 , . . . , m. s
By Theorem 3.5.5 (ii) , Im(S) == span{z 1 , . . . , zn } ..l. Taking the or thogonal complement and using Theorem 2, we have that Ker(S*) == span{z 1 , . . . , zn } , and this is equivalent to (i) . Furthermore, Theorem 3.5.5(i) gives Ker(S) == span{x 1 , . . . , xm } , and Proposition 9 implies that Im(S*) is closed. Hence, by Theorem 2, Im(S*) == • Ker(S)..l == {x 1 , . . . , xm } ..l , and this is equivalent to (ii) . Proof.
Here is another application of Theorem 3. Exercise 5 (cf. Exercises 2.5.6 and 2.5.7) . (i) The adjoint to an isometric operator between Hilbert spaces is coisometric, and vice versa. (ii) The adjoint operator to a topologically injective operator is topo logically surjective, and vice versa. Hint. If V : H1 � H2 is coisometric, then the adjointness relations show that V* takes y to the unique vector x _l Ker(V) for which V x == y. Now we show that some classes of operators defined in terms of Hilbert geometry, can be characterized in purely algebraic terms using the operation of Hilbert star. In a compressed form, what we do looks like the following small algebra-geometry dictionary: unitary operator U * == u - 1 P* == p == p2 orthogonal projection orthogonal reflection J* == J == J - 1 isometric operator V*V == 1 coisometric operator VV* == 1 partially isometric operator WW*W == W Let us explain what all this means.
11.
An operator U : H1 � H2 is unitary {=:::} it is invertible, and its inverse operator coincides with its adjoint operator.
Proposition
1.
Hilbert adjointness: First information
335
===> Since U is invertible and preserves the inner product, for all x E H1 , y E H2 we have ( x, U - 1 y ) == (U x, uu - 1 y ) == ( U x, y ) == ( x, U * y ) . {::::== For the same x, y, (U x, Uy ) == ( x, U * U y ) == ( x, u - 1 U y ) == ( x, y ) . • Proposition 12. An operator P : H � H is an orthogonal projection {=:::} it is idempotent and coincides with its adjoint operator. Proof. ===> Since P is a projection, we have P2 == P. By assumption, Ker(P) _l Im(P) . So for all x, y E H, taking into account that Py - y E Ker(P) , we have ( Px, y ) == ( Px, y + (Py - y) ) == ( Px, Py ) , and similarly ( Py ) == ( Px, Py ) . This implies that P* == P. {::::== Since P2 == P, P is a projection. Taking into account that P* == P, Theorem 3 gives that Ker(P) _l Im(P) . • Warning. From this moment on, everywhere in this chapter "projec tion" means an orthogonal projection: we shall not need other projections. In the next several exercises we ask you to describe the geometric action of an operator, starting from its algebraic properties. Exercise 6. How does J : H � H act if we know that it coincides with its inverse and with its adjoint operator? Answer. There is an orthogonal decomposition H == H+ ffiH_ such that J == 1 on H+ and J == - 1 on H_ . (An operator acting in this way is called an orthogonal reflection.) Hint. Look at the obvious identity x == � (x + Jx) + � (x - Jx) . Exercise 7. How does V : H1 � H2 act if we know that V* V == 1 ? And what if we know that VV* == 1 ? Answer. The first equality describes isometric operators, and the second, coisometric ones. The following class of operators contains all operators from our dictio nary except orthogonal reflections. Exercise 8. How does W : H1 � H2 act if we know that WW*W == W? How does its adjoint operator act? Answer. There are closed subspaces K1 C H1 and K2 C H2 such that W isometrically (i.e. , unitarily) maps K1 onto K2 and takes K/- to zero. At the same time W* maps K2 onto K1 acting as the inverse to W, and takes K;j- to zero. Proof.
.
.
.
x,
.
336
6. Hilbert Adjoint Operators and the Spectral Theorem
Hint. The ugly looking identity is equivalent simply to the fact that W* W is an orthogonal projection, say P. Hence, for K1 : == Im(P) and x , y E K1 we have (Wx , Wy) == (x , y) . The operators we have just talked about are called partially isometric. They play an important role in the theory of operators and operator algebras (classification of factors [4 6], operator K-theory [84], and a series of other questions; cf. Exercises 2.7 and 4.9 below) . Finally, let us recall spectra. Those readers who have done Exercise 5 . 1 . 1 (which is not very simple actually) know how the spectra behave under the operation of Banach star. It is much easier to understand how the spectra and various special subsets of spectra defined in Section 5 . 1 react to the Hilbert star. Exercise 9. Let T : H --+ H be an operator on a Hilbert space. Then a (T* ) == {A : A E a(T) } , and (i) if A E ar (T) , then A E ap (T* ) ; (ii) if A E ap (T) , then A E ap (T* ) or A E ar (T* ) ; (iii) ac (T* ) == {A : A E ac (T) } ; (iv) ae (T* ) == {A : A E ae (T) } . Remark. Both situations in (ii) can actually happen. Give the correspond ing examples. 2. Selfadj oint op erators and their sp ectra. Hilb ert-Schmidt theorem
Let us concentrate on operators acting in a chosen Hilbert space H. Their adjoint operators also act in H (i.e. , belong to the same B(H) ) . Naturally, we have an interesting class of operators, which appears in various problems of mathematics and physics. Definition 1. An operator T E B(H) is called selfadjoint (or Hermitian) if it coincides with its Hilbert adjoint operator (i.e. , T* == T) . The adjointness formulas show that a selfadjoint operator is exactly an operator satisfying the identity (1) (Tx , y) == (x , Ty) ; x , y E H. Proposition 1 .2 allows us to characterize selfadjoint operators as those T for which Sr == (Sr ) * . We come close to another useful characterization, this time in terms of quadratic forms. For an operator T (so far arbitrary) , its quadratic form is defined as the function Qr : H --+ C : x �----+ Sr ( x , x) (or,
2.
337
Selfadjoint operators and their spectra
what is the same, x �----+ (Tx , x) ) . Note that Proposition 1 .2. 1 immediately implies • Proposition 1. T == 0 {:=:::> Q r == 0 (i. e., (Tx, x ) == 0 for all x E H). Remark. Here it is important that we speak about complex linear spaces. For a comparison, look at the operator of rotation through 90 degrees in 1R2 . Proposition 2. An operator T is selfadjoint {:=:::> its quadratic form takes real values {i. e. , ( Tx, x) E lR for all x E H). By the previous proposition, T == T* {:=:::> time, for each S E B (H) we have Qs * == Q s . Proof.
Qr
== QT * . At the same
•
In addition to the selfadjoint operators, the more general, so-called nor mal operators also deserve to be mentioned. These are operators that com mute with their adjoint operators, i.e. , operators T such that T*T == TT* . Obviously, selfadjoint operators and unitary operators are normal. Remark. Actually, much of what we will say about selfadjoint operators it true for normal operators as well. But in this book we pay much less attention to them just because selfadjoint operators are geometrically more visual and appear more often in applications. The set of selfadjoint operators on H is denoted by B(H)s a · Note that, according to Proposition 1 .3, this is a real (but not complex! ) Banach space. Obviously, for every T E B(H) there exists a unique pair Tre, Tim E B(H)s a such that (2) T == Tre + i Tim ; These operators are Tre :== � (T + T*) and Ti m :== J-i (T - T*) . Two other selfadjoint operators that we can construct for every T are T*T and TT* . Remark. A well-justified and fruitful way of interpreting operators on Hilbert spaces is to view them as a profound "non-commutative" gener alization of complex numbers. From this point of view the passage to the adjoint operator corresponds to the passage to the complex conjugate num ber, selfadjoint operators play the role of real numbers, and equality (2) generalizes the algebraic form of complex numbers (and becomes precisely the latter when H :== C ). You may ask what is an analogue of the polar de composition (i.e. , trigonometric form) of complex numbers? As we will see later, there are two possible candidates for the role of polar decomposition, and they differ because of non-commutativity of operator multiplication. But we are not ready to discuss this yet (see Exercises 7 and 4.9 below) . Here we only mention that in the algebra B(H) the role of positive numbers
338
6. Hilbert Adjoint Operators and the Spectral Theorem
belongs to operators of the form T*T, whereas the role of unimodular com plex numbers belongs in some cases to unitary operators, and in other cases to more general partially isometric operators from Exercise 1 . 8. Here are several standard examples. From Exercises 1 .3 you can see that - a diagonal operator T>.. is always normal, and it is selfadjoint {=:::} the sequence A consists of real numbers; - (a more general fact) the operator Tt of multiplication by a function f is always normal, and it is selfadjoint {=:::} f takes real values almost everywhere; - the operators of left and right shifts in l 2 are not selfadjoint, and even not normal: TzTr == 1, and at the same time TrTz is a projec tion with the one-dimensional kernel span(p 1 ) . Proposition 1 .4 immediately implies Corollary 1. An integral operator on L 2 [a, b] is selfadjoint {=:::} its ker nel K(s, t) satisfies the identity K(s, t ) == K(t, s) almost everywhere in the square D . (Such kernels are called symmetric.) We formulate several simple properties of selfadjoint operators . Proposition 3. Suppose T is selfadjoint. Then (i) the eigenvalues of T are real numbers; (ii) the eigenvectors of T corresponding to different eigenvalues are or thogonal; (iii) the kernel and the image are connected by the relations Im(T)..l == Ker( T) and Ker(T)..l == Im(T) - {as always, the bar - means clo sure}; in other words, H == Ker(T) EB Im(T) - ; (iv) if Ho is an invariant subspace under T, then Ht is also an invari ant subspace under T; (v) a subspace Ho in H is invariant under T {=:::} T commutes with the orthogonal projection P : H � H onto Ho ; (v i) II T2 II == II T II 2 ; (vii) r(T) == II T il ; (viii) if S is another selfadjoint operator commuting with T, then ST is also selfadjoint; in particular, every power of T is a selfadjoint operator. Proof. (i) If Tx == Ax; x # 0, then A ( x, x) == ( Tx, x) == (x , Tx) == A (x , x ) . The rest is clear.
2.
Selfadjoint operators and their spectra
339
(ii) If Tx == AX and Ty == J1Y ; x , y # 0, then, taking (i) into account, we have A (x, y) == (x , Ty) == 11 \ x , y) . Hence, A # 11 implies (x , y) == 0. (iii) , (iv) , (vi) , (viii) immediately follow from Theorem 1.3, Proposition 1.5, Theorem 1.2, and Proposition 1.3(iii) . (v, ===> ) Since H == Ho tiJ Hd- , it is sufficient to show that PTx == T Px for x E Ho and for x E Hf . Clearly, in the first case both vectors we compare coincide with Tx , and in the second case, in view of (iv) , they vanish. (v, {::::::= ) If x E Ho , then PTx == T Px == Tx , hence Tx E Ho . • (vii) follows from (vi) and the spectral radius formula.
1
Parts (ii) , (iii) , (vi) , and (vii) in the previous proposition are valid for normal operators. Hint. If T is normal, then II T*x ll == II Tx ll for all x E H; thus, we have (iii) . Therefore, Ker(T*) == Ker(T) and Ker(T - Al) == Ker(T* - Al) , and this is (ii) . Finally, the C*-identity gives II T2 II 2 == II (T 2 )*T2 II == II T*T II 2 == II T II 4 · As for the classical Fredholm theorems on integral equations, they look completely transparent in this context. Proposition 4 (Fredholm) . Suppose Exercise * .
(3)
x (s)
- 1b K (s , t)x (t) dt = y (s)
is an integral equation of the second kind with symmetric kernel and (4)
x (s)
- 1b K ( s , t)x (t) dt = 0
is the corresponding homogeneous equation. Then there exists a linearly independent system X I , . . . , X m of solutions of equation (4) satisfying the following conditions: (i) every solution of equation (4) is a linear combination of X I , . . . , xm ; (ii) the right sides y(s) for which equation {3} has at least one solution are precisely the functions satisfying the equation J: y(t)xk (t) dt == 0 • for all k == 1, . . . , m. Let us now speak about spectra. Those who had done Exercise 1.9, know that the spectrum of a selfadjoint operator is symmetric with respect to the real axis. But actually a much stronger proposition is true: it lies on the real axis. The implications of this fact are so serious that we shall give two proofs for it: "algebraic" (or abstract) and "geometric" (or vector) . Let us start the necessary preparations.
340
6. Hilbert Adjoint Operators and the Spectral Theorem
Suppose T is selfadjoint. Then the operator U :== exp( iT) (see Definition 5. 3.4} is unitary. Proof. By the definition of the operator exponential, U == 1 + E � 1 �! (iT) n . Hence, from the Hilbert star properties we have oo n oo 1 i n U * = 1 + L -1 T = 1 + L ( - iT) n == exp( - iT) . n= l n. n= l n. Hence, by Corollary 5.3.4, U* and U are mutually inverse operators, and • our result follows from Proposition 1 . 1 1 . Proposition 6. Suppose T : H � H is selfadjoint and topologically injec tive. Then it is a topological isomorphism. Proposition 5 .
-,
By assumption, Im(T) is closed, hence, by Proposition 3(iii) , it co • incides with H. The rest is clear. Proposition 7. For each S E B(H) and t > 0 the operator R :== S* S + t1 is invertible. Proof.
By what we had already said, it is sufficient to prove that R is topologically injective. For every x E H we have II Rx ll ll x ll > I (Rx , x ) I == I ( S * Sx , x ) + t (x , x ) I == ( Sx, Sx ) + t (x , x ) > t 11 x ll 2 . • Therefore, II Rx ll > t ll x ll , and after that Corollary 1.4.2 works. Theorem 1. The spectrum of a selfadjoint operator T : H � H belongs to the interval [ - II T II , II T II J and contains at least one of its ends. Proof.
Combining Corollary 5.3. 1 with Proposition 5, we have a(exp(iT) ) C 'lr. Hence, if A E a(T) , then, applying Proposition 5.3.11 to w ( z ) :== exp(i z ) , we see that exp(iA) E 'lr. This obviously implies that • A E JR, and it only remains to use Proposition 3(vii) . Algebraic proof.
Suppose A E C ; A == s + it is such that t # 0. Put S :== T - s1. Then, by Proposition 7, the operator S* S + t1 == S2 + t1 == (S + it1) (S - it1) is invertible, and thus, both factors, commuting with each other, are also invertible (see Proposition 5.2.2) . Hence, it tf. a(S) , and this is equivalent to the fact that A tf. a(T) . Thus, a(T) C JR. Similarly to how it was done in the algebraic proof, we can now complete the proof using Proposition 3(vii) . But there are more elementary tools in our "vector" arguments. Take a sequence X n E H; ll x n ll == 1 such that Geometric proof.
2.
Selfadjoint operators and their spectra
341
II Tx n ll � II T II ; n � oo . Then for every n we have II (T2 - II T II 2 1)xn ll 2 == ( (T2 - II T II 2 1)x n , (T2 - II T II 2 1)x n) == II T 2 xn ll 2 - 2 ( T2 xn , II T II 2 xn) + II T II 4 < 2 IITII 4 - 2 II T II 2 (Txn , Txn) == 2 II T II 4 - 2 II T II 2 11 Txn ll 2 , hence (T 2 - I I T II 2 1)xn � 0 as n � oo . Of course, this means that the oper ator T2 - I I T II 2 1, or, what is the same, the operator (T + II T II l) (T - II T II l) , is not invertible. Therefore, either II T II or - I I T II belongs to a(T) , and it • only remains to recall that r(T) < II T II (Theorem 5.3. 1).
What we have just said is the maximal information one can obtain about the spectra of a general selfadjoint operator. Exercise 2.
(i) Every compact subset of lR is the spectrum of a selfadjoint operator. (ii) Every compact subset of 1r is the spectrum of a unitary operator. (iii) Every compact subset in C is the spectrum of a normal operator. In all these cases the operator can be chosen in such a way that its spectrum coincides with its essential spectrum. Hint. Take the diagonal operator T>.. l 2 � l 2 , where A is the sequence such that {An} - coincides with the given set. Theorem 1 and Proposition 3(iii) immediately imply the following result. • Proposition 8. Selfadjoint operators have no residual spectra. Exercise 3. The same is true for the normal operators. Now we pass to the preparation of a theorem allowing us to completely understand the nature of operators that are compact and at the same time selfadjoint. Theorem 2 (Hilbert-Schmidt). Let T H � H be a compact selfadjoint operator on a Hilbert space. Then there exist a) an orthonormal system e 1 , e 2 , . . . in H of finite or infinite cardi nality, and b) a {finite or infinite) sequence AI , A 2 , . . . of non-zero real numbers with the index set of the same cardinality, tending to zero if it is infinite, such that our operator acts by the following rule: Tx = L >.n ( x, en) en (5) :
:
342
6. Hilbert Adjoint Operators and the Spectral Theorem
where E n means either a finite sum, or a sum of a series in H. {In other words, the en are the eigenvectors for T with eigenvalues An , and T takes every vector orthogonal to all en to zero}. Proof. By the Schmidt Theorem 3.4. 1 , T can be represented as the sum T ==
E n sne� 0 e� . From the selfadjointness of T and Proposition 1 .8 it follows that the same operator can be represented as the sum T == E n sne� 0 e� . . e + .· - eI + eII and e ·. - eI - eII , we see th at Te ± - ± Sne ± . P Utt Ing n n n - n n n - n n Since the s-numbers of T do not increase, for some natural numbers n i < n 2 < · · · we have S I == · · · == Sn 1 > Sn 1 +I == · · · == Sn 2 > Sn2 + I == · · · > s n k- l +I == · · · == s n k > · · · . For k == 1 , 2, . . . , consider the spaces Lt :== span{e�k- l + I , . . . , e�k } (here we put no : == 0) . Obviously, for each e E Lt we have Te == ±s n k e. Hence, by Proposition 3(ii) , all these spaces are mutually orthogonal in H. Consider those subspaces Lt that do not vanish and choose orthonor mal bases in these subspaces. If we collect together all vectors from all these bases and number them arbitrarily, we obtain an orthonormal system, say { e i , e 2 , . . . } . Evidently, every en is an eigenvector for T, and the correspond ing eigenvalue, which we denote by An , coincides with the number ±s k for some k. Moreover, every An appears at most finitely many times. Further, the linear spans of the systems { e I , e 2 , . . . } and { e� , e; , . . . , e1, e� , . . . } obviously coincide. Hence the sequences AI , A 2 , . . . and S I , s 2 , . . . are either both finite, or both infinite. If they are infinite, then from the fact that s n tends to zero it follows that A n tends to zero as well. Finally, take an arbitrary x E H. It can be represented as -
x=
L ( x, en) en n
+
xo ,
where x 0 is orthogonal to all en and, thus, to all e� . Hence, by the Schmidt theorem, Tx 0 == 0, and equality (5) evidently follows from the properties of • T as a continuous operator. The Hilbert-Schmidt theorem admits the following equivalent formula tion; compare it with the result of Exercise 3.4. 1.
Let T : H � H be a compact selfadjoint operator in a Hilbert space. Then there exists a finite or infinite sequence A == (AI , A 2 , . . . ) of real numbers, tending to zero if it is infinite, and a Hilbert space Ho such that T is unitarily equivalent to the operator R l2 EB Ho � l2 EB Ho {where m < oo is the number of the terms of the sequence A) that acts on l2 as the diagonal operator T>.. and takes Ho to zero. Proposition 9 .
:
2.
Selfadjoint operators and their spectra
343
Proof. Take the sequence that appears in the Hilbert-Schmidt theorem as A, and put Ho : == Ker(T) . Then, by the Hilbert-Schmidt theorem, T belongs
to the class of operators in Example 2.2.2. The rest is clear. Exercise Proposition 9.
4.
•
Show that the Hilbert-Schmidt theorem follows from
Recall that for an operator acting in a linear space the multiplicity of an eigenvalue is the dimension of the corresponding subspace of eigenvectors.
Let T be a selfadjoint operator and A n the sequence in the Hilbert-Schmidt theorem. Then (i) the non-zero eigenvalues A of T are elements of the sequence {An}, and the multiplicity of every eigenvalue A coincides with the number of times A occurs in the sequence {An}; ( ii) II T II == max { I A n I ; n == 1 , 2, . . . } .
Proposition 10.
Proof. Evidently, the norm, as well as the family of the non-zero eigenvalues
and the multiplicity of every eigenvalue do not change when we pass to a unitarily equivalent operator. Hence, these characteristics of T are the same as those of the operator R in Proposition 9. Thus, these characteristics coincide with those of the diagonal operator T>.. in this proposition. But every eigenvalue of T>., , say J1 , is obviously one of the numbers An , n == 1 , 2, . . . , and the space of the respective eigenvectors is span { p k : A k == 11} . In addition, l i T>.. II == m ax { I An l ; n == 1 , 2, . . . } (cf. Example 1.3.2) . The rest is clear. • Remark. The fact that the norm of a compact selfadjoint operator is the
greatest absolute value of the eigenvalues follows also from the equality II T II == r (T) and from the fact that the spectrum of a compact operator contains only eigenvalues (and possibly zero) ; see Theorem 5 . 1 . 1 . Finally, the statement in the Schmidt theorem that the operator in ques tion can be decomposed into the sum of one-dimensional operators can be reformulated in the "selfadjoint" case as follows.
Suppose that T H � H, en , and A n are as in the Hilbert-Schmidt theorem. Then T can be represented as T == E n Anen 0 en , where En means either a finite sum, or the sum of the series converging in the operator norm.
Proposition 1 1 .
:
Proof. Equality (5) in the Hilbert-Schmidt theorem can be rewritten as
Tx == E �== l An [en 0 en ] (x) . If the number of terms is finite, everything is clear. If it is infinite and Sk is the partial sum of the corresponding series, then, clearly, T - sk acts by the formula (T - Sk )x == E � k+l An (x , en) en. Now Proposition 10 applied to T - Sk gives l i T - Sk l l == max{ I An l ;
344
6. Hilbert Adjoint Operators and the Spectral Theorem
n == k + 1 , k + 2, . . . } . Hence, l i T - Sk ll tends to zero as k with Ak ·
--+ oo
together
•
While the Schmidt theorem can be regarded as a result on the classi fication up to the weak unitary equivalence, the Hilbert-Schmidt theorem is a result on the classification up to the unitary equivalence (i.e. , up to much more rigid identification) . At the same time it is remarkable that for the class of operators considered, the two kinds of "non-weak" equivalence, unitary and topological, coincide.
:H
--+
H and S : K
--+
K be compact selfadjoint operators. Then they are unitarily equivalent {=:::} they are topologically equivalent {=:::} the sets of their non-zero eigenvalues, taken with multiplic Exercise 5 . Let T
ities, coincide, and Ker(T) is unitarily isomorphic to Ker(S) . Hint. For arbitrary operators their topological equivalence implies the coincidence of the Hilbert dimensions of the spaces of eigenvectors corre sponding to every chosen eigenvalue. (In fact, topological and unitary equivalence coincide on the entire class of selfadjoint operators, but this is more difficult to prove; see Exercise 4. 10 below.) We see that every selfadjoint operator in a Hilbert space is uniquely defined up to unitary equivalence by the following data: a) the (non-ordered) family A of the non-zero eigenvalues, where every eigenvalue occurs as many times as its multiplicity is, and b) the Hilbert dimension of its kernel (see Theorem 2.2.2) . Thus, a complete system of invariants of the unitary equivalence for this class of operators consists of pairs (A, a ) , where A is an at most countable family of non-zero real numbers with possible finite repetitions and with zero as the only possible limit point, and a is a cardinality. A model of the compact operator with the invariant (A, a ) is the operator R : l2 EB K --+ l2 EB K (see Proposition 9) , where A is the given family, arbitrarily ordered, and K is a Hilbert space of Hilbert dimension a (say, l 2 ( X ) , where X is a set of cardinality a ) . All that we have said is true if we replace the unitary equivalence by topological equivalence. *
*
*
For a separable space the Hilbert-Schmidt theorem and the equivalent Proposition 9 can be reformulated more transparently as follows.
Let T : H --+ H be a compact selfadjoint operator in separable Hilbert space. Then
Proposition 12.
a
2.
345
Selfadjoint operators and their spectra
(i) there is an orthonormal basis in H consisting of eigenvectors of T. The corresponding sequence of eigenvalues tends to zero and
consists of real numbers; (ii) T is unitarily equivalent to a diagonal operator T l2 � l2, where m is the Hilbert dimension of the space H, and the sequence A tends to zero and consists of real numbers. :
Proof. Take an orthonormal basis in the formulation of the "general" Hil
bert-Schmidt theorem. Using that Ker(T) is separable, consider an or thonormal basis in this space. If we arbitrarily renumber the union of these two bases, we certainly obtain a required basis in H. The rest is clear. • In particular, it is easy to see that in the separable case the syste1n of invariants of the unitary (as well as topological) equivalence for compact selfadjoint operators also takes a more visual form. Namely, for invariants we can take an arbitrary sequence of real numbers tending to zero, considered up to permutations of its elements. The simplest model of an operator with the sequence A as its invariant is certainly the diagonal operator T>.. : l 2 � l 2 . Now we can fulfill the old promise we gave in Section 1.3. Exercise
6* . The norm of the operator of indefinite integration on
L 2 [0, 1] is ; . Hint. 2 Let us represent our T as US, where S : x �----+ J01 - s x(t)dt, and U is the unitary operator x(t) �----+ y(t) :== x(1 - t) . Hence, II T II == II S II , and since S
is selfadjoint, Proposition 10 reduces the problem to the search of those non zero A E 1R for which the integral equation Sx == AX has a non-zero solution. But these are the same A for which the equation Ax '(t) + x(1 - t) == 0 has non-zero solutions satisfying the condition x(1) == 0. They certainly are the solutions of the equation A 2 x "(t ) + x( t ) == 0 with x( 1) == x' (O) == 0. Knowing the general solution of the latter ordinary differential equation, we see that the non-zero solutions satisfying these boundary conditions exist only for A == ( � + k1r ) - 1 ; k E Z, and for every k they are multiples of cos( ( � + k1r)t) . Thus, II T II < ; , and it remains to verify that Sx == ; x for x(t) :== cos ;t .
In concluding this section, let us go back to the general theory. The Hilbert-Schmidt theorem allows for another important characterization of the s-numbers of compact operators, which also indicates that these numbers do not depend on the choice of the orthonormal systems involved in the Schmidt theorem (see Section 3.4) . 2 This observation simplifies the well-known proof from [81] . It was suggested in a class on functional analysis by Natasha Grinberg, a third-year student at that time.
346
6. Hilbert Adjoint Operators and the Spectral Theorem
13.
Let T : H � K be a compact operator between Hilbert spaces with s-numbers S I > s 2 > · · · . Then the sequence s i , s� , . . . is pre cisely the sequence of non-zero eigenvalues, with multiplicities and in de creasing order, of each of the operators T*T : H � H and TT* K � K.
Proposition
:
Proof. Take T == E n sne� 0 e� . By Proposition 1 .8, T* == E n sne� 0 e� . Hence ' T*Te'n == sn2 e'n and TT* e"n == sn2 e"n for each n · In addition T*Tx == 0 for x _l { e� , e; , . . . } and TT* y == 0 for y _l { e� , e� , . . . } . It remains to apply '
• Proposition 10 to the two operators under consideration. Corollary 2. Let T be a compact operator between two Hilbert spaces, and AI , A 2 , . . . the sequence of eigenvalues of the operator T*T (or, equivalently, TT*) taken with multiplicities. Then (i) T is a Schmidt operator {=:::} E n An < oo ; (iii) T is a nuclear operator {=:::} E n A < oo . Now that we have the Hilbert adjoint operators at our disposal, we can define an inner product in the space S(H, K) of Schmidt operators (see Proposition 3.4.4 (ii) ) using a formula that is not related to orthonormal bases. Namely, for each S, T E S(H, K) we have (S, T) == tr(T* S) . (Thy to prove this formula using Proposition 3.4.8(ii) . )
Now we can show what the promised polar decompositions of the opera tors look like, though at this point we only do it for compact operators. We call a compact operator on a Hilbert space positive if it is selfadjoint and all its eigenvalues are non-negative. (This is a special case of Definition 4.2, which will appear later) . Exercise 7° (polar decomposition of a compact operator) . Suppose T : H � K is a compact operator between Hilbert spaces. Then there exists a partially isometric operator W : H � K and positive operators S : H � H, S' : K � K such that T == WS == S'W. Moreover, if T == E n sne� 0 e� , then our equalities are true for S :== E n sne� 0 e� , S' == E n sne� 0 e� , and W that maps e� to e� and sends all vectors orthogonal to all e� to zero. Under some natural conditions the indicated polar decompositions are . unique. Exercise 8. For the same T, W, S, S', (i) If T == WI SI , where WI is partially isometric, SI is positive, and (Ker(WI))_i == lm(SI ) - , then WI == W and SI == S. (ii) If T == s� WI ' where WI is partially isometric, s� is positive, and (Ker(S� ))_i == Im(WI ) , then WI == W and S� == S'. Hint for (i) . Obviously, WtT == WtWI SI == SI , hence, Sf == T*T == S2 . For the same reason, SI is compact. Since SI is, in addition, positive ,
3. Involutive, C* -, and von Neumann algebras
34 7
Proposition 10 shows the following. A vector e is an eigenvector of S1 with eigenvalue 11 > 0 {=:::} the same e is an eigenvector of sr with eigenvalue /1 2 . Together with the equality T*T == S 2 , this gives S1 == S. A polar decomposition of arbitrary operators will be given later; see Exercise 4. 9. 3. An overview : lnvolutive algebras ,
C*-algebras , and
von Neumann algebras
The set B(H) , already endowed with a rich structure of Banach algebra, has another operation: sending an operator to its Hilbert adjoint. This operation resembles taking the complex conjugate of a number in C. Now it is time to look at what we have obtained, from a more general position. Perhaps, some of the readers do not like that there are many theorems without proofs in this textbook. Then this section will be especially irritat ing for them. Let us try to justify ourselves. The theorems we will speak about are among the most substantial achievements of modern mathematics. At the same time they have simple and transparent statement. We believe that these facts should certainly be mentioned in a text book on functional analysis, and the reader should know about them. As for the proofs, they usually are not simple at all, and together with the necessary preparation would occupy too much space. Therefore, in our opinion, these proofs go beyond the scope of our lectures. You can find them in many other books, e.g. , [86], [25], [72], [50], and [87]. To understand better the forthcoming general definitions, take some examples of algebras (both pure and Banach; see Sections 5.2 and 5.3) from our list: C, C [t] , Mn , B (E) , C 1 [a , b] , l 1 ( Z) , l L ( X, 11) , and put Co ( O ) and B(H) in the most prominent place. (Here 0 is a locally compact topological space, ( X, 11) a measure space, H a Hilbert space, and E an arbitrary Banach space.) We will look at each of these algebras while introducing new notions. First, let us give a purely algebraic oo ,
oo
1. Suppose A is an algebra. A mapping (*) : A --+ A is called an involution in A if (writing a* instead of (*)(a)) for all a, b E A, A E C the Definition
following equalities hold: (i) ( a + b)* == a* + b*; ( ii) (A a)* == A a* ; (iii) ( ab) * == b *a* ; (iv) a** == a.
348
6. Hilbert Adjoint Operators and the Spectral Theorem
An algebra endowed with an involution is called an involutive algebra or, in short, a * -algebra. The element a* is said to be the adjoint to a. All the algebras we have exhibited, save one important exception, have natural involutions. Of course, in the simplest algebra C we have A* :== A. Similarly, in C 1 [a, b] , L00 ( X, 11 ) , and C0 ( 0) , we can put x*(t) :== x( t) , and in l 00 we take �� :== �n · In the algebra of polynomials C [t] the involution is given by p( t) == co + · · · + Cn t n �----+ p * ( t ) :== co + · · · + Cn tn . (This is also the passage to the complex conjugate function if we view polynomials as functions on JR; if we consider them on C, there is a slightly more complicated formula p*(z) :== p(z) .) In the algebra of matrices Mn , for a given a == (akz ) we put a 'kz :== az k· In the Wiener algebra l 1 (Z) we put a� :== a - n · Finally-and this is the main case-the involution in B(H) , as you could have already guessed, is the passage to the adjoint operator. As for the algebra B ( E ) for a non-Hilbert space E, it has no natural involution; sometimes this also happens. Everywhere below in this section A is an involutive algebra. Properties of an involution evidently imply Proposition 1. If A is unital, then 1 * == 1, and for an invertible a E A we • have ( a - 1 )* == ( a*) - 1 . An element a E A is called selfadjoint if a* == a, and normal if a*a == aa*. If A is unital, then an element u in A is called unitary if it is invertible and
u - 1 == u*.
Among all homomorphisms between two *-algebras A and B , those com patible with the involution in a proper way deserve special attention. Definition 2. A homomorphism cp : A � B is called involutive or, shortly, a * -homomorphism if cp(a*) == cp(a)* for all a E A. As a very important special case, a *-homomorphism from A to B(H) is called an involutive representation or a * -representation of the algebra A in the Hilbert space H ( cf. general Definition 5.2.6) . Bijective *-homomorphisms of *-algebras are called involutive isomor phisms or * -isomorphisms . Certainly, they are isomorphisms in the category of *-algebras and *-homomorphisms (give a definition of this category) . Example 1. Clearly, a polynomial calculus of an element a of a unital *-algebra is a *-homomorphism {=:::} a is selfadjoint. In particular, assigning to every p E C [t] the same polynomial viewed as a function on [a, b] , we obviously obtain an injective (as a mapping) *-homomorphism from C[t] to C[a, b] . If we replace [a, b] in this example by an interval [z1 , z2 ] not lying on the real line, we obtain a non-involutive homomorphism from C [t] to
C[z1 , z2 ] .
3. Involutive,
C* - , and von Neumann algebras
Example 2. The mapping
349
cp
: l00 � B(l 2 ) assigning to every sequence A the diagonal operator T>.. is obviously a *-representation of the *-algebra l00 in the Hilbert space l 2 . As a generalization of this example, the mapping cp : L00 ( X, 11) � B(L 2 (X, 11) ) assigning to each essentially bounded measurable function f the operator Tt of multiplication by f is a *-representation of the *-algebra L00 (X, 11) in the Hilbert space L2 (X, 11) . , en be an orthonormal basis in a finite-dimensional Example 3. Let e 1 , Hilbert space H. It is easy to see that assigning to every operator acting on H its matrix in this basis, we obtain a *-isomorphism between the *-algebras B(H) and Mn . .
.
.
Certainly the following proposition is true.
A * -homomorphism maps selfadjoint, normal, and {if we speak about unital homomorphisms} unitary elements to elements of the • same type. A subset M in A is called selfadjoint, or, briefly, a * -subset if a E M implies a* E M. A similar meaning is given to such terms as * -subalgebra, *-ideal, etc. Here is an example. Proposition 2.
4.
Proposition 1 .8 shows that the ideal K(H) in B(H) , where H is a Hilbert space, is a *-ideal. At the same time a subalgebra K(H) + of compact perturbations of scalar operators is a *-subalgebra, but not a *-ideal, if H is infinite-dimensional.
Example
Exercise
1. The sets S(H) and N(H) of Schmidt operators and nuclear
operators, respectively (see Section 3.4) , are also *-ideals in B(H) . The following result is evident. Proposition 3. Suppose I is a * -ideal in A. Then the quotient algebra A/ I is a * -algebra with respect to the involution well defined by the equality
(a + I)* : == a* + I.
•
In particular, the Calkin algebra C (H) : == B(H) /K(H) for a Hilbert space H (see Section 5.2) also has a natural structure of involutive algebra. Let us now proceed from abstract algebra to functional analysis. Definition 3. A Banach involutive algebra (i.e. , an algebra endowed with both a complete norm and an involution) A is called a Banach star-algebra if II a* II == II a ll for all a E A . The following result immediately follows from the definition. Proposition 4. If A is a Banach star-algebra, then the mapping (*) : A � A is continuous, and, in particular, an � a; n � oo implies a� � a*; n � . ()().
350
6. Hilbert Adjoint Operators and the Spectral Theorem
In fact, the converse proposition is true, up to replacing the norm by an equivalent one. Exercise 2. Let A be a Banach algebra with a continuous involution. Then it can be endowed with a norm II · II ' equivalent to the initial one and such that ( A, II · II ') is a Banach star-algebra. Hint. Put ll all ' : == max { ll all , ll a* ll } . Certainly, all involutive algebras in our examples, except Mn and C [t] , are Banach star-algebras with respect to the norms introduced earlier and the involutions introduced in this section. However, Mn can also be easily turned into a Banach star-algebra, and moreover there are many ways to do this. For instance, we can identify this algebra with B(H) (see Example 3) . Or, say, we can define the norm by the equality ll a ll : = ,/'£ �, l = l i akz l 2 • (As for the algebra of polynomials, it cannot be turned even into a Banach space; the readers who have done Exercise 2. 1 . 1 know this. ) Similarly to general Banach algebras (see Section 5.3) , the Banach star algebras are objects of two categories with either bounded, or contraction involutive homomorphisms as morphisms. Isomorphisms of these categories are respectively topological and isometric * -isomorphisms . The meaning of both terms is obvious, as well as, say, of the term contraction or isometric * -homomorphism. (Note that the mappings in Example 2 are isometric *-homomorphisms.) We hope that the following non-trivial fact from the family of "automatic continuity results" will be interesting to our advanced readers ( cf. comments to Proposition 5 .3.2) . Proposition 5 (see [25 , Chapter 4, § 5 . 25] ) . Every * -representation of a Banach star
algebra tn a Hilbert space ts a bounded, and moreover, a contraction * -homomorphism. Exercise 3. Prove that this is true in the case of unital algebras and unital repre sentations.
Hint. If r.p : A ---+ B( H) is our representation, then for a selfadjoint a E A the estimate ll r.p(a) ll < ll a ll can be obtained by consecutive applications of Proposition 2 . 3 (vii) (where T == r.p(a) ) , Proposition 5 .2.4, and Theorem 5.3. 1 . In the general case we must apply the obtained estimate to the element a* a and use Theorem 1 .2 (where T == r.p(a)).
Now we pass to the "best algebras in functional analysis" . The following notion is, apparently, the most important in the entire theory of algebras with involution.
4.
A Banach star-algebra A is called an abstract C*- algebra or, if there is no danger of confusion, just a C* -algebra 3 if for every a E A we Definition
3 Perhaps this term does not sound natural, but it became standard long time ago. Appar ently, it was I . Segal (1 94 7 ) who first introduced it for the operator algebras. The letter "C" seems to be used for emphasizing the role of the algebras in question as non-commutative generalizations
3. Involutive,
C* -, and von Neumann algebras
35 1
have
ll a * a ll == ll a ii 2 This equality is called the (abstract) C* -identity.
4°.
Every involutive Banach algebra where the C*-identity holds is (automatically) a Banach star-algebra, and thus a C*-algebra. We give two main examples, one commutative and one non-commutative. Exercise
Example
5. The algebra Co (O) , where 0 is a locally compact space, and in
particular the algebra C(O) , where 0 is a compact space, are C*-algebras.
6.
The algebra B(H) , where H is a Hilbert space, as well as each norm-closed selfadjoint subalgebra of this algebra, are C* -algebras. This immediately follows from the operator C*-identity (Theorem 1 .2) .
Example
The algebras indicated in the last example are called concrete or operator C* -algebras. Certainly, they include K (H) and, as is easy to verify, the algebra of the operators of multiplication by functions in L2 (X, J1 ) . (The special cases of the latter are the algebra of diagonal operators in l 2 , and the algebra of compact diagonal operators in l 2 .) Certainly, many facts concerning operators which follow from Theorem 1 .2 are true for the elements of an arbitrary C* -algebra. In particular, for selfadjoint elements in C* -algebras we have ll a2 1 1 == l l a ll 2 . (Actually, the related part of Exercises 2. 1 establishes the same for normal elements of these algebras.) And here are counterexamples . Exercise 5. The Banach star-algebra C 1 [a, b] is not a C* -algebra; more over, it is not topologically *-isomorphic to any C*-algebra. The same is true for S(H) and N(H) (see Exercise 1 ) . Hint. Find a sequence of selfadjoint elements an : ll an l l == 1 such that a; � O; n � oo . Remark. The aforesaid is also true for the Wiener algebra l 1 (Z) , but this
is more difficult to establish.
Two results of great importance for the whole mathematics claim that essentially there are no other examples of C* -algebras besides those given in Examples 5 and 6. The first of these results describes all commutative algebras of this class, and the second, arbitrary C* -algebras. Here are the exact statements. of C ( 0) (see Theorem 1 and the discussion of it) , and the star indicates the outstanding role of the involution.
352
6. Hilbert Adjoint Operators and the Spectral Theorem
1 (First Gelfand-Naimark theorem; for the proof, see [25, Chap ter IV, Theorem 7. 13] ) . 4 Every commutative C* -algebra A is isometrically * -isomorphic to an algebra Co ( 0 ) , where 0 is a locally compact topological space; if in addition A is unital, then 0 is a compact space. Theorem
The authors of the theorem give also the concrete form of this isometric *-isomorphism: this is the Gelfand transform r : A � Co (OA ) , where OA is the Gelfand spectrum of our Banach algebra. Remark. The readers following the examples (and they should!) must have
been surprised by the statement of this theorem. What about the algebras L00 (X, J1) and, in particular, l00? Obviously, they are C*-algebras of func tions, but at first sight they do not belong to the class Co (O) . However, they are C0 ( 0 ) algebras in a latent form, up to an isometric *-isomorphism. This "invisible" compact space 0 is an important characteristic of the algebras considered. In particular, up to the indicated equivalence, loo == C(,BN) , where ,BN (the Gelfand spectrum of the algebra Zoo ) is just the Stone-Cech compactification of the discrete space N ( cf. discussion in Section 3. 1 ) . A
Theorem
2 (Second Gelfand-Naimark theorem, 1943; for the proof, see [25,
Chapter IV, Theorem 7.57] ) . Every abstract C* -algebra A is isometrically
* -isomorphic to some concrete {i. e. , operator) C* -algebra. In other words, every C* -algebra has an isometric * -representation in some Hilbert space.
(What is this Hilbert space and how the required representation arises, we explain closer to the end of this section.) An immediate useful consequence of these theorems is that they infor mally show that all we can do with continuous functions, can also be done with the elements of commutative C* -algebras, and all we can do with opera tors on Hilbert spaces, can be done with the elements of general C* -algebras. For instance, in every algebra of the form C0 ( 0 ) the equation x n y has the solution for each non-negative right-hand side and for each natural n. Hence, the same is true for every commutative C* -algebra consisting, say, of operators. Exercise 6* . The spectrum of a unitary element of a unital C* -algebra lies in 1r, and the spectrum of a selfadjoint element is in JR. Hint. In the first of the two proofs of Theorem 2. 1 there is nothing related "specifically to operators" . ==
4 Mark Aronovich Naimark ( 1909-19 78) , a prominent Russian mathematician. He obtained a series of first-class results in functional analysis.
3. Involutive,
C* - , and von Neumann algebras
353
Now we make a few general comments . After more than half a century since the appearance of these theorems, their place in the entire building of mathematics is per haps more important than the creators themselves anticipated. Now these theorems are an integral part of a new important area called non-commutative geometry. One of the "philosophical aspects" of this area is as follows . Speaking informally, the first Gelfand Naimark theorem says that the structure of a commutative C * -algebra is adequately de scribed in purely topological terms of its spectrum. (One can give a precise meaning to this statement using the categorical language. We shall do this later in the part for ad vanced readers. ) Because of this (and now we are even more vague) one can and should view an arbitrary C * -algebra as the algebra of functions on a "non-commutative locally compact space" , or even as a "non-commutative locally compact space" itself. Of course, these are strange words . However, this idea indeed turned out to be extremely fruitful. If you know topology, you can predict properties of non-commutative C * -algebras, and to build useful (in particular, for quantum physics) "non-commutative" versions of classical objects of topology, for instance, "non-commutative sphere" or "non commutative torus" ( cf. [88] ) . At the same time, a higher level of understanding of a series of topological (i.e. , "commutative" ) notions and results is sometimes achieved after dealing with non-commutative C * -algebras. (Moreover, some purely topological facts can obtain more transparent proofs. For instance, the well-known Bott periodicity theorem is proved by I. Cuntz using compact operators; see, e.g. , [84, Theorem 1 1 .2. 1] .) That is why the theory of C * -algebras is also called non-commutative topology. *
*
*
We now turn from the general discussion back to mathematics. Among the operator C* -algebras there is a special class, which was studied much earlier than general C*-algebras (both abstract and concrete) . It was John von Neumann who introduced this class in 1930. One of his main motiva tions was the hope (partially justified later, although not completely) that these algebras, being the "right place for observables in quantum mechanical systems," may put quantum mechanics on a solid mathematical foundation. The algebras we begin to discuss can be defined both in topological and in algebraic terms, and the result will be the same. Recall that in B(H) , where H is a Hilbert space, in addition to the operator norm, there are several structures of a polynormed space. We had already encountered the weak-operator, strong-operator (and, in the case of advanced readers, also ultraweak) families of prenorms and the topologies with the same names, which are generated by these families (see Section 4. 1 and the end of Section 4.2) . For every subset M C B (H) its commutant is the set M ' : == {b E B (H) : ba == ab for all a E M} , and the bicommutant is the set M 1' : == ( M ' ) ' (i.e. , commutant of the commutant) . Evidently, we always have M C M 1' . (von Neumann bicommutant theorem; see [25, Chapter III, Theorem 2.34] ) . Let A be a selfadjoint subalgebra in B(H) containing 1 . Theorem
3
354
6. Hilbert Adjoint Operators and the Spectral Theorem
Then the following properties are equivalent: ( i ) A is closed in the weak-operator topology; ( ii ) A is closed in the strong-operator topology; ( iii ) A '' == A. Exercise
7.
Show that ( iii ) implies ( i ) and ( ii ) .
There are at least four other useful topologies that could have participated in this formulation instead of the two we have just mentioned . The ultraweak topology is one of them (see the end of Section 4.2) ; however, it was not known at the time when the first version of the von Neumann theorem appeared. The proof of the "full" bicommutant theorem, where all the six topologies take part , can be found, e.g . , in [68] . We emphasize that the selfadjointness condition of the considered subalgebra cannot be omitted.
Definition 5. An operator algebra satisfying the assumptions of Theorem 3
is called a von Neumann algebra.
In addition to the algebra B(H) , among the operator algebras we have already met, the algebra of ( all ) diagonal operators on l2 belongs to the class of von Neumann algebras. More generally, the algebra of operators of multiplication by functions in L 2 (X, 11) is also a von Neumann algebra. ( Try to prove this at least for the first algebra. ) Since the topologies participating in Theorem 3 are weaker ( i.e. , coarser ) than the norm topology, every von Neumann algebra is automatically closed in the latter topology, and thus is an operator C* -algebra. It is easy to see that the von Neumann algebras form a much smaller class than the general operator C*-algebras. For instance, the algebra K (H) , and the algebra of compact diagonal operators on l2 are not von Neumann algebras. Exercise
8.
( i ) Both the weak- and the strong-operator closures of the algebra F(H) ( hence of K(H) ) coincide with B(H) . ( ii ) Both the weak- and the strong-operator closures of the algebra of finite-dimensional ( hence of compact ) diagonal operators on l2 coincide with the algebra of all diagonal operators.
Is it possible to characterize the von Neumann algebras in abstract terms, similarly to the way the second Gelfand-Naimark theorem describes the operator C* -algebras? This outstanding problem has remained open for many years. Here is an impressive solution in terms of geometry of Banach spaces. Let E be a Banach space. Another Banach space E* is called predual to E if E coincides with ( E* ) * , up to an isometric isomorphism. For instance,
3. Involutive,
C* -, and von Neumann algebras
355
l 00 has a predual, namely l 1 . At the same time, it can be shown that neither c0 nor, say, K(H) have predual spaces ( cf. Exercise 4. 2.8) . Theorem 4 ( S. Sakai, 1957; see [68, Chapter III, Theorem 3.5]). ( i ) Every von Neumann algebra viewed as a Banach space has a pred ual space, which is determined uniquely up to an isometric isomor phism. ( ii ) If a C* -algebra viewed as a Banach space has a predual algebra, then it is isometrically * -isomorphic to a von Neumann algebra. Sakai explicitly indicated the predual space of a von Neumann algebra. It is an (automatically closed) subspace in A * consisting of functionals that are continuous in ultraweak topology. At the end of Section 4. 2 we have already verified this for the case of
A == B(H) .
The Sakai theorem plays the same role in the theory of von Neumann algebras as the second Gelfand-Naimark theorem does in the theory of op erator C*-algebras. The role of the first Gelfand-Naimark theorem in this analogy belongs to the following fact, which had been found much earlier: Theorem 5 ( von Neumann; for the proof, see [86, Theorem 4.4.4] , [50, Theorem 9.4. 1] ) . Every commutative von Neumann algebra is isometrically * -isomorphic to the algebra L 00 (X, J1 ) , where (X, 11) is a measure space.
Moreover {von Neumann-Halmos), if our algebra acts on a separable Hilbert space, then it is isometrically * -isomorphic either to l� , where n is a finite or countable cardinality, or to L 00 [0, 1] , or to the l 00 -sum of these two algebras.
Remark. This theorem shows, in particular, to what extent the class of
commutative von Neumann algebras is more narrow than the class of com mutative operator C* -algebras ( at least if we consider algebras acting on a separable space ) . We should have in mind that, as it can be easily shown, the second class cannot be more narrow than the class of metrizable compact spaces. The first Gelfand-Naimark theorem suggests to treat C * -algebras as if they were "non-commutative topological spaces" . Continuing this analogy, Theorem 5 suggests to treat von Neumann algebras as if they were "non-commutative measure spaces" . We have already discussed the theory of C * -algebras as "non-commutative topology" . Similar reasons allow us to call the theory of von Neumann algebras "non-commutative measure theory" (or "non-commutative probability theory" ) . Again, this sentence sounds meaning less. However, it reveals a strategy of actions. If we know measure theory (i.e., probability theory) , we can predict the results of the theory of von Neumann algebras: the "proba bilistic" intuition works here. (For instance, the classical Radon-Nikodym theorem from measure theory has a beautiful and important non-commutative version. ) In the opposite direction, the knowledge of the theory of von Neumann algebras helps to better interpret measure theory as a limiting, or "classical" , case. Physicists also obtain something useful: a new view of things and new methods. Here is an example. From this point of view, it
356
6.
Hilbert Adjoint Operators and the Spectral Theorem
is fruitful to interpret quantum mechanical events as orthogonal projections in von Neu mann algebras, the reason being that in the commutative case these projections can be naturally identified with measurable subsets in X , i.e. , with "classical" events. Both "non-commutative areas" , their close mutual relations, and many other related things (in particular, the so-called "non-commutative differential geometry" ) are presented in A. Connes' book Non-commutative geometry [21] . Now we explain to advanced readers the following facts. Behind Theorem 1 , which was formulated as a result about the identification of certain mathematical objects, a stronger result is hidden. It deals with the identification "as a whole" of two important categories . Consider the category of C* -algebras denoted by C *alg. It is clear what are the ob jects of this category; as for the morphisms, we define them as arbitrary *-homomorphisms of algebras of this class. As a matter of fact , they only seem to be arbitrary: all these mappings have extremely good structure. r.p
Theorem 6 (see [25, Chapter IV, Theorem 7 . 83] ) . Suppose
* -homomorphism. Then there ts a commutattve diagram A
\I
: A ---+ B is a C* -algebra
B
c
where C ts another C* -algebra, a is a * -homomorphism that is a coisometric operator, and p is a * -homomorphism that is an isometric operator. (Briefly, every * -homomorphism between C* -algebras ts a composition of a coisometric and an isometric * -homomorphism.) As a corollary, every * -homomorphism between C* -algebras is a contraction operator, and moreover, every injective *-homomorphism between C* -algebras is an isometric operator.
Remark. It is not difficult to guess where C, a , and p come from: C is the quotient algebra A/ Ker( r.p ) , a is the natural projection, and p is the operator generated by r.p (see Proposition 1 .5 .3) . The non-trivial part of the theorem is the verification of the C* -identity for C, and of the indicated properties of a and p .
Now let us denote by UCC* a subcategory in C* alg consisting of unital commutative algebras and unital *-homomorphisms. Recall the (contravariant) Gelfand functor n. from the category of unital commutative Banach algebras to the category CHTop of compact Hausdorff spaces (see Section 5.3) . Denote by GN : UC C* ---+ CHTop the "Gelfand N aimark functor" , coinciding with n. on the objects and morphisms in UCC* . Theorem 7 (see [16, IV.4] ) . The category UCC* cotncides with CHTop up to the dual
equtvalence (see Definition GN .
0. 7. 7},
and this dual equivalence ts established by the functor
Thus, GN plays the role of a contravariant functor F K ---+ .C from the general Definition 0. 7. 7 of dual equivalence of categories . Note that the role of "inverse" functor G : .C ---+ K in the same definition belongs to the functor C : CHTop---+ UCC* defined as the corresponding functor from CHTop to Ban 1 (see Example 3. 1 . 1 ) . Theorem 7 gives a precise mathematical meaning to the statement that the study of compact (and locally compact ; cf. Section 3. 1 ) spaces (i.e., a huge part of topology) can be viewed as a part of the study of C* -algebras. Every result about spaces has its counterpart for C* -algebras, with direction of the arrows reversed. * * * :
3. Involutive,
C* - , and von Neumann algebras
357
Now we give several comments regarding the second Gelfand-Naimark theorem (which "changed the face of modern analysis," as was said in [89, p. 2 1] , the collection of papers dedicated to the 50th anniversary of the great theorem) . After formulating the first Gelfand-Naimark theorem, we explicitly indicated the required locally compact space and the required *-isomorphism. Now the reader may ask: What is the Hilbert space and the *-representation of the given algebra in the second theorem? A decisive role in answering this question belongs to the so-called GNS-construction (named so in honor of its creators, I. M . Gelfand and M . A. Naimark, and also I. Segal, who refined it and recognized its independent value) . Omitting the details, we describe the main idea of this construction. Let A be a (so far arbitrary) unital *-algebra. A functional f : A ---+ C is called positive if f ( a* a ) > 0 for every a E A. Such a functional defines a pre-inner product on A by the formula ( a, b ) : == f (b* a) . Put Jf :== {a E A : ( a, a ) == 0} . The procedure we had already learned (see Proposition 1 .2.2) turns the pre-Hilbert space A into a near-Hilbert space if! :== A/ I, and after the completion (see Proposition 2.6.3) , a Hilbert space H i . We would like to define a representation of the algebra A in H 1 , i.e., a homomorphism from A to B( H f ) . However, in the general case we can only define a homomorphism from A to .C(iif ) by the formula a �-----+ T1 : b + Jf �-----+ ab + Jf . The obstacle is that the operators T1 are not necessarily bounded. But (and this is already a non-trivial result) one can show the following: if A is a C* -algebra, then all T1 are automatically bounded and even contraction operators. Therefore, we can extend every T1 by continuity to an operator T1 and obtain a mapping Tf : A ---+ B( H f ) : a �-----+ Tf . It is not difficult to verify that this mapping is a *-representation of the algebra A in H f . It is called the GNS-representation associated with the positive functional f. The GNS-construction, i.e. , the construction of the GNS-representation, works for some other classes of algebras; for instance, all that we have said is true if A is an ar bitrary unital Banach star-algebra. However, since we want to end up with an isometric representation, we must restrict ourselves to the C* -algebras. Thus, we assume that A is a C* -algebra and consider the family of GNS-representa tions of A associated with all possible positive functionals. In practice, it often turns out that we can find an isometric one among these representations. Here are two examples. Example 7. Suppose A : == C[O, 1] and f : a �-----+ J; a(t) dt. Then obviously H i == L2 [0, 1] , and T1 is the operator of multiplication by a function a(t) in L2 [0, 1] . We see that the GNS-representation Tf : C[O, 1] ---+ B(L2 [0, 1]) is isometric. Exercise 9. Suppose A : == B ( H ) , x E H is different from zero, and f : B( H ) ---+ C : i is unitary isomorphic to H, and Tf is an isometric a �-----+ ( a ( x ) , x ) . Then the space H isomorphism of B( H ) onto B( H f ) . Remark. An isometric GNS-representation of a C * -algebra always exists if this algebra is separable. In this case among the functionals on A there is a strictly posttive one, i.e. , a functional f such that f (a* a) > 0 for all a =1- 0. From this we can easily deduce that the representation Tf is injective, hence isometric if we take Theorem 6 for granted. But to construct such a functional one must use rather complicated techniques.
In the general case it can happen that every GNS-representation of our algebra has a non-zero kernel, and thus looses some information about the structure of the algebra. (Show how it happens , say, in the case of A : == eo (X) , where X is a non-countable set . ) So we proceed as follows.
358
6.
Hilbert Adjoint Operators and the Spectral Theorem
Consider the set of all positive functionals such that 11 ! 1 1 == 1 . (Such functionals are called states ; 5 if A is unital, then they are characterized by the condition f(e) == 1 ) . Put H : == E9{H 1 : f is a state} , i.e. , take the Hilbert sum of (an enormous number! ) of Hilbert spaces corresponding to all possible states. This is the Hilbert space promised in Theorem 2. As for the promised representation, it is the mapping T A ---+ B(H) assigning to every a E A the operator Ta taking the "line" ( . . . , x f , . . . ) ; x f E H f to ( . . . , Tf (xJ ) , . . . ) . The verification that this "sum of GNS-representations" is also a * representation is relatively easy. On the contrary, the fact that this representation is isometric requires non-trivial arguments and uses some special properties of C* -algebras. :
Remark. Again, the creation turned out to be more perfect than the creators them selves expected. The approach that Gelfand and N aimark used to identify an abstract C* -algebra with a concrete one (an operator C* -algebra) , had, as the authors initially thought , a shortcoming. The Hilbert space in which the operators act turned out to be very large; it is always non-separable (since there are too many terms in the Hilbert sum) , and the image of the representation of T is only a "small part" of all B(H). This caused a discomfort the authors felt , and provoked a desire to construct more "efficient" represen tation. 6 It took more than twenty years to discover that this , at first sight "inefficient" , representation of C* -algebras as operator algebras has great possibilities. It turned out that every possible representation of a given C* -algebra as an operator algebra can be obtained as a part of this special representation introduced by Gelfand and N aimark. Due to this , the latter is now called the universal representatton. Moreover, this representation allows one to connect the topics that initially had little in common: the theory of von Neu mann algebras (non-commutative measure theory) and the theory of general C* -algebras (non-commutative topology) . To do this we take the operator algebra constructed in the Gelfand-Naimark representation and pass to its bicommutant in B(H) . We obtain the so-called enveloptng von Neumann algebra of the initial (abstract) C* -algebra, which is the desired link between the two theories. Note the remarkable fact that this algebra, as a Banach space, is isometrically isomorphic to the second dual space A* * . About all this, see, e.g. , [68] , [90] , [87] .
*
*
*
Our digression has already taken much time. However, it is tempting to tell you a dramatic story about an old problem of "non-commutative measure theory" . Here is another diversion. After introducing "his" algebras, von Neumann attempted to classify them up to a *-isomorphism. Realizing independent importance of such a work, he believed that this result should be very useful for quantum mechanics. (Explanation of such connections exceeds the scope of this book. We only note that at that time (the early 1930s) von Neumann set a verily ambitious goal. He wanted to put quantum mechanics on a solid mathematical foundation to make it as harmonious as classical mechanics, and ideally, to turn it into a mathematical theory with its own system of axioms, definitions, and theorems: something like quantum-mechanical Euclid ' s "Elements" . ) At the same time he felt that the required mathematical techniques would allow him to "kill several more (purely mathematical) birds" , which we will not describe here (we only mention that this is about group representations and infinite-dimensional analogues of the Wedderburn theorems) .
5In some mathematical models of quantum mechanics they represent ( physical) states of quantum mechanical systems; see, e.g. , [45, pp. 1 3- 1 5] . 6 It was my teacher Mark Aronovich Naimark who told me about this.
3. Involutive,
C* , and von Neumann algebras -
359
At the beginning, everything looked promising. First , Theorem 5 formulated above, gave a full description of commutative von Neumann algebras. (As we said before, this theorem was the origin of the non-commutative measure theory. ) As for the general von Neumann algebras, von Neumann realized that their study can be reduced to consideration of "elementary cells" , or "bricks" , from which every von Neumann algebra can be built with the help of a natural procedure-the so-called direct integral of operator algebras (there is no sense to speak about it in details now) . Here is the definition of these cells. Definition {a E A : ab
6. ==
A von Neumann algebra A is called a factor if its center, i.e. , the set ba for all b E A} , consists of scalar multiples of the identity operator.
Note that the factors participating in the von Neumann decomposition of commutative von Neumann algebras (i.e. , according to Theorem 5, algebras of the form Loo (X, J.l ) ) are simply C , and there are as many of them as the cardinality of X is. But what are these factors? In the early 1 930s only one type of factors was known: B(H) for a Hilbert space H. (You can verify that this is indeed a factor. ) It was natural to conjecture that there are no other factors at all. But in 1 935 one of the major discoveries of the 20th century happened: von Neumann and his colleague P. Murray found a factor that is completely different from B(H) . (One of the realizations of this factor is described in [92] . ) Continuing their investigations, Murray and von Neumann showed that factors are divided into three types depending on the behavior of projections in them. More precisely, to every space that is the image of such a projection, one can assign a non-negative real number or oo, the so-called relative dimension. (Here we assume for simplicity that H is separable.) If a factor is such that the relative dimension of the images of projections can take natural values or, possibly, oo, then it belongs to type I; these are the algebras B( H) and nothing more. The next level of complexity is represented by "factors of type II" , for which these numbers take values everywhere in the interval [0, 1] (the above mentioned Murray-von Neumann factor is one of those) or in the extended positive real axis IR+ U { oo } . Finally, the third and the last logically possible case is that the relative dimensions of the projection spaces take the values 0 (for the zero subspace) and oo (for all other subs paces) . These are "factors of type III" . They have the most complicated structure. (The fact that these factors really exist became evident only five years after the discovery of factors of type II. ) Well, there are more factors than was initially expected, but do they admit a full description? It turned out that "the more you work, the more you have to do. " Soon it was realized that already among the factors of type II there are many that are mutually non-*-isomorphic, and still no one could figure out whether there are non-isomorphic factors of type III. The first example of such kind appeared only in 1957 (L. Pukanszky) , and it was extremely complicated technically. By that time von Neumann was terminally ill, and it looks like he did not find out about this example. The following important step was made ten years later, when R. Powers constructed a continuous family of mutually non *-isomorphic actors of type Ill indexed by a parameter A E (0, 1 ) . But it was the French mathematician Allain Connes who made a revolutionary dis covery in 1973. By that time it was realized that the problem of classification of all factors is hopeless: there are too many of them. And the first Connes ' achievement was that he distinguished a substantial class within which it was reasonable to formulate the classification problem. As it befits a fundamental notion, the class of factors and, more generally, the class of von Neumann algebras distinguished by Connes, can be character ized by many different and at first sight dissimilar ways. These algebras are often called amenable (the term inherited from harmonic analysis and homological theory of Banach
360
6.
Hilbert Adjoint Operators and the Spectral Theorem
algebras) . Depending on the choice of the way leading to this notion, these algebras are also called hyperfinite , injective, semidiscrete, etc. The shortest way to give the definition is , following Connes, to use the notions from the geometry of Banach spaces: a von Neu mann algebra A C B(H) is called amenable if there is a projection of norm 1 in B(H) that has A as its image (in other words, the natural embedding of A into B(H) is a coretraction in Ban 1 ; cf. the discussion in Section 2.4) . Amenable factors turned out to be the class where a complete classification was achieved . Everything was clear with the factors of type I (they are, of course, all amenable) : they are completely determined by their dimension as a linear space. Connes proved that there are only two non-isomorphic amenable factors of type II: the one discovered by Murray and von Neumann, and another one . As for the amenable factors of type III, Connes divided them into subtypes III.x , where A runs through the interval [0, 1] . He showed that for each A E (0, 1 ) there is one factor, namely, that found by Powers. There are many non-isomorphic factors of type III 0 , but all of them are described in sufficiently transparent terms of the so-called "Krieger factors" . The only difficulty Connes faced were factors of type III 1 (by the way, not long ago they came into fashion in quantum field theory) . Connes proved that the "Araki-Woods factor" known by that time is an amenable factor of this type, but he could not say whether there are other amenable factors of the same type. This last gap in the Connes theory was filled in 1982 by Danish mathematician Uffe Haagerup. He proved that there are no other factors, and thus brought the problem of classification of amenable factors to conclusion.
4 . Continuous functional calculus and posit ive operators
Theorem 2. 1 on the location of spectra is the first serious step in studying selfadjoint operators. Now we can go much further. Until further notice, T is a selfadjoint operator on a Hilbert space H and a :== a(T) is its spectrum; as we recall, it is a non-empty compact set in JR. We have already defined polynomials in arbitrary elements of an algebra. Moreover, if the algebra is a Banach algebra, we can take entire holomorphic functions of its elements. Our next goal is to show that we can take arbitrary continuous functions of a selfadjoint operator provided these functions are defined on intervals (and, more generally, arbitrary subsets) in 1R containing the spectrum. As always, for each p E C [t] we denote by P i a the restriction of p (as a function of real variable) to a. Let us distinguish the following preparatory
1. For every p E C [t] we have ll p(T) II == II P i a lloo · Proof. We first assume that p is a selfadjoint element in C[t] . Then from Proposition 3.2, taking into account Example 3. 1 , it follows that the operator p(T) has the same property. Combining Theorems 2. 1 and 5.2. 1 , we see that ll p(T) II == max{ I A I : A E a(p(T) )} == max{ j p(A) j : A E a } == II P i a lloo · Now we pass to the general case. Since the polynomial q : == p*p is selfadjoint, from what we have just said it follows that ll q(T) II == ll q la lloo · Propo sition
4.
Continuous functional calculus and positive operators
But by the C*-identity and Example 3. 1, we have llq (T) II == ll p* ( T )p( T) II == ll p ( T) *p( T) II == At the same time, it is obvious that
361
ll p (T) II 2 -
llqlalloo == max { l p*p(A) I ; A E a} == max { l p (A)p (A) I ; A E a} == I P i a ll� •
The rest is clear.
In the following definition, [a, b] is an interval in JR, and t denotes the restriction to [a, b] of the "independent variable" t �----+ t.
1. Continuous functional calculus or just continuous calculus of T on [a, b] is a unital homomorphism rc : C[a, b] � B(H) such that 'Yc ( t ) == T. Theorem 1 (on continuous functional calculus) . If [a, b] contains a, then the continuous calculus rc of T on [a, b] exists and is unique. Furthermore, (i) 'Yc is an involutive homomorphism of * -algebras; (ii) l i re ( ! ) II == II !Ia II for every f E C[a, b] and, as a corollary, rc is a contraction operator; (iii) Ker( !c ) == {f E C[a, b] : / Ia == 0}; (iv) if a bounded operator on H commutes with T, then it commutes with the operator rc (f) for every f E C[a, b] . On the other hand, if [a, b] does not contain then there is no continuous calculus of T on [a, b] . Definition
a,
Proof. Suppose
a C [a, b] . Denote by P the subset in C[a, b] consisting of
the restrictions of polynomials. Certainly, it is a *-subalgebra. Denote the restriction of p E C[t] to [a, b] again by p; this will not lead to a confusion. Since the continuous calculus 'Yc of T on [a, b] is a unital homomorphism, we have rc (P ) == p(T) for all p E C[t] . From the continuity of 'Yc and the Weierstrass approximation theorem it follows that there is at most one such calculus. Now, consider the mapping 'Yo : P � B(H) : p �----+ p(T) (coinciding with the polynomial calculus if we identify P with C [t] ) . Certainly, it is a unital *-algebra *-homomorphism taking t to T. Proposition 1 guarantees that this is a contraction operator with respect to the norm in P inherited from C[a, b] . Since B(H) is a Banach space, the continuity extension principle gives a unique contraction operator 'Yc extending 'Yo to C[a, b] . Let us look at its properties. Suppose f, g E C[a, b] , a sequence Pn E P tends to f, and a sequence Qn E P tends to g . Taking into account the continuity of 'Yc and the continuity
6.
362
Hilbert Adjoint Operators and the Spectral Theorem
of the multiplication in C[a, b] , we see that rc (Pn Qn ) tends to !c (fg). At the same time, since the polynomial calculus is an homomorphism and the multiplication in B(H) is continuous, we have that rc (Pn Qn ) == [pnqn ] (T) == Pn (T) qn (T) == !c (Pn ) !c (Qn ) tends to !c (f) !c (g). This proves that rc is a homomorphism (and, thus, it is a continuous calculus of T on [a, b] ) . Again by the continuity of rc and the "star property" of C[a, b] we see that rc (P�) tends to rc (f*), and at the same time, since the polyno mial calculus on T is involutive (Example 3.1) and B(H) is a star-algebra, !c (P�) == p� (T) == Pn(T) * == !c (Pn )* tends to !c (f)*. Hence, !c is an involu tive homomorphism. Finally, again from Proposition 1 , we have limoo II Pnla lloo == ll f lalloo · lim l rc (Pn ) ll == n� limoo II Pn (T) II == n� ll rc ( / ) 11 == n�oo This proves ( ii) and, as a corollary, (iii) . If some S E B(H) commutes with T, then, obviously, Sp(T) == p(T)S for every polynomial p. Hence, if f E C[a, b] and Pn E P tends to f in C[a, b] , then Src (f) == S(limn� oo Pn (T)) == limn� oo (Spn (T)) == limn� oo (Pn (T)S) ==
!c (f)S.
It remains to prove the last statement. If rc exists, then, by Proposition 5.2.4, a(T) C a ( t ) , and the latter set is [a, b] (Example 5.3.6) . The rest is • clear. Continuous calculi on different intervals are compatible.
Suppose C [c, d] C [a, b] , rc and !� are continuous calculi of T on the intervals [a, b] and [c, d] respectively, and C[a, b] � C[c, d] is the restriction f �----+ / I [c , d] . Then rc == !�
Proposition 2.
a
T.
Proof. The mapping
T :
!� obviously, satisfies the definition of continuous T,
calculus of T on [a, b] . Hence, by the uniqueness of the latter, it coincides • with rc · From this moment let us agree that if f is defined on an arbitrary subset in 1R containing the interval [a, b] ::J a, then we denote by f(T) the operator rc (f l [a , b] ) · In view of Proposition 2, this operator does not depend on the choice of the interval. Exercise 1. Let T be a compact selfadjoint operator. If f(O) == 0, then f(T) is compact as well. If, in addition, H is infinite-dimensional, then the converse is true. Speaking about the continuous calculus of operators, we restricted our selves to functions defined on intervals; this simplifies the arguments and is completely sufficient for further applications. But we could have taken
4.
Continuous functional calculus and positive operators
363
an arbitrary compact set containing a, including a itself, as the domain of our functions. In the following exercise � is such a compact set, t is the restriction of the independent variable to � ' and by continuous calculus of T on � we mean a bounded unital homomorphism rf : C(�) � B(H) such that rf (t) == T. Exercise 2. The assertion of Theorem 1 remains true after replacing [a, b] by � - As a corollary, ,g (i.e. , rf for � :== a) is an isometric operator. Hint. If [a, b] is an interval containing � ' then every f E C(�) is a restriction of some f E C[a, b] . Taking into account the known form of Ker(rc) , this allows us to define the mapping f �----+ rc(f) . It is the required calculus. Remark. The result of this exercise in the case of � : == a can be regarded as a special (very special) case of the first Gelfand-Naimark theorem. In this situation the considered C*-algebra is {p (T) : p E C [t] } , i.e. , the minimal operator-norm-closed algebra containing T. Let us find out what the general notion of continuous calculus turns out to be for some concrete classes of operators. In the next examples and exercises, M is a subset in 1R containing an interval where the spectrum of the given operator lies, and f is a continuous function on M. Example 0. Suppose T : == A1 (A E C) . Then a == { A} , and f(T) == j (A) 1 for every f. Example 1. Let T : == P be a projection in H different from 0 and 1. As we remember, in this case a == { 0, 1}. Then f ( T) == g ( T) for each g E C[O, 1] coinciding with f at the points 0 and 1. If we take the linear function /(0) (1 - t) + f(1)t as g , we see that f(P) == f(O)Q + f(1)P, where Q :== 1 - P. Example 2. Suppose H :== L 2 ( [a, b] , 11) , where 11 is a Borel measure on [a, b] and T an operator of multiplication by t (independent variable) . Then the mapping 'Yc : C[a, b] � B(L 2 ([a, b] , 11)) taking every f to the operator Tt of multiplication by f is a continuous calculus of T on [a, b] . Thus, in the considered case, f(T) coincides with Tt · Exercise 3. Suppose T :== T>.. ; A == (AI , A 2 , . . . ) is a selfadjoint diagonal operator on l 2 . Then M contains all An , and f(T) is a diagonal operator T11 with l1n : == /(An ) · Exercise 4 (Generalization of Exercise 3) . Suppose (X, 11) is a mea sure space and T : == T'P is a selfadjoint operator of multiplication by cp E L 00 (X, 11) in L 2 (X, 11) . Then a contains cp(t) for almost all t E X, and f(T) is the operator of multiplication by the function t �----+ f ( cp ( t )). -
364
6. Exercise
Hilbert Adjoint Operators and the Spectral Theorem
5. Suppose T is a selfadjoint compact operator on H, and the
vectors e n and numbers An are the same as in the Hilbert-Schmidt theorem. Then M contains all A n , and, in the case of infinite-dimensional H, contains 0 as well; f(T) takes e n to j(An )e n and is equal to f(O)x on every x such that x _l e n ; n == 1 , 2, . . . . Let us now compare the continuous calculus we have just constructed with the entire holomorphic calculus from Section 5.3. As we remember, entire holomorphic functions can be constructed for arbitrary elements of Banach algebras. Now the initial requirements are more rigid: we consider the algebra B(H) and a selfadjoint operator in it. But in compensation we have many more functions of an element: for instance, we can speak about the "module of T" , or the "cube root of T" , etc. Remark. In fact, what is essential in the construction of continuous calculus
is that we consider a C*-algebra A and a normal (not necessarily selfadjoint) element a in it. In this situation, for a continuous function f on a compact set in C containing a(a) , one can define the element f(a) (see, e.g. , [33] ) . But this is the limit of possible generalizations.
The fact that we use similar symbols w(T) for w E 0 (see Section 5.3) and f(T) for f E C[a, b] in this section will not cause a confusion. Exercise 6 (Compatibility of functional calculi) . Suppose re is an entire calculus of T as an element of B(H) , and [a, b] is an interval containing a. Then the following diagram: 0
1
C[a, b]
�B(H)y
where j : w �----+ w I [a, b] , is commutative. Hint. The set of polynomials is dense in 0 , and the set of their restric tions to [a, b] is dense in C[a, b] . Note an attractive feature of the continuous calculus that brings it closer to the calculi we considered before. Theorem 2 (spectral mapping law for continuous calculus) .
For each inter
val [a, b] ; a C [a, b] C 1R and for each f E C[a, b] we have a(f(T)) == f(a(T) ) .
Proof. First, let us show that a(f(T) ) C f(a(T) ) . Suppose
A ¢: f (a(T) ) ,
i.e. , f( t ) # A for t E a(T) . Consider a function (f( t ) A) - 1 on a(T) . As any other continuous function on a(T) , it can be extended to a continuous func tion, say g , on [a, b] (this evidently follows from a representation of 1R \ a(T) -
4.
Continuous functional calculus and positive operators
365
as a union of disjoint intervals) . Then the function g (t) (f(t) - A) - 1 van ishes at a (T) , and thus (Theorem 1) belongs to Ker( !c ). Hence, g(T) (f (T) A l) == 1, and this means that A ¢: a ( f (T)) . Let us prove the reverse inclusion. Suppose f ( s) ¢: a ( f (T)) for s E a (T) . In other words, the operator f (T) - f (s )l has an inverse in B(H) , say, S. For every n E N consider the function 9n E C[a, b] that is equal to 1 at s, 0 for I t - s l > � , and is linear on the intervals [s - �, s] and [s, s + �]. Since rc is a homomorphism and ll 9nlloo == ll9nl a (T) lloo == 9n (s) == 1, Theorem 1 shows that
l gn (T) II == l gn (T)(f(T) - f(s)l)S II < ll gn (T) (f(T) - f(s)l) II I S II · Since rc is a contraction operator, we see that ll gn (f - f(s)) lloo > 11111 • 0 for a But, by the choice of gn , we have ll 9n (f - f(s)) lloo • contradiction. 1 ==
�
n � oo ,
Here is another useful observation: the identification of operators by the topological (or, as a special case, unitary) equivalence implies the respective identification of the values of a continuous function on these operators.
3.
Suppose I : H � K is a topological Hilbert space isomor phism implementing a topological equivalence between selfadjoint operators T : H � H and S : K � K. Then for every f E C[a, b], where [a, b] is an interval containing the spectrum of these operators {by Proposition 5. 1. 1 they have the same spectrum), the same I implements a topological equivalence between the operators f ( T) and f ( S) . Proof. Consider the mapping �� : C[a, b] � B(K) : f r--+ I f (T) I - 1 . Evi Proposition
dently, it has the properties of a continuous calculus of S on [a, b]. Conse quently, by the uniqueness of the latter, I f (T) I - 1 is nothing but f(S). •
One of numerous applications of continuous calculus is that it allows us to characterize from different points of view a very important class of operators. Whereas selfadjoint operators behave like real numbers, these operators behave like non-negative numbers. Definition 2. An element
a
of an (arbitrary) unital involutive algebra is called positive (notation: a > 0) if it is selfadjoint and a ( a ) C JR + . In particular, a bounded operator on a Hilbert space H is called positive if it is positive as an element of the involutive algebra B(H) . Certainly, every (orthogonal) projection is positive. The following propo sition gives a huge set of examples of positive elements.
4.
Unital homomorphisms of unital * -algebras take positive elements to positive elements.
Proposition
366
6. Hilbert Adjoint Operators and the Spectral Theorem
• 5.2.4. a(T) and f > 0, then the
Proof. This immediately follows from Propositions 3.2 and
1. If f E C[a, b] , where [a, b] operator f ( T ) is positive.
Corollary
=:)
In particular, the operator described in the following definition is posi tive. Suppose T > 0 and the function f ( t ) : == Jt is defined on some interval in JR+ containing a(T) . Then the operator f ( T ) is called the arithmetic square root of T and is denoted by VT.
Definition
3.
(As we recall, this operator does not depend on the choice of the inter val.) From the definition it evidently follows that ( Vf') 2 == T (which justifies the name "square root" ) .
7.
VT is a unique positive operator R such that R2 == T. Hint. Suppose Pn ( t ) tends to Jt in C[O, II T II J . If S > 0 is such that S2 == T and Qn ( t ) : == Pn (t 2 ) , then qn (S) == Pn (T) tends to S, and at the same time to VT. Exercise
Theorem
alent:
3.
The following properties of an operator T E B(H) are equiv
T is positive; T == S2 for some positive S; T == S2 for some selfadjoint S; T == S* S for some S E B (H, K) , where K is (generally speaking) another Hilbert space; ( v) the quadratic form of the operator T takes only non-negative values {i. e., (Tx, x ) > 0 for all x E H).
( i) (ii) (iii) (iv)
Proof. (i)� (ii) . It is sufficient to take
S : == Vf'.
(ii) � (iii)� (iv) . Clear. (iv)� (v) . (Tx, x ) == ( S* Sx, x) == ( Sx , Sx ) . (v) � (i) . By Proposition 2.2, T is selfadjoint, hence a(T) C JR. There fore, our goal is to show that for t > 0 the operator T + tl is invertible. For every x E H we have
II (T + t l)x ll ll x ll > I ((T + tl)x, x ) l == (Tx, x ) + t (x , x ) > t 11 x ll 2 . Hence, T + tl is topologically injective, and it remains to use Proposition • 2.6.
4.
367
Continuous functional calculus and positive operators
Whereas selfadjoint operators form a real subspace in B (H) , positive operators form the so-called cone. T, - T >
5.
If S, T 0, then T == 0.
Proposition
>0
and t
> 0,
then S + T
>0
and tS
> 0.
If
Proof. The first assertion immediately follows from Theorem 3(v) . Taking
into account Proposition 2. 1 , the second assertion is true as well.
•
Let us give some applications of square roots. Proposition
6. If T > 0, then II T II
==
sup{ ( Tx, x ) ; x
Proof. This follows from the equalities IITII
( v'Tx, v'Tx)
==
( v'Tv'Tx, x )
8.
==
( Tx, x) ; x
E H.
==
E BH } ·
ll v'T II 2 and ll v'Tx ll 2
•
For every selfadjoint operator T we have max( a(T) ) sup{ (Tx, x) } and min(a (T) ) == inf{ ( Tx , x) } . As a corollary, we have the formula II T II == sup{ I ( Tx, x) l ; x E BH } · Hint. For sufficiently large t > 0 we have max a(T) + t == l i T + t l ll == sup{ ( Tx, x) ; x E BH} + t. Positive operators allow us to fulfil our promise given before. Namely, we can now describe the polar decomposition for an arbitrary bounded operator ( cf. Exercise 2. 7) . Exercise 9* (polar decomposition) . Suppose T : H � K is a bounded operator between Hilbert spaces. Then there exist a partially isometric operator W : H � K and positive operators S : H � H and R : K � K such that T == W S == RW . The pair of operators W, S is uniquely determined by the condition Ker( W ) ..l == Im(S) - , and the pair W, R, by the condition Im( W ) ..l == Ker(S) . Hint. Put S : == VT*T. Then II Sx ll == IITx ll for all x E H. Hence, there exists a partially isometric operator W , uniquely defined by the rule that it takes Sx to Tx and vanishes on Im(S) ..l . The role of R belongs to VT'f*. The polar decomposition helps to establish the important fact that for selfadjoint operators, the two kinds of identification, "soft" topological equivalence and "rigid" unitary (i.e. , isometric) equivalence, coincide ( cf. Exercise 2. 5) . Exercise 10* . Suppose selfadjoint operators T1 : H � H and T2 : K � K are topologically equivalent. Then they are unitarily equivalent. Hint (we follow A. M. Stepin) . Suppose I : H � K implements this topological equivalence, and I == W S is its polar decomposition. Then W must be unitary, and S must be a positive topological isomorphism. Exercise
368
6. Hilbert Adjoint Operators and the Spectral Theorem
Since IT1 == T2 I, the selfadjointness of the initial operators implies that T1 commutes with J* I. Thus (Theorem 1 ) , it commutes with S == Vf*i as well. Hence, WT1 == WST1 S - 1 == T2 IS - 1 == T2 W. Exercise 11 * . Every selfadjoint operator is a linear combination of two unitary operators. As a corollary, every bounded operator is a linear combination of four unitary operators. Hint. If II T II < 1 , then, imitating the representation of a number in the interval [ - 1 , 1] as a sum of two numbers in 1r, consider the operators u :== T±iV1 - T2 . *
*
*
The notion of positive operator allows us to introduce another important structure in B(H) , an order.
4.
Suppose S, T E B(H) . We say that S is less than T and write S < T if T - S > 0 (in other words, ( Sx, x) < (Tx, x ) for all x E H) . Definition
Let us list some properties of the relation "<" . They are rather inter esting by themselves, but the most essential is that they will be very useful later, in the proof of one of the versions of the spectral theorem.
7.
The relation "< " has all the properties of an order (see Definition 0. 1. 1) and the following additional properties: (i) if S1 < T1 and S2 < T2 , then S1 + S2 < T1 + T2 ; (ii) if S < T and t > 0, then tS < tT; (iii) if S < T and T < S, then S == T; (iv) if S < T, then R *S R < R*T R for every R E B(H) ; (v) if S < T, and P is a projection commuting with both operators, then SP < TP; (vi) for every selfadjoint T we have - II T II 1 < T < II T II 1; (vii) if for some C > 0 we have - C1 < T < C1, then II T II < C; (viii) (generalization of the previous property) if, for some selfadjoint S, T, we have - S < T < S, then I T il < II S II ; (ix) for S, T E B(H) the relation S* S < T*T is equivalent to the fact that II Sx ll < II Tx ll fo r all x E H .
Proposition
Proof. The properties of an order and (i)-(iii) immediately follow from
Theorem 3(v) and Proposition 5. (iv) If T > 0, then, taking into account Theorem 3(iv) , we have R *TR == ( v'TR )* ( v'TR ) > 0. The rest is clear. (v) follows from (iv) and Proposition 1 . 12.
4.
Continuous functional calculus and positive operators
369
(vi) Since a(T) C [ - II T II , II T II J (Theorem 2. 1) , the spectra of the selfad joint operators II T II l - T and II T II l + T lie in JR + . (vii) Using again Theorem 2. 1 and replacing, if necessary, T with - T, we can assume that II T II E a(T) , and this is equivalent to the inclusion ( C - II T II ) E a ( Cl - T) . By assumption, the latter set lies in JR + . (viii) In view of (vi) and the properties of an order, - II S il l < T < II S II l, and the necessary estimate follows from (vii) . (ix) ( S* Sx , x ) == II Sx ll 2 and the same is true after replacing S with T. But S*S < T*T is equivalent to ( S* Sx, x ) < ( T*Tx, x ) . • The following exercise saves us from possible illusions.
12* . The double inequality 0 < S < T in general does not imply that S2 < T2 , nor ( a corollary) that II Sx ll < II Tx ll for x E H. Hint. Consider a one-dimensional projection and add to it another one Exercise
as
dimensional operator which does not commute with it. Proposition
8. Let T be a selfadjoint operator, [a, b] an interval containing
its spectrum, and f, g E C[a, b] . Then (i) if f < g , then f(T) < g (T) {continuous calculus preserves the order); (ii) if l f l < l g l , then ll f(T)x ll < ll g(T )x ll for all x E H . 1. (ii) From (i) we have that f(T)* f(T) == (f f) (T) is less than g (T)* g(T) == • (gg ) (T) , and it remains to use Proposition 7(ix) .
Proof. (i) immediately follows from Corollary
In the case of projections, an order can be expressed both in geometric and in algebraic terms.
Suppose P and Q are projections onto subspaces K and L, respectively, in H . The following statements are equivalent: (i) p < Q; (ii) K c L; (iii) Q P == PQ == P ; (iv) Q - P is a projection.
Proposition 9.
P == P* P and Q == Q*Q, Proposition 7(ix) shows that II Px ll < II Qx ll for all x E H. But if x E K \ L, then II Px ll == ll x ll and II Qx ll < ll x ll , a contradiction. Proof. (i) ===> ( ii) . Since
6.
370
Hilbert Adjoint Operators and the Spectral Theorem
(ii) �(iii) . For every x E H we have Px E K. Hence, Px E L and QPx == Px . For the same x we have (x - Qx) _l L. Thus, (x - Qx) _l K and Px - PQx == 0. (iii) �(iv) . Clearly, Q - P is a selfadjoint idempotent operator. • (iv) �(i) . Being a projection, Q - P is positive. Concluding this section, let us note another useful application of the continuous calculus. It allows us to decompose a selfadjoint operator into its positive and negative part, as is done for functions taking real values. Up to the end of the section, f E C[a, b] is such a function. As always , denote f+ (t) :== max{f(t) , O} and f- (t) :== max{-f(t) , O} . Let T be a selfadjoint operator such that a(T) C [a, b] . Consider the operators f+ (T) and f_ (T) . By Corollary 1, both of them are positive. In particular, for f :== t (independent variable) we obtain two positive operators, T+ :== t + (T) and T_ :== t - (T) . They are called the positive and the negative part of the operator T. The operator T+ + T_ is called the absolute value of the operator T and is denoted by I T I . Exercise 13. Prove that I T I == viTi. As a corollary (cf. Exercise 9) , the polar decomposition of a selfadjoint operator T has the form T == W I T I == ITI W.
14. Prove that I T I == viTi.
As a corollary (see Exercise 9) , the polar decomposition of a selfadjoint operator T has the form T == W I T I == ITI W. Exercise 15. Prove that Ker(T) == Ker( I T I ) == { x E H : ( I T i x , x) == 0} for every selfadjoint operator T. Exercise
Proposition
10.
Suppose f E C[a, b] and Im(f) C JR. Then
(i) f(T) == f+ (T) - f- (T) ; (ii) f+ (T) f- (T) == f- (T) f+ (T) == 0; (iii) ll f(T) II == m ax { l l f+ (T) II , ll f- (T) II } . In particular, T == T+ - T_ , T+ T- == T_T+ == 0 and Proof. The equalities (i) and (ii) immediately follow from the correspond
ing relations for the functions f+ and f- in C[a, b] and from the algebraic properties of continuous calculus. The equality (iii) follows from the obvious relation II f l a(T) II 00 == max{ II f+ l a( T) ll oo , II f- l a( T) ll oo } and Theorem l (ii) . •
16.
The pair T+ , T_ is a unique pair of positive operators with the properties we have just indicated. Exercise
5. Spectral theorem as an integral
371
Hint. If S+ , S_ is another such pair, then (S+ + S_ ) 2 == T2 , and taking into account Exercises 7 and 13, s+ + s_ == I T I . Exercise 17. Prove the following relations: Im(f+ (T) ) _l Im(f- (T) ) , Ker(f+ (T) ) + Ker(f- (T) ) == H, Ker(f(T) ) == Ker(f+ (T) ) n Ker(f- (T) ) .
5 . The spectral theorem as an operator-valued
Riemann-St iltj es integral
Now we are going to discuss one of the most important mathematical achieve ments of all times, Hilbert ' s spectral theorem on the structure of selfadjoint operators. In fact, there are several close statements, which different people (and different text book authors) call the spectral theorem. Nevertheless, with some simplification one can say that all these appearances of the great theorem fall into two groups, which we call "analytic" and "geometric" forms of the spectral theorem. The analytic form says that every selfadjoint operator generates a cer tain family of projections and can be recovered from these projections by a standard procedure. Thus, it turns out that the description of a selfadjoint operator is equivalent to the description of a family of projections with some special properties. The recovery procedure includes an integral with respect to not the usual (scalar) measure, but the so-called projection-valued mea sure. To define the corresponding "operator-valued" integral, we can either follow the classical construction of the Riemann-Stiltjes integral, or use the Lebesgue integral. (The first surpasses the other in clarity and simplicity of arguments; the second provides more powerful results and admits further generalizations.) The geometric form of the spectral theorem gives a model of the operator under consideration, namely a unitarily equivalent operator of a concrete and "hands-on" kind. This is the operator of multiplication by a function in £ 2 ( · ) on an appropriate measure space. Taken together, both approaches lead after some elaboration to a com plete classification of selfadjoint operators up to unitary (or, what is the same in this context, topological) equivalence, i.e. , from the categorical point of view, to complete understanding of their nature (Theorem 7.4 below) . This classification generalizes the classification of compact selfadjoint operators (see Section 2) , but it is formulated in more complicated terms. We think that the best way to gain familiarity with the spectral theorem is to begin with a more visual form that uses a Riemann-Stiltjes type integral
372
6. Hilbert Adjoint Operators and the Spectral Theorem
with respect to the so-called resolution of the identity. In addition, the corresponding arguments are relatively elementary and rely on such tools as continuous calculus and the structure of order. Thus, let T be a selfadjoint operator in a Hilbert space H. The work done in the previous section allows us to construct various operators f (T) , where f is an arbitrary continuous function on the real line ( as was shown, we could consider the restriction of f to an arbitrary interval containing a(T) ) . For each A E JR, put H>.. :== n { Ker( f (T) ) : f(t) == 0 for t < A; f E C(JR) }.
Note that for the definition of H>.. we need in fact only one function. Proposition 1. Let f E C(JR) be a function such that f(t) == 0 for t < A and f(t) # 0 for t > A. Then H>.. == Ker(f(T)) . In particular, H>.. == Ker((t-A) + (T) ) . Proof. Evidently, it is sufficient to show that for every g
E C(JR) such that
0 for t < A and for each x E H the condition f (T)x == 0 implies g(T)x == 0. For every n == 1 , 2, . . . take the function 9n ( t) == g ( t - �). Take also an g (t)
==
interval [a, b] containing O"(T) . Clearly, the sequence 9n tends to g uniformly on [a, b]. Hence, gn (T) tends to g (T) in B(H). Let us (temporarily) freeze n and take c > 0 so small that
!
c ma { lgn ( t ) l : a < t < b } < min { l f ( t ) l : A + < t < b } . Then l cgn l < I f I on [a, b], and Proposition 4.8(ii) shows that llcgn (T)x ll • < l l f(T)x l l == 0. Thus 9n (T)x == 0 for every n, and g(T)x == 0. x
We have obtained a very important geometric object. As we will see later, specifying this object is equivalent to specifying the operator itself. Definition 1. The family H>.. ; A E 1R is called the family of subspaces asso ciated with the (selfadjoint) operator T (or the associated family of T) .
The family H>.. has the following properties: for some a, b E 1R we have H>.. == 0 for A < a and H>.. == H for A > b {boundedness); if A < J1 , then H>.. C H11 {monotonicity); for every A we have H>.. == n { H11 11 > A } {continuity from the right); all H>.. are invariant with respect to g(T) for each g E C(JR) and, in particular, with respect to T.
Proposition 2.
(i) (ii) (iii) (iv)
:
5. Spectral theorem as an integral
Proof. (i) If
A < min{a(T) }, then there exists f
373 E
C( JR ) vanishing for
t < A and non-vanishing on a(T) . Then by Theorem 4.2 (on the mapping of spectra) , f(T) is invertible, and thus, H>.. C Ker(f(T) ) == 0. On the other hand, if A > max{ a(T) } , then for every f E C(JR) such that f(t) == 0 for t < A, we have f (T) == 0 (Theorem 4. 1 (iii) ) , and hence H>.. == H.
(ii) The assertion immediately follows from Definition 1 . (iii) Take an arbitrary f vanishing for t < A, and for every J-L > A put f11 ( t ) :== f(t - (J-L - A)) . Then f11 ( t ) tends to f(t) as J-L � A + 0 uniformly on each interval. Hence, f11 (T) tends to f(T) in B(H) as J-L � A + 0. Therefore, if x E H11 (and thus f11 (T) x == 0 for all J