McGRAW-HILL INTERNATIONAL BOOK COMPANY New York St. Louis San Francisco Auckland Bogota Guatemala Hamburg Johannesburg Lisbon London Madrid Mexico Montreal New Delhi Panama Paris San Juan S$o Paulo Singapore Sydney Tokyo Toronto
PAUL A. FUHRMANN Department of Mathematics Ben Gurion University of the Negev Beer Sheva, Israel
Linear Systems and Operators in Hilbert Space
British Library Cataloguing is Publication Data Fuhrmann, Paul A. Linear systems and operators in Hilbert space. 1. Hilbert space 2. Linear operators I. Title 515'.73
QA322.4
78 40976
ISBN 0-07-022589-3
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE Copyright ® 1981 by McGraw-Hill, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronical, mechanical, photocopying, recording or otherwise, without the prior written permission of the publisher.
2345 MOC 8321
CONTENTS
Chapter I. Linear algebra and finite dimensional systems
Introduction 1. Rings and modules 2. Polynomial modules 3. The Smith canonical form 4. Structure of linear transformations 5. Linear systems 6. Reachability, observability, and realizations 7. Hankel matrices 8. Simulation and isomorphism 9. Transfer functions and their factorizations 10. Realization theory 11. Polynomial system matrices 12. Generalized resultant theorem
43 46
13. Feedback
50
Notes and references
Chapter II. Operators in Hilbert space I . Geometry of Hilbert space 2. Bounded operators in Hilbert space 3. Unbounded operators 4. Representation theorems 5. The spectral theorem 6. Spectral representations 7. The Douglas factorization theorem and related results 8. Shifts, isometries, and the Wold decomposition 9. Contractions, dilations, and models
2
6 12 15
28 31
32
34 38
39
62
63
63 70 77
83
93 104
124 126 129
vii
viii
CONTENTS
10. Semigroups of operators 11. The lifting theorem 12. Elements of HZ theory 13. Models for contractions and their spectra 14. The functional calculus for contractions 15. Jordan models Notes and references Chapter III. Linear systems in Hilbert space I. Fundamental concepts 2. Hankel operators and realization theory 3. Restricted shift systems 4. Spectral minimality of restricted shift systems
5. Degree theory for strictly noncyclic functions 6. Continuous time systems 7. Symmetric systems Notes and references References Index
PREFACE
Great progress has been made in the last few years in the direction of establishing a system theory in the context of infinite dimensional spaces. Although this direction of research has by no means been exhausted it seems that the available theory has reached a level of maturity where a more systematic description would be in order. This would be of help to other workers in the field. My aim in this book is to reach different sets of readers-the mathematically oriented researcher in system theory on the one hand and the pure mathematician working in operator theory on the other. I think that the power, beauty, and elegance of that part of operator theory touched upon in this book are such that the interested system scientist who is ready to invest some, maybe even considerable,
time and effort in its study will be rewarded with a significantly increased set of methods for tackling multivariable systems and a deeper understanding of the finite dimensional theory. The operator theorist might find that system theory provides a rich ground of interesting problems to the mathematician which might be otherwise overlooked. Mathematics has always benefited from the transplanting of ideas and motivations from other fields. It seems to me that system theory besides being intellectually exciting is today one of the richest sources of ideas for the mathematician as well as a major area of application of mathematical knowledge.
I have tried to present the fairly diverse material of the book in a unified way as far as possible, stressing the various analogies. In this sense the concept of module is fundamental and the key results deal with module homomorphisms, coprimeness, and spectral structure. The book is divided into three uneven chapters. The first one is devoted to algebraic system theory and serves also as a general introduction to the subject. The various possible descriptions of linear time invariant systems are described. Thus transfer functions, polynomial system matrices, state space equations, and modules are all touched upon. ix
x
PREFACE
In the second chapter the necessary operator and function theoretic background is established. The material includes a short survey of Hilbert space theory through the spectral theorem. We use here the classical approach based on integral representations of certain classes of analytic functions. This approach is taken to stress the close connection between representation theory and realization theory. We continue with a sketch of multiplicity theory for normal operators. Next we study contractions, their unitary dilations, and contractive semigroups. The Cayley transform is extensively used to facilitate the translation of results from the discrete to the continuous case. A special section is devoted to an outline of the theory of the Hardy spaces in the disc and in a half plane. Shift and translation invariant subspaces are characterized. Next we describe the main results concerning shift operators as models including the functional calculus, spectral analysis, and the theory of Jordan models. In the last chapter we study the mathematical theory of linear systems with a state space that is a Hilbert space. Emphasis is on modeling with shift operators and translation semigroups. The operator theoretic results developed in the second chapter are brought to bear on questions of reachability, observability, spectral
minimality, and realization theory all in discrete and continuous time. Isomorphism results are derived and the limitations of the state space isomorphism theorem are delineated. A special section is devoted to symmetric systems. Many of the ideas and results as well as the general structure of the book have been conceived during my two years' stay with Roger Brockett at Harvard. Without his help, influence, and encouragement this book would not have been written. It is a pleasure to acknowledge here my deep gratitude to him. I would also like to recall the many stimulating exchanges over the past few years with my colleagues J. S. Baras, P. Dewilde, A. Feintuch, J. W. Helton, R. Hermann, S. K. Mitter, and J. C. Willems. For her excellent typing of the entire manuscript I want to thank Mrs Y. Ahuvia. I gratefully acknowledge the support of the Israel Academy of Sciences-the Israel Commission for Basic Research for its support throughout the writing of this book. Most of all I want to thank my wife Nilly for her love, moral support and encouragement.
CHAPTER
ONE LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
INTRODUCTION Finite dimensional linear systems discussed in this chapter are given by the dynamical equations
xi+1 = Ax + Bu yM = Cx
where A, B, and C are appropriate linear maps in linear spaces over an arbitrary field F. Thus the study of linear systems amounts to the study of triples of linear maps (A, B, Q. Historically the study of linear transformations was based upon the study
of matrix representations and reduction to canonical forms through a proper choice of basis. The more modern approach studies a linear transformation in a vector space X through the naturally induced polynomial module structure on X. This reduces the problem to that of a description of finitely generated torsion modules over F [A,] which is done through a cyclic decomposition. The use of polynomials greatly simplifies the operations involved and compactifies the notation. A case in point is the reduction of the problem of similarity of matrices to that of equivalence of corresponding polynomial matrices for which there exists a simple arithmetic algorithm. In this chapter this point of view is adopted, but we go one step further by replacing the usual matrix representations by polynomial and rational models. These functional models present a natural setting for the study of linear systems and provide a common ground both for Kalman's emphasis on studying linear systems as polynomial modules as well as for Rosenbrock's extensive use of polynomial system matrices. Moreover this approach provides the natural link to the study of infinite dimensional systems. 1
2
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
The chapter begins with the introduction of the necessary algebraic concepts, focuses on polynomial modules, both free and torsion modules, represents the latter ones as quotient modules and describes all the module homomorphisms. This in turn is used to study the structure of linear transformations as well as linear systems. Coprime factorizations of transfer functions are used in realization theory and related isomorphism results. Associating a polynomial model with a factorization of a transfer function leads to a natural introduction of polynomial system matrices. We conclude with the study of feedback by the use of polynomial models.
1. RINGS AND MODULES We review in this section the algebraic concepts needed for the understanding of the structure of linear transformations in finite dimensional vector spaces. A ring R is a set with two associative laws of composition called addition and multiplication such that: (a) With respect to addition R is a commutative (abelian) group. (b) The two distributive laws hold. (c) R has a multiplicative unit denoted by 1, that is, lx = x for all x e R. A ring R is called commutative if xy = yx for all x, y e R. If x and y are nonzero elements in R for which xy = 0 we say that x and y are zero divisors. A commutative ring with no zero divisors is called an entire ring. Given two rings R and R1 a ring homomorphism is a map
satisfies
(P(x + y) = w(x) + (P(y),
q(l)=I
and
N(xy) = 9(x) (P (y)
q:(0)=0
An invertible element in a ring R is called a unit. A field is a commutative ring in which every nonzero element is a unit. In a ring R we have a natural division relation. We say that b is a left divisor of a, and denote it by bl,a, if there exists an element c in R such that a = bc. In this case we say that a is a left multiple of c and a right multiple of b. A greatest common left divisor (g.c.l.d.) of elements aQ in a ring R is a common left divisor c of the a, such that every other common left divisor c' of the a, is a left divisor of c. The units of R are left divisors which are called trivial. Ring elements a and b are called left coprime, denoted by (a, b), = 1, if they do not have a nontrivial common left divisor or equivalently if every g.c.l.d. of a and b is a unit. A least common left multiple (l.c.l.m.) of elements as is a common left multiple which is a right divisor of any other common left multiple. We say that two elements a and b in R are left associates, a left equivalent, if each one is a left divisor of the other. Thus, in an entire ring, two left associates differ by a unit factor on the right. The relation of left associateness is an equivalence relation in R. In an analogous way
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
3
we define right divisors and all related notions. If R is a commutative ring we drop the adjectives left and right and write simply divisor, g.c.d., etc.
A subset J of R is called a left ideal if it is an additive subgroup of R and RJ c J, and hence RJ = J as R contains an identity. A right ideal is defined analogously and thus satisfies JR = J. A (two sided) ideal is a subset of R which is simultaneously a left and a right ideal. A left ideal J in R is principal if J = Ra for some a in R. A commutative ring in which each ideal is principal is called a principal ideal ring. An entire ring which is also a principal ring is called a principal ideal domain. There is a close connection between ring homomorphisms and two sided ideals. In fact if cp: R RI is a ring homomorphism then Kerq = {x e Rl(p(x) = 0} is an ideal in R. Conversely given an ideal in R we can construct the factor ring R/J consisting of all cosets a + J with addition and multiplication defined by
(a+J)+(b+J)=(a+b)+J and
(a + J)(b + J) = ab + J These definitions make R/J a ring and the map cp: R -. R/J given by cp(a) _ a + J is a ring homomorphism whose kernel equals J. For the development of structure theory for linear transformations and linear systems it will be important to study rings of polynomials and modules over such rings.
Given a ring R we define R [A] the ring of polynomials in the indeterminate A with coefficients in R as the set of all expressions of the form p(A) = "= o ail`, n >_ 0. The operations of addition and multiplication are defined as usual, that is, if p(1.) _
ail' and q(A) _ I bit' then (p + q)(A) = Y(ai + bi)A'
(1-1)
and
>c"A"
(1-2)
where c" _ Y aib;
(1-3)
Given a ring R let p e R [A] . If p(A) = Y°=o ail' and a" + 0 we say a" is the leading coefficient of p and call n the degree of p, denoted by deg p. If R is entire so is R [A] and deg(pq) = deg p + degq. The most important property of polynomial rings is the existence of a division process in R [A]. Let q, p e R [A] and p such that its leading coefficient is a unit then there exist unique h and r in R [A] such that q = ph + r and deg r < deg p. A similar result holds for right division in R [A]. We have also two evaluation maps from R [A] into R given by p -+ pt(c) and p --+ PRO, respectively, given by PLO = Y c'ai and PR(c) = Y aic' where p(A) _ Y aiA' and c e R.
4
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
As a result of the division process in R [1.] it is easily established [9] that ;. - cl Lq if and only if qL(c) = 0 and similarly for right division. If F is a field then every nonzero constant is a unit and it is an easy consequence of the division rule that the polynomial ring F [A] is a principal ideal domain. We proceed with the introduction of modules. Let R be a ring. A left module M over R (or left R-module) in a commutative group together with an operation of R on M which satisfies
r(x+y)=rx+ry, r(sx) = (rs) x
(r+s)x=rx+sx lx = x
and
for all r, s e R and x, y e M. Right modules are defined similarly. Let M be a left R-module. A subset N of M is a submodule of M if it is an additive subgroup of M which satisfies RN c N. Given a submodule N of a left R-module we can define a module structure on the factor group M/N by letting
r(a + N) = ra + N This makes M/N into a left R-module called the quotient module of M by N. Let M, M1 be two left R-modules. A map rp: M M1 is an R-module homomorphism if for all x, y e M and r e R we have (P (x + y) = q (x) + (P (y)
and
rp(rx) = r4 (x)
Given two R-modules M and M1 we denote by (M, M1)R the set of all Rmodule homomorphisms from M to M1. As in the case of rings we have the canonical R-module homomorphism rp: M - M/N given by x --* x + N, with Kercp = N. Also given an R-module homomorphism cp: M -+ M1 Kercp and Imcp are submodules of M and M1, respectively. Given R-modules M0,..., M. a sequence of R-module homomorphisms
Mo-°i+MI%...M. is called an exact sequence if Im (p; = Ker q ,
1.
An exact sequence of the form
M3
is called a short exact sequence.
If N is a submodule of a left R-module M then the sequence
0-- N-4M -+M/N-.0 is a short exact sequence. Here j is the injection of N into M and a is the canonical projection of M onto M/N. One way to get submodules of a given left R-module M is to consider for a given set A of elemen?s in M the set N = {Y- r;a,jai e A, r; e R} . This is clearly a
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
S
left submodule called the submodule generated by A. The set A is a set of generators for N. If M has a finite set of generators then we say that M is finitely generated.
A subset {b,, ..., bk} of an R-module M is called R-linearly independent, or just linearly independent, if I r;bi = 0 implies r; = 0 for all i. A subset B of an Rmodule M is called a basis if it is linearly independent and generates M. If M has a set of generators consisting of one element we say M is cyclic. By a free module we mean a module which has a basis or the zero module. Theorem 1-1 Let R be a principal ideal domain and M a free left R-module with n basis elements. Then every R-submodule N of M is free and has at most n basis elements.
PROOF Let {e1, ..., en} be a basis for M. Then every element x e M has a unique representation in the form x = Y_ rte;. We prove the theorem by induction. For the zero module the theorem is trivial. Let us assume the theorem has been proved for modules with (n - 1) basis elements. Let N be a submodule of M. If N contains only terms of the form IN i rte; then the theorem holds by the induction hypothesis. Thus we may assume N contains I r;e; E NJ. Clearly I is an an element B =1 r;e; with ra + 0. Let I = {r,,, the ideal generated ideal in R, and R being a principal ideal domain 1 = 1ea_ I + by p,,. Thus N contains an element of the form f = r1e1 + - + Hence for every element a in N there exists an r e R for which a - rf belong to a submodule of the free module generated by {e1..... e _ 1 I. Hence by the induction hypothesis there exists a basis { f1, ..., fa_ I } of that submodule with m - 1 <- n - 1. Clearly If,_., fm_ 1, f } generate N. To show that { f1, ..., fm-i. f 'J are R-linearly independent assume
r1f1 + "' + rm-ifm-1 + rf = 0 If r = 0 then, { fl, ..., fm _ I } being R-linearly independent, it follows r1 = = r.- 1 = 0. But r + 0 implies rp,, = 0 which is impossible. r2 = Assume now R is a principal ideal domain. An element a of a left R-module M is called a torsion element if there exists a nonzero r e R for which ra = 0. The set of all torsion elements of M is a submodule called the torsion submodule of M. M is a torsion module if all its elements are torsion elements. Given any
nonzero element a in M then J. = {r e R I ra = 0} is a left ideal in R which is nontrivial if and only if a is a torsion element. Since R is principal Ja = Rµ0 = (µa) the ideal generated by µ0. u0 is called a minimal annihilator of a. Two minimal annihilators differ by a unit factor. In particular if M is a cyclic module over R
then M is isomorphic to R/(ka). So if M is cyclic either M is isomorphic to R which means M is free or it is isomorphic to a proper quotient ring of R and in that case M is a torsion module. Since a finite direct sum of torsion R-modules is also a torsion module it follows that given elements t1 in R then R/(µ1)Q ... ® R/(µk) is a finitely generated torsion module over R. The converse is also true and is summarized in
6
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
the fundamental structure theorem for finitely generated torsion modules over principal ideal domains. Theorem 1-2 Let M be a finitely generated torsion module over a principal ideal domain R. Then M is isomorphic to R/(µ,) @ .. Q R/(µk) where µ, are nonzero elements in R and µi+ I I µi. The sequence of ideals (µi) is uniquely determined.
We call the elements jz the invariant factors of the module M. The invariant factors are determined up to unit factors. We will give a proof for the special case that R = F [A] the ring of polynomials over a field F. However, the proof holds for the general case with only minor modifications.
2. POLYNOMIAL MODULES In the previous section we introduced R [A], the ring of polynomials over a ring R. In this section we study related objects obtained when starting with a module M. Thus let M be an R-module. By M((A)) we denote the module of all truncated Laurent series, that is, the set of all formal series of the form Li !k m,Ai, k e Z and mi a M, M [[A]] the submodule of all formal power series, that is, sums of the form 0 miAi. Finally M [A] denotes the polynomial module with coefficients in M, that is the submodule of M [[A]] of all formal power series with only a finite number of nonzero coefficients. For a ring R the module R((A)) is actually a ring with addition and multiplication defined by
y piA' + > qiA' = y (pi + qi) A`
(2-1)
(y piA') (y gjAi) = Y r"A"
(2-2)
and
with X pigj
(2-3)
i+j=n
Multiplication is well defined as in each of the rings appearing in (2-3) there is only a finite number of nonzero terms. Given a left R-module M then M((A)) becomes a left R((A))-module if the action of R((A)) on M((A)) is defined by (2-2) with Y pit' E R ((A)) and I gjAj E M((A)). It will be convenient to consider also the module M((A-' )) which then contains M [A] as an R [A]-submodule. Let j be
the injection of M[A] into M((A-')) and let n_ be the canonical projection of M((A-')) on the quotient module M((A-'))/M[A]. Then we have the following short exact sequence of module homomorphisms
0-' M[A] '+
M((A-I)} x
M((A-'))/M CA]
>0
(2-4)
Moreover we can identify M((A-'))/M [A] with A-'M [[A-' ]] the set of all formal power series in A-' having zero constant term. An element f of M((A-' )) is called rational if there exists a nonzero p E R [A] such that pf e M [A] and proper rational
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
7
if f is rational and f e M [[A-' ]]. M((A -' )) is clearly an F [A]-module and thus the quotient module A-'M[[.1-']] has also an induced M[A]-module structure. The action of A in 2 I M [[A ' ]] is called the left shift. We denote by a+ the projection of M((A.-')) onto M [A], i.e., n+ = I - 7r-.
For a ring R we will denote by R` the R-module of all n x m matrices with elements in R. Thus R [A]"m is the set of all n x m matrices with elements in R [A]. We call these polynomial matrices. There is a standard isomorphism of R [d]"'m and R` [1] the set of all polynomials with n x m matrix coefficients. It will be convenient to have both interpretations at hand. Let R be a commutative ring. A matrix U e R""" is called unimodular if det U is a unit in R. By Cramer's rule a matrix U in R` is invertible if and only if U is unimodular.
There is a natural equivalence relation in R'. We say that A and B in Rn " are equivalent, or unimodularly equivalent, if there exist unimodular matrices
U and Vin R"Rand R'", respectively, such that B = UAV It is trivial to check that this is indeed an equivalence relation. From now on we assume F is a field and Van n-dimensional vector space over F. By choice of basis in V it is clear that V is isomorphic to F" and as a consequence
V [A] and (V, V)F [A] are isomorphic to F"[1] and Fn'" [A] (or F[A]"n), respectively.
Theorem 2-1 A subset M of V [A] is a submodule of V [A] if and only if M = DV [A] for some D e(V, V)F [A].
PROOF That a set of the form DV[A] is an F[A]-submodule is clear. Conversely, assume M is a submodule of V [A]. By Theorem 1-1 M is free with m S n generators. Let e,, ..., e" be a basis for V as a vector space over F, then e,, ..., e" is also a set of free generators for V [1.] as an F[A]-module. Let d,, ..., d," be a set of generators for M and D e(V,, V)F [A] be defined by De; = d, for i = 1, ..., m and De; = 0 form < i <- n. Obviously M = DV [A]. The partial order, by inclusion, of submodules of Y [A] can be related now to division relation of the associated matrix polynomials.
Theorem 2-2 Let M = D V [2] and N = EV [1] be submodules of V [A] . Then M c N if and only if EIL D. PROOF If EILD then D = EF for some F in (V, V)F [A], and hence M c N. Conversely assume DV [A] c EV [A]. Let el, ..., e" be a basis for V and let d; = De,. By the submodule inclusion there exist f in V [A] such that d; = E,,, i = 1, ..., n. Define F e (V, V)F [A] by Fe; = f, then the factorization D = EF follows. Corollary 2-3 Let M = DV [.l] be a submodule of V [d] . If D is nonsingular and M = EV [ti] is any other representation of M then E = DU for some unimodular U e (V, V)F [A].
S
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
PROOF We have E = DU and similarly D = EW hence D = DUW. Since D is nonsingular I = UW which implies that U and W are unimodular and E is nonsingular.
The above corollary leads to the following definition. Definition 2-4 A submodule M of V[A] is called a full submodule if it has a representation of the form M = DV [A] for some nonsingular D e(V, V)F [A]. Clearly M is a full submodule of V [A] if and only if it has a basis consisting of n elements. As D is nonsingular if and only if detD + 0 we have the following trivial corollary.
Corollary 2-5 A submodule M = DV [A] of V [A] is full submodule if and only if det D + 0.
It should be noted that det D + 0 means that det D E F [A] is not the zero polynomial. It might be identically equal to zero as a function on F. Another easy and useful corollary of Theorem 2-1 is the following.
Corollary 2-6 Let D be a nonsingular element of (V, V)1[A]. Then (detD) V [A] c DV [A] .
PROOF By a choice of basis in V we may assume we have a matrix represen-
tation of D. Then it follows from Cramer's rule that (det D)1 = D adj D where adjD is the classical adjoint of D, that is, the cofactor matrix of D. Given two submodules M, and M2 of V [A] then M, n M2 and M, + M2 are also submodules of V [A] and M, n M2 c M. c M, + M2. Theorem 2-7 Let MI = E; V [A] and let M, + M2 = DV [1.] and M, n M2 =
CV [A]. Then C and D are a l.c.r.m. and a g.c.l.d., respectively, of M, and M2. PROOF The inclusion M, c M, + M2 implies E,V [1,] c DV [A] and hence, by the previous theorem, E, = DG, or DALE, and similarly DILE2. Thus D is a common left divisor of E, and E2. Let D, be any common left divisor of E, and E2 which means E; = DIE, or equivalently M; c D,V [a]. Hence M. c D, V [J.] and therefore also DV [A] = M, + M2 c D, V [A]. Thus we get D, ILD and hence D is a g.c.l.d. of E, and E2. Next consider M, n M2. M, n M2 c M; implies CV [ti] c EjV [)t] or C = E;G; for some G; E(V, V)F [A]. So C is a common right multiple of E, and E2. Let C, be any other common right multiple of E, and E2 then C, = E;G;. So C, V [1.] = E;V [A] and hence C, V [A] c E, V [7] n E2V [A] = CV [A]. This is equivalent to CILC, or C is a l.c.r.m. of E, and E2.
As a corollary we obtain the following important result.
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
9
Theorem 2-8 Every two matrix polynomials El and EZ in (V, V)F [A] have a g.c.l.d. D which can be expressed as
D = E,F, + E2F2
(2-5)
for some F, and F. in (V, V), [A]. PROOF The existence of a g.c.l.d. has been proved in the previous theorem. Let D be a g.c.l.d. of E, and E2. Then DV[A] = E, V [A] + E2 V [A] . Let d i = De; for a basis e, , ..., e of V. Then d; = E, f ' I + E2f(2). Define F; e (V, V)F [A] by Fie;
then (2-5) holds.
With only minor modifications we can prove the same result assuming E; a (Wi, V)F [A]. Also a completely analogous result holds for a g.c.r.d. A trivial generalization holds for a g.c.l.d. of a finite number of matrix polynomials.
Corollary 2-9 Let A; e (Wi, V )F [A], i = 1..... p. Then A, , ... , A p are left coprime if and only if there exist Bi e (V, W )F [A] such that I = SP=, A;Bi. If two matrix polynomials have a nonsingular g.c.l.d. D then any other g.c.l.d. D, is given by D, = DU for some unimodular U. Corollary 2-10
The availability of g.c.l.d.'s allows us to determine the ideal structure in (V V)F [A]
Theorem 2-11 A subset J C (V, V)F [A] is a right ideal if and only if J = D(V, V)F [A] for some D e (V, V)F [A].
PROOF The if part is trivial. Suppose now that J is a right ideal in (V, V)F [A]
then J is finitely generated. Let A,, ... , AR be a set of generators and let D be a g.c.l.d. of A,__ AR. Then clearly J = D(V, V)F [A] . As in the case of submodules we have D(V, V)F [A] c E(V, V)F [A] if and only if EILD. If J = D(V, V)F [A] is a right ideal for which det D + 0 then D is determined up to a right unimodular factor. A right ideal J = D(V, V)F [A] is called a full right ideal if D is nonsingular. Obviously this definition is independent
of the representation of J. Of course analogous results hold also for left ideals. Next we pass to the study of quotient modules of V [A] singling out those which are torsion modules.
Theorem 2-12 Let M be a submodule of V [A] then the quotient module V [).]IM is a torsion module if and only if M is a full submodule of V [A]. PROOF Assume M is a full submodule of V [A], then M = DV [A] for a nonsingular D. As, by Corollary 2-6, (detD) V [A] c DV [A] it follows that detD
annihilates the quotient module V [A]/DV [A]. Conversely, assume that V [;.]IM is a torsion module. Since it is finitely generated there exists a
10
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
polynomial p annihilating all of V[A]IM. This implies pV [A] c DV [A] where DV [A] is any representation of M. Thus p1 = DE and hence det D is nontrivial and M is a full submodule.
The next lemma is in the same spirit and its proof is omitted. Lemma 2-13 Let M be a submodule of V[A]. Then V[AJ/M considered as a vector space over F is finite dimensional if and only if M is a full submodule of V [A].
In case n = 1 V [A] is isomorphic to F [A] and hence has actually a ring struc-
ture. Any submodule of F [A] is an ideal which is necessarily principal. The generator m, a nonzero polynomial of least degree p, is uniquely determined if we assume the highest order coefficient to be 1. By the division process in F [A] we may identify F [:1]/M with Fp_ 1 [A] the set of all polynomials of degree < p - 1. Since it is easier to work with representatives rather than with equivalence classes we would like to imitate the scalar construction in some way. The difficulty arises mostly out of the nonuniqueness of such a representation. One way to overcome this difficulty is through the use of canonical matrices as was done by Eckberg in [32]. A related approach is to study finite dimensional vector spaces over the field of rational functions and special choices of bases as was done by Forney [38]. We proceed differently and study the whole set of such representations.
Thus let n_ be a canonical projection of V((A-')) onto V((A-'))/V[A]. We identify V((2'))/V[A] with 2 Let now M = DV [A.] be a full submodule of V[A]. D is therefore nonsingular and has an inverse in (V, V)F(A). In matrix language D-' would be a matrix over the field F(A) of rational functions. Define now a map nD: V[A] -+ V[A] by
nDf = Dn_ D-'f
for
f e V [a]
(2-6)
Lemma 2-14 Let D be a nonsingular element in (V, V)F [A]. Then nD defined by (2-6) is a projection map in V [n.] and KerirD = DV [a].
PRooF Let f e V [2] then D- If e V((2-' )). Let D-' f = g + h with g e A-' V[[A-']] and h e V[A]. This decomposition is unique. From this we get n_D-'f = g and nDf = Dn_D-'f = Dg = F - Dh. As f - Dh e V[A] the range of nD is in V [A]. That nD is a projection follows from the equality
nDf = (Di_D-')(Dn_D-'f) = Dn? D-'f = Dn_D-'f = nDf that holds for all f e V[).]. Next we show that Ker rrD = DV[A]. If f e DV [2]
then f = Dg for some g e V[A]. Hence nDf = Dn_D-'Dg = Dn_g = 0 and DV[A] c KernD. Conversely if nDf = 0 then, by the nonsingularity of D, n_ D-' f = 0, or equivalently D -' f = g e V[A]. So f = Dg and Ker nD c D V [A.] which completes the proof.
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
tt
Define now KD as the range of the projection ire, that is
KD= {nDflfeV[A]}
(2-7)
Clearly KD is a vector space over F. From the preceding proof it is clear that the following lemma holds.
Lemma 2-15 An element f in V [A] belongs to KD if and only if D -If e A- I V [[A- I fl, that is, is strictly proper rational. The vector space KD can be given an F [A]-module structure by defining the action of polynomials in KD through P.f = 7rD(Pf)
With this definition 7CD becomes a surjective F[A]-module homomorphism of V [A] onto KD with kernel DV [A] . Thus we have the following important result.
Theorem 2-16 Let M = DV[A] be a full submodule of V[Al. Then KD defined by (2-7) with the module structure defined by (2-8) is an F [A]-module isomorphic to V [A]/M.
We conclude this section with a digression on compound matrices. Let A be an n x n matrix over a commutative ring R, that is, an element of R"". We denote by A(P) the matrix of all p x p minors of A ordered lexicographically. The matrices A(P), p = 1, ..., n, are called the compound matrices of A. As there are
(n) ways of choosing p rows or columns out of the n available, A(P) is
an
x
n
/
(n) matrix over R. The following theorem summarizes the p
p/
known properties of A(P). We refer to [8] for additional material.
Theorem 2-17 Let A and B be n x n matrices over the commutative ring R, then IMP)
= 1( ) (b) (AB)(P) = A(P)B(P) (A)(P) _ (v) (c)
(a)
and (d) detA(P) = (detA)(;-1) PROOF (a) and (c) are trivial, (b) is a consequence of the Binet-Cauchy for-
mula for determinants and (d) follows easily from triangulation of A by elementary row operations.
12
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
As extra easy consequences we get that if A is invertible then (A-' )IPI = (AIP')-'. Hence if U is unimodular so is OP). Corollary 2-18 Let R be a commutative ring. Then if A and B in R""" are equivalent then so are RIP) and B(P).
PROOF If A and B are equivalent then A = UBV for unimodular matrices U and V. This implies AIPI = U(P)B(P)V(P) and since U(P) and V(P) are unimodular A(P) and B(P) are equivalent.
We introduce now the determinant divisors of matrices over a ring and will subsequently relate them to the invariant factors of a matrix.
Definition 2-19 Let R be a commutative ring and A E R""" We define the determinant divisors D.(A) of A by D0(A) = 1 and D.(A) as the g.c.d. of all i x i minors of A.
Lemma 2-20 If A is an n x n matrix over a commutative ring R then (a) DI(A) = DI (AI "))
(b) If A and B are equivalent in R` then D;(A) and D.(B) are equivalent in
Rfor i = 1,...,n. PROOF (a) is obvious. In the light of (a) and Corollary 2-18 it suffices to prove
that D1(A) and D1(B) are equivalent in R. Now if A and B are equivalent then each element a;j of A is a linear function of all the b,,. Thus D1(B)Ia;; for all i and j. Hence D1(B)l D1(A). The equality follows by symmetry. As we shall see later, at least in the case of matrices over a principal ideal
domain the converse is also true, that is the equivalence of all determinant divisors implies the equivalence of the matrices.
3. THE SMITH CANONICAL FORM In the previous section an equivalence relation has been introduced in R"'m where R was a commutative ring. We specialize the ring to F [A] and study the invariants of the relation and the associated canonical forms. This has important implications for the structure theory developed in the next section. In order to give a satisfactory answer to the question of normal forms for equivalence we introduce the following class of square matrices over F[A]. We do not specify their dimensions as it will be n when multiplying on the left and m when multiplying on the right.
LINEAR ALGEBRA AND VINITE DIMENSIONAL LINEAR SYSTEMS
13
rl
Ti(p)=i I
1
Ft
D;(u) = i
where u e F [.i] is a unit, that is, a nonzero constant polynomial, and
rl i
P;; =
1]
The matrices
D,(u), and P;j are unimodular and are called elementary
matrices.
It is easily checked that left (right) multiplication by 1 (p) is equivalent to
14
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
adding the j th row (column) multiplied by p to the ith row (column), left (right) multiplication by D,(u) is equivalent to multiplying the ith row (column) by u and finally left (right) multiplication by P,; is equivalent to interchanging the ith and jth rows (columns). These operations on rows and columns are called elementary operations. From this it is clear that any matrix A' obtained from a given matrix A in F [A]" x m through a finite sequence of elementary operations is equivalent to A.
We denote by diag(p,, ..., p,,) the n x m matrix whose elements are pbi; and k = min (n, m).
Theorem 3-1 Let D be a matrix in F[A]"". Then there exists a matrix A in F [A]n " which is equivalent to D and for which
(a) A = where Si are nonzero elements in F [A]. (b) b; + 1 I Si for i = 1, ... , r - 1 are satisfied.
The elements 6,_ ... , S, are determined uniquely by the above conditions up to unit factors. Equivalently the ideals (b,) are uniquely determined. The matrix A is called the Smith canonical form of D. The elements 6,_., S, are called the invariant factors of D. PROOF Consider the matrix D. If D = 0 there is nothing to prove. Otherwise let d,; be an element of least degree in D. By exchanging rows and columns we bring it to the (1, 1) position. Let d,; = d,',b; + r,; with degr,; < degd,,. Subtract now the first column multiplied by bj from the jth column. Now
either r1 = 0 and we repeat the process with the next element in the first row or r1 j + 0 and we move it, by elementary operations, to the (1, 1) posi-
tion and proceed as before. Next we repeat the same process with rows. Since the degree of the element in the (1, 1) position is decreased with each exchange, it follows that after a finite number of operations we get a matrix of the form e11
0
0
e22 e23
.
.
.
.
.
.
Proceeding inductively we obtain a matrix diag(g,,... , g 0, ... , 0) equivalent to D, with g, + 0. If g, does not divide g, we add the ith row to the first and repeat the process obtaining a diagonal matrix diag(h, , ..., h 0, 0, ..., 0) with degh1 < degg1. A finite number of repetitions of this process yields a
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
15
matrix diag(k,,... , k 0, ... , 0) equivalent to the original and for which kilki+,, i = 1, ..., r - 1. By row and column exchange we can reorder the diagonal elements so that the divisibility conditions of the theorem are satisfied.
It remains to prove uniqueness. Suppose A = diag(8,, ..., b 0, ..., 0) and A' = diag(S, , ... , S;, 0.....0) are two diagonal matrices equivalent to D and satisfying the conditions of the theorem. By transitivity A and A' are equivalent. From Lemma 2-20 it follows that the determinant divisors of A and 0' are equivalent in F[A]. Now the determinant divisors of A are easily computed to be D0(A) = 1,
DI(A) = or,
D2(A) =
...,
D,(A) = Or ... SI
and similarly f o r A'. This implies r = r' and (S,) = ( 6 ') ,( 6 , (a; a;) and hence(S&) = (b'i).
S,) _
This proof yields an effective way of computing the invariant factors.
Corollary 3-2 Let D be an element of F[A]"", D, (D) its determinant divisors and assume D1(D) + 0 for i = 1, ..., r and D1(D) = 0 for i > r. The invariant factors b, , ... , S, of D are given by
aI = D.(D)ID.-1(D),...,6, = DI(D) It should be noted that given two polynomials p and q in F[A] the Euclidean
algorithm for calculating the g.c.d. of p and q is essentially the reduction of (p q) e F1 " '[A] to its Smith canonical form.
4. STRUCTURE OF LINEAR TRANSFORMATIONS Probably the most elegant approach to the study of linear transformations on a finite dimensional vector space V over a field F is by the study of the F [A] -module
structure induced by it. The presentation in this section is along these general lines but it stresses the notion of canonical models and the relation between them. The models we are after are either quotient modules of V [A] or submodules of 2- I V To study them we need to have some information about module homomorphism. The following lemma provides a useful tool for the analysis of the module homomorphism that follows. Lemma 4-1
(a) Let X, X,, and Y2 be modules over the ring R and let fl: X - X, and f2 : X -* X2 be R-homomorphisms of which f2 is assumed to be surjective. Then there exists a uniquely determined R-homomorphism
16
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
0: X 2 - X I which makes the diagram
X
commutative if and only if
Ker f2 c Ker f, Moreover 0 is injective if and only if
Kerf2 = Ker f,
(4-3)
(b) Let X, X 1, and X 2 be modules over the ring R and let f, : X 1 - X and f2 : X2 --. X be R-homomorphisms of which f2 is assumed injective. Then there exists a uniquely determined R-homomorphism cp: X 1 ' X2 which makes the diagram
(4-4)
commutative if and only if
Range J c Range f2
(4-5)
Moreover rp is surjective if and only if
Rangef, = Rangef2 Let us denote by X the identity polynomial in F [A], that is, X(A) = A. Define the map S: V [A] -+ V [A] by Sf = Xf or (Sf) (A) = Af (A). S will be called the right shift in V [A]. V [A] has a variety of structures associated with it. It is an F-vector space, an F [A]-module and. a left (V, V)F [A]-module. With each structure there is associated a class of homomorphisms.
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
17
Theorem 4-2 (a) A map cp: V [A] -+ V [A] is an F [A]-homomorphism if and only if it is a linear map and the diagram V[
A] f V [A] S
Sl
(4-7)
1
V [A]
V [A]
is commutative. (b) A map cp: V [A] - V [A] is an F [A]-homomorphism if and only if (qf) (A) = c(A) f (A)
(4-8)
for some cD e (V, V )F [A].
(c) A map cp: V [A] -+ V [A] is an (V, V)F [A]-homomorphism if and only if ((Pf)(A) = P(A) f(A)
(4-9)
for some p E F [A] . PROOF
(a) If cp: V [A] - V [A] is an F [A]-homomorphism then (4-7) commutes. Conversely if (4-7) commutes then also (pS" = S"cp and by linearity the result follows.
(b) The direct part is obvious. So let cp: V [,Z] -+ V [A] be an F[A]-homomorphism. Let e,, ..., e" be a basis of V. Define an element CD of (V, V)F [A] by C(A) e; = rp(e;) (A) and (4-8) follows.
(c) Again the direct part is trivial. Let cp: V [A] - V [A] be an (V, V)F [1.]homomorphism. By (b) (cpf) (A) = C(A) f (A). Since q is by assumption a (V, V)F [A]-homomorphism we have (D(A) D(A) = D(A) C(A) for all D in (V, V)F [A]. In particular d>(A) D = DC(A) for all D e (V, V)F which implies cD(A) is a scalar element, that is C(A) = p(A) I for some p e F [A] .
The simple structure of the F [A]-homomorphisms of U [A] lets us state the following simple lifting theorem.
Theorem 4-3 Let cp: U[A]
U, [A]
be an F[A]-homomorphism and let
j : U [AA] - U((A-' )) and j, : U I [A] -+ U I ((A-' )) be the canonical injections. Then there exists an F [A]-homomorphism gyp: U ((A-' )) --+ U , ((A-' )) which makes the diagram
U((A-I)) - -. U,((A-')) it
h,,
U [A] f ' U I [A] commutative.
(4-10)
18
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
PROOF By Theorem 4-2 there exists a' e(U, Vl)F [A] such that cp(u) = IOU for u e U [A]. Define cp(u) = 4>u for all u e U ((A-1)) then 0 is an F((A - 1 ))homomorphism which makes (4-10) commutative.
One set of the canonical models we are after for the representation of finite dimensional linear transformations will be the transformations defined by S(D) f = 7rD(Xf)
for
f e KD
(4-11)
where nD and KD are defined by (2-6) and (2-7), respectively. We recall that two linear transformations A and A I acting in F-vector spaces V and V1, respectively arc similar if there exists an invertible F-linear map R: V-- V1
such that A1R = RA. To show that our class of models is sufficient for the description, up to similarity, of all linear transformations in finite dimensional vector spaces we prove the following important theorem. Theorem 4-4 Let A be a linear transformation in a finite dimensional vector space over the field F. Then A is similar to S(x1 - A) acting in K,-A. PROOF Let v(A) _ Y. A1v; be an element of the free module V [A]. Define a map cpA: V [A]
V by coAV = Y A'v1. Clearly IPA is an F [A]-homomorphism
of V [A] onto V. Thus we have the module isomorphism V V [A]/KercPA. Now, by a calculation analogous to the one in Sec. 1, v e KerPA if and only if XI - AILV or v(A) _ (AI - A) w(A) for some w e V [A]. So KercOA = (xl - A) V [A] and V KXI _ A as F [A]-modules. This completes the proof.
The introduction of canonical models leads to an extremely simple proof of the Cayley-Hamilton theorem. As usual given a linear transformation A in a vector space V over F we define the characteristic polynomial dA of A by dA(A) = det(AI - A) where the determinant is computed through any matrix representation of AI - A.
Theorem 4-5 (Cayley-Hamilton) Let A be a linear transformation in a finite dimensional vector space V over F and let dA be its characteristic polynomial. Then dA(A) = 0. PROOF By similarity it suffices to show that dA(S(xl - A)) = 0. This follows, by Cramer's rule, from the inclusion
dAV [A] C (XI - A) V [A]
We proceed now with a more detailed study of the transformation S(D).
Theorem 4-6 A number Ao E F is an eigenvalue of S(D) if and only if KerD(A0) + {0}. In that case the eigenvectors of S(D) have the form
(x -
A,)-1 D5 for
e KerD(Ao).
i
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
PROOF Assume D(A0)
19
= 0 and define f by f = (X - Ao)-' D. then clearly
nD f = f that is f e KD and (S(D) - A0I) f = nD(Y. - AO) f = nDD = 0
that is, f is an eigenvector of S(D) corresponding to the eigenvalue A0. Conversely assume f is an eigenvector of S(D) which corresponds to the eigenvalue A0 then nD(X - A0) f = 0 or (X - 20) f e KernD = DV [A]. Therefore
(X - A0) f = Dg for some g in V [A], or f = (X - A0)-' Dg. It remains to show that g is a constant vector. Since f e KD it follows from Lemma 2-15 that D-'f = (X - A0) g is proper rational and hence g is necessarily constant.
Corollary 4-7 A number A0 e F is an eigenvalue of S(D) if and only if X - A0 divides d = det D. PROOF The polynomial d = det D is divisible by X - A0 if and only if d (A0) = 0
which is equivalent to KerD(A0) + {0}. The above corollary indicates the direction for generalizing Theorem 4-6 and is an instant of a spectral mapping theorem.
Theorem 4-8 Given a polynomial p in F[A] then p(S(D)) is invertible if and only if p and d = detD are coprime.
We omit the direct proof. This theorem follows also as a corollary to the more general result given by Theorem 4-11. Since we are interested in the relationship between different canonical models
it is of importance to characterize the conditions guaranteeing the similarity of two transformations of the form S(D). For this we introduce the notion of intertwining operators. Let K and K 1 be vector spaces over F and let T and T1 be two linear transformations meeting in K and K1, respectively. We say that a linear map X : K -+ K 1 intertwines T and T1 if XT= T1 X. If X happens to be invertible then T and T1 are similar. In the special case that the spaces are KD and KD, and the maps are S(D) and S(D1), respectively, then a map X: KD -+ KD, intertwines S(D) and S(D1) if and only if it is an F [A]-module homomorphism. Thus the set of all F [A]-module homomorphisms from KD into KD, is the one we wish to characterize and in particular the subclass of isomorphisms.
Theorem 4-9 Let D and D1 be nonsingular elements of (V, V )F [A] and (W, W)F [A], respectively. A map X: KD -+ KD, is an F[A]-homomorphism if and only if there exist ZEE and at in (V, W)F [A] satisfying
°D=D181
(4-12)
and X is defined by
Xf = nD,Sf
for f e KD
(4-13)
20
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Before proving Theorem 4-9 we prove the following lemma
Lemma 4-10 Let D, be a nonsingular element of (W, W)F [1]. A map X : V [1] -+ KD, is an F [1]-homomorphism if and only if for some E in (V, W)F [A] X is given by
Xf = nD,Ef
for f e V [1]
(4-14)
KD, is an F [1]-homomorphism. Let e1, ..., e be a basis of V Let Xe; = ; e KD,. Let E be the element of (V, W)F [A] defined by ;(A) = E(1) e;. By linearity it follows that (Xv)(1) = E(1.) v for v e V. Since X is an F [A]-homomorphism we have for any polynomial p in F [A] that X (PV) = nD, p(Ev) = nD,E(pv) . As all elements of V [1] are sums of PROOF Assume X : V [A]
elements of the form pv (4-11) follows. The converse is trivial. KD, is defined through (4-13) and (4-12) then it is clearly an F [A]-module homomorphism. Conversely let X: KD -' KD, be an F [1]-module homomorphism. Thus PROOF OF THEOREM 4-9 If X: KD
XS(D) = S(D1)X
(4-15)
Right multiplying (4-15) by nD we obtain
XS(D)nD = S(D1)XnD and this implies (XnD) S = S(D,) (XnD)
(4-16)
where S: V [1] - V [A] is defined by Sf = Xf. Thus XnD satisfies the conditions of Lemma 4-10 and hence
XnDf = nD,Ef
(4-17)
for some 3 e (V, W)F [A]. Now XnD and X act equally on KD and hence (4-17) implies (4-13). Also XnDDg = 0 for any g e V [A] hence 7tD,EDg = 0 or
ED V [1] c D1V [1]
(4-18)
But (4-18) implies the existence of a E, for which (4-12) holds.
The following theorem characterizes the invertibility properties of the transformations that intertwine two canonical models.
Theorem 4-11 Let D and D, be nonsingular elements of (V, V)" [A] and (W, W)F [A], respectively, and let X: KD - KD, be defined by (4-13) with (4-12) holding. Then
(a) X is onto KD, if and only if (E, D, ), = I, (b) X is one-to-one if and only if (-7,, D)R = I.
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
21
PROOF
(a) Consider the range of X = {n,, "f If e KD} which is clearly a submodule
of KD,. X is not onto KD, if and only if {nD,°f if e KD} + D1W [A] which is equal to cV [A.] + D1 W [A.] differs from W[A]. Now 8V [A] + D 1 W [A] = A W [A] where A is a g.c.l.d. of n and D1. But A is not unimodular if and only if (_=, D1)L = 1. (b) Let f c- KD be in the kernel of X. Since F e KD we can write, by Lemma 2-15, f = Dg for some proper rational function g. Now Xf = 0 implies nD, E f = 0 or 5Dg = D 1 p for some p e W [A]. Using (4-12) we obtain 81g = p. Let us define J. and 1a by
J,, = {Ae(V,V)F[A]AgeV[).]} and
19= {Be(V,W)F[A]BgeW[.i]} The representation theorems for ideals and modules proved in Sec. 2 imply that and Ig = (V W)F [X] Al J9 = (V V)F [A] A for some A and Al in (V, V)F [.1]. Since (V, W)F [J.] Jg c 1a
and
(W, V)F [1] 1a c JB
it follows that A and Al are right associates and without loss of generality can be identified. Now D e J. and E I e I. and hence have a nontrivial common right divisor A.
Conversely assume 81 and D are not right coprime. Let A be a g.c.r.d. of c1 and D. Let g be a proper rational function for which Ag e V [A]. Such a g certainly exists. Let f = D9 then f e KD and Xf= nD,cf = nD,GDg = iD,D1=1g = 0 for °1g a W[A] as A is a right divisor of 1. Thus X is not one-to-one.
Since a unimodular matrix U is left or right coprime with any other matrix we get as an easy corollary of the previous theorem the following classical result.
Corollary 4-12 Let A and A 1 be elements of (V, V)F then A and A 1 are similar if and only if XI - A and xl - Al are equivalent. PROOF That similarity implies equivalence is trivial. Assume equivalence of XI - A and X1- A 1, hence by the previous theorem S(XI - A) and S(XI - A 1)
are similar and by transitivity A and Al are also similar.
Theorem 4-11 has important implications inasmuch as, together with the Smith canonical form, it provides the key to the understanding of the structure of finitely generated torsion modules over F [A] and with that to the structure of linear transformations in finite dimensional vector spaces. Before tackling these
22
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
subjects we wish to indicate the usefulness of another set of canonical models. For a finite dimensional vector space Y over a field F we note that Y ((A-'))
is an F[Al-module of which Y [A] is a submodule. The quotient module Y((A-' ))I Y [A] inherits an induced module structure. We already made the identification of Y((A"'))/Y[A] with A-'Y[[A-1]]. Thus A-'Y[[A-']] is an F [A]-module with the action of a polynomial p given by (4-19)
for y e k` Y [[.1-1 ]], n_ being the canonical projection of Y((2-' )) onto As before let n+ denote the canonical projection of Y((A -' )) onto Y [A] .
Let now D e (Y, Y)F [A] be nonsingular, that is det D is a nontrivial poly-
nomial in F[A]. We define a map ED:A-'Y[[A-']]
A-'Y[[A-']] by
nDy = n_D-'n+Dy
(4-20)
Obviously nD is a projection operator in A-' Y [[A.-' ]] but it is not an F [)..]homomorphism. However, LD defined by LD = RangexD
is a submodule of
(4-21)
In fact we have the following counterpart of
Theorem 2-12.
Theorem 4-13 A subset M of A-'Y submodule if and only if
is a finitely generated torsion
M = LD = Range nD
(4-22)
for some nonsingular D e (Y, Y)F [A].
PROOF Let M = LD for some nonsingular D. By Cramer's rule DE = (det D) I where E is the cofactor matrix of D. This implies that detD annihilates all of M, that is, that M is a torsion submodule of As M is a finite
dimensional vector space over F it is clearly finitely generated over F[A].
Conversely assume M is a finitely generated torsion submodule of
There exists therefore a polynomial p c- F [A] which annihilates all of M. Consider next the set J defined by J = {A e (Y, Y)F [A] I 7r- (Ay) = 0
for all
y e M}
Then clearly J is a left ideal in (Y, Y)F [A] and so, by Theorem 2-11, has the form J = (Y, Y),, [A] D for some D e (Y, Y),, [A]. Since p I E J it follows that D is necessarily nonsingular. Define now a map pD: Range 1D -. Range nD by PDY = Dy
Clearly its inverse is given by po' y = D-' y for y e KD.
(4-23)
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
23
Since for every polynomial p E F [A]
PD(P. y) = PD7r-(PY) = Dir_(Py) = Dn_D-'(PDy) = 7rDP(Dy) = nDP(PDY) = P- PDY
it follows that PD is an F [A]-homomorphism that maps M into a submodule of KD. But submodules of KD correspond in a bijective way to left factors of D, hence necessarily M = Range irD.
Given any two finitely generated torsion submodules M and M1 of 1-' Y [[ -' ]] and 2-' Y1 respectively, we are able now to characterize all F[A]-homomorphisms from M into MI. This is a result analogous to Theorem 4-9.
Theorem 4-14 Let D and DI be nonsingular elements of (Y, Y)F [A] and (Y1, Y, )F [A], respectively. A map i0: LD -. LD, is an F [A]-homomorphism if and only if there exist 'P and 'PI in (Y, Y, )F [2] satisfying
'PD = D,'P1
(4-24)
bo(y) = n- ('PMY)
(4-25)
and for which
PROOF Assume there exist 'P and 'P, satisfying (4-24) and let Rbo be defined by (4-25). If y e LD then Dy is in Y [A] and from (4-25) we obtain
Dloo(y) = D,rc_('Ply) = D,n-D,, 1D1'P1y = nD,(D1'P1y) = nD,(Dy) which shows that 1tD, ('PDy) E Y1 [A] or that Ilo(y) E LD,. To show that 00 is an F [A]-homomorphism we note that for p c- F [A]
00(7r-(Py)) = n-('Pln-(Py)) = a-('P1Py) ='r-(P'PjY)
= n-(Pn-(`Ply)) = n-(P(lo(y))) Conversely let II/0 : LD - LD, be an F [A]-homomorphism. Since PD : LD - KD
and PD, : LD, -' KD, defined by (4-23) are F [1]-homomorphisms then I/i: KD - KD, defined by 0 = PD,IGoPD 1
(4-26)
is also an F [d]-homomorphism. Applying Theorem 4-9 which characterizes these homomorphisms we establish the existence of Y' and IF, in (Y, Y, )F [A] that satisfy (4-24) and for which tP is given by 'A(u) = 7rD,('PU)
As Ilo = PD1 BPD we obtain for y e LD
Oo(y) = Pn 1IPD,y = n-Di I'rD,'PDy = n_Di'D,n_Di''YDy = 7r_ (Di- "FDy) = ir- ('Ply) by virtue of (4-24).
(4-27)
24
is
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Since the map o: A-'Y [[A-']] - 2 Y[[A-']] defined by 7o(y) = ir(` Iy) an F [d]-homomorphism we obtain as a corollary the counterpart of
Theorem 4-9.
Theorem 4-15 Let M = LD and M1 = LD, be finitely generated torsion subrespectively, and let '0: modules of .i-1 Y [[ -1 ]] and
M--+M, be an F[A]-homomorphism. Then there exists an F [2]-homomorphism fro: A-' Y
-1 ]] -> A-1 Y1 [[A-']] which makes the diagram
A-' Y [[A ' ]]
A-' Y1 [[A- `]] (4-28)
I'Dl
I--
y° *
LD
L D,
commutative.
Similarly we obtain the dual version of Lemma 4-10.
Corollary 4-16 Let M be a finitely generated torsion submodule of A- 1 Y [[A-1 ]] and let tp: M -+ A - ' Y1
[[A- 1 ]] be an F [.Z]-homomorphism.
Then there exists an F[.Z]-homomorphism,gP:AY[[A 1]]-A-1Y1[[A-1]] which makes the diagram
A 'Y[[2-']] L) 2 'Y[[A ']] (4-29)
commutative.
The preceding discussion enables us now to introduce a second class of canonical models. For nonsingular element D e (Y, Y)F [A] we define LD by (4-21) and a map S* (D) in LD by S* (D) y = n- (Xy)
(4-30)
S * (D) is well defined as LD is a submodule of 2-1 Y [[2-' ]]
Lemma 4-17 For a nonsingular D e (Y, Y)F [A] the maps S(D) and S* (D) are similar, the similarity given by the following commutative diagram. LD °D -* KD S*(D)l
LD °°-.
js(D)
(4-31)
KD
PROOF This follows from the following equality
PD IS(D)PDy = n_D-'nD(XDy) = n_D-'Dir_D-1(XDy) = R-(Xy) = S*(D)y
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
25
We return now to the study of the structure of finitely generated torsion modules over F [A] .
Theorem 4-18 Let M be a finitely generated torsion module over F[A]. Then M is isomorphic to a direct sum
F[A]/(b1) ®... ® F[2]/(b,) with b,+1 1br. The sequence of ideals (b,) is uniquely determined.
(4-32)
PROOF Let M be a finitely generated torsion module over F [A]. Then M is
isomorphic to a quotient module F" [A]/DF" [.1] with detD * 0. Let A = diag(bt,... , b 1, ..., 1) be the Smith canonical form of D then D and A are equivalent. Theorem 4-18 implies that F" [2]/DF" [A] and F" [A]/AF" [A] are F [A]-module isomorphic. However, Fn[A]/AF" [A] is clearly isomorphic to
the direct sum F [A.]/(61) ® ® F
Uniqueness follows from the
uniqueness of the Smith canonical form.
The previous theorem provides us with the background necessary for the understanding of the structure of linear transformations in finite dimensional vector spaces. Let V be a finite dimensional vector space over F, and let A be a linear trans-
formation in V. We induce an F [A]-module structure in V by way of defining p - x = p(A) x
f o r p e F [A]
and x e V
(4-33)
Moreover, as a consequence of the Cayley-Hamilton theorem, V as a F[2]module is a finitely generated F [2]-torsion module. That this is the case follows also from Theorem 4-4 where the similarity of A and S(XI - A) is proved. This similarity is clearly an F [A]-module isomorphism between K,,, _A and V given the naturally induced module structure. Applying the structure theory developed in Theorem 4-18 we get the following.
Theorem 4-19 Let V be a finite dimensional vector space over F and let A be a linear transformation in V. Let 61, ..., b, be the invariant factors of XI - A then A is similar to S(61) ® ... ® S(b,) acting in Ka, (D ... ® Ks,.
PROOF Aissimilar toS(XI - A)andsince XI - Aand diag(b1,...,b 1,..., 1) are equivalent the result follows from Theorem 4-11.
In order to obtain convenient canonical matrix representations we study further the individual summands in the above direct sum. In the representation (4-32) of the module M the components in the direct sum are cyclic. In fact we have the following.
Theorem 4-20 Let M be a nontrivial cyclic F[2]-module. Then either M is isomorphic to F [A] or M is isomorphic to F [A]1(6) for some nonzero
6eF[A].
26
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
PROOF Let g be a generator of M, then the map tp defined by ap(p) = pg is a surjective F [A]-module homomorphism. Hence M = F [1]/Ker cp. But Ker(p is an ideal in F[A] which is principal so Ker lp = (b). The result depends on whether 6 is zero or not. To relate the above result to linear transformations we define a linear transformation A in a finite dimensional vector space V to be cyclic if V with the F [A]-
module structure induced by (4-33) is cyclic. A generator b e V of this module will be called a cyclic vector for A. Clearly A is cyclic with b a cyclic vector if and only if b, Ab, ... span all of V.
If A is cyclic then by Theorem 4-18, V is isomorphic to F [.1]/(mA) for some m4 a F[A]. The polynomial mA is the minimal polynomial of A. Clearly we must
have dim V = deg MA in case of cyclic transformations. Thus the set B = {b, Ab, ..., A"-'b} is a basis for Vas a vector space. Let the minimal polynomial mA be given by
MAW = An + p"- I,
A"- I + ... + loo
(4-34)
then with respect to the basis B the transformation A has the following matrix representation.
(4-35)
.
.
I
-µ"_I
The matrix (4-35) is called the companion matrix of the polynomial mA. Conversely a matrix of the form (4-35) is clearly cyclic and has MA as characteristic and minimal polynomials. Combining this matrix representation of cyclic transformations with Theorem 4-19 we have the following.
Theorem 4-21 Let A be a linear transformation in a finite dimensional vector space V and let
b;(2) = A" + d()-,2'i
+ ... + dg)
(4-36)
be its invariant factors then there exists a basis in V such that with respect to that basis A has the block diagonal form AI
(4-37)
A,
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
27
where A; is the companion matrix of S,. The matrix (4-37) is called the first canonical form of A. Generally a further reduction is possible.
Lemma 4-22 Let p = p, ... pk with p, a F[A] then diag(p, 1, ..., 1) and diag(p,, ..., p,,) are equivalent if and only if for i + j pi and p1 are relatively prime.
PROOF Denote by it, the polynomial ni = p/p;. Assume equivalence of diag(p, 1, ..., 1) and diag(pl, ..., p,,) then diag(p, 1, ..., 1) is the Smith form of diag(p,,... , p,,) and hence, by Corollary 3-2, the g.c.d. of n,-., 7r,, is I which implies the coprimeness of p, and p, for i + j. Conversely if the pi are pairwise coprime then the g.c.d. of n,, ..., it,, is 1 which implies that p is the only invariant factor of diag(p,, ..., p,,) and hence the equivalence.
Corollary 4-23 Let p e F[A] and let p = p, ... p,, be a factorization of a polynomial p in F [,Z] into relative prime polynomials. Then S(p) is similar to S(p!) O ... ®S(pk) Lemma 4-24 Let it e F[A] be irreducible then there is a basis B in K,,, such that with respect to it S(n') has the r x r block matrix representation
N
PI (4-38)
INIPI where P is the companion matrix of it and N is the matrix
(4-39)
L0....01 PROOF Let n(A) = A9 + pq_,A9-I + + po. Thus dim K, = rq. Since the elements of K. are all polynomials of degree <_qr - 1 to choose a basis it suffices to choose a set of polynomials of degrees ascending from zero to
qr - 1.
28
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
The set of polynomials
{ej.+i=X`-'nJIO <j:5r- 1,1
<-q}
satisfies this requirement and hence is a basis for Kx.. With respect to this basis S(n') has the required form.
Corollary 4-25 If n(A) = A - a then S(e) has a matrix representation of the form
(4-40) 1
a
Combining all previous results we have the Jordan canonical form theorem.
Theorem 4-26 Let F be an algebraically closed field and let A be a linear transformation in a finite dimensional vector space V over F. Then for a suitable choice of basis A has a matrix representation of the form
I
J1.
0
0
Jk
(4-41)
and 0
a; 1
.
(4-42)
J; =
0
1
a;
5. LINEAR SYSTEMS A discrete time constant (time invariant) linear system E consists of three vector spaces U, X, and Y over a field F and a triple (A, B, C) of linear maps A: X X,
B:U-+Xand C:X-Y.
The system (A, B, C) is taken to represent the pair of dynamical equations
J x.,1 = Ax,, + Bu
l
YR = CX
(5-1)
The space U is referred to as the input space, Y as the output space, and X as the state space. In the rest of this chapter we assume all three spaces are finite
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
29
dimensional. The dimension of the system E, denoted by dim E, is defined by dim E = dim X. If m = dim U and p = dim Y then we have an m-input p-output system. Through a choice of basis in U and Y we may, without loss of generality, assume V = F' and Y = FP. Sometimes the second equation in (5-1) is replaced by y, = Cx, + Du". The introduction of the linear map D: U -,. Y does not affect the dynamical behaviour of the system and hence will be omitted whenever convenient.
The description of a linear system by way of equations (5-1) is an explicit dynamic description and is referred to as an internal description. Given an internal
description the external description of the system, that is the input/output behaviour of E, is easily determined.
To do this right it is convenient to indicate each input choice by the time in which it has been applied. The time axis, which in our case is the set of integers Z, is mapped in a one-to-one way onto all powers of the indeterminate A such that
k - A. Thus we will denote by u_klk an input u_k that has been applied at time t = - k. An output yj occurring at time t = j will be denoted by yj2-j. This choice is a convention adopted for historical reasons, mainly to get compatibility with the theory of z-transforms.
Since we are interested in sequences of inputs we use EuiA-i to denote a sequence of inputs where u, is applied at time t = - i. Assuming the system to be initially at rest, that is, x0 = 0, then the application of a single input uo at time t = 0 produces the state evolution x,, = A"-'Buo and hence a sequence of outputs y,, = CA"-'Buo. Thus to the input >-k,s1 ujA -j corresponds the output Y_ -k+1:5 ,Y,u -I where YI =
CAJBu,-j- 1 Osj
(5-2)
Problems of convergence, which have no meaning in this context, do not arise as each sum has at most 1 + k + I nonzero terms. Thus, the system E induces a map f: U((A-1)) -+ Y((A-')) given by
f I-k5i u,A
`_
Y_
-k+1sl
YIA
(5-3)
where the y, are given by Eq. (5-2). The map f is called the input/output map of E or the result of E.
Lemma 5-1 The result f of the system E = (A, B, C) is an F[A]-module homomorphism of U((A-')) into Y ((A ')) which satisfies
f(U[[,Z-']])
A IY[[A-']]
(5-4)
Condition (5-4) is nothing but the expression of the causality of the system 1. Actually, as we saw in Sec. 2, U((A-')) and Y((A- 1)) are also F((A-'))-modules and it is easily verified that f is actually a F((A-'))-module homomorphism. Let j: U [A] - U((,1-')) be the inclusion map of U [A] into U((2-1)) and x_ the canonical projection of Y((A- 1)) onto Y((A-' ))/Y [A] identified as A -' Y [[A- 11] then the map f : U [A] -' Y [[ -' ]] can be defined via the commutative
30
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
diagram
U[2] I
-lY[['1-1]]
(5-5) U((Z _
' Y((A.
1))
1))
We call f the restricted input/output map of E. f is clearly an F [A]-module homomorphism and together with fit is uniquely determined by E.
As a consequence we define a strictly causal input/output map f to be an F[A]-module homomorphism of U((2-' )) into Y((2-' )) which satisfies Similarly a restricted input/output map f is def(U[[2-']] c fined to be an F [A]-module homomorphism of U [A] into A-' Y [[A`]]. Given a restricted input/output map f there exists a unique strictly causal input/output map f which makes diagram (5-5) commutative. We simply define
f
ujA-J = Y z-Jf(uj) k<j
(5-6)
-k<-J
and the right-hand side is easily seen to be a well-defined element of Y((2-')). The preceding discussion yields directly to the introduction of transfer functions.
Lemma 5-2 Let f: D((2 -' )) Y ((A-)) be an F((2 -' ))-homomorphism then f has a unique representation as multiplication by an element T of (U, Y)F ((A ')), that is f (u) = Tu
for
u e U((A-' ))
(5-7)
If f is strictly causal then Te A -'(U, Y)F [[A-' ]] If f is the input/output map of the system E then its representing multiplier TI: is called the transfer function of the system E. In terms of the transfer function Tr the restricted input/output map f is given by f (u) = n_ (Tru)
for
ueU
(5-8)
For the system (A, B, C) the transfer function is easily computed to be 00
TE(A) = E CA'B2-`
C(A1 - A) ' B
i=o
(5-9)
Finally if T(A) = Y o TiA-'-' is an element of 2-'(U, Y)F [[2-']] then we say a system F. = (A, B, C) is a realization of T if T = Tr. This is equivalent to
T = CAB
for
i>0
(5-10)
Similarly given an abstract restricted input/output map f: U[2] - A-' Y[[2 ']] we say E = (A, B, C) is a realization of f if f coincides with the input/output map of E.
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
31
6. REACHABILITY, OBSERVABILITY, AND REALIZATIONS Let E = (A, B, C) be a finite dimensional constant linear system. We say that E is reachable if given any state x in X there is a sequence of inputs u0, u1, ..., uk which drives the system from x = 0 to x. In other words x = r,-0 Ak ' IBui. We say that E is observable if given any nonzero state x in X at least one of the observations of the free motion yk = CAkX is nonzero. Alternately stated observability of E is equivalent to
n KerCA` = {0}
(6-1)
i20
whereas reachability of E is equivalent to
n KerB*A*' = {0}
(6-2)
tzo
Here A*: X* -+ X* and B* : X* - U* are the maps dual to A and B, respectively. Define now the reachability map R: U[2] -+ X and the observability map O: X -. A-' Y [[A- I ]] by
R >2'u-i = > A'Bu_i osr
045i
and
Ox = Y2.-'-ICA'x
(6-4)
r=0
respectively.
If we consider X as an F [n.]-module by way of definition (4-17) then R and O are F[A]-module homomorphisms. Clearly the system E is rezlchabI6 if and only if its reachability map R is surjective and observable if and only if its observability map is injective. We say that a realization E of an input/output map f is canonical if Y. is both reachable and observable. In terms of the reachability and observability maps the restricted input/output map of E can be factored as follows Ufx1
f
)-1Yl[x-'11
Conversely given an F [1]-module homomorphism f : U [A] -+ A-' Y [[A- I ]] then any factorization of the form (6-5) with R and 0 being F [A]-module homomorphisms is called canonical if R is surjective and 0 injective. Assume now that
f = OR is a canonical factorization of f then Ker f = Ker R. Thus we get the F[A]-module isomorphism X = U [A.]/Ker f. Similarly we get, by surjectivity of
32
LINEAR SYSTEMS AND OPERATOS IN HILBERT SPACE
0, the isomorphism X Range f. Thus each of the F [A]-modules U [A]/Ker f and Range f can serve as the state space of the system. The preceding discussion serves to define a realization of a restricted input/ _' output map f : U [A] A Y [[A-' ]] as a factorization (6-5) f = OR into the product of F [A]-module homomorphism. This definition of a realization is compatible with our previous definition of a realization in terms of triples (A, B, Q. Thus, given a factorization f = OR as
above we let A: X - X be the action of A in X, B: U - X is defined to be the restriction of R to U as naturally embedded in U [A]. Finally we let C: X - Y be
defined by Cx = (Ox)- 1 where (Ox) (A) _ Y A1(Ox)_. The triple (A, B, C) defined in this manner is a realization of f. Of course the identification of realizations by triples and realization by fac-
torization allow us to obtain abstract realizations of input/output maps. In fact let f : U [A] -i A - ' Y [[A I]] be a restricted input/output map. Consider the quotient F [A]-module X = U [A]/Ker f and let R be the canonical projection of U [A] onto X, then obviously R is a surjective F [A]-homomorphism. Similarly define a map 0 from X into A I Y [[A- I ]] by O(Ru) = f (u)
for
u E U [A]
(6-6)
Since R is onto X, 0 is defined on all of X and is easily checked to be an injective F [A]-homomorphism. It is well defined as Ker R = Ker f. Thus we obtained a canonical factorization of f and hence a canonical realization of f The preceding discussion allows us to characterize those input/output maps arising out of finite dimensional realizations. Theorem 6-1 Let U, Y be finite dimensional vector spaces over F. An element Te A-' (U, Y)F [[A-1 ]] is the transfer function of a finite dimensional constant linear system if and only if it is rational.
PROOF Assume T is the transfer function of the finite dimensional linear system (A, B, Q. Let d,, be the characteristic polynomial of A. Then, by Cramer's rule, dATE (U, Y)F [Al that is T is rational.
Conversely assume T is rational, that is there exists a polynomial d such that dTE (U, Y)F [A]. It suffices to prove that U [A]/Ker f is a torsion module. Now for u c- U [A] we have f (u) = z_ (Tu). This implies f (du) = n_ (Tdu) = n_ ((d T) u) = 0 that is d U [A] r- Ker f and hence U [A]/Ker f is a finitely generated torsion module with d as annihilator.
7. HANKEL MATRICES Analogous to the matrix representation of a linear transformation we have a matrix representation of input/output maps of constant linear systems.
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
33
Given the vector spaces U and Y we let U* denote the set of finitely nonzero sequences in U and Y" the set of infinite sequences in Y. U* and Y" use given F-linear space structure by the usual definitions of multiplication by scalars and coordinate-wise addition. We induce an F[A]-module structure in U* and Y" by defining the action of A to be the right shift a in U* and the left shift a in Y", that is
a(u0,u1,...,u,,,0,...) =(0,u0,...,u,,,0,...)
(7-1)
Q(Yo, YI, ...) = (YI, y2, ...)
(7-2)
and
We define now two maps p : U [A] - U* and p' : A.-' Y [[A-' ]] -+ Y' by
P Y_ uiAi =(u0,ul,...,u,,,0, ... )
(7-3)
i=0
and X
P, y yip-`-' _(yo,YI,...) i=o
and it is easily checked that both p and p' are F [1]-module isomorphisms. Now given a restricted input/output map f : U [A] - A ' Y [[A-' ]] with associated we define a map Hf: U* -+ Y" as the transfer function T(A) _ Ym 0 unique F [A]-homomorphism which makes the diagram
U[A1 r, A-' Y
[[A- I ]]
PI
I
Hi
U*
P
(7-5)
4 YN
commutative. The fact that Hf is an F[A]-homomorphism is equivalent to Hf being linear and satisfying
&Hf = Hfa
(7-6)
If u e U* and y e Y" is given through y = Hfu then a simple computation shows that Hf has the block matrix representation, also denoted by Hf, given by
Hf =
To
T1
TI
T2
T2. .
.
(7-7)
T2 L
with
Y)F.
We call Hf the Hankel matrix associated with the input/output map f In general any block matrix of the form (7-7) is called a Hankel matrix. Clearly there is a bijective correspondence between Hankel matrices and causal input/output maps and hence also with transfer functions.
34
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
The isomorphisms p and p' induce the pair of F [A]-isomorphisms U [A]/ U*/Ker Hf and Range f ^_- Range Hf. This indicates that the Hankel matrix associated with a given input/output map can be used for realization purposes. This approach will be used extensively in the infinite dimensional setting. In the finite dimensional case the Ho algorithm [10, 11] is one example of realization based on Hankel matrix data. The characterization of Hankel matrices associated with finite dimensional
Kerf
realization is a direct consequence of Theorem 6-1. We only remark that rank H is defined to be the dimension of the range space.
Theorem 7-1 Let H be the Hankel matrix associated with a transfer function T. Then H has finite rank if and only if T is rational.
8. SIMULATION AND ISOMORPHISM As we have seen a given system (A, B, C) in state space form determines a unique
input/output map. While the converse is not true we can still come up with a great deal of information concerning the relation between different realizations provided extra assumptions are made. The central result of this section is Theorem
8-3 better known as the state space isomorphism theorem.
Let f:U[A]
d-1Y[[V-,]] and
f,:U1[A]-+A-IY,[[A-']] be two
re-
stricted input/output maps. We say that f is simulated by f,, and write f f,, if there exists two F [A]-module homomorphisms (p: U [A] U1 [A] and a': V1 Y, [[A-']] - A-' Y, [[A-']] which make the following diagram commutative.
(0l
i4l
(8-1)
U1[A] f'' A %[[A-]] It is clear that simulation is a transitive relation.
Next we introduce a division relation among transfer functions. Let Te A-'(U, Y)F [[A-']] and Ti c -A-'((U,, Yl)F [[A-']] be two transfer functions. Then we say that T divides T,, written TIT,, if there exist polynomial functions CD, `I' and Fl in (U, U1)F [A], (Y, Y,)F [A], and (U, Y)F [A] respectively, for which
T=`I'T,C+II
(8-2)
holds. Both relations, of simulation and division, are reflexive and transitive.
The following theorem relates simulation to the division relation among transfer functions.
Theorem 8-1 Let f and f, be two restricted input/output maps having finite dimensional realizations and let T and T1 be their corresponding transfer functions. Then f is simulated by f, if and only if T divides T1.
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
35
PROOF Assume TIT,. As a consequence there exist (D, `P and n such that holds. Define F [ fl-homomorphisms (p: u [2] - U, [A] and I ]] _' ti-' Y I ]] by cp(u) = $u and 1(y) = n- (`Py). Then 1D :-' Y, for uE U[A] (8-2)
f (u) = ir- (Tu) = n- ((`YT14> + I1) u) =7r- (q'n- (T1 u)) _ `Pf14p(u)
or f f.
Conversely assume f Iff that is f = i/rf,(p. Now every F[A]-homo-
morphism cp: U[2] -+ U1 [A] is of the form cp(u) _ (Du for some 4e(U, UI )F [A].
As for 4 we restrict it to the range of f1 which is a finitely generated torsion By Corollary 4-16 there exists an extension submodule of A-' Y,
: A-' Y1 [[A-1 ]] --. A-' Y [[A-' ]] which has the form (y) = n_ (Ty) for some 1P E (Y1, Y)F [A]. Clearly f = Vif1 p = Zf, (p and so for u e U [A] we have
f (u) = n- (Tu) = n- (`Pn(TiDu)) = n- (`PT,(Du) and this implies (8-2).
'Y[[ ']J and f1: U1 [A) -' A-'Y[[1-'J] be Theorem 8-2 Let f: U[A] two restricted input/output maps having finite dimensional canonical factorizations f = OR and f1 = O, R, through the F [2]-modules X and X 1, respectively. Then there exists an injective F [A] -homomorphism 9: X X 1 which makes the diagram (8-3)
Ulxl
commutative if and only if
Range f c Range f,
(8-4)
PROOF Assume such a homomorphism 0 exists. Since the factorizations of
f and f, are canonical we have Range0 = Range f as well as Range 0, = Range fl . Since 0 = 019 it follows that Range O c Range O1 and so (8-4) follows.
Conversely assume (8-4) holds and consider the homomorphisms f and
f, induced by f and f1 in U [.1]/Ker f and U, [.1]/Ker fl, respectively. Clearly f and f, are injective and Range f c Range A. By Lemma 4-1(b) there exists, a necessarily injective, homomorphism ip: U [A.]/Ker f -.
36
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
U1 [A]/Ker ft for which f = f10 and which, by Theorem 4-9, can be lifted to an F[A]-homomorphism gyp: U[A] -+ U1 [A] making the diagram (8-5) commutative.
By Lemma 4-1 (a) there exists a uniquely determined F[A]-homomorX 1 which satisfies R1sp = OR. This implies 010R = 01R1cp = phism 0: X f,(p =f = OR. As R is surjective we obtain O1O = 0 and from the injectivity of 0 it follows that 0 too is injective. The next theorem contains the dual result.
Theorem 8-3 Let f: U [A] - A-1Y[[A-1]] and fl: U[A] -+ A1Y1 [[A-1]] be two restricted input/output maps having finite dimensional canonical factorizations f = OR and f1 = 0,R, through the F [A]-modules X and X 1, respectively. Then there exists a surjective F [A]-homomorphism 8: X 1 which makes the diagram (8-6)
X
U[Al
01
z l Yl l l x `I 11
1
I1
commutative if and only if
Ker f1 c Kerf
(8-7)
PROOF Suppose such a homomorphism 27 exists. Since 8R1 = R it follows that KerR1 c KerR which implies (8-7) by our assumption that the factorizations of f and f, are canonical.
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
37
Conversely assume (8-7) holds. Range f and Range f, are finitely generated torsion submodules of A-'Y[[2-']] and A-' Y, [[A-'] , respectively. By Lemma 4-1 (a) there exists an F [A]-homomorphism : Range f, Range f which satisfies Of, = f By Theorem 4-15 0 can be lifted to an F [,]homomorphism ': A-' Y, [[.l-']] -. A-' Y [[:t-']] which still satisfies Of, =
f From this we obtain Range 41iO, c Range 0 and as 0 is injective it follows from Lemma 4-1 (b) that there exists an F[A]-homomorphism 8: X, X for which O5 = Vt0,. Finally OER, = 00, R, = Of, = f = OR and by the injectivity of 0 the equality ER, = R follows. This proves the commutativity of diagram (8-6). Finally since R is surjective the equality ?R, = R shows that B must be surjective too.
As a corollary to the two preceding theorems we obtain the state space isomorphism theorem in two equivalent versions. Theorem 8-4
(a) Let f = OR and f = O1R1 be two finite dimensional canonical realizations of the restricted input/output map f with state modules X and X 1, respectively. Then there is an F [A]-module isomorphism 9: X -+ X which makes the diagram (8-8) commutative.
1
X
Ulxl
?-'Y[ [x-']1
(8-8)
(b) Let (A, B, C) and (A1i B1, C1) be two canonical realizations of the restricted input/output map f. Then there exists an invertible linear transformation P which makes the diagram (8-9) commutative. U
A
A1
Y
(8-9)
38
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
PROOF
(a) By Theorems 8-2 and 8-3 there exist an injective homomorphism X 1 and a surjective homomorphism 8: X 1 -+ X which satisfy ER = R1i 010 = 0, ER, = R and O8 = 01. It follows from these that =0R = ER1 = R and by the surjectivity of R that O = IX. Similarly 08R1 = OR = R1 and so 0c = Ix,. These two relations show that 0 and 0: X
8 are actually isomorphisms. (b) We induce in X and X 1 an F[Al-module structure in the normal way. Let R and R1 be the reachability maps of the two realizations and 0 and 0, their observability maps. By(i) there exists an F [A]-homomorphism 0: X X 1 for which OR = R, and 0 = 010. Let P = 0 be considered as an F-linear map. The equality PA = A1P follows from the fact that 0
is an F[A]-homomorphism. The equalities PB = B1 and C = C1P lbllow from OR = R1 and 0 = 010, respectively.
9. TRANSFER FUNCTIONS AND THEIR FACTORIZATIONS Let U and Y be finite dimensional vector spaces over F. We consider an input/output map f : U [A] --+ A-1 Y [[A - I ]] for which U [A]/Ker f is a torsion module
and hence the associated transfer function is rational. Now scalar rational functions have representations as quotients of polynomials and we have the essential uniqueness of such a representation if we assume the numerator and denominator to be coprime. A similar situation exists in the general case. Theorem 9-1 Let T bea proper rational function in 2-1(U, Y)F [[2 - I ]] then T has the representations
T = 0/4,
(9-1)
where 0 e (U, Y)F [2] and 0 e F [A]
T= D-IN
(9-2)
where De(Y,Y)F[A], detD+0, and Ne(U,Y)F[A], and
T= NA -1
(9-3)
where D, e (U, OF [A], det D1 + 0, and N 1 e (U, Y)F [A] . If we assume 0 to be monic and coprime with the g.c.d. of the elements of any matrix representation of 0 then 0 is unique. Similarly, if D and N are left coprime then they are unique up to a common left unimodular factor. Analogously for D1 and N1 with right coprimeness assumed. If the coprimeness assumptions are satisfied then we refer to (9-2) and (9-3) as coprime factorizations.
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
39
PROOF We consider the following sets )
J= {eF[A]TU ,Y)F[A] } J,={PE(Y,Y)F[A] PTE (U, Y)F [d] } and
JR=1QE(U,U)F[A]
TQ E (U, Y)F [2]
}
Obviously J is an ideal in F[A], JL a left ideal in (Y, Y)F [A] and JR a right ideal in (U, U)F [2]. We claim all three ideals are nontrivial This is essentially equivalent to our definition of rationality. Thus, by principality of F[A], J = OF [A] for some 0 which is unique up to a constant factor. Thus ipT = 0 for some O C_ (U, Y)F [A] and hence (9-1). Now I/il belongs to
JL and J. which are therefore full ideals. By Theorem 2-11 we have JL = (Y, Y)F [1] D and J. = D, (U, U)F [A]. Since JL and JR are full we have det D + 0 and det D I + 0. Thus DT = N for some N E (U, Y)F [A] and (9-2) follows. The uniqueness fact follows essentially from Corollary 2-10. The statement about factorization (9-3) is proved analogously. An alternative approach is to consider the restricted input/output map i - l Y [[.l 1 ]] given by f (u) = it _ (Tu) for all u e U [.1] . By the f : U (A] rationality of T, X/Ker f is a finitely generated torsion module. Hence by Theorem 2-12 Ker f = D, U [A] for some nonsingular D, E (U, U)F [A]. Therefore for each u e U [A]
f(Dju) = ir-(TDIu) = 0 This implies the existence of N, e (U, Y)F [A] such that TD, = N, which is equivalent to (9-3). Similarly we can consider Range f = lit - (Tu) I u E U [A] } as a submodule of A-1Y[[A-1 ]]. Now A -1Y[[A-1]] is also left (Y, Y)F [.]-
module with the composition (A, y) - ir_ (Ay). The set n,,Ef {A c (Y, Y)F [d] I a_ (Ay) = 0} is obviously a left ideal in (Y, Y)F [A] and hence has
a representation as (Y, Y)F [A] D for some, necessarily nonsingular, D in (Y, Y)F [A]. Thus m_ (DTu) = 0 for all u e U [A.] which implies DT = N and hence the factorization (9-2).
10. REALIZATION THEORY While the abstract question of realization has been trivially solved the availability of the canonical models, the factorization of rational transfer functions, and the characterization of intertwining operators for our canonical models allow us to construct some explicit realizations and study their relations.
Thus let TEA -1(U, Y)F [[A-']] be rational and let (9-2) and (9-3) be factorizations of T.
40
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Let KD and S(D) be defined by (2-7) (4-11), respectively. Define B: U -+ KD by
(A) = N(A)B
and let C: KD
(10-1)
Y be defined by
Cf = (D-' f ),
f eKD
for
(10-2)
1(D-1 f )" A-" is the formal expansion of D-'f which, where (D-If ) (A) = by Lemma 2-15, belongs to A-' Y Theorem 10-1 The system (S(D), B, C) defined above is a realization of the function T which is observable. It is reachable if and only if (D, N)L = I.
PROOF To begin, the map B is actually a map into KD as D-'NI e A-' Y [[A-']] for all e U by another application of Lemma 2-15. Let LdeL is the set of reachable note the set of elements of the form X S(DY states in KD, it is a submodule of KD and L+ DY [A] is a submodule of Y [A]. By Theorem 2-1 we have L+ DY [A] = EY [A] for some E e (Y, Y)F [A].
Since DY [A] c E Y [A] we have D = EG and as N e E Y [A] for all E U we have N = EM. Thus reachability is equivalent to the left coprimeness of D and N, that is to (D, N)L = I. To show observability assume f e KD and CS(D)" f = 0 for all n ? 0. This means that
(D-'nDX"f)I = (D-'Dn-D-'X"f)I =
(n-XD-If)I
=0
But this implies (D-'f)" = 0 for all n and hence f = 0. To show that we have actually a realization let T(A) = Y' 0A-'- IT be the formal expansion of T It suffices to show that CS(D)' B = Ti. Let e U then CS(D)' B = (D-'nDX'Nc)I = (DT,.
(n-
which proves the statement.
The second factorization of T gives rise to another realization. So assume T= N1D-1 1. The equality NIDj' = D-'N is equivalent to ND, = DN1
(10-3)
Since (D, N)L = I and (D1, N1)R = I it follows from Theorem 4-11 that the map X : KD, - KD given by
Xf = nDNf
for
f c- KD,
(10-4)
is an F[A]-module isomorphism. Define now maps B,: U - KD, and
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
41
C, : KD, - Y in such a way that the diagram U
B
B,
X
- KD
KD,
S(D,)
S (D)
x
KD,
(10-5)
KD
c Y
is commutative. The commutativity is equivalent to X BI = B and C, = CX. We check that B, is given by
BIB = nD,
(10-6)
for
XB, = irDNBI = nDNnD, = nDN = N = B Also for every f e KD, we have
C, f = CXf = (D-'X f), = (D-'nDN.f)I = (D-IDn-D-'NJ), = (n-T.f)I or
CIf = (n-TA
(10-7)
That (S(D, ), B,, C,) is a canonical realization is clear from the invertibility of X and the commutativity of the diagram. This can be verified directly as
C,S(DI r B, = (n-TnD,X"nD,c), = (n-X"T )I = T" Summarizing we obtained the following.
Theorem 10-2 The system (S(D,), B1, C,) defined above is a realization of the transfer function T which is reachable. It is observable if and only if (DI, N,)R = I. We consider two special cases. First, let T= D -' N and
N(A)=No+N,)i+...+Nk_,Ak-I and
D(A) = Do + D,A + ... + Dk- Ak-I + IAk
42
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
It is easily checked that ak_1Ak-I Ia`E U"}
+
KD = nn0U"[A] = {co + aid + If we make identification ao + 0C 1A + ... + OCk_ IAk - 1
then we have the representation
-Do
0 1
S(D) .-.
No
C.-+(0...OI)
and
B
IN,
1
and for that reason we call the realization (S(D), B, C) the standard observable realization. In the same fashion if T= N1D1 1 and
NI(2) = No + ... + Nkand
1
DI(i) = Do + ... + D-k-lAk-I +
then with the same coordination of KD, we have 0
.
.
.
.
-D0
I
I 0
and
B1
0
CI
Tk-1)
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
43
We call the realization (S(D1), B1i Cl) the standard controllable realization. The above construction should be compared for example with [15].
Given a rational transfer function Te 2-'(U, Y)F [[A-']] we define the McMillan degree of T, denoted by b(T), as the dimension of the state space of any
canonical realization of T The realization results of this section can be used to link the McMillan degree of T with the coprime factorizations of it.
Theorem 10-3 Let TeA-'(U, Y)F [[A-']] be rational and let it have the coprime factorizations (9-2) and (9-3). Then the McMillan degree of T is given by
6(T) = deg(detD) = deg(detD,)
(10-8)
PROOF The realizations constructed in Theorem 10-1 and Theorem 10-2 are canonical and use KD and KD, as state spaces, respectively. The dimensions of KD and KD, are given by deg(detD) and deg(detD1), respectively.
Since the range of the input/output map induced by T has the same dimension as the range of the Hankel matrix induced by T we immediately obtain a characterization of the McMillan degree in terms of Hankel matrices.
Corollary 10-4 Let Tea -'(U, Y)F [[2 ']] be rational then we have the equality
b(T) = rank HT
(10-9)
We note here two important properties of the McMillan degree. Theorem 10-5
(a) Let T, and T2 be rational elements of 2-'(U, Y)F [[2 ']]. Then b(TI + T2) < b(TI) + 6(T2) (10-10) (b) Let T, and T2 be rational elements of 2-'(Y, Z)I, [[A-']] and ' (U, Y)F [[A ' ]], respectively. Then
b(T1T2) < 6(T1) + b(T2)
(10-11)
PROOF If X, and X2 denote the state space of canonical realizations of T, and T2, respectively, then X1 ® X2 can be taken as the state space of a not necessarily canonical realization of T, + T2 as well as T1 T2 simply by joining two canonical realizations of T, and T2 in parallel or in series. Hence the inequalities follow.
11. POLYNOMIAL SYSTEM MATRICES The object of this section is to make contact with linear system theory as developed by Rosenbrock. We have seen in Sec. 9 that any rational transfer function Te
44
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
A-'(U, Y)F [[A-']] has left and right coprime factorizations given by (9-2) and (9-3), respectively. And with each factorization there is a naturally associated state space realization. We consider now more general factorizations. Assume the transfer function
T has a representation of the form
T= VD-'W+ Q
(11-1)
where We (U, X )p [A3, D C_ (X, X )F [A], VC- (X, Y)F [A], and Q E (U, Y)F [.Z]. Here
X is another finite dimensional vector space over F. No assumptions concerning the left coprimeness of D and W or the right coprimeness of D and V are made. With each factorization (11-1) of a transfer function T we associate a block matrix of the form
P=[ DV
(11-2)
Q]
and call such a matrix a polynomial system matrix. In analogy with the constructions of the previous sections we use (11-1) as the basis for a state space realization of T We take KD as our state space and define operators B: V-+ KD and C: KD - Y by Bc = nDWI
for
eU
(11-3)
f e KD
(11-4)
and
Cf = (VD-'f )I
for
o T"2-"-' We claim that (S(D), B, C) is a realization of T. So if T(A) _ we will show that T. = CS(Dr B. This follows from the following computation.
CS(D)" Bt; = CS(D)" IEDWl; = C7tDX"nDK
(VD-'7t0X"nDW) = (VD-'D7r-DX"Wc)i
_
_ (V7r-D-'WX" )I = (VD-'WX" )I = T"A
We call the realization (S(D), B, C) constructed above the state space model associated with the polynomial system matrix P. From the previous section we know that (S(D), B, C) is reachable if and only if (D, W )L = 1 and observable if and only if (D, V)R = I.
Assume now T = VD W + Q = VIDi 'WI + QI are two different representations of the transfer function T. The dimensions of the spaces X and X, are not necessarily equal. To the two factorizations we associate two realizations (S(D), B, C) and (S(DI), BI, C,), respectively, where BI and C, are defined by formulas analogous to (11-3) and (11-4), respectively. Let us assume now that the two state space realizations are similar and study the effect of this on the relation between the two polynomial system matrices. By similarity there exists an invertible linear map Z: KD -+ KD, for which ZS(D) =
S(D,) Z, ZB = B, and C,Z = C hold. The structure of transformations that intertwine two canonical models is given by Theorem 4-9. Thus there exist M
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
45
and M 1 in L(X, X 1) [A] for which
MD = D1M1
(11-5)
and Z is given by
Zf = nD1Mf
f e KD
for
(11-6)
Since Z is assumed invertible we must have (D1, M)L = I and (D, Ml)R = I. Next we consider the relation between the respective input and output maps. Since we have ZB = B1 then for every e U
nD1W1 = nD1MnDW = nD,MW
(11-7)
The last equality follows from (11-5) which is equivalent to
MDX[A]cD1X1[A]
(11-8)
From (11-7) it follows that
nD,(WI - MW) = 0
for all
IC- U
(11-9)
and this implies the existence of an L1 E (U, X 1)F [A] such that
W1 - MW= D1L1 or
W1 = MW+ D1L1
Similarly we have C1Z = C and more generally CS (Dr = C1S(D1)"Z Therefore we get for f e KD nn
(VlD1 1nD,X"nD,Mf)1
or
(VD-1X°! )1 = (Vin-D1 'MX"./ )1
which implies in turn
((V- V1M1)D-1X"f)1 = 0 for all n z0 Hence (V - VIM,) D- I is necessarily equal to some K e (X, Y)F [A). For this K we have
V- V1M1 = KD
(11-13)
It is clear that equalities (11-5), (11-11), and (11-13) are equivalent to the matrix equality D
MK
I
I]L
M0
Q J-L DVi
1
111
(11-14)
46
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
That KW+ Q = V,LI + Q1 follows from the fact that
KW- V1L1 =(V- V1M1)D-'W- V1D1'(W1 - MW) = VD-'W- V1M1D-'W- VD-l'W1 + V1Di'MW=Q1 - Q Here we use the fact that
VD-'W+ Q=V1Di'W1+Q1=T whereas the equality
V1M1D-'W = V1D1 'MW is equivalent to (11-5).
The converse result holds also. If two polynomial system matrices W
C_
Q
j
and
DI
[
V1
Q11
are connected via (11-14) with the coprimeness conditions (M, D1)L = I and (M,, D)R = I holding then the two respective state space models are similar. This motivates the following definition. Definition 11-1 rLet two polynomial system matrices
P=l D
f
and
P1=[ DI
Q,
Q
be given. We say P and P1Jare strictly system equivalent if (11-14) holds together with the coprimeness conditions (M, D1)L = I and (M1, D)R = 1.
As a direct consequence of the previous discussion and definition we have Theorem 11-2 Two polynomial matrices are strictly system equivalent if and only if their associated state space models are similar.
We want to remark that as similarity is an equivalence relation it follows that strict system equivalence is also an equivalence relation.
12. GENERALIZED RESULTANT THEOREM Throughout this chapter the coprimeness of polynomial matrices has played a central role. Thus it seems appropriate to give some effective ways for determining the left or right coprimeness of two polynomial matrices. A classical result of Sylvester gives a simple criterion, in terms of the nonsingularity of the resultant matrix, for the coprimeness of two polynomials. As motivation for the more general results of this section we review the classical result.
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
47
Lemma 12-1 Let p, q e F[2] then p and q are coprime if and only if (12-1)
F[A]/pgF[2] = P{F[1]/gF[1]} + q{F[A]/pF[2]}
We identify the quotient ring elements with their unique representative of lowest degree.
PROOF Assume p and q are coprime, then for every f e F [A] there exist a, b e F [A] for which f = ap + bq. Thus f mod(pq) = p(a mod q) + p(b mod p) or
F[A]/pgF[,Z] c p{F[.Z]/qF[A]} + q{F[A]/pF[A]} The converse inclusion holds by a dimensionality argument. Conversely, assume now the equality (12-1). In particular there exist polynomials a and b such that 1 = ap + bq but this is equivalent to the coprimeness of p and q. Assume now that
P(2) = POPIA + ... +
and
q0) = q0 + q12 + ... + 9m2m (12-2) then F [2]/ pF [2] is isomorphic to F [A] the set of all polynomials of degree less that n with the multiplication being modulo p. Similarly F[2]/qF[1] is isomorphic to Fm [A]. The following follows easily from Lemma 12-1. Corollary 12-2 Let the polynomials p and q in F[A] be given by (12-2). Then p and q are coprime if and only if Fm+n[2] = PFm[A] + gF,, [A]
Theorem 12-3 Let p and q be given by (12-2) then p and q are coprime if and only if det R(p, q) $ 0 where R(p, q) is the resultant matrix P0.
.
.
.
p0.
.
R(P, q) =
qo .
P. .
.
.
.
.
.
.
0
.
.
P.
(12-3)
qm
qO .
.
.
.
qm J
PROOF By Corollary 12-2 p and q are coprime if and only if the set
B = {X'pj i=0,...,m- 1}v{X'qI j=0,...,n- 1} is a basis for In terms of the polynomial coefficients this is equivalent to det R(p, q) + 0.
48
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
A somewhat similar criterion for the coprimeness of two polynomials p and q follows directly from Theorem 4-8. Theorem 12-4 Given p and q in F[A] then p and q are coprime if and only if det p(Q) + 0 where Q is the companion matrix of q. PROOF By Theorem 4-8 p and q are coprime if and only if p(S(q)) is invertible.
But S(q) is similar to Q and hence the result.
This result involves calculating a determinant of lower order than the resultant. In view of Theorem 10-1 it has an obvious interpretation in terms of controllability. We now pass to the generalized result. Let D1 and D2 be two nonsingular polynomial matrices in P" [2] and let M, = D,F"[A] be the corresponding full submodules. Define M by M = M1 n M2 so M is also a full submodule and hence has a representation M = DF" [.t] for some nonsingular D. Since M c M, there exist polynomial matrices E, for which the equalities
D = D1E1 = D2E2
(12-4)
hold. Theorem 12-5
(a) The polynomial matrices D1 and D2 are left coprime if and only if the equality
detD = detD1 detD2
(12-5)
holds up to a constant factor on one side. The left coprimeness of D, and D2 implies the right coprimeness of E, and E2 in (12-4). (b) The equality F" [A]/DF" [A] = D1 {F" [A]/E1F" [A]) + D2 {F" [A]/E2F" [A] }
(12-6)
holds if and only if D1 and D2 are left coprime. That this generalizes the resultant theorem is obvious from a comparison with Lemma 12-1.
PRooF Suppose D1 and D2 are left coprime. By Theorem 2-8 there exist polynomial matrices G1 and G2 such that I = D1G1 + D2G2. Therefore every f e F" [A] has a representation
f = D1Glf + D2G2f = D,f1 + D2f2 If we apply the projection RD of F" [A] onto KD and use the equalities (12-4) then
nD f = Dn-D-'f = D1EIn-E1 I DI 1DIf1 + D2E2>t-E2 1D2 1D2f2 = D1REif1 + D27Ezf2
Therefore we get the inclusion KD c DIKE, + D2KE2. To prove the converse
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
49
inclusion it suffices, by symmetry, to show that DIKE, c KD and hence the equality (12-6) is proved. From the proof it is clear that the inclusion
DIKE, + D2KE2 c Kn
(12-7)
holds always. We consider the rational function D2 'D1 which to begin with we assume
to be proper. By Theorem 9-1 there exist polynomial matrices F1 and F2 which are right coprime and for which D2'D1 = F2F1 1
(12-8)
D1F1 = D2F2
(12-9)
which is equivalent to
Since clearly D1F1F"[.1] c D1F"[A] and D2F2F"[A] c D2F"[A] it follows from (12-9) that
D,F;F' [A] c D 1F" [A] n D2F" [A] = DF" [A]
Thus for some polynomial matrix G we have D1F1 = D2F2 = DG or
DG = D1E1G = D2E2G
and hence F1 = E1G and F2 = E2G. But F1 and F2 are assumed to be right coprime and hence necessarily G is unimodular. The unimodularity of G now implies also the right coprimeness of E1 and E2. We recall that we assume D2 1D1 to be a proper rational matrix. We apply now the realization theory developed in Sec. 10 to deduce the similarity of S(D2) and S(E1). This in turn implies the equivalence of D2 and E1 and hence in particular the equality
detD2 = detE1
(12-10)
holds. Using (12-10) and (12-4) the equality (12-5) follows. To prove the converse half of the theorem we assume D1 and D2 to have
a nontrivial greatest common left divisor L. L is determined only up to a right unimodular matrix. Thus we have D1 = LC1
and
D2 = LC2
(12-11)
and C1, C2 are left coprime. Now
D1F"[A] n D2F"[.i.] = L{C1F"[ t] n C2F"[.1] } = LD'F"[A] = DF"[2] and det D' = det C 1 . det C2
50
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Clearly
detD = detL-detD' = detL-detC1 detC2 + detD1 detD2 Similarly the equality (12-6) cannot hold by a dimensionality argument. As linear spaces the dimension of F" [2]/Df" [2] is equal to the degree of the polynomial detD = detL detD' whereas the degree of D1 {F"[2]/E1F"[2]} + D2 {F"[2]/E2F"[2] } is equal to the degree of det D'. Since L is not unimodular there cannot be equality.
We indicate now how to remove the restriction that D2'D1 is proper rational. Let n_ be the projection of Fn'"((2-' )) onto A-'F` [[A 1]] Then we define A by AI = D2n_D2'D1. Obviously D2 'A1 is proper rational and D1 = Al + D2R for some R in F" "[A.]. Since the conditions (D1, D2)L = I and(A1, D2)L = I are equivalent there exist F1 and F2 satisfying(F1, F2)R = I
and D2 'A = F2F1'. From the first part of the proof we have detD2 = detF1. Now Ls1F1 = D2F2 implies D1E1 = D2E2 where E1 = F1 and E2 = F2 + RF1, and the right coprimeness of E1 and E2 follows from that of F1 and F2.
13. FEEDBACK We conclude this chapter by a short study of feedback, feedback equivalence, and canonical forms obtained through the use of feedback. Let (A, B) be a reachable pair with A e (V, V)F and B e (U, V)F where U and V are finite dimensional vector spaces over F. Suppose we augment the dynamical equation
xy+1 = Ax, + Bu,
(13-1)
Yy=xy
(13-2)
by the identity readout map then the transfer function of the triple (A, B, I) is given by
T(1) = (21 - A)-'B
(13-3)
If we require the input to be a linear combination of a new input and the state, that is, we put
u, = Kx, + wy
(13-4)
then the dynamic equation (13-1) is replaced by
xy+ 1 = (A + BK) xy + Bw,
(13-5)
Relation (13-4) is called a feedback law. We say that the pair (A + BK, B) has been obtained from (A, B) by state feedback. Clearly the applications of feedback form a commutative group. If we enlarge this group to the one generated by
similarity transformations in U and V as well as state feedback we obtain the noncommutative feedback group .F. Thus an element of .F is a triple of maps
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
51
(R, K, P) with R e (V, V)F and P e (U, U)F nonsingular and K e (V, U)F. The feedback group.97 acts on a pair (A, B) by ( A , B)
(R.rc.Pl. (R-'AR + R-1BK , R-'BP)
(13-6)
This implies that the group composition law is (R, K, P) °(R1, K1, P1) = (RR1, PK1 + KR1, PP1)
(13-7)
and is associative as it can be expressed in terms of matrix multiplications as follows
RO
R1
CK P)CK,
0 KRI+PK1 PP1 F1l-C O
_
RR1
(13-8)
From (13-8) it also follows that
(R, K, P) -' = (R-', -P-'KR
P-')
(13-9)
which shows that F is a bona fide group. From the matrix representation of the feedback group it follows that every element of .F is the product of elements of three basic types, namely
(a) similarity, or change of basis in the state space, (b) similarity or change of basis in the input space, and finally (c) pure feedbacks. This is clear from (R, K, P) = (R, O,1) o (I, K, 1) 0 (1, 0, P)
(13-10)
The feedback group -,F induces a natural equivalence relation in the set of reachable pairs (A, B) with state space and input space given by V and U, respective-
ly. Thus (A, B) and (A1, B,) are feedback equivalent if there exists an element of F which transforms (A, B) into (A1, B1). It is easily checked that the relation of feedback equivalence is a proper equivalence relation. The equivalence classes are called orbits of the group and we are interested in a characterization of orbits, and of orbit invariants. Moreover we would like to obtain a canonical way of choosing one element in each orbit, a canonical form, which exhibits the orbit invariants. The situation regarding the question of feedback equivalence of two reachable pairs (A, B) and (A,, B,) is analogous to the problem of deciding when two linear transformations A and A, are similar. By Corollary 4-12 A and A, are similar if and only if Al - A and AI - A, are equivalent. By Corollary 3-2 and Lemma
2-20 (b) this is equivalent to Al - A and Al - A, having the same invariant factor. This can be checked by bringing both Al - A and Al - A, to their Smith canonical forms.
For the methods that will be applied in this section we will have to relax slightly the notion of feedback equivalence. Thus if (A, , B,) is another reachable pair with state and input spaces given by V, and U1, respectively, we say that
52
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
(A1i B1) is feedback equivalent to (A, B) if there exist invertible maps P: U1 - U and R: V1 -+ Vsuch that (RA1R-', RB1P-') is feedback equivalent to (A, B).
The feedback group has been introduced through state space formalism. However, we intend to study it through the use of canonical models, polynomial system matrices, and the realization procedures of Sec. 11. To this end we introduce a generalized control canonical form.
Given a reachable pair A, B then in the corresponding transfer function (Al - A) - ' B the factorization is left coprime. Associated with it is a right coprime factorization
(Al - A)-' B = H(Z) D(Z)-'
(13-11)
where the factors H and D are uniquely determined up to a right unimodular factor. By Theorem 11-2 it follows that the pair (S(D), nD) is isomorphic to (A, B) and hence in the study of feedback we may as well start with the former. The factor D in (13-11) is a nonsingular element of (U, U)F [A] and hence has the representation
D(Z)=Do+D1Z+ ...+DJ.ls
(13-12)
As before we denote by n+ and n_ the canonical projections of U((Z-')) on U [Z] and Z - 'U respectively. Then we define
Us [Z-I]=n-
sU[A]={
y=1 ;I
'E u}J
(13-13)
We clearly have the following direct sum decomposition
U[[A-']]
Z-'U[[A-'11 = Us[Z ']
(13-14)
For every yeAs+I U11-Z- IA we have, with Y(Z) = Zs+1 Y, (A)
and y' c- U[[2]] that n+D(Z)Y(Z) = n+;F
Y V) = 0
(13-15)
Hence to obtain all vectors in LD, defined by (4-21), it suffices to consider the linear combinations of the vectors in U, [A -' ]
Fort _j<-s
D
n'7 = (D; + ... +
(13-16)
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
Let us define now s + 1 polynomials in (U, OF [A] by 0, j=0 Ei(z) I<j<s D5,1s-i IDJ. + Di+ 1A + ... +
53
(13-17)
Equation (13-15) can be rewritten now as n+
D
= Ei(A)
,
1<j<S
(13-18)
and so n°
Ai
=
n-D-1Ei
(13-19)
So for LD we have the representation LD =
(13-20) J n_D-1Ei 1 i E U ? t 1 Multiplication by D maps LD onto K° and, recalling the definition of the projec-
tion nD, we obtain
KD = I it EA i e U 1
(13-21)
i=1 s
We shall call representation (13-21) of KD the control representation of KD. The usefulness of the control representation (13-21) of KD becomes apparent in the study of the operator S(D). Indeed we have the following result. Theorem 13-1 Let S(D): KD - KD be defined by (4-11) and let Ei be defined by (13-17). Then S(D) nDEi (A) c = nDEi- 1(A) c - nDD j- 1
(13-22)
PROOF
S(D)nDEi(Z)
= RDAnDEi(A)
= aDAEi(A)
= 71DEi-1 (A) - Di-1)
= 7EDEi-1(A)c - RDDi-1
For j = 1 we have of course S(D) nDEI (A) _ - nDDO
(13-23)
In order to obtain some feeling for the preceding theorem let us specialize to the case of a degree s monic polynomial D. Thus D(A) = Do + + D,_ 1Ws- I + IA3. In this case PD,C
= nDEi(A)
= Ei(A)
54
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
This implies that
S(D)E;(Z) = E,-I(A)A -D;-1S
(13-24)
Since KD coincides with all vector polynomials of degree s - 1 then each such vector polynomial u(A) can be uniquely expressed in the form u(2) = =I E;(.1) ;. If we map bijectively KD onto Us by mapping r.= I Ej(2) ; into I
we obtain for S(D) the block matrix representation (1s
/
0
1
0
0
0
1
...
0 (13-25)
Do
-DI -D2
...
But this is just the classical control canonical form for S(D).
That D be monic is not necessary for nDE; = E; to hold. In fact we have the following.
Lemma 13-2 Let D e (U, OF [A]. If D- I is proper then
nDE; = E;
for all
eU
1<j<s
and
PROOF Dn_D-1E;
nDE, = = Dn
Dn_D-I
=
(D(,1) - (Do + ... + Dl_ 1Ai 1)) Aj
- D(A)-1(Do + ... ,Di-IAi I)
Now /Aj and Do + ... + Dj- 1A1-1 Al
are strictly proper whereas D(A.)-1 is proper by assumption. Thus also
(Do + ... + A1'
Di-1Aj-1)
(13-26)
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
55
is strictly proper and hence
k
a'
.1'
.1'
and this implies (13-26).
The next theorem is a key result in the study of state feedback.
Theorem 13-3 Let Q, D e (U, OF [A) with D nonsingular and QD-' strictly proper. Let DI = D + Q and let E; and E'; be the polynomials associated with D and DI, respectively, that are defined by (13-17). Then the map X defined by
Xf = it+DID-'f
for
f e KD
(13-27)
is an invertible map of KD onto KD, that satisfies
XrrDE, = irD,E' PROOF Assume D(.1): Do + DI2 +
(13-28)
+ Dsi.s.
-
Since D1D-1 = I + QD-'
with QD-' strictly proper it follows that DI(.1) = Do + Dig + --- + D's. s Let D1D-' have the expansion
+
DI (A) D(2)-'
E1
+ 1Z + ...
(13-29)
or
DI(.1)=I/+El+1Z+ ...) D(A)
(13-30)
By equating coefficients we obtain Ds = Ds DD
I = Ds- I + r1Ds
D10=Do+r1DI +...+r.,D.,
(13-31)
or in block matrix form
/I 0
rI
...
1
r,
...
1
rI
rs
I
rs_ (13-32)
56
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
XnDEE =
n+D,D-'nDEJ
= n+D,D-1(Dn_D-'Ei = n+D,n_D-'Eic
= nD,n+D,D-'E-J 7ED,n+ S 1 1 + A1 +
2A2 +
... Ei(A)
+ ...) (Di + Di+,A + ... + _ nD,n+ { (1 + T1 + Z = nD, {D.,A -` + (D.+ r, DJ As - i - ' + ... + I,_iD5)} + (Di + r,Di+, + = nD,()i + lyi+1 + ... t. AAs-i) = nD,Ei
Dsn.s-i)
}
This shows, by the control representations of KD and KD,, that X is a map of KD onto KD,. If we define Y on KD, by
Yg = n+DD1 'g then it is easily checked that for f e KD
YXf = n+DDi
g e KD,
1n+D,D-If= n+DDI 1(1
= n+DD1'D,D-If-
=f-
(13-33)
-
n_)D,D-If
n+DDi'n-D,D-1f
n+DD1'n_D,D-If=f
as
We conclude that X is also injective, hence invertible. Necessarily
X-,=Y.
The map X defined by (13-27) relates also the projections nD and nD, in a simple way.
Lemma 13-4 Let D and D, be as in Theorem 13-3, and let X: KD -. KD, be defined by (13-27). Then for every P e (U, U)F and e U we have
XnDPc = n0,P
(13-34)
PROOF
XnDP = n+D1D-1nDP = n+D,D-'Dn_D-1P =
n+D,n-D-1P
= 7ED,n+D1D-'P = nD,P
As a corollary to Theorem 13-3 we can state the following result.
Theorem 13-5 With the notation of Theorem 4-3 the operator X: U[2] U [A] defined by
Xf = n+D,D- if is an invertible map in U [A].
for
f c- U [A]
-
(13-35)
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
57
PROOF We clearly have the direct sum decompositions U [2] = KD DU [A] = KD, ® D, U [A]. We saw that X maps KD bijectively onto Kn,. Moreover it clearly maps DU [.t] bijectively onto D, U [2] and hence is invertible.
The following theorem enables us to study the effect of feedback transformation in terms of the coprime factorization of transfer functions and related polynomial system matrices. The result is due to Hautus and Heymann.
Theorem 13-6 Let (A, B), with A e (V, V)F and B e (U, V)F, be a reachable pair and let H(A) D(A)-' be a right coprime factorization of (Al - A) - I B. Then a necessary and sufficient condition for a reachable pair (A,, B,) to be feedback equivalent to (A, B) is that
(Al - A,)-' B, = RH(A) (D(A) + Q(A))-' P- 1
(13-36)
for some Q E (U, OF [A] for which QD- I is strictly proper and invertible maps R and P in (V, V)F and (U, U)F, respectively.
PROOF Assume T(A) = (Al - A)-' B = H(A)-' D(A) are coprime factorizations, and let (A,, B,) be feedback equivalent to (A, B). Thus there exist invertible maps R: V-+ V and P: U -. U such that
A, = R(A + BK) R- I
and
B, = RBP-'
Hence
(Al - A,)-' B, = (R(Al - A - BK)-' R-') RBP = R(AI - A - BK)- I BP-' Now
(Al - A - BK)-' B = [(Al - A) (I - (Al - A)-' BK)]-' B = (I - (AI - A)-, BK)(1I - A)-' B = (I - T(1) K)-' T(A) But from the equality
T(.1) (I - KT(A)) = (I - T(A)K)T(A) it follows that
(I - T(A)K)-' T(A) = T(A)(I -
KT(A))-,
Hence it follows that
Tf(A) _ (AI - A,)-' B, = RT(A)(1 - KT(A))-I PD(A)-,)-, P-, D(R)-1 (I - KH(A) = RH(A) = RH(A)(D(A) -
KH(A))-, P-1
58
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
If we put Q(2) = -KH(A) then clearly Tf(A) = RH(2) (D(A) + Q(2))-1 P-1
and QD-1 = - KT is strictly proper. This proves the necessity part of the theorem. = To prove sufficiency it suffices to show that H(A) (D(A) + (Al - A1)-1 B1 for a pair (A1,B1) which is feedback equivalent to (A,B). Q(A))- 1
In that case RH(A) (D(.1) + Q(A))-1 P- 1 is associated with (RA 1 R-1, RB1P-1).
Let D 1 = D + Q then by Theorem 13-3 the map X : KD -+ KD, defined by (13-27) is invertible and its inverse Y= X-' is given by (13-33). The realization procedure of Sec. 11 associates with the factorizations HD-1 and HD, 1 realizations in which the state and input operators are (S(D), RD) and (S(D1), RD,), respectively. Thus for our purposes we have to show that (S(D),RD) and (S(D1),iD,) are feedback equivalent, or that for some invertible map Y : KD, - KD and K: KD - U we have
S(D) - YS(D1) Y-1 = BK where B: U -- KD is given by B = iv
(13-37)
for e U. Clearly (13-37) is equivalent
to
S(D) Y- YS(D,) = BK1
(13-38)
By Lemma 4-1, applied with vector space structure, it suffices to show
that
Range(S(D) Y- YS(D1)) c RangeB
(13-39)
and it is this we shall prove.
From the control representations of KD, and KD we know that they are spanned by vectors of the form nD,Ej and nDEjI, respectively. Thus it suffices to show (13-39) for vectors of this form. Using (13-28), (13-34), and (13-22) we have
(S(D)Y-
YS(D,)nD,Ei
= S(D)nDEj - Y{RD,Ej-1 - RD,Dj-1b}
_ {nDEj-I
- n0Dj-I`+}
- {iDEj-1S - 1DDj-11}
= -nD(Dj-I - IYj-1) which proves the assertion.
At this point it will be convenient to fix bases in U and V, respectively. Thus we can identify U and V with F' and F", respectively, and A and B will denote n x n and n x m matrices, respectively. Without loss of generality we will assume B to be injective, that is, of full column rank. Now with the reachable pair (S(D), RD) we associate the polynomial matrix (D I) which is just the upper row of Rosenbrock's polynomial system matrix. Using Theorem 13-6 and previously obtained results an invertible module homomorphism between canonical models, namely, Theorems 4-9 and 4-11, we can state the following
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
59
Theorem 13-7 Let D, Q, N, and M be m x m polynomial matrices with D nonsingular, QD-' strictly proper and N and M unimodular. Let P be an invertible constant m x m matrix. Then (D I) and (N(D + Q) M NP) are associated with feedback equivalent pairs.
PROOF We consider the following special cases of Theorems 4-9 and 4-11.
D1 = ND
(13-40)
D1 = DM
(13-41)
and
Let the maps X : KD - KD, and Y : KD
KD, be defined by
Xf = 1tD,Nf
(13-42)
Yf = 7rD,f
(13-43)
and
respectively. Then both maps are invertible as the required coprimeness conditions are trivially satisfied. To see how the input operators are transformed we check that XitD = 1rD,N1rD = 1rD,Nl; whereas X1rD = 91),1CDb = nD,b
In terms of polynomial matrices we have the equivalence of (D
I) with
either (DM I) or (ND N). Combining this with Theorem 13-6 the result follows.
We recall that left (right) multiplication by a unimodular matrix is equivalent to a finite series of elementary row (column) operations. Our next object is to use the freedom of Theorem 13-7 to reduce (D 1) to canonical form. To this end we introduce the notion of column properness. Let D(A) be an m x m nonsingular polynomial matrix with columns d(')(A), ..., d(')(A). We define the degree of dl`°(A), degd(`)(2), to be the degree of the highest degree element in d""(,l). D(A) is called column proper if degD(2) = YT, degd(`)(2).
Theorem 13-8 Let D be an m x m nonsingular polynomial matrix. Then there exists a unimodular polynomial matrix M such that DM is column proper. If f t'° are the columns of DM we may assume without loss of generality that, with K,=degfI'7,K1 >_K2 >_ >K,,, Z: 0.
PROOF Given a nonsingular m x m polynomial matrix D we denote by D,, the constant matrix whose ith column d,I; I consists of the coefficient of the terms of highest degree in the ith column of D. It is clear that D is column proper if and only if det D,, + 0.
60
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Now let d"', i = 1, ..., m, be the columns of D. If D,, is nonsingular then D is column proper and there is nothing to prove. So we assume detDh = 0. S, >_ degdetD. We will construct now a Let S; = degdO then clearly unimodular matrix M such that in DM the degree of one column is decreased, the rest remaining unchanged. To this end we observe that if det Dh = 0 there exist a, , ... , am in F, not all zero such that (13-44)
adl,; I = 0 J=1
Let j be an index for which aj + 0 and bi <_ b;. By dividing (13-44) by a; we may assume without loss of generality that a; = 1. From equality (13-44) it = 0. Define a unimodular matrix M by follows now that E"",
r 10
01
.
.
.
0
a2.i°J-a'
(13-45)
M(A) = I
0
Lo 0
.
.
am laJ-am
.
1
The columns of DM agree with those of D for i + j whereas the jth column of DM is of lower degree than that of D. Since det DM = det D we still have Y, 61") Z degdetD where 61') are the degrees of the columns of DM. We proceed inductively until a column proper matrix is obtained.
As an immediate corollary we obtain the following simple derivation of feedback canonical forms, the first in terms of polynomial system matrices whereas the second is in terms of matrix representations. Both canonical forms are named Brunovsky canonical forms.
Theorem 13.9 Let D be a nonsingular m x m polynomial matrix with n = degdetD. Then there exist uniquely determined numbers
K1>... _ K. ?0 with Em 1 K, = n such that (D
I) and (A I), where
A(A) = diag(A."',..., A"'")
(13-46)
are associated with the feedback equivalent pairs (S(D), XD) and (S(A), nD), respectively.
LINEAR ALGEBRA AND FINITE DIMENSIONAL LINEAR SYSTEMS
61
PROOF By Theorems 13-7 and 13-8 we may assume without loss of generality
>_ K. z 0 and . left multiplication with a constant matrix P we can bring (D I) to (PD P) and PD has the form A + Q and the column degrees of Q are less than the corresponding column degrees of A. By the similarity P-' in the input space (A + Q P) is transformed into (A + Q 1). Finally, by state feedback (A + Q 1) is transformed into (A I). To prove the uniqueness of the numbers K,, ..., Km assume (S(A), 7r.) and (S(A,), ne,) are feedback equivalent with A, = diag(.1°', ..., aid"'). By Theorem 13-6 we have A, = P(A + Q) for some invertible matrix P. This in turn implies that A,A-' = P(I + QA-') is proper. Since A,A-' = we have Kj >_ 6j for j = 1, ..., m. Equality now foldiag(1d'
that D is column proper with column degrees K, >_
.
Em=, Ki = n. By
lows by symmetry considerations.
The numbers K,,..., K. will be called the reachability indices of D. We proceed to show that this definition is in agreement with the common definition of the reachability indices. Given the pair (A, B) we let B = Range B and
, = B + A B + + A`-'B. Let
_ dimB
for i > 1 for i = 1
dim;_,
(13-47)
a2 > It is clear that, assuming B to be injective, m = a, >_ a,,. By the CayleyHamilton theorem 0 for k > 0. We define now the reachability indices of the pair (A, B) to be the set of numbers K, , ... , K. defined by Kj = Card{a,ja; >_ j }
(13-48)
It follows that K, ? K2 ->
? K. and Y7_, K j = n. Since the a j are _41-orbit invariants so are the Kj. Now it is easy to check that the reachability indices of A, A being given by (13-46) coincide with the reachability indices of the pair (S(A),,r ).
Corollary 13-10 Let A and B be n x n and n x m matrices, respectively. Assume the pair (A, B) is reachable, B injective and the reachability indices of (A, B) are K, >_ >_ Km. Then (A, B) is feedback equivalent to the pair (A,, B,) where A, = diag(A,, ..., Am), B, = diag(b,, ..., bm) with
0 0
0
0
0
1
(13-49)
Aj
1
0/Kj KKj
62
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
and
/1\ b; =
0
(13-50)
\ 0 /KjX 1 PROOF Let H(A) D(A.)-1 be a right coprime factorization of (.1I - A)-1 B.
The pair (A, B) is isomorphic to (S(D), nn) and feedback equivalent to (S(A), no) where A is given by (13-40). Since A is diagonal we have
Ko = Kk., O ... O KA.-
(13-51)
1, ..., m, Let e 1, ..., em be the standard basin in F' then the vectors {2ej i = 0, ..., x; - 1) are a basis for K. Relative to these bases the pair (S(A), n,) has the matrix representation (Ar, Be), and this is the Brunovsky canonical
form.
NOTES AND REFERENCES For the necessary algebraic background one can consult Jacobson [74,75], Lang [80], MacLane and Birkhoff [88], or van der Waerden [119]. For matrix theory and polynomial matrices MacDuffee [86] is still useful. Gantmacher [55] is a comprehensive survey. The development of the structure theory for linear transformation in finite dimensional vector spaces follows [51] and is motivated by results in the theory of invariant subspaces that are discussed in the second chapter. The sections on linear system theory have been written as an effort to bridge the gap between Kalman's stress on seeing linear time invariant systems as F [A]modules [76, 77], state space theory [15], and Rosenbrock's polynomial system matrices approach [100]. The material on linear system theory is fairly standard now. References include besides the previously mentioned books also the pathbreaking Zadeh and Desoer [126], Wolovich [123], and Wonham [124]. The concepts of controllability and observability have been introduced by Kalman. Reference [76] contains a historical discussion as well as a comprehensive bibliography. The section on simulation has been motivated by [76] and is based on [53] which contains some applications. Coprime factorizations of rational
transfer functions play a dominant role in Rosenbrock's theory. The use of coprimeness in the study of composite systems has been utilized in [19,46].
The resultant of the two polynomials has been introduced by Sylvester [80,119]. For generalizations and the use of resultants in system theory we refer to Barnett [9], Rowe [102], and Gohberg and Lerer [60]. A recent series of papers by Gohberg, Lancaster, and Rodman [57-59] contains a large number of results relevant to system theory.
CHAPTER
TWO OPERATORS IN HILBERT SPACE
1. GEOMETRY OF HILBERT SPACE Hilbert space is going to provide the setting for most of the rest of this work. This section provides a quick introduction to the important results concerning the geometry of Hilbert spaces. We define an inner product space to be a complex linear space H with a function ( , ): H x H C that satisfies
(a) (x, x) >_ 0 and (x, x) = 0 if and only if x = 0 (b)
(0(1x1 + 0(2x2, y) x al (x1, y) + 0(2(x2, y) 0C1,a2eC,
x1,x2,yEH
for (1-1)
(c) (x, y) = (1'-x)
It follows from (1-1) that the form (x, y) is antilinear in y. A form satisfying (I-1) is also called a Hermitian form. Define 11xii _ (x, x)112
(1-2)
which is the norm induced by the inner product. It clearly satisfies JJxJJ >_ 0 for all x and JJxJJ = 0 if and only if x = 0. Also JIax!J = Jai JJxJJ for all aeC and
The proof of the triangle inequality will follow that of the Schwarz inequality. Inner product spaces allow us to introduce the important notion of orthogonality. We say two vectors x, y e H are orthogonal, and write x I y, if (x, y) = 0. Given a set M we write x 1 M if x 1 m for all m c- M. A set of vectors (x0(} is called
an orthogonal set if (x2f x0) = 0 whenever a
,B. A vector x is normalized if
1x11 = 1. We define an orthonormal set as an orthogonal set of normalized vectors. Thus {e3} is an orthonormal set if (e0(, ep) = Sap. 63
64
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Theorem 1-1 (Pythagorean theorem) Let {x,}"=, be an orthogonal set in the inner product space H, then xi
x;
i=1
PROOF 2
Y_ xi
i=1
_
i=I
xi,
j=I
xi
=
i=I j=I
(xi, xj) =
xi
i=1
Theorem 1-2 (Schwarz inequality) For all x, y in H we have (1-3)
11x11' IIyI1
PROOF For y = 0 this is trivial. Assume y
0 and let e = IIYII ' y. Noting
that x = (x, e) e + (x - (x, e) e) and that x - (x, e) e 1 e the Pythagorean theorem implies I(x,e)12
IIxI12 =11(x, e)el12 + IIx - (x, e)e112 -
II(x,e)e112 =
or I(x, e)l 5 Ilxll. Substituting Ilyii -'y for e (1-3) is obtained. Theorem 1-3 (Triangle inequality) For all x, y e H we have
IIx + yll < llxll + llyll
(1-4)
PROOF
IIx+yl12=(x+y,x+y)=(x,x)+(x,y)+(y,x)+(y,y) = IIx112 + 2Re(x,y) + lly112 IIxII2 + 21(x,y)I + 11y112
IIxII2 + 2llxll - ilyll + 11yl12 = ( IIYII + Ilyll )2
With the proof of the triangle inequality we have proved that the norm defined by (1-2) is a bona fide norm. Thus an inner product space becomes a metric space with the metric p defined by p(x, y) = 11x - y11. Convergence of a sequence of vectors in this metric is called strong or norm convergence, that is, a sequence x converges to x if 11 x,, - x11 -. 0. We recall that a metric space is called complete if every Cauchy sequence converges to an element of the space. A complete inner product space will be called a Hilbert space. A subset M of a Hilbert space H is a linear manifold if whenever x, y e M and a, fi e C we have ax + fly a M. A linear manifold which is closed is called a subspace. Thus a subspace of a Hilbert space is also a Hilbert space. Theorem 1-4 The inner product is a continuous function in each of its variables.
OPERATORS IN HILBERT SPACE
65
PROOF Let xn converge to x then
I (xn,Y) - (x,Y)I = I(xn - x,Y)I < Ilxn - xli
IIy1I
and hence (xn, y) converges to (x, y).
Actually we can strengthen this result and show that the inner product
is simultaneously continuous in both variables. Thus let lim x = x and lim yn = y then I(xn, Yn) - (x, Y)I < I(xn, Yn) - '(x, Yn) + (x, Yn) - (x, Y) I
I(xn - x, Yn)I + I (x, Y. - AI Ilxn - xll - IlYnll + Ilxli - IIYM - YII Now the sequence Ilyn II is bounded as a consequence of the uniform boundedness principle and hence we obtain lim(xn, yn) = (x, y).
Corollary 1-5 Given y e H then {xi (x, y) = 0} is a subspace.
The norm in a Hilbert space was defined by means of the inner product. It turns out that the inner product can be recovered from the norm. The proof is a simple computation and is omitted.
Theorem 1-6 (Polarization identity) For all vectors x, y e H Ilx-YII2+illx+iyll2 -illx-iyl(2} (x,Y)=4{IIx+YII2-
(1-5)
Theorem 1-7 (Parallelogram identity) For all vectors x, y e H we have IIx + YII2 + IIx - YII2 = 2(IIxFI2 + IIyOI2)
(1-6)
PROOF Computational.
Theorem 1-8 (Bessel's inequality) Let {e,} be any orthonormal set then for each vector x e H we have >-EI(x,ei)I2
(1-7)
11x112
PROOF For each finite orthonormal set {e,, ... , e } we have n
n
x = x - j (x, e;) et + Y_ (x, et) e,
and
x - Y (x, e;) e;
/ i=I i=1 j`1 is orthogonal to e,, ..., en and hence to the subspace spanned by them. Using the Pythagorean theorem we obtain 2
x
2
x - > (x, ei) e, =I
n
Y (x, e,) et
2
+
E (x, e;) e;
i-1 2
66
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
If our orthonormal set is infinite we let n go to infinity to obtain the required inequality.
Let y be any vector in the subspace spanned by e,, ..., e,, then y = Ej=, a;et for some a; a C. Now
x - y=x -
=I
(x,e;)e;-Y(a;-(x,e'))e;
and hence as in the proof of Bessel's inequality we have 2
x - y
x-Y(x,e;)e;
. x - Y (x, ei) ei
2
2
+ Y a; - (x, er)
2
i=I i=I (x, a;) e; is the vector in the subspace spanned by e, , ..., e which is Thus 1,, _ , closest to x. This observation can be used as a basis for the orthogonal decomposi-
tion of a Hilbert space with respect to a subspace. To this end define two subspaces M and N to be orthogonal if (m, n) = 0 for all m e M and n e N. Given two orthogonal subspaces M and N we write M m N for the orthogonal direct sum of M and N that is for the subspace { m + n I m e M, n e N). A set K in a linear space is called convex if x, y e K implies ax + (1 - a) y e K for all 0 -< a <- 1. Of course any subspace is automatically a convex set. The next theorem, of independent interest, addresses itself to the problem of best approximation by convex sets.
Theorem 1-9 Let K be a closed convex subset of a Hilbert space H and let x 0 K. Then there exists a unique vector k e K, such that IIx - k11 <
IIx-k'Ilfor all k'eK-{k}.
PROOF Let d = inf { Iix - k'11 Ik' e K} and let k be a sequence such that lim IIx - k.ll = d. The sequence is a Cauchy sequence for, by the
'_ co
parallelogram identity 2(IIx-k.112+IIx-km112)=4
x-
Ck"+km) 2
2
+Ilk. -kmB2
or 2
ilk. -k.,112=2(Ilx-k.112+IIx-km112)-4
x -
But (k + km)/2 e K by convexity so IIx
-
+km/ 2
>d
and
ilk. -km112 <_2(Ilx-k.112+ IIx-km112)-4d2
67
OPERATORS IN HILBERT SPACE
Since the right side approaches zero as n, m -+ oo it follows that k is a Cauchy sequence. Since K is closed there exists a vector ke K such that limk = k.
By continuity of the norm we have Ilx - kll = d To show uniqueness assume k' is another vector in K for which llx - k' Il = d. Then 0<_ Ilk
-k'
2(l+ -K
<-2 (II
IIZ
Ix
IIZ
l
)-4Ix-(k+k))12
2
IIZ z
=4d2-4 x-
<0
So k = k' and the proof is complete. Given any set S we define
Sl = {yeHI(y,s) = 0
for all
seS}
=n{yeHI(y,s)=0} SEs
By Corollary 1-5 Sl is a subspace. For a subspace M, Ml is called the orthogonal complement of M by reason of the following.
Theorem 1-10 Let M be a subspace of a Hilbert space H then H = M ® Ml.
PROOF It suffices to show that each vector x in H can be written as x = y + z with y e M and z e M. Let y be the unique vector in M, whose existence has been established in Theorem 1-9, which is closest to x in M. Put z = x - y then clearly x = y + z and the proof will be complete once we establish z = x - y is orthogonal to M. Let y' be any vector in M then Sllx-(y+y')112=11(x-y)-y'112
0511x-yll
= Ilx-y112+Ily'112-2Re(x-y,y') Let y' = pe'Be for some e in M and choose 0 so that a-'(x - y, e) _
-
2p1(x - y, e)I. Dividing by p > 0 and letting p approach zero we have 0 <- -1 (x - y, e) I or (x - y, e) = 0. This I(x - y, e) 1. Then 0 5 1p12 Ile112
proves the theorem.
A direct corollary is the Riesz representation theorem for functionals in a Hilbert space. A linear functional in a Hilbert space is a linear map F from H into A functional is continuous if the function F is continuous. Theorem 1-11 (Riesz representation theorem) Let F be any continuous linear functional in H. Then there exists a unique vector y e H such that
F(x) = (x, y)
for all
xeH
(1-8)
68
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
PRooF If F = 0 choose y = 0. Otherwise let M = KerF = {xI F(x) = 0}.
Since F is linear and continuous M is a subspace. Let 0 * z e M' and define y by y = (F(z) z/jIzjj 2) z then F(z) = (z, y) and F(y) = I1y112. Let x e H then x
rx
F(x) F(y) y +
F(x) F(y) y
where x - (F(x)/F(y)) y e M and is therefore orthogonal to y. Taking the inner product of x and y we have
(x, y) =
\ F(y)
Y1 Y i = F(x)
IF(+y)
= F(x)
which proves the existence of y.
To prove uniqueness let y' be another vector such that for all x e H F(x) = (x, y'). Then (x, y - y') = 0 for all x and choosing x = y - y' we have IIy - Y II2 = 0 which implies y = y'.
We continue with the study of orthonormal sets. An orthonormal set {e,} is an orthonormal basis for a Hilbert space H if it is orthonormal and the smallest subspace containing it is H. An orthonormal set {ea} is closed if there exists no
nonzero vector orthogonal to all eQ. Thus a closed orthonormal set is also a maximal orthonormal set in the sense that there exists no orthonormal set which properly includes it. An orthonormal set is complete if for each vector x e H we have the Parseval identity. 11x112 =
(x,e,,)I2
(1-9)
It is a simple consequence of Bellsel's inequality that for an arbitrary orthonormal set at most a countable number of the (x, ea), that is, of the generalized Fourier coefficients, are nonzero. A Hilbert space H is separable if there exists a countable dense subset in H.
Theorem 1-12 A Hilbert space H is separable if and only if it has a countable orthonormal basis.
PRooF If H has a countable orthonormal basis, say {e,};° ,, then, say, the C_ , a,e, ln > 0, Re a, and Im a, rational) is a countable dense subset. set Conversely assume H is separable and let {x,},-=, be a countable dense subset of H. We construct an orthonormal basis by the Gram-Schmidt orthonormalization procedure. If x, # 0 let e, = x, / 11 x, 11 . Suppose e,, ..., eq have been defined then we proceed inductively. Let x be the first vector in the sequence
OPERATORS IN HILBERT SPACE
{xi} which is not in the subspace spanned by e1, ..., en. Define
69
by
rn
xn, - L (x, e1) ei i=1
x - j (x, ei) e; i=1
The resulting orthonormal set {ei} is an orthonormal basis for H as it spans the same subspace as the set {x; } .
For simplicity we restrict ourselves from now to the case of separable Hilbert spaces.
Theorem 1-13 Let H be a separable Hilbert space. Then the following statements are equivalent. (a) {ei I is an orthonormal basis for H. (b) {e1}'° I is a complete orthonormal set. is a closed orthonormal set. (c)
1 is an orthonormal basis, then by Bessel's inequality I(x,e;)l2 < oo and hence YT= I(x,ei)ei converges. Since x - E 1(x,ei)e, is clearly orthogonal to the subspace spanned by the ei, that is, to all of H, we have x = E I (x, ei) ei. A simple computation yields 1x112 = E- I I(x, ei)12. Thus the Parseval identity holds for all x and {ei},° I is a complete orthonormal set. Let x be orthogonal to all e; then from the Parseval identity we 1,911 is closed. Finally assume {ei};° 1 have x = 0 and the orthonormal set {e1};° 1(x, ei) e; converges and x - Y I (x, ei) ei being orthogis closed. Again onal to all ei is necessarily zero. So x = Y-1 1(x, ei) e; for all x and hence {ei } is an orthonormal basis, which completes the proof. PROOF Assume {ei} I
Theorem 1-14 Let H be any separable Hilbert space. Then the cardinalities of two orthonormal bases are equal. PROOF Since the space is separable the orthonormal basis is at most countable. If H has a finite orthonormal basis {e;, ..., en } and { fj J j any other orthonormal basis then by Parseval's identity n EEI(f;,ei)I2
E 11
f112 =
j i Since jjfj jj = 1 we have I
= EEI(ei, f) I2 = E i
11f j 2
j
Ileill2 = n
1=1
= n and hence the basis { f } has also n
elements.
We define the dimension of a Hilbert space to be the number of elements in any orthonormal basis. By the previous theorem the dimension is well defined. In the same way we define the dimension of a subspace. Given a subspace M the codimension of M is the dimension of M1.
70
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
2. BOUNDED OPERATORS IN HILBERT SPACE Let H1, H2 be two Hilbert spaces. A linear operator T : HI - H2 will be called a bounded operator if its norm II T II defined by 11TX1I2
11TII =
(2-1)
sup X*0 114.
is finite. The set of all bounded linear operators from HI into H2 will be denoted by B(H1, H2). We will write B(H) for B(H, H) and note that while B(H1, H2) is a Banach space in the operator norm, B(H) is actually a Banach algebra [29,125]. If Te B(H1, H2) then its adjoins T* is the unique operator in B(H2, H1) that satisfies (Tx, y)2 = (x, T*y)I
(2-2)
for all x e H1 and y e H2. The existence and uniqueness of T* are a straightforward consequence of the Riesz representation theorem Moreover a simple computation yields the equality IITII = II T* 11
An operator T in B(H) will be called self=adjoint if T = T*. Self-adjoint operators play in B(H) the role that the reals R play in the complex number field
L Thus any operator T in B(H) has a unique representation of the form T= A + iB with A and B self-adjoins The importance of the class of self-adjoint operators stems from their appearance in many applications as well as the fact that they form one of a small class of operators whose structure is completely understood. We will return to the study of self-adjoint operators in Sec. 5. An operator Te B(H) is called a contraction if 11 T11 <_ 1. Every bounded operator can be resealed, through multiplication by a scalar, to become a contraction. Thus the study of the structure of contractions yields information about the most general bounded operators. With each operator T in B(H1, H2) we associate two linear manifolds
KerT= Ix Ix e H1, Tx = 0} and
RangeT= {TxIxeHi} From the assumption of boundedness it follows that Ker T is a subspace of H1 and Range T is a linear manifold in H2 which is not necessarily closed. The following is a simple but important result.
Theorem 2-1 Let T e B(H1, H2) then
(a) H I = KerT $ Range f*, and (b) H2 = Ker T* Q+ Range T. PROOF From the equality (Tx, y) = (x, T * y)
OPERATORS IN HILBERT SPACE
71
it follows that x e Ker T if and only if it is orthogonal to Range T* and hence to Range T*. Thus (a); (b) follows by duality. Let M be a subspace of a Hilbert space H. By Theorem 1-10 we have the direct sum decomposition H = M O+ M. If x = y + z with y e M and z e Ml is the unique representation of x relative to this direct sum decomposition then we define an operator P e B(H) by Px = y. P is linear and satisfies
P2 = P = P*
(2-3)
Conversely given any operator P e B(H) satisfying (2-3) then it induces a direct
sum decomposition H = M (D N with M = Range P = Ker(I - P) and N = Range(I - P) = KerP. We call an operator P satisfying P2 = P a projection
and an orthogonal projection if it satisfies also P = P*. Given a subspace M the orthogonal projection on it is uniquely determined and denoted by PM. Let T be a bounded operator in H. A subspace M c H is called an invariant subspace for T if TM c M. M is invariant under T if and only if Ml is invariant under T*. This is a trivial consequence of (2-2). A subspace M is a reducing subspace for T, or reduces T, if it is invariant under both T and T*. Equivalently M reduces T if and only if both M and Ml are invariant under T Since a subspace M and the orthogonal projection on it PM are in a one-to-one correspondence invariance and reducibility are expressible in terms of the projection PM.
Theorem 2-2 Let M be a subspace of a Hilbert space H, P the orthogonal projection on it and TcB(H). Then (a) M is invariant under T if and only if PTP = TP. (b) M is a reducing subspace for T if and only if PT = TP. PROOF (a) Assume M is invariant under T. For each x e H we have Px E M by invariance. Now M = Ker(I - P) hence (I - P) TPx = and hence
0 or PTPx = TPx for all x. Conversely assume PTP = TP and let x E M then Px = x. Since PTPx = TPx we have Tx = TPx E Ker(1 - P) = M or M is invariant To prove (b) we note that M' is invariant under T if
and only if (I - P) T(I - P) = T(I - P) which simplifies to PTP = PT Thus reducibility implies PT = TP. Conversely the equality PT= TP implies, by left and right multiplication by P, both PTP = TP and PTP = PT which are equivalent to the invariance of M and Ml under T. An operator Te B(H1, H2) is right (respectively left) invertible if there exists an operator R (resp.S) in B(H2, H1) such that TR = 1HZ (resp.ST= IH1). Clearly right invertibility implies T is onto H2, that is, surjective, and left invertibility implies T is one-to-one, that is, injective. The converse of the first implication is true, and is a special case of Theorem 7-1, but the other is false. An operator T is invertible if it is both left and right invertible. Thus the invertibility of T implies
T is one-to-one and onto. The converse is true, but while the existence of an algebraic inverse is trivial the proof of the boundedness of the inverse needs all the power of the closed graph theorem [29, 96].
72
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
We denote the inverse of an invertible operator T by T '. Thus
TT-' = T-'T= I Taking adjoints of the previous equality we obtain
(T-I)*T*=T*(T-1)* =I and hence T* is invertible and
(T* )* = T`
(2-4)
A relaxation of the notion of invertibility is obtained by introducing quasiaffinities A bounded map X: H1 -+ H2 which is one-to-one and has dense range is called a quasiaffinity or equivalently a quasi-invertible transformation. A different situation occurs when an operator T : H 1 -+ H2 has closed range.
In that case T { Ker T } 1 -+ Range T is a boundedly invertible operator. The inverse, originally defined on Range T, can be defined on all of H2 by letting its restriction to {Ranger}' = KerT* be zero. The extended operator, uniquely determined by T and denoted by T*, is called the pseudoinverse of T. The pseudoinverse of T satisfies
TT*T = T
(2-5)
Due to the existence of an inner product the notion of isomorphism can be specialized. Generally two spaces H1 and H2 are isomorphic if there exists an invertible transformation T from H1 onto H2. In the case of Hilbert spaces we say two spaces H1 and H2 are isometrically isomorphic, or unitarily equivalent, if there exists an invertible map U: H1 -+ H2 satisfying (Ux, UY)2 = (x, Y) 1
(2-6)
for all x, y e H1. An invertible map U satisfying (2-6) is called a unitary map. From (2-6) it follows that
U*U = I and since U is assumed invertible we have also
UU* = I (2-8) which together with (2-7) is equivalent to U-' = U*. An operator U: H1 -+ H2 is called an isometry if it satisfies (2-7) and a coisometry if (2-8) is satisfied. Contrary
to the finite dimensional situation there exist nonunitary isometries in B(H), assuming the dimension of H to be infinite. Thus the set of isometries in B(H) is properly larger than the set of unitary operators. For many purposes it turns out to be useful to introduce a class of operators wider than the class of isometries, namely the partial isometries. An operator VE B(H1, H2) is called a partial isometry if there exists a sub-
space M c H1 such that II Vx 11 =
j l0
if
xeM
if
xeM1
OPERATORS IN HILBERT SPACE
73
The subspace M is called the initial space of V whereas Range V, which is a closed
subspace of H2, is called the final space of V. The basic properties of partial isometries are given below. Theorem 2-3
(a) A bounded linear transformation V : H1 - H2 is a partial isometry if and only if V*V is a projection.
(b) V is a partial isometry if and only if V* is and the initial space of V* is the final space of V.
(c) If V is a partial isometry then V*V is the orthogonal projection on the initial space of V whereas VV* is the orthogonal projection on the final space of V.
(d) V is a partial isometry if and only if V* = V* where V* is the pseudoinverse of V.
H2 be a partial isometry with initial space M and let P be the orthogonal projection on M. If xE M then PROOF (a) Let V. H1
(V*Vx, X) =
11
Vx112 =11x112 = (Px, x)
on the other hand if x e M' then (V* Vx, x) = 11 Vx112 = 0 = (Px, x). Thus (V*Vx, x) = (Px, x) for all x in H hence V*V = P which proves also the first part of (c).
Conversely assume V*V = P is a projection, necessarily orthogonal since P is self-adjoint, on a subspace M then for x e M 11Vx11 2=(V*Vx,x)=(Px,x) = 11x112
and for x E M' 11Vx112 = (V*Vx,x) = (Px,x) = 0
Thus V is a partial isometry with M = Range P as initial space. To prove (b) assume V is a partial isometry and let N be its final space. Let y e N then y= Vx for some x e M. II V * y II 2 = (V V * y, y) _ (V V * Vx, Vx) = (VPx, Vx) = II Vx112 = 11x112. If y e N1 then 11 V*y112 = (V V*y, y) _
(VV*Vx, y) = (Vx, y) = 0. Thus I V*y11 is equal to IIyII if y E N and to 0 if y e Nl. Thus V* is a partial isometry with N = Range V as initial space. It follows from the first part of (c) that VV* is the projection onto the final space of V.
(d) Assume V a partial isometry then V*V = P which is equivalent to V* = V*. Conversely if V* = V* then V*V= P. From (a) it follows that V is a partial isometry.
While the spectral theorem and the Wold decomposition give us complete
information as to the structure of isometries the structure theory for partial isometries is equivalent to the theory of an arbitrary bounded operator in a Hilbert space [64, Problem 103]. The notion of isomorphism of spaces extends to isomorphism of operators.
74
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Given T, a B(H1) and T2 e B(H2) we say T, and T2 are unitarily equivalent if there exists a unitary operator U e B(H1, H2) for which UT, = T2 U. T, and T2 are similar if there exists a boundedly invertible operator R C- B(H1, H2) for which
RT, = T2R. We say T, is a quasiaffine transform of T2 if there exists a quasiaffinity x E B(H1, H2) such that XT, = T2X. T, and T2 are quasisimilar if each is a quasiaffine transform of the other.
Unitary equivalence, similarity, and quasisimilarity are all equivalence relations while being a quasiaffine transform is a reflexive and transitive but not necessarily symmetric relation. The object of spectral theory is to study the structure of operators by decomposing them into more elementary ones. This in analogy with the finite di-
mensional situation summarized by Theorem I 4-21 describing the Jordan canonical form. To this end we introduce the relevant terminology. We define the resolvent set of a bounded operator T as the set p(T) of all complex numbers A such that Al - T has a bounded inverse. We put R(2.., T) = (Al - T)-1 and call it the resolvent function. The complement of p(T) is called the spectrum of T and denoted by a(T). The spectrum of T can be more finely classified. We denote by ap(T). the point spectrum of T, the set of all eigenvalues of T, that is, the set of all A where AI - T is not injective. If Al - T is one-to-one but not onto then we assign A to the continuous spectrum ac(T) if the range of Al - T is dense and to the residual spectrum a,(T) otherwise. The important basic facts concerning spectrum and resolvent are summarized by the following theorem. The theorem holds just as well in any Banach space.
Theorem 2-4 Let T be a bounded operator then the resolvent set p(T) is open and R(.1, T) is analytic on p(T). The spectrum a(T) is a nonempty compact set and r(T) = sup { JAI JA e a(T) } = lim 11 T" II1/"
(2-9)
M -W
The resolvent equation
R(A, T) - R(p, T) _ -(A - p)R(A, T)R(p, T)
(2-10)
holds for all A, ,u e p(T).
PROOF We use the fact that if II All < I then I - A is invertible and the inverse given by the uniformly convergent series Eil= 0 A'. Thus if A. e p(T)
is an arbitrary point of p(T) we will show that a full neighbourhood of A0 is in p(T). Indeed
Al - T=(A-A0)!+(A0!-T)=(A01 - T)(I +(A-A0)R(A0,T)) Since A01 - T is invertible by assumption Al - T is invertible if (I + (A - A0) R(A0, T))
is and this is certainly true if 11 (A - A0) R(A0, T) 11 < 1
or if IA - A0I < IIR(A0, T)II
1
OPERATORS IN HILBERT SPACE
75
Thus p(T) is open and moreover we get
R(A, T) _ Y (-1)"(A - Rio r R(AO, Tr "=o
for all A such that IA - Ao I < 11 R (Ao, T) II - I which shows the analyticity of the resolvent. Also it follows that if d(A) is the distance of the point ,1 E p(T)
from the spectrum then necessarily IIR(A, T)II >- d(A)-I which shows that lim IR(A, T) II = oo and hence p(T) is the natural domain of analyticity of a(T)
A
the resolvent function. For A, ,u e p(T) the resolvent equation follows from the equality (A - p) I = (Al - T) - (pI - T) by multiplication by R(A, T) R(p, T). We have for the derivative of the resolvent the formula dR (A, T)/dA = - R(A, T)2. To see that a(T) is bounded we note that for complex A satisfying JAI > 1ITII then Aep(T) as (AI - T)-I = E"'"=,(T"/A"+') and the series converges if a(T) were empty then R(A, T) would be an entire function vanishing at infinity, thus necessarily zero. This implies all coefficients
of the Laurent expansion Ym o (T"/A"+') vanish, including I which yields the contradiction.
A classical argument about power series shows that the expansion R(A, T) = Y-0 (T"lAn") actually converges for JAI > lim sup 11 T"11 'In = 00
sup (IAI 1A E a(T)) . Now if A e a(T) it clearly follows that A" e a(T") and hence JAI" <- IIT"Ii. Thus r(T) = SUP {JAI IAea(T)) -< IIT"III/" or r(T) < lim inf IIT" I1'/".
This together with the previous inequality implies that
lim 11 T"11 'I" exists and is equal to the spectral radius r(T).
n- D
If Ae p(T) then taking the adjoint of the equality
(Al - T)R(d,T) = R(A,T)(Al - T) = 1
(2-11)
R(A, T)* = R(A, T*)
(2-12)
implies
If T is a contraction and A E ap(T) we do not have necessarily A e ap(T). More can be said if JAI = 1.
Lemma 2-5 Let T be a contraction in a Hilbert space H. If JAI = 1 and Tx = Ax then T*x = Ax. PROOF By considering AT instead of T it suffices, without loss of generality
to prove that Tx = x implies T*x = x. Thus we assume Tx = x and hence (T*x, x) = (x, Tx) _ (x, x) =IIxII2. Now
0 s IIT*x - x112 = IIT*x1I2 - 2Re(T*x, x) + IIxII2
=IIT*x112-2IIXI12+IIxII2=11T*x1I2-IIxII2<0 which implies 11T*x - x11 = 0 or T*x = x. As in the finite dimensional case, given a polynomial p(A) = Y= 0 a;A' and a
76
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
bounded operator T, p(T) is defined by p(T)
a,T,
(2-13)
i=o
This is not very useful in the infinite dimensional case due to the fact that the case where p(T) = 0 for some nonzero polynomial is the exception rather than the rule. Thus it is of interest to enlarge the class of functions f for which f (T) can be defined.
By a functional calculus we mean a homomorphic map of some algebra of functions into the algebra of bounded operators on a Hilbert or a Banach space. An important instance of a functional calculus is the Riesz-Dunford functional calculus which is the operator theoretic analog of the Cauchy integral formula. Let T be a bounded operator with spectrum a(T), let S2 be an open set that contains a(T) and finally A(S2) the set of all analytic functions in 0. For f e A(i2) we define
f(T)=2ni fv
(2-14)
f(2)R(A,T)dA
where y is any closed positively oriented path around a(T).
Theorem 2-6 The map f
f(T) defined by (2-14) is a (multiplicative)
homomorphism of A(0) into B(H) that satisfies
f (T) _ I a. T"
(2-15)
-0
for every entire function f (z) _ X:= 0 a"z". PROOF That the map f -+f (T) is linear is obvious. Multiplicativity is a consequence of the resolvent equation (2-10). Indeed let y1 and y2 be positively
oriented paths around a(T) and assume without loss of generality that y1 lies inside y2, that is the index of each point of y1 relative to y2 is 1. In that case
f(T)g(T)=
12{ j 4rc2
ff f r,
j
,
f()R(,T)d}{g()R(,T)dµ
z
9(p)R(.,T) - R(,T)dpdA
p
S72
A
f(A)R(A,T)I2rri f
rz p -
+ 2ni
fy2
)
9(p)R(p,T){2ni f f(A}1d2Idp r,
1
2ni f (A) 9(A) R(A, T) dA = (f9) (T)
}
OPERATORS IN HILBERT SPACE
77
Finally, assume f is an entire function having a Taylor series f (z) = Y-- 0 anzn. Expand R(A, T) around infinity then for a contour y lying in {AI JAI > IITII } we have
f(T) =
A"R(),T)dA
Y an
2ni n=o
y
t
I
o"2rrl a
y
An k
0
k+1
r
oo
dA =
anTn n
If f (T) is defined by (2-14) then we expect to recover the spectrum of f (T) from the knowledge of a(T) and the analytic behavior of f. This is the content of the following theorem known as the spectral mapping theorem. It is the analog in this context of Theorem I 4-8.
Theorem 2-7 If a(T) c I and f e A(i2) then a(f (T)) = f (a (T)). PROOF It suffices to show that f (T) is invertible if and only if f (A) is different from zero on a(T). Assume f (A) 0 for all A e a(T) then h(A) = 1 /[ f (A)] is analytic in a neighborhood S2, of a (T). By the multiplicativity
property of the map f - f (T) it follows that h(T) f (T) = I. Conversely assume A e a(T) and f (A) = 0. Define g by g(p) = [f (p)]/[A - p] then g e A(S2) and f (T) = (Al - T) g(T). If f (T) were invertible with inverse S
so would be (Al - T) with (Al - T)
g(T) S in contradiction to the
assumption A e Q(T).
3.- UNBOUNDED OPERATORS Unbounded operators will be encountered frequently in the sequel, especially as infinitesimal generators of semigroups. Thus it will be convenient to review some of the basic facts concerning them. Let T be a linear map whose domain of definition DT is a linear manifold in a Hilbert space H, and whose range is included in a Hilbert space H2. We define the graph of T as the set F (T) of all pairs {[x, Tx] Ix e DT} in H, ® H2 the direct
sum of H, and H2. The operator T is called closed if its graph F(T) is a closed linear manifold, that is, a subspace, of H, E) H2. Equivalently stated T is closed if for any sequence xn in DT for which xn - x and Txn -+ y we have necessarily x e DT and y = Tx. Every bounded linear operator T from H, to H2 is closed. The converse is not generally true, however, a closed linear operator whose domain of definition DT is a Banach space is bounded. This is the content of the closed graph theorem. If T is injective then T -' is defined on Range T by T - 'y = x where y = Tx.
Clearly T-' is well defined in this case, linear and closed whenever T is closed. We define the resolvent set p(T) and the resolvent function R(A, T) for an unbounded operator just as for bounded ones. Thus A e p(T) if and only if R(A, T) =
78
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
(A1 - T)
exists as a bounded operator, that is, R(2, T) is bounded and
()1 - T) R(A, T) x = x
for
xeH
R(A, T) (A1 - T) x = x
for
x e DT
and
While R(A, T) is analytic on p(T) the spectrum of T may be unbounded on the one hand or empty on the other hand. We say that a linear operator S is an extension of the linear operator T, and write T c S, if DT c Ds and Sx = Tx for all x c DT. Equivalently S is an extension of T if and only if 17(T) -- r(S). A linear manifold I c H, H2 is a graph of a linear operator if and only if [0, y] e F implies y = 0. If T is not closed I'(T), that is, the closure of F(T) in
H1 p H2 may fail to be a graph. If (T) r(T)is a graph we say T is closable. In that case let T(T) = F(T) and we call T the minimal closed extension of T Given a linear operator T with domain DT in a Hilbert space H1 and range in a Hilbert space H2 we consider D the set of vectors y e H2 for which q, (x) = (Tx, y) is a continuous linear functional on H,. By the Riesz representation theorem there exists a vector y* such that (Tx, y) = (x, y*). Obviously y* is uniquely determined if and only if DT is dense in H1. Assuming that we define T*, the adjoint
of T, by D. = D and T*y = y* for y e Dr.. The study of the adjoint operator is facilitated by studying the related graph. To this end consider the map U:H1
O+ H1 defined by
U [x, y] _ [iy, - ix] = i [y, - x]
(3-1)
It is easily checked that U defined by (3-1) is unitary and satisfies also U2 = 1.
We call such an operator a conjugation. In terms of the conjugation U defined by (3-1) we can relate the graphs of T and T*. Lemma 3-1 Let T be a densely defined operator in a Hilbert space H then
r(T*) = {Ul-(T)}1
(3-2)
the orthogonal complement taken in H $ H. PROOF Let (y, z) e { UI'(T) }1 then for all x e DT we have
0 = ([iTx, - ix], [y, z]) = i(Tx, y) - i(x, z) or (Tx, y) = (x, z) for all x e DT. This means that y e DT. and T*y = z, or equivalently that [y, z] e r(T*). The converse follows from the same calculation.
Corollary 3-2 Let T be a densely defined operator, then T* is closed. If T is closed then T** = T. PROOF By the previous lemma I'(T*) is a closed subspace of H $ H, so T* is closed. U being unitary satisfies U(M1) = (UM)' for each subspace M.
OPERATORS IN HILBERT SPACE
79
Applying the previous lemma we have
r(T**) = {Ur(T*)}1 = {U{Ur(T))1)1 = { U2r(T) )11 =
r(T)11 = r(T)
as U2 = I. If T is closed r(T) = r(T) and T** = T A densely defined operator A will be called dissipative if for all x e DA Im(Ax, x) >_ 0. A is symmetric if for all x e DA we have Im(Ax, x) = 0 which is equivalent to (Ax, y) = (x, Ay) for all x, y e DA. Stated another way, A is symmetric if and only if A c A*, in particular every symmetric operator is closable. A symmetric operator A is called self-adjoins if A = A*, so a self-adjoint is automatically closed. A symmetric operator A is said to be maximally symmetric if it has no proper
symmetric extension. That is, if A c A, and A, symmetric then necessarily
A=A,.
Theorem 3-3 Let A be self-adjoint then A is maximally symmetric.
PROOF Let A be self-adjoint and A, a symmetric extension. Since A C A, it follows from Lemma 3-1 that A, c A* which together with the symmetry of A, yields
AcA,cAt A*=A and hence A = A, = A; . Theorem 3-4 Let A be a dissipative not necessarily densely defined, operator. Then the following statements are true. (a) II (A - il)xHI2 <- IIxII2 + IIAxHH2 5 11 (A + if)xII2 for allxeDA.
(b) (A + iI) is injective. (c) A is a closed operator if and only if Range(A + if) is closed. PROOF Let A be dissipative and x e DA then JI(A - il) xll2 = IIxII2 + IIAxII2 - (ix, Ax) - (Ax, ix)
= IIxII2 + IIAxIl2 - 2Im(Ax,x) < IIxII2 + IIAxII2 < IIxII2 + IIAxII2 + 2 Im(Ax, x) 11 (A + il)x112
This proves (a) and (b) is an immediate consequence.
Assume now A is closed Let (A + il) x be a Cauchy sequence in Range(A + if). By (a) both x and Ax are Cauchy sequences converging to x and y, respectively. Since A is closed x e DA and y = Ax. Thus (A + if) x
converges to (A + il) x and Range(A + il) is closed. Suppose conversely that Range(A + iI) is closed. By (a) we have IIxII2 5 11 (A + if) x112 for all
80
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
x e DA. Since A + if is injective (A + if) - ' is a well-defined contraction operator on Range(A + if). Since the inverse of a bounded injective operator is closed it follows that A + if, and hence also A, is closed.
If one considers operators, bounded or unbounded, on a Hilbert space H to be generalizations of complex numbers then one finds that many simple results
concerning complex numbers have nontrivial analogs in the operator theoretic context. One of the most striking and useful results in this vein is the Cayley transform. It is well known that
W=
z-i
z+i
is a fractional linear transformation (Moebius transformation) mapping the upper half plane bijectively onto the unit disc, and the real line onto the unit circle with the point 1 deleted. Consider now dissipative operators as generalizations of the complex numbers lying in the closed upper half plane, {zjIm z >- 0). With each dissipative operator A we can associate a linear operator in the following manner. Let us define T by
Tx=(A-if)(A+il) 'x
(3-3)
for all x c- Range(A + if). T is well defined on Range(A + il) as (A + if) is injective. From Theorem 3-4 (a) it follows that T is a contraction. We call T the Cayley transform of A. Since
Tx=(A+if-2i1)(A+iI)-'x=x-2i(A+il)-'x it follows that
(T - I) x = -2i(A + if) - ' x
(3-4)
and therefore T - 1 is invertible on Range(A + if). We can recover A from T as follows. From (3-4) it follows that
2(T-1)-' =i(A+i1)=iA-I or
A = -i(I + 2(T- I)-') = -i(T+ 1)(T- I)-' or
A = i(1 + T)(I - T)-'
(3-5)
Lemma 3-5 Every densely defined dissipative operator in H has a maximal dissipative extension which is necessarily closed. PROOF Let T be the Cayley transform of A. Clearly from (3-3) and (3-5) A has a proper dissipative extension if and only if T has a proper contractive
OPERATORS IN HILBERT SPACE
81
extension. Contractions, however, are easily extended to all of H. For given T which is contractive on DT we can extend it by continuity to DT and define the extension T, to be zero on D. Next we show that 1 is not an eigenvalue of T1. If 1 is an eigenvalue of T1 with an eigenvector y then we have also, by Lemma 2-5, that T i y = y. Let x be an arbitrary vector in DA then
(y,(A+it)x)(T!y,(A+il)x)(y,T1(A+il)x) (y,T(A+il)x)=(y,(A- il) x) This equality implies (y, x) = 0 for all x in DA and since we assumed A to be densely defined y = 0 and 1 is not an eigenvalue of T,. Define now A, to be the inverse Cayley transform of T, then A, is a maximal dissipative extension of A. A, is closed by Theorem 3-4 (c) as
DT, = H = Range(A1 + ii) The following theorem characterizes maximal dissipative operators. Theorem 3-6 Let A be a densely defined linear operator in H. The following statements are equivalent. (a) A is maximal dissipative. (h) A is dissipative and Range(A + il) = H.
(c) A = i(I + T)(1 - T)-' for some everywhere defined contraction T for which 1$ a,(T). PROOF Let T be the Cayley transform of A. From the previous lemma it is clear that A has a proper dissipative extension if and only if DT # H. However, the domain of T is equal to the range of A + U. Thus (a) and (b) are equivalent From the discussion preceding Lemma 3-5 it follows that the Cayley transform of a dissipative operator is contractive. Also from (3-4) it
follows that I - T is invertible on Range(A + il). Thus if A is maximal dissipative then Range(A + il) = H and I - T is injective or 10 ap(T). So (a) implies (c). Finally assume A is given by (3-5) and 1$ a,,(T). As all vectors
in DA are of the form y = (I - T) x we have
Im(Ay, y) = Im(i(I + T)(I - T)- I y, y)
= Re((I + T)x,(I - T)x) = 11xI12 -
ji TX112 >- 0
Thus A is dissipative and maximality follows from the fact that T is everywhere defined.
Since symmetric operators are automatically dissipative all the previous results hold true for this class of operators. However, since an operator A is symmetric together with -A the results can be sharpened. Theorem 3-7 Let A be a not necessarily densely defined, symmetric operator. Then the following statements are true.
82
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
(a) II (A - il)xII2 = IIXII2 + IIAx1I2 = II(A + il)xOI2 for all xeDA. (b) A is closed if and only if Range(A ± if) is closed. (c) Either Range(A + if) = H or Range(A - if) = H implies A is maximal symmetric. (d) The Cayley transform V of a closed symmetric operator A is isometric if and only if Range(A + if) = H, coisometric if and only if Range(A - if) _ H and unitary if and only if Range(A ± i1) = H.
Corollary 3-8 A symmetric operator A has a self-adjoint extension if and only if the codimensions of Range(A + i1) and Range(A - if) are equal. A class of operators closely related to dissipative operators is the class of accretive operators. A densely defined operator A is accretive if Re(Ax, x) <- 0 for all x e DA. Clearly A is accretive if and only if - iA is dissipative. Thus all results concerning dissipative operators can be easily translated to the case of accretive operators. Thus, we have, for example, the following theorem. Theorem 3-9 Let A be a densely defined linear operator in H. The following statements are equivalent. (a) A is maximal accretive. (b) A is accretive and Range(A - I) = H.
(c) A=(T+I)(T-1)-',
for some everywhere defined contraction T for which 10 a,(T). The maximal accretive operators can be characterized through the resolvent function.
Theorem 3-10 A closed densely defined linear operator A in a Hilbert space H is maximal accretive if and only if iI R(1, A) 11 5 1-'
for all
1>0
(3-6)
PROOF Assume (3-6) holds for all A> 0, that is II IR(1, A) yIi - IIYII for ally. Since (11 - A) DA = H we have y = (11 - A) x for some x e DA. So 11 AX II2 5 II (11 - A) x II2 for all x c- DA. Expanding, we have for 1 > 0 1112 IIxII2 < Ill2 IIxII2 - 21 Re(Ax, x) + IIAxII2
from which the inequality Re(Ax, x) <-
I
2I
IIAxii2
follows. Letting 1 -+ oo we have Re(Ax, x) <- 0, that is A is accretive. Since (A - I) DA = H, A is maximal accretive by Theorem 3-9.
OPERATORS IN HILBERT SPACE
83
Conversely let A be maximal accretive. Then, as Re(Ax, x) < 0 we have Ill2 IIX112 < I112 IIX112 + IIAxi12
IAI2 IIx1I2 + IIAxII2
-22 Re(Ax, x) = 11(21 - A) x112
for all A > 0 and x e DA. Since A is maximal (A - Al) DA = H hence for all y e H we have IA12 IIR(4 A) YII2 _< IIyJI2 which is equivalent to (3-6).
4. REPRESENTATION THEOREMS Representation theorems for certain classes of analytic functions have been instrumental in the development of functional analysis in general and the theory of self-adjoint operators in Hilbert space in particular. As we shall see later many of the classical representation theorems have direct interpretations as solutions of special realization problems. While the current approach to the spectral theorem via the theory of Banach algebras, the Gelfand transform, and the Gelfand-Naimark theorem is unparalleled in power and elegance we will take the classical approach which, though less general, is much closer to those who are application oriented. Our starting point for the development will be the representation theorem of Herglotz.
A (complex) function u in a domain (open connected set) (1 will be called harmonic if it satisfies the Laplace equation 02U
02U
ax
Y
2. +a 22 =0
It is a consequence of the Cauchy-Riemann equations that both the real and the imaginary parts of an analytic function are harmonic. We will be interested primarily in functions harmonic in the open unit disc and without loss of generality we may restrict ourselves to real valued harmonic functions. Clearly the set of all harmonic functions in a given domain is a linear space. This implies that for an
analytic function f its complex conjugate f, as well as its real and imaginary parts are harmonic. In particular the functions z" = r"eiie and f = r"e-'"B are harmonic for all n e Z. It is natural to expect that limits, in a sufficiently strong topology, of harmonic functions, will be harmonic. This is true for uniform convergence on compact subsets of the unit disc D. An important special case is given by
00
P,(B) _
Y_
r""1e'"°
(4-1)
which is harmonic in D. This can also be verified directly since
P.(H)=Re1
+z=Rel +re'B=
1-z
We call P,(8) the Poisson kernel.
I - rei8
1 -r2 I - 2r cos 6 + r2
84
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
We note the following important properties of the Poisson kernel. "
1
P'(0) >_ 0,
2n
P,(t) S P,(b)
P,(0)dO = 1,
for
b <- t <_ n (4-2)
and
limP,(S)=0
0
for
r-'1
Thus the Poisson kernel is a summability kernel, or alternatively an approximate
identity. The latter name is derived from its role in the commutative Banach algebra L1(T) where multiplication is defined by convolution. Thus summability kernels play an important role in the derivation of inversion formulas for Fourier series and Fourier transforms. We will return to them in the study of the Fourier transform in Sec. 10. Denote by C(T) the space of all continuous functions on the unit circle.
Theorem 4-1 Let u e C(1l'), then the function u,(eie) = u(re1e) defined by x
u(reie)
P,(0 - t) u(e") dt
= 2n
(4-3)
is harmonic in D and u, converges uniformly to u. PROOF Without loss of generality we assume u is real valued. u(reie) is harmonic as the real part of the analytic function 1
2n
Ce"+z
e"+reie
1
2n e"-z u(e")dt = 2n
a" - reie
u(e") dt
Now ur(e{e)
- u(ete)
JP.(t) (u(e"' ) - u(e`e)) dt
2_
<
1
2n
+
f
Pr(t) I
u(eue-'I)
- u(e1e) I dt
+Isa
1f
27r
Ielsa
P,(t)
u(eB-`)) I
- u(ee) dt I
Given e > 0 we can, by the uniform continuity of u on the unit circle, choose 6 > 0 so that Iu(ei(e-`) ) - u(eie) < e for Itl < 6 and all 0. Thus the first integral is bounded bye. The second integral can be majorized by 2IIull. max P,(t), 6tltl<x
where IIuis the sup norm of u. Since lim max P,(t) = 0 the result follows r16
OPERATORS IN HILBERT SPACE
85
function in the open disc having preassigned boundary values. The uniqueness of the solution to the Dirichlet problem is a consequence of the following theorem.
Theorem 4-2 Let u be continuous in the closed unit disc and harmonic in the open disc D. If u(e") = 0 on the boundary then u(re") is identically equal to zero in D.
PROOF Without loss of generality we assume u to be real valued. Suppose for some point zo of D u(zo) > 0. Choose E so that 0 < e < u(zo) and define v by v(z) = u(z) + e IzIZ. Then v(zo) > e and v(z) = e on V. It follows that v has a local maximum at an interior point z, of D. But at a local maximum we must have, since v is twice differentiable, that vx. <- 0 and vyy <_ 0. An easy
computation yields, however, v, = vyy = 2e. Thus u(z) > 0 is impossible. Analogously u(z) < 0 is impossible. So necessarily u(z) = 0 for all z e D.
The Poisson integral can be extended to a much larger space. Specifically, let M(1T) be the set of all finite complex Borel measures on I, which by the Riesz representation theorem, is just the dual of C(II'). For every measure p in M(Y) we define
x
u(re'a) = u,(e°) =
P,(9 - t) dp
2n
(4-4)
By the same reasoning as before u is harmonic in D, and it is a simple consequence
of Theorem 4-1 that the measures u, dt converge toy in the 0-topology of M(T). We say a measure p in M(Y) is positive if for each Borel subset a of T we have p(a) ? 0. This in turn is equivalent to f A f dp -> 0 for each nonnegative function
f in C(II'). The Poisson integrals of positive measures are the key to several important representation theorems. The first is due to Herglotz.
Theorem 4-3 (Herglotz) A function u in D is harmonic and nonnegative if and only if it is the Poisson integral of a positive measure p in M(T). The measure y is uniquely determined.
PROOF If pe M(II') is positive and u is defined by (4-4) then clearly u is positive harmonic. To prove the converse let u,(e`°) = u(re`B) in D. Since
I
f
I u(re'°) I d9
= 2n
f'
0 be harmonic
u(reie) dO = u(0)
-n
it follows that the L1 normal of u, are uniformly bounded. Thus we have also the uniform boundedness in M(f) of the absolutely continuous measures
u,(eie) d8. But as M(1') = C(f )* we have, applying the Banach-Alaoglu theorem [29] that every closed ball in M(1T) is w*-compact. Thus there exists a measure p c M(f) and a subsequence u such that u, dt -' p in the
86
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
e-topology of M(7f ). In particular for each nonnegative g in C(7f) we have x
f-
dy
g dp = lim i
xx
-
-R
00
and, since the integrands on the right are nonnegative, it follows that p is positive. To complete we note that 1
2n
1
1"
fx
P,(© - t) u (e") dt
P,.(6 - t) dt = lim j-. 2-n _x
_#
= lim u(r,re'a) = u(reie) which is the required representation.
To prove the uniqueness part suppose a harmonic function u has two representations of the form (4-4) corresponding to the measures pI and p2. Let v = ul - p2 then 0 = v,(e'B) = f P,(O - t) dv. Since v, dt = 0 converge to v in the w*-topology we have in particular f e-'"'dv = lim J 0 e-'"'dt = 0
so f p(e") dv = 0 for every trigonometric polynomial p and hence for all functions in C(T). Thus v = 0 and pl = p2. This completes the proof. The next result is also associated with Herglotz's name.
Theorem 4-4 A function f analytic in D has a nonnegative real part if and only if it has a representation of the(' form eis
f (z) = 1$ + J
e"
+ z du
(4-5)
for some positive measure u E M(T) and real number P. The measure u in (4-5) is uniquely determined.
PROOF If f has such a representation then
Ref=Re
eit+Z
dp=
P,(O-t)dpZ0
Conversely assume u = Ref is nonnegative. Since u is harmonic the previous theorem guarantees the existence of a positive measure p e M(1') such that u(re'B) = 11. P,(O - t) dµ. Define g by g(z)
e"+z
_.e -z dp
Clearly f - g is analytic in D and its real part is zero. Thus necessarily f - g is a constant which can be taken to be purely imaginary. The uniqueness part follows from the uniqueness part of Theorem 4-3.
OPERATORS IN HILBERT SPACE
87
Since a function analytic in the unit disc is completely determined by the coefficients of its Taylor expansion at the origin it is natural to expect a characterization of the class of functions analytic in D and having positive real part in terms of the Taylor coefficients. In doing this we make contact with positive definiteness and moment problems. We say a sequence {c"}' _ is positive definite if for each finite set S1, ... , Sn of complex numbers we have
.
n
n
x pp i E Ci-jUj ?0
i= 1 j=1
Theorem 4-5 Let f (z) = c + Y;° c;z' be an analytic function in D and let co = 2 Rec, c_,, = cn for n > 0. Then f has positive real part if and only if 1
_.
the sequence {c"}
is positive definite.
PROOF Assume f has positive real part. By Theorem 4-4
eit+z
f (z) = i9 + f n
e" _ Zz
dµ
for some positive measure p'. By expanding the kernel of the integral, and using the absolute and uniform convergence of the related series on closed
subsets of D, we obtain f(z) = ifi + f dpi + 2
I J' n edp' z". If we
define p by p = 2µ' we readily obtain cn =
J -,t
edp
for all
neZ
Now let 1 i ..., n be any finite set of complex numbers then n
n
JR
I Ci-jbi1 = i=1j=1
I ykeikt Zdp >0
xY
n
k=1 b
(4-8)
and hence the sequence {c" } _ . is positive definite. Conversely assume {cn}R _, is a positive definite sequence. Choosing = 1 we have from (4-6) that co >_ 0. Choosing o = 1, I = . = bk _ 1 = 0 C-kx1
we have from (4-6) that 0 so that ok+ is real for all complex k. Since Ckbk + ck1k is also real we have necessarily that C_k = ck. Finally applying (4-6) again with o = 1 and k = pe'e the choice of 0 = arg ck implies co p2 + 2 1 ck I p + co >_ 0 for all real p. The discriminant of this equation has to be negative and we obtain IckI <_ co for all k.
Define now a function f in D by f (z) = 2co + cnz". The inequality jcnI 5 co implies the analyticity of f in D. We will show that f has positive
88
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
real part. Indeed for jzj < 1 2 Re f (z) 1 - +ZIZ
f (Z) +f(Z) I - Z:W
W
W
ZmZm
m=0
1(t
W
W
W
C' CkZk + k=1 k=0 W
E L CkZk+m!n + Y Z C-kZ"Zl[+m m=0k=1
m==0k=0
W
X
_
xx` e
L
Y_ I Cn_mZnZm + Y_
Cm-nZ-P
m=0n=m+1
m=0n=m
I L. Cn-,nZnZm + L.
L
WW
N
C. _ZnZm
n=0m=n+1
m=0n=m
N
Y Cn-mZAz" = lim L Y Cn-mZnZm 2 0 N-W n=O m=O
n=0 m=0
and the proof is complete.
The proof of Theorem 4-5 actually contains a solution of the trigonometric moment problem which is to characterize the Fourier coefficients of positive measures.
Theorem 4-6 Given an infinite sequence {cn}. o of complex numbers, there exists a positive Borel measure µ on if such that C. =
J -x
e-'nt dµ
for all
neZ
if and only if {cn } - W is a positive definite sequence. The measure p is uniquely determined.
PROOF Assume u is a positive measure and (4-9) holds then the sequence {cn }. - W is positive definite by (4-8). Conversely if (c.)'-,, is a positive definite
sequence then by Theorem 4-5 the function f (z) = Jc0 + Et°°_ I cnzn is analytic in D, has positive real part and Im f (0) = 0. Thus it has the representation x
t
Z f(z)=2-1 f e"±Zdµ x
for some positive measure µ. This representation clearly implies (4-9). The uniqueness part follows, by the use of the Poisson integral, from the w'convergence of P*µ top.
There is an alternative way of introducing positive definiteness. Given a sequence lc,),-. - W we define a functional C on the set of all trigonometric
OPERATORS IN HILBERT SPACE
polynomials
[`N= _ N rlke'r`, N !J
89
0 by Ikemt =
(4-10) Ckgk C> The functional C is called a positive functional if Cy 2 0 whenever y(e") =
Y Ikeik` >_ 0. It is not surprising that the two notions of positivity coincide. Indeed we have the following theorem.
to be positive definite is that the functional C defined by (4-10) be a positive functional. Theorem 4-7 A necessary and sufficient condition for a sequence {ck}k
PROOF To prove necessity assume {c,)-= - m is a positive definite sequence.
By Theorem 4-6 ck = f e-k`dlt for some positive Borel measure µ Let y(e") _ "",= _" gke'k` be a positive trigonometric polynomial then so is y(e-it) =
I --
1ke-rkt. Hence
n
Y_
k=-n
n
ckrlk = L 1k k=-
Jet dp =
1ke-ikt dp
JkJ- n
0
and C is a positive functional.
Conversely assume C is a positive functional and let 0, ..., .n be any set of complex numbers. Define a trigonometric polynomial y by y(e'`) = Ip(e")I2 where p is the analytic polynomial p(ei`) = From this we get the following representation for y. y(eit)
C' skeikt
=
2
n
k=0
k =O 6 n
"
E s keikt [
&e-ut
1=0
n
n
SkSlettk-!k = Y_
= Y_ Y_
11veivt
i=-n
k=01=0
where
7w = I Wk-r
for
k=V
v >_ 0
(4-11)
and 7Y = il_ for v < 0. Since C is a[positive Kfunctional it follows that Y_-, _
_"c,?I, ? 0. But E" , _ _ncV 1V =
=o (rI"=o
which proves the posi-
tive definiteness of the sequence {ck}k
The necessity part could be proved without recourse to the solution of the trigonometric moment problem by applying a factorization theorem of Fejer and Riesz which is of interest in itself.
Theorem 4-8 A necessary and sufficient condition for a trigonometric polynomial y(e') 7ke'k` to have a representation
y(e') = Ip(eit)I'
(4-12)
90
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
for some analytic polynomial p(e") = E"=o kel" is that y(e") 0. In the factorization (4-12) of a nonnegative trigonometric polynomial y we may choose p to have no zeros in the open unit disc. PROOF The necessity part is trivial. So we assume y is nonnegative and therefore '1-k = ryk
k = 0, ..., n
for
(4-13)
For simplicity assume ryk $ 0. Define now a polynomial Y by Y(z) = D _ -n rlkz"+k, which is of degree 2n. In C2n [A,] the space of all polynomials
of degree <-2n we introduce an involution by defining q*(z) = zznq(1/a). Condition (4-13) guarantees that Y* = Y But from this we infer immediately
that if a, is a zero of Y so is ii'. Now hl(z) = (z - al)(1 - i1z) is a polynomial with zeros at a, and d,-' which satisfies h, = h, and for which 2h, (z) is nonnegative on T. Since h, divides Y we have Y= hY, and Y, satisfies Y* = Y, and aI"-'IY,(z) is nonnegative on if. Repeating this process we can remove all zeros of Y which lie off the unit circle and write
-I
aj)(l -i;z) I X(z)
where X = X" is a polynomial all of whose zeros lie on the unit circle and for which z"-X(z) is nonnegative there. Next we show that every zero of Y on the unit circle is of even order. If we let y(z) = Flkz' then Y(z) = = -n
z"y(z). For a positive number a y(z) + a = [Y(z) + ae]/z" which shows that on if y(z) + a and Y(z) + az" have the same zeros, but y(z) + a has none. Since Y(z) + az" is invariant with respect to the previously defined involution its zeros come in pairs of numbers conjugate with respect to the unit circle. Choosing a small enough an application of a theorem of Hurwitz shows that the zeros of Y on if are of even order. Now for 13 of absolute value one the polynomial g defined by g(z) = (z - fl) (1 - flz) _ - fl(z - 13)2 satisfies g* = g and 2g(z) 0 on IF. Everything put together shows that Y has the representation n-s
s
Y(z)=A r[ (z-a,)(1
-iiz)}{ a(z-$i)(I - fiz))
(4-14)
;=I for some positive number A, a, in D and Pj of absolute value one. Define a polynomial p by )
=1
s
p(z) = Aln
J)
n-s
ri=I (1 - iiz) } {;=I fl (I -
P;z) }
(4-15)
then a simple calculation yields (4-11) and moreover p has no zeros in D. Clearly we could have chosen p to have no zeros outside the closed unit disc.
We give now another proof of the necessity part of Theorem 4-7. Let be a positive definite sequence and let y(e") = E",_ -n ry,,e'°' be a
{ck }k
OPERATORS IN HILBERT SPACE
91
positive trigonometric polynomial. By the Fejer-Riesz theorem y(e') _ jp(e")j2 for some p(e") = E"=o ke` and Ilv is given by (4-11). So n
Cy = L,
v=-n
y
n
n
Cv/ly
k=0 I=0
Ck-ISkSI
0
and C is a positive functional.
Theorem 4-5 can be used as a starting point for obtaining other related representation theorems. The following one is due to Nevanlinna.
Theorem 4-9 A function F defined in the open upper half plane n+ _ {wjlmw > 01 admitsa representation
l + tw do " t-w
F(w) = a + yw +
(4-16)
where a and y > 0 are real constants and a a finite positive Borel measure on JR if and only if it is analytic in Il+ and has a nonnegative imaginary part. The measure a is uniquely determined by F.
PROOF Assume F is analytic in n+ and has a nonnegative imaginary part. The fractional linear transformation W= i[(1 + z)/(1 - z)] maps the unit disc D onto ll+. Define a function f on D by f (z) = - iF {i[(1 + z)/(1 - z)] } .
Then f is analytic in D and has a nonnegative real part. By Theorem 4-4 f(z) = /3 + j(e" + z)/(e" - z) dp for some real number /3 and a positive measure p on If. Let y = p( {0}) and define p' on 11' by p' = p - yS where S is the Dirac measure of the point 1. Thus we have
f(z)=ii4+y1 +Z+ (e"±zdp'
(4-17)
Since z = (w - i)/(w + i) it follows that
e" + z
e" -z
(w+i)e"+(w- i)
w(e"+ 1)+i(e"- 1)
(w+i)e"-(w-i)w(e"-1)+i(e"+1)
-iwctg(t/2)+i w+ctg(t/2)
Define a measure a on JR such that for each Borel subset A of 1
f(z)=ifl+Y1
z
If' IsW s±wdo
±z+i
which in turn implies that
-
F(w)=if(z)= -fl+yw+ f which reduces to (4-16) by putting a
= -ft.
1 +swdo S-w
92
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Conversely if F has the representation (4-16) then it is clearly analytic in 1I+ and it is easily checked that
(1 + t2)Imw
°°
ImF(w)=yImw+
It - w1 2
daz0
The uniqueness of the measure a in (4-16) follows from the uniqueness of part of Theorem 4-4. This completes the proof. Up to this point our representation theorems were all of the Poisson type. To get a Cauchy-type representation much stronger growth conditions have to be satisfied. A typical result is the following.
Theorem 4-10 A function F defined in the open upper half plane II+ admits a representation of the form ('
F(w) = J dµ
(4-18)
t-w
for some finite positive Borel measure µ on R if and only if F is analytic, has nonnegative imaginary part and satisfies yF(iy) I < ao
sup y>0
(4-19)
The measure y is uniquely determined by F.
PROOF If u is a finite positive measure and F is defined by (4-18) then F is clearly analytic in II+ and for w e II+
ImF(w) =
IyF(iy)J=
IY
Im
1
It_W122dµ z 0
f
dµ <
t-w
J t do
I
<
Im w
du =
It
b'Iiyl ,I
du =
IIµII
which implies (4-19).
To prove the converse we apply Theorem 4-9 to obtain the representation (4-16). For w = iy we have
yF(iy) = ay + iyy2 +
f y(1 + isy) da S- iy
and by assumption (4-19) Imy+iyy2+
fy(l+isy)do SM S - iy
OPERATORS IN HILBERT SPACE
93
for some M > 0 and all y > 0. The boundedness of the previous expression is equivalent to the separate boundedness of both the real and imaginary parts. So for y > 0 we have aY + J
-
(4-20)
da < M
(4-21)
ys(1 - yz) d, I < M S2 + y 2
and
fY2(q 2
YY +
sz
+ s2)
+ y2
Since the integral in (4-21) is positive we must have Iyy2I <- M for y > 0 which forces y = 0. As z
llm+ 21 Y
it follows that J (l + s2) da <_ M. Define a measure p by dp = (1 + s2) da then p is a finite Borel measure on R. From (4-20) we have z
a = - lim J`s(1 - y )da = s da y- s + Y J and upon substituting back in (4-16) we obtain
F(w) = a + J
+ s
SW
-w
(
dv = J sdQ + J
+ s
SW
-w
do =
$(')da= JJ s-w w
w
Once again the uniqueness of the measure p in the representation (4-18) follows from the uniqueness of the measure a in the representation (4-16).
5. THE SPECTRAL THEOREM The representation theorems obtained in the previous section form the basis and
analysis of the structure of self-adjoint operators. There is an advantage in using this approach inasmuch as there is no need to assume that the self-adjoint operator under study is bounded. Let us review first the finite dimensional situation concerning self-adjoint operators. In this case there exists an orthonormal basis {e,}"= I consisting of eigenvectors of the self-adjoint operator A corresponding to the real eigenvalues A,. Since {e,},..I is an orthonormal basis we have x = D. I (x, e,) e; and hence Ax = Ej= I .1,(x, e,) e,. Now the operator E, defined by E,x = (x, e,) e, is an orthogonal projection and hence A has the representation A = Y-+= 12,E,. If we pass to the infinite dimensional situation we expect the finite sum to be replaced either by an infinite sum or by an integral. This in fact is the situation. An examination of the simplest concrete self-adjoint operators in infinite dimensional Hilbert spaces
94
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
indicates that we cannot expect generally the spectrum to consist of eigenvalues. Thus if we consider LZ(0, 1) the space of all (equivalence classes) of Lebesgue square integrable functions on (0, 1) and the operator A defined by (Af) (A) = Af (A)
then A is a bounded self-adjoint operator with a(A) = a,(A) = [0, 1] and ay(A) _ 0. Still in some sense L2(0,1) decomposes into reducing subspaces of A where A acts as multiplication. To make this more precise we introduce the notion of spectral measures.
We define a spectral measure on R to be a map from the a-algebra of all Borel sets in R into the set of orthogonal projections in H which has the following properties.
(a) E(R) = I (b) If {S; } is a countable set of pairwise disjoint Borel sets then for each x e H, E(U I S.) X = Y)° I E(S;) x.
We shall use the spectral measure to construct self-adjoint operators in H. Let B be the set of all bounded Borel measurable functions on R In particular simple functions, that is functions of the form y aX,,, where Xa, is the characteristic
function of a Borel set 8;, are in B, and are dense there in the norm II f II = supIf(Z)I .eR
For a simple function
q= , aX.(A) we define
J
(p(2) E(dA) = I a,E(b;)
It is easily checked that the integral of simple functions with respect to a spectral measure is well defined in the sense that it is independent of the particular representations of cp used in defining it. If x, y are vectors in H then ux.y = (E(.) x, y) is a complex Borel measure in R. It follows that
f (p (A) (E(dA) x, y)
f
(p (A) dux.y
< IIwII- Ilux.yll
where 11cp1I is the norm of 0 as an element of Band Ilux,y11 is the total variation of
µx,y. Using the decomposition of complex measures [29) we have II ux.y 11
<_ 4 sup I ux.y(a) I
the supremum taken over all Borel subsets of R. Now lux,y(S)I = I(E(S) x, y)I 11E(a)11- Ilxil . llyll and hence
f (p(A)(E(dA)x,y)I
<_ 411(p ll' Ilxll
-
Ilyll
(5-1)
From this inequality it is clear that if {cp.} is a sequence in B converging to Ip then we can define f rp(a.) (E(dZ) x, y) = lim f 9.(A) (E(dA) x, y) and the inequality
OPERATORS IN HILBERT SPACE
95
(5-1) is still satisfied. By the Riesz representation theorem there exists a bounded operator A(qp) such that (A ((p) x, y)
=
J
(5-2)
W (A) (E(dA) x, y)
Using the properties of the spectral measure it follows that the map ip -+ A((p) is not only linear but also multiplicative, that is A(cp/) = A((p) A(0), that is, it is a homomorphism of B into B(H). Since for a simple function q(A) we have 0 (A) = (p (A) = I aiy,(A) it follows that A(O) = > aiE(br) = (Y a1E(bi))* = A((p)*
and by a continuity argument this holds for all (p a B. Thus the homomorphism cp -9 A(q) is actually a *-homomorphism. It is clear that A((p) is self-adjoint if and only if rp is real valued almost everywhere with respect to the spectral measure E. Next we note that for simple functions cp we have 2
A((p) x 2
=J
14 (2)
(E(dA) x, x)
(5-3)
and again by continuity this holds for all (p e B. If the spectral measure E has compact support then A = f AE(dA) is a welldefined bounded self-adjoint operator. The case in which the spectral measure does not have compact support is different and we no longer expect a bounded operator as the outcome of the integral f AE(dA). We state the result as the following.
Theorem 5-1 Let E be in spectral measure on R. Define an operator A by
DA={x
J
A2 (E(dA) x, x) < oo }
(5-4)
and
Ax = lim fn n-W
-.
AE(dA) x
(5-5)
then A is a densely defined self-adjoint operator. PROOF Let cpn(A) = A for IA 5 n and zero otherwise then A((pn) x = J
gpn(A) E(dA) x= f
AE(dA) x
Since 4',, EB we have 11 A(T,,)xIt2 = 5"- k12 (E(dA)x, x), hence if x e H is such
that lim J% 2E(dA) x exists then necessarily f A2 (E(dA) x, x) < oo. Conversely if x e DA then for n > m IIA((pn)x
- A((p,,)xjJ2 =
$L5 IASn A2(E(dA)x,x)
%
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
which shows that A((pn) x is a Cauchy sequence and hence there exists Ax = lim A((pn) x. To note that the operator A is densely defined we note that for n- w
each x e H iim E((- n, n)) x = x and hence U, E((- n, n)) H is dense in H. N- 'D But for each n E((-n, n)) H c DA since with 6n = (- n, n) j 22 (E(d2) x, x) < oo To conclude we will prove that A is self-adjoint. Let x and y be in DA then (Ax, y) = JA(E(dA)xy) =
J(E(di)Yx)
= (Ay, x) = (x, Ay)
which shows that A is symmetric, that is, A c A*. It remains to show that DA. c DA. Let y e DA. then with the previous definitions of an and A((pn) we have, since is bounded and self-adjoint y) = (A((pn) x, y) = (AE(bn) x, y) = (x, E(V`n) A*y)
(x,
From this equality it follows that for y e DA. we have A*y = limE(bn) A*y = lim A(cpn) y. Since II A (9,,) yf 12 = jn_-n A2 (E(d).) y, y) it follows that DA. c DA
and the proof is complete.
Our next concern is to show that the self-adjoint operators constructed by integration of spectral measures are not special but represent the most general self-adjoint operators. To show this we start with the study of the spectrum of a self-adjoint operator as well as the analytic properties of its resolvent function.
Theorem 5-2 Let A be a self-adjoint operator with domain D(A) dense in the Hilbert space H. Then the spectrum of A is real and the resolvent of A satisfies R(C, A)* = R(C, A)
(5-6)
and
IIR(C, A)(I < IImCL'
PROOF Let x e D(A) and
for
Im(C) # 0
(5-7)
= p + is then
II (CI - A) x II 2 = ((Cl - A) x, (Cl - A) x)
=II(p'-A)x1I2+IIaxII2>a211XI12 which implies II (CI
- A) xII
IImCI
> (1x11
for
x e D(A)
(5-8)
OPERATORS IN HILBERT SPACE
97
The above inequality shows that (Cl - A)- I exists as a bounded operator with norm bounded by IIm CI - I. To show that C is in the resolvent set of A
we have to show that the domain of (Cl - A)-I coincides with H. Since self-adjoint operators are automatically closed and the inverse of a closed operator is closed it follows that (Cl - A) - I is closed, and being bounded its domain is closed. To show density of the domain of (CI - A)- I which is the same as the range of (CI - A) let us note that y is orthogonal to the range of
CI - A if and only if for all x e D(A) 0 = ((CI - A) x, y) = (x, (CI - A) y) and since D(A) is dense in H this holds only if (CI - A) y = 0. However, Im C = - Im C * 0 hence from inequality (5-8) it follows that y is necessarily zero. The equality (5-7) follows from (5-8). From Theorem 5-2 it follows that R(C, A) is defined in the open upper and lower half planes.
Lemma 5-3 The resolvent function of A satisfies the resolvent equation
R(z, A) - R(C, A) = -(C - z) R(z, A) R(C, A)
(5-9)
for all z, C e p(A).
PROOF We observe that for all z in p(A) the range of R(z, A) coincides with D(A). Thus if z, S e p(A) we have R (z, A) (z - O) R (C, A) = R (z, A) (z - A + A - 0 R (C, A)
= R(C, A) - R(z, A) which is equivalent to (5-9).
As an immediate corollary we have
Corollary 5-4 The resolvent function of a self-adjoint operator A is analytic in the open lower and upper half planes and for each x e DA 112
n
Im (R (z, A) x, x) = - R(z,A)x
(Imz)
(5-10)
PROOF For all x, y e H and z, C nonreal we have (,R (z, A) x, y) - (R (C, A) x, y)
z-C
= -(R(z, A) R (C, A) x, y)
(5-11)
Letting C approach z we have d (R (z, A) x, y) dz
-(R(z, A)2 x, y)
and R(z, A) is weakly analytic. However, the various types of analyticity are equivalent [29] which implies that [dR(z, A)]/dz = -R(z, A)2 actually holds in the norm operator topology. To prove (5-10) we substitute in (5-11) { = 2 and use the fact that R(2, A)* = R(z, A).
9$
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Theorem 5-5 (Spectral theorem) A densely defined operator A in a Hilbert space H is self-adjoint if and only if there exists a spectral measure E defined on the Borel subsets of the real line such that DA = { x I J A (E(dA) x, x) < oc } and
Ax = fAE(dA)x
(5-12)
where the last integral is defined in the strong topology. PRooF Let A be self-adjoint. For each vector u e H we consider the function (p. (z) = (R(z, A) u, u) defined and analytic off the spectrum of A, in particular in the upper and lower half planes. By (5-10) cp (z) has nonpositive imaginary
part in the upper half plane and from (5-7)
it
follows that it satisfies
sup lycp (iy)j < oo. Thus, making allowance for the required change in sign,
Y> 0
Theorem 4-10 implies the existence of a finite positive Borel measure p,, such that (R(z, A) x, x) =
1
fZ
dµx
(5-13)
for all z in the upper half plane. Since R(z, A)* = R(2, A) we have (R(2, A) x, x) = (R(z, A) x, x)
1
-
dµx
and hence the representation (5-13) holds actually for all nonreal complex numbers.
We apply now the polarization identity by which for all x, y in H (R (z, A) x, y) _ j { (R (z, A) (x + y), x + y) - (R (z, A) (x - y), x
- y)
+ i((R(z, A)(x + iy), x + iy) - (R(z, A)(x - iy), x - iy)) } If we define a complex measure µx,, by
llx.y = i{{Ux+y - µx-y + iµx+iy - i/!x-iy) then µx.y is a finite complex Borel measure and (R(z, A) x, y) =
fZ
dyx.y
(5-14)
By the uniqueness of the representing measure we must have that is linear in x and antilinear in y. From (5-7) we have In(R(in,A)x,x)I <_ IIxI12
µx,y
OPERATORS IN HILBERT SPACE
99
and hence
Cn
irl - A dux.x
By the Lebesgue dominated convergence theorem it follows, as lim (q/irl- A) = -i that 11ux,x JI <- IIx jj2. This implies the boundedness of the Hermitian bilinear form f,, dux,,, = u,,,,(a) for each Borel set a. We apply now the Riesz representation theorem to obtain the existence of a self-adjoint operator E(a) of norm less than or equal to one such that (5-15) (E(a) x, y) = u.,,(a) and moreover the operator E(a) is uniquely determined. Applying the resolvent identity (R(z, A) x, R(C, A) y) = (R (C, A) R(z, A) x, y) 1
- - z ((R (S, A) - R (z, A)) x, y) and hence
fz
(E(d2) x, R(C, A) y)
z
=
f
f
A- z
(E(dA) x, y)
1
(z
1 A) (E(dA) x, y)
A) G
A)
By the uniqueness part of Theorem 4-10 we have for each Bore] set a (E(a) x, R(C, A) y) =
Jo
1
(E(dA) x, y) =
J
1
X,(2) (E(dA) x, y)
But (E(a) x, R(C, A) y)
A
(E(a) x, E(dA) y)
So we have the equality
f
(E(a) x, E(dl) y)
= fC 1
XQ(2) (E(dA) x, y)
We apply now once more the uniqueness part of Theorem 4-10 to obtain J (E(a) x, E(dl) y) = I X,, (2) (E(dA) x, y) 0
(5-16)
for each Borel subset p of R. Equality (5-16) is equivalent to the following one
(E(a) x, E(p) y) = (E(a n p) x, y)
(5-17)
100
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
for all Borel sets a and p and all vectors x, y in H. Since E(p) is self-adjoint we have finally E(a) E(p) = E(a n p)
(5-18)
In particular E(a) is orthogonal projection valued or
is a spectral
measure.
The spectral measure reduces the resolvent function of A, that is for each Borel set a and all nonreal c we have E(a) R(C, A) = R(C, A) E(a)
(5-19)
To see this we note that using (5-18) (E(a) R(C, A) x, x) = (R (C, A) x, E(a) x) = I
f o(A) (E(dA) x, x) = I
=J
(E(d,) x, E(a) x) X.(A)
(x, E(dA) x)
(E(a) x, E(d2) x) = (E(a) x, R(C, A) x)
1
from which (5-19) follows.
We proceed with the characterization of the domain of A. Let ) be any nonreal complex number. Then the range of R (1t, A) coincides with the domain
of AI - A and hence with that of A. If x is in DA then there exists a vector y such that x = R(i, A) y. Now IIR(1, A) yIl 2 = (R(i, A) y, R(i, A) y)
_ (R(- i, A) R(i, A) y, y) I
((R(I,A) - R(-i,A))y,y)
2i
f( i 1 - - i 1- A) (E(dA) y, y) 1
f
1
+,l2(E(dk)y,y)
Using the reducibility of R(C, A) by the spectral measure we have (E(a) x, x) _ (E(a)2 R(i, A) y, R(i, A) y) = II R(i, A) E(a) Y
112
= I I + 22 (E(d l) y, y)
OPERATORS IN HILBERT SPACE
101
and therefore JA2 (E(dA) x, x) =
i+
2
J(E(dA) y, y) = IlY
(E(dA) y, y)
II2
Thus we have proved that DA c (XI f A2 (E(dA) x, x) < oo}. To show that we have actually equality we define an operator A, by DA,
jx
I
J A.z (E(dA) x, x) < oo
I
and
A,x = lim J n n-w
for
AE(dA) x
x e DA,
n
By Theorem 5-I A, is a densely defined self-adjoint operator. We will show that A and A, coincide. For arbitrary u e H we have
(R(i, A) (iI - A,) x, u) = J'_-_(E(dA)(iI - A,) x, u)
=
f
i
A
(E(dA) x, u) = (x, u)
Since u is arbitrary we obtain
R (i, A) (i1 - A,) x = x
for all
x e DA,
(5-20)
Thus Range R(i,A) = DA, which together with Range R(i,A) = Di,_A = DA shows that DA = DA,. Now (5-20), together with R(i, A) (il - A) x = x for all x c- DA, implies that R (i, A) ((il - A) x - (i1 - A, ) x) = 0. Since R(i, A) is
injective it follows that (il - A) x = (il - A,) x for all x e DA and hence that A = A, I. This completes the proof.
As a straightforward consequence of the spectral theorem we can construct
a much more powerful functional calculus than the Riesz-Dunford calculus described in Sec. 2. This calculus is based on integration of bounded Borel functions with respect to spectral measures. Let A be a self-adjoint operator which is not necessarily bounded. For f c- B we define f (A) by
f (A) = Jf(A) E(dA)
(5-21)
where E is the spectral measure of A.
Theorem 5-6 The map f -f (A) is an algebra homomorphism of B into B(H) mapping the function 1 onto I and satisfies f (A) = f (A)* for f(A) =
102
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
f (A), and II f(A) x II2 =
f
for each
xEH
(5-22)
If (f.) is a uniformly bounded sequence of functions converging pointwise to f then f .(A) converges to f (A) strongly. PROOF That the map of f -+f (A) is a homomorphism has been established in the beginning of this section. That f (A) = f (A)* is trivial for simple functions and extends by continuity to all of B. Since 11 f(A) x II 2 = (f (A) x, f (A) x) _
(f (A) f (A) x, x) (5-22) follows. Finally, since
IIf(A)x -f(A)xlI2 = flf(2) -fn(2)IZ(E(d2)x,x) the last statement is a consequence of the Lebesgue dominated convergence theorem.
If the operator A is bounded we can replace the algebra B by the algebra B(a(A)) of all Borel measurable functions defined and bounded on a(A). In this case all polynomials are in B(a(A)) and if p(2) ail' we have I aiA E(dA) _
i=O
a1A' i=O
thus the various definitions of p(A) coincide. In the next section we will extend the present functional calculus even further. An important consequence of the spectral theorem is the existence of square roots of positive operators. An operator A in a Hilbert space H is called positive if for each x E H (Ax, x) >- 0. Since the Hilbert space is assumed to be complex a positive operator is self-adjoint. For real Hilbert spaces the self-adjointness has to be postulated. Theorem 5-7 A bounded operator A is positive if and only if it has a positive square root. The positive square root of a positive operator is unique.
PROOF Let P be a positive square root of A then (Ax, x) = (P2x, x) = IIPxII2 >- 0 and A is positive. Conversely assume A is positive then its spectrum lies in [0, oo). The positive function 21/2 is in B(a(A)) and hence P = f 21'2E(dA) is a well-defined positive operator. By the properties of the functional calculus P2 = A. To prove uniqueness assume Q is another positive square root of A. The operators Q and A commute as QA = Q3 = AQ and consequently, P being a limit in norm of polynomials in A, P and Q commute. Let now x be an arbitrary vector in H and put y = (P - Q) x. Then if P112 and Q'12 are arbitrary positive square roots of P and Q, respectively IIP112yII2
+
IIQI12YII2
= (PY,Y) + (QY,Y) = ((P + Q)(P - Q)x,Y)
= ((P2 - Q2)x,Y) = 0
OPERATORS IN HILBERT SPACE
103
and hence Py=Qy=0. Now jI(P-Q)x112=
So
((P - Q)2 x, x) = ((P - Q) y, x) = 0 and we obtain Px = Qx. As x was arbitrary P = Q and the proof is complete. We saw already that bounded operators on a Hilbert space have properties resembling those of the complex numbers. An important instance is the generalization of the fact that every complex number z can be represented uniquely in the
polar form z = re'8 where r > 0 and 0 5 0 < 2ir. Theorem 5-8 (Polar decomposition)
(a) Every bounded operator T in a Hilbert space H can be written in the form
T= VP
(5-23)
where P is positive and V a partial isometry. The decomposition (5-23) is unique if we require
Ker V = KerP (b) Every operator Te B(H) can be written in the form
T= QW
(5-24)
(5-25)
where Q is positive and W a partial isometry. The decomposition (5-25) is unique if we require Ker W* = Ker Q
(5-26)
PROOF (a) Define P = (T*T)`12 which exists by Theorem 5-7 as T*T is clearly a positive operator. Now for each x e H 1IPx1I2 = (P2x, x) = (T*Tx, x) =
(5-27)
Define an operator V on Range P by VPx = Tx. Equality (5-27) shows that V is isometric on its domain of definition and hence can be extended by continuity to an isometry on Range P. Extend the domain of definition of V to all of H by letting VI {Range P}1 = VIKerP = 0. So V is a partial isometry with Range P as initial space and Mange T as final space. So (5-23) is proved and (5-24) satisfied. To prove uniqueness assume T= WQ is another decomposition of the same type with Ker W = Ker Q. By Theorem 2-3 W*W is a projection on the initial space of W which is equal to {Ker W)' = {Ker Q }1 = Range Q. Consequently W*WQx = Qx for each x e H and hence T*T= QW*WQ = Q2 or Q = (T* T)1/2 = P by the uniqueness part of Theorem 5-7. This in turn yields the equality VP = WP or V and Ware equal on RangeP.
Now (Range P)' = KerP = Ker V= Ker Wand hence V and Ware equal. To prove (b) apply (a) to T*. We note that necessarily Q = (TT*)1J2 in this case.
Generally similarity of two operators does not imply their unitary equivalence, but for self-adjoint operators even more is true.
104
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Theorem 5-9 Let A and A I be two self-adjoint operators acting in the Hilbert
spaces H and H1, respectively. Let X: H-' H, intertwine A and A1, that is, XA = A1X. Then (a) If X has range dense in H1 there exists a coisometry V such that VA = A, V
(b) If X is one-to-one there exists an isometry W such that WA = A, W (c) If X is one-to-one and has range dense in H, (in particular if X is bounded-
ly invertible) then there exists a unitary U such that UA = A, U PROOF From XA = A1X it follows by taking adjoints that AX* = X*A1 and hence AX*X = X*A1X = X*XA or A(X*X) = (X*X)A and analogously A1(XX*) _ (XX*) A,. By a standard approximation argument it follows that A(X*X)u2 = (X*X)"2 A (5-28) and
A1(XX*)"n = (XX*)li2 Al
(5-29)
Now assume X has range dense in H1. Since {0} = {RangeX}1 = KerX* = Ker(XX*)1"2 = {Range(XX*)1/2}1 it follows that also (XX*)1"2 has range dense in H1. From the equality IIX*yII = II(XX*)112 yII it follows that if we define V by VX*y = (XX*)1'2 y then V can be extended by continuity for an isometry from RangeX* onto H1. Extend V to all of H by defining V I Ker X = 0 and V becomes a coisometry satisfying VX* = (XX*)1J2. By our assumption (XX*)"2 has dense range hence (X X* )- "2 is a closed densely defined operator. Thus y = y for all y in Range(XX*)112. Since V is isometric on RangeX* we have is isometric on its domain of definition hence extendible by continuity to an isometry on H1 which has to coincide VX*(XX*)-u2
X*(XX*)-112
with V*. So we have V = (XX*) - 1/2 X
(5-30)
Since from (5-29) it follows that A1(XX*)-1/2 = (XX*)- 1/2 A, we have (XX*)-u2 XA = (XX*)-u2 A1X
= A1V = which proves (a). Part (b) follows by duality considerations. Finally, if X is one-to-one and has dense range then both X*(XX*)-1/2 and X(X*X)-1J2 are isometric. Now from the equality X(X*X) = (XX*) X it follows that VA =
AI(XX*)-112 X
X(X*X)112 = (XX*)112 X and hence that (XX*)-112 X= X(X*X)-1"2. This means that V given by (5-30) is also isometric and therefore unitary.
6. SPECTRAL REPRESENTATIONS The spectral theorem for self-adjoint operators proved in the previous section while stating that diagonalization of these operators is possible does not yield much insight into their structure.
OPERATORS IN HILBERT SPACE
105
In the spirit of Sec. 1-4 we would like to describe a general self-adjoint operator in terms of operators of simple type. Essentially given a self-adjoint operator A we look for a model of it, that is a unitarily equivalent operator, acting in a function space. Consider a positive Borel measure p on R. For each lp e B, that is, for each bounded Borel measurable function on 9t we define a multiplication M",,, acting in the Hilbert space L2(µ) by (6-1) M..µf = qpf We could replace B by B(S2) the set of all bounded Borel measurable functions on the closed set 0 which we assume contains the support of p. Algebraically we have induced in L2(µ) a B, or B(O), module structure. We note that the map
(P - MO,,, is an algebra homomorphism of B, or B(0), into B(L2(,u)) which satisfies M:,,, = M4i.µ
(6-2)
and II
II 9II
.
(6-3)
where IIwIl is the sup norm of tp. If the support of p is compact then with X(-) = 2 Mx,,, is a bounded self-adjoint operator, and so is a direct sum of such operators Actually if {p} is any family MX,,,, ® ® M..,,. acting in L2 (µ1) ® . ® L2 of positive measures with a uniformly bounded support then ®a M1,,,, acting in (Da L2(µ.) is a bounded self-adjoint operator. If we want the Hilbert space under consideration to be separable then necessarily the family {pa} has to be countable.
Given a self-adjoint operator A in a Hilbert space H a unitary map 4): H
®aL2(µa) is called a spectral representation of A if for each qP EB we have ((D(P(A) x)8 = M,,,,,((Dx)a
(6-4)
for all x e H. A moment's reflection brings us to the conclusion that if there exists a spectral
representation it is not unique. The questions of existence and uniqueness, assuming some extra conditions, of spectral representations are central in this section. To simplify matters as much as possible we will not discuss the most general self-adjoint operator but rather restrict ourselves to the case of finite multiplicity.
The general case can be handled in similar fashion. Given a self-adjoint operator A with the associated spectral measure E we induce a B-module structure on H by letting
(p x = tp(A) x = J'c(1)E(dA)x
(6-5)
A self-adjoint operator A is called cyclic if there exists a vector x in H such that the set of vectors {qp(A) xI w c- B} is dense in H. For a bounded operator this is equivalent to the density of the set {p(A) x I p E C [A] } . More generally a set of vectors {xa} in H is a set of generators for A if the set of all finite sums Ygpa(A) xa with rpa E B is dense in H. A self-adjoint operator has finite multiplicity if there
106
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
exists a finite set of generators for it. A minimal set of generators is a set of generators of smallest possible cardinality. The multiplicity of A is the cardinality of a minimal set of generators. Let {x,, ..., x,} be a fixed set of generators for a self-adjoint operator A and let B' be the cartesian product of r copies of B. Clearly B' is a B-module. We define the map p: B' -* H by P(4Pi, ..., (P,) _
1P,(A) x,
(6-6)
I=1
where x 1, ... , x, is the fixed set of generators for A. The map p is, by elementary
properties of the functional calculus a B-module homomorphism, and by our assumption that x1, ... , x, is a set of generators it follows that p has range which is dense in H.
Computing the norm of p(cpli..., (p,) we obtain II y (pi (A) x1 I I
2
= y Yi (1Pj(A) x1, (pi(A) x1) _ iy yi (Wj (A) Vi(A) xi, xi)
_ Y_ I
Jcoj(1)coi(1)(E(d2)xjxi)
Define now the (complex) measures µ,; by
/Ij(a) = (E(a) xl, xi) for all Borel sets a and let M be the matrix whose i, j entry is p,,. We call such an object a matrix measure [29]. We say a matrix measure is a positive matrix measure if for each Borel set a, M(a) is a nonnegative definite Hermitian matrix. It is easily checked that the matrix measure M constructed in (6-7) is a positive matrix measure. Indeed let a be a Borel subset of Qt and let a,, ..., a, be complex numbers then with a = (a1, ... , a,) (M(Q) a, a) _
piia,i `
Y (E(6) x,, xi) ai 2
= (E(a) Y a,xi, Y a;x; I =
E(a) Y a;x,
>0
In terms of the matrix measure introduced we have
f(dMF,
2
PF
f (A) x, II
=
F)
(6-8)
where F e B' is the vector function whose components are f1, ... , f,. Equality (6-8) ndicates that if we define properly the L2 space of a matrix measure M which we vill denote naturally by L2(M) then the map p;B' -. H will have a natural ex.nsion to a unitary map of L2(M) onto H. Moreover, such a map satisfies p((pF) = cp(A) (pF)
for all
(P C- 8
(6-9)
OPERATORS IN HILBERT SPACE
107
Also for any vector x in the domain of A we have [ p - t (Ax) ] (Z) = A - (p - lx) (2)
(6-10)
Thus in the functional representation A acts like multiplication by 2. We note that M has a convenient description in terms of the spectral measure a,) onto that is associated with A. If J: (Lr - H is the map sending Y,,_, a;x; then for each Borel set a we have
J
M(a) =
(6-11)
To define L2(M) we proceed as follows. We denote by L2(M) the set of all r-tuples (fl, ..., f,) of Borel measurable functions for which
(dMF, F) =
J-
IF112
f (A) r;(A)dµt; < Y ;=I j=tt
(6-12)
and define L2(M) as the set of all equivalence classes in L0(M) modulo the set of null function, a null function being one for which JIFIJ = 0. With the inner product in L2 (M) defined by
(F, G) =
f(d M F, G) = $
EY ;-t;=t
g;(A) dµ;;
(6-13)
L2(M) becomes a pre-Hilbert space and the only open question is that of completeness. There is one class of matrix measures for which L2(M) is clearly complete, namely the class of positive diagonal measures, that is, those for which i j implies p;; = 0 and the diagonal elements are positive measures. If µ,, ...,,u, are the diagonal elements of a diagonal matrix measure then in this case JIFI12
=
Y_ 1i (2)I2 dpi _
t=t
;=t
JIl2j =
r
;=t
II /i II2
where 11 fill is the norm of f and an element of L2(µ;). Hence in this case L2(M) is
clearly equal to the direct sum L2(µ,) p .- m L2(µ,) which is a complete space. We will use this observation to show completeness of L2(M) by exhibiting a unitary map that diagonalizes M. As a first step we simplify the problem by replacing matrix measures by density
matrices and one scalar measure. We choose a positive measure µ such that all µtj are absolutely continuous with respect to p, p;; << µ. One candidate is the sum of the total variation of all the µ;;. A better choice turns out later to be the trace of M If m;; = dpi;/dµ is the Radon-Nikodym derivative of µU with respect to µ then we introduce the density matrix M(2) = m,;(A)
(6-14)
Lemma 6-1 If M(A.) is the density matrix of a matrix measure M with respect to a scalar measure p then M(2) is nonnegative definite p-a.e.
108
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
PROOF Observe first that the set A0 = {.1I(mrj(2)) is nonnegative definite} is a measurable set, for it is the intersection of all sets {A1 Y-, E j mij (2) j ? 0) vary over all vectors with rational coordinates. If M(2) where
is not nonnegative definite p-a.e. then there exists a set A of positive psuch that measure in the complement of A0, and a rational vector for A in AY., L mjj(A)
j < 0. (T' his would imply
(M(A)x,x) = J I r
0
j
contradicting our assumption that M is a positive matrix measure.
Consider next the set of all positive matrix measures on R. We say that M divides N, and write MIN, if there exists a Borel matrix function H such that (6-15) dM = H*dNH Two matrix measures M and N are equivalent and we write M - N if MIN
and N I M.
The division relation is clearly reflexive and transitive and hence induces a partial order in the set of all matrix measures. Relation (6-15) is a generalization of the concept of absolute continuity as applied to matrix measures. Heuristically the matrix function H has the interpretation of a "square root" of a generalized Radon-Nikodym derivative of M with respect to N. We point out that M and N do not have to be necessarily of the same size. In that case H will not be a square matrix. For scalar measures µ and v we have of course p I v if and only if p << v. The partial order in the set of positive matrix measures is reflected in the corresponding L2(M) spaces. Given two matrix measures M and N we say that a map U: L2(M) - L2(N) is an embedding if it is an injective B-homomorphism. If U is also an isometry we say U is an isometric embedding. The next lemma provides a large class of isometric embeddings.
Lemma 6-2 Let M and N be positive matrix measures and assume that MIN. Then there exists an isometric embedding of L2(M) into L2(N).
PROOF Since MIN there exists a measurable matrix function H such that (6-15) holds. Define UM: L2(N) by
for
UMF = HF
F e L2(M)
(6-16)
then clearly IIUF112 =
J(dNHFHF) = f(H*dN HF, F) = J(dM F, F) = IIFII2
So UM is an isometry and it is easily checked that it is a B-homomorphism. We note that the set of isometrics UM is a coherent set of isometries [18] in the sense that if MIN and NIS then we have us M
= NUM
(6-17)
OPERATORS IN HILBERT SPACE
109
The equivalence of two matrix measures can be described also in terms of their density matrices with respect to a common scalar measure. To this end we define a notion of equivalence between measurable matrix functions. Let M and N be Borel measurable n x m matrix functions defined on a subset of R, and let a be a positive measure on R. We say that M and N are a-equivalent if there exist a-a.e. invertible measurable n x n and m x m matrix functions P and R such that M(A)=P(A)N(A)R(A)
a-a.e.
(6-18)
If M and N are square matrix functions we say that M and N are unitarily a-equivalent if there exists a measurable a-a.e. unitary matrix function P such that M(1) = P(2)* N(A) P(2)
a-a.e.
(6-19)
It is clear that both relations are bona fide equivalence relations and unitary aequivalence implies a-equivalence. Also if v is a positive measure and v << a then a-equivalence implies v-equivalence.
We prove now a lemma which is the main technical result needed for the proof of completeness. In terms of the equivalence notions introduced we can state it as follows.
Lemma 6-3 Let M be a positive matrix measure and let M be its density matrix with respect to a positive measure y that satisfies m,; << A. Then there
exists a diagonal matrix function D such that M and D are unitarily Eiequivalent.
Alternately stated there exists a measurable matrix function H such that H(1)* H(2) = 1
(6-20)
M(A) = H(A)* D(A) H(A)
(6-21)
and
hold y-a.e.
PROOF We note that .i-a.e. M(2) is a nonnegative definite matrix and hence can be diagonalized by a unitary matrix The content of the lemma is that the pointwise diagonalizations can be fitted together in a globally measurable way.
Observe first that if there exists a sequence of sets e" and p-measurable matrix functions H(")(2) and which are unitary and diagonal, respectively, and such that M(1) = H(")(2)* D(">(2) H(")(2) holds µ-a.e. on e" then H(,t) and D(A) defined by H(1) = H(")(A) and D(,t) = D(")(1) for A e e" - u; _ ; et satisfy (6-20) and (6-21) on u ° 1 e,. By a theorem of Lusin [29] given any e > 0 the mil are actually continuous on a set whose complement has at most µ-measure e. Taking a sequence of e, converging to zero, and recalling that y is a finite measure, then, but for a set of µ-measure zero, R is the union of measurable sets where all
110
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
m,; are continuous. By the preceding remark we only need to prove the lemma on a measurable set e where all m1, are continuous. Now if the m,,(,1) are continuous on a set e so are the eigenvalues of M(.1), that is, the zeros of det [M(A) - Cl], for they are the zeros of a polynomial with continuously varying coefficient. However, in general it is impossible to choose a continuously varying set of corresponding eigenvectors.
The trouble can occur at points where the multiplicity of an eigenvalue changes. To avoid this difficulty we let N(A) denote the number of distinct
eigenvalues of A. Then the sets {2eeln(2) ? s} are open in the relative topology of e. It follows that e5 = {2 e e n(2) = s} = {2 e e n(.1) ? s} v1k 1 {A e e n(A) >- s + 11k} is a Borel set. Thus e decomposes into the union of e1, ..., e" which are disjoint Borel sets. Thus it suffices to construct H(A) and D(A) on any of the e,,. Let Ao a ek and let 01(2 ), ..., 0, (A) be the distinct eigenvalues of M(-,o). Then there exists by continuity of the eigenvalues a unique enumeration 01(.1), ..., 4'k(2) of the eigenvalues of M(2) such that the wi(1) are continuous on ek. Let e > 0 be such that no two distinct eigenvalues of M(Ao) are closer than E. Let yi be a positively oriented circular path with center at I (Ao) and radius less than s/2. Let EM(.o) and EM(,1) be the spectral measures of M(Ao) and M(A), respectively. For A in a sufficiently small neighborhood N1 of 2 Oi(A) will be within the circle y,. In that case we have tai1
Ei(1o) = EM(,)({i (1o)}} =
R(C,M(2o))dC Y:
and
Ei(2) =
=tai f
Since M(2) varies continuously with A c N 1 so does R (1(, M(2)) and therefore Ei(A) is a continuous function of A e N1. Choose an orthonormal basis v1, ..., v" of (I' consisting of eigenvectors
of M(Ao) ordered so that Ei(Ao) vJ = vj for ni_ 1 < j < n,, 0 = no < n1 < < n, = n. Define f3; (A.) = Ei(2) v1 for ni _ 1 < j < n,. The flj (A) depend continuously on a e N1 and as E.(2) G;(2) = 8; (A) they form a basis of eigenvec-
torn of M(2), but it may fail to be an orthonormal basis Since eigenvectors corresponding to distinct eigenvalues of a self-adjoint operator are orthogonal we can produce an orthonormal basis for (" consisting of eigenvectors of
M(A) by applying the Gram-Schmidt process to each of the ", _ , + ,
(A), ..
", (%1)
If
sets
is the resultant orthonormal
basis for the range of Ei(2) then v1(),), ..., v"(1) is an orthonormal basis for C" and M(A) vi(2) = cpi(A) vi(A), i = 1, ..., n. Let e1, ..., e" be the standard
orthonormal basis of C". Let H(2) be a matrix function such that H(A) ei = vi(A) i = 1, ..., n. As H(A) transforms one orthonormal basis into another
OPERATORS IN HILBERT SPACE
111
it is necessarily unitary. Moreover from M(1) vi(A) = cp;(A) vi(A) it follows that M(2) H(A) e, = rpi(A) H(A) e1 or (6-21) holds with
D(2) = diag(,p(A), ..., (p,,(A))
This completes the proof of the lemma.
Theorem 6-4 If M is a positive measure on R then L2 (M) is a Hilbert space.
PROOF Let µ, H, and D be as in the previous lemma and let dD = Ddµ. The map U'N: L2(M) - LZ(D) given by (6-16) is an isometric embedding. However, since D(.1) = H(A) M(A) H(2)*, it is invertible and we have
(UM)-' = UD = (UM)' where UDG = H'G. Thus UM is a unitary map and
(6-22)
®L2(5,)
(6-23)
L2(D) = L2(6,) ®
where bi are the measures defined by bi(a) = f di(2)dµ Thus L2(D) is complete and so is L2(M).
For (p E B we define the operator on multiplication by (p in L2(M) by MW,MF = coF
for
F E L2(M)
(6-24)
In terms of the identity function X, X(A) = A we can summarize the previous
results and exhibit a functional representation for the Hilbert space H and the self-adjoint operator A acting in it.
Theorem 6-5 Any operator A in a Hilbert space H is unitarily equivalent to an operator Mz,M in L2 (M) for some positive matrix measure M on the real line if and only if A is a finitely generated self-adjoint operator.
The combination of Lemma 6-3 and Theorem 6-4 which is now available to us poses naturally the question of canonical forms. Our aim is to simplify a matrix density by transformations of the form (6-19) for a-a.e. unitary measurable
matrix P, and this simplification will be reflected in a simpler spectral representation for the corresponding self-adjoint operator. This problem of canonical forms for self-adjoint operators is a classical one, first resolved by Hellinger. Our approach uses only simple matrix manipulation The price for that is the loss of generality involved by assuming finite multiplicity. Lemma 6-6 Let L = (A,,) be a positive matrix measure and let a be a positive
measure such that IL al. Then there exists a diagonal matrix measure M with diagonal entries µ, , ... , µp such that dµ1 = mi da and the following statements hold (a) P 1 >> lt2 >> ... >> µn and
(b) IL and M are unitarily a-equivalent.
112
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Moreover if N is another diagonal matrix measure with diagonal entries vI,... , V. such that dv; = n;dp and the statements
(d) vI»v2>>
vpand (b') L and N are unitarily p-equivalent hold then M and N are unitarily T-equivalent where t = p A a is the infinum of the measures p and a [62]. PROOF By Lemma 6-3 it suffices to show that given a diagonal matrix measure IL it can be reduced to canonical form. Thus without loss of generality we let IL be diagonal with diagonal elements 21, ... , 2, where, by assumption,
2, << a. Let d2; = 1, da, that is, 1, is the Radon-Nikodym derivative of 2, with respect to a. For simplicity of notation we assume p = 2. Let 22 = 22 + 2z be the Lebesque decomposition of 22 with respect to 21, assuming 22 << 21 and 2Z 1 21. Let 12 = 12 + 1 i with 12 and 12' the respective Radon-Nikodym derivatives of 22 and 22 with respect to a. Let E2 = {2 1"(A) 0} and F2 = {2 I 1"(2) = 0} and let XE2 and XF2 be the corresponding characteristic function
of the two sets. Define a 2 x 2 matrix function H(A) by H(A) _
XF2((/A)
XE2(2)
C XE2(2)
XF2(2)
A simple calculation yields the equality
L(2) = H(2) L(2) H(1.)
(6-25)
where
L(2) =
11(2)
0
0
12(2)
and
L(2) =
(11(1) + 12(2)
0 12(2)
0
(6-26)
which proves the statement for p = 2. The necessary modifications needed to make the proof work for p > 2 are obvious. Thus we proved the existence of the canonical diagonalization. To prove the uniqueness part we note the obvious fact that if IL and M are unitarily a-equivalent they are also unitarily a -equivalent for any a' << a.
It follows that if we form the infinum t = p A a of the measures p and a transitivity M and N are unitarily r-equivalent. Thus T-a.e. the diagonal matrices mi (2)
.
.
.
0 and
.
.
mp(2) /
\.
0
.
.
.
n',(2)
OPERATORS IN HILBERT SPACE
113
are unitarily equivalent. Here mi and n, are the Radon-Nikodym derivatives of A and vi with respect to z. Since assumptions (a) and (a') imply m;+, (A) = 0
whenever ?;(A) = 0 it follows that the zero sets of m'; and nJ are equal r-a.e. This is equivalent to u; x v;.
As an immediate corollary we obtain the ordered spectral representation for a finitely generated self-adjoint operator A. The integer p is referred to as the multiplicity of A.
Theorem 6-7 Let A be a finitely generated self-adjoint generator in a Hilbert space H. Then there exists a finite sequence of positive measures u, >> 142 >> >> up such that A is unitarily equivalent to Mx.w ® ... ® Mx.u,
(6-27)
acting in L2(pI)
(6-28)
®... ® L2(µ,)
The sequence u,, ..., up is determined by A up to equivalence of measures.
There is another representation associated with a self-adjoint operator which is closely related to the ordered spectral representation.
Theorem 6-8 Let A be a finitely generated self-adjoint operator in a Hilbert space H. Then there exists a finite sequence of mutually singular positive measures v,,..., vp such that A is unitarily equivalent to MX.N, ® ... ®M,.N,
(6-29)
L2(NI) ® ... ® L2(N,)
(6-30)
acting in
where N, = vjIj, If being the j x j identity matrix The sequence of measures v,, ..., vp is determined by A up to equivalence of measures.
The representation (6-29) is referred to as the canonical spectral representation of A.
The passage from Theorem 6-7 to Theorem 6-8 is straightforward using repeatedly the Lebesque decomposition theorem for measures. We omit the details.
We remark that another alternative way of writing the canonical spectral representation is to define the matrix measure by / VI + ... + vp
v2+...+vp N =
1
1
VP
(6-31)
1 14
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
then A is unitarily equivalent to the operator MX.N
acting in L2(N).
(6-32)
Having obtained a spectral representation for a self-adjoint operator it is easy to extend the functional calculus constructed in Sec. 5. Let 1: H - ®f_ I L2(p,) be the ordered spectral representation of a finitely generated self-adjoint operator A. Since the spectral representation is ordered we have u, <
that is, multiplication by L'°(µ) functions makes sense and the usual module axioms are satisfied. Thus also the direct sum ®4_ I L2(ll,) becomes a module over L°°(µ). Using the map 0 we induce an L°°(µ) module structure in H by defining for each x e H and cp e L°° (p) W (A) x = (D - '((p cDx)
(6-33)
It is easily checked that for functions qp E B the new definition coincides with the previous one. We summarize the result.
Theorem 6-9 The map cp (p (A) defined by (6-33) is an isometric algebra isomorphism of L°°(µ) into B(H) that satisfies p(A) = g(A)*.
Suppose now we are given two self-adjoint operators AI and A2 acting in HI and H2, respectively. By our functional calculus each of the spaces HI and H2 becomes a B-module. Given a map X: HI -+ H2 that intertwines AI and A2, that is, for which
XAI = A2X
(6-34)
then it follows easily that for each function (p c- B we have
Xco(AI) = co(A2)X
(6-35)
and this means that X : HI -+ H2 is a B-module homomorphism. Our object is to study these module homomorphisms in terms of the spectral representations of AI and A2.
As a consequence of Theorem 6-5 the study of operators intertwining two (finitely generated) self-adjoint operators reduces to those intertwining two operators of the form Mx.,,,. In the set of all matrix measures we single out the set of all scalar type measures which are the matrix measures of the form aI, diagonal matrix measures with all diagonal elements being equal to a. Given a matrix measure M, a subspace K of L2(M) is called an invariant subspace if M0.MK c K (6-36) for all
It is important to have a characterization of invariant subspaces and this is
given by the next theorem.
OPERATORS IN HILBERT SPACE
115
Theorem 6-10 A subspace K of L2 (al) is an invariant subspace if and only
if K = PL2(aI) where P is a measurable a-a.e. projection valued matrix function.
PROOF The if part is trivial. So assume K is an invariant subspace of L2(aI)
where 1 is the n x n identity matrix. Let el, ..., e" be the standard orthonormal basis in C". Let >/ii denote the orthogonal projection of the constant function ei onto K. We apply the Gram-Schmidt procedure to the vectors cp"Ix,(A.). The "(A) to obtain locally an orthonormal set rP I number n(A) of elements in the set varies with A. It is clear from the way the (p; are constructed that they are measurable functions. Let gp;(A) have cp;j(d) as its components relative to the standard orthonormal basis in V. Define a projection P by (Pf)(.) = P(2) f (A) = E(f(A), (Pk(A)) q'k(A) k
Relative to the standard orthonormal basis in C" P(2) has the matrix representation p;j(A) with ptj(,1) = Ek (pk,(A) 4pkj(2) and is therefore measurable.
We will show that K = PL2(al). It is clear that K c PL2(aI). Suppose g e PV(al) is orthogonal to K, then necessarily J (g(A), (pj(a.)) 2" da = 0 for all n. This means that g is pointwise orthogonal a-a.e. to all (pj hence is the zero function by the definition of P. Clearly a subspace K is invariant if and only if its orthogonal complement
Kl is invariant. If Pl is the projection valued function corresponding to Kl then we have P' = I - P. The next theorem is a characterization of all B-homomorphisms of L2(al).
Theorem 6-11 Let X: L2(al) -. L2(aI) be a B-homomorphism. Then there exists a measurable a-a.e. bounded matrix function such that (XF) (A) = .(A) F(A)
for all
F e L2(aI)
(6-37)
Conversely any operator X defined by (6-37) is a B-homomorphism.
PROOF If 8 is a measurable matrix function satisfying for some M > 0 (A) 11
<_ M a-a.e. then for X defined by (6-37) we have IIXf112 =
f II-(A) f(z)112 da s M2
JIIfH2da
which shows that JJ X JI <_ M. It is obvious that X is a B-homomorphism. Conversely assume X is a bounded B-homomorphism. We suppose that 1 is the identity matrix in i" and el, ..., e" is an orthonormal basis there. Since Xe, a L2(al) we choose some representative for it, say, 4, which is defined up to a set of a-measure zero. We define E(A) by ?(A) ei = ,(2) and extend 8. by linearity to all of V. For any Borel function (p and all x e C"
116
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
we have f = rpx c L2(aI) and therefore
f (p(z)IZ II-_(A)
II2 da = IIXf112 <_ IIXII2.
IIf112 = NXI12
From this we conclude that '8(2) l S IIXII
- 1111
f
{(v(A)i2 ll
li2da
Furthermore, from its
definition, .= is measurable. If al is a scalar type measure then we will write UM for the isometric embedding of L2(M) into L2(aI). Here we assume MIa! or equivalently dM = H(A)* H(A) da.
If M(A) is the Radon-Nikodym derivative of M with respect to a then M(2) = H(2)* H(2) a-ae. It will be of interest to have a concrete representation for (UM)*, the adjoint of the isometric embedding U.
Theorem 6-12 Let M be a matrix measure and M al with dM = M(2) da = H(..)* H(A) da
(6-38)
Let P be the projection valued function corresponding to the invariant subspace UML2(M) of L2(aI). Then we have
(UM)* G = H*PG
(6-39)
for all G E L2 (a!) where H* is the pseudoinverse of H. PROOF Let F e L2(M) and G E L2(aI), then (UMF, G) =
(F, (UM)* G)
=
f (H(A) F(A), G(A)) da
f (H(A) F(A), P(2) G(2))da
Since PG e UML2(M) there exists an element Go E L2(M) such that HGo = PG. By the definition of the pseudoinverse we have H = HH*H, therefore
PG = HGo = HH*HGO = HH*PG Using this equality we obtain (F, (UM)* G) =
which proves (6-39).
f
(H(A) F(1), P(Z) G(2)) do
=
f (H(2) F(2), H(1) H(A)* P(2) G(2)) da
=
f(H(2)* H(2) F(A), H(A)* P(A) G(A)) da
=
f (dMF, H*PG)
OPERATORS IN HILBERT SPACE
1 17
Using this theorem we can obtain a representation for the adjoint of any isometric embedding.
Corollary 6-13 Let M and N be matrix measures such that MIN and let al be a scalar-type measure divisible by both M and N. Assume dM = H*H da
and dN = K*K da then (UM)* F = H*QKF
(6-40)
for all F e L2(N), where Q is the projection valued function corresponding to the invariant subspace UML2(M) of L2(al). PROOF We have UM = UN' U' and hence (UM)* = (UM')* (UN')*. Since UM is isometric we have (6-41)
(UM)* = (UM)* UN
Applying Theorem 6-12 to (6-41) yields (6-40).
The next two results are instances of lifting theorems. They describe complicated B-homomorphisms between two spaces of type L2 (M) in terms of B-homomorphisms of L2(al) which have been described in Theorem 6-11. Lemma 6-14 Let M be a matrix measure and assume M I al. Let X : L2 (al) -
L2(M) be a B-homomorphism. Then there exists a B-homomorphism X: L2(al) - L2(al) for which (6-42)
X = (U,°N)* X
and IIXII = II X II . This implies the existence of a measurable a-a.e. bounded matrix function E., with II5II m = II X II in terms of which we have the representation F e L2(al) (6-43) for XF = H*PB.F
P is the projection valued matrix function corresponding to UML2(M). Conversely any map X: L2(al) - L2(M) defined by (6-42) where X is a B-homomorphism is also a B-homomorphism and IIXII = II-Vii = IIEII.
(6-44)
PROOF IfX : L2(aI) L2(al) is a B-homomorphism then so is its composition with (UM)* and obviously (6-44) holds. Conversely let X: L2(al) - L2(M) be a B-homomorphism. Define
X: L2(al) - L2(al) by XF = UMXF
for
F e L2(al)
(6-45)
Clearly X as a product of B-homomorphisms is also one and since U° is isometric IIX II = II XII By Theorem 6-11 there exists a a-a.e. bounded measurable matrix function 8 for which XF = EF and hence (6-43) holds by an application of Theorem 6-12.
] ]$
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Theorem 6-15 Let M and N be matrix measures and X: L2 (M) L2 (N) a B-homomorphism. Let al be a positive scalar-type measure divisible by both M and N and let dM = H*H da and dN = K*K da. Let P and Q be the measurable projection valued functions corresponding to UML2(M) and UnL2(N), respectively. Then there exists a B-homomorphism X : L2(aI) - L2(aI) satisfying IXII = IIXII for which
for
XF = (UN)*XUMF
FeL2(M)
(6-46)
Moreover, there exists a measurable a-a.e. bounded matrix function satisfying (6-47)
II,7IID = IIXII = IIXII
(Z) =
P(2) =
a-a.e.
(6-48)
and for which
XF = K* HF
for all
F E L2(M)
(6-49)
Conversely every operator X defined by (6-49) for a measurable and a-a.e. bounded is a B-homomorphism from L2(M) into L2(N). PROOF If X is given by (6-49) then it is clearly a B-homomorphism and satis-
fies (6117). Let us assume therefore that X: L2(M) -+ L2(N) is a B-homomorphism. Define Y: L2(al) - L2(N) by YF = X(UM)" F (6-50) Y is a B-homomorphism as a product of such and Y I { UML2(M) }1 = 0 or equivalently stated YP-L2(aI) = 0 which reduces to
YP1 = 0
(6-51)
If we apply now Lemma 6-14 then we obtain Y = (UN)* X
(6-52)
for a B-homomorphism X: L2(aI) -+ L2(aI). Now XF = 5F where 8 is a measurable a-a.e. bounded matrix function that satisfies II =III =IIXII = II YII. Since by (6-51) XP'L2(aI) = UNYPIL2(aI) = 0 we have
rP1 = 0
(6-53)
= =P
(6-54)
which is equivalent to
Also since X = UN Y we have
=L2(aI) c UOL2(N) = QL2(al) w hic h imp lies
Q-U = 0
(6-55)
OPERATORS IN HILBERT SPACE
or equivalently
=QB
119
(6-56)
and (6-48) is proved. We note also that (6-48) implies the equality
SPl = Ql=
(6-57)
Representation (6-49) follows now from (6-50), (6-52) and the formulas for UM and (UN)*.
We note for future reference that X*: LZ(N) - LZ(M) is also a B-homomorphism. In terms of the notation of the previous theorem we have the following corollary.
Corollary 6-16 If X : L2 (M) -- L2 (N) is the B-homomorphism having the representation (6-49) with (6-48) satisfied then X*: LZ(N) - LZ(M) is a Bhomomorphism having the representation for
X*G = (UM)*X*UNG
(6-58)
G E LZ(N)
or more specifically
X*G = H*P8*KG
(6-59)
"(A)* = "(A)* Q(A) = P(A) "(1)*
(6-60)
where
holds a-ae.
For the analysis of the deeper properties of intertwining operators we will introduce the several relevant notions of coprimeness. All definitions will be rela-
tive to a fixed positive scalar measure a. A measurable projection valued n x n matrix P will be called trivial with respect to a, or a-trivial, if P(A) = I a-ae. Two measurable, n x m and n x 1, respectively, matrix functions A and B are called a-left coprime if there exists no a-nontrivial projection function P for which A = PA and B = PB. We denote the a-left coprimeness of A and B by (A, B)i = I. Analogously we define a-right coprimeness and denote it by (A, B)R = I. There is also a stronger notion of coprimeness. We say A and B are strongly a-left coprime, and write [A, B]i = I, if there exists a 6 > 0 such that for all , II = 1 we have II
IIA(A)*
II + IIB(A)*
II
>_ 6
a-a.e.
(6-61)
Again the analogous notion of strong a-right coprimeness is introduced in the same manner. The above definitions extend easily to the coprimeness of a finite number of matrix functions. As expected the coprimeness relations are connected with the ideal structure
in the algebra of bounded measurable functions. Theorem 6-17
(a) Let A,_., AP be bounded measurable n x m; matrix valued functions. Then there exist bounded measurable m, x n matrix valued functions
120
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
B, such that p
A,(2) Bi(..) = I
a-a.e.
(6-62)
if and only if
[A1, ..., AP]i = 1
(6-63)
(b) Let A1, ..., AP be measurable m, x n matrix functions. Then there exist n x m, matrix functions B, such that p
B,(2) A1(A) = I
a-a.c.
(6-64)
i=1
if and only if [A 1, ... , AP]R = I
(6-65)
PROOF Assume there exist B, such that (6-62) holds. Taking adjoints and applying the resulting equality to a unit vector we have = I Bi(A)* Ai(Z)*
and hence
1=
I IIB,(A)* Ai(A)* III Y IIBi(A)* II IIAi(A)* III
BY it Ai(,Z)*
II
where B = max JIB,(A)* I1. Equivalently we have
y11Ai(Q)* Il>B that is [A1,..., Ap]L = I. Conversely assume A1, ..., Ap are strongly a-left coprime. From (6-63) it follows that II
1I2 > 62
(6-66)
for some 6 > 0 and all unit vectors . Inequality (6-66) can be rewritten as Ei A,(A.) A,(n.)* >_ 62I. Thus Yi A,(A) A,(A)* is measurable and invertible in
the algebra of all bounded measurable n x n matrix functions. Define B. by
B,(A) = A1(2)* (L, AJ(A) A;Then the B, are bounded and measurable and (6-62) holds. Part (b) follows by a simple duality argument. The following corollary justifies the distinction between a-left coprimeness and strong or-left coprimeness.
Corollary 6-18 If A 1, ..., A. are bounded measurable n x m, matrix valued functions then [A,,..., AD]7 = I implies (A1, ..., AP)L = I.
OPERATORS IN HILBERT SPACE
121
PROOF Assume [A1..... A,]7 = I. Then there exist B, such that E A,B, = I.
From this it follows that A,..., A,, cannot have a common a-nontrivial projection valued left factor. Thus the a-left coprimeness of A1, ... , Ap follows.
The various coprimeness relations provide the language in which to phrase the next result.
Theorem 6-19 Let X: L2(M) -. L2(N) be a B-homomorphism having the representation (649) with relation (6-48) satisfied. Then (a) X has dense range if and only if
(", Q')i = 1
(6-67)
(b) X is one-to-one if and only if
(", P1)R = 1 (c) X has a bounded right inverse if and only if
(6-68)
[',Q1]7 = 1
(6-69)
(d) X has a bounded left inverse if and only if [=,, P1]R = 1
(6-70)
PROOF
(a) The range of X is dense in L2(N) if and only if the range of X is dense in UNL2(N) = QL2(al). This occurs if and only if the span of the two linear manifolds {cHFJF e L2(M) } and Q1L2(aI) is all of L2(al). Now {EHF I F e L2(M) } = 3PL2(al) and since BPI = Q18 it follows that SP1L2(al) c Q1L2(a1). Hence X has dense range if and only if the span of 8L2(al) and Q1L2(aI) is L2(al). Since the span of two invariant subspaces is an invariant subspace we apply Theorem 6-10 on the characterization of invariant subspaces to obtain the result that
^L2(a1) v Q'L2(al) = L2(al)
(6-71)
if and only if (6-67) holds.
(b) This follows from (a) by a duality argument. X is one-to-one if and only if X*: L2(N) -. L2(M) has dense range. Now X* is given by (6-59) with relation (6-60) holding. By applying part (a) X* has dense range if and only if
P1)i = 1
(6-72)
which is equivalent to (6-68). (c) Assume (6-69) holds. By Theorem 6-17 there exist matrix valued functions
0 and R such that 8(A) E) (A) + Q1(A) R(A) = I
a-a.e.
(6-73)
122
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
L2(N) and V. L2(QI)
Define maps Y. L2(N)
VF = OF
for
L2(QI) by (6-74)
F e L2(ol)
and
for
YF = (UM)* YUNF
F e L2(N)
(6-75)
Obviously Y and Y are bounded linear operators. We claim X Y = I. Let
FeL2(N) then XYF = (UN)* X UM(UM)* YU°NF
Since UM is an isometry UM(UM)* is the projection on the range of UM which is just the multiplication by the projection valued function P. So
XYF = K*Q3POKF and using the equality 8P = Qa as well as Q2 = Q yields
XYF = K*cOKF From (6-73) we have 8e = I - Q1R and since QQ1 = 0 we have XYF = K*QKF = (UN)* UNF = F To prove the necessity of the condition (6-69) for the existence of a bounded
right inverse for X it suffices, by duality considerations, to prove the necessity of the condition (6-70) for the existence of a bounded left inverse for X. Thus assume (6-70) is not satisfied. We will show the existence of a sequence of functions F in L2(M) such that lim 11F.11 = I and lim 11 X F, 11 = 0. This would imply the nonexistence of a bounded left inverse for X. Since (6-70) is not satisfied then for all n > 0 there exists a unit vector for which (A)
III + IIP1(A) III < n
(6-76)
for all A in a set A. of positive or-measure. Let X,A, be the characteristic function of the set A. then 0.(A) =
is a function in L2(aI) of norm one. We decompose T. relative to the direct sum L2(QI) = PL2(QI) ® P-L2(cI) to obtain T. = 0 + F. with
m. = and
I' = Since I a PL2(aI) = UML2(M) we have 0,, = UMF for some F,,a L2(M) with 11 F We note also that 11 r.11 =
[a(AR)]-,n
n
OPERATORS IN HILBERT SPACE
123
and therefore
lim11Fnll2 = lim[jI// II2 - IIrnII2) =
1
We will show now that lim II XFn II = 0.
XFn = (UN)*XUMF = (UN)*XOn and I!z
Xc = X (` ,, - F) _
[a(An)]- 1!2
°Xn Sn We now give the following estimate IIXFnlI = II(UN)*X(nII 5 IIXPnII [a(An)]-In
S [a(A.)]-"2 IIXA 'nII +
I/2
< [a(A,,)]` 112
n +
[a(An)]-112
II"(A)
2 da
J IIPl(Z)Snll2daI
` f II"'IIOOS
z
1/2
s(1 +
which completes the proof of (c). (d) Follows from (c) by duality.
One final remark is in order. Throughout this chapter we have dealt with self-adjoint operators, but essentially all results can be easily translated to the case of one or a pair of unitary operators. This is best done through the use of the Cayley transform.
Thus let U be a unitary operator in a Hilbert space H. Since Ker(I - U) _ Ker(I - U*) by Lemma 2-5 it follows that this is a reducing subspace of U. Thus without loss of generality we will assume 10 ap(U). Define A by
A = i(1 + U)(1 - U)-I then A is a, not necessarily bounded, self-adjoint operator and can be represented as
A = J2E(d.) for some spectral measure E on R. Since the Cayley transform can be inverted, its inverse given by U=(A-il)(A+iI)-I
and as for real Z, the function Vi(1) = (A - i)/(A + i) is continuous and satisfies 10(1)I = 1, it follows by the functional calculus for self-adjoint operators that
f-
U=
Z+i
E(dA)
(6-77)
124
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Define now a spectral measure F on the unit circle 11' by
F(a) = E[O-'(a)] for each Borel set a on the unit circle. Then a change of variable in the integral representation yields
U = Je"F(dt) which is the form of the spectral theorem for unitary operators. In an analogous way we can obtain the spectral representation theory for unitary operators as well as the characterization of intertwining operators. We will use these results in the sequel without formalizing them as theorems. There is, however, one difference which should be pointed out. A subspace which is invariant for a self-adjoint operator A is automatically a reducing subspace. This is no longer the case for a unitary operator. Thus if p is a measure defined on the unit circle then there are subspaces of LZ(pI) which are invariant but not reducing. We study these subspaces in Sec. 12. Therefore in this case
Theorem 6-10 is a characterization of the reducing subspaces of the unitary operator MX.,I. There is also a difference regarding the definition of cyclicity. A unitary operator U in H is called cyclic and x a cyclic vector if the smallest reducing subspace for U which contains x is all of H. This is equivalent to the density in H of all linear combinations of the vectors U"x, n e Z. This differs from the standard definition of cyclicity.
7. THE DOUGLAS FACTORIZATION THEOREM AND RELATED RESULTS This section is devoted to a factorization result of Douglas and various corollaries of it.
Theorem 7-1 Let A and B be bounded operators in a Hilbert space H. Then the following statements are equivalent. (a) A = BC for some bounded operator C. (b) AA* _< A2 BB* for some A > 0.
(c) Range A c Range B.
PROOF Statement (c) follows trivially from (a). Similarly if (a) holds then A* = C*B* and therefore for each x in H IIA*xllz = IIC*B*xlll <_ IIC*IIZ IIB*xFI2
which is equivalent to (b) with A
= II C* II
Now assume (b) holds for all
vectors x in H, then IIA*xII <_ A IIB*xll. Define an operator D from RangeB* into H by DB*x = A*x then IIDB*xII < A IIB*xII and D extends by con-
tinuity to the closure of RangeB*. Since (RangeB*)1 = KerB then if we
OPERATORS IN HILBERT SPACE
125
define D I Ker B = 0 then D is a well-defined bounded operator for which DB* = At. So (a) holds with C = D*. Finally assume (c) holds. Let B. be the restriction of B to {KerB}1. Obviously B0 is an injective operator into RangeB which implies that Bo' exists as a closed operator from RangeB into {KerB}1. Since RangeA c RangeB the operator C = Bo 'A is a closed well-defined operator from H into {KerB}1 and, by the closed graph theorem, is necessarily bounded. From this we get R = BC which completes the proof.
We single out the special case A = 1.
Corollary 7-2 Let A and B be bounded operators in a Hilbert space H. Then there exists a contraction Z such that
A = BZ
(7-1)
AA* <_ BB*
(7-2)
if and only if
The above corollary can be strengthened somewhat if we have equality in (7-2).
Corollary 7-3 Let A and B be bounded operators in a Hilbert space H. Then AA* = BB*
(7-3)
if and only if there exists a partial isometry U with RangeB* as its final space, such that
A=BU
(7-4)
PROOF Assume there exists a partial isometry U such that (7-4) holds. Then U U* is the orthogonal projection on the final space of U, that is, on Range B*. Thus UU*B* = B* which implies (7-3). Conversely assume (7-3). Then for all x in H we have JJA*x JJ = JJB*xJJ. Define V by VB*x = A"x and V J Ker B =
0 then V extends by continuity to a partial isometry with RangeB* as initial space. Thus (7-4) follows with U = V*. Obviously the initial space of V is the final space of U.
We note that Theorem 7-1 holds just as well for operators A and B whose domain of definition are different Hilbert spaces. The interesting case is when B acts on a direct sum of Hilbert spaces. Theorem 7-4 Let H0, H1,..., H. and K be Hilbert spaces and let AieB(Hi,K). Then there exist Zi e B(Ho, Hi) such that
Ao = Yn A1Zi i=1
126
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
if and only if for some 2 > 0
AOAO -< 2 j A;A*
(7-6)
i=1
The operators Zi can be taken to satisfy Z*Zi < 1Ho
(7-7)
if and only if (7-6) holds with A < 1. PROOF We define B: H1 a
Q H - K by B(x1, ...,
>°=1 Aixi, then
its adjoint is given by B*y = (Ay..... Any). Condition (7-6) is equivalent to (b) of Theorem 7-1. The operator C: Ho -+ H1 p ... $ H. is given by Cx = (Z1x, ... , Z,,x) and the result follows from Theorem 7-1 and its corollary as C*C = Z!Zi. A different extension of Theorem 7-1 is obtained by relaxing somewhat condition (c) of the theorem. It suffices to assume the existence of a subset Ho of the second category in H whose image under A is included in the range of B. This
uses a somewhat stronger form of the closed graph theorem as given in [96]. This observation is the key to the following.
Theorem 7-5 Let K, H0, H1i ... be Hilbert spaces and let Ai E B(Ho, K) and
i Be B(Hi, K). Assume for each x in Ho there exists an index i such that Aix e Range Bi, then for some index m Range A. c Range B. and hence Am = BmC for some C e B(Ho,
PROOF Let Mi = {x e Ho I Aix a Range Bi 1. By assumption K = U?° 1 Mi
and since Ho is complete we have, by the Baire category theorem [96], that at least one of the Mi, say M,,, is of the second category in Ho. By the remarks preceding the theorem we have Am = BmC.
8. SHIFTS, ISOMETRIES, AND THE WOLD DECOMPOSITION Let N be a Hilbert space. By 12(-oc, oo; N) we denote the Hilbert space of all doubly infinite sequences {a }.° - . with a E N and E°° _ . Ila 112 < oo. Similarly we define 12(0, oo; N) as the space of all one-sided sequences o for which i°°=oIla.II2
with
y, = xn-1 (8-1) We call U the bilateral right shift in 12(- oe, oo; N). Its adjoint U* acts by U*
with
y=
(8-2)
Clearly both U and U* are isometric hence both are unitary operators. U* is called the bilateral left shift.
OPERATORS IN HILBERT SPACE
127
Similarly we define the unilateral right shift S in 12(0, oo; N) S{x"} = {y"} where Y"
xn_1
n > 0
0
n=0
Again it is obvious that S is an isometry. However, its adjoint S* given by
S*{x"} = {y"}
(8-4)
with Y. = X.+1
n>0
is not isometric as it has a nontrivial null space. S* is referred to as the left shift or sometimes as the backward shift.
The shift operators have been introduced in a concrete way. However, it is easy to abstract the properties which characterize shifts up to unitary equivalence. Let us begin with the unilateral shift. Consider the subspace L of 12(0, oo; N) consisting of all sequences {x"} for which x" = 0 for n > 0. The subspace L has the following properties
(a) LI S"L for n > 0 and (b) 12(0, oo; N) = ©n o S"L. Let now V be a general isometry in a Hilbert space H. A subspace L will be called a wandering subspace for an isometry V if Ll V"L for n > 0. Thus we can form the orthogonal direct sum of the subspaces V"L to obtain ED. "L, V"L. If we have H = ©.*°= o V"L then V is clearly unitarily equivalent to the right shift in
12(0, oo; L). The multiplicity of the right shift is defined as the dimension of L where L is a spanning wandering subspace. L is uniquely determined by V and we have
L= {Range V}1 = Ker V*
(8-5)
We note that two unilateral shifts V and V' are unitarily equivalent if and only if they have the same multiplicity. Equal multiplicity follows from unitary equivalence by (8-5). Conversely if V and V, are of the same multiplicity, let Land L, be their corresponding spanning wandering subspaces. Let {e2 Ia a A} and {eaI aeA} be orthonormal bases for L and L,, respectively, then {V"e n ? 0, a e A) and {V ieQ I n 0, a e A) are orthonormal bases for H and H,, respectively. Define a map qi: H -+ H, by qi(V"e,) = Vie,, for all n > 0, a e A and extend by linearity. Obviously f is unitary and 0V = V,0. Thus the unitary equivalence of V and V, is proved. Let S be the right shift in l'(0, oo ; N). Let x = { 1;" }, then x e Range S" if and only if la; = 0 for i = 0, ..., n - 1. Thus f, 0 Range S" = {0}. This yields another characterization of unilateral shifts.
128
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Lemma 8-1 Let V be an isometry in a Hilbert space H then V is unitarily equivalent to a unilateral right shift if and only if i V"H = {0}
(8-6)
"=0
PROOF It remains to prove the if part. Let L = (Range V) 1 = H Q VH = Ker V*. L is a wandering subspace for V for V"Lc V"H c VH I L. From L = H Q VH it follows that H = LO VH and since V is isometric VH = VL ® V 2H. Thus by an induction argument we have H = L ® VL ® . ® V"+ 1H. So V1L
1=
V"+ 1H
t=0
and hence (
m
1
ao
{ ® V"L
= fl V"H
(8-8)
n=0
0
From the last equality it follows that (8-6) implies V is a unilateral shift. This lemma contains the essence of the next result generally known as the Wold decomposition.
Theorem 8-2 Let V be an isometry in a Hilbert space, then there exists a unique decomposition of H into a direct sum of reducing subspaces of V, H = Ho ®H1 such that V I Ho is unitary and V I H1 is a unilateral shift.
PROOF Let Ho = n,'= o V"H then Ho = n ," V"H as V"H is a monotonically decreasing sequence of subspaces. The invariance of Ho is obvious. Now
V*H0 = V* n,-. 1 V"H = n o V"H = Ho. Thus Ho is reducing. Define L by L= H Q VH then by (8-8) H1 = H Q Ho = {fl,'= o V"H}1 = ® o V"L. Since H1 is the orthogonal complement of a reducing subspace of V it is also reducing. Now V and V* are clearly isometric when restricted to Ho and V I H 1 is a unilateral shift.
As the unilateral shift could be defined abstractly so can the bilateral shift.
If U is the bilateral right shift in I'(- oo, oo ; N) we note that if we consider 1'(0, co ; N) as naturally embedded in 12(- oo, co; N) the following properties hold U12(0, co;
N) c
12(0, co; N)
f1 U"12(0, co;N) = {0} n=-m
OPERATORS IN HILBERT SPACE
129
and
U
U"l 2 (0, oo ; N) = 12(_ oo, oo ; N)
=-W
Taking these properties as our model we define an outgoing subspace D for a unitary operator U acting in a Hilbert space H if it satisfies
UDcD W
n U"D = {0} W
U U"D = H The definition of outgoing subspaces as well as the following theorem which gives an intrinsic characterization of bilateral shifts are due to Lax and Phillips [82].
Theorem 8-3 Let D be an outgoing subspace for a unitary operator U then U is unitarily equivalent to the bilateral shift in l2(- oo, oo; N) for some Hilbert space N. PROOF Let us define N by N = D Q VD. Since D is invariant under U, U I D is isometric, hence applying the Wold decomposition we have D = ((PR o U"N} ® Ho with Ho reducing and UI H0 unitary. Since no, - W U"D =
{0} we have necessarily Ho = {O} and the last condition of (8-9) implies
H = pn -W U"N. Clearly D = (D,'= 0 URN. Thus H is isomorphic to 12 (- oc, oo ; N) and U to the bilateral right shift.
9. CONTRACTIONS, DILATIONS, AND MODELS The Wold decomposition for isometries in a Hilbert space can be extended to all contractions. We will say that a contraction T in a Hilbert space H is completely nonunitary if there exists no nontrivial reducing subspace of T in which T acts unitarily.
After decomposing a contraction T into its unitary and completely nonunitary parts we will introduce the notion of isometric and unitary dilations and see how the study of a large class of contractions can be facilitated by identifying them as parts of special isometric or unitary operators. We start by introducing some notation. Given a contraction T then both (I - T* T) and (I - TT*) are positive operators and hence have, by Theorem 5-7, unique positive square roots. Let us define DT = (I - T*T)1J2
(9-1)
130
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
and
D. = (I - TT*)'12
(9-2)
Since TDT = T(1 - T*T) = (I - TV) T= DT.T it follows by induction that for every complex polynomial p we have
Tp(D2) = p(DT.)T
(9-3)
Choosing a sequence of polynomials approximating uniformly the square root function on [0, 11 we have in the limit
TDT = DT.T
(9-4)
T*DT. = DTT*
(9-5)
and by taking adjoints also
Also we note that for all x e H Ilx112 = IITxI12 + IIDTXII2 = IIT*x112 + IIDT.x112
(9-6)
DT and DT. are called the defect operators of T and they give a measure of the distance of T and T* from being isometric. Thus T is isometric if and only if DT = 0 and similarly for T*. We let Z = Range DT
(9-7)
Z* = Range DT.
(9-8)
and
and call them the defect spaces of T and their respective dimensions the defect numbers of T
Theorem 9-1 Let T be a contraction in a Hilbert space H, then there exists a unique decomposition of H into a direct sum of H = Ho (D H, of reducing subspaces of T such that T I Ho is unitary and T I H, is completely nonunitary.
PRoor We define Ho by Ho = {xl IIT"xll = OIT*"xll = Ilxll, n z 0}. To see that Ho is actually a subspace we note that JjT"xjj = Ilxii if and only if (I - T*"T") x = 0 and II T*"xli = IIxli if and only if (I - T"T*") x = 0. Thus Ho is the intersection of the kernels of all operators of the form (I - T*"T") and (I - T"T*") and hence is a subspace of H. Next we show that Ho is invariant under T. Let x e Ho then lI T"(Tx) Il = 11 7-' Ixll = llx 11 = lI Txll. Since
T is a contraction we have Ilxll = IiTxll if and only if x = T*Tx. Using this we have for xeHo IHT*"Txll = IDT*" `T*Txll = iiT*"xll = Ilxll = IiTxll and the invariance of T with respect to T is proved. As the definition of Ho was symmetric with respect to T and T* it follows that Ho is invariant also under T* and is therefore a reducing subspace for T From the definition of Ho it is clear that T I Ho is unitary. Let H, = Ho then necessarily T I H, is completely nonunitary for if L c H, is a reducing subspace of T in which T acts unitarily then for x E L we have ll x II = lI T"x Ii = Il T"xll . Thus L c Ho n H, = {0}, which completes the proof.
OPERATORS IN HILBERT SPACE
131
Contrary to the case of isometries where the structure was determined by the
unitary part, completely described by the spectral theorem, and the unilateral shift the structure of the general contraction in a Hilbert space is as complicated as its completely nonunitary part which is generally difficult to describe. There is, however, a special subclass of contractions which is closely related to the shift
operators and which points out the importance of shift operators as models for other operators.
Theorem 9-2 Let T be a contraction in a Hilbert space H. T* is unitarily equivalent to the restriction of a left shift to one of its invariant subspaces if and only if T*" -' 0 strongly.
PROOF Let M c 12(0, oo; N) be an invariant subspace of S* the left shift then clearly (S* I M)" = S*"I M --+ 0 strongly and so must every operator unitarily equivalent to it. Conversely let T*" -. 0 strongly. We define a new norm in H in such a way as to have IIxII2 = F°D=o IIT*"xII2 for all x e H. Since 11X112 - IIT*x1I2 = IIx I1i we must have IIXII1 =
(9-9)
Let t be defined by (9-8) and S the right shift in 12(0, oo ; Z*) then if we define a map W: H -+12(o, co; T)*) by Wx = (DT.x, DT.T*x, DT.T*'x, ...) it follows that
IIWXII2 = E
n=o
Y (IIT*"xII2 - IIT*".1x112}
n=o
= Jim { IIXII2 - IIT*""xII2} = IIXII2 Obviously W is isometric, Range W is S"-invariant and WT* = S* W which proves that T* is unitarily equivalent to S* I Range W. Let P be the orthogonal projection of 12(0, oo; ZJ*) onto Range W then for all x, y C Range W we have (S*"x, y) = (X, S"y) = (Px, Sy) = (x, Ps"y)
and hence (S* I Range W)*" = PS" I Range W
(9-10)
which implies that T of Theorem 9-2 is unitarily equivalent to the operator PS I Range W and more generally T" is unitarily equivalent to PS" I Range W.
This brings us to make the following definitions. Let M be a subspace of a Hilbert space H, P the orthogonal projection of H onto M. Let T be a bounded operator in M and A a bounded operator in H. We say that A is a dilation of T if
T= PAI M
(9-11)
132
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
and a strong dilation of T if
T" = PA"I M
for all
n >_ 0
(9-12)
We also refer to T as the compression of A. If the operator A has special properties
as for example if it is isometric or unitary we will speak of (strong) isometric or unitary dilations. Given any dilation A of T we define Ho = V o A"M. Ho is clearly invariant under A and obviously A Ho is also a dilation of T and a strong dilation if A is a strong dilation. If we are interested in uniqueness of strong dilations then the condition
H=VA"M
(9-13)
"=o
is a natural one to impose. Next we proceed to study the question of existence and uniqueness of strong
isometric and unitary dilations of contractions in a Hilbert space. Theorem 9-3 Let T be a contraction in a Hilbert space H. Then there exists a strong isometric dilation V in a Hilbert space K z> H satisfying
K= V
V"H
00
(9-14)
"=o
Any two isometric dilations satisfying condition (9-14) are unitarily equivalent.
PROOF Consider the Hilbert space 12(0, oo; H) and the embedding
p: H -
12(0, oo ; H) given by p (x) = (x, 0, 0, ... ). Let P be the projection of 12 (0, oG ; H)
onto p(H). We define a map W: 12(0, oo; H) -. 12(0, oo; H) by W(xo, x1, ...) = (Txo, DTxo, XI, X2, ...)
(9-15)
By virtue of (9-6) W is an isometry and is clearly a strong dilation of T Since (9-14) is not necessarily satisfied we let K = Vn o W"H and let V= WIK which proves the existence.
Let now V and VI be strong isometric dilations satisfying con-
dition (9-14). Since vectors of the form (Y!=o VixiIn >_ 0, xjE H}, (E;=o Vi xj l n >_ O, xi a H} span K and K1, respectively, we define a map
cp:K-'K1 by
(P(io vixi)
Y Vix1 = i=o
(9-16)
Since V and V1 are isometric we have, assuming without loss of generality
that i>_j, (V1xi, Vixi) = (V 'V xi,Xi) _ (Vi-ixi, Xi)
= (V`-'xi, Pxi) _
(PV'-ixi,
xi) =
(T'-ixi,
xi)
133
OPERATORS IN HILBERT SPACE
Thus it follows that for all i and j (Vixi, V'x;) = (V'xi, Vixf) and hence n
2
Vixi
n
_ Y Y (Vixi, V'xj ) i=01=0
i=0
n
n
2
R
= Y I (V ixi, Vixj) = i=0 j=0
[-`
L
Viixt
i=o
So (p defined by (9-16) is isometric on a dense subset in K and hence has
an isometric extension to all of K. Its range includes a dense subset of K1 and hence gyp, having closed range, is necessarily onto and therefore unitary. The relation cpV V1gp follows trivially from the definition of (p and this proves the unitary equivalence of V and V1.
For a given contraction T the isometric dilation V that satisfies (9-14), unique up to unitary equivalence, will be called the minimal isometric dilation. In that case we will also say that V" is the minimal coisometric dilation of T*.
We note that for each x e H we have PV"+'x = T"+ Ix = TT"x = PVPV"x or PV(I - P) V"x = 0 for all x e H. If we assume that V is the minimal isometric dilation of T then the set of vectors of the form V"x, x e H spans K and hence we have PV(I - P) = 0 or PV= PVP. Taking adjoints we obtain
V*P = PV*P
(9-17)
which is equivalent to H being invariant under V*. Also given x, y e H we have (V*x, y) = (x, Vy) = (x, PVy) = (x, Ty) = (T*x, y) which implies
T* = V*1M
(9-18)
Thus we have proved the following.
Theorem 9-4 Let T be a contraction in a Hilbert space H and let V be its minimal isometric dilation acting in the dilation space K. Then H is invariant under V* and (9-18) holds.
It is convenient to have a concrete representation for the minimal isometric dilation of a contraction T This can be done and is summed up by the following. Theorem 9-5 Let T be a contraction in a Hilbert space H. Then the operator V defined by the operator matrix T DT
0 IT,
V=
Since
0
(9-19)
134
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
acting in the Hilbert space
K= HO+r$tO+
(9-20)
is the minimal isometric dilation of T. V is unitarily equivalent to a unilateral right shift if and only if T*" -. 0 strongly.
PROOF Let (xo, x1, ...) E K then V(xo, x1, ...) = (Txo, DTxo, X1, ...) and by (9-6) V is isometric. It is clearly well defined as for each x0 e H we have Dx0 e Z. Thus we have only to show that K = V o V"H. Since obviously V"H c K the inclusion Vn o V"H c K clearly holds. Now for each x e H V(xo, 0, ...) - (Txo, 0, ...) = (0, Dxo, 0, ...) which implies H V VH = H
t (D {0} E) ...andas V"[{0} of o{0} (@ (01 O+ {0} O
{0}
.®{o} e
71,
. the result follows.
Finally if V is a unilateral right shift then by (9-18) T* = V*I H and hence T*" - 0 strongly. Conversely assume T*" -+ 0 strongly then by Theorem
9-2 T has an isometric dilation S which is a unilateral right shift. It follows that the minimal isometric dilation is, up to unitary equivalence, S restricted to a reducing subspace of S. However, a unilateral shift reduced to a reducing subspace is also a unilateral shift. This together with the uniqueness of the minimal isometric dilation proves the theorem. The importance of Theorems 9-2 and 9-5 will become clear once functional models for shifts are available. In that case a natural functional model is obtained for any contraction whose isometric dilation is a unilateral shift. These models are extremely useful for spectral analysis. Given a contraction T we may apply the previous construction to T* instead of T to obtain the operator matrix T*
D. 0 1z,,
0
(9-21)
1.
acting in
K'=H® ?*O+D*®...
(9-22)
as the minimal isometric dilation of T*.
We can put now the two pieces together to obtain a matrix representation for the minimal unitary dilation of a contraction T. The resultant matrix (9-23) is known as the Schafer matrix.
Theorem 9-6 Let T be a contraction in a Hilbert space H. Then there is a minimal strong unitary dilation of T and two minimal unitary dilations of T are unitarily equivalent.
OPERATORS IN HILBERT SPACE
135
Iand t*are
PROOF Let K = pZ*©Z*®H0Z
the defect spaces of T defined by (9-7) and (9-8), respectively. Define U to be the lower triangular operator matrix
1Z. DT.
U=
T (9-23)
-T* DT ID
Ia
acting in K = .. ©D. E )Z. O H O Z ©Z @ . Thus (... , x_ 1, xo, x1 , ...) e K with the x0 coordinate in H we have
Ux = (... , x_ 2, Tx0 + Dr.x- 1, DTXO - T*x_ 1, x1, x2, ...)
if x= (9-24)
To show that U is well defined in K it suffices to show that DTxo - T *x _ 1 e Z. However, x_ 1 e Z* and this follows by virtue of the relation T*DT. = DTT*.
To show that U is isometric we observe that IITxo + DTx-1112 + 11DTxo - T*x-1112 = IITxo112 +
IIDTxo112
+ 1IT*x-1112 - 2 Re(Drxo, 7X- 1) _ {11Txo112 + 11Drxo112} + {IIT*x ;1112 + 1IDr.x-III2}
= IIxoI12 +
1Ix-1112
Here we used relation (9-5) to get (Txo, DT.x_ 1) = (xo, T*DT.x_ 1) _ (xo, DTT *x_ 1) = (DTxo, T*x_ 1) as well as (9-6). Since U* has a similar matrix representation an analogous computation shows that U* is isometric and hence the unitarity of U follows. That U is a strong dilation of T can be observed from the lower triangularity of the matrix U or equivalently from (9-24).
The proof of minimality is along the lines of the proof of Theorem 9-3 whereas the unitary equivalence of two minimal unitary dilations of T is similar to the proof of the unitary equivalence of two minimal isometric dilations of T. Details are omitted.
It should be observed that the matrix representation (9-19) of the minimal isometric dilation of T is just the lower right-hand corner of the Schafer matrix
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
136
(9-23), and hence the restriction of the minimal unitary dilation U of T to the invariant subspace V' 0 U"H is the minimal isometric dilation V of T. We introduce now a notation that simplifies the description of the geometry of the dilation space. We define subspaces Q and 2* of the dilation space K by
2 -(U-T)H
(9-25)
$* = (U* - T*) H
(9-26)
and
Clearly ifa vector x in K is written as an infinite sequence(..., x- 2,X- 1, xo, x 1,x2, ... )
with xo a m, x1 e t for i > 0 and x; a I* for i < 0 then S is the space of all vectors for which xi = 0 for i # 1 and 2* the space of all vectors for which xi = 0 for i # -1. Moreover for n > 0, x e U"Q, if and only if xi = 0 for i # n and x e U*"2* if and only if xi = 0 for i - n. The dilation space K has now the direct sum representation
K=...®U*L'*ED
(9-27)
g*®H®2®UQ®...
Obviously the infinite direct sums of subspaces ®,° -. U"$ and ®
_.D U*"$
are subspaces of the dilation space K and a natural question is under what circumstances one or the other actually equals K.
Theorem 9-7 Let T be a contraction in a Hilbert space H and let U be its minimal unitary dilation acting in K. Then (a) K = ® - U"Q if and only if T' tends to zero strongly
(b) K = ®" _ U* ".p* if and only if T *" tends to zero strongly
(c) K = ((DR -," U"2) v ((D- U*"2*) if and only if T is completely nonunitary. PROOF Assume T" tends to zero strongly. Since ED,,=
U"$ is obviously a
reducing subspace for U and since V U"H = K by minimality it
sufficient to show that H c ®, as a telescopic series
_
,
is
U. So for x e H we write x - U-"T"x
"-1
x- U-"T"x = y (U'JTJx N U-J-1TJ+1x) J=O n-1
U-J(1-U-'T)TJx J=O
"-1
X U-J-'(U-T)TJx
J=o
which implies that x - U -"T "x e ®j=, U -J2. But since T"x tends to zero we have by passing to the limit that x c- ®1 1 V -Jil c ®.° -," UJB. Since V* = U-' I ® 1 U-J2 is the minimal isometric dilation of T* the converse is contained in Theorem 9-2. Thus part (a) is proved and part (b) follows by symmetry.
OPERATORS IN HILBERT SPACE
137
Finally to prove (c) let N be defined by N
{(?
m
U"!)
v
(" ©.
U-"2*
) 11
We will show that N = {0} if and only if T is completely nonunitary. To this end we need to identify the vectors in N. By the representation (9-27) of the dilation space K it is clear that N c H. If X E N then for all y e H and n > 0 we have
0 = (x, U-"(U - T) y) _ (U"-'x, Y) - (U"x, Ty) = (T"- Ix,Y) -(T"x, Ty) In particular the choice y = T"-'x yields the equality JIT"-'xil = I+T"xJJ. In an analogous fashion we show 1IT*"-'x11 = 11T*"xH for all n > 0. Thus N is equal to the subspace Ho = {xJ `Jx jI = JI T"x = II T* "x JJ, n >- 01. By Theorem 9-1 T is completely nonunitary if and only if Ho = {0} and this J`
completes the proof of the theorem.
We conclude this section with a proof of Naimark's theorem on unitary representations of groups. In Sec. 4 we saw already the close connection between positive definite functions and integral representations. For example {c"} is a positive definite sequence if and only if c" _ f e'"` dµ for a positive measure on T. If we introduce in the Hilbert space L2 (µ) the unitary operator U defined
by Uf = Xf and if rp is the function in L2(µ) defined by cp(e") = I then c" = (U"(p, cp) for all n e Z. We can consider U" to be a unitary representation of the group Z. The formula c" = (U"(p, q) can be considered also as a dilation result. If c" is considered as a contraction on the Hilbert space of complex numbers we can consider C as the subspace H of L2(µ) of all constant functions. If P is the
orthogonal projection of L2(µ) onto H then c" = PU' H. This circle of ideas can be generalized considerably. Given a group G and a complex Hilbert space H then a B(H}valued function T is said to be positive definite if Y Y (T(gi I g;) h j, hi) >- 0
(9-28)
for every finite set of vectors hiE H and any g,e G. A unitary representation of G is a homomorphism U of G into the set of unitary operators on H which satisfies U(e) = I. The connection between positive definite functions on groups and unitary representations is the content of the following theorem of Naimark.
Theorem 9-8 Let U(g) be a unitary representation of the group G in the Hilbert space K and let H be a subspace of K with P the orthogonal projection of K on H. Then T (g) = P U (g) IH
(9-29)
is a positive definite function which is weakly continuous if U(g) is. Conversely if T(g) is a positive definite B(H}valued function on G for
138
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
which T (e) = I then there exists a Hilbert space K H and a unitary representation U(g) of G such that (9-29) holds together with K = V U (g) H
(9-30)
BEG
If G is a topological group and T(g) weakly continuous then so is the unitary representation.
PROOF Assume U(g) is a unitary representation of the group G. Then with T(g) defined by (9-29) we have for h. E H n
n
Pt
n
y Y (T(g& 'gi)hi, hi) = y y (PU(g1)* U(gi)hi,hi)
i=I j= I
i = I j=I
2
n
= I i (U(gr)hi, U(gl)hi) _ i=1 1=I
U(g;) hi
>_0
t=I
Continuity of T follows from that of U by (9-29). To prove the converse let T be a positive definite B(H}valued function on G. Let L be the set of all finitely nonzero H-valued functions defined on G. L becomes a complex linear space by the definitions of addition and multiplication by scalars. In L we introduce the inner product
y Y (T (g-'h) f (h), f'(g))
(9-31)
hcG geG
where the sum, having only a finite number of nonzero terms, makes sense. By the assumption of positive definiteness the above inner product is a nonnegative definite Hermitian form. It may happen, however, that (f, f) = 0 without f = 0. We denote by N the set of all elements fin L for which (f, f) = 0. From the Schwarz inequality it follows that if f c- N then (f, f') = 0 for all f' E L, in particular N is a subspace. To make the inner product definite we factor out the set of null elements. Let therefore K' = L/N be the set of all equivalence classes modulo the null elements and K the completion of K' in the induced norm topology. So K is a Hilbert space. H can be considered as embedded in K by the following considerations. Define for a given x E H the function fx E L by ff(e) = x and fx(g) = 0 for g # e. Since
(f., f,) = E Y (T (g-'h) ff(h), fv(g)) = (fx(e), fr(e)) = (x, y) heG geG
H is isometrically embedded in Land hence also in L/N and eventually in K. Define now a translation semigroup in Las follows. Given f e L let U (go) f (h) = f (go 'h) = fgo(h)
OPERATORS IN HILBERT SPACE
139
Given f f e L then (U(go) f, U(go)
(f9o,feo) _ Y > (T (g- 'h) .fao(h), .fan(g)) heG BEG
I Y (T(g-'h) f(go'h), f'(go'g)) _ Y Y (T (g-'gogo 'h), .f'(go'g))
Now as h and g vary over all elements of G so do go 'h and go'g. Hence
(U(go) f U(go)f') = (f f') which shows that U(go) is isometric for each go e G. Since U(go) is invertible,
in fact U(go)-1 = U(go'), U(g) is actually a unitary representation of G in L for which N is a reducing subspace. Passing to K' and K we still have a unitary representation which we denote by the same letter. Let x, y c- H then
(U(go)x,y) _ I Y (T(g-'h) U(go) x(h), y(g)) BEG heG
(T(g-'h) x(go'h), y(g)) BEG hEG
Now y(g) = e for g = e and zero otherwise and x(go'h) = x for h = go and zero otherwise. Therefore it follows that (U (go) x, y) = (T (go) x, y)
(9-32)
which is equivalent to (9-29). Since L is clearly the set of all finite linear combinations of translates of elements of H then L = U (g) h and (9-30)
follows from that by going to the quotient space K' first and then by completion.
Assuming T(g) is weakly continuous then each function of the form (T(g) x, y) is continuous. Equality (9-29) shows that (U (go) x, y) is continuous for x and y in L and since 11 U (go) I continuity follows for all x, y in K. This completes the proof.
Naimark's theorem yields another approach to the existence of minimal unitary dilations of contractions. Let T be a contraction in the Hilbert space H. Define a B(H}valued function T(n) on Z by T (n)
T" T*'^'
n>_0 n<0
then Halperin [65] has observed that T(n) is a positive function on Z. The existence of a strong unitary dilation of T follows by a straightforward application of Naimark's theorem. A similar approach works for the unitary dilation of contractive semigroups.
140
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
10. SEMIGROUPS OF OPERATORS Semigroups of operators arise in the solution of differential equations with opera-
tor coefficients in the same way as the exponential of a matrix is used to solve first order constant linear systems of differential equations. Thus semigroups are operator solutions of the functional equation of the exponential function. Specifically, a one parameter semigroup of operators in a Banach space X is a family {T(t)lt >- 0} of bounded operators on X which satisfies
(a) T(0) = I and (b) T (t + s) = T (t) T (s) for all t, s >- 0
Our first object in the study of semigroups is to find an exponential representation for the semigroup. This is possible if some continuity assumptions are
made. It turns out that the assumption of continuity of the semigroup in the operator norm is much too restrictive and would exclude most of the applications. Strong continuity is sufficient for the development of the exponential representation. We say that the semigroup { T(t) } is strongly continuous if for each x E X the X-valued function T(t)x is norm continuous on [0, oo). Given the strongly continuous semigroup { T(t) } we define the infinitesimal generator of the semigroup to be the operator
Ax = Ii..-
,-.o.
-T(t)-x --x
(10-1)
t
where DA the domain of definition of A is the set of vectors for which the limit (10-1) exists.
Lemma 10-1
(a) The set DA is a linear manifold and A is linear on DA. (b) If X E DA then T(t) x has a strong derivative given by d
dtT(t) x = AT(t)x = T(t) Ax
(10-2)
(c) For x in DA
T(t) x - x = fo r T (T) Ax dT = `t AT (z) x dt 0
PROOF (a) is obvious from the definition. Let x E DA, t >- 0 and z > 0 then
T(t + t) - T(t)x
T(r) x - x T(r)T(t)x - T(t) x = T(t) --- =
T
T
lim T-0
T(r)x-x
T (t) ------- = T (t) Ax T
T
141
OPERATORS IN HILBERT SPACE
it follows that T (T) T (t) x - T (t) x
lim
T(t) Ax
r
T-0
Thus T(t) x e DA and by the definition of the infinitesimal generator (10-2) follows. Part (c) follows by integrating (10-2). Part (c) of the lemma can be strengthened as follows. Lemma 10-2 For all x in X we have (
'
T (t) x- x = A F0 T (T) x dT 0
PROOF
A J T (T) x dr = lira 0
T(h) - I h
h-.0
= lim
1
h-o h
`
`
T (T) x dr
o
[T(h + r)x - T(r)x]dt
('0
= lim 1 J+h T (T) x dT - lim 1 JF T (r) x dr = T(t) x - x h
h
0
by the strong continuity of the semigroup. Corollary 10-3 The linear manifold DA is dense in H and A is a closed operator.
PROOF For each x e X the vector (1 /t) fo T(r) x dr is in DA and x = lim m (1 /t) 10 T (T) x dT. To show that A is closed let x e DA, X. - x and Ax -+ y.
By integrating (10-2) we have T(t) x - x = jo T(r) Ax dr and by assumpT (T) y dT. T x tion li m (T (t) x r-0 So (T(t) x - x)/t = (1 /t) J o T(r) y dr, and hence x e DA and Ax = y which proves the closedness of A.
From now on we will assume that the semigroup is a strongly continuous semigroup of contractions, that is, that 11 T (t) 11 1 for t z 0. This assumption
-
gives us some information concerning the spectrum of the infinitesimal generator. Theorem 10-4 Let {T (t) } be a strongly continuous semigroup of contraction then p(A), the resolvent set of A, includes the open right-half plane. Moreover for 2 > 0 we have
e-17(t) dt
R(2, A) = J
0
(10-3)
142
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
PROOF Consider the semigroup {e- z`T (t) } having the infinitesimal generator
A - Al. Applying Lemma 10-2 we have for all
x - e- AIT(t)x = (Al - A) f e- "T(T) x dT o
xeX
(10-4)
xEDA
(10-5)
0
and
x - e-TT(t)x =
Jr
for
e_AtT(T)(Z1 - A)xdt
0
Now the operator R(A) defined by the norm convergent integral R(A) =
J
e-ziT(t)dt
is clearly a well-defined bounded operator. Passing to the limit in (10-4) and (10-5) we obtain
x =(AI - A)R(A)x
xeX
for all
and
x = R(A) (,Z1 - A) x
Thus A > 0
is in
for
x c- DA
p(A) and R(A) = R(A, A).
A strongly continuous semigroup determines uniquely its infinitesimal generator. The converse is also true and is given by the following lemma.
Lemma 10-5 A strongly continuous semigroup is uniquely determined by its infinitesimal generator. PROOF Assume {T,(t)} and {T2(t)} are two strongly continuous semigroups having the same infinitesimal generator A. Then for all x e DA we have also T (t) x E DA, i = 1, 2. On differentiating T2(t - r) T, (t) x with x E DA we obtain d
T2(t - t)T,(T)x = T2(t - t)AT,(t)x - AT2(t -T)T,(t)x = T2(t - t) A T, (t) x - T2(t - t) AT, (t) x = 0
Thus T2 (t - t) T, (t) x is a constant vector. Evaluating it at t = 0 and t = t we have T2(t) x = T,(t) x for all t > 0 and x c DA. By the strong continuity of both semigroups this equality holds for all x c- X, that is, the semigroups are equal.
Our next object is the characterization of infinitesimal generators of contractive semigroups, that is of strongly continuous semigroups satisfying 11 T(t) J1 S 1.
OPERATORS IN HILBERT SPACE
143
Moreover we want a procedure which will enable us to reconstruct the semigroup, given its infinitesimal generator. This is the content of the Hille-Yosida theorem.
Theorem 10-6 (Hille-Yosida) A closed densely defined operator A is the infinitesimal generator of a contractive semigroup if and only if IIR(A,A)II <-2 '
A>0
for all
(10-6)
PROOF Assume { T (t) } is a contractive semigroup. By (10-3) we have R(A, A) _
1. e-ztT(t) dt and hence for A> 0
fe-z' T(t)II dt s J
e-ztT(t)dt
IIR(2, A) 11 =
e-z`dt =
0
J0"
To prove the converse we construct a family of approximating semigroups having bounded infinitesimal generators. Let Az = 22R(A, A) - Al and define T1(t) by
Tz(t) = e'A;
(10-7)
T2(t) is well defined as e`= is an entire function and so T1(t) = From (10-6) we have for x e DA
o t"(Ax/n!).
IIAR(2,A)x - xII = IIR(A,A)AxII s A` IjAxII and hence lim AR(2, A) x = x
for
z-oo
x e DA
(10-8)
Since DA is dense and the set of operators AR(A, A) is uniformly bounded then (10-8) holds for all x e H. In particular since AR (A, A) Ax = A [R (.1, A) (A - .11) x + R (A, A) Ax]
= 22R (A, A) x - Ax = Azx we have for all x e DA that
lim Azx = Ax
(10-9)
The approximating semigroups { T2(t) } are also contractive semigroups as is seen from the following estimate IIT2(t)II = Ile
lte22R(x.A)tlI =
e
-z
2"
R(A, A)" t' n!
n=o
e-
t"A" IIAR(A, A) II" =o
n!
e- z
10
N"
"=o n!
-
-1
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
1 44
Since TA(t) and T (t) commute for all A, µ z 0 we have
TA(t) - TI(t) = I
t0 I d- TA(t) Tµ(t - r)] dr ft
TT(T);(t - T)(A. -
= 0
from which the estimate
IITz(t)x - .(t)xll < t11A,x - A,xll follows. Thus from (10-9) it follows that lim T, (t) x exists for all x e DA uniformly on compact subsets of [0, co). Again -D the uniform boundedness A
of the TA(t) implies the existence of the limit for all x E H. Define now T(t) by
T(t) x = lim TA(t) x A-00
(10-10)
for t >- 0. We show first that {T (t) } is a strongly continuous semigroup.
T(t + s) x = lim Tz(t + s) x = Jim T,(t) T2(s) x = T(t) T(s) x So T(t) is a semigroup. By the inequality JIT(t)xII = lim IITz(t)x1l < IIxjI the semigroup is contractive. Finally as
IIx-T,(t)x1I+IITA(t)x-T(t)xII
IIx-T(t)x1I
and since Tz converges to T(t) x uniformly on compact subsets, the strong continuity of T(t) follows from that of TA(t). To conclude we will show that A is the infinitesimal generator of the semigroup {T(t)}. Let B be the infinitesimal generator of { T (t) } . For x e D,, we have
TA(h) x - x = f
TA(T) Azx dr o
and hence also
TA(h) x - x
I
"
h
h
0
T,( t) Azx dT
Let A -+ oc and we obtain
T(h)x - x
1
''
h
h
0
T (r) A x dT
But the last equality implies x e DB and Bx = Ax which is equivalent to A. Now for A > 0, AI - A is invertible and has no proper extension so necessarily B = A, which completes the proof. B
OPERATORS IN HILBERT SPACE
145
Applying Theorem 3-10 which characterizes the maximal accretive operators
we can restate the Hille-Yosida theorem as follows. Theorem 10-7 A closed densely defined operator is the infinitesimal generator of a strongly continuous semigroup of contractions if and only if it is maximal accretive. Given a strongly continuous semigroup { T(t) } , now specifically assumed to act in a Hilbert space H, we define the adjoint semigroup to be the family of adjoints {T(t)* It 2- 0}. It is clear that {T(t)*} is also a semigroup. Since lim (T (t)* x, y) = lim (x, T (t) y) = (x, y) for all x, y e H it follows that r-.o t-o 0 < 11 T(t)* x - x112 = 1' T(t)* x12 - (T(t)* x, x) - (x, T(t)* x) + (x, x)
o
IIT(t)*x112 - IIx112 <0
Thus necessarily lim T(t)* x = x which shows that the adjoint semigroup is also t-o strongly continuous. The expected relation between the infinitesimal generators of adjoint semigroups holds and is given by the following theorem.
Theorem 10-8 The infinitesimal generator of the adjoint semigroup is the adjoint of the infinitesimal generator of the original semigroup. PROOF Let A and B be the infinitesimal generators of the semigroups {T(t)
and {T(t)t}, respectively, and let DA and DD be their respective domains of definition. Choose x e DA and y e D8, then (Ax, y) = lim
t-0
T(t) x - x t
,y
= lim x,
T(t)* y - y
,-o
t
(x, By)
So B c A*. Conversely if x e DA and y e DA-
(T(t) x - x, y) = Jt (AT (r) x, y) dT =
(' 1
0
t 0
(x, T (t)* A* y) dT
which implies T(t)* y - y = jo T(T)* A*y dT. Dividing by t > 0 and letting t tend to zero we have By = A*y or B A*. So B = A* and the proof is complete. If A is the infinitesimal generator of a contractive semigroup then, by Theorem 3-9, its Cayley transform T defined by
T = (A + 1) (A - I)-'
(10-11)
is a contraction. We call this contraction the infinitesimal cogenerator of the semigroup. Since the Cayley transform is invertible there is a bijective correspondence
between a semigroup and its cogenerators. The cogenerator reflects in a more faithful way the properties of the semigroup. In the following theorems that charac-
146
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
terize the infinitesimal generators of isometric and unitary semigroups are some instances of this relation. We precede these theorems by showing that a semigroup and its cogenerator have the same invariant subspaces. Theorem 10-9 Let {T (t)} be a contractive semigroup and T its infinitesimal cogenerator. A subspace M is invariant under the semigroup if and only if it is invariant under T PROOF Assume M is a subspace invariant under the semigroup. Since T=
I + 2(A - I) - ' and by Theorem 10-4 (1 - A)
x = fo e-`T(t)xdt it is
clear that M is also invariant under T
Conversely assume M is invariant under T For A > 0 let AZ be de-
fined by
A2 = 12R(A, A) - Al
(10-12)
then for M to be invariant under A2 it suffices to show that it is invariant under R(1, A). Now
R(A,A)=[A-(T+I)(T-I)-']-' =(T-1)[1(T-1)-(T+I)] =(T-I)(A+1)-1[( +1
)T-I]-1
For A > 0 we have (2 - 1)/(1 + 1) < 1 and hence { [(A - 1)/(1 + 1)] T- I } can be expanded into a uniformly convergent geometric series. This shows that M is invariant under A2 and hence also under the semigroup {T2(t)} = {e(AA) used in the proof of the Hille-Yosida theorem. Finally since T(t) x = lim T2(t) x the result follows. A-.
Theorem 10-10 The following statements are equivalent. (a) T(t) is a strongly continuous semigroup of isometrics.
(b) The infinitesimal generator A satisfies A
A*, or equivalently
(iA) c (iA)*. (c) Re(Ax, x) = 0 for all x E DA.
(d) The infinitesimal cogenerator T of the semigroup is isometric and 1 $ ap(T).
PROOF Assume {T (t) } is isometric. Thus IIT(t)x112 = 11x112. Differentiating the equality, assuming x c- DA we have
(AT(t) x, T (t) x) + (T (t), AT (t) x) = 0
In particular (Ax, x) + (x, Ax) = 2 Re(Ax, x) = 0 which also implies A (--A*. Thus (b) and (c) follow. The equivalence of (b) and (c) is obvious. Assume now (c) holds, then
II(A + 1)xII2 = jjxjp2 + IIAx!12 = II(A - I)xjIZ
OPERATORS IN HILBERT SPACE
147
Since Range(A - I) = H we have for all y e H IITYII2 = II(A + 1)(A
-1)-I YIIZ =11y112
or T is isometric. So (c) implies (d). If T is isometric and 10 ap(T) then by
Theorem 3-9 A = (T + I) (T - I) - ' is maximal accretive. Now A + I = 2T(T - I)- I and A - I = 2(T - I)-' so for all x in DA = Range(T - I) we have
II(A + 1)xii =
II2T(T-
II2(T- I)-I xII = II(A - I)xii
I)-I xII =
But the equality II (A + I) xII = II(A - I) xII is equivalent to Re(Ax,x) = 0
and hence (d) implies (b) and
(c). Finally, since (d/dt) II T(t) x112 = 2 Re(A T(t) x, T (t) x) = 0 for all x e DA, we have II T (t) x 1l is a constant and
hence necessarily II T (t) x11 = II x II for all t >- 0 and x e DA. By continuity this holds for all x and the implication of (a) by (d) is proved.
If T (t) is a semigroup of unitary operators then it can be extended to a unitary group, U(t), t e R, by letting
U(t) _
T(t)
t >- 0
T(-t)* = T(-
t)- I
t<0
(10-13)
Theorem 10-11 (Stone) The following statements are equivalent.
(a) U(t) is a strongly continuous group of unitary operators (b) The infinitesimal generator of U(t) is skew self-adjoint (c) The cogenerator U of the group is unitary and 10 al(U). PROOF Assume U(t) is a group of unitary operators then so is UI(t) = U(t)*. The infinitesimal generator of the adjoint group UI(t) is A*. Applying the previous theorem we have A c - A* and A* c - A**. However, since A is closed and densely defined we have, by Corollary 3-2, that A** = A and hence A* = -A and the equivalence of (a) and (b) follows. If U(t) is unitary
both U = (A + I)(A - I) -I and UI = (A* + I)(A* - I) -I are isometries. However, on the dense set DA, we have
(A* - I)-I(A* + I) c (A* + I)(A* - I)-I = UI So U* c UI and by continuity they must be equal. Consequently if U is unitary and 10 ap(U) then U and U* are the infinitesimal cogenerators of the unitary semigroups T(t) and T(t)*. Define U(t) by (10-13) and it is easily checked that U (t) is a unitary group. If A is the infinitesimal generator of the strongly continuous group of unitary operators { U(t) } then B = - iA is a self-adjoint operator. Thus there exists, by the spectral theorem, a spectral measure E on R such that for all
xeDB
Bx = J
1E(dA) x
_ 00
148
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
This implies that the group { U(t) } has the integral representation 00
U(t) X = J -
e`'"E(dA) x
(10-14)
and conversely for each spectral measure the group U(t) defined by (10-13) is a strongly continuous unitary group.
To give important examples of isometric and unitary semigroups we consider the Hilbert space L2(-ee, oc) of all Lebesgue square integrable functions on R with the norm given by 1IfIIZ =
f"O -00
(10-15)
If(x)I2dx
In L2(-oo, oo) we define the group {U(t)} by
(U(t) f) (x) = f(x - t)
(10-16)
Obviously {U(t)} is a unitary group with U(t)- I= U(t)* = U(-t). To see that {U(t)} is strongly continuous we note that for any continuous function f of compact support the L2 norm of f - U(t) f can be made arbitrarily small as f is actually uniformly continuous. Since these functions are dense in L2(-oo, oo) and the norms of U(t) uniformly bounded the strong continuity follows. We call { U(t)I t e R} the translation group with { U(t)I t ? 0} the right translation semigroup and {U(t)* t >- 0) = {U(-t) It > 0} the left translation semigroup. The subspace L2(0, oo) considered as a subspace of L2(-oo, oo) is clearly
invariant under the right translation semigroup. We define {V(t)It z0} to be the restriction of the right translation semigroup to L2(0, oo). Thus we have (V(t)t)(x)={f(x - t)
x
(10-17)
is a strongly continuous semigroup of isometrics, the semigroup of right translations in L2(0, oo). The adjoint semigroup is given by and { V (t) }
(V (t)* f) (x) = f (x + t)
(10-18)
which defines the semigroup of left translations. We have IIV(t)*f112 =
f
" o
if (x + t)I2dx = f If(x)I2dx
So lim V(t)*f = 0 for all f e L2(0, oo). t-.o
We determine next the infinitesimal generators of the various semigroups. To begin let A be the infinitesimal generator of the group { U(t) 1. Let f e DA then
(f (x - t) - f (x) )/t converges, in L2 (- oo, oo), to a function g e L2 (- oo, oo).
OPERATORS IN HILBERT SPACE
149
Integrating against the characteristic function of the interval (a, b) we have f b f (X
lim
t=a
- t) -f (x) dx = fb t
9(x) dx
a
By a change of variable we have fb
f (x - t) - f (X) dx
Ja
t
= f:11x)dx t
fb 1
f(x) dx
r
The function f is locally in L' and therefore almost everywhere lim t-.o t
a
a-t
f (x) dx = f (a)
and equally for the other integral. So for almost all b we have
.f (a) -f(b) =
J(x)dx
(10-19)
We redefine f, on a set of measure zero, so that (10-19) holds for all b. But (10-19)
means that f is absolutely continuous and almost everywhere f'(x) = -g(x). Summarizing, if A is the infinitesimal generator of the right translation semigroup
in L2(-co, co) then D,, = { f e L2(- oo, oo)I f absolutely continuous and f e LZ(- oo, oo) } (10-20) and
Af = -f'
(10-21)
In exactly the same way if B is the infinitesimal generator of the left translation
semigroup in L2(-oo, co) then DB = D,, and
Bf =f'
(10-22)
Let U be the infinitesimal cogenerator of the right translation semigroup. Then U' is the cogenerator of the left translation semigroup. To obtain explicit formulas for U and U* we note that U = (A + I) (A - I) - I = I + 2(A - I) - I.
Let Uf=g then f+2(A-I)-'f=g.
If(A-I)_I
f=h then f=(A-I)h=
-h' - h. So h is the solution of the differential equation
h' + h = -f
(10-23)
that lies in L2(- oo, oo). The general solution of (10-23) is a-` {c - Jo e`f (t) dt} and for that to be in V(-oo, oo) we have to have c = Jo 0 etf (T)dT = -,(o_. e`f (t) dt. This yields
(Uf)(t) =f(t) - 2e
`
I
J
e`f(T)dT
t ao
(10-24)
150
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
By similar computation for U* we obtain
(U*f) (t) = f (t) - 2e` J
e-`f (T) dT
(10-25)
It can be seen directly from (10-24) and (10-25) that L2(0, co) is invariant under U and L2(- oo, 0) is invariant under U*. This is of course evident as a consequence of Theorem 10-9.
The semigroups of left and right translation in L2(-oo, eo) are clearly unitarily equivalent. This follows easily from the spectral analysis that follows but
can be seen directly by observing that if we define a map J: L2(- oc, oo) L2(- oc, oo) by
(Jf)(x) = -f(-x)
(10-26)
then J is unitary, J* = J-' = J and
U(t)J = JU(-t) = JU(t)*
(10-27)
is satisfied.
It is easy to check that relation (10-27) holds also for the cogenerators, that is we have
UJ = JU*
(10-28)
By Stone's theorem both A and B are skew self-adjoint, and from Theorem 10-8 we have B = A* = - A, so A = - B = iH for some self-adjoint operator H. As a consequence the spectra of A and B lie on the imaginary axis. To determine the spectrum of A and B we note first that the point spectrum of A is empty. Indeed the only solutions of the equation Af = .if are of the form f (x) = ce-ax and none of these is in L2(- oo, oo). To characterize the spectrum the simplest approach is to determine the spectral representation of the unitary group {U(t)). This is done through the introduction of the Fourier transformation in L2 (- oo, oo).
We define the Fourier transform initially in L' (- oo, oo), the space of all Lebesgue integrable functions on the real line, the norm being given by 11f III = f If (x) I dx. We observe that L' (- oo, oo) is a Banach space, and moreover a commutative Banach algebra under convolution multiplication
(f * OW = J
f (x - t) g(t) dt
f,geL'(-oo,oo)
for
(10-29)
The Fourier transform -F is defined by .Ff =f with 1
J(A)= --
°
f (x)
dx
(10-30)
,/
If f, g e LI(- oo, oo) and h = f* g then we have 1i(A) _ j(A) J(A) or equivalently
.(f*g) = F(f) - fi(g)
(10-31)
Under suitable smoothness and growth conditions the Fourier transform can be inverted.
OPERATORS IN HILBERT SPACE
151
To obtain an inversion formula for the Fourier transform we use the idea of a summability kernel, introduced in Sec. 4, with the only difference that the domain of definition is taken to be the real line. Thus a family of functions {k,1(x)} defined on IR is a summability kernel if k2(x) ? 0, j°_°,, k(x) dx = 1 and kz(x)dx = 0 for all S > 0. If k e L'(- oo, co) and k(x) >- 0 then k,1(x) _ lim
Ak(A.x) is easily checked to be a summability kernel. The important property of summability kernels is that for all f E L' (- oo, oo) we have lim 11 f - kx -f II I = 0. A-W Let us pick now
k(x) =
sIXX 21Z
/2
2>t
= - fI 271
-
1
(1 -
ICI)e- +txd-
which is the Fejer kernel. That I'- - k(x) dx = I can he proved by contour integration and the residue theorem, whereas the representation of k(x) as a Fourier
transform follows by integration by parts of (l/2n) j''-1(1 - I I) e-'4x d = Since kz(x) = (1/2n) jz z (1 - ICI/A) a-'4x d we have (1/n) jo (1 - ) (kz s f) (x) _
J A- . t
e-'4x d and hence
- ICI/A)
f (x) = lim 2n 2.0 1
z
(I - II ) f (f) e-`fx d
(10-32)
A
the limit existing in the L' (- co, oo) norm. In particular we have the uniqueness
of the Fourier transform of L' (- (.,o, oo) functions. For assume f( ) = 0 then (10-32) implies f = 0. We will extend now the definition of the Fourier transform to L2(- 00, oo).
Let f be continuously differentiable and of compact support. This guarantees that 1% I f (2)I2 dA < co. Define h(x) = f (-x) then h(2) _ (A). Let g = f * h then
g(x) = J
f (x - t) f (-0 dt = J
f (x + T) f (t) dt
(10-33)
00
and in particular g(0) = j% I f (T)I2 dt. Passing to the Fourier transform of the convolution g = f * h we have O(A) = if(A)12. From (10-33) it follows, putting x = 0, that lim jA (1 I!(2) I2 dal = g(0) = j If (x) I' dx. As the inte-
grand on the left is nonnegative the limit exists and is equal to j%
If(A)I2 dA
which proves that I
f
12
f
continuous and of compact support. This is a set that is obviously dense in L2(- oo, oo). For these f we define .F by (10-30) then ,F is the isometry defined
on a dense subset of L2(- oo, oo) into L2(- oo, oo) and hence has a unique extension by continuity to an isometry defined on all of L2(-oo, oc). Since every twice differentiable function of compact support is a Fourier transform of a
152
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
continuous integrable function it follows that the range of F is dense in L2 (- co, oo),
and being an isometry, actually all of L2(-oo, oo). Summarizing we have obtained the following.
Theorem 10-12 The operator .F defined on L'(-co, oc) n L2(-oo, oo) by 1
= J(A) _ -2n
fo.
f (x) e" dx
(10-35)
has a unique extension to a unitary operator of L2(-oo, oo) onto itself. Thus for f e L2(- oo, oo) we have j°e L2(- co, oo) and
f
13().)12 d1 00
=f
If(x)I2 dx
(10-36)
We call the extended operator F the Fourier-Plancherel transform and (10-36) the Plancherel identity. We record now for later use some of the basic properties of the FourierPlancherel transform. Theorem 10-13 Let f e L2 (- oo , oo) and (a) If t"f (t) e L2 (- oo, oo) then 2n
t"f (t)
eiuc dt
-
=
_
f
a0
2n
_.
f (t) eimr dt
301(w)
f (II) cL2(- oo, oo) then f Cn E U(- co, oo) for
0<j
Ff.
(- i)"
f
(")(t) ei°t dt
= --
2n
f
f (t) e" °' dt = (- iw)" f (w) (10-38)
If f is in L2(0, co) we need the extra assumption f(0) = 0 for 0 <- j
,- f -
1
°°
2rz
u
e'f (t) e" dt = j'(a) - ia)
(10-39)
We return now to the study of the infinitesimal generator A of the right trans-
lation semigroup. We introduce the self-adjoint operator H DH = DA given by (10-20) H is defined by
Hf = if '
iA then on (10-40)
OPERATORS IN HILBERT SPACE
153
As f and f' are in L2(-oo, oo) by integration by parts we have 1
fei-"f'(t) dt = -z2
2n
2n
f-
e'A`f(t)dt
(10-41)
which can be rewritten as
flHf) = H'.flf)
(10-42)
where H' is the self-adjoint operator in L2( - oc, oo) defined by
DH, = {f If and XfeL2(-co,oo))
(10-43)
where X is the identity function X(2) = A and (Hf) (Z) = Zf (1)
(10-44)
Thus the operators H and H' are unitarily equivalent. Actually by the FourierPlancherel transform we have obtained a spectral representation for the operators of differentiation. The spectral measure is given by E(o) f (A) = X0(A) f (A)
(10-45)
where Xo is the characteristic function of the Borel set a. The group U(t) has the representation (JIF U(t)f)(2) = e1'1(f) (2) (10-46)
which holds for all t eR. From (10-44) we see that a(H) = a(H) = R and hence a(A) is the whole imaginary axis. The situation changes drastically when we consider the left translation semigroup in L2(0, oo). In this case the infinitesimal generator B has domain DB = f f e L2(0, co) (f absolutely continuous and f' e L2(0, oo) )
(10-47)
and Bf = f'
(10-48)
If we look for solutions of Bf = If we have to solve the differential equation f' = If in L2(0, oo). The functions which are constant multiples of e' for Re). < 0 are in L2(0, oo) and solve the equation so op(B) _ {AI ReA < 0}. Since we have a contractive semigroup each A with positive real part is in the resolvent set of B and a(B) being closed is necessarily equal to the closed left half plane. Next we determine the infinitesimal generator of the right translation semigroup in L2(0, oo). We can proceed directly or compute the adjoint of the previously determined infinitesimal generator of the left translation semigroup. We choose the first course and follow the line we took before. Let A be the infinitesimal generator then for f e DA we have
(Af) (x) = lim r-.0
f X - t) - f (X) = 9(X) t
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
154
the limit existing in L2(0, oo). Integrating against the characteristic function of [0, t] we have fo { [f (x - r) - f (x)]/r} dx -+ J0 g(x) dx. But
f
t
` f (x - r) - f (x) dx =
ro .f (x) dx
T f('T
T
o
=
r
f .f (x - r) dx - 1 .f (x) dx - 1
I
T
T
0
f
(' t 0
.f (x) dx = -
1 T
I
T
.f (x) dx
t_T
For almost all t we have lira (1 /T) f,'-, f (x) dx = f (t). Redefining f on a set of T_0 measure zero we obtain the representation f (t) = J0 g(x) dx for g e L2(0, oc). Thus the domain of definition of A is D,1 = {f e LZ (0, co) If absolutely continuous, f (O) = 0 f c- L2 (0, cc)) (10-49) and
Af = -f'
f e DA
for
(10-50)
Since A is the adjoint of the previously determined B we have immediately that a(A) = {A ReA.
0). However, A is an eigenvalue of A if and only if f is a constant
multiple of a-''t for Red. > 0 and none of these is in DA so the point spectrum is empty.
To compute the infinitesimal cogenerator V of the right translation semigroup in L2(0, oo) we can proceed directly in a way analogous to the determina-
tion of the cogenerator U of the right translation semigroup in L2(-oo, oo). However, it is simpler to derive a formula for V out of that of U. Indeed we have L2(0, co) naturally embedded in L2(- oo, oo) where if f e L2(0, oo) we assume f (x) = 0 for x < 0. Then we have V(t) = U(t)I L2(0, oo) for t >- 0. This implies that V = U I L2(0, eo) and hence from (10-24) we obtain
(Vf)(t) = f (t) -
2e_'
f'
e`f (T) dr
(10-51)
0
By Theorem 10-10 V is isometric and V* the cogenerator of the left translation semigroup coisometric. If P is the orthogonal projection of L2(-oo, co) onto L2(0, oo) then from V= UJL2(0, oo) it follows that for all f in L2(0, oo) we have V*f = PU*f. Using (10-25) we obtain the explicit formula
(V*f)(t) = f (t) - 2e'
I
e-Tf (r) dr
(10-52)
for all f e L2(0, oo). Naturally we are considering only nonnegative values for t. The kernel of V* is the set of all fe L2(0, oo) for which f (t) = 2et f W e-Tf (t)dr.
Let us put g(t) = e- !f (t) then fe KerV* if and only if g(t) = 2 1j° g(T) dr. So g is
absolutely continuous and g'(t) = -2g(t) which implies g(t) = ce-2t or f(t) = ce `.So Ker V* is one dimensional. By induction it is easy to prove that f e Ker V*" if and only if f (t) = p,,-,(t) e_' for some polynomial p,,_1 of degree n - 1 at most. Since
V"H}1 = span {Ker V*"In z 0} and as the set of functions p(t) a-t,
OPERATORS IN HILBERT SPACE
155
with p any polynomial, is dense in L2(0, oo) it follows that V is a completely non-
unitary isometry, equivalent by Lemma 8-1 to a unilateral shift of multiplicity one.
It is easy to make contact now with the special orthonormal basis of L2(0, co) consisting of the Laguerre functions.
Let fo(t) = ,1-2-e-, then fo is an orthonormal basis for Ker V* and hence, V being unitarily equivalent to a simple right shift, ( Vnfo) 0 is an orthonormal basis for L2(0, co). Define I = V f. then fo(O) = V'T and from (10-51) it follows that 1(0) = (V.) (0) = So 2 for all n > 0. By differentiating (10-51)
f
f;,+1(t) =.fn(t) - 2ff(t) + 2e
efn(T)dT 0
or
fn+1 +fn+, =fn -fn
(10-53)
which together with the obvious relation f; = -fo implies by summing up the equalities from 0 to n + 1 that f,, equations
1
is the solution of the recursive set of differential
fn+1 +fn+1 = -2 1 f 1=0
fn+1(0) = / 2-
(10-54)
Now in L2(0, oc) we define a linear transformation W by (Wf) (t) _ (1/ 2) f(t/2) then W is an isometry which is clearly invertible with (W -'g) (t) = g(2t). So W is a unitary map in L2(0, oo) and hence maps an orthonormal basis onto another orthonormal basis If we let cpn be defined by cpn = Wf then {cpn} is an orthonormal basis in L2(0, oo). Moreover transforming (10-54) it follows that the cpn are defined by the following set of recursive differential equations
2;,+1(t) + p.,1(t) =
-2
cPr(t)
i=o
n+I(0) = 1
(10-55)
The functions {cp.} are known as the Laguerre functions. In summary we have proved the following theorem. Theorem 10-14 Let {V(t)lt ? 01 be the right translation semigroup in L2(0, 00).
Its infinitesimal generator A is given by the operator A defined by (10-50) with domain DA given by (10-49). The cogenerator V is given by (10-51) is an isometry and with respect to the orthonormal basis { fn} of 12(0, 00) consisting of the solutions to the recursive set of differential equations (10-54) it is a simple unilateral shift.
The left translation semigroup {V(t)*} has the infinitesimal generator B given by (10-48) with domain DB given by (10-47). The cogenerator V* is coisometric and given by (10-52). Corollary 10-15 The Laguerre functions {q,, } defined by the recursive system of differential equations (10-55) are an orthonormal basis for L2(0, oo).
156
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Actually we can use the orthonormal basis (f.} of L2(0, oo) to produce an orthonormal basis for L2(- co, oo). Define gn = Jf for the unitary map J of (10-26). Since J maps L2(0, oo) onto L2(- eo, 0) it is clear that {gn}°°,, is an orthonormal basis for L2(-00,0). Moreover since Ufn = f,+l relation (10-28) implies U*9n = gn+1
(10-56)
To link the two sequences we consider both L2(- oo, 0) and L2(0, co) as sub-
spaces of L2(-cc, oo). It is important to note that the restriction of an L2(0, oo) to the negative half axis is zero. By a direct computation we have U*fo = go. We state the results concerning the translation group in L2 (- oo, oo) as a theorem.
Theorem 10-16 Let { U(t)I t > 0} be the right translation semigroup in L2(- oo, oo). Its infinitesimal generator A is given by (10-21) with domain given by (10-20). The cogenerator U is unitary and given by (10-24). The sequence of functions {.... g _ 2, 9-1 , 90, fo, f1, } with the If. } defined by t) or alternately by the recursive sys(10-54) and the {gn } by gn(t) tem of differential equations n
g'n+1 - 9n+1 =
i=0
gi
9n+1(0) _ -1/2
(10-57)
for the half line (- oo, 0], is an orthonormal basis for e(- oo, oo). With respect to this basis U is a bilateral right shift. The cogenerator U* of {U(t) I t <- 0} = {U (t)* I t ? 0} is given by (10-25) and is a bilateral left shift of multiplicity one.
The same analysis of the translation semigroups can be carried out in the vectorial case. Thus instead of scalar valued function we consider, for a given separable Hilbert space N, the space of all N-valued weakly measurable functions satisfying 11f 112 = f
11f (x) f JN dx < oo. We denote this space by L2 (- 00, 00; N)
and define L2(0, oo; N) analogously. We just note that in this case the respective cogenerators are shifts of multiplicity equal to dim N. A semigroup {T(t)} is called a completely nonunitary semigroup if there exists no nontrivial subspace reducing all T(t) in which T(t) acts unitarily. As for contractions given a contraction semigroup {T(t)} we can separate its unitary and completely nonunitary parts.
Theorem 10-17 Given a contraction semigroup {T (t) } in a Hilbert space H
then there exists a unique direct sum decomposition H = Ho ® H1 that reduce T (t), such that {To(t) } = {T(t)1H0} is a unitary semigroup and {T1(t)} = {T(t)+H1} is a completely nonunitary semigroup.
PROOF Let {T(t) } be a contractive semigroup and T its infinitesimal cogenerator. Let T = To ® T1 be its unique decomposition into its unitary and completely nonunitary part. The existence of this decomposition has been proved in Theorem 9-1. Now 1 does not belong to op(T) if and only if it
OPERATORS IN HILEERT SPACE
157
does not belong to ap(T), i = 0, 1. Thus To and Ti are also infinitesimal cogenerators of contraction semigroups { To(t) } and {Ti (t) } acting in Ho and
Hl, respectively. By Theorem 10-10 the semigroup {To(t)} is unitary. By the same theorem { T, (t) } must be completely nonunitary for if { TI (t) } had
as a direct summand a unitary semigroup then the cogenerator TI could not be completely nonunitary.
More can be said about the relation of the semigroup and its cogenerator if we assume T(t)* tends strongly to zero. This turns out to be equivalent to T*" tending strongly to zero. To prove it we need the following result which is extremely important for its own sake. It is the continuous analogue of Theorem 9-2 and points out the importance of the translation semigroups in providing universal models for contraction semigroups. Theorem 10-18 A contraction semigroup { T (t) } is unitarily equivalent to the adjoint of the semigroup of left translations restricted to a left invariant subspace of some L2(0, oo ; N) space if and only if T (t)* tends strongly to zero.
PROOF Let { V (t) } be the right translation semigroup in L2 (0, oo ; N) where
N is a separable Hilbert space. Assume M is a left invariant subspace of LZ(0, oo; N) and let P be the orthogonal projection of L2(0, oo; N) on M. Define W(t) by W(t) = PV(t) I M
(10-58)
then {W(t)} is strongly continuous semigroup and W(t)* = V(t)* I M
(10-59)
Since II W (t)* f 112 = J,° II f (x) 1I 2 dx we have lim II W (t)* f II = 0 for all f e M.
Consequently, if { T (t) } is unitarily equivalent to (W(t)) we must have urn II T(t)* X112 = 0 for all x e H. 2-00
Conversely assume lim II T (t)* x ! = 0 for all x e H. We want to identify
a vector x e H with the vectorial function T (t)* x and construct a new norm in H such that this identification will be unitary. So we want to have IIXIIZ = J
IIT(t)* x11 2, dt
(10-60)
0
and since this has to hold for all vectors in particular we must have IIT(s)* xII2 =
IIT(t)* T(s)* xl1; dt J0
(10-61)
f
T(t+s)*xlI dtIIT(t)xlI dt fs"O
158
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Differentiating (10-64) with respect to s and letting B denote the infinitesimal generator of {T(t)* } we obtain II T (s)* x II ;
(BT(s)* x, T (x)* x) + (T (s)* x, BT (s)* x)
(10-62)
and by evaluating at s = 0 IIxI1
= -(Bx, x) - (x, Bx) _ -2 Re(Bx, x)
I
(10-63)
Now B being the infinitesimal generator of a contractive semigroup is maximal accretive so indeed the norm defined by (10-63) is nonnegative. Now for every x in DB we take (10-63) as the definition of the norm. From (10-61) we have d IIT(s)*x1I2 = -IIT(s)*xlli and integrating over (0, t) IIT(s)*xll1ds = 11x112 -
IIT(t)*x112
(10-64)
fU
By our assumption T(t)* tends strongly to zero so (10-60) follows from (10-64). Define No as the set of all vectors x in DB for which Ilxii 1 = 0, and let N be
the completion of DB in the new norm. N is a Hilbert.space and since DB is dense in H and the map from DB to L2(0, oo ; N) defined by x - T (t)* x is isometric it can be extended to all of H. If M is the image of H under this map then clearly M is left invariant. For if V (t)* is the left translation semigroup in L2(0, oo; N) then
V (t)* T (s)* x = T (t + s)* x = T (t)* T (s)* x = T (s)* T (t)* x
(10-65)
Equality (10-65) shows not only the left invariance of M but also that the action of T (t)* in H is mapped into the action of V (t)* in M. Thus the unitary equivalence of {T(t)*} and V(t)* I M. This proves the theorem.
Theorem 10-19 Let {T(t)} be a strongly continuous semigroup and T its infinitesimal cogenerator. Then the conditions lim T (t)* x = 0
for all
xeH
(10-66)
and
lim T*"x = 0 n
for all
xEH
(10-67)
ao
are equivalent.
PRooF Assume (10-66) holds. By the previous theorem we may assume that {T(t)*} is the left translation semigroup restricted to a left invariant subspace M of some L2(0, oo; N) space. Its cogenerator V* is given by (10-52)
OPERATORS IN HILBERT SPACE
159
and by Theorem 10-9. M is also invariant under V* so T* = V*I M. Now V* is unitarily equivalent to a left shift so (10-67) follows. Conversely assume (10-67) holds. By Theorem 9-2 we may assume that
T* is the left shift restricted to a left invariant subspace of L2(0, oo; N), or for that matter that it is the operator V* given by (10-52) restricted to a left invariant subspace of L2(0, oo; N). Therefore, the semigroup T(t)* is unitarily equivalent to the left translation semigroup restricted to a left translation invariant subspace of L2(0, oo; N) and consequently (10-68) holds.
Theorem 10-18 provided an isometric dilation for a strongly continuous semigroup provided T (t)* - 0 strongly. Actually every strongly continuous contractive semigroup {T(t)} has a unitary dilation in the sense that there exists
a Hilbert space K H and a strongly continuous unitary semigroup {U(t)} in K such that
T(t) = PU(t) I H
(10-68)
with P the orthogonal projection of K on H. Theorem 10-20 Every strongly continuous contractive semigroup { T (t) } in a Hilbert space H has a unitary dilation.
PROOF There are several ways to proceed. We can check that the function T(t) defined onR by T
{T(t)
T(-t)*
t>_0
t<0
(10-69)
is a positive definite function on R and apply the Naimark dilation theorem. Otherwise let T be the cogenerator of the semigroup {T(t)}. We know that T is a contraction and 10 a,(T). Let U be the minimal unitary dilation of T then 10 a,(U) and hence U is the cogenerator of a unitary semigroup {U(t)}. This semigroup provides the unitary dilation of T.
Next we describe another application of Naimark's theorem. Let F be a positive operator measure on R, that is a B(H)-valued function defined on the Borel subsets of R such that (F( ) x, x) is a positive measure for each x c H. The question that naturally arises is when can F be dilated to a spectral measure on a larger space K.
Theorem 10-21 Let F be a positive operator measure on R which satisfies F(a) <_ I for each Bore] set a. Then there exists a Hilbert space K H and a spectral measure E in K such that for each Borel set a
F(a) = PE(a) I H
(10-70)
160
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
PROOF We can use the positive operator measure F to construct a function T (t) on R by letting
T (t) = JeihF(dA)
where the integral certainly exists in the weak sense. We check that T(t) is a positive definite function on R. Let x; e H and t, eR then "
"
Tt
i=1 J=1
t
x x-) _
Jetti)(F(dA)xixi) t=1 j=1
eu,xi )
= J(F(dA) i=1
>_ 0
J=1
By Naimark's theorem there exists a unitary representation U(t) of R in
K H such that (T(t)x,y) = (U(t) x, y)
for all
x,yeH
Applying Stone's theorem we have U(t) = ! e'uE(dA) for some spectral follows that for all x, y e H measure E in K. ItJe1 ( F(dA ) x, y) =
Jet (E(d,) x, y)
By the uniqueness of the Fourier-Stieltjes transform the measures x, y) are equal, that is for each Borel subset a of R we have and (F(o)x, y) _ (E(o) x, y)
x, y)
(10-71)
Condition (10-71) is equivalent to (10-70).
With the results obtained so far we are in a position to give an abstract characterization of the group of translations in L2 ( - oo, oo), characterization analogous to the one obtained in Theorem 8-3 for bilateral shifts in i2(-oo, oo). Let {U(t)} be any strongly continuous group of unitary operators. A subspace D is an outgoing subspace for the group {U(t) } if it satisfies U(t) D e D fort > 0, rn,6R U(t) D = {0} and v,E R U(t) D = H. Lemma 10-22 D is an outgoing subspace for the unitary group {U(t) } if and only if it is an outgoing subspace for the unitary cogenerator U of the group.
PROOF By Theorem 10-9 D is invariant under the semigroup {U(t)lt Z 0} if and only if D is invariant under the cogenerator U. If a subspace M is invariant under all U(t) then it is invariant under all U" and U"M = M = U(t) M.
Let now M = n,ER U(t) D and M = nc. z U"D. Since U(t) M = M for all t and U"M' = M' for all n we have also U"M = M and U(t) M = M'
OPERATORS IN HILBERT SPACE
161
so consequently
M= n U"Mc n U"D= M'= nU(t)M'c nU(t)D=M nEZ
rE R
neZ
tER
So M = M' and u,Ez U"D = {0} if and only if ntER U(t) D = {0). Finally, N and N' are invariant under all let N = u,ER U(t) D and N' = U(t) and U" hence
N= UU"N=) UU"D=N'= UU(t)N'=) UU(t)D=N "EZ
rER
nEZ
tER
So N = N' and N = H if and only if N' = H which completes the proof. Theorem 10-23 Let D be an outgoing subspace for a strongly continuous group { U 1(t) } of unitary operators in a Hilbert space H. Then { U 1(t) } is unitarily equivalent to the group of translations { U(t) } in some L2 (- oo, o; N) space.
PROOF Let U1 be the cogenerator of {U1(t)It Z 0} then U1 is unitary and D is an outgoing space with respect to U1. By Theorem 8-3 we may assume without loss of generality that U 1 is the right shift in 12(- oo, oo ; N) for some
auxiliary Hilbert space N, and D is 12(0, oo; N). If If.) and {g"} are the sequences of functions defined by (10-54) and (10-57), respectively, and x =
-,,, is in 12(-0o,oo;N) then we define a map t0:12(-0o,oo;N)-e
L2oo,oo;N)by
o
It is clear that cp is a unitary map and, as a consequence of Theorem 10-16
4401 = Uco
(10-72)
where U is given by (10-24). But U is the cogenerator of the right translation semigroup in L2 (- oo, oo ; N), so { U, (t) } and { U(t) } are unitarily equivalent. This completes the proof.
11. THE LIFTING THEOREM This section is devoted to a proof of the lifting theorem describing the commutant of a given contraction in terms of the commutant of its minimal isometric dilation.
We begin by presenting a theorem that is instrumental in the proof of the main result.
Theorem 11-1 Let T1 and T2 be contractions in the Hilbert spaces H1 and H2, respectively and let X E B(H2, H1). A necessary and sufficient condition that the operator on H1 ® H2 defined by the operator matrix T1
X
0
T2
162
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
be a contraction is that there exists a contraction C e B(H2, H,) such that X = D1-CDT2
(11-1)
where DT and DT. are defined by (I - T*T)112 and (I - TT")"2, respectively. PROOF Assume the operator defined by the matrix
T, X 0
TT
is a contraction. Then
(T, X 0
(T, X)(T* 01)
X 1*
T2J
0
T2
0
(IH,
0
1
X* T0 IH2 J
TZ
(11-2)
But
T, X Ti 0 0 T2)(X* T2)
TT; + XX* XTZ T2X*
_
T2T*
(TIT; 0) (XX* XTZ ) 0
0
+
T2X*
T2T2
which implies that 0
(00
T ) -(D0,
00
(11-3)
IH2
Applying Corollary 7-2 this is equivalent to the existence of a contraction in H, Q H2 represented by a matrix (K11
K12
K21
K22
DTt
0
which satisfies
_
0
X
0
T2)-(
K
K12 K22
IH2)(K21
0
From this equality it follows that K21 = 0, K22 = T2 and X = DT-,K12Moreover if
K, 2 )
(K11 0
T2
is a contraction, then so is
M= (00
K12) TZ
(K1, 0
K1
0
T2)(0
OPERATORS IN HILBERT SPACE
163
Since M is a contraction we have
M*M =
0
0
0
(K12
TJ /
1H,
0
0
IH2
(0
0
0 0
K12 T2
Ki2K12 + TiT2
from which K*2K12 <_ IH= - TJT2 = DT2 follows. Another application of Corollary 7-2 yields the existence of a contraction C such that K12 = CDT2 and (11-1) is proved. Conversely if (11-1) holds for a contraction C then we TJ T2 and hence that M define K12 = CDT2 which implies K*1 2K12 <defined as above is a contraction. Since obviously (0
X
0
T2
DT-,CDT,
(00
/(0
- (DT'0
/
T2
IH2 0
0
CDT2 T2
relation (11-3) follows and, adding 0 0
T,T-1
0
to both sides, also (11-2) and the proof is complete. The characterization of the commutant of the general contraction in a Hilbert space will be based on that of partial isometrics. Let Q be a partial isometry in a Hilbert space K. Let H be the final space of Q and G its orthogonal complement in K. Thus K = H ® G and with respect to this direct sum decomposition Q has a matrix representation
Q=
T
S
0
0
(11-4)
By Theorem 2-3 QQ
*=
TT* + SS*
0
0
0
is the orthogonal projection on the final space of Q, hence TT* + SS* = IH.
Theorem 11-2 Let G and H be Hilbert spaces, T a contraction in H and S e B(G, H) for which TT* + SS* = 1H. Let Q be the partial isometry defined by (11-4) acting in H ® G and let X be an operator in H that commutes with T. Then there exists an operator Y
(0
B)
in H ® G such that Y commutes with Q and 11 Y 11 = 11 X11,
(11-5)
164
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
PROOF We may assume without loss of generality that IIXII = 1. An operator Y defined by (11-5) commutes with Q, assuming X commutes with T, if and only if
TA + SB = XS
(11-6)
To prove the theorem we will show the existence of operators A e B(G, H) B E B(G, G) such that (11-6) holds and Y is a contraction. This, in view of Theorem 11-1 is equivalent to IIBII <- 1 and A = Dx.CDB for some contraction C. Since X and T commute we have XTT*X* = TXX*T*. Using the fact that X is a contraction the inequality
It, - TXX*T* >- XX* - XTT*X* = X(I - TT*)X* = XSS*X* follows. Replacing IH by TT* + SS* we obtain XSS*X* <- TDX.T* + SS
(11-7)
which by an application of Theorem 7-4 guarantees the existence of operators K and B which satisfy
TDX.K + SB = XS
(11-8)
B*B + K*K = lc,
(11-9)
and
From (11-9) clearly B itself is a contraction and K*K 5 D. Applying Corollary 7-2 we infer the existence of a contraction C such that K = CDB. Letting A = DX.CDB completes the proof.
We proceed to the statement and proof of the lifting theorem. Since we can
start with the minimal isometric dilations of both T and T* we can state the result in terms of the minimal coisometric or isometric dilations of T We begin with the first.
Theorem 11-3 Let T be a contraction in a Hilbert space H and let W be the minimal coisometric dilation of T acting in K H. Given an operator X in H that commutes with T there exists an operator Y in K that commutes with W and satisfies YH c H, X = Y I H and II Y II = IIXII PROOF By the remarks following Theorem 9-5 we have the matrix representa-
tion (9-21) for Win K = H O Z. (D Z* ED - . Let us define Ho = H O+ {O} ® {0} ® ... and H = H O+ T* ® ... O Z. O {0} ® We observe that H. is invariant under W and define W = W I H. Thus with respect to the direct sum H $ Z. ® ... O Z. W. has the matrix representation 0 (11-10)
OPERATORS IN HILBERT SPACE
165
Using the equality T T* + D. = IH a simple matrix calculation yields
WW
0) or
is the projection on H ®Z* e ... ®Z* ® {0} which means that
1 embedded naturally in H. Since W. is clearly W * is isometric on 1 as 1 we have that W. is a partial isometry with zero on H,, e final space. In terms of the direct sum decomposition H. = 1 ®T.. W., for n >_ 2, has the matrix representation
w_=
C W.- 1 0
S.-11 0
and W._1W,,-1 + S.-1S.-1 = IH-
with Y. acting We define now inductively a sequence of operators in H,,. Let Yo = X. Applying Theorem 11-2 there exists an operator YI in HI such that Ho is invariant under Y1, X = Y1 I H. and II Y1 II =IIXII . Suppose that for 1 <_ k 5 n we defined Yk acting in Hk such that YkWk = WkYk,
YkHk-1
and
Hk-1, IIYkII = IIXII
Yk-1 = YkIHk-1 (11-11)
Applying Theorem 11-2 once again we have the existence of an operator satisfying (11-11) for k = n + 1. If we embed all spaces H. Yn+1 in as a sequence of operators in K, naturally in K then we can consider { by letting Y.I H. = {0}. In the linear manifold L= un 0 H,,, which is obviously dense in K, we define an operator Y by Yx = Y.x if x e H,,. From (11-11) it follows that Y is well defined on L and for each xe L there exists an n such that II YxII = II Y.x11 < II Y. 1111 x 11 = IIXII IIxII which implies II Y11 s II X II . Since obviously II Y11 >_ II Y. 11 we have actually the equality II Y11 = IIXII
Thus Y can be extended to a bounded operator on all of K. Since ,,W, = WkYk on Hk we have actually YW= WY on all of L and by continuity this holds also in K. This completes the proof As a direct corollary we have this alternative form of the lifting theorem. Theorem 11-4 Let T be a contraction in a Hilbert space H and let V be the minimal isometric dilation of T acting in K H. Given an operator X in H that commutes with T there exists an operator Y in K that commutes with V and satisfies Y (K e H) c= K e H, II Y11 = IIXII and X = PHYI H
(11-12)
166
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
PROOF Let W be the minimal coisometric dilation of T* then V= W* is the minimal isometric dilation of T Since X* commutes with T* there exists an operator Z commuting with Wand satisfying ZH c H, IPZHI = II X* II and Z I H = X*. Define Y = Z* then Y commutes with V and II Y II = II X II . The
invariance of K e H under Y is equivalent to that of H under Z. Finally for all x, y in H we have (Xx, y) = (x, X*y) = (x, Zy) = (Z*x, y) = (Yx, y) _ (PHYx, y) and (11-12) follows.
The assumption of the minimality of the strong isometric dilation is not essential and it will be convenient to have a formulation of Theorem 11-3 which is independent of it. Naturally the same holds for Theorem 11-4.
Theorem 11-5 Let T be a contraction in a Hilbert space H and let W be a coisometric dilation of Tacting in a Hilbert space K H. For every operator X that commutes with T there exists an operator Y in K that commutes with W and satisfies YH c H, X = Y I H and = II X II 1 1Y 1 1
PROOF Let K, c K be the smallest reducing subspace of W that contains H. Then W, = W I K, is, up to unitary equivalence, the minimal coisometric
dilation of T Let Y, be the operator in K, that satisfies Y1W1 = W, Yi, Y1H c H and Y, I H = X, which exists by virtue of Theorem 11-3. Let K2 be the orthogonal complement of K, in K. Relative to the direct sum decomposition K = K, (D K2 define an operator Y by the 2 x 2 operator matrix
Y=
Y,
0
0
0)
then clearly Y has the required properties.
The above two theorems generalize easily to the context of intertwining operators for two contractions T, and T2 that act in the Hilbert space H, and H2, respectively.
Theorem 11-6 Let, for i = 1, 2, T be contractions in the Hilbert spaces H; and let W, be some coisometric dilations acting in the Hilbert space Ki. Let X : H, H2 be an operator that intertwines T, and T2, that is, satisfies T2X = XT,. Then there exists an operator Y: K, -i K2 such that YW1 _
W2Y,YH,cH2,X=YIH,andVIYII = IIXII.
PROOF Define operators 1 and k on the direct sum H, (D H2 by the 2 x 2 operator matrices
'1'=I T'
T2)
and
aC=(X
0)
(11-13)
OPERATORS IN HILBERT SPACE
167
then X T1 = T2X is equivalent tog 1'_ 1'S. Since obviously 0
W= C Wi 0
W2
is a coisometric dilation of Dwe can apply Theorem 11-5 to get the existence of an operator
(ZI Y
LZ
Z3
on K1 ® K2 such that ? 3' = W ?, 1I H = X and immediately that Y has the required properties.
II pII = IIX II
It follows
Theorem 11-7 Let, for i = 1, 2, T be contractions in the Hilbert spaces H1 and let V; be some isometric dilations acting in the Hilbert spaces K; H,. Let X : H1 - H2 be an operator that satisfies XT1 = T2X. Then there exists an operator Y: K1 - K2 such that YV1 = V2 Y, Y(K1 O H1) c K2 p HZ, IIYII = IIXII and
X = P,,,YI H1
(11-14)
We note that in Theorem 11-5 and 11-7 the assumption of minimality of the isometric or coisometric dilation has not been made.
12. ELEMENTS OF H2 THEORY In the previous two sections we obtained abstract models for certain contractions and a characterization of operators intertwining two contractions. More information, especially of spectral nature, could be obtained if the Hilbert spaces under consideration would have more structure. Thus our aim will be to transform the Hilbert spaces, by way of the Fourier transformation, into function spaces. The relevant spaces turn out to be H2 spaces and this section is devoted to a short survey of the necessary background.
Let ii' be the unit circle and D the open unit disc. We denote by L1(Ir'), or simply by L°, the Banach space of all functions whose pth power is integrable with respect to the normalized Lebesque measure with the usual conventions. Thus we have for I < p < p' < oo that L' LF -D LP L°° L°° is defined as the space of all essentially bounded functions. Each f E V has well-defined Fourier coefficients given by
a"=
I 21r
(12-1)
We define for 1 < p < oo the Hardy space HA to be the closed subspace of L° consisting of all functions for which a = 0 when n < 0. Since the map f - a.
168
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
is continuous for each space L" then H" is a closed subspace and hence also a Banach space. Of course H2 inherits from L2 the Hilbert space structure. For p = 2 we have the orthogonal decomposition of L2 given by L2 = H2 O Ro where R,2 = (f c -,U 1(1/27r) j f (e") e- "dt = 0, n 0). For later use we also define R2 by R2 = { J'e L2 (1 /21tt f (e") e-'"' dt = 0, n > 0}. Clearly f e H2 if and is in R2 which explains the notation. only if f defined by f(e") = It is worthwhile to point out right at the beginning the connection of the H2 spaces with the analysis of shift operators. By the completeness of L2 it follows a"e'"' converges if and only if Y Ia"I2 < 00. that We define the Fourier transform F to be the map .F: 12(- 00, 00) L2 a"e'"'. It is easily checked that it is a unitary _ .) = defined by map. Moreover if U is the bilateral right shift in 12(c0, c0) then defining the operator U, in L2 by (U I f) (e") = e"f (e") we have
.FU = U1F
(12-2)
Also it is clear that 912(0, oo) = H2 and the operators U 112(0, oo) and U1 H2 are also unitarily equivalent, the unitary equivalence provided by the Fourier transform restricted to 12(0, 00).
We will now focus our attention on the structure of invariant subspaces. Thus let µ be a positive Borel measure on the unit circle and let L2(µ) be the space
of all square integrable functions with respect to p with the norm I f (e°)l2 dµ. In L2(µ) we single out the operator U defined by
jjf 12
=
(e") or alternately U f = X f with X the identity function on V. Clearly U is a unitary map in L2(µ) and the restriction of U. to any of its invariant subspaces is isometric. A straightforward application of the Wold decomposition (Theorem
8-2) yields the following.
Theorem 12-1 Let p be a positive measure on if and let M be an invariant subspace. Then there exists a unique direct sum decomposition M = Mo M1 which reduces the isometry V= U (M and such that V I M0 is unitary and r-%,'- 0 V"M I = {0} .
Since each of the two subspaces is invariant under UP it is of interest to characterize the two types of invariant subspaces.
Lemma 12-2 Let µ be a Borel measure on if that satisfies j X"dµ = 0 for all
neZthenp=0.
PROOF The measure µ represents a continuous linear functional on C(If) which vanishes on the dense subset of all trigonometric polynomials so is necessarily zero.
Corollary 12-3 Let p be a real Borel measure on if for which j X"dµ = 0 for n > 0 then µ is a constant multiple of the normalized Lebesque measure a.
OPERATORS IN HILBERT SPACE
169
PROOF Since p is real we actually have f X" dµ = 0 for all n * 0. This is also true for the Lebesque measure v and hence for some a we have
f X"(dp - a do) = 0 for all n. By the previous lemma p = aa. Theorem 12-4 Let p be positive Borel measure on T. A subset M of L2 (µ) is a reducing subspace for U,, if and only if M = XEL2(p), where XE is the characteristic function of a Borel subset E of if. PROOF If M = XEL2(p) then M is a linear manifold. It is closed as the range
of the orthogonal projection PM defined by PMf = XEf and moreover it clearly reduces U,,.
Conversely let M be a reducing subspace of U and let P be the orthog-
onal projection on M. Since M reduces U,, we have UP = PU,,. By the results of Sec. 6, easily adapted to the case of measures on 'If, P is represented
by a multiplication operator by a function it in U(p). Since n2 = it we have ir(e")2 = n(e") almost everywhere with respect to p. So it = XE for some Borel subset E.
The above theorem settles the case of reducing subspaces. At the other extreme we have those invariant subspaces of U which do not contain a reducing
direct summand. This is equivalent to the restriction of U,, being completely nonunitary. Theorem 12-5 Let p be a positive measure on T. A subspace M of L2(µ) is invariant and V = UI M completely nonunitary if and only if M = (PH2 for some Borel measurable function such that IT 2 dp = da. PROOF Let cp be a Borel measurable function for which 14pl2 do = da. Define
a map D: H2-pL2(p)by Of =cpfthen 110f112=
JIfI2 dp = JIfI2 qi2 dp =
JJfl2
d = If IV
is an isometry and M = (pH2 a closed subspace. If U, is defined in H2 by U, f = Xf then clearly FU0. V is completely nonunitary if and only if n,'=0 V"M = n o U"M = {0). But nm o U"M = nn o U"$H2 = n o (D U" H1 = (D r o U"H2 = {0}. So
Conversely let M be an invariant subspace of L2(µ) from which o U,,M = (0). By the Wold decomposition M = Op o U"L where L = d =E)
Given any unit vector qp in L we have cp 1 U;(p for n > 0 or
cpoX-" dp = 0. By Corollary 12-3 we have I(pl2 dp = dQ. We claim L is onedimensional. If cp' is another unit vector in L which is orthogonal to cp then (U"cp, U°qp) = 0 for all n, m >_ 0. This means $ ww'f du = 0 for all k e Z and hence that qp(p dp = 0 or cpo' = 0 almost everywhere with respect to p. But
JWJ2 dp = I(p,l2 du = do- which is a contradiction. We conclude that L is one-dimensional and M = m,'=o UuL= (PH2.
170
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
A special case of the previous result is Beurling's characterization of the invariant subspaces of H2 which proved to be one of the turning points in operator theory. To this end we define a function q in H°° to be inner if Jq(e")I = 1 almost everywhere on W. A function f E H2 is called outer if the linear combinations of the functions fx", n >_ 0 are dense in H2.
Theorem 12-6 (Beurling) Let S be defined in H2 by Sf = Xf, then a subspace M of H2 is invariant under S if and only if M = qH2 for some inner function q. The inner function q is determined up to a constant of absolute value one.
PROOF If M = qH2 for some inner function q then it is clearly a closed invariant subspace. Conversely if M is invariant under S and since n' o S"HZ = {0} we have by the previous theorem that M = qH2 for some measurable q such that 1g12 da = da or JgJ2 = 1 almost everywhere with respect to a. But as the function 1 is in H2 we have q in H2 or q is inner. Finally let q' be another inner function such that M = qH2. Since qH2 = q'H2 it follows that 4'q and qq' are in H2. So f 4q'X" do = 0 for all n * 0 and hence qq' is a constant A of modulus one. Thus q' = Aq.
Corollary 12-7 A subspace M of H2 is invariant under multiplication by x if and only if it is invariant under multiplication by all H°° functions. PROOF If M is invariant under multiplication by X it is, by Beurling's theorem,
of the form qH2 for some inner function. Now since H2 is invariant under multiplication by H°° functions so is M = qH2. The converse is obvious.
Corollary 12-8 If f is in H2 and f = 0 on a set of positive measure the f is the zero function. PROOF Let M be the invariant subspace spanned by the functions ff". By Beurling's theorem if M is nontrivial we have M = qH2 for some inner function q. Obviously all functions in M are zero whenever f is. Since q e M we get a contradiction. Thus necessarily f is the zero function. The same ideas yield the important factorization of H2 functions into inner and outer factors.
Theorem 12-9 Every nonzero function f e H2 has a factorization f = qg where q is inner or constant and g is outer. This factorization is unique up to constant factors of absolute value one. PROOF If f is outer take g = f and q = 1. Otherwise let M f be the subspace spanned by the functions ft" for n >_ 0. M is an invariant subspace so, by Beurling's theorem, Mf = qH2 for some inner function q. As f e M f we have f = qg for some g e H2. Since multiplication by an inner function q is an isometry in H2 then M f = qMe, Mg being the invariant subspace spanned
OPERATORS IN HILBERT SPACE
171
by the functions gx". This implies that M5 = H2 or that g is outer. If f = q1g, is another factorization off into inner and outer factors we have Mf = q,H2 so q and q, are equal up to a constant of modulus one.
The structure of invariant subspaces has been established only for the H2 case. The same result holds also for all H" spaces,
1
_< p <_ oo. Thus a subspace M
of H" is invariant if and only if M = qH" for some inner function q which is uniquely determined up to a constant of absolute value one. The case p = oo holds true for those subspaces which are w*-closed, a characterization which will prove useful to us later. For the proofs of the quoted results we refer to [67].
Theorem 12-10 (F. and M. Riesz) Let p be a finite Borel measure on 11' satisfying Jfdp = 0
for
n>0
(12-3)
Then u is absolutely continuous with respect to Lebesque measure and hence dp = f da for some fin H'. PROOF Let v be the total variation of p. Then v is a finite positive measure on V and dµ = h dv for some measurable h satisfying Jh(e") I = I v-almost everywhere. We also have that p and v are equivalent measures. Let now M be the subspace of L2 (v) spanned by the functions {hx' In >_ 0} .
If U, f = xf for f e L2(v) then U, is unitary and M clearly an invariant subspace. Since Jh(e") I = 1 v-almost everywhere the set of functions {hx" I n e Z} U"M = L2(v). Next we note that spans all of L2(v), which implies that
n",o U"M = {0} for this is equivalent to showing that the span of all the subspaces U"M is M and that is obvious. Applying Theorem 8-2 it follows that U, is unitarily equivalent to a unilateral shift of multiplicity one. In particular U,, and U, are unitarily equivalent. This implies the mutual absolute continuity of v and a and hence of y and a. Thus d p = f da for some fin L1(Y) and the assumption (12-3) implies that actually f e H. So far we have considered H2, and more generally H', functions as defined on the unit circle. However, with any function f in H' we can associate a uniquely determined analytic function in the unit disc. If f is in H1 then f has a Fourier series of analytic type, that is, of the form Y-,. 0 a"e". By the Riemann-Lebesgue lemma lim a" = 0 so the sequence {a"} is bounded. This implies that the power series YW 0 a"z" converges absolutely and uniformly on compact subsets of D.
Define f by f (z) _ j a"z" then f provides the analytic extension of f into D. The precise relationship between functions in H' and their analytic extensions is our next object. To explore this further we give an alternate definition of the H" spaces. We define now H°(D), for 1 <_ p _< oo, as the space of all analytic functions
172
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
in D satisfying
Ilfllp=hmilf.Ilp<.0 where f, is the function on the unit circle defined by f,(e'°) = f (re"). The spaces H'(D) are Banach spaces whereas H°°(D) is actually a commutative Banach algebra.
Theorem 12-11 There exists an isometric isomorphism between H" and HA(D) given by f -.J for all 1 < p < oo. PROOF Let fe H" and j' be the analytic extension off into D. If P,(O) is the Poisson kernel then, as the Fourier coefficients of the convolution of two L' functions are the product of the corresponding Fourier coefficients we have f* P, or f,(e'8) _ 0 a"reine. Using the summability properties of the Poisson kernel, in line with the proof of Theorem 41, we have lim , - f lip =
0 for 1 _< p < oc and lim f in the w*-topology of H' for f e H°0. In particular it follows that lim IL , p = Jlf Il p so the map f -j is a linear I1
isometry. Conversely suppose f e HA(D) and f (z) = Y100 a"z". Then llf, 11p is
bounded for 0 <- r < 1 and we may assume, without loss of generality, that 11 f, 11, <- 1, for 1 < p _< cc the functions, lie in the unit ball of a dual Banach
space which is w*-compact by the Banach-Alaoglu theorem [29]. Thus there exists a function f in LA such that f, converges to f in the w*-topology of LA. In particular, as X" a Lq, we have lim(f X") = (f, X") for all n e Z. Now (I x") = awl so f has the Fourier series Y, o By comparing Fourier series we have 1, = f * P, and hence by the first part of the proof f, --+ f in L" norm for 1 < p < oo. The case p = 1 requires separate treatment. Thus if f e H' (D) we consider the family of measures f, da which again we consider to lie in the unit ball of M('F) the space of all finite Borel measures on 11'. By applying the Banach-Alaoglu theorem once again, there exists a measure µ E M(1f) such that f, do converges top in the w*-topology of M(T). Since all functions X" are in C(lf) then lim ,-. l
I 27r
J e'"f,(e")dt = J e'n' dp
and as f, has a Fourier series of analytic type we have 1 e"" dp = 0 for n > 0. Thus y is an analytic measure and by the F. and M. Riesz theorem dp = f da for some f in H'. Again j, = f * P, and hence f, converges to fin L' norm. This completes the proof.
As of now we will use both representations of H" interchangeably and use the same letter to denote both the function on the circle as well as its analytic extension.
In terms of the analytic extension the structure of inner and outer functions can be completely described. A function q is inner if and only if it is of the form
173
OPERATORS IN HILBERT SPACE
q(z) = aB(z) S(z) where
°° za
B(z) = zk f
a
1 - atz
(12-4)
Iatl
is a Blaschke product and the condition Y ?O, 1 - Ia11 < oo is satisfied whereas
,d"
S(z) = e- f e
(12-5)
with p a positive measure singular with respect to Lebesgue measure is a singular inner function. A function f in HI is outer if and only if
f (z) = e -
1u e
z k(e ) dt
(12-6)
for some real valued k in V. In that case we have necessarily k (e") = log If (eu) I
We note that since f, converges to f in H° a subsequence f converges to f almost everywhere. A theorem of Fatou states that actually the nontangential limits of f (z) exist almost everywhere on the unit circle. An important result concerning H°°, the proof of which is outside the scope of this book is the corona theorem of Carleson [20, 30, 73]. It is equivalent to the
density of the open unit disc in the maximal ideal space of the commutative Banach algebra H. In analytic terms it states that given a,, ..., a e H°° there exist b,, ... , b a H°° such that Z,"s, at(z) b,(z) = I for all z in D if and only if there exists a S > 0 such that for all z in D E Iat(z)I
i=I
If a,,..., a satisfy this condition we say that they are strongly coprime. They are called coprime if their greatest common inner divisor is 1, and write al n 1. Clearly strong coprimeness implies coprimeness.
.
Aa_
The study of H° spaces is not restricted to the unit disc and other domains of definition can be considered. For us the interesting case is that of the upper and lower half planes. Moreover we will consider only the H2 and H°° spaces. In our approach we will stress the close connection between the corresponding spaces in the disc and in a half plane.
Thus let n+ be the open upper half plane II+ = {AIIm1 > 0} and II_ the lower half plane. We denote by H°°(II+) the space of all functions analytic in 1I+ and satisfying
IIfII = sup If(z)I < 00 zcn,
(12-7)
We let H2(II+) be the space of functions analytic in H+ and satisfying 1I f II 2 = sup
J_
I f (x + iy) I2 dx < oo
(12-8)
H°°(II+) is clearly a commutative Banach algebra and there is a simple isomorphism between H°°(II+) and H°° of the unit disc which is induced by the
174
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
fractional linear transformation z = (w - i)/(w + i) which maps the upper half plane 11+ onto the unit disc D. The induced map is given by tp(z) = f(i(1 + z)/(1 - z)) and its inverse by f(w) = cp((w - i)/(w + i)), for fE H°°(II+) and cp a H°°.
The H2 spaces of the disc and the upper half plane are also related. For this the Paley-Wiener characterization of H2(11+) is instrumental. Theorem 12-12 (Paley-Wiener) A complex valued function F defined in 11+ is in H2(11+) if and only if
F(w) _ -127r
(12-9)
f (t) e8w dt o
for some f in L2(0, oo). PROOF Assume fE L2(0, co) and let w = x + iy e II+ then fy(x) = F(x + iy) = (1 / f) f v f(t) e- y, e' t dt. By the Fourier-Plancherel theorem FY E L2(-co, oo) and f°_. IF,(x) 12 dx = f,,O,, IF(x + iy)I2 = fo Ie-Of (t)I2 dt b > 0 then
,
IF(w) - F(wo) I =
I
f
o(eitw - eitwo)
I fo Ie'-
f (t) dt
fI
I
1112
-
etwol2
dt
l
5112
f(t) I2 dt
0
Now in the previously defined half plane the function Iei'w - eitwol2 is bounded by 4e-a' which is certainly in L'(0, co). Since for each t we have lim
eitw
I
- e1t,o I2 = 0 the Lebesgue dominated convergence theorem yields
the continuity of F in Imw > S > 0. Since S > 0 was arbitrary F is actually continuous in II+. If y is any closed contour in H+ then f r eitw dw = 0 and by the use of Fubini's theorem it follows that f, F(w) dw = 0. So F is analytic by Morera's theorem.
Conversely assume F is in H2(11+). Let F,(x) = F(x + iy) then
F, e L2(- oo, oo) and hence Fy is the Fourier-Plancherel transform of an L (- oo, oo) function which we denote by fy, that is 1
Fy(x) =
,jr27t f _.
fy(t) eixt dt
(12-10)
and by the inversion formula f,(t) = i/( 2n) f Fy(x) e-rtx dx. We will show that eyt fy(t) is independent of y > 0. We note that W 21t
_.
Fy(x) e-"x dx = I27C
f, y
F(w) e-""te-y' dw
OPERATORS IN HILBERT SPACE
175
or
1
eY'f'(t) =
1/ 277
dH,
F(w)
J
(12-11)
Im w=Y
The function F(w)e-'wt is analytic in II+ and so its integral on any closed contour lying in II+ is zero. We integrate it on the positively oriented rectangle whose vertices are at the points - + iy,, + iyI, + iy2 and
- + iY2, so we have
1+iy, 27r
C+iY2
F(w)e'w'dw+
F(w) e
4
+
dw
f4+01
-4+iyz
f
+iY2
4+iY1
F(w) edw +
F(w) a-`w` dw = 0 (12-12)
c+iY2
-4+iy2
Let us estimate now the second integral. + iy2
2
F(w) e-iwr dw l { + iy,
=
2
Y2
f f
F( + iu) e -'(4 +'u)' du
Y1
Y2
2
F(r; + iu)
Y2
du
YI
e2ut du
I YJ
From (12-8) it follows, by Fubini's theorem, that J°°- JY; JF( + iu)12 du di is finite and hence there exists a sequence of points 4 -+ oo such that Y2
lim n"m fyi
IF(± + iu)I2 du = 0
This in turn implies lim J t+;Y, F(w) e- iw' dwl = 0 the limit being independent of t. In the same way we estimate the fourth integral in (12-12). Letting now go to infinity through the sequence we have, from (12-12), that
F(a)e-wtdti _
F(x)e-iwtdw 27t
'1mw=y2
jmwya
that is, the integral (1/) JImw =yF(w)e-'w'dw is independent of y > 0. From (12-11) define a function f by f(t) = eyff(t). Applying the Plancherel identity to ff(t) = e-)f (t) we have Ie-Of(t)I2 dt
=
J-
ff(t)12 =
J
IF(x + iy)I2 dx : 11 FiI2
176
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Letting y -+ oo it follows from this that f (t) = 0 almost everywhere in (- oo, 0), while letting y -. 0 implies f e L2 (0, oo). Substituting e- of f (t) for fy(t) in (12-10) we get the required representation (12-9).
As a corollary to the Paley-Wiener theorem we prove the existence of boundary values of H2(fl ) functions, at least in the H2(fl+) sense. Theorem 12-13 If F C_ H2 (II+) then F = lim Fy exists in L'(- oo, oo) norm and F(x) =1(1/,/2-n) e1 f (t) dt for some f e L2(0, oo).
PROOF IfFeH2(II+) then F(w)_(1// 2n)Jo f(t)e'"'dtforsome feL2(0,oo). In particular for the restriction to the real axis we have
f (x) =
1
f (t) edt
J
Now Fy(x) - F(x)
,/2n f
°° e"(e-y'
f (t)
o
- 1) dt
and by the Plancherel identity (10-36) we have If(t)I2 ICY' - 112 dt
II Fy - FII2 = Applying
the
Lebesgue
0
dominated
convergence
theorem
we have
limIIFy - FII = 0. To obtain the pointwise limits almost everywhere we will first seek a concrete isomorphism between H2 and H2(II+). To this end we start with a deeper study of the properties of the set of functions that were introduced in Sec. 10 as the uniquely determined solutions of the recursive set of differential equations
f
y + yn = -2 > yJ J=O
(12-13)
Given a set of functions {f.) then a function of two variables 0(t, z) is called a generating function of f f.) if Ifi(t, Z) _ Y f,, (t) z" n-o
f
(12-14)
Lemma 12-14 The generating function of the set {f.1 defined by (12-13) is e-t(its)/(1-z)
(t' z) =
1-z
(12-15)
OPERATORS IN HILBERT SPACE
177
The functions fn defined by (12-13) are explicitly given by (12-16)
nitn(e-Ittn)
Jf(t) = V
and
2)k (n) tk
/-
k.
fn(t)=V 2
(\k)
(12-17)
and the Fourier-Plancherel transform of f" is given by i
( fn) (w) =
1
--
w - I"
(12-18)
,/1-r w+i w+i
will show that the PROOF Let (p(t,z) = 2e-t(1+`1(1-=) = {an} are solutions of the recursive set of differential equations 1,10=06(t)e. We
a,(0) = 2-
I as + ao = 0
(12-19)
n-1
+a"
-2 1 aj
a"(0)=0
for
n>0
j=o
and that R^^
{{
(12-20)
Jn = L, aj j=0
Indeed, differentiating the series expansion of (p(t, z) we have (t) z" _ R=o
d
+ ze-al+=u(1-=)
1-z
1
dt
(12-21)
the term by term differentiation being permissible as the expansion of 9(t, z)
uniformly convergent on finite intervals for each z in D. Now -(1 + z)/(1 - z) = -(1 + z) a.0 zk = -1 - 2= I zk. So from (12-21) it
is
o a;, (t) z" = follows that 0 {a" + 2 >I7 o aj } z". Consequently the an are solutions of (12-19) with the correct initial values satisfied as V (O, z) _ 1 + Y-"11= 0 06(0) z".
Let now g" =
Y-;=oaj. We will show that the gR are also solutions of
(12-13) and hence coincide with the fn. First we check that gn (0) = Ei=oaj (0) =
ao(0) = 2, that is the initial conditions are satisfied. Since ao =fo and ao = go by the definition of {gf} we have go = fo. Assume g, = f, for 0 < i <- n and proceed by induction. R+1 {{
g
n+1
n[+1
+l + gn+1 = Y j + E aj = L (aj + aj) j=o
j=0
j=o
By the induction hypothesis n+1
L (o(j + aj)
j=o
n+1 j-1
n
2 Y > ak = -2 > gj j=o k=O
j=o
178
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
and hence the gn satisfy (12-13). This proves gn = fn = Ei=o aj Now consider
UO
Z
Z
n=0
m
co
e'(1+ZVO -=)
(p (t, z)
k zk I =
"
"(oz
n
"k(t)
Ik
fn(t) Z" = k t, z)
which proves (12-5). To prove (12-17) we compute the power series expansion of I'(t, z) e-ul+z)/(u-=)
fn(t) z" _
00,Z) _
1-z
and hence
e-2(:nI -z)
6A (t, z) = n=0
e`fn(t) Z" = V'
1 -Z
=f.
(-2)I` tk
zk
k!
(1 - z)k+ I
k=0
By differentiating the power series of 1/(1 - z) n times we have n
Zk
(1 - z)k+I =
n=
k
=k
which yields upon substituting back in the previous equality
E e`fn(t) z" = M=O
_
=
E k zn V- k=0 (-k)k tkn=k \ 1 1
(- 2)k tk l n k! \k k=0 n=k
"L
SI
L
(-2)"(n)
zn k
j zn
Equating the coefficients of equal powers of z proves (12-17).
Apply now the Leibnitz differentiation rule to the nth derivative of
e-2ttn
d"
dt
rnl(_2)ke-2tn(n-1)...(k+1)tk
(e-2ttn)=
k-o k n
I ( n )(-2)k
k=0
k
e
-20! tk k
which is equivalent to (12-16). If we put l"(t) = e'fn(t) then 1"(t) is an nth degree polynomial and the nth Laguerre polynomial Ln(t) is obtained from
OPERATORS IN HILBERT SPACE
179
So for L. we have
it by
Qt) = kF _ k l )<` ( k) t'`
(12-22)
as in [12].
To compute the Fourier- Plancherel transforms of the f our starting point is (12-16) and we use the computational rules for the Fourier-Plancherel transform outlined in Theorem 10-13. By direct computation
f
2n
.
e-
2s "t
e
o
from which
2n
1
!
I
1
i"+1n!
dw
2i + w
J2n (2i + W)ft +I
-(e-2tt")}e1Widt
= (-iw)"
e-2tt"e-dt =
1
dt =2n---2i + w
I
2n
o
d"
Next I
Jo
dt"
i"+ln! 2n
(2i + w)"+1 n!
w"
,72,(2i + w)"+1 and, finally, 1
2
-
02e'[-!!"
n! e`Ldt"(e
n!
1
w+i!
n! i
W+i(w_:
1
-:( W - i
"
f w+i w+i) which proves (12-18).
Theorem 12-15 The map J: H2 -. H2(II+) defined by i
Jf (w)
w-i1
f w+i f w+i/ 1
(12-23)
is a unitary map of H2 onto H2(rl++) with the inverse map being given by
(J'F) (z) = 1
f" F(i 1+ z)
(12-24)
180
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
PROOF The set of functions {f} is an orthonormal set in H2, integration being with respect to the normalized Lebesgue measure. The set of functions (f.) in L2(0, oo) defined by (12-13) is an orthonormal basis for L2(0, oo), as was shown in Sec 10. So if we define a map 0 by 1(f) = f then d' has a unique extension to a unitary map of H2 onto L2(0, oo). Compose this map
with the Fourier-Plancherel transform F to obtain, by the Paley-Wiener theorem, a unitary map J: H2 -+ H2(11+), defined by J(') = -F (f.). Let S be the right shift in H2, that is, Sf = Xf for f E H2, then since Vf with V defined by (10-51), it follows that O(X"+,)
4(Sx") =
=fc+i = Vf, = Vc1(f)
and by extension to all of H2 we have
(DS=V'
(12-25)
By applying the Fourier-Plancherel transform we have
w-I =(f,,)(w) 1 w+i Jnw+i w+i R+I
1
which implies
(FVf)(w) =
w
+ i.f(f)(W)
(12-26)
By composition we also have
w-t
(12-27)
(JS9) (w) = w + i (J9) (w)
In other words JSJ* is the multiplication by (w - i)/(w + i) operator in H2(11+). Now V is the infinitesimal cogenerator of the right translation semigroup in L2 (0, co) so ,F VF* is the infinitesimal cogenerator of the semigroup V (t) - , F * in H2 (II + ). Consequently the infinitesimal generator of (t) ,9*
is the operator of multiplication by iw which means that the action of the semigroup .FV(t),9* in H2(l+) is multiplication by e`w`. This can of course be verified directly as 1
2n
etw
f(r - t) eiwr d= _
f (T) e'w` di
2n
(12-28)
o
for all fe L2(0, oo). From the definition of J we have, by summing finite powers, that (12-23)
is satisfied for any polynomial and by continuity it is true for all fE H2. Actually we have proved more than was claimed and we state it as a theorem.
OPERATORS IN HILBERT SPACE
181
Theorem 12-16 The right translation semigroup { V (t) } in L2 (0, x) is unitarily equivalent to the semigroup of multiplication by e" in H2(17,). The Fourier-Plancherel transform provides the unitary equivalence. As a corollary to Theorem 12-15 we obtain the Poisson formula in the upper
half plane and the existence of nontangential boundary values for H2(II+) functions.
Theorem 12-17 If FeH2(II+) then for w = + it in the upper half plane
fo F(x)( -
F(1; + i17) _ 11
2 + 2 dx
(12-29)
and the nontangential limits of F(w) exist almost everywhere on gt. In particular urn F(x + iy) = F(x) almost everywhere. r-o
-
PROOF Let F e H2(II+) and use Theorem 12-15 to define 2
f(z) =
I
- zF
`
1 - z)
which is in H2 of the disc. Obviously (1 - z) f (z) is also in H2 so by the Poisson formula in D we have
(1 - z) f (z) =
J(l - e') f (e") Ree" + Z
_
dt
(12-30)
Define the function F on R by transforming the boundary values of f, that is
F(x) =
1
I
V x+ i
f,
x-i x+ i
and note that as e" = (x - i)/(x + i) it follows that ie" dt = (2i dx)/(x + i)2 and hence that dt = 2 dx/(1 + x2). From (12-30) we obtain by a change of variables
F(w)= Now Re
--
1
eu+z F(x)Ree"-zl
J
e"+z = Re 11 +xw= 1 +xw -i x-w Im e"-z x-w
dx
+x2 (Imw)(1 +x2)
tx-wi2
which yields 1
(°°
F(w)_1JI
lm W
F(x)Ix_---1dx
Formula (12-31) is evidently equivalent to (12-29).
(12-31)
182
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Using Fatou's theorem in the disc we obtain the existence of nontangential boundary values for F(w). We bear in mind that the fractional linear
transformation z = (w - i)/(w + i) is a conformal map of II+ onto D and hence preserves nontangential arcs.
We have now at hand all that is needed for the characterization of the right translation invariant subspaces of L2(0, oo). A subspace M of L2(0, oo) is called right translation invariant if V(t) M c M
for all members V(t) of the right translation semigroup (10-17). We begin with two lemmas.
Lemma 12-18 A subspace M of L2 (0, oo) is right translation invariant if and
only if its Fourier-Plancherel transform FM is invariant under multiplication by all functions e'", t ? 0. PROOF Follows from (12-28).
Lemma 12-19 A subspace N of H2 (f1+) is invariant under multiplication by all functions e", t > 0 if and only if it is invariant under multiplication by all H'O (fI +) functions.
PROOF Since for t > 0 the functions e`" are in H° (H+) the if part is trivial. Conversely assume N is invariant under multiplication by all e'" for t > 0. In view of Corollary 12-7 and (12-27) it suffices to prove that N is invariant under multiplication by (w - i)/(w + i) or equivalently by i/(w + i) as (w - i)/(w + i) = 1 - {2i/(w + i )1 . Now
__ _ J e- e- dt = w+i
0
f
e'(w+)i dt
= lim
euw+' dt = gn(w)
n-W fno
0
can be approximated by exponentials and boundedly pointwise. So invariance of N follows. The function
i/(w + i)
Theorem 12-20 (Lax) A subspace M of L2 (0, oo) is right translation invariant
if and only if its Fourier-Plancherel transform has the form QH2(fl+) for some inner function Q in the upper half plane.
PROOF Let M be a right translation invariant subspace of L2(0, oo). By the two preceding lemmas FM is invariant under multiplication by all H°°(f1+)
functions. This means that J*J"M = (D*M is a subspace of H2 invariant under multiplication by all H°° functions. Applying Beurling's theorem we have 4)*M = qH2 for some inner function q. Using the explicit representation
(12-23) for J we have ,FM = J(gH2) = QH2(l+) where Q(w)=qw-I
w+i is obviously an inner function. The other part is trivial.
OPERATORS IN HILBERT SPACE
183
We conclude this section with a summary of results about vectorial H° spaces.
Let N be a separable Hilbert space, then we denote by LN the space of all (equivalent classes) of weakly measurable N-valued functions for which IIFII2 =
JHF(e)II2da < oo
(12-32)
Here IIF(e") II is the pointwise norm of F(e") as a vector in N. LN is a Hilbert
space, the inner product of two functions, F, G e LN given by (F, G) = J(F(ei), G(e")) da
(12-33)
There are two natural ways of expanding elements of LN into infinite series. Roughly the first corresponds to writing a function F in terms of a fixed orthonormal basis of N, the coordinates being scalar L2 functions. The other expansion is a Fourier expansion with coefficients in N. To get the first expansion let 0} be a fixed orthonormal basis for N and let F e L. Define f by
f (e") _ (F(e"),
(12-34)
then, by the assumed weak measurability of F, the f, are measurable functions and as
JIL(eit)l2da = JI(F(e),en)I2da
<_
f
IIF(e")112 da = IJFII2
we have actually for all n >_ 0. The Parseval equality applied to (12-34) implies that a.e. on the unit circle
If
IIF(e")112 =
(eit)I2
(12-35)
n=0 and by integration a)
II1.II2
IIFII2 = Y
(12-36)
which is one form of the Parseval equality. R=0 In the second representation we write a)
F =k=-ao I
cpke`k`
with
tpk e N
(12-37)
To obtain the pk we note that the set {e e'k` I n > 0, k e Z j is an orthonormal basis for L. The orthonormality is obvious and for completeness it suffices to show, by Theorem 1-13, that this orthonormal set is closed If G E LN is orthogonal to all we have
J(G(eu).e)edcio
for all
n >_ 0, k E Z
(12-38)
184
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Fixing n (12-38) implies (G(e"), en) = 0 a.e. for each n. So G(e") = 0 a.e. that is, nkeJt be the scalar Fourier series
G = 0. If F = Y' o fnen let fn(e-") _ k -
we obtain from (12-36) that
of fn. Since 11 fn 112 = ;k°°
00
co
IIFII2 =
11f
n=0
II2 = Y_ Y_ n=0 k= -o
1(Pnk12
.0
F(e") _
n=Ok= -ao
oo
tpke"e,, = X
Y (pnken n=0
k=-ao
} e`'"
(12-39)
and all series are norm convergent. Comparing coefficients with (12-37) we have Pk = °°-0 (Pnkan and ao
11F112 = E
m
o0
11(pk112
k=-w n=0
I(pnk12
(1240)
k=-m
holds. As in the scalar case we call the (Pk the Fourier coefficients of F. The space
HN is then defined to be the subspace of LN of all functions whose negatively indexed Fourier coefficients vanish. Thus F e HN if and only if F 0 fnen and fn e H2. Functions in H2 have analytic extensions into D and so do functions in HN by defining F(z) = 0 fn(z) en. The last series converges absolutely (in norm) and uniformly on compact subsets of D hence F(z) represents a vector is the Taylor expansion of fn valued analytic function in D. If f .(z) = Yk o it follows, with the previous notation, that F(z) = yk o cpkzk. Letting F,(z) _ F(rz) we have F, _ >2 fn,,en and II F - F,11' = °°_ o II fn - f ..II 2 which implies that F, converges to F in the LN norm. Separability of N coupled with Fatou's theorem shows that actually the nontangential limits of F(z) exist a.e. and are equal to F(e"). In a completely analogous way we can treat operator valued functions. Given two separable Hilbert spaces N and M we say a function A: IF -+ B(N, M) is weakly measurable if for all x e N and y E M the function x, y) is meapnkzk
surable. L'N,M) is the space of all weakly measurable essentially bounded B(N, M) valued functions. The norm given by
IIA11 = ess-sup{I A(e")1110 _ t < 2n)
(12-41)
Each element A e L&,v,M) has a natural Fourier series associated with it. In fact x e LM and hence A(e") x =m _ Ak(x) e'". Since Ak(x) for a fixed x e N depends linearly on x it follows, noting that IIAk(x)112
J
IIA(w")x112da < IIAII'' 11x112
that Ak(X) = AkX
keZ
(1242)
OPERATORS IN HILBERT SPACE
1S5
for some uniformly bounded set of operators Ak a B(N, M). Again HBN,,W) is defined as the subspace of all LAIN.M) whose negatively indexed Fourier coefficients vanish. Again every A e H'N,M) has an analytic extension into D and is recoverable
by strong nontangential limits. In LsiN,M) we have an important conjugation given by
A(eu) =
A(e- $)*
(12-43)
This definition induces a conjugation in H'B(N M) by the same formula as A
A
naturally preserves analyticity. We saw already in Theorem 6-11 that LOON,M) is a representation of the set of all B-homomorphisms of LN, that is every operator
commuting with all multiplications by bounded Borel functions is actually multiplication by a LBN,M) function. The same is true in the context of HN spaces.
Each HN is actually in H°°-module where for each (p e H°° we denote by M, the operator of multiplication by cp. The H°°-homomorphisms are easily determined.
Theorem 12-21 A bounded operator A : HN -* HM is an H`°-homomorphism, that is
AM, = M,A
(12-44)
holds for all (p e H°°, if and only if there exists a unique operator valued analytic function A E HajN.M) such that
(AF)(z) = A(z)F(z)
(12-45)
and
IIAII = IIAII. =
IIA(Z)II
(12-46)
PROOF The direct part is obvious. Conversely assume A is a bounded operator
from HN into HM which commutes with all multiplication operators M.. For any vector c e N we have A a H. Using the fact that AM., = M,A for all (p e H°° we obtain for any vector polynomial p(z) _ Y;= U jz' that (Ar) (z) = Since is linear in there exists an operator valued
Ifunction A (z) for which
(z) = A(z) . Thus (AP) (z) = A(z) p(z) for all vector polynomials. If we restrict ourselves to the unit circle then we define (A(fp)) (z) = ?A(z)p(z) = A(z) ?p(z)
(12-47)
Thus (12-45) holds for all vector trigonometric polynomials, and as these are dense in LN it holds by continuity for all F e L. It is also clear from (12-47) that the norm of A as an operator from HN into HM is equal to its norm as an operator from LN into LM, therefore (12-46) follows from Theorem 6-11 and
Fatou's theorem concerning existence of strong radial limits of A which satisfy the equality sup IIA(z)II = ess-supJJA(e')II :
186
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
In order to effectively develop the spectral theory we need the vectorial version of Beurling's theorem. Thus a subspace M of HN is called invariant, or right invariant, if it is an invariant subspace for all M,,, (p a H°°. Theorem 12-22 Let M c HN be an invariant subspace then there exists a function Q e HBN) with the following properties IIQII- < 1
(12-48)
Q(e") is a.e. a partial isometry with a fixed initial space
(12-49)
M = QHN
(12-50)
and
Conversely every subspace M defined in this way is an invariant subspace of H. PROOF If M is defined by (12-50) then certainly it is invariant. Moreover, multiplication by Q in HN is a partial isometry, its initial space given by the set of all f e HN such that f (e°) belongs a.e. to the initial space of Q. It follows that M is closed. To prove the converse assume M is an invariant subspace. Let S = Mx then S is a right unilateral shift in H. S restricted to the invariant subspace M is then also a unilateral shift. By the Wold decomposition we have M = 0 S"L where L= M e SM. Choose an orthonormal basis q, in L then, because qi e M e SM, the vectors qi(e") are pointwise orthogonal a.e. on T. Since dim L< dim N we can find an orthonormal set e, in N of the same cardinality. Let K be the subspace spanned by the ei. Define a map Q by Qe, _ q, and extend it by linearity and continuity to all of K. Finally, let Q I Kl = 0. It is clear that Q(e'") is a.e. a partial isometry having K as its initial space and it extends to an analytic operator valued function into D as we have Q(z) = >, ,q,(z) for = 1, ,e,. Now every function f in L can be written as Yi a,q,(z) = Q(z) Y aiei and so every function in M can be written as Y tp,(z) qi(z) = Q(z) Y gpi(z) e; for H2 functions cp,, or as Q(z) g(z) for g e H. Since HN = H2 + H2- and multiplication by Q annihilates HK1 we may as well write M = QHK. This completes the proof. Of course the function Q is not uniquely determined as our choice of the subspace K of N was arbitrary.
We call the functions described in the previous theorem rigid functions. Out of the class of rigid functions we will be interested in a particular subclass which arises out of a special class of invariant subspaces. We say that an invariant subspace M of HN is an invariant subspace of full range if a.e. on the unit circle { f (e") I f e M} spans N. In this case dim L = dim N and we can choose the orthonormal set {e,} to be an orthonormal basis. It is then obvious that in this case the function Q(e") is a.e. unitary. Such functions are called inner functions and they generalize the scalar inner functions. Given an invariant subspace of full range the inner function corresponding to it is only determined up to a choice
OPERATORS IN HILBERT SPACE
187
of an orthonormal basis in N. Thus if Q and QI are inner then QHN = QIHN if and only if QI = QU where U is a fixed unitary operator in N. If Q e HBCN> is inner we will also say that Q is an inner function in N. There is a natural partial ordering of inner functions induced by the partial ordering of invariant subspaces. We say an inner function Q is stronger than P and write Q < P if QHN c PHN. Lemma 12-23 Let P and Q be inner functions in N. Q is stronger than P if and only if (12-51) Q = PR for some inner function R in N.
PROOF QHN c PHN if and only if P*QHN c HN that is if and only if R = P*Q is inner. This is equivalent to (12-51). In the case of a factorization Q = PR of an inner function Q into inner factors we say that P is a left inner factor of Q, R a right inner factor and Q a left inner multiple of R. If I is the identity operator in N and q a scalar inner function then qI is also inner. We call such inner functions scalar inner functions. An inner function Q has a scalar inner multiple q if there exists another inner function R such that
QR = RQ = qI
(12-52)
If N is finite dimensional then any inner function in N has a scalar multiple.
Theorem 12-24 Let N be finite dimensional and Q an inner function in N. Then q = detQ is inner and qI < Q. PROOF Let adj Q be the classical adjoint of Q. By Cramer's rule we have
Q adjQ = qI
(12-53)
Since the elements of adjQ are analytic adj Q is inner. Equality (12-53) shows that q is a scalar multiple of Q which is equivalent to qI < Q. Given two inner functions P and R having scalar multiples p and r, respective-
ly we can consider M = PHN - RHN. M is obviously an invariant subspace of HN and it certainly contains prHN as prHN c pHN n rHN c PHN n RHN = M
Therefore M is a subspace of full range and has the representation M = QHN for some inner function Q in N. Q is determined up to a constant unitary factor on the right. We sometimes write Q = P V L R and say that Q is the least common
left inner multiple of P and R. For scalar inner functions p and r we write p v r for the least common inner multiple. a define the least common right inner multiple of P and R by P v R R= (P v L ).
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
188
In an analogous way we can consider the invariant subspace L = PHN V RHN. If P and R are inner L has full range so L = QHN for some inner Q. We call Q the greatest common left inner divisor of P and R and write Q = P A L R. If Q is an inner function having a scalar multiple q then qHN c QHN. Let J
be the set of all functions lp in H°° for which wHN c QHN. Clearly J is a nontrivial ideal in H and in fact a w*-closed one. To see the w*-closure of J let Wa be a net in J that converges to qp in the w*-topology. Since for every f e HN and a N the function (f, Qi;) ' is in L1 of the circle we have, due to analyticity of cpQ(Q*f, l;), that for n > 0
qp(e")(Q(e'')* f(e"), ) el"" dt =lam f q
a(e")(Q(e«)* f(e"), 4) e'"' dt
ThisJshows
=0
that (pQ*f is in HN and hence (p is in J. Every w*-closed ideal in His of the form J = mH°° for some inner function m which is uniquely determined up to a constant factor of absolute value one. We call it the minimal inner function of Q.
Lemma 12-25 Let Q be an inner function in an n-dimensional space N, and let m be its minimal inner function. If q = det Q then we have and
mIq
(12-54)
q l m"
PROOF That it divides q follows from the fact that q belongs to the previously
defined ideal J. Since m is a scalar multiple of Q we have mHN c QHN or
ml = QR for some inner function R. Taking the determinant of the last equality we obtain m" = q det R or q I m".
It will turn out to be important to characterize the points of the unit circle where an inner function Q has an analytic continuation to the exterior of the unit disc. For this we need the following version of the Schwarz reflection principle.
Theorem 12-26 Let f and
g
be
analytic
in
the
domains Of =
{re'BI1-E
respectively. Assume that a.e. on y = {e'BI a < 0 < fl} the radial limits of f and g exist and
f (e") = lim f (re'") = lim g(re") .--1r-'1'
(12-55)
lim I f (e") - f (rei) I dt = lim I f (e") - g(re") I dt = 0 rl J Y JY
(12-56)
and also that
then f and g are analytic continuations of each other across y.
PROOF Choose, E1, a,, and $, so that 0 < E, < E, a < a1 < fl1 < fi and the radial limits of f and g exist at e'a' and e0'. Let r be the positively oriented
OPERATORS IN HILBERT SPACE
189
contour along the boundary of the circular strip f2 = {re' 111 - s < r < 1 + e, a, < 0 < P,) and define h(z) by h(z)
f(Z)
Zef2f
g(z)
Z e f2a
Define now a function H(z) in Q by
H(z) =
for
h(S) dt'
1
276J rS-z
z e f2
H is obviously analytic in 0 and it remains to show that H coincides with f in f2 n f2 f and with g in f2 n f1a. Let z e f2 n f2 f and for 0 < 6 < e, let r,, r2, and r3 be the positively oriented contours along the boundaries of the circular strips
f2,={reiell-s,
f23={retell+6
h(_
2ni r - Z d
1_f h(
=
dC
(12-57)
By Cauchy's theorem if b is sufficiently small then
f (z) =
1 `r, S - z d h(am)
.
2m J
whereas 1
2ni
h(C) z
The middle integral tends to zero with 6 by our assumptions on f and g and so H coincides with f in f2 n 1J . That H coincides with g on f2 n f2e can be shown similarly.
We observe that L2-convergence to the boundary values can replace the weaker condition (12-56). Essentially the same result holds for vectorial functions and we will use the theorem freely in that context. Theorem 12-27 Let JAI = 1. An inner function Q has an analytic continuation at A if and only if there exists a neighborhood V of A such that for z e V n D Q(z) is invertible and I+Q(z)-' 11 is uniformly bounded there.
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
190
PROOF Assume Q has an analytic continuation across the unit circle at A. Since Q(2) is necessarily inner Q(A)-' = Q(,)*. The set of invertible elements
is open so in a neighborhood of A Q(2) is invertible and Q(z)-' uniformly bounded. To prove the converse assume Q(z)-' is uniformly bounded in Vn D where V is an open neighborhood of Z. Since Q is inner then for almost all points p of the unit circle Q(z) has strong radial limits at p, and the limits Q(µ) are unitary. Thus for such a point u and a vector n E N there exists a vector E N for which Q(µ) = n. Now we show that Q(z) ' 7 has also a radial limit at p. To this end write
/
/
/
x1
Q(z)-,?l = Q(z)-' Q(p) b = Q(z)-' [Q(z) + (Q(u) - Q(Z) 4 )1
_
+ Q(z)-' (Q(u) - Q(z))
Now IIQ(z)-' Il is uniformly bounded in Vn D whereas the radial limit of
coincides with Q(µ)1;. Thus we obtain the existence of the radial limit of Q(z)-' >) and limQ(z)-' q = as z -> p radially. Define now a function Q in the exterior of the unit circle by Q(z) = Q(z-')-'. Obviously Q is analytic at all points z where Q(z-') is invertible. Our assumptions guarantee the analyticity and uniform boundedness of Q(z) in the intersection of a neighborhood of .Z and De. The strong radial limits of Q exist and lim Q(Rp) = Q(p)* -' = Q(µ) as Q(µ) is unitary. We apply now Theorem 12-26 to infer that Q and Q are analytic conjunctions of each other. Q(z)
13. MODELS FOR CONTRACTIONS AND THEIR SPECTRA We saw already several instances of the desirability of having a functional model for the study of an operator or a semigroup of operators. In this way we obtained spectral representations for self-adjoint and unitary operators and by way of the
Fourier-Plancherel transform also a spectral representation of the translation semigroups in L2(- oo, co; N) and L2(0, oo; N). Similarly the bilateral right shifts in 12(-eo, oo; N) and 12(0, oo; N) have the multiplication operators M. in LN(T) and Mx I HN as their models.
By Theorem 9-5, the minimal isometric dilation V of a contraction T is a unilateral right shift if and only if Tv" tends strongly to zero. If that is the case we have a functional model for T induced by that of V Let S be the right shift
in H. Theorem 13-1 A contraction T is unitarily equivalent to the operator R defined by
Rf = PMSf
fe M
(13-1)
where M is a left invariant subspace of some HN space, PM being the orthogonal projection on M, if and only if T *a tends to zero strongly.
OPERATORS IN HILBERT SPACE
191
PROOF The adjoint of R is given by (13-2)
R* = S` I M
as R*f = PMXf. In terms of the analytic extension off into D we have for Skzk-'. So IIR*"f 112 = f (z) = Y1 0 "z" that (R*f)(z) _ (f(z) -f(0))/z = Yk " II k 11 2 and lim IIR*"f 11 = 0. So if T is unitarily equivalent to R nec1
K-M
essarily lim 11 T*"xl! = 0 for all x c H. "_W
Conversely assume T*" tends to zero strongly then, by Theorem 9-5, V the minimal isometric dilation of T is a unilateral shift in K H, and H is V* invariant. Representing V as the right shift S in HN then T is given by (13-2) for some left invariant subspace M of H.
Applying the Beurling-Lax theorem we obtain the representation M1 = QHN for some rigid function Q introduced in Sec. 12. For an N-operator valued rigid function Q we define H(Q) by H(Q) = {QHN}1
(13-3)
and so H(Q) is a left invariant subspace of H. In H(Q) we define an operator S(Q) by S(Q) f = PH((?)Sf
f e H(Q)
(13-4)
then from the previous discussion it follows that S(Q)L` = S" I H(Q)
(13-5)
So S(Q)* is a restriction of the left shift operator or in short a restricted shift. By abuse of language we refer to S(Q) also as a restricted shift though a compression
of a shift would be more accurate. If Q is an inner function this has some justification deriving from the next theorem. We define a map J: LN - LN by
feLN for (13-6) (Jf)(e") =.f(e-") Clearly J is unitary and satisfies J2 = I and J = J* = J-'. For a given inner function Q we define a map rQ : LN -' LN by
TQf = 0(if) whereas usual
(13-7)
(z) = Q(z)*.
Theorem 13-2 For a given inner function Q the operator rQ defined by (13-7) is a unitary operator in LN for which the following relations hold
22
TQ(QHN ) = HOO.N
(13-8)
TQ(HO.N) = QHN
(13-9)
TQ(H(Q)) = H(0)
(13-10)
192
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
TQI = TQ = TQ
(13-11)
and TQPH(Q) = PH(Q)TQ
(13-12)
If we restrict TQ to H(Q) then the following diagram is commutative H(Q) S(Q)1
tQ H(0) IS(or
(13-13)
H(Q) j2, H(Q)
and hence S(Q) and S(O)* are unitarily equivalent. PROOF Since Q is inner we have for f e LN IITQf112=
f Ile - "0 (e") f(e u)H2da= f Ilf(e-`=)112da=JJf112
So TQ is isometric and so is TQ. An easy check yields TQTQ = I so (13-11) is
proved. For f = Qh E QH2 we have e-'0(eit)Q(e-«) h(e-u) = e-(th(e «)
(TQf)(eu) = But a "h(e ") belongs to Ho N SO TQ(QHN) C QAo N. Similarly if h E RO,N
then (TQh)(eu) = e-V(eu)h(e-") which belongs to (HN and so 2 (R2 O.N) C OHN
By the symmetry of the situation relative to the inner functions Q and the inclusions are actually equalities. So this implies that (13-10) holds too. The relation (13-12) follows from (13-8)-(13-10) and applying it after U, the bilateral shift in LN, we have TQPH(Q) Uf = TQPH(Q)Xf = PH(Q)TQXf
= PH(Q)XTQf = PH(Q)U*TQf
which implies the commutativity of diagram (13-13).
The operator S(Q) is completely determined by the rigid function Q and our aim is to study S(Q) as well as T in terms of Q. The relation between T and S(Q) and the function Q is as of the general completely nonunitary contraction to its characteristic function in the Sz.-Nagy-Foias theory. Our study of the spectrum of S(Q) starts with the point spectrum. Theorem 13-3 Let Q be a rigid function. (a) For JAI < I ; ea,(S(Q)*) if and only if Q(A)* has a nontrivial null space and
dim Ker(.;I - S(Q)*) = dim KerQ(A)*
(13-14)
OPERATORS IN HILBERT SPACE
The normalized eigenfunctions of S(Q)* have the form (1 _ IAI2)112 /(1 - z)
193
(13-15)
where is a unit vector in Ker Q(A)*. (b) If Q is inner, JAI < 1, then A e a (S(Q)) if and only if Q(A) has a nontrivial null space, and
dim Ker(AI - S(Q)) = dim KerQ(A)
(13-16)
The normalized eigenfunctions of S(Q) have the form
(1 -
IAI2)"2
Q(z) /(z - A)
(13-17)
where is a unit vector in KerQ(A).
PROOF An eigenfunction of S(Q)* relative to the eigenvalue A satisfies S(Q)* f = Af or (f (z) - f (0))/z = Af (z), which means that f (z) = f (0)/ (1 - Az). Thus A e a,(S(Q)*) if and only if for some i; a N, i;/(1 - Az) is orthogonal to QHN. The orthogonality condition is
I
0
27d
J(Q(e)g(e1t),1
1(Q(C) gQ, )
-','[e" )dt = I
A
1
2n
(Q (e") g(e"), )
dC = (Q (A) g(A),
' e" - A
dt
(g(A), Q(A)* )
Since this holds for all g e HN, /(1 - A z) belongs to H(Q) if and only = 0. To compute the norm of /(1 - Az) we consider the integral
if Q(A)* 1
2n
( 1 - e"' 1 -
Ze")
dt
=
dt
2n
f (1 - e")(1 - Ae-")
IIIIII Xll2
ie" dt
2ni
(1 - Ae") (e" - ,1)
= IIII2 27th
(1 -
)( - A)
1 _ IAI2
by a straightforward application of the Cauchy formula. This proves (13-15) and that the map i; -. (1 - 1212)112 /(1 - Zz) is a unitary map of KerQ(A)* onto Ker(.Z1 - S(Q)*). Condition (13-14) is a consequence of that fact.
To prove part (b) we use the unitary equivalence of S(Q) and S(Q)* proved in Theorem 13-2. Al - S(Q) has a null function if and only if AI S(O)* has, and this occurs by part (a) if and only if Kero(.I)* is nontrivial. But 0(;)* = Q(A), so (13-16) follows. The normalized eigenfunctions of Al - S(Q) are the image under Td of the functions (1 IAI2)In /(1 - Az)
-
and a simple computation yields (13-17).
-
194
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
If N is finite dimensional the analysis can always be made more precise due to the availability of the determinant function. One instance of that is the following.
Lemma 13-4 Let N be finite dimensional and Q a noninner rigid function. Then vp(S(Q)*) is equal to the open unit disc D. PROOF Since Q is not inner detQ(e") vanishes on a set of positive measure hence vanishes identically. Q(l) is therefore singular for every ,1 a D, and since the space N is finite dimensional Q(A) and Q(A)* have nontrivial kernels.
The result follows from part (a) of Theorem 13-3. Before proceeding with the analysis of the spectrum of S(Q) we collect some information concerning functions in H(Q).
Lemma 13-5 Let Q be inner. A function f e HN is in H(Q) if and only if Q*f e HON
-
PROOF The map of LN onto itself given by f -+ Q*f is clearly unitary. Since multiplication by Q* maps QHN onto HI necessarily the image of H(Q) is in Ho.N. Conversely if Q* f e Ho N then f = Qg a HN for some g e H',N. Since f is in HN and clearly orthogonal to QHN it follows that fe H(Q).
Lemma 13-6 Let Q be inner. A function f e H(Q) can be continued analytically across any point A, JAI = 1, where Q has an analytic continuation.
PROOF Let Ill = 1 and assume Q has an analytic continuation across A. If f e H(Q) then f = Qg for some g e HO',N. Q(z) is invertible for all points z in D that are sufficiently close to A. Now the analytic continuation of Q to De is given by Q(z-')*-'. Similarly g is the boundary function of an HO.N function. Applying Theorem 12-26 we conclude that the function Q(z- ')* -' g(z) is the analytic continuation of f across A.
The following lemma is instrumental in the spectral analysis of restricted shifts.
Lemma 13-7 Let Q be an inner function and J11 < 1. Given any unit vector e N there exists a function f e H(Q) such that 11(S(Q)*" - PI) f 11
4 JJQ(1)* I!
IIf 11
(13-18)
PROOF If Q(1)* = 0 we choose f (z) _ (1 - X112)1/2 /(1 - ,Zz) which is an eigenfunction of S(Q)* corresponding to the eigenvalue A and hence there is a trivial equality in (13-18). In the general case given
e(z) = (1 - 1,112)1/2 c/(1 - 2z)
(13-19)
OPERATORS IN HILBERT SPACE
195
we write e = f + g for the decomposition of e relative to the direct sum HI = H(Q) p QHN, with f e H(Q) and g e QHN. It is elementary to check that f(z) = (1 - 1212)I;2
(1
-- Q(Z) Q(2)*) /(t - 7z)
(13-20)
and
g(Z)
(1 - 1212)1/2 Q(Z) Q(2)* /(1 -
AZ)
(13-21)
Since Q is inner IIgII2 = IIQ(A)* II2 and hence
II2
IIfil2 =
(13-22)
Now e is a null function of S* - ,1 and hence 0 = (S*" - 7."I) e = (S(Q)*" - ;I) f + (S*" - A"I) g Now IIS*" - "I 11 5 2 and consequently II (S(Q)*" - X"I) f 11 -< min {2 Plg 11, 211 f 11 }
For a unit vector
e N if 1IQ(2)* X11 >- I then
II(S(Q)*"-,MJ)Jll <211f11 <411Q(1)*ill' Ilfll
On the other hand if IIQ(2)* 1I < 1/2 then from (13-22) we conclude that Ilfll > 1/2 and so I1(S(Q)*" -1"1) fIl : 2lIglI = 2IIQ(2)* X11 < 4IIQ(1)* lI' IIf11 which proves the lemma.
We have now at hand all that is needed for the characterization of the spectra of the restricted shift operators. Theorem 13-8 (a) The spectrum of S(Q)* consists of all points., IAI < 1, where Q(A) is not boundedly invertible and those points JAI = 1, for which Q has no analytic continuation at A. (b) The spectrum of S(Q) consists of all points A, JAI < 1, where Q(1) is not boundedly invertible and those points., 121 = 1, at which Q has no analytic continuation. PROOF In view of Theorem 13-2 it suffices to prove part (a). We assume there-
fore that A is in D and Q(2), and with it Q(1)*, are boundedly invertible. By Theorem 13-3 Al - S(Q)* is injective and to show that it is surjective it is enough to show the solvability of the equation
(Al - S(Q*)) f = g
for each
g e H(Q)
(13-23)
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Since S(Q)* acts as the left shift the above equation reduces to
-.f(0) ,f (z) T(z) - ---= g(2) =
(13-24)
Z
or f (z) = (f (0) - zg(z))/(1 - .'{z) and we have to show that = f (0) can be chosen so that f is in H(Q). With Lemma 13-5 in mind f is in H(Q) if and only if Q*f is in Ho N. Now g e H(Q) so g = Qh for some h c- H'0,N and so k defined by k(e') = (Q*f)(e`t) = (e-`rQ(e")* - h(e"))/(e-" - .'O has a
meromorphic extension to De, the exterior of the closed unit disc, the
only possible pole located at !-I. The extension is given by k(z) = - zh(z))/(1 - Az) and is actually in Ho,N if the numerator vanishes ')and since we assumed Q(A) at I.-'. This is equivalent to Q(A)* to be invertible _ .t-'Q(A)" -1 h(1- 1) when substituted back in f provides a solution of (13-23). The same argument works for 1, 1 = 1. In this case, since S(Q)* is completely nonunitary, in fact S(Q)*" tends strongly to zero, ;1 S(Q)* and its adjoint are injective. Again all we have to show is that (13-23) is solvable. By our assumption Q has, by applying Lemma 13-6, an analytic continuation at .1 and so has the function h, this time into D. The function k(z) defined as before has a numerator analytic at A and it will be in Ho,, if Q(.1)*
_ ).h(1). By Theorem 12-27 Q(A) is invertible and so for
_
,1Q(A)"-' h(2) is as before the key to the solution of (13-23). To prove the converse assume first that JAI < 1 and Q(A) is not boundedly invertible. We will show that in this case le a(S(Q)"). Without loss of generality we may assume that KerS(Q) and KerS(Q)* contain the zero vector only, the other alternatives have been settled already by Theorem 13-3. This leaves us with the assumption 0 e ac(Q(,i)*), a condition implying the existence of a sequence of unit vectors n E N for which lim IIQ(2)* n 11 = 0. We nom
will show the existence of a sequence of functions f" in H(Q) such that lim 1I f" II = 1 and lim I((z"I - S(Q)*") f" 1! = 0 showing that T, a ac(S(Q)*). ,,
Let u be a complex number, jj > 1 then -,u-'S(Q)*)-'
(µl - S(Q)*)-' = µ-'(1
S(Q)
=
*n
µn+1 n
n=0
Define an operator valued function I by
F(p) = (µl - S(Q)*) - (µl - ZI)-' 00
0 R=0
S(Q) n - .Z"I µ
n+I -
On the other hand we have for f
r(µ) = (PI - S(Q)*)-1 - (III - .11)-' = (µl - S(Q)")-' (I - (1 - A)-1(µI - S(Q)"))
-(µ -
A,)-I (µl
- S(Q)")-I (.17 - S(Q)")
(13-25)
OPERATORS IN HILBERT SPACE
197
and so F(µ) is invertible, for lµi > 1, if and only if (.11 - S(Q)*) is. For " e N define f,, and g" by (13-20) and (13-21), respectively, then equality (13-25) implies I
(µ)fn I l =
II
kO
(S(Q)*k
- Ali) f"
Pk+1
4IIQ(A)*nl_Ilf"II lul - 1 The last inequality is a consequence of Lemma 13-7. As lira IIQ(A)* "II = 0
and lint Ilf"II = I we have also lim 111-(u) f"II = 0. Thus r'(µ) is not invertible n-'oand so .i a a(S(Q)*). Finally let IdI = I and assume Q has no analytic
continuation at i. By Theorem 12-27 there exist points A" ED and unit II = 0. Define f" by vectors " e N such that lim A = A and lim
(13-20) then by Lemma 13-7 lim Ilf" II = 1 and
II(.I-S(Q)*)f"II Ik-A"I'IIf"Il
+I0"1-S(Q)*)f"II
The right-hand side tends to zero which shows A. e ac(S(Q)*) and completes the proof.
14. FUNCTIONAL CALCULUS FOR CONTRACTIONS The spectral theorem for self-adjoint and unitary operators in Hilbert space provided the key to the construction of a functional calculus for these operators.
Thus for any unitary U and bounded measurable function on IF the operator f (U) was defined. Theorem 6-9 provided an extension of the calculus to L°°(v) where v was a scalar measure on if such that for all x (E( ) x, x) << v. In particular if E is absolutely continuous with respect to a, the normalized Lebesgue measure on if, then f (U) is well defined for all f E L' = Lam' (a). Let now T be an arbitrary contraction in a Hilbert space H. By Theorem 9-1 T decomposes into a direct sum T = To ® T, relative to the direct sum decomposi-
tion H = Ho ® HI of reducing subspace of T such that To = T I Ha is unitary whereas T1 = TIHI is completely nonunitary. Questions of constructing a functional calculus for To have already been settled and we are left with the task of producing a satisfactory functional calculus for T1. Hence without loss of generality
we may assume that T is a completely nonunitary contraction. We let U be the minimal unitary dilation of T acting in K H. Let now u be an analytic function in D having a Taylor expansion u(z) _ D o a,,e satisfying E°° o la"I < oo. Since T is a contraction Y' o a"T" converges
198
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
in the operator norm and we define u(T) by
u(T) = Y_ 0' a"T"
(14-1)
"=o
Since U is the strong minimal unitary dilation of T we have by (9-24) T" _ PU"I H where P is the orthogonal projection of K on H. Substituting back in (14-1) we obtain
u(T) = E a"PU"IH=P(: a"U")IH=Pu(U)IH or
u(T) = Pu(U)IH
(14-2)
Using (14-2) as a guide it is easy to generalize this simple calculus. If T is completely nonunitary the spectral measure E of its minimal strong unitary dilation U is absolutely continuous with respect to the Lebesgue measure on V. Thus u(U) is well defined for all u e L". However, we restrict ourselves to the subalgebra H°° of Lam'.
Theorem 14-1 Let T be a completely non unitary contraction in H and U its
minimal strong unitary dilation acting in K
H and P the orthogonal
projection of K on H. The map u -+ u(T) defined by (14-2) is a continuous algebra homomorphism of H' onto B(H) that satisfies I!u(T)II <_ IIuIIW
(14-3)
u(T)* = iii(T*)
(14-4)
u(z) = u(z)
(14-5)
where u is defined by
If u" e H°° converge to u boundedly pointwise a.e. on if then u"(T) converges strongly to u(T).
PROOF That the map u -+ u(T) is linear is obvious. To show that the map is a multiplicative homomorphism let K1 = V"=0 U"H then clearly K1 is an invariant subspace of U and, using Theorem 9-5, so is {V 0 U"H} G H. Now for all x e H (uv)(T)(x) = P(uv)(U)x = Pu(U)v(U)x = Pu(U)Pv(U)x = u(T)v(T)x Next the simple estimate IIPu(U) xII _< II u(U) xiI <_ IIuIID IIxHI
proves (14-3).
199
OPERATORS IN HILBERT SPACE
As U* is the minimal unitary dilation of T* we have u(T*) = Pu(U*). So if x,yCH (u(T)* x, y) _ (x, u(T) y) = (x, Pu(U) y) = (x, u(U) y) = (u(U)* x, y)
_ (u(U*) x, y) = (Pii(U*) x, y) which implies (14-4).
is a uniformly bounded sequence that converges to u Finally, if pointwise a.e. on it then for x e H 1I(u.(T) - u(T)) x1I2 = IIP(u.(U) - u(U)) x1I2 _< Il(u.(U) - u(U)) x112
_ JIu(eit)
- u(e2 (E(dt) x, x)
and the right-hand side tendsto zero by the Lebesgue dominated convergence theorem.
The restriction of the functional calculus to H°° rather than L°° provided us with the multiplicative property of the functional calculus, that is, with the equality (uv)(T) = u(T) v(T) for all u, v e H°°. This property is not generally satisfied for Lm functions but it should be noted that important classes of operators are introduced in this way. A particularly important one is the class of Toeplitz operators.
If S is the right shift in H2 then U, the bilateral right shift in L2, is its minimal unitary dilation. For u e L°° the operator u(U) is just the multiplication operator by u on L2. Hence we define an operator T. in H2 by (14-6)
The operator T. is called a Toeplitz operator. For extensive studies of Toeplitz operators and their relation to Wiener-Hopf equations we refer to [24, 56]. Once a functional calculus has been constructed a natural question that arises is the characterization of c(u(T)), that is, the question of finding the right spectral mapping theorem. We address ourselves now to this problem for the class of restricted shifts of finite multiplicity, which is the case of interest to us in view of the applications we have in mind. This turns out to be the case for which a satisfactory answer is available.
We first note that if Q is a matrix rigid function, that is N is assumed to be finite dimensional, then the availability of the determinant function simplifies the analysis considerably. Also the restriction to the class of restricted shifts yields a very concrete representation for u(T).
Theorem 14-2 Let Q be a rigid function and S(Q) the operator in H(Q) defined by (13-4). For ueH°° u(S(Q)) is given by u(S(Q)) f = PH(QI(uf)
for
fe H(Q)
(14-7)
200
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
and
u(S(Q))* f = PH(Q)(uf)
(14-8)
PROOF The not necessarily minimal unitary dilation of S(Q) is given by U the bilateral right shift in L. Since for u e L°° u(U) f = of for all f e LN the result follows. We note that as HN is invariant under all u(U) for u e H°° the PH(Q) may be taken in (14-7) as the orthogonal projection of HN onto H(Q) whereas in (14-8) its interpretation is the projection of LN onto H(Q).
If N is a finite dimensional Hilbert space and A is an operator in N then we define det A the determinant of A as the determinant of the matrix representation of A relative to any basis of N. It is a well-known fact that det A is well defined, that is, its definition is independent of the particular basis used.
Lemma 14-3 Let A be a linear operator in an n-dimensional Hilbert space N, then ldet A! >_ IIA -' 11-"
(14-9)
where we put IAA -' 1I -' = 0 whenever A is not invertible. PROOF If A is not invertible (14-9) reduces to a triviality. Otherwise det A =
f"=1 a, where a; are the eigenvalues of R. Since a,:-' are the eigenvalues of
A` and clearly la; '
< I) A - ' 11 then jai
>_
1 1 A' 1 1 -' and by taking the prod-
uct of these inequalities we obtain (14-9). As was the case in the previous section we begin with the analysis of the point spectrum.
Theorem 14-4 Let N be finite dimensional, Q a rigid function, S(Q) the restricted shift operator acting in H(Q) = {QHN }1 and u E H°°.
(a) If Q is not inner, that is, detQ is identically equal to zero, then u(S(Q)*) is injective if and only if u is an outer function. (b) If Q is inner u(S(Q)) is injective if and only if u and q = detQ are coprime, that is, have no nontrivial common inner factor. (c) If Q is inner u(S(Q)*) is injective if and only if u and q" are coprime. PROOF If Q is not inner then we will show that for each scalar inner function
q there exists a function f e H(Q) such that of is orthogonal, in LN, to H. This will show that if u has an inner factor, that is, u is not outer, then it is not injective.
Now Q(e") is a.e. on I' a partial isometry with a fixed initial space M c N and the inclusion is proper since Q is assumed to be noninner. Let {e,, ..., be an orthonormal basis for N such that {el, ..., em} is an orthonormal basis for M. Let Qei = q; for e = 1, ..., m. If given q there exists no f in H(Q) for which of 1 HN then this is equivalent to H(Q) n H(qI) = {0} or alternatively to H(Q) V H(ql) = HN. Now {qe1, ..., q1, ..., qm} is a
OPERATORS IN HILBERT SPACE
201
set of generators for IIN. If we consider the invariant subspace spanned by q 1, ..., q.) then it is given by a rigid function Q 1 with the same { qe 1, ..., initial space. If q('), ..., ql1") are the nonzero columns of Q1 then we would have that then x n matrix function with columns q('),..., q(m), gein+l, ..., qe" corresponds to HN and hence is a constant unitary matrix. But that is impossible as ql"-n0 is a factor of its determinant. To prove the converse assume u is a nontrivial function in H°° and u(S(Q)*) f = 0 for some nonzero f in H(Q). The set J = (vIv(S(Q)*) f = 0) is a w*-closed ideal in H°° hence of the form qH°° for some inner function q. Since u E J it is not outer. So we proved (a). To prove (b) assume first that S(Q) is not injective. Hence there exists a nontrivial f in H(Q) for which of = Qg for some g E H. If q = det Q then by Cramer's rule we can write MQ = qI where M is the inner function whose entries are the cofactors of Q. Thus we have
uMf=qg Now if q and u are coprime then Mf = qh for some h e H. In other words f = Qh which means that f E QHN contrary to an assumption. So q and u have a nontrivial common inner factor. Conversely let us assume u and q have a nontrivial common inner factor. If that is the case then also u and m have a nontrivial factor, where m is the minimal function of Q. Let us put m = i/ia and u = i/rb where 0 is the greatest
common inner division of m and u. Since we cannot have H(Q) .L aHN there exists a g e HN for which the decomposition ag = f + Qh relative to the direct sum HN = H(Q) E) QHN yields a nontrivial f Obviously f is in Ker(u(S(Q))) for of = i/ibJ = I/ib(ag - Qh) = m(bg) - Q(uh) So of E QHN and hence u(S(Q)) f = 0. Part (c) follows from (b) by an applica-
tion of Theorem 13-2. u(S(Q)*) is unitarily equivalent to u(S(Q)) acting in H(Q). Hence u (S(Q)*) is injective if and only if u and q` = det Q are coprime.
Theorem 14-5 Let N be finite dimensional, Q a rigid function, S(Q) the restricted shift operator acting in H(Q) = {QHN }' and u e H°°. Then u(S(Q)) is boundedly invertible if and only if for some S > 0 lu(z)I +
IIQ(z)-'
II -1 >- S
(14-10)
holds for all z in D.
PROOF If (14- 10) holds for all z in D then by Lemma 14-3 there exists a b' > 0 such that
lu(z)I + Iq(z)I ? b'
(14-11)
for all z in D, that is, u and q are strongly coprime. By the Corona theorem of Carleson there exist two functions a and b in H°° such that au + bq = 1.
202
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
The functional calculus transforms this into
a(S(Q)u(S(Q)) + b(S(Q))g(S(Q)) = I
But q(S(Q)) = 0 so we are left with u(S(Q))-' = a(S(Q)) and u(S(Q)) is boundedly invertible. To prove the converse we argue as in Theorem 13-8. If condition (14-10)
is not satisfied there exists a sequence of points Z. in D and unit vectors n in N such that lim 11u(An)1I = 0 and lim 1IQ(An)* n11 = 0. Let en, J , and gn nom
n-'ac
be defined by (13-19), (13-20), and (13-21), respectively, with A replaced by A. and replaced by cn. Since 11gn 11 2 + 11 fn 112 = 11 en I12 we have lim 11 fn i12 = 1 - lira 11gn 112 = 1. For the left shift S* in HN we have u(S)* e,, = u"(S*) e,, = u(An) en = u(A,) en
Hence u(S(Q))* f,, - u(A,) fn = u(S)* fn - u(An) fn = -(u(S)* gn - u(An) 9n)
Consequently we obtain the following estimate lIu(S(Q)`) fn11 < 21u(An)l + hull. -
Ilgnll
and the right-hand side tends to zero. If Q is not inner then for each Ae D there exists, by Theorem 13-3, at least one eigenfunction of S(Q)*. No decomposition of the eigenfunctions as carried out above is necessary. This completes the proof.
Corollary 14-6 A e p(u(S(Q))) if and only if for some b > 0 IA - U(z)1 + 11Q(z)
-1
11-' >_ b
(14-12)
for all z in D.
Restricting ourselves to functions in the algebra A of functions analytic in D and continuous in we obtain the classical spectral mapping theorem.
Theorem 14-7 Let u e A and Q be a rigid function. Then v(u(S(Q))) _ u(a(S(Q))) PROOF Assume 1. c a(S(Q)). By Theorem 14-6 there are points An in D such that lint A, = A and lim 11Q(An)_111-' = 0. It follows by continuity that limlu(A,) - u(A)I + 11Q(An)-' 11-' = 0 or that u(),) c- a(u(S(Q))).
Conversely assume, without loss of generality, that Oct(u(S(Q))). Then, by passing to a subsequence, we may assume A,, - A and lu(A,)I + ' 0. Clearly Ac- a(S(Q)) and u(A) = 0. l1
OPERATORS IN HILBERT SPACE
203
The analysis carried out above can be generalized and the same methods applied to operators intertwining two restricted shifts. Let T, and T2 be contractions and assume T*" 0 strongly. This assumption simplifies significantly the construction of a functional model for the Ti. In fact by Theorem 12-1 each 7r can be represented by an operator of the form S(Qi) acting in a left invariant subspace H(Qi) of HN; where Qi is a rigid function. As of now we will assume the Qi to be inner and the spaces N; to be finite dimensional. The lifting theorem, that is, Theorem 11-4, when translated into the language of vectorial function theory reads as follows.
Theorem 14-8 Let Qi for i = 1, 2 be inner functions Ni finite dimensional Hilbert spaces and S(Qi) the restricted shifts acting in H(Q1) = {Q;HN,}l. An operator X : H(Q,) -. H(Q) satisfies XS(Q1) = S(Q2) X
(14-13)
if and only if there exist functions E, E, E Ha (NI.N2) satisfying EQ1 = Q2E1
(14-14)
IIEIIx = IIXII
(14-15)
and for which
Xf =
H(Q2) Ef
(14-16)
PROOF If X is given by (14-16) then since by (14-14) multiplication by E maps Q1H2, into Q2HN2 we have for feH(Q,) XS(Q1) f = PH(Q1)EPH(Q0Xf = PH(Q2)X Ef = PH(Q2)XPH(Q2) Ef = S(Q2) X./
Moreover we have obviously II X II
_< II E II
Conversely assume X satisfies (14-13) that is X intertwines S(Q1) and S(Q2). The right shifts in HN1 and HN2, respectively, provide isometric dilations, which are not necessarily minimal. By Theorem 11-7 there exists an operator Y intertwining the right shifts for which II YII = IIX II Since an operator intertwining the right shifts is necessarily a multiplication operator by a bounded operator valued analytic function E we have (14-16). Now 2, 2 the operator Ysatisfies c Q2HN2 which is equivalent to the existence of a function E1 such that (14-14) holds. Theorem 14-9 Let X: H(Q1) - H(Q2) be the map defined in Theorem 14-1. Then its adjoint X*: H(Q2) H(Q1) is unitarily equivalent to the map X 1: H(02) -. 11(01) defined by X 19 = Pmo -Z 19
(14-17)
-102 = 012
(14-18)
with E1 satisfying
204
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
PROOF Let, for i = 1, 2, TQ, be the unitary maps defined by (12-7). We will prove that X* = TQ,X 1 rQ2. Note that (14-18) follows from (14-14) and is in turn equivalent to (14-19)
The last equality implies that for each g e H(Q1) we have TQ,Bg = 21TQ,g, also recall that we have TQ2PH(Q2) = PN(Q,)TQ,. Therefore for all f e H(Q2)
andgeH(Q1) (X*f, g) = (f, X9) = (TQ2f TQ2X9) = (TQ2f TQ2PH(Q2)°9) = (TQ2f PHId2) =9) _ (TQ2 f TQ2 =9)
_ (TQ2f, 1TQ,9) _ (" 1TQ2f, TQ19) = (PHIQ,)E.TQ2f TQ,9)
= (XITQ2f TQ,9)
which proves the theorem.
In preparation for Theorem 14-11 we prove a matrix generalization of the Carleson corona theorem. Theorem 14-10 Let N, Ni i = 1, ..., p be finite dimensional Hilbert spaces. (a) Given A, E H&N,N,) then a necessary and sufficient condition for the existence of B; a H(N,,N) satisfying P
Bi(z) A.(z) = IN
(14-20)
i=1
is that the strong coprimeness condition [A1, ..., APIR = IN
(14-21)
be satisfied.
(h) Given Ai a HBN,.N) then a necessary and sufficient condition for the existence of Bi E H'N,N,) satisfying P
i Ai(z) Bi(z) = IN
(14-22)
i=1
is that the strong coprimeness condition [A1,
..., AP]L = IN
(14-23)
be satisfied. PROOF
(a) The necessity part of the proof is simple. If (14-21) is not satisfied then there exist points A. e D and unit vectors e N such that for i = 1, ... , p lim I1Ai(1,,) 11 = 0. In that case (14-20) cannot hold for it implies =
OPERATORS IN HILBERT SPACE
A;(2")
"
205
and hence the following estimate P
Y IIBiII
1=
=1
and the right-hand side tends to zero. To prove sufficiency we fix orthonormal bases in N, Ni, i = 1, ..., p and express A; in matrix form, retaining the letters A for the corresponding matrices. Denote by A;kI the elements of A;. A, is an n; x n matrix where n and n; are the dimensions of N, Ni, respectively. Let W be the Y_ n; x n matrix composed of the rows of all A. Let be the n x n matrix whose rows are the i,, ..., in rows of W. W,
We claim that if (14-21) holds then the set of scalar functions < Y,_, n; is strongly coprime, that is, there
det W;, ;", I <_ i, <
exists a 6 > 0 such that Y Idet W, ,,,(z) I
(14-24)
>_ 6
for all zeD. The basic idea is that if for some 2 e D we have Y Idet (2) I = 0 then the vectors represented by the rows of the matrices A;(2) lie all in a proper subspace of N. If that is the case then there exists a nonzero vector orthogonal to all of them implying that for some e N we have yf_, IIA,(2) If = 0 in contradiction to (14-20). In the general case we have to argue differently. If (14-24) is not satisfied then there exists a sequence of points 2" e D for which lim Y Idet
0
(14-25)
We will show that (14-25) contradicts (14-21) by proving that 1im
inf{
IIA;(2,,) x1l Ix e NI, Ilxll = 1
}
(14-26)
Let E" = E Idet in(2") I and let y;" be the ith row of W (2J. There is one set of indices i, ... in such that for all j, ... j" we have I det W;,...;,.(2") I < I det W,...,j2") I
By Lemma 14-3 there exists a unit vector e N for which
(14-27)
A;')I S E'I".
If e" > 0 then the vectors y1(v) are a basis for N. Each yj,") has a representation AV) = E7_, and (14-27) implies that Ifik;l <_ 1 and with it that for
each yk"I, (x, ykI) 5 n;'i". This estimate shows that (14-20) implies (14-24).
Invoke now the scalar corona theorem to deduce the existence of a;, .;,, a H° ,
I
_< i, < ... < in = Y_jP_, n; such that
I a.,
det W,...,,,(z) = 1
(14-28)
To complete the proof we have to show the existence of B4k in H°°
206
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
such that p
n
Y Bk (z)
1=1 k=1
bi;
(14-29)
holds.
From equality (14-28) we can, by collecting terms, define matrices B°1 with elements BI; in H°° such that BIk (z) Aki (z) = 1 1=1 k=1
Thus (14-29) is satisfied for i = j. If i P
j then
n,
y Y Bf 1k (z) A,')(z) _ > a;,...1,.(z) det W,,...1 (z) i=1 k=1
where W;, ...1 is equal to W., ...;, with the ith column replaced by the jth
and thus det i,j= 1, ..,n.
(z) = 0. As a consequence (14-29) holds for all
(b) This follows from part (a) by noting that j?=1 A;(z) Bi(z) = IN if and only if ?= 1 $,(z) A1(z) = I N and that [1, , ... , An]R = IN is equivalent to (14-23).
Theorem 14-11 Let for i = 1, 2 Ni be finite dimensional Hilbert spaces, Q. inner functions and S(Q1) the restricted shifts in H(Q1) = {Q1HN,}L . Let X : H(Q1) = H(Q2) be an operator intertwining S(Q1) and S(Q2) having the representation (14-16) with (14-14) satisfied. Then
(a) X is injective if and only if
(-1, Q1)R = IN,
(14-30)
(b) X* is injective if and only if (21 Q2)L = IN2
(14-31)
(c) X has a bounded left inverse if and only if [=1.Q1]R = IN,
(14-32)
(d) X has a bounded right inverse if and only if I=-, Q21L = IN2
(14-33)
PROOF Let us start with (b). X* is injective if and only if the range of X is dense in H(Q2) and this occurs if and only if the invariant subspace spanned by aHN, and Q2HN2 is all of HN2. This is equivalent to (14-31). Next X is unitarily equivalent to X i where X 1 is defined by (14-17). Applying part (b) X is injective if and only if (-e1, 0 1)L = IN, and this condition is equivalent to (14-30) proving (a).
OPERATORS IN HILBERT SPACE
207
Assume now (14-33) holds. By Theorem 14-10 there exist O e HaeN,,N2)
and Re H81Ni) such that O+ Q2R = IN2. Define a bounded operator T: H(Q2) -+ H(QI) by Tf = PH,Q,)Of for all fe H(Q2). For such f we have, recalling that EQ1HN, c Q2HN2, PH(Q2) {, O.f + Q2Rf} = PH(Q2)EPH(Q,)9.f =.f
or X T = I proving that T is a right inverse of X. In the same manner, using Theorem 14-9, we prove that (14-32) is sufficient for the existence of a bounded left inverse for X. To prove the necessity part we argue in the way we did in the proof of Theorem 14-5. We will show
that (14-33) is necessary for the existence of a bounded right inverse for X. Now X has a bounded right inverse if and only if X* has a bounded left inverse. Since X* and X1 are unitarily equivalent it is enough to show that X 1 does not have a bounded left inverse. For this it suffices to exhibit a sequence of functions such that lim II Fn II = 1 and lim II X ,F. 11 = 0. n-00
R- CO
Since we assume that (14-33) does not hold there exists a sequence of points A. in D and unit vectors ,, e N such that lim II =(fin)* n II = llm IIQ2(An)* S. 11 = 0. Given An and ,, define the functions en, fn, and gn by (13-19), (13-20), and (13-21), respectively. The function fn is the projection of en onto H(Q2). Apply the unitary transformation =Q2 to all three functions and let En = rQ2en, F. = iQ2 fn, and Gn = 'CQ2gn. A simple calculation yields that En(z) _ (1 - IAn12)112 02(z) n/(z - Vin)
(14-34)
Fn(z) _ (I - IAnI2)1/2 (02(Z) - 02(An)) /z - 3.n)
(14-35)
Gn(z) _ (I
-
IA.12)I/2
(14-36)
An)
Moreover, since I I Fn I I 2 + I I Gn I I 2 = 1 and lim I I Gn II 2 = 0 we have lim II Fn II 2
1. Note also that E. 1 02H2N2 and F. = that (Z)F.(Z)
(Z)(1 - I A.I 2)1/21(10' 2(Z)ss- Q2(1n)) b./(Z
=(I
- I1n12)112 -I (Z) Q2(Z) Sn/(Z - 1n)
(1 - IA.12)1/2 xl (Z) 02(A.) n/(7Z - .n)
(I -
IA .12)1/2 0,(Z)
(Z) SnI(Z -xx /Ln)
- (1 - IAnI2)1j2 ° I(Z) Q2 (A.)* bn/(Z - An)
= 01(Z)(I
- IA 12)1/2 /
=
PH(Q2)E.. To compute X1Fn we note
12)111/2
(Z) -
+ 0 (Z)(1 - IAn (I - I2n12)1/2 '='I (Z) Q2(An)* bn/(Z - An)
An
208
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Since the first term in the last sum is obviously in Q I HN, taking the projection on H(O1) we obtain the following estimate.
IIX1F"II < II'(V")c"II + II111. II
+ 11- 1
11
. IIQ2(AJ*
11
and so lim 11 X 1 F. I = 0 and (d) is proved. The necessity of condition (14-32)
for the existence of a bounded left inverse for X follows now by duality arguments.
15. JORDAN MODELS In Chap. I we showed how the theory of equivalence of matrices over F [A] is connected with the theory of similarity of matrices over the field F. We proved that matrices A and A I are similar if and only if the polynomial matrices Al - A
and Al - Al are equivalent. As a consequence the reduction of Al - A to its Smith canonical form was the key to the reduction of A to its first canonical form, and from that the Jordan canonical form followed. Our aim in this section is to develop the same theory for the case of restricted shifts of finite multiplicity. We will consider the set of all n x n matrices over Hz', that is, the matrix ring (H°° )""". H°° is a ring, actually an algebra over the complex field. As a ring H" is an integral domain, that is, has no zero divisors. However,
H" is not a principal idea domain, as the only principal ideals in H°° are the w'-closed ones. Thus a straightforward application of the classical algebraic theory is impossible. We will have to relax our notion of equivalence to obtain a richer theory.
In H', as in any ring, we have the natural division relation. We say that b divides a and write b I a if for some c we have a = bc. By A f2 we will denote the
greatest common inner divisor of the f, and by v f2 their least common inner multiple. Both A f2 and v f2 are determined up to a constant of absolute value one.
We begin by proving some lemmas which are important for the development of the theory.
Lemma 15-1 Let fl, f2 be functions in H°° and let co be an inner function. Suppose
0)Af1A.f2=1
(15-1)
then for every complex number a, with the exception of at most a countable number of values we have W A (f1 + aft) = 1
(15-2)
PROOF For complex a let r2 = w A (fl + af2). If J3 * a then necessarily r2 A rp = I for if r = r2 A rp then r divides both f1 + aft and f1 + /f2 and
OPERATORS IN HILBERT SPACE
209
hence their difference (a - Q) f2. Since a * f r divides f2 and as a consequence also f, Now r is an inner factor of co and so must be equal to one otherwise (15-1) is contradicted. Now we claim that at most a countable number of the rQ are nontrivial.
Let us factor the inner function w as BS where B is a Blaschke product corresponding to the zeros of co and S a singular inner function which is associated with the singular measure p. If rs is an inner divisor of w let ra = B.S. be its corresponding factorization. The zeros of B. form a subset of the zeros of B, the measure pa is also singular and µ - p,, is positive. If r,, A r = 1 then B. A Bp = 1 and S. A Sp = 1. This means that the zero sets of Ba and Bp are nonintersecting and that the singular measures u,, and pp are mutually singular. From this it is clear that at most a countable number of the ra are nontrivial.
The preceding lemma has some immediate generalizations.
Lemma 15-2 Let fl, ..., fm be functions in H°° and let w be an inner function. Assume
wAf1A...Afm=I
(15-3)
then there exist complex numbers a2, ..., am such that
Co A (ff + azf2 + ... + amfm) = 1
(15-4)
PROOF We prove the lemma by induction. Let us define ri = Cl) A f, A A fi. Obviously we have the division relations 1 = rm I rm ... I r 1 I w. For i = 2 we have (w/r2) A (ft/r2) A (f2/r2) = 1 which, by the previous lemma guarantees the existence of a complex a2 such that co
A r2
fi r2
+az--fz
which is the same as CO A (f1 + a2f2)
r2
= r2
Assume we proved the existence of a2, ..., a,E such that w A (ff + a2f2 + .. + akfk) = rk. Clearly we must have w A (f1 +a2f2 + + akfk) A .fk+I = rk+1. Applying the previous lemma once more there exists an ak+1 such that CO A (fI + a2f2 + + (Xk+lfk+1) = rk+,. Since rm = 1 the lemma is proved.
Lemma 15-3 Let (f j) be an n x m matrix with H' entries and let wi be inner functions. Suppose
wi A f 1 A ... A fm = 1
(15-5)
then there exist numbers A,, ... , . such that m
w; A Y_ A, fj = 1 i=1
i = 1,
... , n
(15-6)
210
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
PROOF The proof is analogous to the previous one. If r"I = wi A f, I A
.
Af1
at each step 2 can be chosen arbitrarily except at most in the union of n countable sets. This means that the , can be chosen to fit all i, i = 1, ..., n. Corollary 15-4 Let (ai;) be an n x m matrix with Ham' entries and co an inner function. Then there exist functions hi e H°' such that ai, + A2ai2 + ... + -maim =./,(ail A ... A aim)
and
(15-7)
1.
A matrix X in (H'° r"" is a unit if X -' is also in (H')""" which is equivalent to det X being an invertible element of H. Two matrices A and B in (Hw r"" are equivalent if there exist units X and Yin (H°°)n"" for which (15-8) XA = BY A function X in (H-r'" has a scalar multiple rP e Hm if there exists a matrix X°e(H°°y'"" such that
X°X=XX°=cp1
(15-9)
Clearly det X is a scalar multiple of X. Given an inner function w we denote by A '.(n) the set of all matrices in (H')""" that have a scalar multiple cp which is coprime with co, that is for which (p A w = 1. In particular all units in (H°°)""" belong to .A" (n). Given matrices A and B in (H'°)""" and an inner function co we say A and B are w-equivalent if there exist X and Y in 4",,(n) for which (15-8) holds. If A and B are w-equivalent for every inner function w then we say that A and B are quasiequivalent. All three notions of equivalence are bona fide equivalence relations, that is, they are reflexive, symmetric, and transitive. For equivalence the three properties are immediate, and for quasiequivalence it would follow from proving these properties for w-equivalence. Reflexivity is trivial choosing X = Y = I. To show symmetry assume (15-8) holds and X, Y have scalar multiples r0 and s, respectively, satisfying q A w = 0 A CO = 1. There exist X" and Y° such that (15-9) holds as well as the analogous condition for Y. From (15-8) we have X AY° = BYY° = Bpi which implies X°XAY° = I/iX"B or A((pX") = (frX°) B. Since 0 CY°X
= cpY°Y= cplil and q00 A w = 1 symmetry follows. Finally, if besides
(15-8) we have also ZB = CW with Z°Z = ZZ° = C1 and C A w = 1 it follows that (ZX) A = Z(BY) = C(WY). ZYhas (iii as a scalar multiple and clearly C o A w = 1. Similarly WY has a scalar multiple coprime with w which shows transitivity. It is clear that equivalence implies quasiequivalence, but generally the con-
verse is not true. To give an example consider the diagonal matrices A and B given by A (0
0),
B
t
p'
0
I
where cp and Iy are the inner functions lp(z) = e-" +')I(I -z) and 41 the Blaschke
OPERATORS IN HILBERT SPACE
211
product with zeros at the points I -- (1 /n2 ). We will show that A and B are quasiequivalent but not equivalent. For the equivalence of A and B we have to have unit matrices
'121
and
Y=(niI
?1121
q22) `n21 S21 '22 in (H°°)"'n such that (15-8) holds. A simple calculation shows that necessarily and (pi 12. Thus A=(00'"
cg"' ) 22
S21
is a unit in H`° we must have that cp and and since detA = w' fr satisfy the Carleson condition inf { Jgp(z) I + IiIi(z)I } ? b for some S > 0. Our ZED
choice of co and aV rules that out so A and B are not equivalent. It is easy, however,
to show that A and B are quasiequivalent. Let co be an arbitrary inner function. By Lemma 15-1 we can choose a complex a so that (41 + a(p) A c) = 1. Define matrices X and Y by
X=
i `p
]
and
Y= \(V
then clearly XA = BY and detX = det Y= + xcp. This implies w-equivalence and as co was arbitrary the quasiequivalence of A and B. For a matrix A in (H'°r"" we introduce the determinant divisors D;(A) and invariant factors E;(A) in analogy with that of Chap. I. We let D0(A) = I and define D;(A) to be the greatest common inner divisor of all i x i minors of A. We write D;(A) = 0 if all i x i minors are identically equal to zero. By the expansion rules of determinants it follows that D;(A) I D;+ 1(A) and in particular D;(A) = 0 implies Di+ I (A) = 0. Next we define E;(A) by E;(A) = D;(A)/D; _ 1(A) if D; -, (A)
0 and E;(A) = 0 if D,-, (A) = 0. For A e (H°°)"'" we define the compound matrices of A, A(P) as in Chap. I. The results of Theorem 12-17 holds equally well over Hx. In particular if A and B are equivalent so are A(P) and B(P). The same is true for cc-equivalence, for if (15-8) holds and (p = det X and a/' = det Y are coprime with co then X(P)A(P) = B(P)Y(P)
(15-10)
obviously detX(P) and det Y(P) holds and as detX(P) = 0; 1) and det Y(P) = are also coprime with cc. From this it also follows that quasiequivalence of A
and B implies the quasiequivalence of A(P) and B(P). We can prove now the analog of Lemma I 2-20.
Lemma 15-5 Let A, B e (H°°)""" and let co be an inner function. Then D;(A) = D1(A('))
(15-11)
If A and B are co-equivalent then there exist inner functions a; and /3; coprime
212
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
with co such that
i = 1, ..., n
D;(B)I $lD;(A),
D.(A)I a;D((B),
(15-12)
If A and B are quasiequivalent then
i = 1, ... , n
D.(A) = D1(B),
(15-13)
PROOF That D1(A) = D1(A(`)) is obvious from the definition of the determinant divisors. Assume now A and B are w-equivalent, thus XA = BY with (p = det X and = det Y coprime with co. From this we obtain (15-14) X°BY= 9A where X° and Y° are the classical adjoints of X and Y, respectively. From
XAY° = q1B,
(15-14) it is clear that
and
D1(A)IOD1(B)
D1(B)I (pD1(A)
(15-15)
From (15-14) we have immediately (X°)(P) B(P)Y(P) = (pPA(P)
X(P)A(P)(Y°)(v) = .I.PB(P),
(15-16)
and hence D1(A(P))I
PPDI(B(P)),
D1(B(P))I(PPD1(A(P))
(15-17)
which is equivalent to (15-12) once we use the identity (15-11).
If A and B are quasiequivalent then for a fixed index i choose w = D.(A) D;(B), then (pP and OP are coprime with D;(A) and D,(B) separately. In this case(15-17) implies(15-12). Next we introduce some convenient notation. Let u = (0, u2, ..., u") and (0, V2,,.., v") be vectors with H°° entries. We define matrices C(u) and R(v) in (H°°)""" by I
I
R(v) =
v2
... ,v"
1.
(15-18)
Obviously C(u) and R(v) are units in (H°°)""". In fact we have
C(u)C(-u) = I
R(v)R(-v) = 1
(15-19)
We are ready for the main theorem of diagonalization.
Theorem 15-6 Let A e (H°°)""" then A is quasiequivalent to a diagonal matrix having the invariant factors E;(A) on the diagonal, and we have E.(A)IE1+1(A)
(15-20)
OPERATORS IN HILBERT SPACE
213
PROOF We will show that for any choice of inner function co in H10 A is w-equivalent to diag(E1(A), ..., E (A)). For A = 0 this is trivial so we assume A is nonzero and hence D1(A) is a nontrivial inner function. Let w denote the product of co and all nontrivial determinant divisors of A. Thus a function f is coprime with co if and only if it is coprime with co and all nonzero DI(A).
Let a' = A j= 1 al; then ail/a^,..., a;,,/a^ are coprime and hence, by Lemma 15-3, there exist complex numbers a2, ..., a such that a; defined by
a, = all + a2ai2 + ... + anain = a,^hi and hl n (0' = 1. Since A!,= I a' = D1(A) another application of Lemma 15-3 shows there exist complex Y2, ..., ft such that
al + fi2a2 +
+
and
hD1(A)
For the vectors a = (0, a2, ...,
h n co' = I
and b = (0, fit, ...,
we define now the
matrices C(b) and R(a) by (15-18). Let A' be given by
A' = R(b) AC(a)
(15-21)
Since C(a) and R(b) are units A' is equivalent to A and so has the same determinant divisors as A. Moreover by the special structure of R(b) and C(a) it follows that A' has the form
A' =
f
... dl
hD,(a)
a'12
d21
a22 ... ate
(15-22)
ae,
Define A" by f
A" --
D1(a) ail
die ha22 (15-23)
ha, 2
... ha,,,, )
then we have
A' diag(l, h, ... , h) = diag(h, I__ , 1) A" Since both of the diagonal matrices have scalar factors, h"- I and h, respectively, which are coprime with co' it follows that A' and A" are a)'-equivalent.
Now as DI(A) = DI(A') it follows that D1(A) divides all ail and all a; l. Let u, = (A) and v1 = a'1 /D1(A). If u = (0, u2, ..., and v = (0, v2i ... , we form C(u) and R(v) by (15-18) and define A- by A- = C(u) A"R(v)
(15-24)
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
214
then A"' is equivalent to A" and has the form A,,,
_ (DI(A)
0
All
0
(15-25)
where A, is in (H°°)I"-II"Ie-1j. Now DI(A) divides all elements of A' by equivalence. A simple check shows that DI (A) divides all elements of A" and by equivalence those of A-. In particular all elements of A, are divisible by D,(A). We proceed by induction to obtain the ui -equivalence of A to a diagonal
matrix A = diag(81,... , S") with b; lb1+, and 6, = DI (A). Since A is w'equivalent to A it is also w-equivalent and oo being arbitrary A and A are quasiequivalent. Therefore D1(A) = D;(A) = b, ... 81. Now the invariant factors of A are b1, ... , b" and therefore also E1(A) = b1. This proves the theorem.
Corollary 15-7 Two matrices A and B in (H°')""" are quasiequivalent if and only if they have the same invariant factors.
PROOF Assume A and B have the same invariant factors b,, ..., b". Both A and B are quasiequivalent to diag(bI, ..., b") and since quasiequivalence is transitive A and B are quasiequivalent. The converse has already been proved in Lemma 15-5.
Now we proceed to the study of the relation between quasiequivalence and quasisimilarity and determine a quasisimilarity invariant for a class of restricted shifts.
To begin with we note that in one direction we can get the analog of the finite dimensional situation.
Theorem 15-8 Let Q1 and Q2 be n x n inner functions. If Q, and Q2 are equivalent then the operators S(Q1) and S(Q2) defined by (13-4) in the left invariant subspaces H(Q1) and H(Q2) defined by (13-3) are similar. PROOF By equivalence there exist units E and E1 in (H°°)""" such that EQI = Q2E1
(15-26)
Define an operator X: H(Q1) - H(Q2) by X1= PH(Q2)Ef
fEH(Q1)
(15-27)
then X satisfies XS(Q1) = S(Q2) X and is boundedly invertible by Theorem 14-11.
Given inner functions q1..... q" in H" which satisfy q1+1 Iq, we define a Jordan operator to be an operator of the form S(qI) O ... e S(q")
(15-28)
OPERATORS IN HILBERTSPACE
215
acting in the space H(q1) O+ ... O+ H(q")
(15-29)
Our aim will be to establish that a Jordan operator is a quasisimilarity invariant for the set of restricted shifts corresponding to matrix inner functions.
Theorem 15-9 Let Q, and Q2 be inner functions in C". If Q, and Q2 are quasiequivalent then S(Q1) and S(Q2) are quasisimilar. PROOF Since Q1 and Q2 are quasiequivalent then for each inner function co in (H°°)"" such that (15-26) holds and both det = and there exist E and
det8, are coprime with co. Choose co to be equal to detQ1 detQ2 then necessarily we obtain the coprimeness relations and (81, Q1)R = 1 (`, Q2)L = I Define X : H(Q1) -. H(Q2) by (15-27) then X satisfies XS(QI) = S(Q2) X. Moreover by Theorem 14-11 both X and X* are injective. Since {Range X) I =
Ker X* it follows that X has dense range and so X is quasi-invertible. By symmetry there exists also a quasi-invertible operator Y: H(Q2) H(Q1) for which YS(Q2) = S(Q2) Y and so quasisimilarity of S(Q1) and S(Q2) is proved.
Theorem 15-10 Let Q be an inner function in C" and let q, , ..., q" be its invariant factors ordered so that q;+iIq;. Then S(Q) is quasisimilar to the Jordan operator S(q1) O Q+ S(qj. PROOF As Q is quasiequivalent to diag(g1, ..., q") the theorem follows from the previous one.
The question of whether the converse to Theorem 15-9 holds is connected to the question of whether two quasisimilar Jordan operators are necessarily equal. Both questions can be answered in the affirmative but we will have to establish some preliminary results before.
Theorem 15-11 Let q be an inner function in H°° and Q1 an inner function in C m. Let Q = ql" and let X : H(Q) H(Q 1) be a quasi-invertible operator satisfying XS(Q) = S(Q1) X
(15-30)
then necessarily n 5 m.
PROOF Since S(Q1) is quasisimilar to its Jordan model we may assume without loss of generality that Q1 = diag(q,, ..., q",) where q; are the invariant factors of Q1 and q;+ 1 qjFrom (15-30) it follows that the equality 1
X
(15-31)
216
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
holds for all cp a H. In particular the choice qp = q yields q(S(Q)) = 0 and as the range of X is dense in H(Q1) this in turn implies q(S(Q1)) = q(S(g1))
... ® q(S(gm)) = 0 or q, I q for all j. Therefore there exist inner functions rj for which (15-32) j = 1, ..., m q = rlgi, and 0 = (nil) in (H-r-By Theorem 14-8 there exist matrices E =
such that (15-33)
EQ = Q1® for
X f = PHIQ1IEf
f e H(Q)
(15-34)
and the coprimeness relations (E,Q1)L = I and (Q, ())R = I are satisfied. or that From (15-32) and (15-33) it follows that nib =
0 = RE
(15-35)
where R = diag(r1, ..., rm). Since (Q, O)R = I it is impossible for all >l;, to be divisible by q. Let r be the order of a maximal minor of O whose determinant
is not divisible by q. Without loss of generality we may assume it to be the minor (Ilij), i, j = 1, ..., r. Obviously 1 <_ r <_ m.
Assume now n > m and we will show this leads to a contradiction. Define functions u1, ..., u,+1 in H°° through the following determinant expansion
nII
...
1I,
rl1.r+I r+ I
7r1 X1
...
'lrr
tlr.r+I
X,
x,+I
xiui
(15-36)
i=1
We note that r+I J=1
0
n''u '
divisible by q
i = 1, ..., r
for for
i>r
(15-37)
as for i <_ r this is the expansion of a determinant with two equal rows whereas
for i > r this is the determinant expansion of a minor of 0 of order r + 1 hence divisible by q. Let now u e He. have u1, ..., ur+1 as its first r + 1 components, all others equal to zero. Since Ilxt
lxr
nrt
nr.
ur+1 = det is not divisible by q the function u is not in QHC', = QHC1. and so v = PH(Q)U
is nonzero. We will show now that Xv = 0. Indeed (15-37) shows that Ou a qH'.. and as qI = RQ1 we obtain Ou a This coupled with (15-35) implies Eue Q1H(. Now Xv = PN(Qi)Ev = PHIQIIEPH(Q)U =
OPERATORS IN HILBERT SPACE
217
PH(Q,)EU = 0. This contradicts the assumption that X is quasi-invertible and in particular is injective. So n S m and the theorem is proved.
Lemma 15-12 Let q = pr be a factorization of the inner function q into inner factors. Then
H(q) = H(r) $ rH(p)
(15-38)
PROOF We have H2 = H(p) ® pH2. Since multiplication by an inner function is an isometry in H2 it preserves orthogonality. Hence
rH2 = rH(p) m rpH2 = rH(p) Q qH2 and as a consequence
H2 = H(r) ® rH2 = H(r) ® rH(p) ® qH2 which is equivalent to (15-38).
Lemma 15-13 Let q and q' be inner functions in H°° and let r = q n q', q = rc?. Then
Rangeq'(S(q)) = rH(4)
(15-39)
and there exists a quasi-invertible operator K = H(t) - rH(4) such that KS(q) = (S(q) I rH(4)) K
(15-40)
PROOF Define X: H(t) -+ H(q) by Xg = PH(q)gg,
for
g E H(4)
(15-41)
Clearly
XSM = S(q) X
(15-42)
Let f e H(q) and f = g + 4h be its decomposition with respect to the direct sum representation H(q) = H(q) (j) 4H(r). Now q'(S(q)) f = PH(g)q.f = PH(g)(g9 + q '4h) = PH(q)q 9 = PH(q)4 PH(4) f = XPH(4) f
Here we used the fact that q divides q'4. Define now 4' = q'/r then 4' A q = I and from (15-41) we have Xg = PH(g)q 9 = Px(g)r4 g = rPH(4)4'9
By Theorem 14-5 X is injective and the range of 4'(S(4)) is dense in H(4) and
so the range of X is dense in rH(4). Define K: H(4) -+ rH(t) by Kg = Xg then K is quasi-invertible and (15-40) holds.
218
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Corollary 15-14 Let q = pr be a factorization of the inner function q into
inner factors. Then the range of p(S(q)) is rH(p) and the operators S(p) and S(q) I rH(p) are unitarily equivalent. PROOF The operator X: H(p) -+ H(q) defined by Xg = rg is isometric with range equal to the invariant subspace rH(p) of H(q). If we restrict the range of X to rH(p) it is unitary and provides the required unitary equivalence of S(p) and S(q)1(rH(p)).
Lemma 15-15 If q1, q2, and q' are inner functions and g2jq, then q2
qI
(15-43)
q2 nR q1 nq
PROOF Since q2( q1 there exists an inner function s such that qI = q2s. It follows that
qI Aq =g2snq'=(q2n q)
9
S n
q2n
o))
and as a consequence gI
qI n q
_
S_____
q2
qi n 4
Sn
(t\
q'
(15-44)
)
q2 A q' 11
But (15-44) clearly implies (15-43).
The next result shows that two quasisimilar Jordan operators are necessarily equal.
Theorem 15-16 Let q1, ..., q and q,, ..., q, be inner functions in H`° such that ql+1I ql and q++I lq;. Let Q = diag(gl,... , and Q' = diag(q,,..., qm). If X: H(Q) -+ H(Q') in a quasi-invertible operator for which XS(Q) = S(Q') X
(15-45)
then n=mandg1=q;. PROOF Note that H(Q) = H(q1) Q H(q' ) . Also S(Q) = S(q1) O ... $
$
and H(Q') = H(q1) © ... Q+ and
(D
Since q I q; for j = 1, ..., n we have q; = r;q for some inner functions r;, j = 1, ..., n, with r = 1. Consider the subspace Ho = (D - ® of H(Q) which is an invariant subspace of S(Q). By Corollary 15-14 S(Q) I H0
is unitarily equivalent to S(q ). X restricted to Ho is a quasiinvertible operator from Ho into Ho = X HO which is a S(Q'}invariant subspace of H(Q'). Now S(Q')JHo is unitarily equivalent to S(Q") for some Q x m inner function T. Applying Theorem 15-11 we have n <_ m.
OPERATORS IN HILBERT SPACE
219
Fix now 1 5 k < m. Since Xcp(S(Q)) =
X [Yk(S(g1)) ® ... ®gk(S(gn))]
= [gk(S(q',)) ®"' (Dgk(S(gk-1)) ®{O} ®... ® {0}] X
(15-46)
By Lemma 15-13 and Corollary 15-14 k(S(gj)) H(qj) = (qj A qk) H
q'
\\ qj A k /
and
_A
S(qj) (qj A qk) H
is unitarily equivalent to ((
S\qj
A qk
Similarly q' , (S (q;)) H(q;) = qkH(q;/q) and S(q;)I q'H(q;/q'k) is unitarily equiv-
alent to S(q;/k). Thus there exists a quasi-invertible operator X, for which X,
[S(
g1
1 ®...
®S(
L `q 1 A qk J
q.
q. A qk
) J = [S(g,/Nk) ®". ©S(qk-1/q'k)] X1
and Lemma 15-15 guarantees the division relations qj+1
q1
qi+1 A qk
and
(qi+11gk)I(gW,gk)
Let j be the maximal index for which qj/(qjj n qk) is nontrivial. Since
S(_q,,, )® ...®SI __ _ q1 Aqk gjAgk restricted to an invariant subspace is unitarily equivalent to the direct sum of j copies of
then another application of Theorem 15-11 shows that j 5 k - 1. This implies qk = qk A qk or qk I qk
To complete the proof we take the adjoint of equality (15-45). Thus we have
X*S(Q')* = S(Q)* X*
(15-47)
220
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
and X* is also quasi-invertible. Now, by Theorem 13-2 S(Q)* is unitarily equivalent to S(0) and S(Q) = S(tj',) ® ® S(q',,). Similarly for S(Q1)*. Apply the first part of the theorem to obtain m _< n and if,44ewhich clearly implies q, I q, and this completes the proof.
By Theorem 15-10 given any matrix inner function Q then S(Q) is quasisimilar to a Jordan operator. By Theorem 15-16 this Jordan operator is unique. We call it the Jordan model of the operator S(Q). Theorem 15-17 Let Q1 and Q2 be two matrix inner functions. Then S(Q,) and S(Q2) are quasisimilar if and only if they have the same Jordan model. PROOF Assume S(Q1) and S(Q2) have the same Jordan model S(A). Since both S(Q,) are quasisimilar to S(A) the quasisimilarity of S(Q1) and S(Q2) follows by transitivity. Conversely let S(A1) and S(A2) be the Jordan models of S(Q1) and S(Q2), respectively. Again by transitivity S(A1) and S(A2) are quasisimilar and by Theorem 15-16 A, and A2 coincide.
Theorem 15-18 Let Q1 and Q2 be two matrix inner functions. Then S(Q1) and S(Q2) are quasisimilar if and only if Q1 and Q2 are quasiequivalent. PROOF The if part has been proved in Theorem 15-9. Let A, and A2 be the diagonal inner functions with the invariant factors of Q1 and Q2 as their respective entries, then Q, is quasiequivalent to A. As S(A1) and S(A2) are quasisimilar by the transitivity of the quasisimilarity relation we must have, by Theorem 15-16, that A, = A2. By transitivity of quasiequivalence, Q1 and Q2 are quasiequivalent.
In a finite dimensional vector space a linear transformation is cyclic if and only if its characteristic and minimal polynomials coincide. An analogous situation holds for Jordan operators. Let us denote by Co the class of all completely nonunitary contractions such that qp(T) = 0 for some nonzero function rp in H.
Lemma 15-19 Let Te Co then J = {II/III/E H°°, I/i(T) = 0} is a w*-closed
ideal in H. PROOF That J is an ideal is clear from the properties and as Te Ca it is a nontrivial ideal of the functional calculus. To see that J is w*-closed let (p. be a net converging to cp in the w*-topology of H. Recall that the minimal unitary dilation U of T has a spectral measure which is absolutely continuous
with respect to Lebesgue measure. Let x and y be arbitrary vectors in H, then we have ((o(T)x,Y) = (Pco(U)x,Y) = (co(U)x,Y) = Jq(eit)(E(dt)x,Y) Since by the Radon-Nykodim theorem (E(dt) x, y) = k(e") dt for some k e L1
OPERATORS IN HILBERT SPACE
221
we have J cp(e") (E(dt) x, y) =
$Q(eut) (E(dt) x, y) = 0
This implies that ((p(T) x, y) = 0 for all x and y hence (p(T) = 0. We use now the representation theorem for w*-closed ideals in H°° to get J = mTH * for some inner function MT which is uniquely determined up to a constant factor of absolute value one. We call MT the minimal function of T If Te Co so does T* and we have mT. = ?hT. If q is an inner function in H`° then clearly for S(q) the minimal function coincides with q. Now let S(q1) t ... p S(q") be a Jordan operator. Since qj + 1 I q1 for i = 1, ..., n - 1 we immediately infer that the minimal function
of S(gt)m...©S(q.)is g1. The minimal function is a quasisimilarity invariant. Actually we prove a slightly stronger result.
Theorem 15-20 Let T, TI be two completely nonunitary contractions and X a quasi-invertible operator that intertwines them, i.e.
XT= T1X
(15-48)
Then if one of the operators is Co so is the other and their minimal functions coincide. PROOF From (15-48) it follows that Xcp(T) = cp(T1) X
for all
cp a HI
(15-49)
Assume Te Co then MT(T1) X = XmT(T) = 0 and as the range of X is dense we must have MT(T1) = 0, and so T1 E Co and MT, I mT. Conversely if T1 e Co
then 0 = mT,(T1)X = XmT,(T). Since X is injective MT, (7) = 0 which shows that Te Co and MT I MT,. Hence the two minimal functions coincide.
We have now two notions of minimal inner functions associated with a matrix inner function Q. One is the minimal inner function of Q while the other is the minimal function of S(Q). Not surprisingly the two notions of minimality coincide.
Theorem 15-21 Let Q be an inner function in C". Then m, the minimal inner function of Q, and m1, the minimal function of S(Q), coincide. PROOF Since m1(S(Q)) = 0 it follows that m1H(Q) c QHH., and since clearly m1QHH,. = Qm1HH., c QHH we have m1HH c QH2 or m1m1. Conversely if mH,2,., c QHc': we have m(S(Q)) = 0 and so m1 I M. Thus m and m1 coincide.
An operator T in H is cyclic and b a cyclic vector if the set of linear combina-
222
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
tions of the vectors Pb, j - 0, 1, ... is dense in rn Cyclicity too is a quasisimilarity invariant. Theorem 15-22 Let T and T, be contractions and X a quasi-invertible operator that intertwines T and T,. Then if T is cyclic so is T,. Consequently if T and T, are quasisimilar then T is cyclic if and only if T, is cyclic.
PROOF Let T be cyclic and b a cyclic vector. From (15-48) it follows that
XT"b = T"Xb = Tib,
(15-50)
where b, = Xb. As the range of X is dense this shows that T, is cyclic and b, a cyclic vector. We describe an important class of cyclic operators.
Lemma 15-23 Let q e H°° be inner. Then S(q) is cyclic and a function fe H(q) is a cyclic vector for S(q) if and only if f n q = 1. PROOF A function f in H(q) is a cyclic vector for S(q) if and only if the smallest right invariant subspace that contains f and qH2 is H2. This, by Beurling's theorem, is the case if and only if f n q = 1. To see that S(q) is cyclic it suffices to exhibit an outer function in H(q). Let k = PH(q)l = 1 q(0)q then k is an outer function. In fact jq(0) I < 1 as a consequence of the maximum modulus theorem and hence k is actually an invertible element
of H.
As the result of the preceding development we can obtain a characterization of the restricted shifts of finite multiplicity which are cyclic.
Theorem 15-24 Let Q be an inner function in V. Then the following statements are equivalent. (a) S(Q) is cyclic. (b) S(Q)* is cyclic. (c) S(Q) is quasisimilar to S(q) where q = detQ. (d) m = q where m is the minimal function of S(Q). PROOF Since cyclicity is a quasisimilarity invariant we can replace S(Q) by its Jordan model S(q,) © ... O We assume q,+, jq, for i = I__, n - 1. Since S(q) is cyclic clearly (c) -+ (a). Also if S(Q) is quasisimilar to S(q) then S(Q)* is quasisimilar to S(q)*. However, S(q)* is unitarily equivalent, by Theorem 13-2, to S(4) which is again cyclic, so (c) -. (b). Minimal functions are a quasisimilarity invariant so since q is clearly the minimal function of S(q) we have the implication (c) -+ (d). To see that (d) -. (c) assume (d) holds, which is equivalent Finally assume (a) holds, that is, S(Q) or S(q,) Q Q+ is cyclic and let f = f, O+ -. Q+ f be a cyclic vector for it. First we show that we may
OPERATORS IN HILBERT SPACE
223
replace f by another cyclic vector S I + - - + ,, with 1 e H°°, i = 1, ... , n. To this end let S be the right shift in H.2 and let M be the invariant subspace spanned by S"f. By Theorem 12-22, M = S2HH. for some rigid function S2. Let K c C" be the initial space of S2 Since f eM f = S?g for some g e H'C. and we may as well assume g e H. Since Q is isometric on H j clearly g spans HK which implies that dim K = 1. By a suitable choice of basis we may
assume K coincides with the subspace spanned by e1, the first element of be the elements of the the standard orthonormal basis of C. Let S1, O+ SnQ g and Ej= O first column of i2 then f1KQI. Qf" = p+ Sn is also a cyclic vector for S(ql) Q $ S(qn). If k = PH(Q,)l = I - gl(0)gl then we define an operator X : H(q1) - H(q1) ... Q H(qn) by 1
XS(g1)m k = S(g11 S1 Q ... Q S(gn)m S.
(15-51)
Clearly X extends by linearity to a contraction. We have in fact
' )h
Xh =
for
heH(g1)
(15-52)
Sn
O+ 4. is a cyclic vector X clearly has dense range. If X is not injective then also the operator Y: H(q1) -. H(q1) given by Since c 1 Q
.
Yh =
(15-53)
is not injective. This is equivalent to 1x and q, having a nontrivial inner factor. But this contradicts the fact that b I is cyclic for S(q1). Therefore the operator X defined by (15-52) is a quasi-invertible operator and it clearly satisfies
XS(g1) = (S(q1) O+ ... (D S(qn)) X
We apply Theorem 15-16 to infer that q2 =
(15-54)
. = q" = 1. So S(q) is the
Jordan model of S(q) and so they are quasisimilar and the proof complete.
The previous theorem suggests the question as to when S(Q) is actually similar to S(q). Such questions are always associated with the Carleson corona theorem and this one is no exception. Theorem 15-25 Let Q be an inner function in C", and let S) = (w;;) be the classical adjoint of Q, that is, Q11 = QQ = qI where q = det Q. Then S(Q) is similar to S(q) if and only if there exist h, ui, vi a H°° such that n
Y Q;;u;vj + qh = 1
(15-55)
i.j= I
PROOF Assume S(Q) and S(q) are similar. Then there exist boundedly invertible maps X: H(Q) H(q) and Y: H(q) -. H(Q) satisfying XY= IH(q),
224
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
YX = IHCQ)
XS(Q) = S(q) X
(15-56)
S(Q) Y = YS(q)
(15-57)
and
By the representation theory for intertwining operators there exist 1
x n and n x 1 bounded matrix functions 1 and `I', respectively, such that `Pq = Q'P1
and
CDQ = q(D I
(15-58)
Xf = PH(q))f
for
f e H(Q)
(15-59)
Y9 = PH(Q)`Pg
for
g e H(q)
(15-60)
and the coprimeness relations [ID, q,]L = I,
[DI,Q]R = I
(15-61)
[`P, Q]L = I,
[gI,PI]R = I
(15-62)
Relation (15-58) implies that forge H(q) 9 = PH(q) DPH(Q)I9 = PH( )OT g
and hence that
1 - (DT = qh
(15-63)
for some h e H. Finally, from cQ = q
1 - '152`P = qh
(15-64)
Let
(Vi) (DI
= (ul, ..., un),
(15-65)
`I'
v
then (15-55) follows.
Conversely assume there exist h, u,, v, a H°° for which (15-55) holds. Define 4) and 'P by (15-65) and let us define 4) and P1 by cD ='bIi2,
`P1 = QT
(15-66)
From (15-66) we get, multiplying by Q on the right and left, respectively, that (15-58) holds. Therefore, the maps X and Y defined by (15-59) and (15-60), respectively, are bounded maps that satisfy (15-56) and (15-57), respectively.
It remains to show that X and Y are invertible. It is clear that X Y = which follows from (15-55), and so X is surjective while Y is injective. To complete the proof we have to show that YX = I H(Q) or equivalently, since
OPERATORS IN HILBERT SPACE
225
YX f = PH(Q)'Pcf for f e H(Q) and P(DQ = 'Pq b1 = q'M1 = Qi2'PD1, that (15-67)
('P D, Q)L = I
From (15-64) by multiplication by fI we obtain (15-68)
((D ,52`P) Q = S + gf1h
We will show now that q divides all elements of F = (b li2'P) f2 (f2'P) (D10). Since D, and 'P are given by (15-65) the k, I element of F is given by n
C
i.j=1
n
n
wijUiUj wki - 1
(OkjUj
j=1
(1 L Utwil j=1
=
i.j=1
Vivj(WjW,,, - wkjcoU)
Now (wijwkl - wkj(Oil) =
wij wkj
is the general element of the compound matrix W), and we have to show its divisibility by q. Observe that (15-55) implies that S(Q) is cyclic for we have
1(Q) A q =
detQ2 = q, necessarily Dn_1(Q) = 1. Thus the Jordan model for S(Q) is S(q) which is cyclic Since S(Q) and S(q) are quasisimilar then also S(Q) and S(Q) are quasisimilar where Q = diag(q, 1, ..., 1). There exist therefore for each inner co, 8 and O in .N.(n) such that EQ =
1, and as
QO, 8°8
I, and O°0 = 00° = nl, where
and 0° are the
classical adjoints of E and 0, respectively. We choose w = q thus A q = n n q = 1. From S1Q = qI we have qnI = 0°t2Q0 = O°S2 Q = i2Q where 11 = O°528 Now 0 and 0 are clearly quasiequivalent as Oft = Q(ne) S2 and n=, O° e A.(n) if E, 0 e .K,°(n). Therefore 6 and and S2E = f2 have the same determinant divisors. Now from the equality QQ = qnI it follows that q divides all elements of the last (n - 1) columns of ft This means that q D2(C) = D2(Q) and hence all elements of Q(2), and therefore of F. are divisible by q. Let F = qD for some n x n matrix D over H. Equality (15-68) can be rewritten as
f2=f2'1'
Q+q(D-S2h)
or, using the fact that QQ = qI, as
I='P01f2+Q(D-CM)='PO+Q(D-1h) By Theorem 14-10 this is equivalent to (15-67) and this completes the proof of the theorem. The terminology concerning Jordan operators is not in total agreement with the finite dimensional situation. In fact a Jordan operator is the proper generalization of the first canonical form (I 4-21) which is derived from Theorem I 4-14. A
226
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
natural question poses itself, that of finding conditions under which a further reduction is possible. The extreme simplification would be to find conditions under which S(Q) is completely diagonalizable in the sense that there exists a basis consisting of eigenfunctions of S(Q). We restrict ourselves to the study of the scalar case only. Let us review the
finite dimensional situation. Consider a polynomial q e F [A] which has the zeros A,,._ , A. all of them assumed to be simple. The quotient ring F [A]/(q) is isomorphic to the set of polynomials of degree
-2
j*i Ai - Aj
(15-69)
Thcn we clearly have ni().j) = bi j
for
I 5 i, j 5 n
(15-70)
The polynomials it,, ..., it,, are clearly linearly independent, as a consequence of (15-70), hence they form a basis for F[A]/(q). Now given any set of numbers a,,..., a in F the polynomial a(A) _ Y;_, aini(A) is an interpolating polynomial in the sense that a(Aj) = aj, j = 1, ..., n. Finally we note that if S(q) is defined by (I 4-8) then the polynomials iti are eigenvectors of S(q) corresponding to the eigenvalues A. This follows from the fact that (A - Ai) iri(A) is divisible by q. The moral
of this simple example is that certain problems of interpolation are connected with the problem of existence of bases consisting of eigenfunctions. in the open unit We make a few definitions first. A sequence of points disc D is called uniformly separated, or equivalently a Carleson sequence, if there exists a S > 0 such that Aj - Ak k >- 1 (15-71) for all 6 11 j*k
I
It is quite easy to see that if {Ai} is a uniformly separated sequence then there
exists a Blaschke product with the set {Ai} as its set of zeros. Without loss of generality we may assume A0 = 0 then Aj
j>0 1 - -A0 AoAj
III
j>0
The convergence of the last infinite product is equivalent to the convergence of the series , (1 - IAj I). We will say that a sequence of points (A.1 in D is a p-interpolating sequence if there exists a constant M such that for any sequence a = {a } in 1 P there exists a function fe HP that satisfies .f(A1) = at/(1 - IAiI2)I1P
(15-72)
IIIIIP 5 M IIailP
(15-73)
and
OPERATORS IN HILBERTSPACE
227
that is, we can interpolate the values ai/(1 - I2iI2)11p in a uniformly bounded way. An oo-interpolating sequence is also referred to as an interpolating sequence. In this case (15-72) is replaced by
f (iii) = ai
(15-74)
We continue with the introduction of some notions concerning bases in a Hilbert space. Let { fi} be a sequence of vectors in a separable Hilbert space then
If) is a basis if for each f e H there exist unique scalars ai(f) such that Ii= 1 ai(f) f II = 0. A basis { fi } is called a bounded basis if there exists a constant m > 0 for which m-1 < IIf < m, for all i >_ 0. A basis { fi} is called lim 11 f -
"_ 00
Hilbertian if for each sequence {ai } e l 2 the series I ai f, converges. A basis If,') } is called Besselian if for each g e H we have > I(g, i/ii)12 < oo where {qii} is a sequence which is biorthogonal to { that is, which satisfies Mi(ff) = Sid.
It will turn out to be useful to have some additional information concerning bases in Hilbert space. Let us assume H is a separable Hilbert space and {(pi} a basis in H. Let {(pi} be the biorthogonal sequence that corresponds to {pi}, that is ((pi, qij) = bi;. Finally let ei be the standard orthonormal basis for 12. Theorem 15-26 The following statements are equivalent. (a) {cpi } is a Besselian basis for H.
(b) There exists a bounded map T : H - 12 such that Ttpi = ei
(15-75)
(c) {Oil is a Hilbertian basis for H. (d) There exists a bounded map R: 12 -+ H such that Re, = i1i
(15-76)
is a Besselian basis for H. Thus > all x in H. Define a map T: H -+ 12 PROOF Assume
1
I (x, 0i)12
T>2a;ip;={aj}
< oo for
(15-77)
!=1
Clearly T is linear, its domain of definition is all of H and it is easily checked
to have a closed graph. So by the closed graph theorem T is bounded. In particular we have the inequality
{>2I(x,4,;)12}lag <
I1TII. Ilxll
(15-78)
Conversely if there exists a bounded linear map T: H -+ 12 that satisfies (15-75) then we have
ai =
n 1 Y_ alei, ei 1 = { 7 Y ai1Pi, e; 1 = > ai(Wi, T*e;) i=1 i=1 // i=1
/
\
228
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
We must have therefore that
Oj = T*ej
(15-79)
and so we obtain
EI(x,.Pj)I' = ZI(x,T"ej)I2 = El(Tx,ej)I2 = IITxlI2 s IITiI2 IIxlI2 which shows that {cpI} is a Besselian basis. We have proved so far the equivalence of (a) and (b). In a completely analogous way we can show the equivalence of (c) and (d). But equality (15-79) shows that (b) (d). Similarly if R satisfies (15-76) then (15-80)
R*cp; = e,
and so (d) -+ (b) and this completes the proof.
For two subspaces M and N of a Hilbert space H we define the angle between the two subspaces, which we denote by a(M, N), by a(M, N) = arc cos sup{ (x, y) I x e M, y e N, lI x II = ll y Il = 11 and we assume 0 < a(M, N) :5
it/2. It is obvious that inf {Ilx - yll Ix e M, y e N, llxll = llyll = 11 > 0 if and only
ifa(M,N)>0. Lemma 15-27 Let R: H -+ K be a boundedly invertible map between two Hilbert spaces. Then there exists a 6 > 0 such that for any pair of orthogonal subspaces M and N in H we have a(RM, RN) ? 6. PROOF It suffices to show that there exists a d > 0 such that for every pair of orthogonal subspaces M and N in H we have inf { II m' - n ll l m' a RM, n' a RN, IIR-' llm ll = lnll = 1} >- d. We will show that we can take d = 212 11
IIR-111
Note the two elementary inequalities 1
IIR
lI
llxll
llRll-1llyll < IIR-'yll
and
llRxll. Let M' E RM and n' e RN be unit vectors and let
m = R- m', n = R-'n'. Thus
ilm -
'II >- IiR-111-1 Jim - nil
or
Ilm - n IIZ >
1IR-'1I
2(Ilmll2 + IInl12)
IIR-1II-2(1IR-1m-ll2 + IIR-'f
Ilnll2)
llR
=
II2)
2IIR-I11-2.
IIRII-2
and this proves the lemma.
We can state now the main result about uniformly separated sequences.
OPERATORS IN HILBERT SPACE
229
Theorem 15-28 Let (A,) be a sequence of points in the open unit disc D. Then the following statements are equivalent. (a) {A;} is a uniformly separated sequence satisfying
f
i*k 1 - Aj'1k
k?1
for all
>- b
(15-81)
(b) The map T defined by
Tf = {f(zl)(1 -
I2;12)1/2}
(15-82)
is bounded map of H2 onto 12.
(c) Each of the two sequences of functions (1 - 1;12)1'2 f; (z) = -1 - .biz
j>1
(15-83)
and
z 1/2 B(z)
(15-84)
is a Hilbertian and Besselian basis for H(B) where B is the Blaschke product whose zeros are the points A j.
(d) The operator S(B) defined in H(B) by (13-4) is similar to a normal operator.
We note that if the operator T defined by (15-82) is bounded then fe H2 belongs to Ker T if and only if f (A;) = 0 for all i which is equivalent to f e BH2 where B is the Blaschke product whose zeroes are the points A j. This means that T restricted to H(B) = {BH2}1 is a boundedly invertible map of H(B) onto 12. We can restate statement (b) as follows.
(b') There exists a constant M such that for all f e H2
(1 -
M 11f112
(15-85)
and there exists another constant K, which can be taken to be 26-4(1 - 2logb) such that for each sequence {a;} in 12 there exists a function f in H' for which _
a; IAiI2)1/2
(15-86)
and IIf 112 -< K Ila112
Before we prove the theorem we state and prove some lemmas.
(15-87)
230
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Lemma 15-29 Let (aij ), i, j = 1, 2, ... be a Hermitian matrix, that is, aij = a ji and suppose for some constant M we have X
I Iaij1 <_M
j=1
Then for any sequence {1; j } in 1 2 we have
X aijSi xx
<
i,j= 1
(15-88)
I2
j=I
PROOF We show first that
I aijUj
n
In
p
(15-89)
i.j= 1
j=1
for all n. For the finite Hermitian matrix A. = (aij), i, j = 1, ..., n we have n
11 A. 11 =supj
j
Y-
i.j=1
Isj
2 I
j=I
(15-90)
<1
and since IIAn11 is equal to the largest absolute value of any eigenvalue of A. we have to get an estimate on these. If A. is an eigenvalue of A and (1;1, ... , n) a corresponding eigenvector then n
I j=I and therefore n
y
j=I Summing up these inequalities for all i and dividing by Yj=111;j1 we obtain 1A1 5 M. Since the norms II An 11 are uniformly bounded we have y) = (Ax, y) for all x and y in 12 if and only if this equality holds in a fundamental
set. But for the standard orthonormal basis {e,} of 12 we clearly have ej) = aji = (Aei, ej) and so (15-88) holds. Lemma 15-30 Let {2i} be a uniformly separated sequence satisfying (15-81). Then ,y
-- it -
(1 - IAj12)(1
-
IAk!2)
1-
21og5
for all
k?1
(15-91)
PROOF From (15-81) we obtain
-E log j*k
Ak Aj 1 - ;JA,
<- -logo2 = -2logO
(15-92)
OPERATORS IN HILBERT SPACE
231
Applying the inequality -logy >- 1 - y to the following identity (1_ jAjj2)(1
= 1 - Zk-AI
JAk12)
2
(15-93)
1-Ajk
I1 -
we obtain, by summing up over the all j * k, that
I (1-IAil2)(I -I_Z"12) - I log .lk-'.i j*k
11 - A,Ak
1 -AjAk
1*k
2
< -2logb
which is equivalent, adding I to both sides of the inequality, to (15-91). PROOF OF THEOREM 15-28 We will prove the implications (a) _ (b) _ (c) -
(d) - (a). Let us assume first that {A; } is a uniformly separated sequence. We already know that there exists a Blaschke product B whose zero set is {a.; 1. Let us define also
B"(z) =
7"7
111
z-2.
(15-94)
1 - A;Z
1B"j(z) = B"(z) ---
for
z-j
j
(15-95)
and
b"j = B",
(15-96)
Let us solve the finite interpolation problem by finding a function h"a H2 that satisfies h"(2;) = a;/(1 l)'JI2)1"2 for j = 1, ..., n. Now the function B"; (z)/b; satisfies B"jO-k)lb"; = bjk
1
j, k <- n
(15-97)
Therefore a solution to the finite interpolation problem is given by n
h"(z) = Y- ocjB"i(z)lb"j(1 - I.1jI2)li2 i=1
Moreover any other solution of the finite interpolation problem differs from h" by B"g for some g in H2. To find the function of minimum norm we have to find nH 1Ih" - Bg1j. We rewrite h" - B"g as follows a
h"(z) - B"(z)g(z) = B"(z) I I b" j z -
.1,?(1
- IjjP)1 2- 9(z) }
Since B" is inner we have
IIh" - Bgjj _
1 -A;z I ;YI 6"; z Aj (1 - V, 110 - g "
oc.
(15-98) 1
232
LINEAR SYSTEMS AND OPERATORS IN HJLBERT SPACE
If M is any subspace of a Hilbert space then we know infM
()y V I
Applying this identity to (15-98) we have inf 11hR - BRgII 9EH2
= sup
I'lil2)112dtI ke Ho, IIkI1
j=, b,,j (e" - Aj) (1 -
27r
1
If k e Ho then k(e") = e"f(e") for some f e Hz. Thus we obtain finally, evaluating the integral by the use of Cauchy's theorem, that n Hr
g
IIhR -B9II = sup
2
E
IA;IZ)112 if H2, I1f11=1 }
1
Given a sequence a = {aj} in 12 which satisfies 1IaII2 = L1a;12 5 1 we define
%(a) = sup{
a' f(A ;)(1 - IA 12)1j2 IfeH2, IIfII = 1
j= 1 bR;
}
then assuming (15-81) holds we easily obtain
=1}
sup{ Y If(1j)12(l - IA;I2)IfeH2,IIfII =1
5[supsupmR(a)]z
;
a
Y If(Aj)12(1 -11.2)IfeH2,hfH = I
and we conclude that (15-85) holds for all f e H2 if and only if sup sup m"(a) < R
oo. To complete the proof we will show that for K = 26 (15-86) and (15-87) hold. To this end define z
9nj (Z) =
-
(1 - h,j l2 )3J2
BR(Z)
z
A
We note that B"j(i1j)2
9.j( k) =
-1
- -I
.I-z)l/2 ask J
nEI
2logb)
233
OPERATORS IN HILBERT SPACE
therefore
Y
a b2-
9.j(Z)
satisfies (1__-
aj
(15-99)
------- 1Aj 12)1/2
Next we show that the set of functions
(9., (P.)
_
(15-100)
b2b2 (9ni,9nj) i,j = 1
nj
ni
and since B. is an inner function dt
(9ni,9nj) = (I - I1iI2)3/2(I - Iljl2)3/2 _ (1 - IAiI2)3/2(1 - IAjl2)3/2'
(e it - 2i)2 (e-it - A j)2 (1 + d`.lj) Ai-A))3
(15-101)
(I -
From the trivial inequality 2 Re.li1j <- IAiI2 + I2jI2 we obtain (I - 12112)1/2(1 -
IAj12)112 < I1-
AiziI
Substituting this back into (15-101) and using Lemma 15-30 we have (9ni,9nj) <- 2
---
(I - I1 2)(1 - I2J2) 2
II - liai j j Using Lemma 15-29 and (15-100) we have
- < 2(1 - 2log5)
2 _< 26-4(1 - 2log6)>
(15-102)
Ia.I2
k
Since the sequence (pn is uniformly bounded there is a subsequence that converges weakly to a function f in H2 whose norm is bounded by the same bound. Since weak convergence in H2 implies pointwise convergence in the open unit disc the function f satisfies (15-86) and (15-87). To prove that (b) implies (c) we assume the map T defined by (15-82) is
a bounded map of H2 onto 12. As we saw this means that its restriction to H(B) = {Ker T j' is boundedly invertible. Let {e1 } be the standard ortho-
normal basis in l2 then the functions di a H(B) defined by di = T-'ei are uniformly bounded. Moreover di has to satisfy d;(1)
Vij
(15-103)
IAjI2)1/2
We can easily derive an explicit expression for di. Indeed if B is the Blaschke product that corresponds to the set of zeroes {.li} 1 then we let
2M
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
B1l1(z) = B(z)((l - A;z)/(z - A1)), that is B(') is the Blaschke product that corresponds to the same set of zeroes with A; excluded. It is easy to check that necessarily B(z)
(15-104)
Using the fact that B is inner the norm of d; turns out to be equal to
Since the d; are uniformly bounded this means that IBI')(A;)l is bounded away from zero. So in particular {A;} is a uniformly separated sequence. We note that the map T : H(B) 1 z defined by (15-82) is a bounded map for which Td, = e, where d, is defined by (15-104). By Theorem 15-26 we infer that {d1} is a Besselian basis for H(B). Now it is easily seen that a sequence {A;} is uniformly separated if and only if the sequence {A;} is. To the sequence ii} corresponds to Blaschke product $(z) and so the sequence of functions { j} is a Besselian basis for H($). We compute now the biorthogonal sequence that corresponds to {d; I. Let { f; } be the sequence of functions
defined by (15-83) then (f, f,) _ (1 - 12;12)"2 f(A;) for all feH2 and moreover all f; belong to H(B). Therefore we have (d1,.f1) = (1 -
IA;I2)''2
d1(A;) = bl;
and as a consequence of Theorem 15-26 it follows that { f; } is a Hilbertian basis for H(B). Now let TB: H(B) --> H(s) be the unitary map defined by (13-7) then by an elementary computation we have (15-105) Te9j =J This shows that %) is a Hilbertian and Besselian basis for H(s) and so
TB.fj = 9j
{ f; } is a Hilbertian and Besselian basis for H(B). The result for {g; } follows by symmetry. Of course {g,) and {d; } differ only by factors of a bounded sequence which is also bounded away from zero. So one is Besselian, or Hilbertian, if and only if the other is. Finally assume that B is an inner function in H°° for which S(B) is similar to a normal operator. Since similarity preserves invariant subspaces
and since for a normal operator every invariant subspace is reducing it follows that every invariant subspace M of S(B) has a complementary invariant subspace N such that H(B) = M + N and M n N = {O). By Lemma 15-27 the angle between M and N is positive. If M = H(B1) and N = H(B2) it
follows that H(B) = H(B1) + H(B2) with B = BIB2. The condition
H(B1) n H(B2) = {0} is equivalent to B1 A B2 = 1. Since the angle between H(B1) and H(B2) is positive actually more is true as will be proved in the next chapter. However, since for an arbitrary factorization B1B2 of B we have B1 A B2 = 1 we conclude that B is a Blaschke product with simple
zeroes Let {A;} be the set of zeroes of B. Clearly the functions g;(z) = (1 - jA;j2)'12 {B(z)/(z - A1)} are eigenfunctions of S(B), corresponding to the eigenvalues A;, and they span H(B). Let L be a normal operator in a Hilbert
space H which is similar to S(B). H has therefore an orthonormal basis
OPERATORS IN HILBERT SPACE
235
{ei} consisting of eigenvectors of L. Assume RL= S(B) R and let M; and Ni be the one-dimensional subspaces of H and H(B) spanned by e; and gi, respectively. By Lemma 15-27 the angle between N, and {B1°H2 }1 is bounded away from zero, uniformly in i. We compute inf{ 11gi - xll JJxlJ = 1, xe H(B('))} I
which by the projection theorem is equal to 11gi - PH(B,i,,gi 11. This projection
is easily computed to be
(PH(B-0i)(Z) =0 -
IAii2)Ii2 {B1°(z) - B(°(A_))
z-Ai
and hence 119i - PIr(B0)>9III = IBS°(A;)I. Since
j-j
;:i
1 - A;A
it follows that the sequence (Ail is uniformly separated, which completes the proof.
As a corollary we obtain the solution to the H°°-interpolation problem.
Theorem 15-31 A necessary and sufficient condition for {A,} to be an oointerpolating sequence is that it be uniformly separated. PROOF Assume {Ai} is an oc-interpolating sequence. Consider the sequences
{Sk;}k , and let f; be interpolating functions such that (15-106)
fi(-'k) = 6kj and
=M
I1si1I <M11{tTki111
(15-107)
hold If B; is the Blaschke product corresponding to the zeroes {Ak}k*; then B; divides j. So f, + B;h; with Ilhi 11 = 11 f; 11 < M and therefore
I =fi(Ai) = B;(A;)h;(A;) - 6 with 6 = M-'. But IBi(Ai)I =
IB;(A;)I .
IIhiI1
lI i*;
and so {Ai} is uniformly separated.
Conversely assume {Ai} is uniformly separated and let (a;) be an 1 °° sequence. Define bounded operators A and L in I2 by Aei = aiei
and
Lei = A,ei
where {ei} is the standard orthonormal basis of 12. Clearly AL= LA and so if R: I2 + H(B) is defined by R Y aiei =
ai f,
2M
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
then R is a boundedly invertible operator, S(B) = RLR-' and RAR-' is in the commutant of S(B). By the scalar version of Theorem 14-8 there exists a function a in H' such that a(S(B)) = RAR'. Since a(A;) = a;, {a;} is an interpolating sequence.
NOTES AND REFERENCES The study of Hilbert spaces began with the work of Hilbert and others on integral equations. The axiomatic introduction of Hilbert spaces and much of the early theory is due to von Neumann. Among the other pioneers in the field one should mention Riesz, Fischer, Stone, and others. There exist several excellent introductions to Hilbert spaces and the theory of linear operators. Among others Stone
[111], Akhiezer and Glazman [2], Riesz and Sz.-Nagy [99], Halmos [62,64], Douglas [24] as well as the relevant parts of Dunford and Schwartz [29] should be mentioned. Unbounded operators were studied by von Neumann who was the first to prove the spectral theorem for unbounded self-adjoint operators. To von Neumann is also due the Cayley transform technique. The integral representations obtained in Sec. 4 have been the object of study of many mathematicians early in the century, among them Caratheodory, Herglotz, Toeplitz, Bochner, and Nevanlinna. For harmonic functions and the Poisson integral one can refer to Hoffman [73]. The exposition in the text of the relation between the representation theorems to the spectral theorem and moment problems follows Akhiezer and Glazman [2]. A short and beautifully written introduction to spectral theory is in Lorch [85]. The commutative B'-algebra approach to the spectral theorem may be found in Dunford and Schwartz [29] and Douglas [24]. The theory of spectral representations amd multiplicity theory is developed in Halmos [62], Dunford and Schwartz [29], Beals [10], and Plessner [98] to cite a few. The exposition in Sec. 6 uses LZ spaces of matrix measures which though less general than, say, the use of direct integrals of Hilbert spaces is more concrete.
The writing of this section has been motivated in part by Brown [18] and Nelson [94]. The factorization lemma of Douglas appears in [23]. Embry's [34] contains a correct Banach space formulation. Theorem 7-5 is from [41]. Wold's decomposition is from [122] though it has been obtained earlier by von Neumann. The study of shifts and models received its impetus from Beurling's fundamental paper [11] characterizing the invariant subspaces of the shift as well as Rota's elegant paper on models [101]. Other motivating sources were Livsic's introduction of characteristic functions [83], scattering theory as developed by Lax and Phillips [82], work in prediction theory by Wiener and Masani [121] and Helson and Lowdenslager [68]. An important influence was the work of de Branges and Rovnyak on the invariant subspace problem [14] and finally the large body of work of Sz.-Nagy and Foias on contractions [115, 116-118]. Three
OPERATORS IN HILBERT SPACE
237
excellent references are Fillmore [37], Helson [67], and the survey of Douglas [25]. There is no attempt to credit all who contributed to the theory and only a few theorems are cited. Dilations and compressions were introduced by Halmos [61], minimal unitary dilations by Sz.-Nagy [112], they being a special case of Naimark's Theorem 9-8 on unitary representation of positive definite functions on groups [92]. The structure of the minimal dilation space has been elucidated by Schaffer [105] and Sz.-Nagy and Foias [116]. Semigroup theory is very completely covered in Hille and Phillips [72]. Other references are Yosida [125] and Dunford and Schwartz [29]. The proof of the Hille-Yosida theorem in the text follows Lax and Phillips [82]. Infinitesimal cogenerators were introduced by Sz.-Nagy and Foias [115].
For the Fourier-Plancherel transform basis references are Akhiezer and Glazman [2], Bochner and Chandrasekharan [13], and Katznelson [78]. In particular the proof of Theorem 10-13 can be found in [13]. Theorem 10-17 and its discrete analog are due to Sz.-Nagy and Foias. Theorems 10-18 and 10-23 are from Lax and Phillips [82]. The results on isometric semigroups are due to Cooper [21] and Sz.-Nagy [113]. The proof of the lifting theorem of Sec. 11 is due to Douglas, Muhly, and Pearcy [27]. The scalar version of the theorem has been proved first by Sarason [104] whereas the general case is due to Sz.-Nagy and Foias [115]. For HP-theory the most readable account is Hoffman's excellent exposition [73]. Another comprehensive survey is Duren [30] which contains also a proof of Carleson's corona theorem. The proof of the Paley-Wiener theorem follows Dym and McKean [31]. The connection between the HP spaces of the disc and the half plane appears in Hoffman [73]. Theorem 12-20 is due to Lax [81] in the vectorial case. The extension of Beurling's theorem given in Theorem 12-22 is due to Halmos [63]. For vectorial H2-theory Helson [67] and Sz.-Nagy and Foias [115] are good sources. Our exposition follows in large part Helson. Theorem 12-27 is adapted from [82]. The content of Sec 13 lies halfway between Rota's paper [101] and the Sz.Nagy and Foias theory of characteristic functions and functional models [115]. The convenient unitary maps TQ introduced in Theorem 13-2 are from [39]. They have their counterpart in the Sz.-Nagy and Foias work. Lemma 13-7 is
adapted from its continuous analog in Lax and Phillips [82]. Theorem 13-8 originated with Moeller [89], Lax and Phillips [82], Helson [67], and Sz.-Nagy and Foias [115]. The functional calculus constructed in Sec. 14 is due to Sz.-Nagy and Foias [115]. The spectral analysis of Theorems 14-4 and 14-5 is from [39]; so is the matrix version of the corona theorem. The extension to the spectral analysis of operators intertwining compressions of shifts is from [40]. Further extensions appear in Sz.-Nagy and Foias [118]. The study of diagonalization of matrices over H°° has been initiated by Nordgren [97] who also introduced quasiequivalence. The present exposition follows mainly Sz.-Nagy [114]. The fundamental Lemma 15-1 is due to Sherman [108]. Independently it has been proved by the author in [50] as an extension
2M
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
of a lemma of Wonham. Lemma 15-12 is from Ahern and Clark [1]. Theorem 15-18 is due to Moore and Nordgren [91]. Theorem 15-25 is from Sz.-Nagy and Foias [116] while generally the theory of Jordan models follows [114, 117]. The material on bases in left invariant subspaces is based on Shapiro and Shields [107] as well as Nikolskii and Pavlov [95].
CHAPTER
THREE LINEAR SYSTEMS IN HILBERT SPACE
1. FUNDAMENTAL CONCEPTS As will become clear in this chapter there is a great similarity in the formalism of finite and infinite dimensional linear systems. However, the infinite dimensional cases abound with a variety of phenomena missing from the finite dimensional situations. In the following we will focus our attention on the theory of modeling infinite dimensional linear time invariant systems in state space form. We will consider solely systems with a finite number of inputs and outputs. The need for such a theory is obvious inasmuch as realistic modeling of most physical systems must include the distributed effects. Whereas in some cases these effects can be safely ignored there are various situations when they have to be taken into account. We note that while it is possible by experimental tests to conclude that a system is not finite dimensional it seems much harder, if not altogether impossible, to conclude that a system is finite dimensional. This indicates that it may be better to view finite dimensional systems as specializations of infinite dimensional ones as opposed to viewing infinite dimensional systems as extensions of finite dimensional ones. We note also that to specify an infinite dimensional system does not require an infinite number of parameters. Thus the nonrational transfer function e-"/(.s + /3) does not have a finite dimensional realization but can be specified by the two parameters a and p. Finally, even if one were to replace an infinite dimensional system by a finite dimensional approximation to do it the right way would necessarily mean the development of a complete state space theory for infinite dimensional systems.
We will restrict ourselves to systems which can be realized on state spaces which are Hilbert spaces. This is certainly not the most general framework discussed in the literature. One other alternative would be to use a distributional framework. Our assumption though somewhat more restrictive has the advantage 239
240
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
of yielding a richer theory and a very explicit structure theory for the dynamical models. In fact it is this insistence on getting at a precise description of the internal structure which forces the various assumptions made in this chapter. Our aim will be to develop a theory of infinite dimensional systems that encompasses both the discrete and continuous time case. As in the finite dimensional case, to specify a system, or rather a constant, or time invariant, linear system we need a quadruple (A, B, C, D) of operators and three spaces U, K, and Y The input and output value spaces are U and Y and will be assumed finite dimensional. K denotes the state space and will be assumed to be a Hilbert space. The operators are assumed linear and A: K -+ K, B: U -+ K, C: K -+ Y, and D: U -+ Y. In the discrete time case (A, B, C, D) stands for the set of dynamical equations
x +I = Ax + Bu
y = Cx + Du whereas in the continuous time case the system equations will be
*(t) = Ax(t) + Bu(t) y(t) = Cx(t) + Du(t)
(1-2)
Of course these definitions are strictly formal and meaningless unless further assumptions are made on the spaces as well as the operators involved.
Even in a cursory acquaintance with modem operator theory, its extreme richness on the one hand and its essential incompleteness on the other, suggest immediately that there are only very few general results one could expect and it is only by restricting the setting that we can expect to develop an interesting theory. Here one must be guided both by physical intuition as well as by a sense of mathematical aesthetics. Our choice of Hilbert spaces could be justified in certain cases as ensuing naturally out of energy considerations, but mostly it is done for mathematical convenience. Within this general framework it is possible to develop a relatively complete and satisfying theory of systems. To continue with our introduction we consider for the time being only discrete time systems. We assume all operators A, B, C, and D to be bounded and linear.
Let u _ j e U denote an input applied at time t = -j. Assuming the system to have been at rest in the remote past we obtain at time t = I the state x x1 = Y A'Bu_j
(1-3)
J=O
Contrary to the development in Chap. I we may consider an infinite number of nonzero inputs but have to assume that the series in (1-3) converges. If as of time t = 1 no further inputs are applied we obtain a sequence of outputs CAi+kBu_j
yk = j=0
(1-4)
LINEAR SYSTEMS IN HILBERT SPACE
241
Let us consider now the space of input string, or input functions, to be 12(- oo, 0; U) and the space of output functions to be 12(1, oo; Y). We define a map f :12(- oo, 0; U) - 12(1, oo ; Y) by
f({ul}i=_
{Yk}k
(1-5)
1
where the Yk are determined by (1-4).
We call f the restricted input/output map of the system. We note that the restricted input/output map is not dependent on the operator D. Of course initially f is not defined on all of 12( - oo, 0; U) as the series (1-3) may fail to con-
verge. Thus the initial domain of definition of f is the dense set of all finitely nonzero sequences in I2(- oo, 0; U). The function f may or may not be extended by continuity to all of 12(- co, 0; U). Even in the later case its range may fail to be in 12(1, oo; Y). Thus our first restriction will be to study only systems whose restricted input/output map f defined by (1-4) and (1-5) extends to a bounded linear map of 12(- oo, 0; U) into 12(1, co; Y). Let us denote by S*_ and S+ the left shifts in 12(- oc, 0; U) and 12(1, oo; Y), respectively. Thus
= {v_J}°
S"_ {u_j}°
(1-6)
w ith
j >0
u- i +1
v_'
=
10
j
(1-7)
= 0
w hereas
S+{Yi}i 1 = {zi };
(1-8)
1
with Zj = u1+1
j 2: 1
(1-9)
Obvously S*_ is unitarily equivalent to the right shift in 12(0, co; U). It is clear from the definition of f that it necessarily satisfies the functional equation
fS' = S*+f
(1-10)
We take this functional equation as our intrinsic definition of a restricted
input/output map which for us is any bounded operator f: I'(- oo, 0; U) 12(1, oo; Y) that satisfies (1-10).
It is of interest to derive a matrix representation for f If we write {uj }j_ _ and {yk }k 1 as column vectors then (1-4) can be rewritten as
/CB 11 CAB
CA2B
CAB CA2B
CA2B
uo
u_1 u_2
(1-11)
242
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Thus the matrix representation of f is a block Hankel matrix. This leads us
to define a Hankel operator to be any bounded operator H: 12(- 00, 0; U) 12(1, ,c; Y) which satisfies (1-10). We define an extended causal input/output map to be a bounded linear map
f:12(-cc, 00; U) - 12(-eo, eo; Y) which satisfies
fu= Uf
(1-12)
f12(0, 00; U) c 12(0, 00; Y)
(1-13)
and
where U denotes the bilateral right shift in 12 (- 00, 00; U) and I2 (- eo, 00; Y). We say that f is strictly causal if (1-13) is replaced by
f12(0, cc; U) c 12(1,'x; Y) (1-14) It is natural to inquire whether a given restricted input/output map f can be extended to an extended causal input/output map f A straightforward application of Theorem II 11-7 yields the following.
Theorem 1-1 Let f: I2(- co, 0; U) -+ I2(1, 00; Y) be a restricted input/output
map then there exists a map f: 12(- oc, co ; U) -+ 12 (- oo, oo ; Y) satisfying (1-15)
11f 11 = IlfII
as well as (1-12) and for which f= P12(1.Co; r)f
112(- x, 0; Y)
(1-16)
Of course Theorem 1-1 does not say anything about causality. In fact to study causal extensions it is convenient to work in a functional representation for f and f and to this end we utilize the Fourier transform. Under the Fourier 12(-00, oo; U) = LI2,, .FL2(-eo, co; Y) = Ly, transform .F we have F12(-00,0; U) = HU and . 12(1, 00; Y) = Ho.r Since f commutes with the bilateral shift its image under the Fourier transform H = f.F - I commutes with all multiplication operators by scalar bounded measurable functions. It follows that H is a multiplication operator by a function in L°`(B(U, Y)). This implies the following result. Theorem 1-2 Let H: HI2, -. Ho,,. be a bounded Hankel operator. Then there exists a function Te L°°(B(U, Y)) such that Hg = PNO,Tg
for all
geHI2,
(1-17)
We say that H is the Hankel operator induced by T and write H = H. Since T is in L'(B(U, Y)) it has a Fourier expansion of the form T(e") =
yn _ x
with T. being linear maps from U to Y. Since H is defined by Hg = Tg for all g e LI2, then the causality condition H(HI2,) c H2
(1-18)
LINEAR SYSTEMS IN HILBERT SPACE
243
is satisfied if and only if TT = 0 for n < 0. Strict causality is equivalent to T = 0 for n 5 0.
Corollary 1-3 Let f : /'(- co, 0; U) 12 (1, oo ; Y) be a restricted input/output map. Then f can be extended to a causal input/output map f if and only if the Hankel operator (Ti,,) is induced by a function X
Te"
T(e") _
in
H°°(B(U, Y))
n=o
Given a dynamical system (A, B, C, D) we define the transfer function T of the system by
T(z) = D + zC(I -
zA)-I B
(1-19)
T is analytic on the set {AI A- I e p(A) } and for all values of z in a sufficiently small neighborhood of the origin, at least for all z such that Izi < IIAII', it is given by X
T W = D + Y CAjBzj+ I
(1-20)
1=0
The system (A, B, C, D) has associated with it an extended causal input/output map if and only if its transfer function is in H (B(U, Y)). The transfer function can be considered as the Fourier transform of the sequence
(D, CB, CAB,...)
(1-21)
which is called the weighting pattern or impulse response of the system. Next we associate with any given system (A, B, C, D) a reachability operator
R and observability operator O. R is defined on the set of finitely nonzero sequences 12(- co, 0; U) by
R({u;E AjBu_;
(1-22)
0 = {CAjx}j o
(1-23)
i=o
whereas 0 is defined by for each x in the state space.
Again there is an ambiguity concerning the natural domain of definition of R and the range of O. Contrary to the finite dimensional situation there are various possibilities regarding the definition of reachability and observability all of which use the basic formulas (1-22) and (1-23) .
The system (A, B, C), we omit D as it does not influence the dynamic behavior of the system, is called reachable if the reachability operator R has a range that is dense in the state space. This is clearly equivalent to
n KerB*A*` = (0} =o
(1-24)
M LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
A system is strongly reachable if any state x is reachable from the origin through the application of a finite input string. That is for each x there exist uo, u- 1, ... , u-, such that x = o AJBu_j. As a consequence of Theorem 11 7-5 if a system is strongly reachable then there is a uniform bound on the length of the input string needed to reach any given state. This means that for some n the map R": U" - K defined by n-1
AjBu-;
Rn(uo,...,u_n+1) _
(1-25)
i=o
is a surjective map. By Theorem 11 7-1 this implies that R"R.* is a strictly positive
operator or equivalently that for some b > 0 n-1
(1-26)
A'BB*A*1 >_ 621 i=O
Since we assume U to be finite dimensional BB* is an operator of finite rank and so (1-26) can hold only if K is finite dimensional Thus a finite input infinite dimensional system cannot be strongly reachable [41].
There are other definitions of reachability which are appropriate in the infinite dimensional situation. A system (A, B, C) is called continuously reachable
if its reachability operator R extends to a bounded operator from 12(- oo, 0; U) onto a dense subset of the state space K. We will say that (A, B, C) is exactly reachable if it is continuously reachable and R is a surjective map. In our definition
continuous reachability refers to the space of input functions l2(- co, 0; U). Other input function spaces may be used and we will see one instance of this in section 7. For observability we have analogous definitions. Thus (A, B, C) is, respectively, observable, continuously observable, and exactly observable if (A*, C*, B*) is, respectively, reachable, continuously reachable, and exactly reachable. In particular the observability condition reduces to X
n Ker CA' = {0}
(1-27)
=o
If f is the restricted input/output map of the realization E = (A, B, C) and if we assume E to be continuously reachable and continuously observable then we have f = OR. Lemma 1-4 Two systems E = (A, B, C, D) and E 1 = (A 1, B 1, C1, D 1) have the same transfer function if and only if
D = D1
and
CAjB = C1AjB1
for
j?0
(1-28)
If both systems are continuously reachable and continuously observable then they have the same transfer function if and only if D = D1
and
OAJR = O1A{R1
for
j?0
(1-29)
LINEAR SYSTEMS IN HILBERT SPACE
245
PROOF The first part follows trivially from the definition of a transfer function. Equality (1-29) follows from (1-28) by observing that OAjR has the matrix representation
CA'+'B
CAjB
.
.
.
CAj+'B OAjR =
(1-30)
which is an extension of (1-11).
One of the central problems we focus on in this chapter is the question of isomorphism theorems for systems. To this end we introduce some definitions and derive some elementary results. Given two realizations E = (A, B, C) and E1 = (A1, B1, C,) with the same input and output spaces U and Yand state spaces K and K1, respectively, we will say that a map X: K K1 intertwines E and E, if the diagram
A1
A
X
K
c
(1-31)
- K, cl
Y
is commutative. If only the diagram U
g/ K
f/
B1
X K,
Al
A
K
X
K,
(1-32)
246
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
commutes then we say that X intertwines (A, B) and (A1, B1). Similarly we define an intertwining map of (A, C) and (A1i C1). A system E = (A, B, C) is a quasi-invertible transform of E1 = (A1, B1, C1) if there exists a quasi-invertible operator X: K -+ K1 which intertwines E and E1. Two systems E and E, are quasisimilar if each is a quasi-invertible transform of the other and similar if there exists a boundedly invertible operator X that intertwines the two systems.
For operators quasisimilarity does not imply similarity as we saw in Sec. 11-15. However, for systems the intertwining relation is more rigid and hence the next result.
Lemma 1-5 Let E = (A, B, C) and E, = (A 1, B1, C1) be two reachable systems. Then E and E, are quasisimilar if and only if they are similar. PROOF The if part is trivial. Thus let us assume X : K -, K1 and X 1: K1 -. K are quasi-invertible maps that intertwine the two systems. Since intertwining is a transitive relation it follows that X1X intertwines E with itself. Thus X,XA"B = A"B for all n > 0 and analogously XX 1A"B1 = A"B1. By the
assumption of reachability it follows that X1X = IK and XX1 = IK1 and hence X and X 1 are boundedly invertible and the two systems are similar. A similar result holds naturally if we replace the condition of reachability by that of observability. If no additional assumptions are made on the nature of the systems involved then nothing can be said concerning the existence and uniqueness of intertwining operators. One uniqueness result is the following.
Lemma 1-6 Given two systems E = (A, B, C) and E 1 = (A, , B I , C1). (a) Let E be reachable then if there exists an operator X intertwining E and E, it is unique. (b) Let E, be observable then there exists an operator X intertwining E and E, it is unique. PROOF Assume two intertwining operators X and X' are given. Then Z =
X - X' satisfies ZA'B = 0 for all i ? 0. Since E is reachable this implies Z = 0. Statement (b) follows by duality.
The next theorem relates properties of an intertwining operator to the properties of the corresponding systems.
Theorem 1-7 Let X be a bounded operator intertwining the systems E _ (A, B, C) and E 1 = (A,, B,, C 1). (a) If X has dense range the reachability of E implies that of E1. If X is surjective the exact reachability of E implies that of E. (b) If E, is reachable then X has dense range. If E, is exactly reachable X is surjective.
LINEAR SYSTEMS IN HILBERT SPACE
247
(c) If X is injectivc then the observability of E, implies that of E. If X* is surjective and E, exactly observable then E is also exactly observable. (d) If E is observable then X is injective. If Y. is exactly observable then X* is surjective.
PROOF Let ze n,=o KerBtAi", then for all n 0 we have BiA*"z = 0. Since X intertwines E and E, we have BiA*" = B*A*"X* and so X*z E nn o KerB*A*" = {0}. If X has dense range then X* is injective and so z = 0 which shows that n,-, U KerB*Ai" = {0} or in other words the reachability of E,. If X is surjective then R 1 = X R shows that if R is onto so is R 1. Part (c) follows from (a) by duality. Statement (b) is a direct consequence of the definitions. Finally, to prove (b) we note that C1AiXz = CA"z for each
z e K. Thus z e n' o Ker CA" and hence z = 0. The second part of statement (d) follows from (b) by duality.
Lemma 1-8 Let X intertwine E _ (A, B, C) and E, = (A B C,). Then E and E, realize the same transfer function. PROOF Since X intertwines E and E1 we have XA' = A1 X, XB = B1 and
C1X= C. Thus
C,AiB,=CIAiXB=CIXA'B=CA'B
for
j>_0
and hence the transfer functions, modulo the constant term, coincide. The converse of this lemma is not generally true. For the finite dimensional case the additional assumption that both systems are canonical guarantees the similarity of the two systems. This is the content of Theorem 18-4. In the infinite dimensional case we have to further strengthen the assumptions. Theorem 1-9 Let E = (A, B, C, D) and E, = (A, , BI, C1 , D,) be two realizations of the same transfer function T in the state spaces K and K1, respectively.
(a) If E is continuously observable and exactly reachable and E, continuously observable and continuously reachable then there exists a quasi-invertible operator X which intertwines E and E1.
(b) If E and E, are both continuously observable and exactly reachable then the two systems are similar.
PROOF Let R and R1 be the respective reachability operators and 0 and 01 the respective observability operators. By Lemma 1-4 we have
OA'R = 0,A4R (1-33) For j = 0 this implies, 0 and O, being injective, that KerR = KerR1. If E is exactly reachable then R = R I {KerR} 1 - K is a boundedly invertible operator. Let X = R1R- 1 then X is a bounded quasi-invertible operator from K to K1. Now from (1-33) restricted to {KerR}1 = {KerR,}1
248
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
we have OAJ = O,A;X. For j = 0 this reduces to 0 = O,X which implies C = C,X. Substituting 0 = O,X back into(1-33) yields O,XA'R = O,AIR, and since O, is injective that XAlR = A;R, for all j >_ 0. For j = 0 this implies XR = R, and so also XB = B1. Hence from XAjR = AiR, we obtain XA'R = AiXR for all j >_ 0. As R has dense range this implies XA' = A;X for all j >_ 0. Thus X intertwines the two systems which proves (a). If E, is also exactly reachable then X as defined before is boundedly invertible and hence the two systems are similar.
The previous theorem is an instance of an infinite dimensional state space isomorphism theorem. The proof depended crucially on the assumption of exact reachability. Of course exact observability might have been used instead. Another version of an isomorphism will be encountered in Sec. 7 where instead of strengthening the reachability or observability properties use is made of some symmetry assumptions.
That some extra assumptions need to be made to guarantee similarity is clear from the following example. Let A:12(0, oo) -+ 12(0, oc) begiven by Aei = 2,ei where {ei};° 0 is the standard
orthonormal basis in 12(0, oe), the 1i are distinct numbers with jA < 1. Let b and c be vectors in 12(0, oo) with real coordinates {fl} ;__ o and {yi}j° 0, respectively. Consider next the system E = (A, b, c) and (A, c, b). Since (Ajb, c) = Y Ai fl,y; =
(AJc, b) the two systems realize the same transfer functions. Let X be a bounded
operator that intertwines E and E. As XA = AX it follows from the cyclicity of A that X = cp(A) for some bounded measurable cp. Let (pi = cp(Ai) then Xe, _ cpiei. Since Xb = c we must have cp;fl, = y, which implies that Sup
iz0
Yi
= SuP I(Pil = 11X II
i?0
Ni
But for arbitrary b and c in 12(0, oo) Sup iZ0
/yi Ni
need not be bounded. As a case in point we might take y = n 1 and $ = n 2. So the two systems cannot be similar in this case. This example of nonisomorphic systems had still the same generators. We will see later that there exist canonical realizations of the same transfer functions whose generators have widely differing spectra.
2. HANKEL OPERATORS AND REALIZATION THEORY It became clear in the previous section that the input/output behavior of a system is, except for the map D, completely determined by its Hankel operator. Thus given an impulse response (T0, T,, T2, ...) or its associated transfer function
LINEAR SYSTEMS IN HILBERT SPACE
T(z) _
249
Tjzf a system (A, B, C, D) is said to realize T if To = D and T =
CAj-IBforj _1. Assume the function T is H°°(B(U, Y)) and let Hr: FIU - Ho,y be its induced Hankel operator. Taking the finite dimensional theory as a guide we expect that the construction of a state space model should use HU/Ker HT or RangeHr as possible state spaces. This expectation turns out to be justified.
Theorem 2-1 Let T E H(B(U, Y)). Then there exists a reachable and exactly observable Hilbert space realization of T.
PROOF It is easy to construct a realization of T. Let S.4. be the unilateral right shift in Ho, y. Define operators A1, BI, and C1 by Al = S+, BIB = Hr , CI f = (S+ f) (0) and D1 = To then the system (A1, B1, C1, D1) is a realization of T. This realization may or may not be reachable, depending on whether Range HT = Ho,'., but it is exactly observable and 0*: Ha.y - Ho.Y is just the
identity map. To obtain a reachable realization of T all we have to do is to replace the state space Ho,1 by RangeHT. Since Hr satisfies the functional equation HTS*- = S*+HT
(2-1)
it follows that KerHT is S*_-invariant whereas RangeHr is an S+-invariant subspace. Define now A = S+ RangeHr, B = BI, C = C1, and D = To then (A, B, C, D) is a reachable and exactly observable realization of T We call this realization the shift realization of T. The reachability operator R of the shift realization of T coincides with H. This follows from the fact that for all n
A"B = S+"HT = HTS*-" =
On the other hand the observability of operator of the system 0: and Ho.y is given by Of = f and so its adjoint 0*: Ha.y - RangeHr has the representation 0* = PR,"QCHT.
It follows immediately from the previous theorem that a function T in H°°(B(U, Y)) has an exactly reachable and exactly observable Hilbert space realization if Range Hr is closed. We will defer to the next section the characterization of those functions admitting such a realization. The realizability criterion given by Theorem 2-1 provides a sufficient condition only. In the discrete time case it is easy to characterize all weighting patterns having Hilbert space realizations. We say that a sequence {7) ,° o of operators in B(U, Y) is of exponential type if there exist constants M and w such that 1' T. 11
_< Mw"
n >_ 0
(2-2)
Theorem 2-2 A weighting pattern o has a canonical Hilbert space realization if and only if it is of exponential type.
250
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Since T. = CA' 1B it PROOF Assume (A, B, C, D) is a realization of follows that 1 1 T. 1 1 <_ I I C11 J I B IJ 1 1 A1 1 '1 and so {7;} - o is of exponential type. Conversely if { T" },° o is of exponential type then for some nonzero
A,
sufficiently small, the sequence {A."T"} 0 is the sequence of Taylor coefficients
of an H`°(B(U, Y)) function. By Theorem 2-1 it has a canonical Hilbert space realization (A, B, C, D). It is clear then that {..-'A, B, C, D} is a realiza-
tion of { T }' o and the rescaling does affect neither reachability nor observability.
It will be seen later that while rescaling of a weighting pattern does not affect realizability the fine structure of the realization constructed in the proof of Theorem 2-1 is sensitive to rescaling. The proof of the realizability result of Theorem 2-1 was asymmetrical inasmuch as it used only RangeHT as state space, not considering the possibility of using Hu G KerHT. To get a more symmetric theory we have to use adjoints. Given a system E = (A, B, C, D) we define the adjoint system E* by E* _ (A*, C*, B*, D*). Thus dynamic equations for the adjoint systems are
x"+I = A*x" + C*y" u" = B*x" + D*y"
(2-3)
If T* denotes the transfer function of the adjoint system then it is clear that we obtain the following relation between it and the transfer function T of the original system, namely T* (z) = T (z) = T(2)*
Now given a function Te H°°(B(U, Y)), let ET denote its shift realization. Since T is in H°°(B(Y,, U)) whenever T is in H'°(B(U, Y)) we can apply the construction of Theorem 2-1 to obtain the shift realization of T ET = (A, E, C, D).
Thus Et has RangeHT as state space. Clearly by the definition of the adjoint system and the remarks following it the system (ET)* = ('1'*, l',`*, I&*, f)*) is a realization of the transfer function T We call this realization the *-shift realization of T.
It is of interest to inquire about the relation between the shift and the *-shift realizations of a given transfer function. Contrary to the finite dimensional situation the two realizations, though both canonical, need not be isomorphic. Some obviously necessary conditions for isomorphism arise out of spectral considerations. If we consider the state operators A = S + IRange HT and I* = (S+ Range HT)* then their spectra are determined by Theorem 11 13-8 as well as the representation theorem for right invariant subspaces, namely Theorem 1112-22. If we assume that {RangeHT}1 = QH0 r and {RangeHT}1 = Q,H',,, for some rigid functions Q and Q, then it is a consequence of Lemma II 13-4 that, U and Y being finite dimensional, that if Q and Q, fail to be inner then the spectra of A and A4 coincide with the closed unit disc. However, while the point spectrum of A coincides with the open unit disc T* has no point spectrum at all. Thus the possibility of the similarity of A and A* is excluded.
LINEAR SYSTEMS IN HILBERT SPACE
251
A spectrum of the kind exhibited by the operators above in the case that Q and Q1 fail to be inner is of a kind that operators arising in applications rarely
have. This is an indication that the theory of the shift realization should be restricted to transfer functions T for which (Range HT }' is an invariant subspace of full range. This theme is the subject of the next sections.
Before proceeding in that direction we elucidate the connection between the Hankel operator induced by 1 namely HT, and HT.
Theorem 2-3 Let T e H°° (B(U, Y)) and let HT and HT be the Hankel operators induced by T and T, respectively. Then the operators H? and HT are unitarily equivalent. PROOF It is simple to check that if HT: HU its adjoint HT: HI.1 _+ HU is given by HTg = PN, T*g
for
H2.y is defined by (1-17) then g e HU
(2-4)
2
Define now a map a: LU - LU by 2
(af)(eu) = erf f(e
(2-5)
a is unitary and satisfies a* = a-' = or as well as aHo.u = HU. For every g e HY we have
a(HTg) = PA,, Tg = PHAa(Tg) = PHL T*(ag) = HT(ag) and hence the equality oHT = HTa
(2-6)
follows.
3. RESTRICTED SHIFT SYSTEMS It was indicated in the previous section that the shift realization might be a useful tool for the study of state space models of transfer functions T for which the induced Hankel operator HT has a range whose orthogonal complement is an invariant subspace of full range. In general we define a system (A, B, C) to be a restricted shift system whenever
its generator, or state transition operator A, is unitarily equivalent to the restriction of a left shift to a left invariant subspace whose orthogonal complement is of full range.
Thus A is unitarily equivalent to So(Q)* where So(Q)* = S*+ I Ho(Q)
(3-1)
Ho(Q) _ {QHI.rv}'
(3-2)
and
252
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
It is a consequence of Theorem II 13-2 that (A, B, C) is a restricted shift system if and only if (A*, C*, B*) is. We further assume that the inner function Q has a scalar multiple. This is certainly the case whenever dim N < oo when detQ provides a scalar multiple of Q. As before the spaces U and Y of input and output values are assumed to be finite dimensional.
With this definition we have greatly restricted the class of systems under consideration and our first object is the characterization of the class of input/output maps that have realizations by means of restricted shift systems. Applying B to a fixed vector in U we have B a H0(Q) and hence (Be) (z) _ b,(z). Since the vector function b4 depends linearly on there exists a B(U, N) valued function D, analytic in the open unit disc, for which (Be) (z) = bi(z) = D(z)
(3-3)
Similarly there exists a B(Y, N)-valued analytic function E for which (C*tl) (z) = E(z) n for all qeY
(3-4)
The functions D and E need not necessarily be norm bounded in the open unit disc.
If we compute
n) for arbitrary E U, q e Y and n >_ 0 we obtain h) = (CSo(Q)*
=
ti) _ (So(Q)*
En) _ (7'D
1 Je1(E(eit)* D(e1) 271
,
C*,1)
En)
, n) dt
Letting W
(E*D)(e") = I Tee"' we get T. = CA"-'B for n > 0. Thus the transfer function of (A, B, C) coincides with ELO_ I TZ".
The fact that the multiplication operator B defined by (3-3) has its range in H0(Q) implies a certain factorization representation given next.
Theorem 3-1 Let Q be an inner function in N and let B: U -. Ho (Q) be a bounded linear operator defined by (3-3) for some B(U, N)-valued function D which is analytic in the open disc. Then D is factorable on the unit circle in the form
D(e") = Q(e") F(eu)*
(3-5)
where F is another B(N, U)-valued analytic function in the unit disc. PROOF We use Lemma II 13-5. Since Di; is in H0(Q) for each e U it follows that Q*D is in HN for each . Since the dependence on is linear there
exists a function F analytic in the unit disc such that Q*Dg = F* which proves the theorem.
LINEAR SYSTEM IN HILBERT SPACE
253
We can give now a first characterization of transfer functions realizable by restricted shift systems.
Theorem 3-2 An H°°(B(U, Y)) function T is realizable by a restricted shift system if and only if T is factorable on the unit circle as T(e") = E(e")* Q(e") F(e")*
(3-6)
for some inner function Q acting in a Hilbert space N and where E and F are, respectively, B(Y, N)- and B(N, U)-valued analytic functions in the open
unit disc that induce bounded multiplication operators through (3-3) and (3-4).
PROOF We saw that if T is realizable by a restricted shift system then it can be factored as T = E*D. Theorem 3-1 and the factorization (3-5) of D imply the result. Conversely if T admits a factorization of the form (3-6) then the system (A, B, C) defined by (3-1), (3-3), and (3-4) is clearly a restricted shift system. Moreover (A, B, C, To) realizes T. We would like to relate the possibility of factoring a function as in Theorem 3-2 to some intrinsic property of the function. To this end we introduce some definitions.
We say that a function Te If (B(U, Y)) is cyclic (cyclicity here is relative to the left shift in H0,y) if Range HT is dense in Ho,,. and noncyclic otherwise. A func-
tion T is called strictly noncyclic if {RangeHT}1 is an invariant subspace of full range. In case dim Y = 1 noncyclicity and strict noncyclicity coincide but for shifts of multiplicity greater than one the notions differ. Of course strict noncyclicity implies noncyclicity. We note also that RangeHT cannot equal Ho 1. This is excluded by the functional equation (2-1) of the Hankel operator and the fact that the left shift S"+ in H0',1 is similar neither to the right shift nor to its compression to a left invariant subspace. Let Q be a domain in the complex plane. A B(U, Y)-valued function F is meromorphic of bounded type in 0 if F = G/g where G and g are, respectively, bounded B(U, Y)-valued and scalar-valued analytic functions in f2.
If a function T in H°°(B(U, Y)) is the strong radial limit of a function 1' meromorphic function and of bounded type in De = {AI I < J21 < oo} then we say that T is a meromorphic pseudocontinuation of bounded type of T. Clearly a meromorphic continuation of bounded type of T is at the same time a pseudocontinuation but the converse is generally false. There are functions in H°° for which the unit disc is the natural domain of analyticity but which still admit a meromorphic pseudocontinuation.
Lemma 3-3 Let Te L" (B(U, Y)). Suppose there exists a nonzero function cp a H'° such that cpTe H(B(U, Y)) then there exists an inner function q for which qTe H°°(B(U, Y)).
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
254
PROOF The set J = {VlI 0 e H°°, IIiT e H°° (B(U, Y)) } is a nontrivial w*-closed
ideal in H°° hence of the form J = qH°°.
Consequently if F is a meromorphic function of bounded type in D. then in a representation F = Gig the denominator g can be taken to be an inner function
in D. Inner functions acting in finite dimensional Hilbert spaces have meromorphic pseudocontinuations of bounded type to D. Actually a somewhat stronger result holds.
Lemma 3-4 Let Q be an inner function in N. Then Q has a meromorphic pseudocontinuation of bounded type if and only if it has a scalar multiple. PROOF Assume Q has a scalar multiple q. Thus QQ° = Q°Q = qI
(3-7)
holds for some function Q°. Since clearly qQ* is analytic it follows from Lemma 3-3 that without loss of generality we may assume q, and therefore
also Q°, are inner. For z e De we define Q(z) = &(z-')/q(z-' ). Clearly Q°(z-') and q(z-') are bounded analytic functions in Dr. Thus Q is meromorphic and of bounded type in De. From the definition of Q it follows that a.e. on the unit circle
Rim Q(Re") = Rim Q° I R e-" )' q` I R e;,1 = lim Q°(re")*/q(rei)
\
= Q°(e")*lq(e") =
I
\
q(e")Q°(e")*
But from (3-7) we obtain q(Q°)* = Q and so Jim Q(Re") = lim Q(re") a.e. .-lR-1' and Q is a pseudocontinuation of Q. Conversely if Q has a meromorphic pseudocontinuation of bounded type in De then Q = G/g and g may be taken to be inner in De. This implies that is a scalar multiple of Q. We note that, as every inner function Q acting in a finite dimensional space has a scalar multiple, the construction in the lemma yields a pseudomeromorphic continuation of bounded type for Q. Moreover since (3-3) implies that, whenever q(z) * 0, Q(z)-' = Q°(z)/q(z) then the function Q can be written also as Q(z) _
Q(Z-1)-1 Theorem 3-5 Let T e L°°(B(U, Y)) where U and Y are finite dimensional Hilbert spaces. Then the following statements are equivalent (a) T is strictly noncyclic. (b) T is a strong radial limit a.e. of a meromorphic function of bounded type in De.
255
LINEAR SYSTEMS IN HILBERT SPACE
(c) On the unit circle T is factorable as T= PC* = C*,P1
(3-8)
where P and P1 are inner functions in Y and U, respectively, C and C1 are in H°'(B(Y, U)), and the coprimeness relations and
(P, OR = Iy
(P1, C1)L = 1 u
(3-9)
hold.
PROOF Assume T is strictly noncyclic, thus {Range HT}1 PHQ.y for some inner function P acting in Y. Equivalently atr ngeHT c H0(P). Let e U then
HT = PHg, T and, by applying Lemma II 13-5, P*PHh T e H. Since P11HY c HY we have P"T e Hy for all e U. This implies that P"T, which is in L° (B(U, Y)), has a Fourier expansion in which all positively indexed Fourier coefficients vanish. Thus P*T = C* for some C in H°° (B(Y, U)) which implies the factorization T = PC*. Assume now T e L`° (B(U, Y)) admits a factorization of the form T = PC* with P and C as before. Define a function tin D. by 1'(z) = P(z) C(z -1) _
P(z-1)-I C(z-') Since C(z) is analytic in the unit disc C(z-') is analytic in De. Also P(z) = P(z -1) -1 is, by Lemma 3-4, meromorphic of bounded type in De and hence so is T Moreover
/1
Rim T (Re") = Rim P ( R e-" 1
Re
P(e") C(e")* a e.
Thus (a) implies (b) and the first factorization in (3-8).
Next assume T(e") is a.e. the strong radial limit of a meromorphic function of bounded type Tin De. If T= G/g with g an inner function then necessarily G = H* and g = q where H E H°" (B(Y,, U)) and q = g is inner. This last representation implies that qTT e My for all E U. But qTI _ q {PH,TT + PH2 TT} = q {PH2T + HT}. As qHY c HY it follows that H,Y. for "all
in U and this implies that Range HT c H0(gl) _
gH0',1}1. Since qH',y is a subspace of full range so is {RangeHT}1 which includes qff,1. Thus T is strictly noncyclic and for this the coprimeness relations (3-9) are irrelevant.
If T is a radial limit of T which is meromorphic and of bounded type in D. then 'h` is the radial limit of T Thus T is strictly noncyclic if and only if Tis.
Hence if T is strictly noncyclic T= RIDi with R1 inner in U and D1 E H°°(B(U, Y)). It follows that T= C*jPI with P1 = A, and C1 = D1. Finally, assume RangeHT = H0(P) which implies that T= PC*. Apply the map rQ : LY -. LY defined by (3-10) ref = XQ(Jf) where (Jf) (e") = f (e"). rQ is closely related to the map rQ introduced in Sec. 11-13 and all duality results obtained there are easily adapted to the
present setting. rP maps H0(P) unitarily onto H0(P) and T'PPH,(p) = PHO(p)tp.
256
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
It follows that for f e HU TPHTJ = 7PPHo.yTf = tI,PHO(P)Tf
PHO(P)TPT.f
= PH0(P)XPP*C(Jf) = PHO(P)XC(Jf )
Hence RangcHT = H0(P) if and only if the operator M,-: HU - H0(P) given by
Meg = PH0(P)XCg
for
g e HI
has dense range. By Theorem II 14-11 this is equivalent to the coprimeness
condition (C, P)L = lu or, alternatively, to (C, P)R = I. In a completely analogous fashion Range HT = H0(P1) if and only if (P1, C 1) = J. This completes the proof.
As a straightforward corollary we obtain Corollary 3-6 (a) Let T1 and T2 be strictly noncyclic functions in H (B(U, Y)) then T1 + Tz is strictly noncyclic. (b) Let T1 e H°° (B(U, W)), and T2 e H' (B(W, Y)) be strictly noncyclic then T2 T1 is strictly noncyclic.
(c) Let Te H'° (B(U, Y)) be strictly noncyclic then 'D is strictly noncyclic in H°°(B(Y,, U)).
Corollary 3-7 A function T E H°°(B(U, Y)) is realizable by a restricted shift system if and only if it is strictly noncyclic. PRooF By Theorem 3-2 realizability by a restricted shift system is equivalent to a factorization of the form (3-6) on the unit circle. Since the factorizations (3-8) are special cases of (3-6) it is clear that strictly noncyclic functions are realizable by restricted shift systems. Conversely if T= E*QF* then, apply-
ing the previous theorem, QF* = F!Q1 and T= E*FIQ1 = (F1E)* Q1 is strictly noncyclic.
If we consider the restricted shift systems as generalizations of finite dimensional systems then strictly noncyclic functions take the place of rational functions. In fact the coprime factorizations (3-6) can be viewed as replacing the description of rational functions as quotients of polynomials. In this connection we note the following result.
Theorem 3-8 Let T e H`°(B(U, Y)). Then Range HT is finite dimensional if and only if T is rational. PROOF Assume RangeHT = H0(Q) is finite dimensional. Thus Q is a finite Blaschke function, that is, q = detQ is a finite Blaschke product. Since the
pseudomeromorphic continuation of T is an actual analytic continuation
LINEAR SYSTEMS IN HILBERT SPACE
257
T has only a finite number of poles on the Riemann sphere and hence is rational. Conversely if T is rational then T(z) = G(z)/g(z) where g(z) is a polynomial of degree k with zeroes in D. Let g,(z) = g(z)/z'j(z-') then g, is inner and we can write T(z) = zg"(z-') G(z)/g,(z). This implies that RangeHT c {g1H01.,.}1 where q1 = g,. But {g1H0'.y}1 is of finite dimension equal to k dim Y.
Given a restricted shift system (A, B, C) its reachability and observability properties are completely determined by the analytic functions induced by the operators B and C and by the corresponding inner function. The next theorem describes the various possibilities in terms of coprimeness. Theorem 3-9 Let T E H°° (B (U, Y)) be strictly noncyclic admitting the factorization T = E*QF* on the unit circle where E and F are, respectively, B(Y, N)- and B(N, U)-valued analytic functions in the unit disc that induce bounded multiplication operators. The realization (A, B, C, D) of T where A = S+ I Ho(Q)
B = QF*
(3-11) (3-12)
Cf = (S+E*f)(0)
(3-13)
D = To
(3-14)
(Q, F)R = I N
(3-15)
and
is
(a) reachable if and only if (b) exactly reachable if and only if [Q, Flit
(3-16)
IN
(c) observable if and only if (E, Q)L = I N
(3-17)
and
(d) exactly observable if and only if [E, Q]L = IN
(3-18)
PROOF The reachability and observability operators of the system are given by R: Hu -p H0(Q) and 0: Ha,y -. H0(Q) where Rf = PHo(Q)QF*f =
f C Hu
(3-19)
and 0*g = PNo(Q)XEg,
g E Ilo.r
(3-20)
258
LINEAR SYSTEMS AND OPERATORS IN IIILBERT SPACE
For O* the result follows directly from Theorem II 14-11. To get the result for the reachability operator we apply the map TQ defined by (3-10). Since
we have ES0(0)` rQ(QF*4) = PH0&X
Hence TQR f = PHo4QIXF(Jf) for f e H'. Since JHU = HU we apply Theorem 11 14-11 once again and the proof is complete.
Next we characterize the class of functions in H°°(B(U, Y)) whose induced Hankel operators have closed range.
Theorem 3-10 Let T e W (B (U, Y)). Then the following statements are equivalent. (a) HT has closed range. (b) HT has closed range.
(c) On the unit circle T factors as
T=QH*
(3-21)
where Q is inner in Y and H e H(B(Y, U)) and
[Q,H]R°Ir
(3-22)
holds.
(d) On the unit circle T factors as T=H*1QI
(3-23)
[QI,HI]L=u
(3-24)
and
holds.
PROOF That (c) implies (a) follows from the previous theorem. Suppose conversely that HT has closed range. As HT satisfies the functional equation S+HT = HTS"
(3-25)
we deduce that RangeHT is S+-invariant and KerHT is S*_-invariant. The restriction of HT to {KerHT }1 is a boundedly invertible operator of {KerHT } 1
onto RangeHT which moreover satisfies (S+I RangeHT) HT = HT(Ptxe,HTf1S*_ I {KerHT}1)
Thus the operators S+ RangeHT, and P(Ke,HT) S'_` {KerHT}1 are similar. If T is not strictly noncyclic then RangeHT = {QIIo, }' for some rigid
LINEAR SYSTEMS IN HILBERT SPACE
259
function Q which is not inner. In that case it follows from Lemma II 13-4 that every point of the open unit disc is an eigenvalue of S+ I Range HT. On the other hand the operator P{KerHT}1S* {KerHT}l, which is unitarily equivalent to a compression of the right shift in HU to a left invariant subspace, can have at most a countable number of eigenvalues. Thus necessarily T is strictly noncyclic and RangeHT = H0(Q) for some inner function Q. So the factorization (3-21) holds and the strong coprimeness condition(3-22) follows from the previous theorem. By similar reasoning statements (b) and (d) are equivalent. Finally, we use the fact that a bounded operator A has closed range if and only if its adjoint A* has closed range. So RangeHT is closed if and only if Range HT is closed. But HT and HT are unitarily equivalent, the equivalence given by Eqs (2-5) and (2-6). Thus (a) and (b) are equivalent.
Corollary 3-11 Let T e H' (B (U, Y)), U, Y finite dimensional. Then T is realizable by an exactly reachable and exactly observable system if and only if Range HT is closed.
PROOF If Range HT is closed then T is strictly noncyclic and factors on the unit circle as T = QH* where Q and H are strongly right coprime. Theorem 3-9 provides a realization which by Theorem 3-10 is exactly reachable and exactly observable. Conversely assume T is realizable by an exactly reachable and exactly observable system. Since the shift realization of T is reachable and exactly observable it follows from Theorem 1-9 that the two systems are isomorphic.
In particular the reachability operators are similar. But the reachability operator of the shift realization is HT and so necessarily HT has closed range.
4. SPECTRAL MINIMALITY OF RESTRICTED SHIFT SYSTEMS In the absence of a general state space isomorphism theorem in the infinite dimensional context we are faced with a situation, and the last example in this section shows that this is a reality, that there may exist canonical realizations of the same transfer function which besides being nonisomorphic have generators with widely differing spectra. From an intuitive point of view it seems clear that a state space model should, through the spectrum of the state operator, reflect the singularities of the transfer function in a faithful way. In some sense we should look for realization where the
state operator has the smallest possible spectrum required to model the singularities of the transfer function. To make this more precise let (A, B, C, D) be a realization of a transfer func-
tion 7; that is, in the neighborhood of the origin we have
T(z) = D + zC(I - zA) -I B
(4-1)
260
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Clearly this representation is analytic at all points z where IzI-' is larger than the spectral radius of A. But the formula (4-1) can be used as a basis for the analytic
continuation of T at least to all points z where z - ' E po(A), po(A) being the principal component of p(A) the resolvent set of A. If oo(z) denotes the complement of po(A) we clearly have {AI A-' E a(T) } c ao(A)
(4-2)
a(T) denotes the set of nonanaliticity of T. We call (4-2) the spectral inclusion property and say that (A, B, C, D) is a spectrally minimal realization if there exists
an analytic continuation of T for which equality holds in (4-2). This section is devoted to a finer spectral analysis of the shift realization for strictly noncyclic functions, the ultimate aim being the proof of the spectral minimality of such realizations. In the process we prove some results concerning inner functions that, interesting in themselves, turn out to be useful also for the degree theory developed in the next section. We recall, Lemma II 12-25, that given any inner function Q acting in an n-dimensional Hilbert space N there exists a unique, up to a constant factor of absolute value one, scalar inner function a which satisfies aHN c QHN and a I a for every a E H°° for which aHN c QHN. The function a is the minimal inner function of Q. We have a i det Q and det Q I a". Let now P and R be two inner functions acting in a finite dimensional Hilbert space N. Since PHN n RHN is an invariant subspace of full range, by the remarks
following Theorem II 12-24, we have PHN n RHN = QHN for some inner function Q.
Theorem 4-1 Let P, R, and Q be inner functions such that QHN = PHN n RHN and let it, p, and a be their respective minimal inner functions. Then (a) n1a, pja, and alnp. (b) a = rtp, equality up to a constant factor of absolute value one, if and only if it and p are coprime.
PROOF Since QHN = PHN n RHN we have QHN c PHN. Thus aHN c QHN = PHN and hence n I a and similarly p I a. Now as nHN c PHN and pHN c RHN it follows that
(n v p) HN = nHN n pHN c PHN n RHN = QHN
and hence a I n v p. This together with the division relations nI a and p a implies that a = it v p and clearly it V p I np which proves (a). Statement (b) follows from the observation that n v p = np if and only if it A p = 1, that is, if and only if n and p are coprime. We proceed to discuss some closely related notions of minimality. Given an inner function Q acting in N, assumed to be finite dimensional, we have associated with it its minimal inner functions. A contraction operator X is said to belong to the class Co if it is a completely
LINEAR SYSTEMS IN HILBERT SPACE
261
nonunitary contraction for which there exists a nontrivial function a in H`O for which a(X) = 0. Clearly given the matrix inner function Q then the operators S(Q) and S0(Q) defined by (11 134) and (3-1) are Co contractions. An annihilating
function can be taken to be q = detQ. Given a Co contraction X then Jx = {(p e H' Ij p(X) = 0) is a 0-closed ideal in H°° and hence has the representation Jx = mxH°° for some inner function mX. The function mx is called the minimal inner function of X. Finally, with each strictly noncyclic function Te L°° (B(U, Y)) we associate
the minimal inner function µT which makes iTT have an analytic extension to the exterior of the closed unit disc.
Theorem 4-2 Let T e H'(B(U, Y)) be strictly noncyclic having the coprime factorizations T= QH* = H*Q 1
(4-3)
on the unit circle.
Let a, m, and µ be the minimal inner functions of Q, S0(Q), and T, respectively. Then, up to a constant factor of absolute value one, a, m, and µ coincide.
PROOF For each f e Hl, we have PHO,Q,af = 0 as aHo.N c QH',N. This implies that mla. Conversely if m(S(Q)) = 0, it follows by the invariance of QHo N under multiplication by m that PHOIQMf = 0 for all fe H'0,N. Thus mH0,N c QHN and so a m and the coincidence of or and m follows. Since OQ has an analytic extension to De so has aT = 5QH* which shows that pea. Conversely since µT extends analytically to De we have µ'T = G* for some G in H°(B(Y, N)). Thus T = uG* and RangeHT c (pH0',N)1 which is equivalent to µH0 ,N c QHO.N. Thus al p which completes the proof.
Coprime factorizations of the minimal inner function a of an inner function Q induce factorizations of Q itself. Theorem 4-3 Let N be an n-dimensional Hilbert space and let Q be an inner function acting in N. Let a be the minimal inner function of Q and a = np any coprime factorization of a. Then there exist inner functions P and R, having it and p as their respective minimal inner functions, such that QHN = PHN n RHN
(4-4)
detQ = det P det R
(4-5)
and
Furthermore there exist inner functions P, and R1 for which
Q=PR,=RP1
(4-6)
of a. Let Mn = {feHNipfeQHN} and MP = {feHN1nfeQHN}. Clearly Mx and Mo are PROOF Assume a = itp is a coprime factorization
262
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
invariant subspaces of H2 and moreover QHN c M. n M. Since QHN is of full range so are M. and Mo and they have therefore representations of the form M = PHN and M. = M.RHN for some inner functions P and R. To prove n M, c QHN it suffices to show that H(Q) _ the converse inclusion {QHN}1 c (PHN n RHH)1 or that H(Q) is orthogonal to PHN n RHN. To this end let f and g be arbitrary elements in H(Q) and PHN n RHN, respectively. Define J = [4o c- H°° If X"cp(g, f) dt = 0, n >_ 0]. Obviously J is an invari-
ant subspace of H`° which is nontrivial as both it and p are in J. Since it
and p are coprime this implies that J = H. Letting n = 0 we obtain $ (g(e"), f (e")) dt = 0. Since f was arbitrary in H(Q) necessarily g e QHN, or PHN n RHN c QHN and equality follows. Since PHN n RHN c PHN we have
the existence of an inner function R, such that Q = PR1 and analogously there exist an inner function P1 such that Q = RP,. To see that it is the minimal inner function of P we note that for each f e HN, p (If) = (pit) f = of e QHN, hence from the definition of M. it follows
that nHN c M = PHN. If a is any inner function for which aH2 c M. then (ap) HN c M. c QHN which implies that o I ap or that n a. Thus it is the minimal inner function of P. Finally we prove equality (4-5). From the factorizations (4-6) we obtain det P det Q as well as det R I det Q. Since det P and det R are coprime it follows
that det P det R det Q. To prove the converse we note that o detQ and detQ l d'. Since o" = n" p" it follows that detQ can be factored as detQ = pr where p I a" and r I p". Now pM" = pPHN c QHN implies that det Q I p" det P or that p det P. Similarly r 1det R and the two division relations taken together with the coprimeness of p and r yield det Q I det P det R, which proves (4-5).
The importance of the minimal inner function p of a strictly noncyclic function T stems from the fact that it gives a parametrization of the singularities of I' the analytic extension of T to DQ. Since T is an operator valued function this description is insufficient and this is the motivation for deriving the next results concerning ideals in H°° (B(U, Y)). Let V, and V2 be two Hilbert spaces. We denote by TC(V,, V2) and HS(V1, V2) the trace class, and Hilbert-Schmidt class of operators from V1 to V2, respectively.
The trace class norm of an operator Te B(V1, V2) is denoted by 11 TII1. We let L1(TC(V,, V2)) be the space of all weakly measurable TC(V1, V2)-valued functions F on the unit circle that satisfy 2" 11F(e")11,
IIFIIL4TCIV,.v:u =
dt <
oo
0
TC (V1, V2) considered as a Banach space has B(V1, V2) as its dual under the pairing
= tr(X*T) As a consequence the dual of L' (TC(V1, V2)) is given by L°°(B(V1, V2)) where the
LINEAR SYSTEMS IN HILBERT SPACE
263
pairing is 1
=
2n
zn
tr(G(e")* F(e")) dt 0
The Hilbert-Schmidt class HS(V1, V2) becomes a Hilbert space under the inner product (F,, F2) = tr(F2*F1). Thus it is natural to consider the corresponding Hardy space HHS(Y,,V2). A representation theorem extending the Beurling-Lax Theorem II 12-22, is the following. Theorem 4-4
(a) A subspace of HHS(V,.Y2) is invariant under right multiplication by all H ° (B(V, )) functions if and only if it is of the form QHHS(V, V2) for some rigid function Q.
(b) A subspace of HHS(V,.V2) is invariant under left multiplication by all H°°(B(V2)) functions if and only if it is of the form HHS(V, V2) Q, for some rigid function Q1.
PROOF Since HS(V1, V2) is invariant under right multiplication by B(V1) operators it follows that given a rigid function Q the space QHHS(VI V2) is invariant under right multiplication by all HI(B(V1)) functions. It is closed since it is the range of a partial isometry in HHS(Y,.Y2) induced by Q. The initial space of this partial isometry is the set of all H,Z,s,v,,v2) functions whose values lie almost everywhere in the initial subspace of Q. Conversely let M c HHS(V,,V2) be invariant under right multiplication by all H°°(B(V,)) functions. Let Mo be the subspace of H,2,2 spanned by all functions of the form TT where Te M and g e V1. Clearly Mo is an invariant subspace of HV'2 and hence, by Theorem 11 12-22, there exists a rigid function Q for which Mo = QHV'2. Given Te M and E V, we have T = Qqq, for some 94 e HV2. The function q is not uniquely determined but we can make it so by the additional requirement that qp is almost everywhere in the initial space of Q. Thus we may define an analytic operator valued function 0 by
= 9,(z). For each E V, we have II T(z) II2 = IIQ(z) (pjz) II2 = II Q(z) 4)(z) 4II 2 which implies that almost everywhere on the unit circle is an orthonormal basis in V, it follows that II T(e") III = II(D(e") II If Y, IIT(e") QII2 = '(e")2 which shows that 4)(e")eHS(V1, V2), (D E z 2 HHS)V,.V2) and M c QHHS)V,.V2) For a fixed , {TAI Te M} is an invariant subspace included in QH,2,2. Thus for an arbitrary e e V2 there exists an f e V1 such that Tf = Qe and this (D(z)
in turn implies that Qe ®f belongs to M for all f E V, and e c- V2, e of being defined by (e (Df) x = (z, f) e. From this we infer directly that M
QHHs(V,,v2) which proves (a). Part (b) follows by duality.
In the case of finite dimensional spaces V1 and V2 the trace class TC(V,, V2) and the Hilbert-Schmidt class HS(V1, V2) both coincide with B(V1, V2) and this identification will be used in the sequel. Let now T be a strictly noncyclic function in H-(B(U, Y)). Given a complex
264
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
A, JAI > 1, we define IL(T; A) and IR(T; A) by
IL(T; A) _ {P E H°'(B(Y))I P*Textends analytically to A}
(4-7)
and similarly IR(T; A)
{Pe H-(B(U))I TP* extends analytically to 2}
(4-8)
Clearly IL(T; A) and IR(T; ).) are right and left ideals in H' (B(Y)) and H (B(U)), respectively. Moreover IL(T; A) = H"(B(Y)) if and only if T is analytic at A, and likewise for IR(T; A). If a is the minimal function of T then QI belongs to IL(T; A) as well as to IR(T; A) which shows that both are subspaces of full range. Since they are also clearly 0-closed, Theorem 4-5 can be applied in order to get the representations IL(T; A) = SxH°°(B(Y)) and IR(T; A) _ Ham' (B(U)) Sx where Sx and S, are inner functions. The ideals IL(T; A) and IR(T; ),) serve as a local measure of the singularities
of a strictly noncyclic function T. To get a global measure we introduce IL(T) and IR(T) through IL(T) = {PE H'°(B(Y))I P*T extends analytically to De}
(4-9)
IR(T) = {P E H' (B(U))I TP* extends analytically to De}
(4-10)
and
The spaces IL(T) and IR(T) are w*-closed right and left ideals in H°°(B(Y)) and H'0(B(U)), respectively, which are of full range and hence have representations IL(T) = SH°°(B(Y)) and IR = HI (B(U)) S1, respectively, where S and S, are inner functions. With the above definitions we can introduce some equivalence relations in the class of strictly noncyclic functions. We say that two strictly noncyclic functions T, and T2 have equivalent left singularities at a point A. if IL(T, ; 2) = IL(T2; A)
and similarly for right singularities. T, and T2 have globally equivalent left singularities if 1L(Tl) = IL(T2) and similarly for globally equivalent right singularities. Thus the inner functions S,,, S;,, S, and S, parametrize the local and global singularities. Next we show that they are essentially related to the coprime factorizations of a strictly noncyclic function.
Theorem 4-5 Let T E HI (B(U, Y)) be strictly noncyclic having the coprime factorizations T = QH* = Hi Q1 on the unit circle and let IL(T) = SH'°(B(Y)) and JR(T) = H`°(B(U))S1. Then Q and S are equal up to a constant right unitary factor and Q1 and S, are equal up to a constant left unitary factor.
PROOF Since T= QH* it follows that Q*T= H* extends analytically to De and so Q E IL(T) or Q = SR for some, necessarily inner, function R. This means that QHY c SHY. Conversely since S* TT is orthogonal to HY for every E U and since LY G HY is invariant under multiplication by S* it follows that
PH2S*PH2rT = PHpRS*T = 0
265
LINEAR SYSTEMS IN HILBERT SPACE
for all e U. Since the vectors of the form PH2 "T span H(Q) = RangeH,it follows that S*H(Q) J H. By Lemma II 13-5 we have H(Q) c H(S) or SHY c QHY. Together with the previously obtained inverse inclusion we have QHY = SHY and hence Q and S differ by at most a constant right
unitary factor. Corollary 4-6
(a) Given two strictly noncyclic functions Te H°°(B(U, Y)) and T, e
HI(B(U,, Y)) then T and T, have equivalent singularities if and only if Range HT = Range HT, .
(b) Given two strictly noncyclic functions Te H1(B(U, Y)) and T, e H°°(B(U, Y,)) then T and T, have equivalent right singularities if and only if KerHT = KerHT,. Corollary 4-7 If T e H°° (B (U, Y)) is strictly noncyclic and has the coprime
factorization T = QH* then T is left equivalent to Q. Next we pass on to the analysis of the local singularities. For every ), such that 111 > 1 it is obvious that IL(T) c IL(T; A) and hence S = SSA for some inner
functions SA. Assuming T admits the coprime factorization T= QH* we let a be the minimal inner function of Q. Let or = a,cA where ax is the Blaschke factor that corresponds to the zeros of a at -', thus aA(A-') # 0. This means that ax and CA are coprime. By Theorem 4-3 this factorization induces factorizations Q = Q.1QA = Q'Qa
(4-11)
of Q on the unit circle. ax is the minimal inner function of QA and Q. whereas a,, is the minimal inner function of QA and Theorem 4-8 Let T E H°° (B (U, Y)) be strictly noncyclic admitting the coprime factorization T = QH* on the unit circle. Let Q be factored as in (4-11) and let IL(T; A) = S,,H (B(Y)). Then Q,, and S,, coincide up to a constant right unitary factor. PROOF Since T = QH* = Q,QAH* it follows that Qx T = QAH* which has an
extension to the exterior of the unit disc given by
OA(z')H(z-') that is
analytic at A as QA(A-') is invertible. Thus Q, E IL(T; A) and Q,, = SAR for some inner function R. Conversely assume Q,, = SAR, R being a nontrivial inner factor. Ob-
viously the minimal inner function of R is a factor of a2. Thus the only singularity of the analytic extension of R to De is a pole at A. Now S*, Textends
meromorphically to D. and the extension is analytic at A. From the coprime factorizations of T and (4-11) we have T' = S*TT= STIQzQAH* = RQAH*
Since the minimal inner functions of R and QA are coprime there exist, by Theorem 4-3, inner functions R" and Q' satisfying RQA = QTR", detR =
266
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
T= det R" and det QA = det QA". Now T' = RQAH* = Q"AR"H* and so R"H* extends analytically to De with the possible exception of A. But it must
be analytic at A too since T extends analytically to A. Thus the range of the Hankel operator induced by R"H* is trivial which shows that R" is constant and hence so is R.
Lastly we pass on to the study of the boundary behavior of a strictly noncyclic function T and its associated inner function. Theorem 4-9 Let C e H' (B (U, Y) ), Pan inner function in H°° (B (Y)) and assume that (P, C)L = 1. If P(z)-' C(z) has an analytic continuation at a point A of the unit circle then P has an analytic continuation at A. PROOF Let U be a disc centered at A where P(z)-' C(z) is analytic and assume jP(z)-' C(z) Jj < M for z e U n D. Let V be a disc centered at A and properly contained in U. Let it be the minimal inner function of P which we factor as
it = itAnA where nA A nA = I and assume both nA(z) * 0 for z e D - V as well as that the singular measure in the integral representation of the singular
factor of nx is supported on the intersection of the unit circle with 17 Let nA(z) * 0 for z e Vn D and assume its singular measure is supported on T - V. It follows that inf{ 1mA(z)I Iz e D - U} > 0 and in particular nA extends analytically at all points of T - U, by an application of Theorem II 12-27.
Corresponding to the factorization of it we have a factorization P = PAPA with n,I and nA being the minimal inner functions of P,1 and PA, respectively.
Consider now the function A defined by A(z) = PA(z) P(z)-' C(z) _ PA(z)-' C(z). As nA is the minimal inner function of PA there exists an inner function HA for which PAHA = nAl. Therefore PA(z)-' = nA(z)-' IIA(z) from
which the boundedness of PA(z)-', and hence also that of A(z), in D - U follows. For z e D n U, P(z)-' C(z) is bounded by assumption and hence also A(z) is bounded. Thus A e H°°(B(U, Y)) and so C = PAA which together
with the factorization P = PAPA contradicts the coprimeness condition (P, C)L = I r.. Thus necessarily PA is trivial, that is P. is a constant unitary operator and the minimal inner function of P is nA which has an analytic continuation at A and therefore P itself is analytically continuable at A. Theorem 4-10 Let T be strictly noncyclic in H°'(B(U, Y)) admitting the coprime factorization T= QH* on the unit circle. Then A has an analytic continuation at a point A, JAI = 1, if and only if Q has. PROOF If Q is analytically continuable at A then, by Lemma II 13-6, so is every function in H0(Q). Since for all e U the function HT = PHO ,,TT is in Ho (Q) then (T (z)
has T(z) .
- T(0) )/z has an analytic continuation at A and so
LINEAR SYSTEMS IN HILBERT SPACE
267
Conversely assume T has the coprime factorization T= QH* on the O(z-1)-I R(z-1) is the meromorphic extension of T to D. unit circle. Thus which by our assumption extends analytically to D at A. This is equivalent to (S) ' H(s) having an analytic continuation at A. Now (Q, H)R = 11 implies (0, 1 )L = 1 r and the result follows from the previous theorem.
In conclusion we are ready to put everything together and state the central result of this section.
Theorem 4-11 Let T e H°°(B(U, Y)) be strictly noncyclic then the shift realization of T is spectrally minimal.
PROOF Let T= QH* be the coprime factorization of T on the unit circle. The state space of the shift realization is H0(Q) and the generator or state operators of the shift realization is given by S0(Q)*. By Theorem 11 13-8 the spectrum of S0(Q)* is completely determined by Q and is equal to the set of all points 2, 121 < 1, where Q(2) is not invertible as well as the points 1, J21 = 1,
where Q is not analytically continuable to D. Now T extends meromorphically to D. with the exception of at most a countable number of poles located at the points A where Q(2-1) is not invertible. Similarly T and Q are analytically continuable at the same points of the unit circle, thus T has no analytic extension at A, JAI = 1, if and only if 2 = A-1 belongs to the continuous spectrum of S(Q)*. This completes the proof.
For functions which are not strictly noncyclic the shift realization does not provide a useful tool for analysis. This is the setting for a striking counterexample to the state space isomorphism theorem. For simplicity we restrict ourselves to the scalar case.
Let Te H°° be a nonrational noncyclic function relative to the left shift operator. Let TT(z) = T(pz) where 0 < p < 1. Obviously if Te H°° so do all T. In fact Tp is analytic in the region IzI < p-' and since T, is not rational it is necessarily a cyclic function for the left shift. Let E = (A, B, C) and E. = (AP, bP, cP) be the shift realizations of T and Ti,, respectively. Clearly EP = (p-1Ap, bp, cp) is a realization of T Now A = S* Range HT whereas p-1Ap = p1S* as RangeHTp =
H 2. Since Range HT is a proper left invariant subspace of H2 the spectrum of A has at most a countable number of points inside the open unit disc. The spectrum of p - 1 AP on the other hand coincides with the closed disc of radius p'. This excludes the possibility of the isomorphism of the two realizations as similarity preserves the spectrum.
268
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
5. DEGREE THEORY FOR STRICTLY NONCYCLIC FUNCTIONS Although a general degree theory, extending the McMillan degree of rational functions, is not available it is possible to develop a complete degree theory for the case of strictly noncyclic transfer functions. The main difference is that while in the finite dimensional case degree is defined by use of the dimension function and hence is essentially an additive function, in the infinite dimensional case we will use a multiplicative analogue.
Given a finite dimensional vector space V then the dimension is a function from the set of all subspaces of V into the monoid Z+, which has the following properties
(a) M c M1
dimM 5 dimM
(b) dim(M + N)
(5-1)
dimM + dimN
with equality if and only if M n N = {0}. Let now N denote a finite dimensional Hilbert space. Let LN denote the set of all left invariant subspaces of HN whose orthogonal complement has full range. Thus LN coincides with the set of all subspaces of HI of the form H(Q) = {QHN }1 where Q is an inner function. Let IN denote the multiplicative monoid of all inner functions in N. Define a map d: LN IN by d (H (Q)) = det Q
(5-2)
then the next theorem shows that the function d is a suitable generalization of the dimension function.
Theorem 5-1 Let P and R be inner functions in N. Then the following statements are true. (a) H(P) c H(R) implies d(H(P))Id(H(R))
(b) d(H(P) v H(R))Id(H(P)) d(H(R)) (c) d(H(P) v H(R)) = d(H(P)) .d(H(R)) if and only if(P, R)L = IN.
PROOF (a) H(P) c H(R) if and only if RHI c PHI which is equivalent to the factorization R = PS for some inner function S. By taking determinants we obtain detPIdetR, which is the same as d(H(P))Id(H(R)). Next we prove (c). Assume H(P) n H(R) _ {0} which is equivalent to PHN v RHN = HI or to (P, R)L = !N. Let Q be an inner function for which QHN = PHN n RHN, Q exists since for N of finite dimensions the intersection of invariant subspaces of full range has full range. We note that Q is determined only up to a constant unitary factor on the right. Since QHN c PHN we have Q = PR1 and similarly Q = RP1 for some inner functions P1 and R1. Define A = P*R then A is obviously a strictly noncyclic function in U(R(N)). By Theorem 3-5 A has also a factorization
LINEAR SYSTEMS IN HILBERT SPACE
269
A = R2P*2 on the unit circle and the coprimeness condition (R2,P2)R = IN is satisfied. Since R2 is inner the same must be true of P2 as A is a.e. unitary on the unit circle. Now the equality P*R = R2P2 implies
RP2 = PR2
(5-3)
We can apply now Theorem II 14-11 to infer that S(P2) and S(P) are quasisimilar, and by the same token also S(R) and S(R2) are quasisimilar. By Theorem 11 15-17, P and P2 have the same Jordan model and the same is true of R and R2. Since P and P2 have the same invariant factors it follows in particular that detP = detP2 and similarly that detR = detR2. We will show that, modulo a constant unitary factor on the right, PI is equal to P2 and the same holds for R, and R2. From (5-3) it follows that RP2HN c RHN and also RP2HN = PR2H2 CZ PHN so RP2HN c RHN n PHN = QHN. Hence there exists an inner func-
tion Z for which RP2 = QZ = RP,Z, or P2 = PIZ. By similar reasoning R2 = R,Z and so Z is a common right inner factor of P2 and R2 and by the assumption of the right coprimeness of R2 and P2, Z is constant. This implies that, up to a constant of absolute value one
detQ = detP detR
(5-4)
If P and Q are not left coprime let S be a greatest common left inner R')L = IN. Since SP'H2 n SR'HN = S(P"HN n R'HN) = SQ'HN = QHN we can apply the first part of the proof to obtain det Q' = det P' det R'. Hence det Q = det SQ' = det S det Q' = det S det P' - det R' j(det S)2 det P' det ' = det SP' det SR' = det P det R. This proves (b).
divisor. Thus P = SP' and R = SR' with
Corollary 5-2 Let P and R be inner functions in N and let Q be an inner function for which QHN = PHN n RHN. Then there exist inner functions P, and R1 such that Q = PR, = RP, and moreover (P, R)L = IN if and only if (PI, RI)R = IN.
PROOF That the factorizations Q = PR, = RPI hold follows from the inclusions PHN n RHN c PHN and PHN n RHN c RHN. Assume (P, R)L = IN
and hence H(P) n H(R) = 10} and with it the equalities detR = detR1 and det P = det P, . Now Q = R, P = P 1 R and since det o = det Q = det P det R = det P 1 det R 1 it follows that (P1i R I )L = I N which is the same as (P, , R 1)R = IN.
From Theorem 5-1 it is clear that the determinant of an inner function Q provides a suitable generalization of the concept of dimension for subspaces of the form H(Q). This will also be the key for the generalization of McMillan degree theory of rational functions to the case of strictly noncyclic functions. Equality (5-4) is equivalent to H(Q) = H(P) + H(R) and H(P) n H(R) _
270 {0} .
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
This is certainly satisfied whenever
(5-5) H(Q) = H(P) + H(R) where + denotes the, not necessarily orthogonal, direct sum of H(P) and H(R).
For (5-5) to hold the condition (P, R)L = IN is generally insufficient. One expects
that (P, R)L = IN should be replaced by the stronger coprimeness condition [P, R]L = IN and this turns out to be true. In preparation we prove some necessary lemmas.
If R is an inner function acting in N then HR, the Hankel operator induced by R and defined by (1-17), is a partial isometry from HN into H'0,N- Its range is given by H0(R) = Ho.N O RH2,N. Applying Theorem II 2-3 the orthogonal projection of Ho,N onto H0(R) is given by (5-6)
Prlo(R) = HRHR
Lemma 5-3 Let P and R be inner functions in N. If (P, R)L = IN then PHo(R) {PH0,N } is dense in H0(R).
PROOF Since for f e Ho.N
PH0(R)Pf = HRHRPf =
R*Pf
a simple adaptation of Theorem 3-5 shows that the coprimeness condition (R, P)L = IN implies that the map f -+ PHr3 R*Pf which is just HR*., has range dense in HN Q R*HN. Since KerHR = R*HN the result follows.
Using our available information concerning range closure of Hankel operators we can strengthen the previous lemma to obtain.
Lemma 5-4 Let P and R be inner functions in N. If [P, R]L = IN then PHO(R){PHo.N} = H0(R).
PROOF By Theorem 3-10 the range of HP.R is H N e R*H'N given that [P, R]L = IN is satisfied. But HN e R*HN is just the initial space of HR and hence is mapped isometrically onto a closed subspace of HON which by the previous lemma has to coincide with H0(R).
We can relate now strong coprimeness of inner functions to the geometry of left invariant subspaces in HO.N
For any two subspaces M, and M2 of a Banach space X which satisfy M, n M2 = {0} the sum M, + M2 is closed if and only if for some d > 0 inf{Ilx, - x211 Ix;eMt, IIxiII = 1,i = 1, 2} >_ d
(5-7)
In a Hilbert space condition (5-7) is equivalent to sup { I(xI, x2)I Ix. C- M., IIx.II = 1, i = 1, 2} < 1
(5-8)
LINEAR SYSTEMS IN HILBERT SPACE
271
This last condition has the interpretation that the angle between M1 and M2 is positive.
Theorem 5-5 Let P and R be inner functions in N. The angle between the left invariant subspaces H0(P) and H0(R) is positive if and only if [P, R]L = IN.
Equivalently there exists an inner function Q such that H0(Q) = H0(P) + H0(R)
(5-9)
detQ = detP detR
(5-10)
and holds if and only if [P, R]L = 'N' PROOF Assume [P, R]L = IN. Let PHo N n RHo,N = QHO2,N then by Theorem
3-10 and Corollary 5-2, Q = PR, = RP, and also [P1, R1 ]R = IN. We apply now Theorem 11 14-10 to infer the existence of b and P in H°°(B(N, N)) for
which OP1 + APR, = IN. In turn this implies Q* = (DP,Q* + 1 RIQ* _ OR = PP and by taking adjoints we obtain Q = R(D* + PT*
(5-11)
We saw that already the weaker condition (P, R)L = IN implied H0(Q) _ H0(P) v H0(R). Now from (5-11) it follows that
H0(Q) = RangeHQ = RangeHRm.+P,l,. c RangeHRm.
+ RangeHR,P. c H0(R) + H0(P)
c H0(R) v H0(P) = H0(Q) Hence the equality (5-9) is obtained. To show the necessity of the strong coprimeness condition [P, R]L = IN we assume that P and R are not strongly coprime_ The most obvious violation of [P, R]L = IN is the existence of a nonzero vector rj e N and a point .? in the open unit disc for which
P(,)* rl = RO.)* n = 0
(5-12)
But P(2)* q = 0 implies that the function X(1 - AX)-' is in H0(P) and hence (5-12) implies H(P) n H(R) * {0}.
In general (5-12) does not hold and we resort to an approximation argument similar to the one used in the proof of Theorem II 14-11. If [P, R]L * IN there exists a sequence of points .1,,, IA,,l < 1, and a sequence of unit vectors q e N for which lim IIP(2)* ?,, II = lim IIR(ti )* 1.11 = 0
We will show the existence of a sequence F. E H0(P) and a sequence F;, e H0(R) for which lim JIF 11 = lim 1IF;1 11 = I and also lim(F,,, F;,) = 1. This
implies that H0(P) and H0(R) have zero angle between them.
272
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE Ik.I2)I/2 X(1 - ;I,X)-1 ry are normalized eigenThe functions H. F. and F. be their orthogonal projecfunctions of the left shift in H.N. oLet
tions on H0(P) and HO (R), respectively, and let G,, = H - F. and G, _ H - F;,. It follows from Lemma II 13-7 that G.(z) _ -(1 - I;.12)1/2 - - -- -
--
I
and hence that lim JIG 11 = lim IIG, 11 = 0 and as a consequence that lim JIF 11 lim11 F. 11 = 1. Nowl
(G., F;,) + (G., G,). The last three terms obviously tend to zero and we have lim(F,,, F;,) = 1 as required. We recall that the McMillan degree of a proper rational matrix function T is defined as the dimension of the state space of any, and hence by Theorem I 8-4 all, canonical realization of T. Thus b is a map from the set of proper rational functions into Z+ which satisfies (a) b(TI + T2) < b(T1) + b(T2) (5 13)
(b) b(TIT2) < b(TI) + b(T2)
Equalities in (a) and (b) are subject to the coprimeness conditions which guarantee that no pole-zero cancellations occur. An alternative way to define the McMillan degree of a rational function T is to let it be the rank of the associated Hankel matrix Since the rank of a matrix is the dimension of its range space, then just as we used the determinant function to replace the concept of dimension, we are led to make the following definition. Let T be a strictly noncyclic function in H'°(B(U, Y)). We define the degree of T, denoted by A(T), by A(T) = d(RangeHT)
(5-14)
where d is defined by (5-2). Thus if Q is an inner function such that {Range T}1 =
QH',Y then A(T) = detQ. Theorem 5-6 Let T E H00 (B (U, Y)) be strictly noncyclic then
A(T) = A(T)
(5-15)
PROOF That 7 is strictly noncyclic is the content of Corollary 3-6 (c). If T = QH* = H*QI are coprime factorizations of T then ?= Q1II* = I7*0. Thus A(7) = det1g1 = detQI. Since S(Q) and S(Q1) are quasisimilar, Q and Q1 are quasiequivalent and hence detQ = detQ1. This proves the theorem. The degree function A defined by (5-14) satisfies the multiplicative analogs of(5-13), that is (a) A(T1 + T2)IA(T1)- A(T2)
(5-16)
(b)
A(T1T2)IA(T1). A(T2)
LINEAR SYSTEMS IN HILBERT SPACE
273
To prove this we have to study in detail the ranges of Hankel operators induced by sums and products of strictly noncyclic functions. We begin with the study of products. Let L, M, and N be three finite dimensional Hilbert spaces. Let A and B be strictly noncyclic functions in L`°(B(N,M)) and L" (B(L, N)), respectively. By Theorem 3-5 the functions A and B have the following factorizations on the unit circle
A = PC* = CTPI
(5-17)
B = RD* = D;R1
(5-18)
and
where P is an inner function in M, P1 and R inner in N, R 1 inner in L, whereas C, C1 e H°°(B(M, N)) and D, DI e H°°(B(N, L)). Moreover we assume the factorizations to be coprime, that is, the conditions (P, OR = IM,
(F1, CI)L = IN
(5-19)
= IN,
(RI, DI)L = IL
(5-20)
and
(R, D)R
are satisfied.
Since A and B are strictly noncyclic both have meromorphic extensions of bounded type to De and hence also their product AB has such an extension. So T= AB itself is strictly noncyclic and by the same theorem used before admits factorizations
T= QH* = HiQI
(5-21)
with (Q, H)R = IM,
(QI, HI )L = IL
(5-22)
satisfied.
The analysis of the general case will be based on the two special cases B = R
and B = D*.
Lemma 5-7 Let A e L-(B(N, M)) be strictly noncyclic and let R be an inner function in N, then RangeHA c Range HAR. PROOF Let f C HN then R*f e HN and
HAR(R*) = PH2,,,ARR*f = PNo,,,AJ which proves the stated range inclusion.
In this case T= AR and from the coprime factorization (5-21) we obtain RangeHAR = H0(Q).
274
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Lemma 5-8 Let A and AR have the coprime factorizations (5-17) and (5-21) then det Q = det P - det R (5-23) if and only if (R, C)L = IN
(5-24)
PROOF Assume R and C have a nontrivial greatest left inner factor S. Thus R = SR2 and C = SC2 and (R2, C2)L = IN. Since S is nontrivial, det R2 I det R we have detR # det R 2. Now T= AR = PC* R = PC*2R2. By Theorem 3-5 there exists R3 and C3 satisfying (R3, C3)R = IM as well as CZR2 = R3C2.
Since R2 and R3 are quasiequivalent the equality detR2 = detR3 holds. Now T= AR = PCZR2 = PR3C3 it follows that H0(Q) = RangeHAR HO(PR3). Thus PR3 is divisible on the left by Q and hence detQ det P det R3, and with it (5-23) is impossible. To prove the converse assume (5-24) holds. As before T = AR = PC*R =
PR2CZ = QH*. From Lemma 5-7, H0(P) c H0(Q) and hence Q = PS for some inner function S. Thus PC*R = QH* = PSH* and (S, H)R = 1,,, and from it the equality R2C2 = SH*. As both factorizations are coprime we have Ho(R2) = HO(S) and therefore R2 and S differ at most by a constant unitary factor on the right. In particular detS = detR2 = detR and (5-23) is satisfied.
This lemma can be sharpened to yield a result about the range closure of HAR.
Lemma 5-9 Let A and AR admit the coprime factorizations (5-17) and (5-22), respectively. If HA has closed range then HAR has closed range H0(Q) with (5-23) satisfied if and only if [R, C]L = I M
(5-25)
PROOF For (5-23) to hold the coprimeness condition (5-24) is necessary.
Using the notation in the proof of the previous lemma AR = PC*R = PR2CZ and the last factorization is coprime. For the range closure of H,,R [PR2, C2]R = 'M is necessary which implies the necessity of the weaker condition [R2, C2]R = IM which is equivalent to (5-25). Conversely assume (5-25) holds then by the previous lemma RangeHAR is dense in H(Q) = H(PR2). Since we have
H(PR2) = H(P) ©PH(R2) (5-26) and as RangeHA = H(P) c RangeHR it suffices to show that PH(R2) <
Range HAR. To this end let f HN then HARf = PH2.M ARf = PHO.M PR2CZ f = PPHO.M R2C*f = PHR:C3.f
Now (5-25) implies [R2, C2]R = IN and with it RangeHR,c, = Ho(R2).
275
LINEAR SYSTEMS IN HILBERT SPACE
Next we assume B = D* for Dc H'(B(N, L)), or equivalently T= AD*.
Lemma 5-10 Let A be strictly noncyclic in L x (B (N, M)) and De HI (B(N, L)). Then AD* is strictly noncyclic and RangeHAD' c Range HA. PROOF Let f e Hi then D*f e Hi and HAD' = PH2.M AD*f = HA(D*f )
If T= AD* = PC*D* factors coprimely as before by (5-21) we have RangeHAD. = H(Q) c H(P). The inclusion implies P = QS for some inner function S.
Lemma 5-11 Let A be strictly noncyclic in L'° (B(N, M)) and let A and T= AD* have the coprime factorizations (5-17) and (5-21), respectively. A necessary and sufficient condition for the equality
detP = detQ
(5-27)
to be satisfied, up to a constant of absolute value one, is (PI, D)R = IN
(5-28)
PROOF The proof of necessity follows along the lines of the proof of Lemma 5-8. For the proof of sufficiency we note that
AD* = PC*D* = C*,P,D* = QH*
Since P = QS it follows that SC*D* = H*. Now (P, C)R = 1M implies (S, C)R = I. and hence SC* = C*2S2 with (S2, C2)L = IN. Now the Hankel operator induced by H* is the zero operator. This implies that PHiMC2PHONS2D*f = 0
for all
feHi
and hence that the operator defined on H2 N by PHZ. C*2g has nontrivial kernel. By an application of Theorem II 14-11 this contradicts the coprimeness relation (C2, S2)L = IN. Necessarily we have therefore that Range HS,D is trivial which can occur only ifS2 is constant. Since Q = PS the determinants of P and Q differ at most by a constant of absolute value one.
As in the case of Lemma 5-8 also this lemma can be sharpened to obtain the following.
Lemma 5-12 Let A be strictly noncyclic in L°°(B(N, M)) having the coprime factorizations (5-17) on the limit circle, and assume Range HA is closed. Let D c H"(B(N, L)), then a necessary and sufficient condition for the equality Range HAD. = Range HA
(5-29)
[P,, D]R = IN
(5-30)
to hold is
276
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
PROOF We begin by proving the necessity of (5-30). We saw already that (P1, D)R = IN is necessary for RangeHAD. = RangeHA. In that case P1D* = D*2P2 with P2 inner and (P2, D2)L = IL. Since AD* = PC*D* = CtP1D* =
is necessary that [P2, D2C1]L = IL holds. Hence the weaker condition [P2, D2]L = IL is also necessary and this is equivalent to (5-30). C; Dz P2, it follows that for Range HAD. = H0(P) it
Conversely we assume that (5-30) holds. Thus RangeHP1D. = Ho(P1). Clearly also Range HP, = Ho(P1). Now for f e HL' HAD.f = PHO.M AD*f = PHU.M CI PI D*f = PHO.M C; PHO.N P1D*f
and so
RangeHAD = {PHo MCi9I9e HO(PI)}
{PHo,MCI PH, P,f If e L2N}
= {PHg.,KC; P, f f e HN } =RangeHA = H(P) We can combine now the results of the previous lemmas to yield the following theorem.
Theorem 5-13 Let A e L°° (B(N, M)) and B e L° (R (L, N)) be strictly noncyclic, having the coprime factorizations (5-17) and (5-18), respectively. Let T= AB have the coprime factorizations (5-21). (a) A necessary and sufficient condition for det Q = det P - det R
(5-31)
to hold is (C, R)L = IN
and
(P1, D1)R = IN
(5-32)
(h) Assume HA and HB have closed range then HAB has closed range and (5-3 1) is satisfied if and only if
[C, R]L = IN
and
[PI, DI ]R = IN
(5-33)
PROOF (a) The necessity of conditions (5-32) for (5-31) to hold follows along the lines of the proof of Lemma 5-8, in particular we always have det Q I det P
detR. So we assume (5-32) to hold and consider PC*R = C*,P,R. As (R1, C)L = IN we have (P,R, C,)L -- IN and thus the range closure of HPC.R is H0(Q') for some inner function Q' acting in M. For Q' we have, by Lemma 5-8, det Q' = det P det K Next we consider AB = (PC*R) D* = (C*P 1R) D*. By Lemma 5-8 equality (5-31) will hold if and only if (P1R, D)R = IN. Since P1RD* = P1D*jR1 it follows from Lemma 5-8 that (P1R, D)R = IN is equiva-
lent to (PI,DI)R = IN and (Ri,DI)L = IL which proves the sufficiency part. (b) By part (a) already (5-32) is necessary for equality (5-31) to hold. If C*R =
R2C2 with (R2, C2)R = I. then AB = PC*RD* = PR2C2*D*. For HAB to have closed range it is necessary therefore that [PR2, DC2]R = IM andhence the necessity of the weaker condition [R2, C2]R = IM, this last condition
LINEAR SYSTEMS IN HILBERT SPACE
277
being equivalent to [R, C]L = IN. The necessity of [PI, D,]R = IN is proved analogously using the representation AB = CIPIDIR1. To prove the converse let us assume the strong coprimeness conditions in (5-33) to hold. Thus RangeH,,B = H(Q) and (5-31) holds by part (a). It suffices therefore to prove the range closure of HAI,. By Lemma 5-9 HPC.R has
closed range. Now PC*R = Q P,R hence, by Lemma 5-12, [P,R, Cl]L = IN holds. To prove the range closure of HAB it suffices to show that [PIR, D]R = IN. To see this we note that by our assumptions the range of HP,D; is closed
and since PIRD = P,DtRI the assumption [RI,DI]L = IL yields, by another application of Lemma 5-9, the range closure of
HP,RD..
Hence
[PIR, D]R = IN and HAB has closed range.
We pass now to the analysis of Hankel operators induced by sums of strictly noncyclic functions. So we assume that A and B are two strictly noncyclic functions in L°° (B(N, M)) having the respective factorizations (5-17) and (5-18) on the unit
circle. We assume now the coprimeness relations (P, OR = IN,
(PI, CI)L = IN
(5-34)
(R, D)R = IM,
(R1, DI )L = IN
(5-35)
and
As A + B has a meromorphic extension of bounded type to De whenever both A and B have then it is clearly strictly noncyclic. We assume that A + B factors as A + B = SH* = H; S, (5-36) and the conditions (S, H)R = IM,
(S1, H,)L = IN
(5-37)
are satisfied. Theorem 5-14
(a) Let A, B be two strictly noncyclic functions in L`'(B(N, M)) and assume the factorizations (5-17) and (5-18) hold together with the coprimeness conditions (5-34) and (5-35). Let S be the inner function defined by (5-36) and (5-37). then det S I det P det R and
detS = det P detR
(5-38)
if and only if (P, R)L = IM
and
(PI, RI )R = IN
(5-39)
[P1,CI]L = IN
(5-40)
(b) If (5-34) and (5-35) are replaced by
[P,C]R = 'Al
and
278
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
and
and [R1, D1]L = IN (5-41) [R, D]R = I,, respectively, then RangeHA+B = H0(S) and (5-38) holds if and only if and
[P,R]L = IM
[P1,R1]R = IN
(5-42)
PROOF Since M is finite dimensional and the subspaces PH' and RHM are of full range so is their intersection and therefore there exists an inner function Q for which
QH2 = PHM n RHM
(5-43)
Moreover for some inner functions P' and R' we have
Q=PR'=RP
(5-44)
From the obvious relations
RangeHA+B = Range(HA + HB) c RangeHA v RangeHB
= H0(P) v H0(R) = H0(Q) together with RangeHA+B = H0(S) we obtain the inclusion 110(S) c H0(P) and the consequent factorization Q = SW of Q with W also an inner function. This implies det S det Q. Since by Corollary 5-2 det Q l det P det R (5-37) follows. Now for equality (5-38) to hold it is necessary that det Q = det P det R which is equivalent, again by Corollary 5-2, to (P, R)L = IM. Thus (P, R)L = IM is a necessary condition for (5-38) to hold but generally not sufficient we obtain another necessary condition, namely the coprimeness relation (PI, R1)R = IN, by considering T + A in place of A + B. These two necessary coprimeness conditions of (5-39) turn out to be sufficient. To this end we note that from B = D*R1 it follows that KerHB = R*HN. By restricting HA+B to Ker B we obtain for f e HN HA+BR*.f'= (HA + HB)
R*,'=
HAR*.J = PH2.M AR*.f'
PH2.iu CtP1R' f = Hc1P,R;J
or RangeHciP,R; = HA(R*HN). By Lemma 5-11 the condition (P1,R1)R = IN implies H0(P) = RangeHc,P,R, and so H0(P) c RangeHA+B and analogously H0(R) = RangeHA+B. Hence 110(P) v H0(R) c RangeHA+B.
Since the inverse inclusion holds always we must have the equality H0(P) v H0(R) = RangeHA+B = H0(S). The coprimeness condition
(P, R)L = IM implies now that H0(Q) = H0(P) v H0(R) and detQ = detP det R and hence equality (5-38). To prove part (b) we assume HA and HB to have closed ranges. For (5-38) to hold conditions (5-39) are necessary by part (a) and imply Range HA+B = H0(Q). From (5-44) together with
A + B = PC* + RD* = QH*
LINEAR SYSTEMS IN HILBERT SPACE
279
we obtain
H = CR' + DP"
(5-45)
For H,,+B to have closed range it is necessary that [Q, CR' + DP"]R = 1M holds. This implies the necessity of [R', P']R = IM which is equivalent to
[P, R]L = IM. The necessity of [P,, R, ]R = IN follows by duality considerations.
Conversely we assume the strong coprimeness conditions (5-42). By our assumptions RangeH,, = HO(P), RangeHB = HO(R) and by Theorem 5-5 [P, R]L = 'Al implies that the angle between HO(P) and HO(R) is positive,
thus we obtain HO(Q) = HO(P) + HO(R). To complete the proof it suffices to show that HO(P) and HO(R) are both included in RangeHA+B. As in part (a) H,,+B(KerHB) = HA(R;HN) =RangeHAR;. We can imply Lemma
5-12 to see that RangeHAR; = RangeHA = HO(P). By symmetry also RangeHB = He(R) c Range HA+B and the proof is complete.
We remark that part of the content of Theorem 5-13 and Theorem 5-14 is the proof that the degree function A defined by (5-14) indeed satisfies relations (5-16). Moreover equalities in (5-16) are dependent on coprimeness conditions. The degree theory development so far is closely related to the study of systems connected in series and in parallel and we shall delve into this in more detail. Let E, = (A1, B1, C,, D,) and E2 = (A2, B2, C2, D2) be two systems which realize the transfer functions T, and T2, respectively. The series connection of E1 and E2, denoted by E,E2 is obtained, assuming the obvious compatibility conditions that the output space of E1 coincides with the integral space of E2, by feeding the output of E, into E2. The dynamic equations are
xn+I = Ax") + Btu, YnI)=C1x,I)+D,u, and
+I
x(2)
Y.
C xe2+
= A x(2) 2
(5-47)
=
B2CI
1
+ B2.cl)
\ xlz) +
Az
B2A 1
U. (5-48)
z11)
y. _ (D2C1
C2)
x(2)
+ D2D1u
n
in other words `A,
121
\ B2C,
B,
0
A,
I'
BA \/
(D2C, Cz), D2D1
l/
(5-49)
280
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
In the same way given two systems E1 = (A1, B1, C1, D1) and E2 =
(A2, B2, C2, D2) having the same input and output spaces we define the parallel connection of E1 and E2, denoted by E1 + E2, by feeding the same input to both systems and combining their outputs. The dynamic equations are x;,+I = A1x(,') + B1u y(l" = D1un
(5-50)
A2421 + B2u
(5-51)
Y; ,z) = C2xV/ + D2un and Yn
= ylll + y(2) n
(5-52)
In matrix form this becomes
(0
(xn2+11
B,
X0
A0
+(B2)un
2)(X(11)
1
(5-53)
x(1)
Y. = (C1
Cz)
x(2)
+ (D1 + D2) U.
n
and hence
EI
+ E2 =
A0
((O1
2)'(Bz)'(C1
C2)+(D1 +
Dz))
(5-54)
Theorem 5-15
(a) The transfer function of the series connection E2E1 of the systems E1 and E2 is the product T2T1 of their respective transfer functions. (b) The transfer function of the parallel connection EI + E2 of the systems E1 and E2 is the sum T2 + T1 of their respective transfer functions. PROOF (a) The transfer function of the series connection of E1 and E2 is given by
T(z) = D2D1 + z(D2C1
Cz)
1-zA,
0
-zB2C1
I - zA2
'
B1
(B2D1
Now
1-zA1
0
-zB2C1
I - zA2
) ((I - zA1)-' z(1 - zA2)-' B2C1(I - zA1)-'
0 (I
-
zA2)-'
281
LINEAR SYSTEMS IN HILBERT SPACE
and the result follows. Part (b) is proved by a similar, even simpler, computation. We pass now to the study of the series connection of shift realizations. Theorem 5-16 Let L, M, and N be finite dimensional Hilbert spaces and let A e HI (B(M, N)) and B e H°° (B(L, M)) be two strictly noncyclic functions having the factorizations
A = PC* = C;P1
(5-55)
B=RD*=DTR1
(5-56)
and satisfying the coprimeness conditions
(P, OR = IN'
(P1, Cl)L = IM
(5-57)
(R, D)R = IM'
(RI, DI)L = IL
(5-58)
and
respectively. Then the following statements hold. (a) The series connection EAEB of the shift realizations EA and EB of A and B, respectively, is observable if and only if (R, C)L = IM holds and exactly observable if and only if [R, C]L = IM. (b) The series coupling E;,EB of the* -shift realizations of A and B is reachable
if and only if (P1i D1)R = I. and exactly reachable if and only if [P1, DI]R = IM. (c) A sufficient condition for the reachability of LAEB is (P1, D1)R = I. If EA and EB are both exactly reachable then EAEB is exactly reachable if and only if [P1, D1]R = IM. (d) A sufficient condition for the observability of E;,EB is (R, C)L = IM. If E'A and EB are both exactly observable then E;,EB is exactly observable if and only if [R, C]L = IM.
PROOF The shift realization of A is given, omitting the constant term for simplicity, by EA = (S0(P)*, MA, yo(P)*) in the state space H0(P) where S0(P) is given by (3-1), MA: M -* Ho(N) is defined by
MA = Peo(N)A and yo(P): N - Ho(M) is defined by
for
eM
Yo (P) 11 = PH0(P)X'1
(5-59)
(5-60)
which is equivalent to (yo(P)11) (z) = z(I - P(z) P(0)*) rl. Similar formulas can be derived for EB. Note that yo(P)* is given by Yo (P)*
f = f '(0)
for
f e H0 (P)
(5-61)
282
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
H0(P) we have
Also using the transformation iP: H0(P) t'PYo(P) = ro(P)
(5-62)
where I'0(P): N - H0(P) is defined by ro(P) ), = (P - P(0)) rl
(5-63)
From the preceding representation of the shift realization of A and B we see that the series connection EAEB of FA and EB has H0(R) Q H0(P) as state space, is given by the system S0(R)*
CC MAYo(R)*
0
So (P)*
MB
(0
(0
J
Yo(P)*)
J
(5-64)
and has AB as its transfer function.
Assume first that (R, C)L = I,,,. By Theorem 3-5, there exists an inner
function R' acting in N and Cc H"' (B(N, M)) such that C*R = R'C*, (R', C')R = IN and detR = detR'. From the factorizations (5-55) and (5-56) we obtain AB = PC*RD* = PR'C*D* and this factorization of AB enables us to write down explicitly the shift realization EAB of AB. It has Ho(PR') as state space and is given by (S0(PR'),MAB,Yo(PR')*). This realization is clearly exactly observable but not necessarily reachable. Reachability is equivalent to Ho(PR') = RangeH,,B and this is equivalent to the coprimeness condition (P,, D,)R = I,,,. To prove our theorem we have to study EAB in more detail. By the vector-valued version of Lemma II 15-12, HO(PR') has a direct sum representation
Ho(PR') = Ho(P) O PH0(R') (5-65) Hence there exists a unitary map of Ho(PR') onto H0(R') Q H0(P) given
by f = g + Ph -+ (h, g) where g + Ph is the unique decomposition of f'e Ho(PR') relative to the direct sum (5-65). From the above representation of f'e Ho(PR') we have, using the fact that (S* f) (z) = f (z)/z - f'(0), and recalling that h(0) = 0 for h e Ho(N) J'(z)/ z - f'(0) = g(z)/z - g'(0) + P(z) h(z)/z - P(0) h'(0)
= g(z)lz - g'(0) + P(z)(h(z)lz - h'(0)) + (P(z) - P(0)) h'(0) and hence
So(PR')* f = S0(P)* g + PS0(R')* h + ro(P) y0(R')* h
(5-66)
Next, from f = g + Ph, it follows, again using the fact that h(0) = 0, that f'(0) = g'(0) + P(0) h'(0) which implies that Finally, for
yo(PR')* f = Yo(P)* g + P(0) yo(R')* h (5-67) e L let MABb = ABA = g + Ph with g e H0(P) and h e H0(R),
LINEAR SYSTEMS IN HILBERT SPACE
283
then we have
h = PHS(M)C*B = PH2(M)C*MB
(5-68)
and
(5-69) 9 = MAB - P' PHO(M)C*MB Thus with respect to the direct sum H(R') Q+ H(P) the shift realization
EAB of AB is given by
PH:(MpM B
0
So (R')*
[i ro(P) yo (R')*
(5-70)
So(P)* /
(P(0) yo(R')*
MAB - P - PHzocM C *MB yo
(P)*)]
As our next step we construct a map X : H0(R) $H0(P) -+ H0(R') +$H0(P) which intertwines EAEB and EAB. A comparison of the state generators in the two systems (5-64) and (5-70), which are both lower triangular, indicates that we should look for an intertwining operator X of the
X=
(5-71)
H0(R') and Z: H0(R) --> Ho(P) are bounded. For the generator W the natural candidate is the quasi-invertible operator that where W: H0(R)
intertwines the shift and the *-shift realizations of the analytic part of E _ C*R = R'C*. These two realizations are given by (So(R)*, ro(R), Mc)
(5-72)
(So(R')*, ME, yo(R')*)
(5-73)
and
respectively. From the commutativity of the diagram M
H ,(R')
H0(R)
S0(R')'
So(R )*
H0(R)
w
-
H0(R')
(5-74)
2M
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
we have that
WFO(R) = WPHa(M)R = PHj(M)C*Rc and since WSO(R)* = SO(R')* W this implies that for each fE HO(R) (5-75)
Wf= PHo(R')C* = PH2(N)C*f
and we take (5-75) to be the definition of W. The coprimeness conditions (R, C)L = IM and (R', C')R = INr guarantee that W is a quasi-invertible operator. Now X given by (5-71) intertwines EAEB and EAB if and only if the following relations hold WSO(R)* = SO(R')* W
(5-76)
ZSo(R)* + MAYO(R)* = FO(P)yo(R')* W+ So (P)* Z
(5-77)
P(0) yo(R')* W + yo(P)* Z = 0
(5-78)
WMB = PH1(M)C*MB
(5-79)
and
ZMB = MAB - P' PHO(M)C*MB
(5-80)
Equalities (5-76) and (5-79) follow directly from the definition of W given by (5-75). We define Z: HO(R) -. HO(P) by Zf = PHO(p)PC*f
for
f e HO(R)
(5-81)
Now for f e HO(R) we have P(0) yo(R')* Wf = (P
fy (0) and
which immediately implies (5-80). hence P(0) yo(R')* Wf+ yo(P)* Zf = (P' PHO(R')C*f+ PHO(p)PC*fY (0)
= (PC*f )' (0)
as both PC* and f vanish at zero. Finally, in order to prove (5-77) we note that for fe HO(R) (ZS0(R)* - So(P)* Z) J'= PHO(p)PC*PH6(M)XfPHo(p)XPHo(p)PC*f
(5-82)
Take the decomposition of PC*f, f in HO(R), relative to HO(PR') = HO(P) ® PH,(R'). Thus PC* f = g + Ph with g e HO(P) and h E HO(R'). Clearly we have g = PHO(p)PC* f and h = PHO(R.)C* f. This implies the equality PHO(p)XPC*f = PHo(P)XPHo(P)PC*f + PHo(P)XP PHO(R')C*f
(5-83)
Substituting (5-83) back into (5-82) yields (ZSO(R)* - So(P)* Z).f = PHO(p)PC*PH3(M)Xl - PHO(P)XPC*f + PHO(P)XP' PHO(R')C*f
(5-84)
LINEAR SYSTEMS IN HILBERT SPACE
285
If we apply equality (5-66) to PHo(R')C*{J
PHO(P)XP
= PHo(P)XPWf= PHo(P)PHJ(N)XPWf
we obtain PHO(P)PNI(N)XPWf= PHO(P){PS(R')* Wf+ I'o(P)Yo(R')* Wf}
= F0 (P) yo (R')* Wf
Also, for the first two terms on the right-hand side of (5-84) we have PHO(P)PC*PH8(M)Xf- PHO(P)XPC*f = PHO(P)XPC*(.f -Xf'(0)) - PHo(P)XPC*f
_ - PHO(P)PC*f'(0)
PC*yo(R)* f
_ - MAYo(R)* f which proves (5-77).
Since X is quasi-invertible if and only if W is our assumption of the coprimeness relation (R, C)L = IM implies the quasi-invertibility of X. Since the shift realization EAB of AB is exactly observable it follows, by Theorem
1-7 (c), that E is observable. Replacing (R, C)L = IM by the stronger coprimeness relation [R, C]L = 1 M guarantees that W, and hence X, is boundedly invertible and so again by Theorem 1-7 (c), the exact observability of EAB
implies that of the series connection EAEB. Conversely if we assume the observability of EAEB then X as defined by (5-71), (5-75), and (5-81) is a map that intertwines EAEB and EAB. By Lemma 1-6 (b) X is the only intertwining map which, by Theorem 1-7 (d), is injective. Thus the injectivity of W follows
and hence (R, C)L = IM has to be satisfied. Similarly if EAEB is assumed exactly observable then X becomes boundedly invertible which in turn im-
plies [R, C]L = IM. Thus part (a) of the theorem is proved and part (b) follows directly by duality considerations. Indeed let EA and EB be the *-shift
realizations of A and B which are unitarily equivalent to EA, and Z. Thus the series connection EAEB is unitarily equivalent to (EBEA)* the adjoint system to the series connection of the shift realizations of 9 and A. The map X intertwining EAEB and EAB has its counterpart now in a map X' that intertwines EAB and EAEB. Also note that reachability properties of the system EAEB are equivalent to observability properties of E8E,Z. Since there exist quasi-invertible maps that intertwine EA and EA, EB and EB, respectively, their direct sum, denoted by 8, is a map that intertwines EAEB and EAEB. Thus we obtain the series of intertwining maps AB
EAEB
EAEB '' EAB
(5-85)
The transformation X has dense range by construction and is injective if and only if (R, C)L = IM. X' is injective by construction and has dense range if (P1, D1)R = IM. To obtain bounded invertibility of X and X' the coprime-
ness conditions have to be replaced by strong coprimeness conditions. The map E is boundedly invertible if and only if the maps intertwining the *-shift and shift realizations of A, and B, respectively, are actually bounded-
286
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
ly invertible and this is equivalent to the exact reachability and exact oband E. In turn this is equivalent servability of the four systems EA, EB, to the strong coprimeness conditions [P, C]R = I N, [PI, CI ] = IM, [R, DIR = IM and [RI, DI] = IL. We add the obvious remark that in the case of rational functions the map
is always boundedly invertible and hence this theorem provides a complete analysis of the series connection in the finite dimensional case.
Next we analyse the parallel connection of the shift realizations of two strictly noncyclic functions. The analysis is simpler as the parallel coupling of shift systems is also a shift system.
We begin by proving a simple lemma concerning inner functions.
Lemma 5-17 Given a Hilbert space \M then an inner function Q acting in M Q M is a left inner factor of (IM IM I if and only if it has, up to a constant
)
right unitary factor, the form
-I(IM+S I\
Q-
2 IM-S
IM - S) IM+S
(5-86)
for some inner functions S acting in M.
PROOF Let S be an inner function acting in M then Q defined by (5-86) is also inner and since 1')
I
IM
2(IM-S
IM + S
IM - S
IM
(5-87)
IM+S)(IM)
it is a left inner factor of CIM)
Conversely we consider the constant unitary operator U in M Q+ M defined by
U=
(5-88)
which extends naturally to a unitary operator in H2(M (@ M). Obviously we have U
JM)=,/2- (0)
(5-89)
Thus an inner function Q acting in M ® M is a left factor of (IM) if and M
only if UQ is a left factor of I
O)
However, left inner factors of (
O) are
287
LINEAR SYSTEMS IN HILBERT SPACE
those associated with full range right invariant subspaces of H2(M O+ M) which contain HZ(M) Q+ {0}. These subspaces are clearly of the form HZ(M) E SHI(M) for some inner function S acting in M. Thus the corresponding inner functions are of the form
U*\ Q
SI
0
IM
0
0
S
Hence
.
2\IM
-S/
Since Q is unique up to a right constant inner factor by right multiplication by U we obtain the representation (5-86). Lemma 5-18 Given inner functions P and R acting in M. The following two statements are true. (a) The coprimeness conditions
C(0
R)'(IM)JL
(5-90)
= IM®M
and (5-91)
(P, R)L = IM
are equivalent. (b) The strong coprimeness conditions
C(0
R)'\IM/JL
(5-92)
IM®M
and
[P, R]L = IM
(5-93)
are equivalent. PROOF
(a) Let S be a common left inner factor of P and R. Thus P = SP1 and R = SR 1. Since
0 0 (P
R
1 (I,+ S IM - S1 IM+S S-IM1(Pl 0
2 IM-S IM+S) 2(S-IM IM+SJ SP1 0
0
R1
0 SR1
then together with (5-87) it follows that Q defined by (5-86) is a nontrivial left inner factor of
(0
R)
and
(IM/
288
LINEAR SYSTESMS AND OPERATORS IN HILBERT SPACE
Conversely if we assume
(P
R
and
CIM)
0)
0
have a common left inner factor then by Lemma 5-17 it is necessarily of the form (5-86). Thus 0
CP0
R}
2(I,,,-S
IM+S)(C
D)
from which we obtain the relations P = S(A - C) and R = S(D - B) which taken together show that S is a common left factor of P and R. (b) The key to the proof is Theorem III 14-10. If [P, R]L = IM holds then there exist P, and R, in H°° (B(M, M)) for which PP1 + RR1 = IM. This implies in turn that (P
0
RI)+(IM)(RR1 PP1)=(0
R)(-RI
M)
which shows that the coprimeness relation (5-92) holds.
Conversely assume (5-92) holds then there exist bounded analytic functions
p
(0
A B and (E F) for which C D) A
R)(C
D)
+('M)(E F)=(0
O)
This implies in particular that PA- RC = IM and applying Theorem 11 14-10 again we have (5-93).
Theorem 5-19 Let A and B be strictly noncyclic functions in H°°(B(N, M)) having the coprime factorizations (5-55) and (5-56), respectively. Then (a) The parallel connection EA + E. of the shift realizations EA and EB is observable if and only if (P, R)L = IM and is exactly observable if and only if [P, R],., = I.
(5-94)
(5-95)
(b) The parallel connection E;, + EB of the *-shift realizations E;, and EB is reachable if and only if (PI, RI)R = IM and exactly reachable if and only if
(5-96)
[PI, RI]R = IM
(5-97)
LINEAR SYSTEMS IN HILBERT SPACE
289
(c) A sufficient condition for the reachability of LA + EB is (5-96). If E,, and ED are exactly reachable then (4-36) is also necessary.
PROOF The parallel connection EA + ED of EA and Is has state space H0(P) $ H0(R) and is given by ((S0 (P)*
\\0
0 SO
)'
M, MB
)'
Yo(R)
(5-98)
Thus EA + ED is observable and exactly observable if and only if (5-90) and (5-96) hold, respectively. But these coprimeness conditions are equivalent to (5-91) and (5-93), respectively, by Lemma 5-18, which proves (a). Part (b) follows by duality considerations. To prove (c) let X,, be the map that intertwines the *-shift and shift realizations of A. Similarly we define X8. X,, and XD are quasi-invertible maps and hence so is X,, ® XB which is a map from H0(P1) ®Ho(R1) into H0(P) Q H0(R) which intertwines E;, + EB and E,, + EB. Thus the reachability of E;, + EB implies that of LA + LB. If both systems E,, and ED are exactly reachable then X,, and XD are boundedly invertible and so is X,, D XB. In this case E;, + EB is reachable or exactly reachable if and only if LA + ED has these properties.
6. CONTINUOUS TIME SYSTEMS The study of infinite dimensional continuous time systems presents some difficulties which are absent from the discrete time case. Probably the greatest one is that of deciding about how large a class of systems one wants to study. Thus while we want to develop a theory of systems whose internal representations are of the form
z(t) = Ax(t) + Bu(t) y(t) = Cx(t) + Du(t)
(6-1)
it is far from obvious what restrictions one wants to impose on the operators involved. As we shall see a strict interpretation of (6-1) limits the input/output relations realizable by such systems and hence in order to obtain a theory that would encompass more general input/output relations one would have to relax the restrictions on the operators, A, B, C, D. The central theme of this section is the discussion of the realization problem for continuous time systems and to this end we want to use the continuous time analogue of the restricted shift realization, namely, a realization that utilizes the left translation semigroup. This is a natural approach both for its similarity to the discrete time methods and for the universal properties of the left translation semigroup as given by Theorem II 10-18. Let us study equations (6-1) for a moment. As usual we consider finite input/
290
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
finite output systems which means that the spaces U and Y are assumed to be finite dimensional. The state x takes its values in a Hilbert space X. For (6-1) to make sense strongly one assumes A to be the infinitesimal generator of a strongly continuous semigroup, which will be denoted by a", u is to be continuous function and that the range of B is included in the domain of A. Moreover one assumes that the domain of Z, Dc includes the domain of A, D", and that the restriction of C to D. is continuous. Under these assumptions given an initial state x0 e DA a solution of (6-1) exists and is given by the variation of parameters formula
x(t) = e"xo +
f o0 e"-')Bu(r) dr
(6-2)
which yields
y(t) = Ce"x0 + fo Ce ""-`IBu(t) dt
(6-3)
Th e function Ce"B is the weighting pattern of the system. Reversing our steps we may start with a weighting pattern, which we may not restrict to be a function but allow also distributional values and study input/output relations of
the form
y(t) =
t
J0
W(t - t) u(t) dr
(6-4)
Under assumptions, which will be made precise later, the Fourier transform of (6-4) can be taken which yields
.Fy = OF W) (fu)
(6-5)
The Fourier transform of W will be called the transfer function of the system. In the realization problem we seek, given a W or its Fourier transform, a system (A, B, C, D) within a prescribed class whose weighting pattern coincides with W By weakening our concept of solution we can enlarge the class of systems
under consideration. Thus an X-valued function x(t) will be called a (weak) solution if (6-3) is satisfied. Thus x0 need not be restricted to DA and the function
u need not be continuous, being locally V is sufficient. Furthermore we do not assume that the range of B is included in D". What this amounts to is the interpretation of (6-1) in the weak or distributional sense.
Suppose that we assume both operators B: U -' X and C: X -+ Y to be bounded linear operators then, assuming x0 = 0, the weighting pattern W (t) _ Ce"B is a continuous function from U to Y. Requiring a weighting pattern to be continuous is a severe restriction on the theory which will exclude most interesting physical systems. Also the bounded-
ness requirements on B and C exclude the cases where either the controls or the observations are applied at the boundary. Thus we are faced with the need to relax our assumptions on the operators that constitute a system. We will call a system (A, B, C, D) which has W (t) as its weighting pattern a regular realization of W if B and C are bounded linear maps. One way to relax
LINEAR SYSTEMS IN HILBERT SPACE
291
the conditions on B and C is the introduction of balanced realizations. A realization (A, B, C, D) is called balanced if B: U -p X is bounded and Range B C D", C is a closed linear operator for which Dc DA and C restricted to DA is continuous with respect to the graph topology of DA, that is the topology induced + I(Ax1I2}'12. An equivalent way of stating it is saying by the norm IIxII" = { that C is A-bounded, that is IIx!I2
(6-6) xeDA for all IICxII
Theorem 6-1 A weighting pattern W(t) has a balanced realization if and only if it has a regular one.
PROOF Suppose (A, B, C) is a regular realization of W(t). Define a system (F, G, H) by
F = A,
G = (AI - A)-' B,
H = C(AI - A)
(6-7)
where we take 2 in p(A). Clearly the range of G is in D". Moreover since A is DA and continuous on D", with respect to the graph topology of DA, DH H is continuous on D". Conversely suppose (F, G, H) is a balanced realization of W(t). Define (A, B, C) by
A = F,
B = (2I - F) G,
C = H(2I - F)-'
(6-8)
As (Al - F) is closed it follows that B is an everywhere defined closed transformation. The closed graph theorem guarantees that B is bounded. Similarly (1I - F)-' is a continuous map into DF and as H is F-bounded the continuity of C follows. This is equivalent to boundedness.
It may be of interest to probe a bit further as to the limits of applicability of regular, or alternately balanced, realizations. For simplicity we take up the case of scalar valued weighting patterns. A function f defined on (0, oo) is said to be of exponential order if there exist positive constants M and co for which If (t) l < Me". Obviously if f is measurable and of exponential order then e-f (t) is in L2(0, oo) for large enough a.
Theorem 6-2 A necessary condition that w (t) have a balanced realization is that w be continuous and of exponential order. A sufficient condition is that w be absolutely continuous and w be of exponential order (in the sense that Iw(t)I < Men" a.e.). PROOF If w has a balanced realization then it has also a regular one (A, B, Q. Since w(t) = Ce"B it follows that w is continuous. Also since lie A, II 5 Me fit
for every strongly continuous semigroup we have Iw(t)I - IICII and w is of exponential order.
IIBII Meet
292
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
To prove the sufficiency part we note that if (A, B, C) is a realization of
w(t) then (A - Al, B, C) is a realization of e-ziw(t). Thus without loss of generality we may assume that w and w belong to L2(0, co). Let V(t) be the right translation semigroup in L2(0, oo), V(t)* the left translation semigroup. Let A be the infinitesimal generator of V(t)* then as was shown in Sec. II-10 we have DA = If c L2 (0, oo) I f absolutely continuous and f' e L2 (0, oo) } and Af = f' for f e D,,. Let us define B: C -* De by Ba = aw. Finally let C be the linear functional defined by Cf = f (0). Certainly C is defined on D,,. Moreover we have, by integration by parts that
f- f"'J"(X)f(x)dx 0
if (0)12 0
_ -2 Re f f (x) f'(x)dx 0o
Applying Schwartz's inequality we have
If(0)12 <21IfII Ilf'II :! IIf112+IIf'112= IIf112+ IIA!112 Thus the map C is A-bounded. Now (CV(t)* Ba) (x) = aC(V(t)* w) (x) = aCw(x + t) = aw(t)
which shows that (A, B, C) is a balanced realization of w.
The construction of the balanced realization in the previous proof contains the central idea of most approaches to the realization problem. We want a state space model that uses the left translation semigroup. B would embed the weighting pattern in some vectorial L2(0, co) space and C should act as evaluation at zero. Of course one could just as well approach the realization problem in the frequency domain. To this end let F+ denote the open right half plane {A I Re A > 0) and let H2 (F'+ ; N) be the corresponding Hardy space. By the results of Sec. 11-12
there exists a unitary map J: HN -+ H2(F'+ ; N) given by
(Jf)(w)
w-1 /7tw+lf w+1 1
1
Also by the Paley-Wiener theorem the Fourier transform
(Ff) (w) =
27r
f (t) e-wl dt
(6-10)
0
is a unitary map of L2(0, co; N) onto H2(F'+; N). If {V(t)} is the right translation semigroup in L2 (0, oo ; N) then V (t) _ Jr V(t) .!F is a unitarily equivalent semigroup acting in H2 (I'+ ; N) which is given by
(1(t)F) (w) = e-'T (w)
(6-11)
LINEAR SYSTEMS IN HILBERT SPACE
293
In terms of the boundary values of F on the imaginary axis we have
(t (t) F) (iw) = e-"-'F(iw)
(6-12)
The adjoint semigroup is given by
P (t)* F =
Px2tr.^(e'(0fF)
(6-13)
With the obvious trivial modifications resulting from the different half plane used, all the results of Sec. 11-12 can be applied here. Now the problem of central interest to us is that of realization. As in the case of discrete time systems certain factorizations of the transfer function are intimately connected to some specific
realizations. We let H2 (17,; HS(U, Y)) denote the Hardy space of all analytic functions in the right half plane whose values are Hilbert-Schmidt operators from U into Y. If U and Yare finite dimensional then by a choice of bases in U
and Y a function T in H2(r+; HS(U, Y)) is just a finite matrix with H2(r+) entries. In such a case the map MT: U - H2(F+; Y) defined by MTV = T
(6-14)
is a bounded map.
Theorem 6-3 Let T be a continuous B(U, Y )-valued weighting pattern whose Laplace transform T is analytic in the right half plane. If DeH'(I'+ ; TC(U, Y))
and on the imaginary axis admits a factorization T(iw) = C(iw)* B(iw) where B e H2 (17,; HS(U, V)) and C e H2 (t+ ; HS(Y,, V)) then T has a regular realization. PROOF We choose H2 (I'+ ; V) as our state space and define a system (F, G, H) in the following way. We let F be the infinitesimal generator of the semigroup
{1'(t)*} defined by (6-13) whereas the operators G and H are defined by G = MBA with MT given by (6-14) and H*rl = Mcry. Thus H = Mc is given by
Hf =
2n
C(iw)* f (iw) dw
By our assumptions G and H are bounded operators and {(t)*} is a strongly continuous semigroup. Moreover from the factorization of T it is clear that T is in H'(7, ; TC(U, Y)) so that for each e M HP(t)* Gc = 27r
2n
e"'C(iw)* B(iw) t dw
e'°'T(i(o) dcu = T(t)
and hence we obtained a regular realization of T.
294
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
We note that this theorem is stronger than the sufficiency part of Theorem 6-2. Indeed let T and 1' be in LZ (0, oo ; B(U, Y)) then, since the Fourier transform
of t is iwf(iw), it follows that (1 - ico) D(iw) is in HZ(I +; HS(U, Y)). Since U being finite dimensional, (1 + iw)-' IuEH2(F+;HS(U, U)) it follows that I' is factorable in the form D(i(o) = ((1 + iw) ' Ir ((1 - ia)) D(iw)) and hence T is regularly realizable.
In the case of the unit disc the inclusion relations H°° c HZ c H' hold. In that case once a realization procedure for H' functions is obtained it is automatically valid for H°° functions. Passing to a half plane those inclusions fail to hold any more. Thus if we want a realization procedure that will work for bounded analytic transfer functions in the right half plane then the previous procedures have to be modified.
That this is not purely academic follows by considering the ideal delay by a units. The corresponding input/output relations are
y(t) = u(t - a)
(6-15)
and the transfer function is F(w) = e which is in H`°(F,) but is certainly not in H2(F+). More generally we will consider input/output relations of the form
y(t) _ fr dM(r) u(t - T)
(6-16)
o
where M is a finite B(U, Y)-valued Borel measure on [0, oc). If M is absolutely continuous with respect to the Lebesgue measure then there exists a density matrix M(t) for which dM(t) = M(t) dt for which (M(t) , q) is in L'(0, co) for all choices of e U and q e Y. In such a case (6-16) reduces to the more familiar input/output relation
y(t) =
('
fo
M(r) u(t - r) dt = J M(t - t) u(t) dt
(6-17)
0
In the general case a further reduction is possible. Let D = M({0}) then (6-16) transforms into
y(t) = Du(t) + f
dM(t) u(t - T)
(6-18)
o
Let us denote by F the Fourier transform of the measure M, that is, F(w) =
e-"'T dM(t)
I
JJ
(6-19)
o
Clearly
F(w) = M({0}) + J
e
dM(r)
0
Now F(w) is analytic in r+ and bounded there. Moreover by an application of
LINEAR SYSTEMS IN HILBERT SPACE
295
the Lebesque dominated convergence theorem it follows that
limF(x+iy)-M({O})=0
X- OD
uniformly in y. Thus it is natural to call M({O}) by F(oo) in the sense that lim F(x + iy) = F(oo) uniformly in y. The realization procedure that follows
z^m
has been developed with this class of transfer functions in mind. The new concept needed for what follows is that of a rigged structure on a Hilbert space X. By a rigged structure we mean a Hilbert space X and a linear manifold K c X which is itself a Hilbert space under another, stronger, norm. If K' is the space of all continuous linear functionals on K then any vector x e X induces a continuous linear functional 1. on K given by 1x(y) = (y, x) for all y e K. This enables us to view X. as embedded in K', the embedding being given by L: X K' with Lx = l . Thus we obtain the inclusions
K c X c K'
(6-20)
Let M be an operator in H for which K is M*-invariant. Given any y e K and I e K' we have in 1(M*y) a well-defined continuous linear functional on K whose dependence on I is linear. Thus there exists a bounded operator M and K' satisfying (Pfl)(Y) = 1(M*Y)
(6-21)
For linear functionals arising out of vectors in H we have Ml:(Y) = 1x(M*Y) = (M*Y, x) = (, Mx) = 1MX(Y)
or, as 1,, = Lx, this can be written as
ML= LM
(6-22)
which shows that Al is the continuous extension of M to K' and henceforward we will use the M to denote M as well. If B: U -+ K' and C: K - U are continuous maps then so are B*: K -+ U and C*: U K' which are given by (u, B*k) = (Bu) (k') and (C*u, k) = (u, Ck). Let now X, U, and Ybe Hilbert spaces, and let A be the infinitesimal generator
of a strongly continuous contraction semigroup acting in X. We noted before that, as A is closed, DA becomes a Hilbert space with the inner product induced by the graph norm in DA, Il XII A' = Ilx Ill + II Ax 11 Z. If DA is the dual of DA we nat-
urally obtain the rigged structure
DA c X c D'A
(6-23)
DA. c X c D'A.
(6-24)
and similarly
Since both A and A* generate strongly continuous contractive semigroups
(Al - A)-' and (AI - A*)-' exist as bounded operators for all A such that ReA > 0. Now (AI - A) - ' X c DA and the map (AI - A)-' is continuous relative
296
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
to the graph topology of DA. Similarly (AI - A)*-' X,-_ DA.. By duality we obtain (6-25) (;t1 - A)-' DA. c X and ().I - A*)-' D'A X Also from the invariance of DA under the semigroup a" it follows that eA' is a continuous map of D;, into itself. Generally a continuous time linear system will denote a quadruple of operators
(A, B, C, D) with A as before B: U -+ DA., C: Dc - Y and D: U -- Y where we assume the continuity of B, Dc D DA and the restriction of C to DA to be continuous. By a compatible system we denote a system for which
(A,I - A)-' BU c Dc
(6-26)
for some Ao with Re A0 > 0.
Let us note first that the condition (6-26) is independent of the point A in
ReA>0. Lemma 6-4 Let (A, B, C) be a compatible system. Then (AI for all u e U and A with Re A > 0.
-
A)-' Bu c Dc
PROOF Since (A, B, C) is assumed compatible then (6-26) holds for some AO
with Re)o > 0. By the resolvent equation we have
(AI - A)-' = (Aol - A)-' + (Ao - A) (AoI - A)-'(AI - A)-' Applying this to Bu we have
(AI - A)-' Bu = (A01 - A)-' Bu + (.o - A)(AoI - A)-'(AI - A)-' Bu (6-27)
Now Bu a D'A. and from (6-25) it follows that (AI - A)-' Bu is in X. Now
(AoI - A)-' X c DA c Dc and so the right term in (6-27) is in Dc. But (Ao1 - A)-' Bu a Dc by the (AI - A)-' Bu a Dc.
assumption of compatibility
and so
We turn now to showing that all transfer functions of the form (6-19) are realizable by compatible systems. This is done by associating with the continuous
time realization problem a discrete time problem which is easier to solve and whose solution suitably transformed yields a compatible continuous time realization.
Assume now F is given by (6-19). The map Z=
ww+1
(6-28)
is a fractional linear transformation that maps the right half plane r, onto the open unit disc D. The inverse transformation is given by
w=
l+z (6-29)
1 - z
LINEAR SYSTEMS IN HILBERT SPACE
297
Define now a function CD in the unit disc by CD(z) = F
(6-30)
Clearly CD e H°° (B(0, Y)) and moreover the strong nontangential limits of CD at z = 1 exist and are equal to C(1) = F(oo). By considering CD to be the transfer
function of a discrete time system we can apply the results of Sec. 2 to obtain a Hilbert space realization for CD. Specifically the shift realization (F, G, H, E) of CD can be used. Thus we identify the state space M of the realization with RangeH0,
F = So* I Range H0, G = H, = ((D - CD (0)) , Hf = f'(0) for f e M and E _ (D(0) .
In terms of these operators we can write
C(z) = E + zH(I - zF)-' G
(6-31)
Since the nontangential limits of CD at z = 1 exist and are equal to 0(1) = F(oo) we obtain the relation
C(1) = E + H(I - F) - 1 G = F(co)
(6-32)
From (6-31) we obtain, using the previously introduced fractional linear transformation that F(w)
w- 1 w- 1 w- 1 E+w+1H(1 0(w+1) w+1 =E+(w-1)H((w+l)1-(w-1)F)-'G
-1 G
= E+(w- 1)H(w(1 -F)+(I +F))-'G
= E + (w - 1) H(I -F)`'(w-(F+I)(F-G Since F is a completely nonunitary contraction (F - I)-1 exists as a possibly
unbounded, closed operator. Let AO = (F + 1)(F - I)-' then AO is maximal accretive and hence the infinitesimal generator of a strongly continuous contractive semigroup. Moreover the relation
(F - I)- I = (1 - AO)/2
(6-33)
F(w) _ E + (w - 1) H(1 - AO) (wI - Ao)-' G
(6-34)
implies
We define new operators Bo and Co by
Bo= -277r(I-A0)G
(6-35)
Co = 1-7rH(I - Ao)
(6-36)
and
298
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Thus (6-34) can be rewritten as
F(w) = E - Co(w - 1) (wI - A,)-'(I - Ao)-1 Bo
(6-37)
and, by applying the resolvent identity, we obtain the following representation for F
F(w) = E - Co(I - Ao)-1 Bo + C0(wI - AO)_' BO = F(oo) + C0(wI - Ao)-1 Bo
(6-38)
In summary, the system (AO, Bo, Co, F(oo)) is a realization of the transfer function F, and it will be shown that it is actually a compatible realization. From (6-35) we have
Bo = - 27G*(1 - Ao) and hence, for every x e DA. 1
IIBoxIl _
2f
G`(I - Ao)x
2In-
IIG*II(Ilxll + IIAoxII)
Hence Bo : DA. -+ U is continuous with respect to the graph norm of DA. and by H(1 - AO) duality Bo: U -+ D'A. is a continuous map. Similarly, since Co =
and as DA. = Range(I - Ao)-1, the boundedness of H implies that Dc.
and that the restriction of Co to DA. Co(1 - Ao)-1 Bo shows that for each
is
DA.
continuous. Also F(oo) = E -
e U we have (I - AO)-1 Bot; E Dc and
so the compatibility condition is satisfied.
Although we have obtained a compatible realization, the result is unsatisfactory inasmuch as the state space of the realization is a subspace of Ho,r of the unit disc whereas we would like the setting of the realization to be either a space of functions analytic in the right half plane or a subspace of some L2 space on [0, oc). This is indeed possible and is summarized by the following theorem. The main realization result is the following. Theorem 6-5 Let M be a complex B(U, Y)-valued Borel measure on [0, oo) and let
F(w) = f e-wt dM(t) 0
(a) The state space system (A, B, C, D) with state space M = JoMo = JO Range H0 where JO: Ho, r - H2 (I'+ ; Y) is given by 1
1
(w- Il
,'C w-1
w+
(6-39)
and A, B, C, D are given by (Af) (iw) = iwf (iw)
for
f E DA
(6-40)
LINEAR SYSTEMS IN HILBERT SPACE
(E for
g) =
°'
1
((F(iw) - F(1)) ,g(i(w))dco
2n
299
(6-41)
E U and g c- D,,.
Cf = J _
f e Dc
for
f (i(o) dco
(6-42)
and
D = F(oo) is a compatible realization of F. (b) The state space system (A, E, C, D) with the state space M F being Fourier-Plancherel transform, with (A(p) (p) = V (S)
for
cp e DA
1
(B , y) =
(cpf(s), y(s) + y'(s)) ds
21t
0
( qq(s)
(' sP' 1) =
- (P (s), a-sg) ds
(6-46)
fo
for sp e Dc and
D = F(oc)
(6-47)
is a compatible realization of F.
PROOF (a) The map JO defined by (6-39) is a minor modification of (6-9) and it maps HI.y unitarily onto HZ(I'+, Y). Under JO the right shift in Ho Y is mapped onto the multiplication by (w - 1)/(w + 1) operator in HI(r+, Y). If M = JOMO = JO Range Ho then Ml is a subspace of H2(F+; Y) invariant under multiplication by all H°°(F+) functions. If (F, G, H, E) is the shift realization of (D defined previously then (JOFJO', JOG, HJO', E) is a unitarily
equivalent realization of 4) in M. The operator AO = (F + 1)(F - I)-' is unitarily equivalent to A = JOAOJo ' = (JOFJo ' + I) (JOFJO ' - I)-'. Since F*sp = PMOXcp it follows that
JOF*Jo If = PM X -
1
X+1
f
where X denotes in both cases the identity function, in D and t+, respectively. Thus A*f = - PMX f and hence for f e DA we obtain, in terms of the boundary values
Af = - PMXf
(6-48)
300
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
or, somewhat less precisely, (Af) (iw) = PM(icof (iw)) for al} f e DA. The action of the semigroup generated by A in M is given by (eAtf) (i(o) = PM(e'`)`f (iw)) and this proves (6-40).
To compute B we note that B = JOBO = -1 /(2 f) Jo(I - AO) G = (z) _ (Hod) (z) = 1(z) - 4(0) which im-1 /(2 f) (1 - A) JOG. Now plies that 1
w- 1
1
(JoG )(I+)=J w-
/
-(0)
w+ 1
1L
2 r ((I - A) JOG, g) = - 1 f27t
J
r<
[ w- I
]
(JOGS, (I - A*) g)
2
F(Iw) - F(l)/
iw - 1' 9(iu') + iwg(iw)/ dw
27r
=
F(w) - F(1)
1
eU
Hence for each g e DA. and (B , g)
_
((F(iw) - F(1)) i, g(iw)) dw
which proves (6-41) .
To calculate the representation of C we note that for cD a MO we have Hcp = rp'(0). Now the inverse of JO is given by
(Jo'f) (z) = 2 f
z
1
-
zf
+ z)
(6-49)
-z
so if cp = Jo I f then
f ( 1 +zll 2 f(1 +z) z =2 zdz f d r I-z 1\1-z/J/+1-z 1
cPO
z
which implies that cp'(0) = 22,/ f(1). Therefore Cf = COJo I f = V n H (I - AO) JO If = fn-HJO 1(I - A) .f Now evaluation at any point . e F.4. is a continuous map on H2 (r, ; Y) and we have for each f e H12(r'; Y) and n e Y that 21c ,I
(
n
Iw+A
J
n) l f _,) (fA0a)), - iw
27r
Hence, for f e DA
(Cf ?I) = (f HJo '(I - A), n) = J
(.f (iw), l) dw
This proves (6-42) while (6-43) follows from (6-38).
(6-50)
LINEAR SYSTEMS IN HILBERT SPACE
301
To prove part (b) we apply the inverse Fourier-Plancherel transform to obtain a time domain realization (A, B, e, D) = (.f -'A.., JP--'B, C.`F, D)
in the state space M = °r-'M. From previous considerations A is the infinitesimal generator of the left translation semigroup restricted to the left invariant subspace M of LZ (0, oo ; Y). So (e`(p) (s) = (p (s + t) for all cp a M, and cp a DA we have Acp = V. From (6-41) it follows that W 1
((F(iw) - F ( l ) ) ,
g) = 2n
((F(i - F (1))
1
l
1 - iw
(1
+ iw)g(iw) fdw
l
and taking the inverse Fourier transform we obtain
Y) = 2
(cp ,(s), Y(s) + y '(s)) ds 0
for all
e U and y e DA.. Here cp4 denotes the function
F(w) - F(1) cp4
1-w
(6-51)
J
Finally for f e DA we have, letting cp = .f -'f x
W
(Cf, 0 =
-x
(.f (iw), n) dw = J- W
( (1 - i(O) .f Ow),
1 + iw } dw
or
(pp(s) - (p '(s), a-sq) ds
(eco' n) = J
Integration by parts yields for differentiable p, certainly for cp e D,;, that Ccp = 9 (0)
(6-52)
It is of interest to verify directly that we have obtained a realization. To this end we evaluate
(D+C(21-A)-'B) =D%+C(AI-A) 'Bl; F(oo) +
(F(iw) - F(1))
1
2n J
2-iw
Using the scalar resolvent identity 1
2-iw
1
+
(1 - 2)
1-iw (1-i(O)(A-iw)
dw
302
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
we can rewrite the last integral as 1
(F(iw) - F(l)) dug - im) (1 - i ( o ) j Since (F(w) - F(1)) /1 - w is in H2(1'+, Y) it follows from (6-50)
f
(F(iw) - F(1)) 1-iw
2n J
dw + (1 -
j'
.1)
2n
that
(F(iw) - F(1))
(1 - A)
f
2ir
(1 - i(O) (.2 - iw)
_
t do) = F(A) - F(1)
Thus it remains to evaluate the first integral. To this end we recall that
e-w`dM(i) = F(oo) + J
F(w) = J 0
rf
e-w°dM(r) 0
Let us define G by
`
G(w) = F(w) - F(oc) _
`
e wr dM(t)
0
Clearly G(1) = F(1) - F(ox) and so
F(w) - F(1) 1-w
G(w) - G(l)
1-w
By the Paley-Wiener theorem
F(w) - F(1)
_ G(w) - G(1)
1 -w
1 -w
is the Fourier-Plancherel transform of an L2(0, oo; Y) function, which we
denote by cps. Neither F(w) /(1 - w) nor F(1) /(1 - w) is in
H2(1-+; Y),
= 0, but by restricting ourselves to the half plane Rew > 1 we can identify F(w)/(1 - w) as the Laplace transform of the convolution of the function -e and the measure M, that is unless F(1)
F(w) X
1-w -
oo
('r
e
S
o
J
dM(z)
e`
dt
0
Also FI
1
w
-F(1)
e-w`
e-M
e` dt = -
0
JO
fe'-'dM(T) i; dt 0
which implies the equality
cP4(0 = e J
e ° dM(z)
(6-53)
We can interpret (6-53) as the variation of parameters formula for the
303
LINEAR SYSTEMS IN HILBERT SPACE
solution of the nonhomogeneous differential equation
y(t) - y(t) = M
(6-54)
The differential equation has to be considered in the distributional sense, the solution being actually in L2(0, oo; Y). We can evaluate now the integral I
2 ,`
(F(iw) - F(1)) dw 1 - iw
This integral is equal to
C(F(1)- w(1)) = CP4 =
J
(s))a-sds
(gg4(s) - Wo 0
By (6-54) cp4 - cpf = M and so
e-' dM(s) = F(l)
Ccp4 = fo
wh ich shows that we have indeed solved the realization problem. We call
the realization provided by Theorem 6-5 the restricted translation realization.
As an example of the preceding theorem we obtain a state space realization
of the simple delay line. The input/output relation is y(t) = u(t - a) for some a > 0. Thus the transfer function is F(w) = d-°I which is the Fourier transform of the weighting pattern M = t5 ° I where b° is the Dirac delta function.
To realize 6° .1 we take our state space to be L2(0, a; Y). The operator A will be the infinitesimal generator of the left translation semigroup restricted to L2(0, a; Y). Thus for rp a DA we have A(p = tp'. By (6-53) we have in our case (p, (s) = e' J 3 e-' d M (T)
=e
e-`a°(T) f3'O
(0
es ° This implies that for
s>a s
e U and y e DA.
fa (B , Y) =
((pg(s), y(s) + y '(s)) ds = J
0
y(s) + 1' (s)) ds 0
Now as y e DA. it follows from the results of Sec. II-10 that y(O) = 0. Hence integra-
tion by parts yields (Be, y) = (l;, y(a)) which is formally equivalent to Bk = b°l:.
304
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Now from the variation of parameters formula we obtain ft
e"I'-T>Bu(z) 6.(s) dr
x(t, s) = fo e-4('- '),6u(r) dz = 0
= Jt u(r)6.(s+t-z)dT=u(s+t-a) 0
Assuming u is continuous we immediately obtain y(t) = ex(t) = x(t, 0) u(t - a), i.e. we have realized the simple delay. We turn now to the discussion of Hankel operators. The state space of the shift realization was best characterized as the range of the Hankel operator of the transfer function We expect the same to be true in the continuous time case if the Hankel operator is suitably defined. Given F e H°° (F+ ; B(U, Y)) we define HF : H2(r_ ; U) -* H2(F+ ; Y) by HFV = Prt2(r.; r)Fv
(6-55)
Let D E H°° (B (U, Y)) and F be related by (6-30) and let JO : H0',1 -i H2 (f'+ ; Y) be defined by (6-39). Then for u E HU we have JOHmu
= JOPH2.,(DU = PH2(r.; r)JodDU = PHI(r.:r>F(Jo(u)) = HF(Jou)
or
JOH0 = HFJO
(6-56)
that is, the two Hankel operators Ho and HF are unitarily equivalent. By the Paley-
Wiener theorem the Fourier-Plancherel transform maps L2(- oo, 0; Y) and L2 (0, oo ; Y) on
H2(r_
; Y) and H2 (F+ ; Y), respectively. Thus we want to identify
.F-'HF.4f. Assume again that F(w) = Jo a-'°' dM for some complex matrix valued measure M. Since the Fourier transform of a convolution is the product of the Fourier transforms we have 0
(°F ' (Fv)) (t) =
_
dM(t - a) (.F-'v) (a) do
f
f
co
dM(t + a)(.°srv)(-o)da 0
or
dM(t + a)u(a)do
J
(6-57)
0
With the above identification of all the results concerning reachability, observability, and spectral analysis carry over to the continuous time setting. To discuss reachability let us consider a pair (A, B) where we assume first that B: U - X which is more restrictive than the assumption of compatibility. We
LINEAR SYSTEMS IN HILBERT SPACE
305
say that (A, B) is a reachable pair of U K, = X where K, is the linear manifold tzo consisting of all vectors of the form f o e"(t-`)Bu(t)dr where the control function u is restricted to some function space, say, C([0, co); U).
Lemma 6-6 The pair (A, B) is a reachable pair if and only if n Ker B*eA*t = {0}
(6-58)
ta0
PROOF (A, B) is not a reachable pair if and only if there exists a nonzero vector x e X for which 0 = (x, (o eAll- `)Bu(r) dr) = f o (B*e"''t-,)X, u(r)) dT. Choose u(r) = r0i(t) q where n e U and ly is any scalar C°° function. Thus xe n KerB*e"*t. ta0
If we assume that A is the infinitesimal generator of a strongly continuous semigroup of contractions the reachability of (A, B) can be shown to be equivalent
to the reachability of an associated discrete time system.
Lemma 6-7 Let A be the infinitesimal generator of a strongly continuous
semigroup of completely nonunitary contractions in X and let T= (A + 1)(A - 1)-1 be its Cayley transform. Then (A, B) is reachable if and only if (T, B) is reachable.
PROOF The reachability of the two systems is equivalent to n KerB*e"*t = t20
{0} and n KerB*T*" = {0}, respectively. Thus it suffices to show the ta0
equivalence of the later two conditions. To this end we recall that e"t = e,(T) where e,(z) = e-r(1 +z)/(i -s) = y an(t) z" and e,(T) = lim e,,,(T) with e,,(z) = e,(rz). Thus X
B*e'*tx = Jim E an(t) r"B*T*"x r-.1 n=0
so x e n KerB*T*" implies x e n KerB*eA*t. nao tao To prove the converse we know that with Ip,(z) = (z -
1 + t)/(z - 1 - t)
we have T = lim rp,(e"t) and similarly T* = lim tp,(e4'). This shows that r-O g-0 T*" has a series expansion in terms of a" which implies the converse inclu-
sion n KerB*eA" c n KerB"A*". ta0
na0
Of course reachability could also be discussed in terms of the reachability operator. We define R on Co ((- oo, 0]; U) by
Ru = f
eA Bu( - t) dt o
(6-59)
306
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Since u has compact support R is well defined. It is very simple to check that (A, B) is a reachable pair if and only if the range of R is dense in X.
We pass now to the discussion of compatible systems. In this case R is a priori a map into D'A.. However, since we restricted R to a space of Cm-functions,
in fact once differentiable would be sufficient for our purposes, it follows by integration by parts that
e 'Bu(- t) dt
Ru = J 0
_ -(A -
Ao)-' Bu(O) + fo eA(A - .l0)-' B(li(-t) - k.ou(-t))dt
wh ich shows by compatibility that Ru a X. We say now that (A, B) is reachable
if R thus defined has range dense in X. The pair (A, B) is an exactly reachable pair if R has an extension to a continuous map of L2 (- eo, 0; U) onto X. We can now apply Lemmas 6-6 and 6-7 to compatible systems. Corollary 6-8 Let (A, B) be a compatible pair. Then (a) (A, B) is reachable if and only if n Ker B* (20I - A*) eA*' = {0}
(6-60)
t2o
(b) If A is the infinitesimal generator of a completely non-unitary semigroup then (A, B) is a reachable pair if and only if (T, (I - A) B) is reachable where T is the Cayley transform of A.
As in the discrete time case we can identify the reachability operator of the restricted translation realization with the corresponding Hankel operator. Indeed if F e H°°(F+ ; B(U, Y)) then for u e L2(- oo, 0; U)
Ru = J
eA'Bu(-t)dt = J m PH2(r.;n(ew'(F(it.O) - F(1))u(-t)dt 0
=
PH2(r.; Y)
0
e'O'F(ioi) u(- t) dt J
PH2(r.; Y)
f F(1) e'`°'u(- t) dt 0
0 m
= PH2(r.;Y)
ei°'F(ico) u (- t) dt = HFU
I
0
We say that F E HI (r+ ; B(U, Y)) is strictly noncyclic if (RangeHF)' is an invariant subspace of full range. Thus we infer that, for b defined by (6-30),'D is strictly noncyclic if and only if F is. This implies that theorems like Theorem 3-5 and Theorem 3-10 hold fur functions in H°'(II+; B(U, Y)). An analogous discussion holds for observability. The observability operator
LINEAR SYSTEMS IN HILBERT SPACE
307
0 is defined through its adjoint
O*y = I
e"'`C*y(t)dt
For the restricted translation realization we obtain from (6-42) that (C*n)(ico) = n for all n e Y and hence e
J
*y(t) dt =
0
e-"'ty(t) dt = (-,Fy)(iw) J
0
Thus O* can be identified with the Fourier transform and so the restricted translation realization is exactly observable.
7. SYMMETRIC SYSTEMS The realization theory developed in the previous section using the translation semigroup has as its natural domain of applicability the set of strictly noncyclic functions. This by no means exhausts the interesting cases. In particular certain internal symmetry properties of systems are reflected in the corresponding transfer function. Thus we expect a more limited set of transfer functions, or weighting patterns, to be realizable by systems with internal symmetries. In some sense this
is an approach contrary in spirit to the shift and translation realizations which were highly nonsymmetric.
The systems to be studied in this section are of the form (A, B, C) with A a, possibly unbounded, self-adjoint operator in the Hilbert space H which generates a strongly continuous semigroup. This is equivalent to the semiboundedness of A from above. That is we assume there exists an w > 0 such that (Ax,x) <_ wIIxI12
(7-1)
for all x e DA. This implies and is actually equivalent to a(A) c (- co, CO]. The semigroup eAt generated by A will be contractive if and only if a(A) (-- (- oo, 0]. Hence by replacing A by A - o 1, and the semigroup eA' by eIA-wt)` we may as well assume that A is the infinitesimal generator of a strongly continuous semigroup of contractions. A system (A, B, C) will be called a self-adjoint system if the spaces U and Y are equal and if besides the self-adjointness of A we assume C = B*. We say that (A, B, C) is a stable self-adjoint system if it is a self-adjoint system and A generates a strongly continuous semigroup of self-adjoint contractions. Just as shift operators could be utilized to study systems mainly because we had a convenient functional model for them the same is true in the case of systems with self-adjoint generators. The theory of spectral representations for self-adjoint
operators provides the tool to study these systems. Let us assume that A has finite multiplicity n. In this case we may assume that A is given in its spectral representation. Thus the Hilbert space H can be identified with L2(M) for some n x n matrix measure M. The operator A acts on functions in its domain by (Af) (A) = Af (A)
(7-2)
308
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
and the action of the semigroup is
(e"f)(A) = eAf(A) (7-3) for all f e L2 (M). If the input space U is finite dimensional then, by a choice of basis, it can be identified with (C' where m = dim U. Since B: (C - L2(M) there exists a measurable n x m matrix valued function B(A) such that (7-4)
(Be) (A) = B(A)
for all e (C'". We define the reachability operator R by 0
Ru =
e-"Bu(r) dT
(7-5)
and take its domain of definition to be the space of all bounded measurable (C'"-valued functions of compact support. Thus defined R is a map into L2(M). We can use the spectral representation of A to obtain a corresponding representation for R. It follows from (7-5) that 0
(Ru) (A) = J
e-xtB(A) u(t) dT = B(A) f 0 e-ATu(T) dT
(7-6)
00
But f °-. e A`u(T) dT is just the Laplace transform µ = Lea of a. Thus we can define A on the set of all Laplace transforms of permissible inputs by Ru = Ru or equiva-
lently RQ = 2R. So (Ru) (A) = B(2) II(A)
(7-7)
properly extended by continuity to a function space, which is a B-module, where B is the algebra of bounded Borel measurable functions on 1R, then R would be a B-homomorphism. The observability operator or rather its adjoint can be analogously analyzed. Given a state f E L2 (M) we let
(Of)(t) = CeA'f (7-8) where we assume C = L2(M) -+ CP, having identified Y with V. Let v be any CP-valued function for which the L2 (0, oo ; CP) inner product (Of, v) makes sense. Since (C*n) (A) = C(A)* n
for some measurable p x n matrix function C. From (7-9) it follows that (Cf n) = (f, C*n) = J (dMf (A), C(A)* n) , = j'(C(A)dMf(2)?l) = ( JC(A)dMf(A))
(7-9)
309
LINEAR SYSTEMS IN HILBERT SPACE
or
Cf = JC(A)dMf(.t)
for
fcL2(M)
and this in turn implies
(Of) (t) =
Ce'"f = j'C()dMef()
or (Of) (t) =
JeC()dMf(2)
(7-10)
Thus the representation
(O*v)(,) = C(A)* J
eAtv(t) dt
(7-11)
holds and in terms of Laplace transforms on [0, oo) this can be rewritten as (O*0 (A) = C(,1)* b(A)
(7-12)
where now v refers to the Laplace transforms of functions defined on [0, oo). Thus proper extensions of O* are also B-homomorphisms. Now the reachability operator R is completely determined by the matrix measure M and the matrix function B and so it is natural to characterize reachability in those terms. Analogously we want to characterize observability in terms of C and M. In our context reachability and observability mean that R and O* have dense range conditions which are obviously equivalent to
n KerB*e" = {0}
(7-13)
t?O
and
n Ker CeA = {0}
(7-14)
r>_0
respectively.
Let us choose a scalar measure a such that MIaI". Thus LZ(M) can be unitarily embedded in LZ(al") where the embedding operators in UM as defined
by (6-15-6-16) of Chap. H. We furthermore define another pair of operators B : C' -. L( al,,) and C*. (EI -. P (al.) by B' = UMB
and
C'* = UMC*
(7-15)
The introduction of B' and C'*, which are both multiplication operators, is made in order to remove redundancies in the definition of B and C. In particular B' and C' will be zero on the complement of the support of M. We can apply now
Theorem II 6-19 to obtain the following characterization of reachability and observability.
310
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Theorem 7-1 Let (A, B, C) be a system with state space LZ (M) where A, B, and C are defined by (7-2), (7-4), and (7-9), respectively. Let B' and C' be defined by (7-15) and let P be the measurable projection valued function corresponding to UMLZ(M). Then the system (A, B, C) is reachable if and only if
(B', P-)i = I
(7-16)
(C', Pl)R = 1
(7-17)
and observable if and only if
If we specialize this to the case of a matrix measure of scalar type that is to the case M = uI,, then obviously a can be identified with µ. Since P is identically
equal to 1 we have Pl = 0 and so (B, P')i = In if and only if there exists no ,u-nontrivial projection valued function R such that B = RB. This is obviously the case if and only if B has full row rank µ-a.e. Summarizing this we have obtained.
Corollary 7-2 Let M = uI and let the system (A, B, C) be as in Theorem 7-1. Then (A, B, C) is reachable if and only if B has full row rank P-ae and observable if and only if C has full column rank µ-ax. That is rank B(.1) = n
µ-a.e.
(7-18)
rank C(2) = n
u-a.e.
(7-19)
and
respectively.
From this corollary we can immediately deduce the general reachability conditions. To this end we utilize the canonical spectral representation of A. A is unitarily equivalent to the multiplication by A operators in LZ(N) where N is the diagonal matrix measure
/vl+...+v. N=
v2+...+v
(7-20)
.
Let a = vI +
+ vp and let n, be the Radon-Nikodym derivative of v, with
respect to a and let E; = {AI n;(A.) * 0} then a-a.e. the measurable projection valued function P that satisfies UNLZ(N) = PLZ(aI,,) is given by XE, ('1) + "' + XE (A)
P(A) =
XE,(A) + ... + XE(2)
(7-21)
LINEAR SYSTEMS IN HILBERT SPACE
311
With this notation we can state the general reachability and observability conditions as follows.
Theorem 7-3 Let (A, B, C) be the system as in Theorem 7-1. Then with respect to the canonical spectral representation the system (A, B, C) is reachable if and only if Rank B(A) = k
vk -a.e.
(7-22)
for all k = 1, ..., n. Analogously the system is observable if and only if (7-23) Rank C(1) = k vk -a.e.
for all k = 1, ..., n. As is clear from the single example at the end of Sec. 1 two systems with self-adjoint generators that realize the same transfer function need not be similar. Therefore for isomorphism results we have either to strengthen the reachability (or alternatively observability) conditions or to tighten the relation between the operators B and C. We begin by studying the first possibility modeling our approach on the exact reachability conditions introduced previously. Let R be the reachability operator of the system (A, B, C) with the state space L2(M). Let a be a positive measure on (- oo, 0]. We say the system (A, B, C) is a-exactly reachable if R can be extended by continuity to a bounded operator from L2 (M). a-exact observability is defined analogously. Note that the a-exact reachability of the system (A, B, C) shows that R is a B-homomorphism. Lemma 7-4 Let (A, B, C) be as in Theorem 7-1. Then if (A, B, C) is a-exactly reachable and or << p then (A, B, C) is also p-exactly reachable.
PROOF By Lemma II 6-14 there exists a B-homomorphism R: for which A _ (UM)* R. If a << p, h can be lifted to a B-homomorphism R: for which AUo = U°R
(7-24)
Thus (UM)* A provides the necessary bounded extension of A to a Bhomomorphism of
onto L2(M).
To get a characterization of a-exact reachability in terms of the matrix measure M, the measure a and the function B we first show that L2(M) can be considered as a subspace of Actually we prove a bit more and the result may be of independent interest. This is the converse to Lemma 11 6-2.
Theorem 7-5 Let M and N be two n x n positive matrix measures and let X: L2(M) -+ L2(N) be a B-homomorphism. Then (a) If X is one to one then M I N. (b) If X has dense range then N I M.
312
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
PROOF (a) By Theorem II 5-9, there exists an isometry, which can be checked
to be B-homomorphism, U: LZ(M) -i LZ(N). Let E be any Borel set with compact closure, XE its characteristic function and e C' Then XE belongs to any LZ (M) space. Since U is a B-homomorphism we have (U(XE)) (Z) = XE(A) (Uc) (A) = XE(ti) J(A)
for some measurable matrix function J. Since U is isometric we have
J(dMXE()
J (dNXE(A) J(A) , J(2)
, XE(Z))
or
J (J(1)* dNJ(1)
,
E
)=J
Since (7-25) holds for arbitrary vectors
(7-25) E
e C and every Borel set E then
dM = J*dNJ
(7-26)
or MIN. Part (b) follows by duality.
Corollary 7-6 Let (A, B, C) be a system as in Theorem 7-1. If (A, B, C) is a-exactly reachable then M I cI,,.
An easy application of Theorem II 6-19 yields the characterization of aexact reachability and observability.
Theorem 7-7 Let (A, B, C) be a system in L2(M) where A, B, and C are defined by (7-2), (7-4), and (7-9), respectively, and let B' and C' be defined by (7-15). Then
(a) (A, B, C) is a-exactly reachable if and only if M i a1 and [B', P1]°c = 1,
(7-27)
where P is the orthogonal projection valued function for which RU UMLZ(M).
(b) (A, B, C) is a-exactly observable if and only if M I aI and [C, P1 1R = I
(7-28)
where P is as in part (a).
The concept of a-exact reachability allows us to state and prove a theorem analogous to the isomorphism result of Theorem 1-9.
Theorem 7-8 Let (A, B, C) and (A,, B1, C1) be two systems with finite multiplicity self-adjoint generators and assume both systems realize the same
weighting pattern. If both systems are observable and a-exactly reachable then they are isomorphic.
LINEAR SYSTEMS IN HILBERT SPACE
313
PROOF Let R, R1 and 0, O, be the respective reachability and observability operators of the two systems. Since both systems realize the same weighting pattern we have OR = O1 R1 and by the assumption of observability
KerO = Ker O, = 101. This implies that KerR = Ker R 1 and hence also KerR = KerR1. Define a map X: LZ(M) -. LZ(M1) by
XR = R,
(7-29)
R* the pseudoinverse of f? is defined and bounded on all of L2(M) and as R is onto LZ(M) it follows that X = R1R* and X is a B-homomorphism of LZ(M) onto LZ(M1).
From the definition of X we have XB(A) = B1(A) and XA = A1X follows from the fact that X is a B-homomorphism and hence is given by a
multiplication operator that intertwines scalar multiplication operators. Finally, since OR = O1R1, using (7-29) we have O1R1 = O1XR = OR. As
R is surjective we deduce that 01X = 0 which in turn implies C1X = C. We expect to obtain stronger statements by limiting the class of systems under consideration. Hereby we focus our attention on stable self-adjoint systems. Our first object is the characterization of the class of weighting patterns that are realizable by such systems. To this end we introduce complete monotonicity. A scalar function w defined on (0, oo) is called completely monotonic if it is C°° and satisfies (- I)" w(-)(t) >- 0 for all n >- 0 and t >- 0. This extends easily to a Hilbert space H operator valued function W. We say W is completely monotonic if the scalar function (W(t) x, x) is completely monotonic for every x e H. Scalar completely monotonic functions have analytic extensions to the right half plane and so W has a weakly analytic extension. Since weak and uniform analyticity are equivalent [29] it follows that a completely monotonic operator valued function is actually infinitely differentiable in the uniform operator topology.
Theorem 7-9 A U-valued weighting pattern W defined on [0, oo) is realizable by a stable self-adjoint system if and only if it is completely monotonic.
PROOF Suppose the stable self-adjoint system (A, B, B*) realizes W in the
state space H, Let E be the spectral measure of A Then, by to spectral theorem 0
('0
W(t) = B*e"`B = B*
eA E(dA) B = I
e tB*E(d..) B
Hence (-1 )" W1"1 (t) = J
0
(-.ire"B*E(dl)B
or
(_I)"(lf,cel(t)x,x) =
f
(_1)"ext1E(dA)Bx
314
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
Conversely assume W is completely monotonic. We apply an integral representation theorem of Bernstein [120] to the effect that the class of completely monotonic functions is in a one-to-one correspondence with the
set of Laplace transforms of finite nonnegative measures supported on (- oc, 0]. Thus every completely monotonic function is uniquely repre-
,
sentable in the form f°_ e" dµ. Since W is assumed completely monotonic it follows that for every e U there exists a measure p4 for which eAr du,
(W(t) , ) =
(7-30)
f-o OD
By polarization we have for each
and >) in U the existence of a unique
finite complex Bore] measure u4,,, such that 0
(W Q) , rl)
e-" dpe,a
(7-31)
J -00
By essentially the same method used in the proof of the spectral theorem there exists a complex U-operator valued measure M for which
(M(a) , l) = p;.n(a)
(7-32)
for every Borel set a. Sincep4,4 is nonnegative for every e U then the measure
M is also nonnegative. From (7-31) it follows that
W (t) = J
eA` dM
(7-33)
To get a realization out of the above representation we assume, without loss of generality, as this can be achieved by resealing, that M((- oo, 0]) 5 I. Theorem II 10-21 guarantees the existence of a dilation space K U and a spectral measure E in K such that for every Bore] set a
M(a) = PE(a) I U
(7-34)
where P is the orthogonal projection of K onto H. In particular this implies the factorization
M(a) = B*E(a) B
(7-35)
where B is the injection of U into K and B*: K -+ U satisfies B*x = Px for every x e K. Using the spectral measure E in K we define a self-adjoint operator A by
A = J2E(dA)
(7-36)
The system (A, B, B*) is a self-adjoint system and it realizes Was B*e"'B = B* JetE(dA)B = J eA`B*E(dA) B = f e,"dM = W (t)
LINEAR SYSTEMS IN HILBERT SPACE
315
We did not have to assume the finite dimensionality of U. However, if U is finite dimensional it can be identified, through a choice of an orthonormal basis, with V. Thus M is in this case a nonnegative matrix measure and the dilation space can be identified with LZ(M). The operator A is taken to be the multiplication by A operator in LZ(M) whereas B: U -+ LZ(M) is the embedding which takes a vector E U into the constant function g, that is, (A) _ . The spectral measure E of the self-adjoint operator A is given by (E(a) f) (A) = X (A) f(A) where Xa is the characteristic function of a. For every Borel set a we have (B*E(a)
(E(a) B';, J0
(M(a) , )
and so the factorization (7-35) follows. As a consequence of Theorem 7-3 this realization is canonical.
For self-adjoint systems the conditions for observability and reachability coincide. Previously we characterized reachability in terms of M and B. Since the spectral measure E determines A uniquely we can obtain an equivalent characterization in these terms. Theorem 7-10 Let (A, B, B*) be a self-adjoint system with state space X and let E be the spectral measure of A. Then the system is canonical if and only if
n KerB*E(a) = {0}
(7-37)
aEE
the intersection taken over the set F. of all Borel subsets of (- oe, 0]. PROOF It suffices to show that noEE KerB*E(a) = n ,,o KerB*e". Assume x e nQEE KerB*E(a) then 0
B*eA`X
and hence the inclusion converse inclusion let x E
t ? 0 and the
= OEE
e-'B*E(di) X = 0
KerB*E(a) c n, 0 Ker B*e"`. To prove the
,, o KerB*e"` then J-, e`B*E(dA) x = 0 for all uniqueness part of Bernstein's theorem implies x e
noEE KerB*E(a). This proves the theorem.
Theorem 7-11 Let (A, B, B*) be a canonical self-adjoint realization of a transfer function F then the realization is spectrally minimal. PROOF Since A is self-adjoint A has the representation A = f AE(dA) with respect to a uniquely determined spectral measure E. The spectral measure E((a, b)) of an open interval (a, b) can be obtained from the resolvent function
316
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
R(A; A) by l
6-d
[R(A - is; A) - R(. + is; A)] dA
E((a, b)) = lim lim
6-0 F_0 2
Hence for every vector
(7-38)
a+d
EU
B
)) b
= lim lim
a
[(r(A
- is)
(r(2 + is)da. (7-39)
Suppose (a, b) is an open interval which is included in the domain of analyticity of IF. Then, by Cauchy's theorem, the previous equality E((a, b)) BI; = 0. Since the semigroup a" commutes with the spectral measure E it follows that
E((a, b)) a `B = e"E((a, b)) B = 0
Reachability of the system is equivalent to the set of vectors of the form el'Bt spanning the state space H. Hence we can conclude E((a, b)) = 0 as (a, b) c a(A). Thus o(A) c a(r) and as the reverse inclusion holds always the proof is complete.
A direct consequence of the spectral minimality property for canonical self-
adjoint systems is the fact that the spectra of the generators in two different canonical self-adjoint realizations of a given transfer function coincide. Actually an isomorphism result can be proved and as a step in that direction we prove the following.
Lemma 7-12 Let (A, B, B*) and (A, , B1, B;) be two canonical self-adjoint realizations with I, and r1 their respective transfer functions and E and E1 the spectral measures of A and A1, respectively. Then the transfer functions F and r1 coincide if and only if for every Borel seta on the real line we have
B*E(a)B = BiE1(a)B1
(7-40)
PROOF If (7-39) holds then
r(z) = B*(zI - A)-' B = J(z - A)-I B*E(dl) B =
f(z
- A)' BE1(dA)B1 =B*l(zI -A1)-1 B1 =r1(z)
The converse follows from (7-37) for open intervals and hence by standard measure theoretic techniques for all Borel sets.
LINEAR SYSTEMS IN HILBERT SPACE
317
We can state and prove now the state space isomorphism for self-adjoint systems.
Theorem 7-13 Let (A, B, B*) and (A1, B1i Bt) be two canonical self-adjoint systems in the Hilbert spaces H and H1, respectively. A necessary and sufficient condition that the two systems realize the same transfer function is that they are unitarily equivalent.
PROOF The sufficiency part is trivial. To prove the converse we note that every self-adjoint operator has a unitarily equivalent spectral representation. Thus there is no loss of generality in assuming that both operators are given
in their canonical spectral representation. So by Theorem II 6-8 we can identify the state space of the two systems with e,= 1 LZ (v,I j) and mj= 1 LZ(vj"I j) where the vj are mutually singular. The supports of vj are the
sets of multiplicity j. By Lemma 7-12 equality (7-40) holds for all Borel sets a which implies, integrating over the supports of v and v;11, that vj and v1j 1) are equivalent measures. It follows that dvj = hjdvS'I for some measurable function hj. Moreover for ). in Ej the support of vj we have B(2)* B(2) = hj(2)2 B1(A)* B1(1)
(7-41)
Define a map Uj(2) in Ej by Uj(A) B(2) = hj(2) B1(2)
(7-42)
By reachability B and B1 have full rank a.e. with respect to vj (or vj')) and Uj is unitary. Let j) ®LZ(vj(')Ij) be defined by U = ®1= 1 Uj then U is a unitary U: ( B .s-homomorphism, thus intertwines A and A1. Condition (7-41) is equivalent to UB = B1 and so U provides the unitary equivalence. hence U(A) is invertible a.e. Moreover, clearly
NOTES AND REFERENCES Almost simultaneously with the development of finite dimensional system theory attention has been directed to the study of infinite dimensional problems. Of the
early work in this direction one should mention Balakrishnan's [3, 4]. That modern operator theory is relevant to system theory has been recognized concurrently by several researchers, namely, Dewilde [22], Helton [69-71], Baras and Brockett [5, 6], and the author [42-50, 54]. Essentially it is this body of work that
motivated the book and constitutes the content of this chapter. It was Helton [69] who introduced the concept of exact reachability and proved the version of the state space isomorphism theorem appearing in Theorem 1-9. The realization procedure using shift operators has been constructed by the author in the scalar case [42] and by Helton in the general case using Hankel operators [69]. The results concerning restricted shift systems follow the author's [43-45]. Theorem 3-5 characterizing strictly noncyclic functions is motivated by and based on the fundamental paper of Douglas, Shapiro, and Shields [28], as well
318
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
as Kriete [79]. That Theorem II 14-11 can be applied to study Hankel operator ranges has been observed first by Clark in the scalar case [69] and worked out in detail by the author in [43, 44]. Spectral minimality and its importance in system theory have been stressed by Baras and Brockett [6]. The proof of the spectral minimality of the shift realization of strictly noncyclic functions is due to the author. Section 5 is based on several of the author's papers, namely [47-49] which deal with ranges of Hankel operators induced by sums and products and with series and parallel connection of systems. Some work on degree theory for infinite dimensional systems has been done also by Dewilde [22] who also recognized the role of the determinant of an inner function as a proper substitute for a degree function. The study of the infinite dimensional realization problem in continuous time began with Balakrishnan. The first rigorous work seems to be that of Baras and Brockett [6]. The idea of using rigged Hilbert spaces is due to Helton [71] who introduced also compatible systems. We follow Hedberg's [66] approach to the continuous time realization problem. This is done by associating with it a discrete time realization problem and appropriately transforming the solution. Such an association appears also in the study of controllability in [41, 43]. The restriction to transfer functions that are Fourier transforms of measures is done for technical reasons. There is no doubt, however, that the theory can be developed in much greater generality. Section 7 is based largely on [16, 54]; Theorem 7-3 is due to Fattorini [35].
Infinite dimensional systems with other symmetry constraints are discussed in [17, 36, 84] .
REFERENCES Ahern, P. R. and D. N. Clark: On functions orthogonal to invariant subspaces, Acta Math., 124,191-204,1970. 2. Akhiezer, I. N. and I. M. Glazman: "Theory of Linear Operators in Hilbert space," F. Ungar, New York, 1961. 3. Balakrishnan, A. V.: Linear systems with infinite dimensional state spaces, Symposium on System Theory, Polytechnic Institute of Brooklyn, 1965. 4. Balakrishnan, A. V.: System theory and stochastic optimization, Proc. Nato Advanced Study Institute of Network and Signal Theory, Bournemouth, England, 1972. 5. Baras, J. S.: Algebraic structure of infinite dimensional linear systems in Hilbert Space, Proc. of CNR-CISM Symposium-- Advanced School on Algebraic System Theory, Udine, Springer, I.
1975. 6.
Baras, J. S. and R. W. Brockett: Hz-functions and infinite-dimensional realization theory,
SIAM J. Control, 13, 221-241, 1975. 7. Baras, J. S., Brockett, R. W., and P. A. Fuhrmann: State space models for infinite-dimensional systems, IEEE Trans. Automatic Control, 19,693-700,1974. 8. 9.
Baras, J. S. and P. Dewilde: Invariant subspace methods in linear multivariable distributed systems and lumped distributed network synthesis, Proc. IEEE 64, 160-178, 1976. Barnett, S.: Matrices, polynominals and linear time invariant systems, IEEE Trans. Automatic Control, 18, 1-10, 1973.
10.
Beals, R.: "Topics in Operator Theory," Univ. Chicago Press, Chicago and London, 1971.
LINEAR SYSTEMS IN HILBERT SPACE
319
H. Beurling, A.: On two problems concerning linear transformations in Hilbert Space, Acta Math., 81, 239-255, 1949. 12. Birkhoff, G. and G. C. Rota: "Ordinary Differential Equations," Blaisdell, Waltham, Mass. 1962.
13. Bochner, S. and K. Chandrasekharan: "Fourier Transforms," Princeton Univ. Press, Princeton N.J., 1949. 14. De Branges, L. and J. Rovnyak: The existence of invariant subspaces, Bull. Amer. Math. Soc., 70, 718-721, 1964; and 71, 396, 1965. 15. Brockett, R. W.: "Finite Dimensional Linear Systems," Wiley, New York, 1970. 16. Brockett, R. W. and P. A. Fuhrmann: Normal symmetric dynamical systems, SIAM J. Control, 14,107-119,1976. 17. Brodskii, M. S.: "Triangular and Jordan representations of linear operators," Amer. Math. Soc., Providence R.I., 1971. 18. Brown, A.: A version of multiplicity theory, in "Topics in Operators Theory" (ed. C. Pearcy) Amer. Math. Soc., Providence R.L., 1974. 19. Callier, F. M. and C. D. Nahum: Necessary and sufficient conditions for the complete controllability and observability of systems in series using the coprime decomposition of a rational matrix, IEEE Trans. Circuits and Systems, 22, 90-95, 1975. 20. Carleson, L.: Interpolation by bounded analytic functions and the corona problem, Ann. of Math., 76, 547-559, 1962. 21. Cooper, J. L. B.: One parameter semi-groups of isometric operators in Hilbert space, Ann. of Math., 48, 827-842, 1947. 22. Dewilde, P.: Input-Output description of roomy systems, SIAM J. Control, 14, 712-736, 1976. 23. Douglas R. G.: On majorization, factorization and range inclusion of operators on Hilbert Space, Proc. Amer. Math. Soc., 17, 413-415, 1966. 24. Douglas, R. G.: "Banach Algebra Techniques in Operators Theory," Acedemic Press, New York, 1972.
25. Douglas, R. G.: Canonical models, in "Topics in Operators Theory" (Ed. C. Pearcy), Amer. Math. Soc., Providence R.I., 1974. 26. Douglas, R. G. and J. W. Helton: Inner dilations of analytic matrix functions and Darlington Synthesis, Arm Sci. Math., 34, 301-310, 1973.
Douglas, R. G., Muhly, P. S. and C. Pearcy: Lifting commuting operators, Mich. Math. J., 15,385-395,1968. 28. Douglas, R. G., Shapiro, H. S. and A. L. Shields: Cyclic vectors and invariant subspaces for 27.
the backward shift, Ann. Inst. Fourier, Grenoble 20, 1, 37-76, 1971. 29. Dunford, N. and J. T. Schwartz: "Linear Operators," Vols. 1. 2. Interscience, New York, 1957, 1963.
30. Duren, P. L.: "Theory of H°-Spaces," Academic Press, New York, 1970. 31. Dym, H. and H. P. McKean: "Fourier Series and Integrals," Academic Press, New York, 1972. 32. Eckberg, A.: A characterization of linear systems via polynomial matrices and module theory, MIT Report ESL-R-528, 1974. 33.
Eilenberg, S.: "Automata, Languages and Machines," vol. A. Academic Press, New York, 1974.
34. Embry, M. R.: Factorization of Operators on Banach Space, Proc. Amer. Math. Soc., 38, 587590,1973. 35. Fattorini, H. 0.: On complete controllability of linear systems, J. Diff. Eq., 391-402, 1967. 36. Feintuch, A.: "Realization theory for symmetric systems," J. Math. Anal. App!. 71, 131-146, 1979. 37. Fillmore, P. A.: "Notes on Operator Theory," Van Nostrand, New York, 1968. 38. Forney, G. D.: Minimal bases of rational vector spaces with applications to
multivariable
linear systems, SIAM J. Control, 13,493-520, 1975. 39. Fuhrmann, P. A.: On the corona theorem and its applications to spectral problems in Hilbert space, Trans. Amer. Math. Soc., 132, 55-66, 1968. 40. Fuhrmann, P. A.: A functional calculus in Hilbert space based on operator valued analytic functions, Israel J. Math., 6, 267-278, 1968. 41. Fuhrmann, P. A.: On weak and strong reachability and controllability of infinite-dimensional linear systems, J. Opt. Th. App!., 9, 77-87, 1972.
320
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
42. Fuhrmann, P. A.: On realizations of linear systems and applications to some questions of stability, Math. Syst. Th., 8, 132-141, 1974.
43. Fuhrmann, P. A.: Exact Controllability and observability and realization theory, J. Math. Anal. Appl., 53, 377-392, 1976. 44. Fuhrmann, P. A.: Realization theory in Hilbert space for a class of transfer functions, J. Funct. Anal., 18, 338-349, 1975. 45. Fuhrmann, P.A.: On Hankel operator ranges, meromorphic pseudo-continuation and factorization of operator valued analytic functions, J. Lond. Math. Soc., (2) 13, 323-327, 1975. 46. Fuhrmann, P. A.: On controllability and observability of systems connected in parallel, IEEE Trans. Circuits and Systems, Cas-22, 57, 1975.
47. Fuhrmann, P. A.: On canonical realizations of sums and products of nonrational transfer functions, Proc. 8th Princeton Conf. on Information Sciences and Systems, 213-217, 1974.
48. Fuhrmann, P. A.: On generalized Hankel operators induced by sums and products, Israeli. Math., 21, 279-295, 1975. 49. Fuhrmann, P. A.: On series and parallel coupling of infinite dimensional linear systems, SIAM J. Control, 14, 339-358, 1976. 50. Fuhrmann, P. A.: Some results on controllability, Ricerche di Automatica, 5, 1-5, 1974. 51. Fuhrmann P. A.: Algebraic system theory; an analyst's point of view, J. Franklin Inst., 301, 521-540, 1976. 52. Fuhrmann, P. A.: On strict system equivalence and similarity, Int. J. Control, 25, 5-10, 1977. 53. Fuhrmann, P. A.: Simulation of linear systems and factorization of matrix polynomials, Int. J. Control, 28, 689-705, 1978. 54. Fuhrmann, P. A.: Operator measures, self-adjoint operators and dynamical systems, SIAM J. Math. Anal. 11, 1980. 55. Gantmacher, F. R.: The Theory of Matrices," Chelsea, New York, 1959. 56. Gohberg, 1. C. and I. A. Feldman: "Convolution Equations and Projection Methods for Their Solution", Amer. Math. Soc., Providence R.I., 1974. 57. Gohberg, I., Lancaster, P. and L. Rodman: Spectral analysis of matrix polynomials, I. Canonical forms and divisors, Linear Algebra Appl., 20, 1-44, 1977. 58. Gohberg, I., Lancaster, P. and L. Rodman: Spectral analysis of matrix polynomials, II. The resolvent form and spectral divisor, Linear Algebra Appl., 21, 65-88, 1978.
59. Gohberg, I., Lancaster, P. and L. Rodman: Representations and divisibility of operator polynomials, Canadian J. of Math., 30,1045-1069,1978. 60. Gohberg, I. C. and L. E. Lerer: Resultants of matrix polynomials. 61. Halmos, P. R.: Normal dilations and extensions of operators, Summa Brasil., 2, 125-134, 1950.
62. Halmos, P. R.: "Introduction to Hilbert Space and the Theory of Spectral Multiplicity," Chelsea, New York, 1951. 63. Halmos, P. R.: Shifts on Hilbert Spaces, J. Reine Angew. Math., 208, 102-112, 1961. 64. Halmos, P. R.: "A Hilbert Space Problem Book," Van Nostrand, New York, 1967. 65. Halperin, I.: The unitary dilation of a contraction operator, Duke math. J., 28, 563-571, 1961. 66. Hedberg, D. J.: "Operator Models of Infinite Dimensional Systems," Ph.D. Thesis, Dept. of System Science UCLA, 1977. 67. Helson, H.: "Lectures on Invariant Subspaces," Academic Press, New York, 1964. 68. Helson, H. and D. Lowdenslager: Prediction theory and Fourier Series in Several Variables, Acta Math., 99, 165-202, 1958; II, ibid. 106, 175-213, 1961. 69. Helton, J. W.: Discrete time systems, operator models and scattering theory, J. Funct. Anal., 16, 15-38, 1974. 70. Helton, J. W.: A spectral factorization approach to the distributed stable regulator problem: the algebraic Ricatti equation, SIAM J. Control, 14, 639-661, 1976. 71. Helton, J. W.: Systems with infinite-dimensional state space: The Hilbert space approach, Proc. IEEE, 64, 145-160, 1976. 72. Hille, E. and R. S. Phillips: Functional Analysis and Semigroups, Amer. Math. Soc., Providence, 1957.
73. Hoffman, K.: "Banach Spaces of Analytic Functions," Prentice Hall, Englewood Cliffs, N.J., 1962.
LINEAR SYSTEMS IN HILBERT SPACE
321
74. Jacobson, N.: Lectures in Abstract Algebra, Vol. 2 "Linear Algebra," Van Nostrand, Princeton, 1953.
75. Jacobson, N.: "Basic Algebra I," W. H. Freeman, San Francisco, 1974. 76. Kalman, R. E.: "Lectures on Controllability and observability," CIME Summer Course 1968; Cremonese, Roma 1969.
77. Kalman, R. E., Falb, P. L. and M. A. Arbib: "Topics in Mathematical System Theory," McGraw-Hill, New York, 1969.
Katznelson, Y.: "An Introduction to Harmonic Analysis," Wiley, New York, 1968. Kriete, T. L.: A generalized Paley-Wiener theorem, J. Math. Anal. Appl., 36, 529-555, 1971. Lang, S.: "Algebra," Addison Wesley, Reading, Mass., 1965. Lax, P. D.: Translation Invariant Subspaces, Acta Math., 101, 163-178, 1959. Lax, P. D. and R. S. Phillips: "Scattering Theory," Academic Press, New York, 1967. Livsic, M. S.: On the spectral resolution of linear non-selfadjoint operators, Math. Sb., 34 (76), 145-199,1954. 84. Livsic, M. S.: "Operators, Oscillations, Waves. Open Systems," Amer. Math. Soc. Translations, 78. 79. 80. 81. 82. 83.
34, 1973. 85. Lorch, E. R.: "Spectral Theory," Oxford University Press, New York, 1962.
86. MacDuffee, C. C.: "The Theory of Matrices," Chelsea, New York, 1946. 87. MacDuffee, C. C.: Some applications of matrices in the theory of equations, Amer. Math. Monthly, 57, 154-161, 1950. 88. MacLane, S. and G. Birkhoff: "Algebra," Macmillan, New York, 1967. 89. Moeller, J. W.: On the spectra of some translation invariant subspaces, J. Math. Anal. Appl., 4, 276-296, 1962. 90. Moore, B. III: Canonical forms in linear systems, Proc. Eleventh Allerton Conference, University of Illinois, 36 44, 1973. 91. Moore, B. III and E. A. Nordgren: On quasi-equivalence and quasi-similarity, Acta Sci. Math., 34, 311-316, 1973. 92. Naimark, M. A.: Positive definite operator functions on a commutative group, Izvestija Akad. Nauk SSSR, 7, 237-244, 1943.
93. Naimark, M. A.: "On a representation of additive operator set functions," Doklady Akad. 94.
Nauk SSSR. 41, 359-361, 1943. Nelson, E.: "Topics in Dynamics 1: Flows," Princeton Univ. Press, Princeton N.J., 1968.
95. Nikolskii, N. K. and B. S. Pavlov: Eigenvector bases of completely nonunitary contractions and the characteristic function, Math. USSR-Izvestija, 4, 91-134, 1970. 96. Nirenberg, L.: "Functional Analysis," Lecture Notes of the CIMS, New York, 1961. 97. Nordgren, E. A.: On quasi-equivalence of matrices over Hm, Acta Sci. Math., 34, 301-310, 1973. 98. Plessner, A. I.: "Spectral Theory of Linear Operators I, II," F. Ungar, New York, 1969. 99. Riesz, F. and B. Sz.-Nagy: "Functional Analysis," F. Ungar, New York, 1955.
100. Rosenbrock, H. H.: "State Space and Multivariable Theory," Wiley, New York, 1970. 101. Rota, G. C.: On models for linear operators, Comm. Pure and Appl. Math., 13, 469-472, 1960. 102. Rowe, A.: The generalized resultant matrix, J. Inst. Math. Appl., 9, 390-396, 1972. 103. Sarason, D.: A remark on the Volterra operator, J. Math. Anal. Appl., 12, 244-246, 1965. 104. Sarason, D.: Generalized interpolation in H`°, Trans. Amer. Math. Soc., 127, 179-203, 1967. 105. Schaffer, J. J.: On unitary dilations of contractions, Proc. Amer. Math. Soc., 6, 322, 1955. 106. Schatten, R.: "Norm Ideals of Completely Continuous Operators," Springer, Berlin, 1960. 107. Shapiro, H. S. and A. L. Shields: On some interpolation problems for analytic functions, Amer. J. Math. 83, 513-532, 1961. 108. Sherman, M. J.: Operators and inner functions, Pacific J. Math., 22, 159-170, 1967. 109. Sontag, E. D.: Linear systems over commutative rings: a survey, Ricerche di Automatica, 7, 1-34, 1976. 110. Sontag, E. D.: On linear systems and noncommutative rings, Math. Sys. Th., 9, 327-344, 1976. 111. Stone, M. H.: Linear transformations in Hilbert space and their applications to analysis, Amer. Math. Soc., Colloquium Pub. 15, 1932. 112. Sz.-Nagy, B.: Sur les contractions de I'espace de Hilbert, Acta Sci. Math., 15, 87-92, 1953.
322
LINEAR SYSTEMS AND OPERATORS IN HILBERT SPACE
113. Sz.-Nagy, B.: Isometric flows in Hilbert space, Proc. Cambridge Phil. Soc., 60, 45-49, 1964. 114. Sz.-Nagy, B.: Diagonalization of matrices over H", Acta Sci. Math., 38, 223-238, 1976.
115. Sz.-Nagy, B. and C. Foias: "Harmonic Analysis of Operators on Hilbert Space," North Holland, Amsterdam, 1970. 116. Sz.-Nagy, B. and C. Foias: Operateurs sans multiplicite, Acta Sci. Math., 30, 1-18, 1969. 117. Sz.-Nagy, B. and C. Foias: Modele de Jordan pour une classe d'operateurs de I'espace de Hilbert, Acta Sci. Math.. 31, 91 117, 1970. 118. Sz.-Nagy, B. and C. Foias: On the structure of intertwining operators, Acta Sci. Math., 35, 225-254, 1973. 119. Van der Waerden, B. L.: "Modern Algebra," F. Ungar, New York, 1949. 120. Widder, D. V.: "The Laplace Transform," Princeton Univ. Press, Princeton, N.J., 1946. 121. Wiener, N. and P. Masani: The prediction theory of multivariate stochastic processes, 1, Acta Math., 98, 111-150, 1957; 11, ibid. 99, 93-139, 1958. 122. Wold, H.: "A study in the Analysis of Stationary Time Series," Almqvist and Wiksell, Stockholm, 1938.
123. Wolovich, W. A.: "Linear Multivariable Systems," Springer Verlag, New York, 1974. 124. Wonham, M. W.: "Linear Multivariable Control," Springer Verlag, Berlin, 1974. 125. Yosida, K.: "Functional Analysis," Springer Verlag, Berlin, 1968. 126. Zadeh, L. and C. A. Desoer: "Linear System Theory," McGraw-Hill, 1963.
INDEX
a-bounded operator, 291 accretive operators, 82 adjoint, 70 adjoint system, 250 angle between subspaces, 228 approximate identity, 84 backward shift, 127 balanced realization, 291 basis, 5 Besselian basis, 227 Bessel's inequality, 65 bilateral shift, 126 Blaschke product, 173 bounded basis, 227 bounded operator, 70 Brunovsky canonical form, 60 canonical models, 18 canonical realization, 31 canonical spectral representation, 113 Carleson sequence, 226 Cayley transform, 80 closed operator, 77 closed set, 68 codimension, 69 coherent set of isometrics, 108 coisometry, 72 companion matrix, 26 compatible system, 296 completely monotonic function, 313 completely nonunitary contraction, 129 completely nonunitary semigroup, 156 completeness, 64
compound matrices, I 1 compression, 132 conjugation, 78 continuous observability, 244 continuous spectrum, 74 continuous time linear system, 296 contraction, 70 control representation, 53 convex set, 66 coprime factorization, 38 cyclic module, 5 cyclic operator, 26, 221 cyclic vector, 26, 221
defect numbers, 130 defect operators, 130 defect spaces, 130 degree, 272 determinant divisors, 12 dilation, 131 dimension, 69 discrete time linear system, 28 dissipative operator, 79 division of measures, 108 division of rational matrices, 34 elementary matrices, 13 elementary operations, 14 embedding, 108 entire ring, 2 equivalence, 7 equivalence of measures, 108 exact observability, 244 exact reachability, 244 323
324
INDEX
exact sequence, 4 exponential order, 291 extended causal input/output map, 242 external description, 29 feedback, 50
feedback equivalence, 51 feedback group, 50 feedback law, 50 field, 2 final space, 73
finite multiplicity, 105 finitely generated, 5 first canonical form, 27 Fourier transform, 150 Fourier- Plan cherel transform, 152 free module, 5 full ideal, 9 full submodule, 8 functional calculus, 76 generalized Fourier coefficients, 68 generating function, 176 generators, 5 greatest common divisor, 2 greatest common left inner divisor, 188
Hankel matrix, 33 harmonic function, 83 Hermitian form, 63 Hilbertian basis, 227 Hilbert space, 64 ideal, 3 impulse response, 243 infinitesimal cogenerator, 145 infinitesimal generator, 140 initial space, 73
inner function, 170 inner product space, 63 input/output map, 29 input space, 28 internal description, 29 interpolating sequence, 227 intertwining map, 19 invariant factors. 6. 14 invariant subspace, 71, 114 invariant supspace of full range, 186 invertible operator, 71 isometric embedding, 108 isometric isomorphism, 72 isometry, 72
Jordan model, 220 Jordan operator, 214
Laguerre functions, 155 least common left multiple, 2 left associate, 2 left coprime, 2 left divisor, 2 left ideal, 3 left inner factor, 187 left module, 4 left multiple, 2 left shift, 127 left translation group, 148 linear functional, 67 linear independence, 5 linear manifold, 64
matrix measure, 106 maximal orthonormal set, 68 McMillan degree, 43 meromorphic function of bounded type, 254 meromorphic pseudocontinuation, 253 minimal closed extension, 78 minimal coisometric dilation, 133 minimal inner function, 188, 221, 261 minimal isometric dilation, 133 minimal set of generators, 106 module, 4 multiplicity, 106, 113, 127
Naimark's theorem, 137 noncyclic function, 253 observability map, 31, 243 observable, 31, 244 ordered spectral representation, 113 orthogonality, 63 orthogonal complement, 67 orthogonal projection, 71 orthogonal set, 63 orthonormal basis, 63 orthonormal set, 63 outer function, 170 outgoing subspace, 129 output space, 28
parallel connection, 280 partial isometry, 72 p-interpolating sequence, 226 Plancherel identity, 152 point spectrum, 74 Poisson kernel, 83 polar decomposition, 103 polynomial system matrix, 44 positive definite function, 137 positive definite sequence, 87 positive functional, 89 principal ideal, 3
INDEX
principal ideal domain, 3 projection, 71 proper rational, 6 pseudoinverse, 72 Pythagorean theorem, 64 quasiaffinity, 72 quasiequivalence, 210 quasiinvertible transformation, 72, 246 quasisimilarity of systems, 246
rational, 6 reachability indices, 61 reachability map, 31, 243 reachable, 31, 243 realization, 30, 32 reducing subspace, 71 regular realization, 290 residual spectrum, 74 resolvent function, 74, 77 resolvent set, 74, 77 restricted input/output map, 30 restricted shift, 191 restricted shift system, 251
restricted translation realization, 303 result, 29 Riesz representation theorem, 67 right ideal, 3 right invertible, 71 right multiple, 2 right shift, 127 right translation group, 148 right translation invariant subspace, 182 rigid function, 186 ring, 2 ring homomorphism, 2 scalar multiple, 210 scalar type measure, 114 Schaffer matrix, 134 self adjoint, 70, 79 self adjoint system, 307 semigroup, 140 series connection, 279
or-equivalence, 109
a-exact observability, 311 a-exact reachability, 311 or-left coprimeness, 119 or-right coprimeness, 119 similarity of systems, 246 simulation, 34 singular inner function, 173 Smith canonical form, 14 spectral inclusion property, 260 spectral measure, 94 spectral minimality, 260 spectral representation, 105 spectral theorem, 98 spectrum, 74 stable self adjoint system, 307 standard controllable realization, 43 standard observable realization, 42 state space, 28 state space isomorphism theorem, 37 strict causality, 242 strictly noncyclic function, 253 strict system equivalence, 46 strong coprimeness, 173 strong dilation, 132 strong reachability, 244 strong a-left coprimeness, 119 subspace, 64 summability kernel, 84 symmetric operator, 79
Toeplitz operator, 199 torsion element, 5 torsion module, 5 transfer function, 30, 290 translation group, 148 uniformly separated sequence, 226 unimodular, 7 unit, 2 unitary equivalence, 72 unitary operator, 72 unitary representation, 137 unitary a-equivalence, 109
set of generators, 105
shift realization, 249 short exact sequence, 4
wandering subspace, 127 weighting pattern, 243
325
Other McGraw-Hill titles: ADVANCED MATHEMATICAL METHODS FOR SCIENTISTS AND ENGINEERS
Carl M. Bender, Washington University Steven A. Orszag, Massachusetts Institute of Technology 640 pages
This outstanding new book presents and explains mathematical methods for obtaining approximate analytical solutions to differential and difference equations that cannot be solved exactly. The authors do not dwell on equations whose exact solutions are well tabulated. Rather, they aim to help scientists and engineers to develop the skills necessary to analyze equations that they encounter in their work. The text develops the insights and methods that are most useful for attacking a new problem. Briefly, this work emphasizes applications, avoiding long and boring expositions of intuitive concepts whose essence can be stated in a few lines. Computer plots are provided that compare exact and approximate answers, justifying the results of the text's analyses. MODERN METHODS IN PARTIAL DIFFERENTIAL EQUATIONS
An Introduction Martin Schechter, Yeshiva University, New York 296 pages
This is the first monograph to present recent important accomplishments in the theory of Linear Partial Differential Equations without requiring extensive background of the reader, With a unified theme in method and approach throughout the book even undergraduates and beginning graduate students will be able to understand this material. Only advanced calculus and rudimentary complex function theory are required for the reader. It is an up-to-date treatment of an advanced topic written by a first-rate research mathematician. The purpose of this monograph is to make accessible to students and researchers discoveries of the past 30 years (mostly of researchers other than the author) and to present this modern approach in a simplified and coherent manner. MODERN METHODS IN TOPOLOGICAL VECTOR SPACES
Albert Wilansky, Lehigh University, Bethlehem, Pennsylvania 304 pages This book develops the theory and then, by means of examples and problems, carefully illustrates the necessity of various assumptions, raising and answering natural questions. Results are classified by means of unifying principles called programs (e.g. the equivalence program) w:jich have never before been made so explicit. The core of the book is the introduction of duality, Chapter 8, followed by a careful study, emphasizing dual pairs, of the important topologies and equicontinuity and reflexivity, leading to a climax in Chapter 12 in which such topics as full completeness, open and closed graph theorems, completion, and the Grothendieck interchange, are presented in detail. The study of operators appears in Chapters 10 and 12. Chapters 13 15 apply these results in a detailed study of inductive limits, function spaces, barrelled spaces, and the separable quotient problem for Banach spaces.
McGraw-Hill International Book Company Serving the Need for Knowledge
0-07-022589-3