LONDON MATHEMATICAL SOCIETY STUDENT TEXTS
Managing editor: Professor E.B. Davies, Department of Mathematics, King's Col...
51 downloads
922 Views
868KB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
LONDON MATHEMATICAL SOCIETY STUDENT TEXTS
Managing editor: Professor E.B. Davies, Department of Mathematics, King's College, Strand, London WC2R 2LS
1
Introduction to combinators and ?.-calculus, J.R. HINDLEY &
J.P. SELDIN 2
Building models by games, WILFRID HODGES
Local fields, J.W.S. CASSELS 4 An introduction to twistor theory, S.A. HUGGETT & K.P. TOD 5 Introduction to general relativity, L. HUGHSTON & K.P. TOD 6 Lectures on stochastic analysis: diffusion theory, DANIEL W. STROOCK 7 The theory of evolution and dynamical systems, J. HOFBAUER & 3
K. SIGMUND 8
9
Summing and nuclear norms in Banach space theory, G.J.O. JAMESON Automorphisms of surfaces after Nielsen and Thurston, A.CAS SON &
S. BLEILER 10 Non-standard analysis and its applications, N.CUTLAND (ed) 11 The geometry of spacetime, G. NABER 12 Undergraduate algebraic geometry, MILES REID 13 An Introduction to Hankel Operators, J.R. PARTINGTON
London Mathematical Society Student Texts. 13
An Introduction to Hankel Operators JONATHAN R. PARTINGTON Fellow and Director of Studies in Mathematics, Fitzwilliam College Cambridge
The right of the University of Cambridge
1
y1
.tl. i.i ug=-
`.3 1
to prim ..d,,11
all - afbooks was granted by Henry V111 in 1334. The University has printed and published conrtnuously since 1384.
CAMBRIDGE UNIVERSITY PRESS Cambridge
New York New Rochelle Melbourne Sydney
CAMBRIDGE UNIVERSITY PRESS Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, Sao Paulo
Cambridge University Press The Edinburgh Building, Cambridge CB2 8RU, UK
Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521366113
© Cambridge University Press 1988
This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 1988 Re-issued in this digitally printed version 2007
A catalogue record for this publication is available from the British Library ISBN 978-0-521-36611-3 hardback ISBN 978-0-521-36791-2 paperback
To my mother, and in memory of my father
CONTENTS
0
Introduction
1
Compact Operators on a Hilbert Space
2
Hardy Spaces Basic Properties of Hankel Operators
3
4 5
Hankel Operators on the Half Plane Linear Systems and H-
6
Hankel-norm Approximation
7
Special Classes of Hankel Operator Appendix
Exercises Bibliography
Index
0. INTRODUCTION In Riemann, Hilbert or in Banach space Let superscripts and subscripts go their ways. Our asymptotes no longer out of phase, We shall encounter, counting, face to face. Stanislaw Lem (The Cyberiad)
We apologise for the fact that in the title of the Tensors talk in the last newsletter, the words "theoretical physics" came out as "impossible ideas". Arehimedeans' Newsletter, January 1986. Many have been led astray by their speculations, And false conjectures have impaired their judgement. Ecclesiasticus 3, 24.
A Hankel matrix is one of the form
a0 a1 a2 ...
a, a2 a3 ... /I
a2 a3 a4 ...
that is, a matrix {(cij):
i, j = 0,
.
oo}, where cij depends only on i+j, so can be written
cij = ai+j, for some sequence a0, a1, a2, ...
Under suitable conditions such a matrix gives rise in a natural way to a linear map
(an
operator) r on the Hilbert space 12 of square summable sequences, and we have that
(rx)i = E0 ai+jxj, for x = (x0, x1, x2,...) e 12. r is a Hankel operator. Similarly, a Hankel Integral Operator on L2(0, oo) has the representation
Fx(t) = c h(t + s) x(s) ds, so that the kernel, h(t + s), depends on the sum of the two variables involved.
As we shall see in more detail later, 12 is isomorphic to the Hardy space H2 of analytic functions on the unit disc flzl < 1): this is the space of all functions f(z) = EQ anzn with norm
IIfl12 = E Ianl2 < O. We thus have a connection between Hankel operators and complex variable theory, which turns out to be very important. Similarly, L2(0, oo) is easily related to
2
another Hardy space, this time of functions defined on the right half plane C+, using the Laplace transform.
Hankel operators have in recent years been shown to have widespread applications to both Systems Theory and Approximation Theory: we explore these here.
In Chapter 1 we start with some general operator theory. Compact operators on Hilbert spaces
can be written in the form
Fx = 171 ai (x, vi) wi,
with al >- a2 2 ...
0, and (vi) and (wi) orthonormal sequences in the given Hilbert space.
The ai are called singular values (approximation numbers, generalised eigenvalues) and have many important properties. For example we can consider what it means to say that E ai < -, or that E ai < 00 (nuclear operators and Hilbert-Schmidt operators).
Hardy spaces are introduced in Chapter 2. For the applications to Hankel operators we are
only concerned with H2, H and (occasionally) Hl, and we give a more elementary discussion than is customary (for example we are able to avoid the use of maximal functions entirely). We also treat Hardy spaces on C+ by considering their equivalence with Hardy spaces on the disc.
Having established the background we are able to introduce Hankel operators in Chapter 3.
Nehari's Theorem and the Carathdodory-Fejdr and Nevanlinna-Pick problems are treated. In addition we establish Hartman's theorem on compact Hankel operators.
Hankel integral operators on L2(0, oo) and their equivalent forms on H2(C+) are discussed in Chapter 4. Most results here are obtained using equivalences with Hankel operators on the disc.
An elementary treatment of linear systems and H,, is presented in Chapter 5. Some infinitedimensional systems (where the associated Hankel operator is of infinite rank) are discussed. Here
we give the physical motivation for Model Reduction - approximation by simpler functions in suitable norms.
3
In Chapter 6 we present Beurling's Theorem and the Adamjan-Arov-Krein results on Hankel-
norm approximation. Here we follow Power's simplified treatment, giving additional proofs, examples and explanations of this rather deep problem.
The final chapter connects the general operator theory of the first chapter with the Hardy space theory. Various results on Hilbert-Schmidt and nuclear Hankel operators are presented,
culminating in the recent results of Peller, Coifman and Rochberg, Bonsall and Walsh, and including various inequalities which give L1 and H,, error bounds for model reduction.
We conclude with an appendix covering various background results in functional analysis which may be unfamiliar to some readers. These include standard results from Operator Theory and Measure Theory, and we give them in their simplest form.
These notes are based on those for a Part III Mathematics course on Hankel operators given
to an audience of Mathematicians and Engineers at Cambridge University in the Michaelmas
Term, 1987. I am grateful to Dr B. Bollobds, Dr T.K. Came, Dr K. Glover, Dr T.W. KOmer,
Mr D.C. McFarlane and Dr R. Ober for useful discussions and comments; also
to the
Departments of Engineering and of Pure Mathematics and Mathematical Statistics of Cambridge
University, to Fitzwilliam College, Cambridge, and to the Science and Engineering Research Council for their assistance.
4
1. COMPACT OPERATORS ON A HILBERT SPACE
In this first chapter, we begin by considering linear operators in general - we specialise to Hankel operators in Chapter 3. Although it is possible to discuss operators defined on a general nonmed space, we shall not do so, but just consider linear operators defined on a complete inner-
product space, a Hilbert space. The properties in which we are interested are of greatest importance when the operator is compact, that is, close to being a finite-rank operator (a formal definition will be given later).
For compact operators which are also Hermitian there is the Spectral Theorem, which shows how the action of the operator is fully determined by its eigenvalues and eigenvectors. From this
we move to the Schmidt expansion of a general compact operator, and come naturally to the definition of the approximation numbers (singular values) of a compact operator (to be denoted (as))-
A brief discussion of the polar decomposition follows: this enables us to refer to the modulus of an operator, itself an operator with several useful properties.
We spend the remainder of the chapter in considering operators of the class Cp (1
<_ p < oo), that is with E j OF < o. These form nonmed spaces, of which the spaces C1 of
nuclear operators, and C2 of Hilbert-Schmidt operators are by far the most important. We establish the fact that these are indeed nonmed spaces (in fact C2 is an inner-product space), discussing the trace function en route.
The results of this chapter are mostly standard and can be found, in various forms, in the cited books of Dunford and Schwartz, Gohberg and Krein, and Schatten.
All operators will be assumed to be continuous unless otherwise specified, and the underlying Hilbert spaces will be complex.
For A: H -* H linear, we recall that its norm is given by IIAIl = sup (IlAxll/INI, x x 0).
Also the adjoint A* of A satisfies (A*x, y) = (x, Ay) for all x and y, and A** = A. A is a compact operator if the closure of A(U) is compact, where U denotes the unit ball (Ilxll
<-
1) of
H. An equivalent, simpler condition for compactness for operators on a Hilbert space is that A is
5
compact if and only if there is a sequence (An) of finite rank operators (i.e. An(H) is a finitedimensional subspace), such that On - All -* 0. Equivalently again, A is compact if and only if, given any bounded sequence (xn), the sequence (Axn) has a convergent subsequence.
If A is a compact operator, then so are A* and any scalar multiple of A. The sum of two compact operators is compact, and if A is compact and B is bounded then both AB and BA are compact.
We recall the Spectral Theorem for Compact Hermitian Operators on a Hilbert Space (see the Appendix for more details):
Proposition 1.1 A is a compact Hermitian operator if and only if there exists a sequence (),n) of real numbers which tends to zero, and an orthonormal basis (xn) of H, such that
Ax = E1 'n(x, xn)xnThe (xn) are eigenvectors of A and the (),n) eigenvalues. With respect to this orthonormal basis, A thus has a diagonal matrix.
We observe that there will always be a certain non-uniqueness in this representation: the (xn)
can always be multiplied by scalars of modulus one, and, in the case when we have repeated eigenvalues (eigenspaces of (finite) dimension greater than 1) we can choose an orthonormal basis for such an eigenspace arbitrarily.
We note that the operators Am (m = 1, 2, ...) defined by
Amx = ET ) n(x, xn)xn satisfy rank (Am) < - and IIAm
-
All -4 0, so that A is indeed the norm limit of finite rank
operators.
We say that A is positive, and write A >_ 0, if A is compact Hennitian with positive eigenvalues. It then has a unique positive square root, A112, defined by A112x =
Xn12(x xn)xn,
which satisfies A112A112 = A. For the uniqueness, we note that if B >_ 0 and B2 = A, then the eigenvectors
of B
are
eigenvectors
of A,
and
thus
if Bx = Ej µn(x, yn)yn,
then
6
Ax = El µn(x, yn)yn Now, looking at the eigenspaces of A, we obtain the uniqueness of the square root.
Theorem 1.2 (Schmidt expansion of a compact operator) An operator T is compact if and only if it there exist orthonormal sequences (vi), (wi), i
>_
1, and scalars ai decreasing to 0, such that
Tx = E7 ai(x, vi)wi. Proof 'If': Writing Tmx = E7 aj{x, vi)wi, we have rank(Trn) <_ m and IITm -
711 = am+1 -4 0.
`Only if': Since (T*Tx, x) = (Tx, Tx) >_ 0 and (T*Tx, y) = (x, T*Ty), we have T*T >_ 0. Let X1, X2, ... be the nonzero eigenvalues of T*T, ordered in decreasing size, v1, v2, ... the
corresponding eigenvectors (orthonormal), and ai = X112. Now write wi = Tvi/ai. We thus
have (wi, wj) = (Tvi, Tvj)laiaj = (T*Tvi, vj)laiaj = ai(vi, vj)laj = Sid,
i.e.
the (wi) are
orthonormal.
Note that T*Tx = 0 if and only if Tx = 0, so that Tx = Z-1 (x, vi)Tvi = EI ai(x, vi)wi. Also
Tvi = aiwi and T*wi = aivi; 7T*wi = ai wi = Xiw and T*x =
7 ai(x, wi)vi.
The numbers (ai) are called singular values (sometimes approximation numbers, s-numbers or generalised eigenvalues.)
Corollary 1.3 If A is an m-by-m matrix, we can find unitary matrices U and V and a positive semi-definite diagonal matrix D such that A = UDV.
Proof A corresponds to a finite rank operator T: Cm
Cm. With respect to the orhonormal
bases (vi) and (wi) (extended if necessary by adding vectors from the kernels of T and T*), T has the diagonal matrix D. Changing back to the standard orthonormal basis transfonns D into UDV, where U and V are unitary matrices.
We can interpret a1(T) as 11711. More generally we have the following result.
Theorem 1.4 For n >_ 1,
(n(T) = inf (IIT - SII: rank(S) < n). The infimum is actually attained.
Proof We may assume without loss of generality that n is at least 2, since for n = 1, S = 0 will do. Clearly, taking
Sx =
E7-1
ai(x vi)wi,
we have rank(S) < n and
(T - S)(x) = En ai(x, vi)wi, and so IIT - SII = on
Suppose now that R is any operator of rank k, say, and consider L, the linear span of the
vk+1 Since dim(L) > rank(R), we see that the restriction R: L -4 Im R is not
vectors v1,
injective and there exists a vector x of norm I with x E L and Rx = 0. But IITxll ? ak+lllxll, since the coordinates of x are each magnified at least that much, and so
II(T - R)xII ? ak+lllxll, which implies that IIT - RII z ak+l
an, and the result follows.
This explains why the ai are sometimes called approximation numbers of T. When T is not compact, but merely bounded, we can still define
ai(T) = inf (IIT - SII: rank(S) < i), and clearly ai(T) -* 0 if and only if T is compact.
Corollary
1.5
am+n-1(S + T) 5 am(S) + an(T)
and
am+n-1(ST) <_ am(S)an(T)
for
m, n >_ 1. In particular, am(ST) <_ arn(S)II71I and am(TS) <_ am(S)11711.
Proof For any e > 0 we may choose operators SI of rank at most m-1, and TI of rank at most n-1, with IIS
-
Slll <_ arn(S) + e and IIT -
T111
<_ an(T) + e.
Then IIS + T - (SI + TI)II <_ am(S) + an(T) + 2e, and
rank(SI + TI) <_ m + n - 2,
which means that the left hand side is at least am+n-1(S + T). The result follows on letting
e -* 0. Similarly
8
IIST - (SI(T - TI) + STI)II 5 II(S - SI)(T - TI)II <_ (am(S) + e)(an(T) + e), which implies the second inequality, since the rank of SI(T - TI) + STI is at most
m + n - 2. The Polar Decomposition of a compact operator Suppose, as usual, that
Tx = E1 ai(x, vi)wi, so that
T*Tx = E j ai (x, vi)vi. We define the modulus of T, IT, to be the operator (T*T)1,'2, that is
ITIx = E j ai(x, vi)vi, and we define the operator U = UT by
Or =
(x, vi)wi,
i.e. U(vi) = wi and U is an isometry (not in general compact) of the closed linear span of {v1, v2, ...) onto the closed linear span of {w/, w2, ...), which takes its orthogonal complement
to zero. This is a partial isometry. The polar decomposition of T is then T = UTI11, which may be compared with the writing of a nonzero complex number as
(or, for z = 0, just 0 times 0!)
Proposition 1.6 Let T = U7471 be the polar decomposition of a compact operator T. Then 171 = UTT. Moreover, if also T* = UT*lT*l, then UT* = UT and hence T = IT*IUT. Proof This is merely a matter of checking what each of these operators actually does.
Since UTx = E1 (x, vi)wi, thus mapping vi to wi and anything orthogonal to the vi to 0, it is simple to verify that UTx = El (x, wi)vi, which therefore maps wi to vi in a similar fashion. Thus
UTT = UT'JTM = 171.
Moreover
T* = UT*IT*I,
which
implies
that
T = (T*)* = IT*IUT* = IT*IUT, taking adjoints. This proves the proposition.
In fact the polar decomposition does exist for any bounded operator: T = UTITI, where 1712 = T*T, UT is a partial isometry with Ker UT = Ker 171, and 171 >_ 0, although we shall
9
not prove this.
Definition We say that a compact operator T is in the class Cp (1 <_ p < oo) if and only if
El ai(T)P < -. Two particular values of p are of interest (as usual): C1: The nuclear or trace-class operators, and
C2: The Hilbert-Schmidt operators.
Proposition 1.7 The class Cp is a linear space and an ideal, that is
C=.?Te Cp;
i)Te Cp,7<,
ii) S, T e Cp = S+ T e Cp; iii) S bounded, T E Cp = ST E= Cp and TS e Cp. Proof Since (;i(),T) =
we get (i) immediately.
By Corollary 1.5, a2i_1(S + T) 5 ag(S) + ai(T) <_ 2 max (ai(S), (Yi(T)), and hence (a2i_1(S + T))P 5 2P(ai(S)P + ai(T)P), and likewise
(a2i(S + T))P <_ 2P(ai(S)P + ai+1(T)P). Summing over i we see that Z-1
((;i(S + T))P < 2P(2Z-j ai(S)P + 2z -j (YIT)P) <
Finally, ai(ST) <_ ai(T)IISII and ai(TS) <_ ai(T)IISII, also by Corollary 1.5, and the result follows.
In fact CP is a Banach space, with norm 117110P = (Z-1
ai{Tt')1 iP, having properties similar
to those of lp. We shall investigate the properties of Cl and C2 in detail. We remark also that T E Cp if and only if T* E Cp, simply because T and T* have the same singular values.
Proposition 1.8 If E j ai (T) < oo, then, for any orthonormal basis (xi) we have
Ej IITxill2 = Ej a?(T).
10
Proof Let (xi) and (Yd be any orthonormal bases. Then
Ei IlTxill2 = Ei Ej
I(Txi,
yj)I2,
by the Riesz-Fischer theorem (see the Appendix),
= Eij I(xi, T*yj)12 = Mid. I(T*yj, xi)12, = Ej IIT*y1j12,
again by the Riesz-Fischer theorem.
Hence the answer we get is independent of our choice of orthonormal basis (and we get the
same answer for T* instead of T.) Thus, extending (vi) to an orthonormal basis, by adding in
vectors from the kernel of T = 7 6i(x, vi)wi if necessary, we obtain El IITv1112, which is equal to E7 6?(T), as required.
Corollary 1.9 The space C2 is an inner-product space, under the inner product
<S, T> = El (Sxk, Txk), where (xk) is an orthonormal basis, and the value obtained is independent of the basis (xk). Hence the space C2 becomes a normed space with norm
IISIIHS = (Ej IISxkll2)//2 = (EI (k(S)2)112
Proof Observe first that Ej I(Sxk, Txk)I <_ El (Ej IISxkII2)112(
IISxkII-IITxkll
IITxkll2)1"2 < oo, (Cauchy-Schwarz) so that the definition makes
sense. Clearly the expression <S, 7> will be linear in S, conjugate-linear in T, anti-symmetric
and positive definite. We need to show that the formula does not depend on our choice of orthonormal basis (xk): so we note that <S, S> and etc. are uniquely determined, and hence, by the polarization identity
<S, 7> = 4 (<S + T , S + 7> - <S - T , S - T> + i <S + iT, S + i7> - i<S - iT, S - iT>), we have the uniqueness of <S, T>, as required. We can now say something useful about nuclear operators.
11
Proposition 1.10 T is nuclear if and only if T can be written T = SIS2 for some SI, S2 Hilbert-Schmidt.
Proof If T is nuclear, say Tx = EI ai(x, vi)wi with E1 ai < oo, write
S1x = EI
a]12(x vi)wi
S2x =
(;112(x, vi)vi,
(mapping vi to (;f'2wi), and
f
(mapping vi to 011129-
Then SI and S2 both have as singular values the sequence a112, Schmidt. Indeed S2 =
I7,112.
112,
..., and so are Hilbert-
Clearly T = SIS2.
Conversely, if SI and S2 are Hilbert-Schmidt, then
('2i-I(SIS2) 5 ai(SI)(5j(S2), by Corollary 1.5,
5 a2(SI) + a2(S2), and
"2i(SIS2) 5 similarly. Adding up, we see that SIS2 is nuclear.
Definition For T E C1, we define the trace of T by
tr (T) = 17 (Txk, xk), for any orthonormal basis (xk). This definition is independent of the choice of (xk), since
tr (T) = E1 (SIS2xk, xk) = E1 (S2xk, Sjxk) = <S2> S l>. Lemma 1.11 Let T be nuclear, with singular values (ad; then
(i) tr 171 = EI ail. (ii) Itr (T)I 5 tr 171; (iii) tr ISTI 5 IISII tr 171 for any bounded operator S (and likewise for tr ITSI);
(iv) If SI and S2 are Hilbert-Schmidt, then <Sj, S2> = tr (S2SI) = Or (S jS2))-. Proof (i) IT 171 = 17 (I71vi,
vi) = r7 ai.
12
(ii) Itr (77I 5 Ej I(Tvi, vi)I 5 z7 al = tr
171.
(iii) oi(ST) <_ IISIIai(T); now use (i).
(iv) <SI. S2> = El (Sixi, S2xi) = r1* (S?Slxi. xi) = r1* (xi, Sj*S2xi), as required. Theorem
1.12
Let
T
Ej aj = sup [VI I(Txj, yj)I:
be
nuclear,
with
singular
values
((Y').
Then
(xj), (yj) orthonormal sequences).
Proof Clearly the right hand side above is greater than or equal to the left hand side, for if we
write Tx = Ej a,{x, vj)wj, then taking xj = vj and yj = wj gives us El a,,. Conversely, given (xj) and (yj), we may assume without loss of generality that (Txj, yj) >_ 0, as it does not change the value of the expression to multiply the vectors by scalars of modulus 1. Z-1
(Txj, yj) = El (Txj, Uxj) where U is the norm I map (partial isometry) taking xj to yj for
each j (and zero on the orthogonal complement of the (xj)). Thus
EI (Txj, yj) = tr (U*T) <_ tr IU*71 5 tr 171, and the result follows. Corollary 1.13 C1 is a normed space, with norm II711N = tr 171.
Proof
All
that
remains
is
to
IIT1 + T21IN = sup Ej I((T1 + T2)xj, yj)I
prove
the
triangle
inequality.
However
5 sup El I(T1xj. Yj)l + sup Ej I(T2xj, Yj)I
= IITIIIN + IIT2IIN.
The final result of this section should be compared with Proposition 1.10.
Corollary 1.14 If S1, S2 E C2, then IISIS2IIN 5 IIS/IIHSIIS2IIHS Proof '71 I(SIS2xj, yj)I = r1* I(S2xj, S*]Yj)I 5 E-1 5 (Ej IIS2xjll2)I12(E1 IISjyjIl2)112
< IIS211HSIIS*IIHS
= IIS2IIHSIISIIIHS
The result now
follows by Theorem 1.12.
We shall return to the subject of singular values later (notably in the final two chapters), but for the present we leave this topic and consider some spaces of analytic functions.
13
2. HARDY SPACES
As will be seen in Chapter 3, the most fruitful way of looking at Hankel operators is to consider them as acting on certain spaces of analytic functions, namely the Hardy spaces. The classic Hardy spaces (Hp spaces) consist of analytic functions defined on the open unit disc that
exhibit suitable behaviour as one approaches the boundary. The discussion is simplest for H2,
which is naturally regarded as a closed subspace of L2(T), T being the unit circle, and thus a Hilbert space in its own right. Similarly the other Hp spaces embed in the corresponding LP
spaces. After treating H2, we consider H and HI in detail, H by relatively elementary arguments, HI by more complicated reasoning. This involves us in a discussion of Blaschke products, the Poisson kernel and, ultimately, the Riesz factorization theorem.
It is possible to define Hardy spaces on other domains than the disc; the right half plane C+ is important for our purposes. An important fact here is that Lp(iR) can be related to Lp(T) using a MSbius map. The spaces Hp(C+) are defined and related to the Hardy spaces on the disc in the same manner. In addition, it turns out that the Laplace transform, often regarded merely as a trick for solving differential equations, plays a precise role as an isomorphism between H2(C+) and the Lebesgue space L2(0, oo). We therefore discuss this in detail.
The theorems of this chapter are again standard, though we have given a more elementary
treatment than is customary (avoiding the use of harmonic functions almost entirely and dispensing with maximal functions). The cited books of Duren, Gamett, Koosis, KOmer and Widder between them cover most of the results in this chapter, though with different approaches.
We start by reviewing the basic properties of Lp(T), Lp(0, oo), and Lp(R).
LP(T) is the space of all measurable `functions' on the circle T = (IzI = 1) (parametrised as
(ei8. 0 <_ 0 <_ 2n)), with the norm Ilflip = (J0t If(ei0)IpdO/2n)I1P <
where we identify two functions if they are equal almost everywhere. Similarly L (T) is the space of all essentially bounded measurable functions on the circle with
IVII_ = ess. sup lf(ei0)I <
14
where to form the essential supremum we are allowed to disregard sets of measure zero. Similarly
for Lp(0, o), and LP(R) = Lp(-oo, o). It is worth observing that various important subspaces are dense in L. and that the norm can be considered as defined by taking the completion of the norm on these subspaces. For example,
C(T), the space of continuous functions on the circle, is dense in LL(T) (with the LP norm), at least if 1 <_ p < oo.
Similarly Coo(R), the space of all continuous functions with compact support (i.e. for which
there exists an M such that f(x) = 0 for Ixl z M) is dense in LL(R) for 1 <_ p < o. Analogously for Lp(0, co).
Finally, the space of step functions (i.e. those taking finitely many values, which are continuous at all except finitely many points, and in the case of R and (0, oc) are eventually zero) is again dense in LP for 1 <_ p < cc.
As usual L2(T) is the easiest space to consider, and we can relate Linear Analysis, Complex Analysis and Fourier Series in doing so.
Proposition 2.1 The set of functions (zn: n e Z) = (ein0, n n Z) forms an orthonormal basis for L2(T), and its linear span is dense in LL(T) for all 1 <_ p < o. Proof We recall that the inner product on L2(T) is given by
(f. g) = Jon f(ei0)g(ei0)dO/2a. Thus (zn zm) _
Ic einOe imOd0/2n = Smn, hence the set is orthonormal. To show that
the linear span of the (zn), i.e. the trigonometric polynomials, is dense in LL(T), and hence that
the (zn) form an orthonormal basis for L2, we argue as follows. Take any f E Lp and e > 0;
then there exists g E C(T) with if - gllp < e/2. But by the Stone-Weierstrass theorem (see the Appendix) any continuous function on T can be uniformly approximated by trigonometric polynomials, so there exists h = E N aneinO such that Ilg that Ilg - hlip < e12 and hence Ill - hllp < e, as required.
-
hll,,, < e12. But II.Ilp
<_
11.11-. so
15
Note that the function zn is analytic in (Izl < 1) only for n Z 0. Relating Lp to spaces of functions analytic in the disc gives us the Hardy spaces.
Definition 2.2 For 1 <_ p < 00 the Hardy space Hp is the space of all analytic functions on
flzl < 1) such that /p < oo. IV1IHp = sup r<1 (J0' I f(reie)IpdO/2n)1
H is the space of all bounded analytic functions on (Izl < 1) with the norm Ilf(z)Ila, = sup IzI<1
11(z)I
We note that all such f can be written as Eo anzn, a power series of radius of convergence at least 1.
Lemma 23 If f(z) = Eo* anzn, then IInIH2 = EO Jai, and hence H2 is a Hilbert space with orthonormal basis (zn: 0 <_ n < 00), which can be naturally regarded as a closed subspace of L2(T). Moreover, in Definition 2.2, the supremum over r < I is also the limit as r -4 1. Proof JOn f(rei0)f (rei0)dO/2n = JOB (V p anrnetno) (EO anrne in0)dO/2tc
= llzl-1 (Ep anrnzn) (EF dnrni n)dzl21ciz
writing z = ei°, dz = izd9,
= 10
r2nlanl2,
as we can rearrange the absolutely convergent sum. Now, letting r -a 1, we obtain the required expression for the H2 norm. The inner Product
U g) = IF anbn
=
lim
0
f(reie) g(reie) dA/2n, where
f(z) = LO' anzn and g(z) = EO bnzn, makes H2 into an inner-product space. The space is complete, since it is isometrically isomorphic to 12 - for any sequence (an) with EO Ian12 <
corresponds to a power series E0 anzn with radius of convergence at least 1. It follows that H2 and 12 are isometrically isomorphic Hilbert spaces.
Theorem 2.4 For 1 <_ p 5 oo, Hp is a Banach space.
16
Proof The basic properties of the norm, including the triangle inequality, are proved similarly to the known result for LP spaces. It remains only to establish completeness.
So suppose that (fn) is a Cauchy sequence in Hp. The case p = - is easily dealt with, since the functions are then uniformly convergent and tend to a limit function which will also be in
H_. For 1 <_ p < o, we need to work slightly harder. If Iwl < r and f is analytic, we have
f(w) = (1/2tti)Jizi=r f(z)dzl(z - w), which is Cauchy's formula. Hence, writing z = rei0, we obtain rz
If(w)I = I(1/2tti) 0 f(rei0)irei0dO/(rei0 - w)I
which, by Holder's inequality (see the Appendix), is at most
(''
If(rece)rd6/2tt)1/p (?N (rl(r
-
IwU)gd912n)11q,
where 1/p + 1/q = 1. But this is in turn at most IItIHP(rl(r We now choose r to be (1 + IwU/2; thus If(w)I
<_
IUIIH
P
Iwi)).
((1 + Iw)/(1
- IwU).
Therefore, if (fn) is a Cauchy sequence in Hp we have (fn) converging locally uniformly to a limit function, f, say, since Ifn(w)
- fm(w)I < ((1 + Iwi)/(1
- IwI))Ilfn
- fmliHp
It follows that f is analytic, and 22
(J07c Lfn(rei0) - f(reie)IPdO/2a)I/P
5 lim inf m-*oo(1
Ifn(rei0)
fm(reie)IPd9/2a)1/P,
by Fatou's Lemma (see the Appendix), which, given e > 0, is less than e for n sufficiently large, independent of r. Similarly we see that IVIIHP < oo, i.e. the limit function is actually in
H. Looking at f E H2 now, we can relate its values inside the disc, given by a power series, to its values on the boundary, given by regarding H2 as a subspace of L2.
Proposition 2.5 If f(w) = ro anwn a H2, then for Iwi < 1, .f(w) = (1/2ai)J
Izl=1
.f (z)dzl(z
- w),
where f is the extension to T defined by taking the limit in the L2 sense of the functions .fn7(z) =
EN
p
anzn-
17
Proof Clearly, we do have fN(w) = (1/2tci)J fN(z)dzl(z - w), by Cauchy's theorem. This is the
same as (fN, h), where h e L2, and h(ei0) = ei0/(ei0 - w).
Now as N --4 °°, fN --4 f in L2(T), and so (fN, h) -* (f , h); since fN(w) -4 f(w), pointwise, the result follows.
We now turn our attention to H,,, and prove a few useful properties of it, which enable us to
regard it as a subspace of L (T).
One preliminary remark is in order. If f(e'A) = :E-- aneinO is a function in L2(T), let us
writefN(et9) = E N aneinA Now I& - AIL2 -+ 0 and hence fN -4 fin measure, and there is a subsequence (fNk) converging to f pointwise almost everywhere (see the Appendix).
Let us now consider how H acts by multiplication on H2.
Theorem 2.6 Let f E H ((z H2) and f be as in Proposition 2.5. Then (i) Ilf llo, = sup fllfgIIL2 IIgIIL2- g E L2(T), nonzero); (ii) Ilf II = sup fllf gIIL2 IISIIL2: g a nonzero polynomial in H2);
(iii) 1hL_ = II1IH, Proof (i) Clearly Ilf gl12 <_ IUII 118112 However,
if
If I ? M
on
a
set
A
of
measure
µ
> 0,
then
with
g(ei0) = XA(ei9)IV (eie)I/f (eee), where XA(eie) is the characteristic function (indicator function) of A, we have 79 = XAVI; I; thus 119112 = µl12 and IV 9112
(ii) We clearly have
>_
Mµ112. This implies the result.
using (i). But if IV 9112 <_ x118112 for all polynomials g E L2, then
Ilf (z)E N bnznllL2
xllz>`'E Nr bnznllL2>
i.e. IV(z)E N bnznllL2 5 K]IE N bnzr`II
L2
for all trigonometric polynomials. But then, given any g e L2 the analogous inequality will hold, since we can find a sequence (gk) of trigonometric polynomials converging to g in L2 and almost everywhere. Then, by Fatou's Lemma (see the Appendix),
18
Ilf (z)g(z)IIL2 5 lim inf Ilf (z)gk(z)IIL2 5 KIIgfl
,
and hence, by (i), IVIIL_ <_ K.
(iii) Suppose that f e H,,, corresponding to f E L2, and that g e H2 is a polynomial, corresponding to g e L2.
Thus f(z) = E0 anzn e H,,, g(z) = EON bnzn a H,,, and clearly the extension of fg to T is obtained by multiplying series. Hence IV"- = sup fllf glTL2/1181IL2: g a polynomial) = sup flIfgIIH2 IIgIIH2: g a polynomial) <_ IIiIH
.
Thus we do at least know that f is in L,,, and so, whenever g e H2, 7g is in L2. It is also easily verified that (fg)_ = f g, that is, to obtain the L2 expansion of 7g, one just multiplies the power series. In particular, IVkIIH2 = Ilf kIIL2.
Now (lIIlH2)1/k -* IItIH , as k
since the L.H.S. is less than or equal to the R.H.S.
for each k and, if JA > M at an interior point a, lal < 1, then by continuity JA > M on some arc of (IzI = Ial), of angle 4) say, and IV9IH2 >_ (IIzJ=Jai e12 d912n)1/k > (M2k 4) lall21c)112k -* M
as k -
0.
Similarly (Ilfk11)11k _> MIL
as k -4 00. This completes the proof.
We have thus established the existence of norm-preserving maps
J: H2 -4 L2, and
J: H -9 L,,, each taking f to 71 which assign `boundary values' to analytic functions in the disc. We shall often identify f with 71 regarding H2 as a subspace of L2, or H,,, as a subspace of L,,. Now we consider the space H1, taking in various topics of interest and importance en route.
Blaschke products
Suppose that we have a function B(z) = et4) II,1=1 (z
Izjl < I for j = 1,
...,
- zj)l(1
- Yjz), where 0 e R and
n. We call such a function B a finite Blaschke product. The following
proposition collects together some standard facts about Blaschke products.
19
Proposition 2.7 (i) B is analytic in (Izl < 1) and continuous in (Izl
<_
1);
(ii) IB(z)l = 1 if IzI = 1; (iii) B has zeroes at z1, ..., zn only, and poles at 1/Y1, ..., 1/zn only. Proof This is elementary verification and is left as an exercise.
Definition An H function which has unit modulus almost everywhere on T is called an inner function.
We are interested in factorizing an analytic function into a Blaschke product (with perhaps infinitely many factors) multiplied by a function with no zeroes. It is therefore important to know something about the zeroes of such a function.
Lemma 2.8 If f(z) E Hp, and is not identically zero, then the zeroes (zn) off are countable in
number and satisfy 17(1 - lznl) < -Proof Take r < 1, and consider the zeroes zI, ..., zm in (Izl < r), assuming that none satisfy Izkl = r; there can only be finitely many, since they are isolated.
The function fr(z) = f(rz) is analytic in flzl 5 1), and has zeroes z1/r,
..., zm/r. Thus we
can write
f(rz) = f7 ((z -
zk/r)l(1
- !kz/r)) g(z),
where g is analytic and nonzero. Thus
log g(O) =
9 log g(e'0) dO/2n.
O
Taking real parts, we obtain
log Ig(0)I = o
7C
log Ig(eiO)I d812n =
JOIt
log l f(rei0)l dO/2n.
That is, It
log If0)I + Elzkl
<_ (1/p)log 0
log IfireiORdOl2ic
lf(rei0)lpd9/2n, by Jensen's Inequality
Appendix), <_ log II/IHp
(see the
20
Now, letting r -+ 1, we see that Z I- log 1/Izkl < oo. This is equivalent to the assertion that r j* (1 - Izkl) < o, as is easily verified.
Theorem 2.9 Let f E H. Then the infinite Blaschke product B(z) = zm rlIznIx0 (-in/IznI)(z - zn)l(1 - rnz),
where (zn) are the zeroes of f, m of them being at 0, converges uniformly on compact sets to
an H function whose only zeroes are the (zn), with the correct multiplicities. Moreover B(z) <_ 1 and IB(ei0)I = 1 almost everywhere.
Proof It will be sufficient to prove the result for f(z)/zm. Write
bn(z) = (-ln/IznI)(z - zn)l(1 - rnz), where the first term is a factor chosen to make bn(O) > 0. Then n bn converges to an analytic function with the correct zeroes if and only if E log Ibnl
converges locally uniformly; this happens if and only if Z 11 - bn(z)I converges locally uniformly. But 11 - bn(z)I = 11 + (Tn/IznI)(z - zn)/(1 - znz)I
_ (1 - IznI(Ynz + IznI)/(IznI.Il - YnzI
<_
(1
- IznI(l + IzI)/(1
- IzI),
which
gives
convergence.
Thus B(z) E H,,, and IIBIIH
<_
J: H - L (T)) satisfies IB(et0)I
<_
1, so that the boundary function (defined using the map
1 almost everywhere. But, writing Bn = nj bk, we see
that B/Bn is another Blaschke product, and so IB(0)/Bn(0)I
<_
J 1 IB(e'0)/Bn(e'0)IdO/2n = JOrz IB(eie)IdO/2a.
Letting n -+ oo, we obtain JOn IB(eie)Id8/2tt = 1, and so IB(eie)I = I a.e.
Recall that
II"IH, = limr-+l suplzl=r V(z)I, and IIflIH2 =
(log Ifrreie)I2dO/2ty1/2
by the proof of Lemma 2.3. These indicate that functions grow towards to the boundary of the disc in a strong sense, and, while studying H1, we would like an analogous result. This we prove using a form of the Poisson Kernel.
21
Proposition 2.10 Let P(z, w) = (IzI2
- Iwl2)/(Iz -
x'12) for 0 < Iwl < Izl
<_
1 be the 'Poisson
Kernel'. Then
(i) if f is analytic in {lzl < 1), then f(w) _ n
(ii) 0
(iii)
n
0
P(reie, w)f(rei0)dO/27c;
0
P(reie, w)d0/2n = 1 for Ix1 < r; and P(z, se'4)d412n = 1 for Izl > s.
Proof (i) Jo" P(reie, w)f(reie)de/27t = (1/27ti)Jlzl=r (r2 - ww)f(z)dzl(z(z - w)(r2/z
-
w)),
setting z = ree, dz = izde. The result now follows by the Residue theorem, since the only pole is at w, with residue f(w).
(ii) Take f(z) = 1, and use (i). (iii) 0
n
P(z, sel0)4l21c = (1/27ti)J1wI=s (zY - s2)dwl(w(z - w)(z - s2/w)
(1/2ici)J1w1=s (zY - s2)dwl((z - w)(iw - s2). The result follows again from the residue theorem: this time the only pole is at w = s2/r. Corollary 2.11 Ifs < r < 1 and f is analytic in
(Izl
< 1), then
Dx fAId412c S a" f(re'0)Ide/2n, and thus, if f e H1, then
Ilfllyl = limr41o(reie)Ide/2n. tn f Proof ?-7c (f(set4Id4l27c
=0I
J ' f(rei0) P(reie, sets) de/2711 dtp127c
<_ 1on ?IC Lf(rei0)I P(ei0 sets) dO/27t do/2n.
By Fubini's theorem (see the Appendix), we may swap the order of integration and integrate with respect to
first, obtaining J0r (f(reie)IdO/2n, by Proposition 2.10.
Theorem 2.12 (F. Riesz). Let f(z) e Hp be not identically zero, and B(z) be the infinite
Blaschke product on the zeroes (zn) of f. Then f(z) = g(z)B(z), where IIAIp = Ilgilp for
p = 1, 2. (In fact IIAlp = 11gllp for all values of p, but we do not prove this.)
Proof Let g(z) = f(z)/B(z) and gn(z) = f(z)lBn(z), where Bn(z) is the Blaschke product
22
corresponding to the first n zeroes off. If r < 1, then 22
J01t Ign(reie)IPdO/2x <_ limR.j 10' [fReie)IP/IBn(Reie)Ip d8/2n,
as we have seen that this limit exists for p = 1, 2. Now as R - 1, IBn(Reie)I -a 1 uniformly, so that
2 Ign(reie)IPde/2n 5 limn. Now
Ign(z)I -* Ig(z)I
monotonically
J Jan If(Reie)IP de/2a
(increasing) as
n -4 o,
Convergence Theorem (see the Appendix), IIgIIH 5 ILflIH ; but Ig(z)I P P disc, and thus IIgIIHH = IL/11H
= BillP by the Monotone
so >_
I(fz)I for all z in the
.
Theorem 2.13 (Riesz Factorization Theorem) f(z) E H j if and only if there exist g(z), h(z) E H2 such that f = gh, and RAW, = IIgIIH2IIhIIH2 In other words we can regard H j functions as products of H2 functions, rather in the same way as we regarded C j operators as the products of C2 operators. Proof It follows immediately from the Cauchy-Schwarz inequality that we always have IIghIIH
j5
IIgIIH2IIhIIH2
Conversely, given any f(z) E H1, we may write f(z) = fj(z)B(z), where B is the Blaschke product as in Theorem 2.12, f, has no zeroes in the disc, and II/IH1 = analytic and nonzero, it has an analytic square root, g say, and IiAIH1
=
Il f j II H
j Since f, is
IIgIIH2' Thus
.Rz) = g(z)g(z)B(z), and IVIIH
j=
IIgIIH2IIgBIIH2,
since we already have '<_' and of course IIgB11H2 5 IIgIIH2
Corollary 2.14 H j embeds linearly and isometrically into L j(T) by an extension of the map J already defined on H2.
Proof The set of polynomials is dense in H1, since if
f(z) = g(z)h(z) E Hl, where g(z) =
akzk and h(z) = EQ bkzk are in H2, then writing gn(z) = E3 akzk
and hn(z) = Eg bkzk, we see that
23
Ilgh - gnhntIH1 = IIg(h - hn) + hn(g - gn)IIHI S 118IIH21Ih
- hnIIH2 + IIhIIH2IIg - gnII H2'
which tends to zero as n -* oo. But for polynomials we do have that IIUIHI =
'
11H1' by
Corollary 2.11, and the result now follows by defining J to be the unique linear and isometric extension to HI.
Equivalence of Hilbert Spaces
Writing 11 = 12(0, 1, 2, ...), we have an isomorphism between l2 and H2 in the usual way, with the sequence (an) corresponding to F1j anzn. We also have 11 isomorphic to the orthogonal
complement H21 , by mapping the sequence (an) to 1
ani 1-". Also we can define an
isomorphism
T. H2-4 H21, by (Tf)(z) = (1/z)f(1/z).
Hardy spaces on the half plane
We now wish to consider analytic function whose domain is a halfplane rather than a disc. As
regards 'boundary' values, (i.e. Lp(T) and Lp(iR)), the following result will be extremely useful to us.
Proposition 2.15 The MSbius map M(z) _ (1 - z)l(1 + z) is a self-inverse bijection from the
disc (Iz4 < 1) to the right halfplane C+ _ (x +ry: x > 0). A function g(z) defined on the unit circle is in LL(T) if and only 3f the function G: tR -4 C defined by
G(s) = it 1/p(1 +
s)-2/pg(Ms)
is in Lp(iR); moreover Ilgllp = IIGIIp
Proof The properties of M claimed in the first sentence are straightforward. Further, IIGIIp = 1 IG(iy)Ipdy _ -if
OIg(Ms)Ip(11n)ds/I1 + s1
= ifIzl=llg(z)(I/a)(-2dz1(1 + z)2).II + zl2/4,
since s = (I - z)/(1 + z), ds/dz = -21(1 + z)2, and I + s = 21(1 + z). We have also
24
picked up another minus sign through the need to integrate anti-clockwise round the circle. This expression equals ig(ei°)IP(1/2n)d6 iz 11 + z12/(1 + z)2.
_i
However,
0
z11 + z12 = z(1 + z)(1 + i)
= z(1 + z)(1 + 1/z) = (1 + z)2,
which
reduces the expression to JOlt Ig(ei0)IpdO/2x = II8IIp
For reference, we note that the inverse map, expressing G in terms of g, is obtained on writing z = Ms and Mz = s, namely G(Mz) = 7CI/p(1 + z)2/p2-2/pg(z), or,
g(z) = 22/px1 /p(1 + z)-21PG(Mz). For p = - we clearly have the much simpler expression G(s) = g(Ms) and g(z) = G(Mz),
with IIgIL = IIGII We next wish to define Hardy spaces on the right half plane C+. For p = -, there are no problems in deciding on a natural definition; for p = 2, we make the following comments. In Proposition 2.15, the orthonormal basis (zn)nE Z of L2(T) is mapped to what will be an
orthonormal basis of L2(iR), the functions Gn(s) = tt 1/2(1 - s)nl(1 + s)n+1 Those functions with n 2 0 are actually analytic in the right half plane, and will generate the space H2(C+) for us.
However, the `usual' definition of Hp(C+) is slightly different, and we give this now.
Definition 2.16
(G: G(s) analytic and bounded in Re s > 0), with (IGIIH-(C+) = sup IG(s)I.
For 1 < p < -, Hp(C+) _ (G: G(s) analytic in Re s > 0, with IIGII = supx>0(I
IG(x + iyrdy)1/p < °°).
25
Note that, if G(s) E H2(C+), then g(z) = 2tt112G(Mz)l(1 + z) is analytic in (lzl < 1) and thus we have an expansion G(s) = IF anti-112(1 - s)nl(1 + s)n+1 converging uniformly on compact sets.
Now, it is a routine but tedious calculation to verify, for finite sums, that
,,IN anti 112(1 - s)nl(1 + s)n+IIIH2(C+)
(EN Iani2)1/2 =
(One can transform the integral along a line to an integral in the disc, using the map M. In fact it
is elementary to show this result for polynomials if one replaces `sup x>O' by `lim x-*0' in the definition of H2(C+).)
We may deduce that the injections H2(C+) -* L2(iR) and H (C+) ->
are
isometric. Therefore
H2(C+) _ {G(s), analytic in Re s > 0: g(z) = 2a1/2G(Mz)/(1 + z) E H2), and IIGIIH2(C+) = IIgIIH2.
We write V: H2 -* H2(C+) for the isometric map given above, taking g(z) to G(s).
The Laplace Transform
We recall that l2 and H2 were shown to be isometrically isomorphic as Hilbert spaces in a natural way using Fourier series. Similar things are true on the half-plane, namely that L2(0, oo) and H2(C+) are isometrically isomorphic using the Laplace transform.
Definition 2.17 For h(t) a L2(0, oo), define the Laplace transform
G(s) = (Lh)(s) =
e-sth(t)dt.
By the Cauchy-Schwarz inequality, we see that this exists pointwise, at least in C.
For more general functions, the Laplace transform can often still be defined by the above
integral: for example it will converge if h e LI(0, oo), or if h e L (0,
on).
26
Example
If h(t) = e)`t, then (Lh)(s) =
p
1/(s - ?),
e
at least in the half-plane
Re s > Re X. More examples will be found in the Exercises.
Theorem 2.18 The Laplace transform determines a linear bijection between L2(0, oo) and H2(C,+), such that IILhIIH2(C+)
=
(2zt)112IIhIIL2(0
).
Proof (Sketch). If h(t) E L2(0, o), then, since the integral converges absolutely and uniformly on compact subsets of C+, it is clear that Lh(s) is analytic in C+, with derivative
(Lh)'(s) = 0 te-sth(t)dt = -L(th)(s). Now one way to proceed is as follows: if hn - h in L2(0, oo) norm, then clearly Lhn -- Lh pointwise and locally uniformly. So let us consider the dense subspace of L2(0, oo) containing all functions of the form
h(t) = EN-' anx[jM/N, (j+1)M/NJ(t) We obtain ane jMshN(1 - e Ms1N)/s, (Lh)(s) = EN-1
which is analytic on all of C. One can verify by direct calculation that, for such h, IILhIIH2(C+) = (21t)112IIhIIL2(O,_), and thus define L to be the unique extension of this map into
H2(C+) from the dense subspace. We note that this gives the correct `pointwise' values, since if
hn - h in L2, then, since V1 Lhn -* V1 Lh in H2, thus by Proposition 2.5 the value as an H2 function determines the pointwise values (Lh)(s) in the interior of the domain.
Finally L is onto H2(C+), since H2(C+) has as an orthonormal basis the set of functions fn(s)
= 7(112(1 -
L(tne-t) = n!/(s +
s)nl(1 +
s)n+l. These are certainly in the image of L2(0, 00) under f, since
1)n+1. This completes the proof.
Comments
1. There is a standard formula for the inverse transform:
(.C1G)(t) = (1/2ni)
jr
G(s)estds,
27
where F is the contour (Re s = y) for any y > 0. We shall not require it. 2. The orthononnal functions (zn)n>_0 in H2 are transformed by V to n'1 /2(1 - s)n/(1 + s)n+l
in H2(C+). These in turn transform under G1 to functions pa(t)et in L2(0, o), where pn is a real polynomial of degree exactly n. (It can't be more than n, and if it were less then the pn wouldn't be independent.)
Also (pne-t, prne-t) =
Stnn/2a.
However, the Laguerre polynomials Ln(t) = et(tnet)(n)ln!
satisfy
0 Ln(t)Lm(t)e t = Smn' i.e. (La(t)e t/2) is an orthonormal sequence. It follows therefore that in fact
Pn(t) = ±n-1/2Ln(2t). The following simple result is also useful.
Theorem 2.19 L gives a contraction from L1(0, oo) into H,o(C+).
Proof
I ' e-sth(t)dtl <_ ' Ih(t)I dt
(Recall similarly that IIEo anznl1H
.
supt le stl 5 IIhIIL1(0 -).
5 Ep lanl.)
Remark 2.20 The two-sided Laplace transform gives a bijection between L2(-oo, os) and L2(iR). We shall denote this by L. Writing
(Lh)(s) = J : e sth(t)dt, we see that L takes L2(0, o) to H2(C+) and L2(-oo, 0) to H2(C_), so that the decompositions
L2(R) = L2(--, 0) ® L2(0, °') and
L2(iR) = H2(C_) ® H2(C+) are respected by L. For comparison, recall that 12(Z) = 12(..., -2, -1)
corresponds to
12(0, 1, 2, ...)
28
L2(T) = H2 for Hardy spaces in the disc.
With this elegant set of equivalent decompositions we complete the background analysis and
are now ready to introduce Hankel operators themselves: we shall see that they may be considered to act on several of the spaces discussed in this chapter.
29
3. BASIC PROPERTIES OF HANKEL OPERATORS
Having established the necessary background, we cover a variety of topics in this chapter. Starting from the definition of a Hankel matrix we give three equivalent approaches to the task of
defining a Hankel operator on H2 - that is, an operator whose matrix is a Hankel matrix with
respect to the usual basis, (1, z, z2, ...). All three approaches have been used in the literature and we choose what is in some ways the simplest, explaining how one can easily pass from this to the others.
The first big theorem is Nehari's theorem, which associates with a bounded Hankel operator a
function in L (T) (a symbol) whose norm is the same as the operator norm. We give some examples, including Hilbert's Hankel matrix.
Next we come to two famous problems of complex analysis, the Carathdodory-Fejer and Nevanlinna-Pick problems, which we state in their simplest forms (many more difficult versions
have been analysed). Each can be reduced to the Nehari extension problem - that of finding a symbol of minimum norm - and we give Sarason's elegant solution to this.
Turning to the more general theory of Hankel operators, we give Kronecker's theorem, characterising finite-rank Hankel operators in terms of rational symbols. We then prove Hartman's
theorem characterising compact Hankel operators - a much deeper result. En route we introduce the disc algebra, a subspace of H_. Much of the material of this chapter is covered by the cited works of Francis, Garnett, Power and Sarason.
Recall that a Hankel matrix has the form
a0a1a2... a, a2 a3 ... 11
C
a2 a3 a4 ...
which we shall write as (a0, a1, ...) when no confusion is likely.
30
We shall be working with operators on 12 = 12(0, 1, 2, ...) and H2 and we ask the question: what operators on H2 could give rise to such a matrix? Evidently we require a map IF such that
r(zm)
= am + am+Iz + am+2z2 +
....
We give three possible approaches.
Approach A
Let R denote the reversion map in L2(T) which takes zm to z -m. Suppose now that g(z) = E_-_ gkzk is a function in L (T) (and hence L2(T)): let Mg denote the bounded operator
on L2 which consists of multiplication by g. Thus MgR(zm) = E_- g, z m+k =
E-
gm+lzl-
Finally, if we let P denote the orthogonal projection from L2(T) onto H2, which takes negative powers of z to zero, we see that E
gm+lzl - EO
gm+1z1
Thus the map r taking f to PMgRf will be given by the Hankel matrix (a0, a1,
...)
if
g(z) = E_-_ gkzk and gk = ak for k > 0. For the operator to be bounded it is sufficient that g E L (T), in which case III'II
<_
IIgIIL_.
Approach B
If we start off with Mg, taking zm to E_;, gkzm+k, then following this by R (as defined
above) now produces E_ gkzm-k Em g-k
zk-m
=
Eo-
g
= 7- gem+k The effect of P now is to take this to
(l+m)z1.
Thus the map taking f to PRMgf = PRgf will be given by the Hankel matrix (a0, a1, ...) if
g_k = ak for k ? 0 and the operator will be bounded if g E L (T). This is the approach taken in Chapter 1 of Power's book.
Approach C
We begin with Mg, taking zm to Egkzm+k, and then project onto Hk using the map
Q = I - P. This gives us Y-1 g_r_mi r E Hi. (Indeed some authors even define the Hankel operator as the map QMg from H2 to H.
31
Applying
U
now,
where
(Uh)(z) = (1/z)h(1/z),
takes
this
to
g-l-I-mzl = g-m-1 + g-,-2z + ... and so the map taking f to UQMgf = UQgf will be given by the matrix (a0, a1, ...) if g-k-1 = ak for k z 0, or by the matrix (a,, a2, ...) if g-k = ak fork >_
1.
This approach is commonly used in the theory of discrete-time linear systems.
Approach A does have the advantage that we end up with gk = ak and we shall summarise what it says in a theorem.
Theorem 3.1 If g(z) = Y-_- gkzk E L (T), then the operator 1' = PMgR: H2 -* H2 is a Hankel operator given by the matrix (g0, gl, ...), where R: H2 -4 L2, Mg: L2 -4 L2, and
P: L2 -. H2 are defined as follows: R(Ep anzn) = Ep ana n;
Mgh = gh; and
P(E: cnzn) = EOF cnzn. Moreover 111`11 5 I I811L_.
Proof We saw above that F was correctly defined on powers of z. It has a unique continuous linear extension to all H2, as indicated above, since IIPII = 1, IIMgII 5 11811E , and IIRII = 1.
Conversely, given a Hankel matrix (a0, aI, ...), we would like to be able to choose g r= L
such that gk = ak for all k 2 0. The fact that we can do this effectively is the content of our next result.
Theorem 3.2 (Nehari's Theorem). If I': H2 -* H2 is a bounded Hankel operator given by a
matrix (a0, aI, ...) with respect to the standard basis, then there exists a function g E L_(T) such that F = PMgR and 11811E
= 11111.
32
Proof We observe that, taking L2 inner products,
(rzn, zm) = am+n = (rzn+m
Therefore (I
1).
bnzn, Ep rmzm) = (r(E0 bnzn)(E0 cmzm), 1).
Let us write (E cmzm)+ to denote E Z'mzm.
Thus, if f, and f2 are polynomials, the linear functional a(f jf2) = (F(f1f2), 1) _ (1'fl, f2+) satisfies I(x(fjf2)I
<_
IITIi
IlfII12 1V2112'
But products of polynomials are dense in H1, by the proof of Corollary 2.14, and we obtain a unique extension
a: H1 -a C, given
a(f jf2) = ('fl, f2+).
by
la(fjf2)I < Aril
IIfj112
(This
r(fjf2)
equals
if
1112 E H2.)
Moreover
IIf21I2, and so Hall < 11111, by Theorem 2.13.
Now we can regard H j as a subspace of L1, using J, and extend a to a linear map a: LI -> C withHail = flail, using the Hahn-Banach Theorem (seethe Appendix). This implies that
a(f) = JOn .f(e`9) h(ei0) dO/2n,
for some It E L_(T) with IIhilL = Hail am+n =
and so h-k = ak for k = 0, 1, 2, Thus,
taking
k = 0, 1, 2,
(rzm+n.
<_
Illli. Thus
1) = lOn ei(m+n)0 h(ei0) d6/2n,
....
g(ei0) = h(e-10),
we
have
that
IIgDIL_ 5 111711
and
ak = gk
for
..., as required.
Definition 3.3 A function g E L (T) such that gk = ak for k = 0, 1, 2,
...
symbol for the Hankel operator corresponding to the matrix (a0, a1, ...). We see from Theorem 3.2 that g can be chosen with IIgIIL- = 11111, and indeed
Ilrll = inf fI1gIIL_: g r= L,,, g a symbol for r). The problem of finding such a g is called the Nehari extension problem.
is called a
33
Corollary 3.4 If g is a symbol for r, then Ilrll = dist(g, H1), where H1 is the space of functions 1-1 ck-A which are analytic and bounded in (14 > 1), i.e. those functions f(z) such
that Uf(z) = (1/z)f(1/z) a H,,. Proof Clearly 11111 = W lllg - hlL: (g - h) a L,,, a symbol for
r)
= inf {IIg - hll,,: h e H1. Examples of symbols
Example 3.5 Let I' be a rank-1 Hankel matrix, which without loss of generality has the form
(1, a, (x2, ...). We shall take lal < 1, which is necessary to make the operator bounded. Since it
has rank one, its operator norm is straightforward to calculate (either by noting that
(1, ti, tx2, ...) is an unnonnalised Schmidt vector, or by working out the operator's Hilbert-
Schmidt norm by taking the sum of the squares of all the matrix entries), and it is in fact 1/(1 - lal2).
The most obvious choice of symbol for r is
g1(z)=1 +(Xz+a2z2+... =1/(1 -az). Thus g1(z) a H and Ilglll = 11(1
-
lal). This is too big to be optimal.
If, however we consider
g2(z) = gl(z) - a. /(z(1
- 1a12)),
adding on just one anti-analytic term, we obtain, after a small calculation, g2(z) = (11(1 -
1a12))
.
(z - to.)/(z(1 - az)),
which is interesting because the second factor is actually a Blaschke product and hence an inner function, i.e. it has constant modulus one on the unit circle. Therefore IIg2l1L
= 11(1 - lal2), which is optimal.
We recall that the map taking f(z) to (1lz)f(1/z) is an isomorphism between H and Hl, and if we translate the above arguments across, we see that the best approximation of the H1 function
F(z) = I
/ (z - (x) by an H function is just the constant a. 1(1 - 1a12), with L error
1/(1 - 1a12).
34
Example 3.6 (Hilbert's Matrix) This is the Hankel matrix H = (1, 1/2, 1/3, ...) (in the usual notation). Hilbert's inequality (proved by Schur) says that IIHII = it. By taking
g(ei0) = i(ir - 0)e t6 (0 <_ 0 < 2n), we have a function with L norm at most it, and
?rz i(n 0
-
9)e-i0ein0d0/2n ,22
i(6
- x)ei(n+1)0/(2tr(n + 1)) jon -JOB (il(n + 1))e i(n+1)6dO/2rz
1/(n + 1) (n 2 0). Thus g is a suitable symbol for H. Two famous problems 1.
Carathdodory-Fejdr Given a polynomial a0 + alz + ... + anzn,
choose coefficients
an+1, an+2. ... to minimize IIEO akzkllHH, i.e. minimize
IIaO + a1z + ... + anzn + zn+l f(z)IIH over all f(z) analytic (and bounded) in (IzI < 1). 2. Nevanlinna-Pick Given zl, ..., zn and wl, ..., wn, complex numbers, with IzkI < 1 for each
k, find a function f analytic in (IzI < 1) such that f(zk) = wk for k = 1, ..., n and with Ilflla, minimal.
Theorem 3.7 The Caratheodory-Fejdr problem can be reduced to the Nehari extension problem.
Proof Let h(z) = a0 + ... + anzn; then we have to calculate dist(h(z),
and to find a
closest zn+l f(z) in the H norm. Equivalently we need to determine dist(h(z)/zn+1, H_). But this is the same as dist((1/z)h(1/z)zn+1 Hl), using the isometric isomorphism U: H -* Hl,
Uk(z) = (1/z)k(11z). Thus solving the Carathdodory-Fejdr problem for a0 + ... + anzn is equivalent to solving the Nehari problem for an + an-lz + ... + a0zn, and the minimum norm is just the operator norm of the Hankel matrix (am
a0).
Example
Let h(z) = I + 2z; thus IIhIIH
= 3. We have that
inf (III + 2z + a2z2 + ...IIH-: a2, a3, ... E C} is the operator norm of the Hankel matrix (2,
1, 0, 0, ...), or equivalently of the 2-by-2 matrix
35
that is, its largest singular value.
C10
Since the matrix is self-adjoint, with eigenvalues I ± 42, we see that in fact the above infimum is I + 42, which is strictly less than 3.
In fact the optimal extension here
a = 42 -
the function (1 + '2)(z + a)l(1 + az), where
is
1; it is just a constant multiple of a Blaschke product, and constant in modulus on the
unit circle.
Theorem 3.8 The Nevanlinna-Pick problem can be reduced to the Nehari extension problem.
Proof Certainly there do exist bounded analytic functions in the unit disc with f(zi) = wi for
i = 1,
...,
n: for example a polynomial interpolant. Let g(z) be one such. Then if h(z)
another, we have that g - h e H,,, with g(zl) =
...
is
= g(zn) = 0. It thus follows, e.g. as in
Proposition 2.7, that g - h = B(z)k(z), where B is a Blaschke product on the zeroes, a product
of factors (z - zi)/(1 - Yiz). Conversely, if k(z) a H,,, then g(z) + B(z)k(z) is another interpolant. We
thus
seek
to
inf (llgB-1 + kll,,: k e
find
inf (Ilg + Bkll,,: k e H_ J.
But
this
dist(gB-1, H_), since B(z) is inner, that is
is IB(ei0)I
just
=I
almost everywhere (in this case, everywhere). This reduces to the Nehari problem, by the proof of Theorem 3.7.
Example
Find a minimum norm interpolant f such that f(0) = I and f(1/2) = 0. We start with
g(z) = 1 - 2z, which certainly takes the required values. The set of all possible interpolants g + Bk (as above) is ((1 - 2z) + (z(z
- 1/2)1(1 -
minimize 118B-1 + k11,,, which is (1 - 2z)(1
-
zl2))k(z), k E H,}. Thus we seek k to
z/2) / (z(z -
1/2)) + k = (z - 2)lz + k. The
minimum is therefore 2, and taking k = -1 (a constant function) will do.
Thus f(z) = (I - 2z) _ -2(z
-
1/2) / (1
-
(z(z - 112) / (1 - z/2))
is optimal; this is
(1
- 2z) /
z/2), which is again a constant multiple of an inner function.
(1
- z/2)
36
We intend now to give an explicit solution to the Nehari problem: however we require a technical lemma on H2 functions before we do this.
Lemma 3.9 If f E H2, and f is not identically zero, then ?0IC
log V'(eie)I de/2n > -oo,
and hence f(ete) * 0 almost everywhere.
Proof (We are interpreting H2 as a subspace of L2, as usual.) Without loss of generality f(0) * 0, as otherwise we may consider f(z)/zn for some suitable n. Writing fr(z) = f(rz) for
r < 1,
we
recall
that
Ilfr
- AIL2 - 0,
and
hence
there
exist
rm -* I
such that
frm(ete) -+ f(ete) almost everywhere. We recall from the proof of Lemma 2.8 that, if (zk) are the zeroes of f, we have JOIC
log If(o)I + lhzkl
log If(rete)I de/2n,
n (0)I 5 0 log V(reie)I de/2n.
log
Let us write log(x) = log+(x) + log (x), for x 2 0. where log+(x) = max(0, log(x)), and log-(x) = min(0, log(x)). Then, since log+(x) S x2, ,2n log+If(rmete)1 j22On de/2a <_ If(rmete)12 de/2n c
= Ilfrmll2 < IIA12. Hene JOn log-If(rmete)1 de/2n >_ log
If(0)1 - IIA12
for each m. Thus, by Fatou's Lemma applied to minus the integrals above (see the Appendix), we obtain Jon log-1f(eie)I de/2n >_ loglf(0)I - IIII2,
and hence
2a
f2
as required.
log If(eie)I de/2n >
37
Theorem 3.10 (Sarason) If I': H2 -* H2 corresponds to the Hankel matrix (a0, a1, ...), and
f e H2, f not the zero function, is such that Ilrfll = Hill IIMI, then there is a unique symbol g
r
for
of minimal
norm,
ilgll = 11111,
and
it
is
given
g = rf / Rf,
by
i.e.
g(z) = (I'f)(z) / f(1/z). Moreover lg(eiO)I is constant almost everywhere. Proof Recall that rf = PMgRf so that 11111 Il/l = 11TA12 < 119RA12
<_
11911-
11R1112. But we have
equality throughout, and hence rf = gRf, so that g = rf / Rf, since f(eie), and hence (Rf)(eie) = 1(e t0), is nonzero almost everywhere. Moreover, since IIgRJ112 = 11911- 11RA12, we have Ig(eie)I = 11g1,,, almost everywhere.
Since for a compact operator r with r'x = Ej 61(x, vi)wi, r attains its norm (a1) at v1, we can solve the Nehari problem explicitly for compact operators.
Example For the Hankel matrix (1, a, a2, ...) a symbol was shown (in Example 3.5) to be
(,/(,
- I .A2)) (1 - (=)-1 /
f(z) = (1
= (1/(l - lal2)) (z -
(I - Y.z 1)-1
- zXz)-1 and (rf)(z)
_ (1
(xz)-1
-
Ut)
/ (z(1
-
az)).
Here
/ (I - lal2).
We are now interested in knowing what conditions on g(z) = Y.-' gnzn a G force the Hankel matrix r' to be finite rank or compact. Clearly r = 0 if and only if g(z) a H1 (so that there are no positive nonzero gn).
Theorem 3.11 (Kronecker's Theorem) The Hankel matrix (a0, al, ...) has finite rank if and
only if f(z) = a0 + alz + a2z2 +
...
is a rational function of z, which is if and only if
Uf(z) = a0lz + a11z2 + ... is a rational function of z. The rank of (a0, al, ...) is equal to the number of poles of Uf(z), which must be in the open disc flzl < 1) if (a0, al, ...) determines an operator.
Proof Clearly f(z) is rational if and only if Uf is, so we shall consider Uf. If the matrix has finite rank r, say, then the first (r + 1) rows are linearly dependent, and hence
i i Xiai+m = 0 for all m >_ for some X0, ..., 'r not all zero. But then
0,
38
(',0 +),lz + ... + ),,zr) (a0/z + al/z2 + ...) is a polynomial in z of degree r-1, since the coefficient of a negative power i m-l is k0am + ?lam+l +
+ )`ram+r = 0,
and hence Uf(z) is a rational function of degree r.
Conversely, if P(z)EO aii 1-t = Q(z), for P(z) a polynomial of degree at most r and Q another polynomial, then working backwards we obtain a recurrence relation between the (a1) and
the rank is finite. Note
that
if
),0 + Xlz + ... +
Eir ),iai+m = 0
then
the
poles
of
Uf(z)
are
the
roots
of
Xrzr = 0, and thus have to satisfy Iz) < 1 in order that this determine a
bounded operator, since the roots of this polynomial are involved in the general solution to the
recurrence relation for the (ad and ai -* 0 if and only if JzjI < I for all j. For example, one may
(a0, al,
.
an. 0, 0,
easily verify
Kronecker's theorem for
(1, a, a2,
...)
and
...).
Corollary 3.12 g(z) = E_- gkzk determines a finite rank Hankel operator if and only if
g(z) a Hl + RH,,, where RH is the set of rational H functions. Proof Immediate from Theorem 3.11.
We are now interested in characterising compact Hankel operators. At this stage it is useful to introduce one or two new nonned spaces.
Definition 3.13 The Disc Algebra A0 is defined to be H n C(T), i.e. the set of those analytic functions in the disc which are bounded and determine continuous boundary functions.
Note that C(T) a H,,, since for example T = 11z is in C(T) but is not analytic. Moreover
H cx C(T), since e.g. a M(z) = e (1-z)l(1+z) is in H,,: at the boundary it takes the values ety
(y a R), but as z -* -1, y -a oo and the function is discontinuous. We therefore do get something new.
39
We shall also be interested in the space H + C(T) = {f + g: f E H,,, g e C(T)), regarded as a subspace of L (T). Also in H1 + C(T), defined similarly. Note
that U is
an automorphism of C(T), where (Uf)(z) = (1/z)f(1/z), and hence
U: H + C(T) t-+ Hl + C(T) is an isometric isomorphism. Theorem 3.14 If g(z) E Hl + C(T), then g determines a compact Hankel operator. Proof Since Hl functions give the zero operator, we may assume without loss of generality that
g E C(T).
n = 1, 2,
Appendix)
there
exist gn(z),
..., such that gn(ei0) is a trigonometric polynomial for each n, with lign
- gll, -a 0.
By the
Stone-Weierstrass
theorem
(see the
Thus gn(z) = pn(z) + qn(1/z), with pn and qn polynomials. But now IIr'gn - Fgll
<_
Ilgn
- gll -* 0, and so fg is the limit of a sequence of finite rank operators,
and hence is compact.
Corollary 3.15 If (an)0 E 11, then the Hankel operator (a0, a1, ...) is compact. Proof 10- anzn converges uniformly on flzl
<_
1), giving a function in C(T) as a symbol for I.
(This result can also be proved fairly directly - see the Exercises.)
For the converse, we start by showing that if r = Fg is compact, then g is in the closure of
H + C(T). Proposition 3.16 If I = F9 is compact, then there exist gn E H1 + C(T) such that
I I g- g'11- -, 0. Proof We write S for the right shift on 12, i.e. S(c0, c1, ...) _ (0, co, cI, ...). Now for any finite rank operator T, with
Tx = E we have
j
(;,(X, vi)wl,
TSnx = E j al(x, S*nvi)wl,
and S* is the left shift. Hence IITS'II -+ 0, since S*nvl -* 0 for each i. Moreover, if T is a
40
compact operator, then there is a sequence (Tk) of finite rank operators, with IITk - 711 -> 0.
Given e > 0, choose Tk such that IITk
-
711 < e; then IITkS
Z
-
TS"II < e, and it follows that
IITSnII < e for large n; hence IITSnII 4 0.
Checking the effect on the basis vectors of H2, we see that rgsn = rz-ng . Thus, by
Theorem 3.2, there exist hn E Hl such that IIz ng + hnII - 0, that is flg + znhnll -> 0, and taking gn = znhn E H1 + C(T) gives the required result. Thus a compact Hankel operator is actually the limit of a sequence of finite rank Hankel operators.
We wish to show that H1 + C(T) is closed in L (T), or equivalently, using the isometry U,
that H + C(T) is closed in L,,. It is convenient now to prove one or two further facts about the Poisson kernel P(z, w) =
(Izl2 -
IwI2) / (Iz - w12), which relate to Proposition 2.10.
For f e L (T), r < I and Iw) = 1, write Prf(w) =Jo P(eie, rw) ,fei0) dO/27c. Note that Proposition 2.10 implies that IIPrfll 5 ML.,. Lemma 3.17 (i) If, f(w) = wn, then Prf(w) = rl nl wn;
(ii) If f E H,,, then Prf E A0; (iii) If f E C(T), then Prf -4 f uniformly as r -a 1. Proof (i) For n Z 0, this follows from Proposition 2.10 (i); for n < 0 the result can be obtained by taking complex conjugates.
(ii) This follows since Prf = fr where fr(z) = f(rz); (iii)
Given e > 0, choose g such that g(ei0) = E N anein0 and IV -
prg(eie) _
N
anrlnlein0.
<_ e13 + e13 + Ilg
-
Then
gll
< e13; thus
IV - Prfll 5 IV - gll + Ilg - Prgli + IIPrg - Prfll
}II < e, for r sufficiently close to 1.
Corollary 3.18 If f E C(T), then dist(f,
dist(f, AO).
Proof There is a function g E H such that If - gIl = dist(f, H_), by Nehari's theorem. Let
fr = Prf and gr = Prg.
41
Then ILf - grll 5 Ilf - frIl + IV, -
which, given e > 0, is at most e + Ilf - gIL,
for r sufficiently close to 1, by Lemma 3.17, so that dist(f, AO) <_ dist(f, H_), since gr r= A0.
Theorem 3.19 (Sarason). H + C(T) is a closed subspace of L (T).
Proof Given h E L. in the closure of H,, + C(T), there exist fn E H,,, gn Ilh -
(fn + gn)II < 2-n. It follows that dist(gn - gn+1.
r=
C(T) with
2-n+l, and so, by Corollary
- gn+l - knll,,, < 2-n+1 Let GI = g1 and Gn = gn + kl + ... + kn_l E C(T) for n > 1. Hence
3.18, there exist kn E AO such that Ilgn
IIGn - Gn+111 < 2-n+1 so that the functions Gn converge to some g r= C(T).
Now write Fn = (fn + gn) - Gn which are in H since gn - Gn E H,,.
Thus Fn_*h-g, so h-g H,,. Finally h=(h-g)+gE H_+ C(T), required.
To sum up:
Theorem 3.20 (Hartman's theorem) IF = Fg is compact if and only if g E H1 + C(T). Proof Use Theorem 3.14, Proposition 3.16 and Theorem 3.19.
as
42
4. HANKEL OPERATORS ON THE HALF PLANE
In the previous chapter we treated Hankel operators defined by means of a Hankel matrix: we
now turn our attention to a second kind of Hankel operator, the Hankel Integral Operator on L2(0, oo). With the aid of the Laplace transform we are able to determine the action which such operators induce on the Hardy space H2(C+).
As with the Hankel operators of Chapter 3, the Hankel operators on H2(C+) can be regarded
as being produced by applying an inversion, followed by a multiplication and finally an orthogonal projection. Using the isometric isomorphism between H2(C+) and H2 that was defined in Chapter 2, we then see that Hankel integral operators correspond to Hankel operators on H2.
This correspondence allows us to reap several corollaries, deducing versions of the Nehari, Kronecker and Hartman theorems from the corresponding results of Chapter 3.
The references most nearly related to the material of this chapter include Glover, Glover et al, and Power, although some of the calculations appear to have the status of folklore.
We begin with the Hankel integral operators on L2(0, 00).
Proposition 4.1. If h(x) a LI(0, 00) n L2(0, oo), then the Hankel Integral Operator
rh: L2(0, oo) -. L2(0, 00) given by
(rhu)(x) _ c h(x + y) u(Y) dy is well-defined and bounded, with IlrhII
IlhJJI.
Proof Since h E L2(0, oo), it is clear from the Cauchy-Schwarz inequality that rhu is defined pointwise. Now, if v E L2(0, oo), we have
(rhu, v) _
0 h(x + y) u(y) V(x) dy dx,
so that
I(rhu, v)I < r 0 1
Y=
0 Ih(z)I lu(Y)I Iv(z - Y)I dy dz,
letting z = x + y and using Fubini's theorem (see the Appendix) to justify rearranging the integral. This is in turn at most
Ih(z)I c lu(Y)I
Iv(z
- Y)I dy dz
43
S IlhIII
IIUI12
11V112,
and hence Ilrhll 5 IIhII j.
We shall see later that I'h is actually compact as well.
Corollary 4.2 If h e Ll n L2(0, oo) and u e LI n L2(0, oo), then rhu e LI n L2(0, and also Ilrhul l l 5 I lhl l l
I lul l l.
Proof As for Proposition 4.1, with v e L (0, o) this time: 100
(rhu)(x) V(x) dxl 5 Ilhlll
Ilulll Ilvll,,,
which implies the result.
Notation We shall let g(x) and h(x) be functions in Ll n L2(0, oo), and let G(s) and H(s) be their Laplace transforms, in H2 n H (C+), which actually converge uniformly on the imaginary
axis s = iy. Lemma 4.3 With g, h, G and H as above, and s E 1R,
L(rgh)(s) + L(rg)(-s) = G(s) H(-s). Proof The left hand side is
0 g(x + y) h(y) esx dy dx +
h(x + z) g(z) esx dz dx.
Write z =0x + y in the first integral, and y = x + z in the second. The Jacobians for the change of variables come out to be unity. We thus obtain
rf z_0 1
Y=
0 g(z) h(y) a-sz+sy dy dz
_ - c h(y) e5g(z) esz
+r0 Y=
Jg
_0 h(y) g(z) esz+sy dy dz
dy dz = G(s) H(-s).
This enables us to identify the effects of the integral operator in the space H2(C+).
44
Corollary 4.4 With g, h, G, H as above, we have L(I'gh)(s) = P+(G(s)H(-s)}, where
P+: L2(iR) - H2(C+) is the orthogonal projection. That
is, L4Fgh)(s) = P+MGR(H(s)),
where MG: L2(iR) -* L2(iR) is multiplication by G E L_(tR), and R: H2(C+) - L2(iR) is defined by (RH)(s) = H(-s).
Proof As Lemma 4.3 showed, there is a simple decomposition of G(s)H(-s) E L2(iR) into an H2(C+) part, namely 4rgh)(s), and an H2(C_) part, namely L(I'hg)(-s).
Theorem 4.5 If h E LI n L2(0, oo), then I'h has a unique continuous extension to an operator on L2(0, -), or equivalently an operator 4'h on H2(C+), and Ilrhll = Ilp'hll <_ IILhL.
Proof Define (P'gH)(s) = P+(G(s)H(-s)), H e H2(C+); then the equation L(rgh) = lg(Lh) defines rg uniquely on L2(0, o) with the required properties.
Given a function G(s) a
we shall refer to an operator of the form
H - P+(G(s)H(-s)) as a Hankel operator on the halfplane, (i.e. on H2(C+)). We now wish to relate these to Hankel operators on the disc, (i.e. on H2). Recall that we have maps
V: H2 -* H2(C+) and V-1: H2(C+) -4 H2 which are isometric, defined by
(Vg)(s) = n-1/2g(Ms)l(1 + s), g E H2 and
(V-1G)(z) = 2it1/2G(Mz)l(1 + z), G E H2(C+), where Ms = (1 - s)/(1 + s).
Theorem 4.6 Let G(s) E
Then the Hankel operator determined on H2(C+) by G is
equivalent to the Hankel operator determined on H2 by the function g(z) = G(Mz)lz E L (T).
Proof It is sufficient to verify the result on the functions zn in H2, since all the maps involved are linear and bounded. It also makes it easier to see the effects of the maps involved.
45
Firstly zn maps under V to the function x-112(1 - s)nl(1 + s)n+l e H2(C+). The effect of
the Hankel operator determined by G is to map this to
P+(G(s)itl/2(1 + s)nl(1
_
s)n+1) E H2(C+). This is just V applied to
P((2n112l(1 + z))G(Mz)lt l/2i nl(1 - s)), since VP = P+V (i.e. V preserves the orthogonal decomposition
of
L2(T)
into
H2
and
its
complement.)
This
is
then
P((2/(1 + z))G(Mz)i n(1 + z)12z) = P((G(Mz)lz)i n), which is the Hankel operator with symbol G(Mz)lz, as required.
Example
The Hankel operator determined by 1/(s
-
a), for a e C+ is equivalent to the Hankel
operator on H2 with symbol (1lz)(1l(Mz - a)) = (1 + z)l(z(1 - a - z(1 + a))). This
is
rational, so, applying U to this (in order to determine its rank), we obtain
(z + 1)/((1 - a)z - (1 + a)), which has just the one pole, at (1 + a)l(1 - a) in the disc, and is thus a rank-one Hankel operator.
In the special case a = -1, this gives the symbol (1 + z)/2z, which is equivalent to 1/2 (throwing away the H1 part which has no effect.) Thus we can conclude that the Hankel integral operator with kernel e -t and the Hankel matrix
(1/2, 0, 0, ...) are equivalent, and determine rank-one operators of norm 1/2. We shall write W for the map obtained in Theorem 4.6, from L_(iR) to L (T), with (WG)(z) = G(Mz)lz.
Its inverse is easily calculated to be the map given by (W-1 g)(s) = (Ms)g(Ms).
We remark that IIWGII = IIGIL, and we can thus identify particular types of operators on
the halfplane, using W and analogues in Chapter 3.
W1.
The remaining results of this chapter all follow from their
46
Corollary 4.7 (Nehari's Theorem for the halfplane) A Hankel operator on H2(C+) is given by
H E H2(C+), and it is possible to choose a suitable
17H = P+{G(s)H(-s)), G e symbol G e
with
11111 = IIGII = inf (IIKII,,: K a symbol for t). Proof From Theorems 3.2 and 4.6.
Corollary 4.8 G E
determines the zero Hankel operator if and only if G e
Proof G(s) is analytic and bounded in (Re s < 0) if and only if G(Mz) is analytic and bounded
in flzi > 1). This happens if and only if G(Mz)lz E H. The above result can also be proved directly, but the final two corollaries are most easily deduced from their `disc' versions of Chapter 3.
Corollary 4.9 (Kronecker's Theorem for the halfplane) G E L,(iR) determines a finite rank
Hankel operator if and only if G e H_(C_) + RH (C+), where RH (C+) is the set of rational H (C+) functions (so that their poles are in C_). The operator's rank is the number of
poles in C. Proof G determines a finite rank operator if and only if U(G(Mz)/z) is in RHl + H,,, which is
if and only if G(M(1lz)) = G((z - 1)/(z + 1)) is 'rational +
with poles in the disc. This
is if and only if G(s) is `rational + H,(C_)', with poles in the left half plane.
Corollary 4.10 (Hartman's Theorem for the halfplane) G E
determines a compact
Hankel operator if and only if G e H_(C_) + C*(iR), where C*(iR) is the space consisting of those functions continuous on iR, with a (unique) limit at ±ioo.
Proof G(Mz)/z E H1 + C(T) if and only if G((z - 1)/(z + 1)) E H + C(T), that is, if and only if G(s) a H_(C_) + C*(iR). We remark that H_(C_) + C*(iR) is the same as space of continuous functions tending to zero at ±ioo.
C0(iR), where C0(iR) is the
47
Thus, for example, e -s does not determine a compact Hankel operator, and it corresponds to (l/z)e(z-1)l(z+l)
on the disc, which, apart from the llz factor, is the example that we gave earlier
of a function which is in H,,, but not C(T).
This concludes our discussion of the relationships between Hankel operators on the disc and halfplane. We meet the latter again in a practical context, in the next chapter.
48
5. LINEAR SYSTEMS AND H Unlike the previous chapters, which had a purely Analytic theme, this chapter is more applied
in flavour. We start with the notion of a continuous-time finite-dimensional linear system (given
by a set of differential equations), outline how solutions may be expressed in terms of the Laplace transform, and observe that a finite-dimensional linear system corresponds naturally to a rational H_(C+) function, whose degree is the rank of the associated Hankel operator. More generally, one can define infinite-dimensional systems and find examples of them in real physical problems. We consider one such example in some detail.
This gives a physical motivation for the principle of model reduction - approximating a system by a simpler system - and hence for the problem of rational approximation.
The use of Laplace transforms in solving differential equations can be found in the books of
Jacobs and Kbmer, the tie-up with H is explained in more detail in the works of Francis, Fuhrmann, Glover and Glover et al.
We avoid discussing systems with more than one input and output, mainly because analogous
results about the corresponding matrix-valued Hardy spaces are not always straightforward to derive. We also refrain from discussing discrete-time systems, although the Exercises provide some opportunity for encountering them.
A continuous-time finite-dimensional linear system is conventionally specified by a pair of matrix equations:
x(t) = Ax(t) + Bu(t) y(t) = Cx(t) + Du(t), where
u(t) is the input u: (0,oo) -4 Cm,
x(t) is the state x: (0, oo) -4 Cn,
and y(t) is the output y: (0, oo) -4 C. We shall typically restrict to cases when in = p = I (Single Input, Single Output, or SISO),
and u, y E L2(0, ').
49
Example 5.1 y + ay + (3y = u(t). If we take as states xl(t) = y(t) and x2(t) = y(t), we have xl(t) _ x2(t),
= y(r) = -(3xl(t) - ax2(t) + u(t), and y(t) = xl(t), that is,
C
X2
all
)= \ p
+
X.1
I)
y = (1, 0) (
+ Ou.
x2
Proposition 5.2 The solution to the system above is
x(t) = eAtx(0) + f0 eA(t-ti)Bu(ti)dr (t 2 0) and
y(r) = CeAtx(O) + 10 CeA(t-ti)Bu(ti)dti + Du(t).
Proof Verify by differentiating.
We now make two more simplifying assumptions.
(i) x(0) = 0;
(ii) D = 0. Thus y(t) = 10 h(t - ti) u(z) dt, where h(t) = CeAtB. Proposition 5.3 Suppose
S E C and Re s
is sufficiently large.
Then provided that
U(s) = (Lu)(s) exists, it follows that X(s) = (Lx)(s), Y(s) = (Ly)(s) and H(s) = (Lh)(s) exist at s and that Y(s) = H(s)U(s), where H(s) = C(sl - A)-I B. Proof It is easy to see that x, y, and h cannot grow faster than exponentially, so that X, Y and H exist if Re s is large enough. Taking Laplace transforms, we obtain
(Lx)(s) = -- x(t) a st dt
x(t)est IF +
s x(t) a-st dt = sX(s), since x(0) = 0.
c
Thus sX(s) = AX + BU and Y = CX, so that
X = (sI -
A)-1BU,
and Y(s) = H(s)U(s).
50
C(sI -
Ce(A - sI)tB dt = f C(A -
CeAtB a st dt =
Note that
0
A)-IB,
sI)-
e(A-sI)tB
1-
as required.
To guarantee that if U E H2(C+) then Y E H2(C+), we require that H(s) E H (C+), for which it is sufficient that the eigenvalues of A lie in C_. In this case h(t) E LI(0, oo). This is one notion that has been called stability though definitions vary so we shall avoid the term.
Definition
5.4 An H,; system
H(s) E H (C+), together
is
with
a function h(t) E LI(0, oo), the
associated
maps
its
Laplace transform
Th: L2(0, oo) - L2(0, oo) and
Th: H2(C+) -4 H2(C+), defined by (Thu)(t) =
O
h(t -
ti)
u(ti) dt,
and
(ThU)(s) = H(s)U(s).
Its degree is the rank of the associated Hankel operator r given by
(rhu)(t) = - h(t + r)
u(ti) dti.
A system is of finite degree if and only if H(s) a RH (C+), and its degree is the number of poles. In the non-degenerate case an n-state system (with n-by-n A-matrix) has degree n.
It is customary to call h(t) the impulse reponse and H(s) the transfer function. Since the Laplace transform of eXt is 1/(s - X.), a mode e?''t corresponds to a pole at ? in H(s), which must be in C_ if we have a stable system.
Since all that is required to determine a system is the associated function h (or indeed H), it is encouraging to see that some infinite-dimensional systems do occur naturally. These can arise in
various physical ways, for example systems with built-in delays, transmission lines, and from partial differential equations such as the Heat Equation.
51
Example 5.5 A delay system (infinite-dimensional)
Consider the equation 1(t) = -x(t - 1), t
>_
1, with x(t) given for 0 <_ t
<_
1, x(0) = 0.
Figure 1: x(t) = -x(t - 1)
A popular example here (which is further discussed in Khmer's book) is that of a shower bath whose temperature the user is trying to control without being in turn scalded and frozen.
Let u(t) = x(t) for 0 <_ t t
<_
<_
1, u(t) = 0 for t > 1, and y(t) = x(t), assumed zero for
0.
Thus x(t) _ -x(t - 1) + u(t) for all t
0.
e s(tr + 1)x(,r)dti, = e-5X(s),
The Laplace transform of x(t - 1) is r e -St x(t - 1) dt =
putting 2 = t
-
1, and noting that x = 0 for t < 0.
°
Hence sX(s) = -e-5X(s) + U(s), so that H(s) = 11(s + e -S). Now s + e -S * 0 in C+,
since ifs = a + ib, and (a + ib) + e-a(cos b - i sin b) = 0, then
52
a+eacosb=0, and
b - e-a sin b = 0,
so IbI < 1, and hence cos b > 0, which makes the first equation impossible. We may conclude that H(s) a H (C+), since the function is clearly bounded on the imaginary axis. Figure 1 shows x(t) plotted against t, with the initial conditions x(t) = t for 0 5 t
0
0
4
2
6
Figure 2: x(t) = -2x(t -
If instead we consider the equation
x(t) = -2x(t -
1),
we obtain an unstable system: the transfer function is now
H(s) = 1/(s + 2e S), which has some poles in C+ (as is easily verified).
0
1)
10
<_
1.
53
a
2
4
6
0
IB
Figure 3: x(t) = -(n/2)x(t - 1)
The critical value between I and 2 turns out to be n/2: the function
H(s) = 1/(s + (n/2)e s) has poles on the imaginary axis, and the equation is `neutrally' stable.
We can say more: e -s + s has a zero at approximately (-0.32, 1.34) or (-0.32, 7C/2.34), which is reflected in the fact that we obtain damped oscillations with period about 4.5.
Similarly e -S + 2s has a zero at approximately (0.17, 1.67) or (0.17, n/1.88), reflected in a unbounded oscillation with period about 4.
The principle of model reduction is to approximate non-rational H(s) by rational H1(s) of low
degree (or to approximate high-order rational functions by lower order ones). Its motivation is, apart from aesthetics, crudely financial: low order systems are cheaper to build, as well as easier to analyse.
54
Our arguments above were rough and heuristic. We would like to be able to approximate H(s)
in some suitable norm. The H norm is the most natural for many Engineering purposes, though
one might also consider the H2 norm, the L1 norm of the impulse reponse, the norm of the Hankel operator or even its Hilbert-Schmidt norm or its nuclear norm. Because of the tie-up between H,,, and the Hankel operator, it is considered easiest to examine the Hankel norm, and indeed the solution in this case is very elegant, as the next chapter shows.
Other applications of Hankel operators to problems of Control Theory can be found in the
book of Francis. There, the model-matching problem, the tracking problem and the robust stabilization problem are analysed and the connections with H established. We shall not discuss these further.
55
6. HANKEL-NORM APPROXIMATION
The main aim of this chapter is to present the celebrated results of Adamjan, Arov and Krein,
which give the achievable error in approximating a Hankel operator r by another one of smaller
rank. For this we begin by reformulating Kronecker's theorem from Chapter 4 in terms of Blaschke products.
The next step in the argument requires us to consider the kernel (null space) of a Hankel operator: here we give the elegant theorem of Beurling on closed shift-invariant subspaces of H2 (which itself requires no operator theory at all).
With Beurling's theorem established we proceed to examine the Schmidt pairs of r in detail. An inner function b appears naturally at this stage of the argument and to complete the proof we
consider the subspace bH2 in order to show that b is a Blaschke product with the required properties.
The proof is not an easy one and so we clarify some of the details with an illustrative example - for simplicity we are content to approximate a rank two Hankel operator by one of rank one.
In one sense, the main A-A-K result is a natural generalization of Nehari's theorem; as there, it turns out that, once one knows that there is an optimal rank-n approximant, then it is relatively
straightforward to work out what it must be. We perform this calculation and hence establish uniqueness.
We follow Power's simplified approach to the Adamjan-Arov-Krein results in this chapter,
expanding upon some of the more tricky details. We have also profited from reading the cited paper of Adamjan, Arov and Krein (though the English translation is unreliable), as well as from the state-space approach given by Glover. More on Beurling's theorem can be found in Gamett's book.
We recall that, in the disc, (a0, al, ...) determines a Hankel operator of rank n if and only if a0/z + al/z2 + ... is rational with n poles in {lzl < 1) (Kronecker's theorem). For the purposes of this chapter, we shall require the following alternative version.
56
Proposition 6.1 r: H2 -a H2 has finite rank n if and only if F = rg for some g = Bh, B a Blaschke product with n zeroes in the disc, h e H. Proof IF has rank n if and only if r = rg with (Ug)(z) = (1/z)g(1/z) a rational H,-1, function
with n poles in the disc. This implies that r = rg with (Ug)(z) = k(z)/b(z), k e H, b(z) a Blaschke product with n zeroes in the disc. Hence g(z) = (1/z)k(1/z)lb(1/z) = B(z)h(z), as
required, since if b(z) = zm lI ((z - a)l(1 1
/
b(1/z) = zm n ((1
- zt/z)l(1lz
-
az)), then
- a)) = n ((z - a)l(1
-
az)), which is a Blaschke
product, and (Uk)(z) a H. Conversely, if g(z) = B(z)h(z), we have that (Ug)(z) = k(z)lb(z), k e H,,, b a Blaschke product, which is in H + RH1, with n poles; then rg = rg1, where Ugl is the RHl part. Suppose now that r is a compact operator. We now wish to consider
inf (pr - r11: rank(r) S n, r' a Hankel operator). By Theorem 1.4, this is necessarily at least 6n+1(r), the (n + 1)st singular value of T. (Approximating with a Hankel operator is a special case of Theorem 1.4.) There are numerous other ways of characterising this quantity, some of which we collect below.
Proposition 6.2 Suppose that r = rg, g e L (T). Then inf (ilF - r11: rank(r) <_ n) = inf (fig - Bh = inf flIg/B
k e Hl, B a Blaschke product with n zeroes in the disc)
-h-
h, k, B as above)
= inf (lIrg/B: B a Blaschke product with n zeroes in the disc) = inf fll rg(Rb): B a Blaschke product with n zeroes in the disc)
= inf (h Ug - H1 - klll_: HI a RH1, with n poles in the disc, k1 a H
-
H e RH (C+), with n poles in C_, K E
Proof The first equality follows from Proposition 6.1 and Nehari's Theorem. The second holds
because a Blaschke product B is inner, and so unimodular on T. The third is again Nehari's theorem (noting that g/B a H1). The fourth follows because R applied to a Blaschke product gives the reciprocal of a Blaschke product. Finally the last two formulations follow on applying the isometric maps U and W (cf. 4.6 and 4.9).
57
The case n = 0 is of course familiar. The renowned results of Adamjan, Arov and Krein
(universally referred to as A-A-K) will show us that all the quantities above are equal to 6n+I
(r).
We start by examining the kernel (null space) of a Hankel operator, that is the set of all
vectors u such that Fu = 0. This will be useful when we need to consider where two given operators are equal.
Lemma 6.3. Let S: H2 -* H2 be the shift (Sj)(z) = zf(z). Then for any Hankel operator r,
S(Ker r) c Ker r. Proof We claim that rS = S*r (this actually characterises Hankel operators). By linearity and continuity, it is enough to check the basis vectors ei = zi.
Note that rS(ei) = r(ei+1) _ V0 ai+j+1 ej = S*(Z0 ai+j+l ej+l + aie0) = (S*r)ei. Thus if If = 0, then S*rf = 0, so that r(Sf) = 0, as required. The next result, Beurling's theorem, actually characterises such closed shift-invariant subspaces. It is perhaps worth comparing it with the result that says that C[z] is a principal ideal
domain: that is, that any shift invariant subspace of the polynomial ring consists of all multiples of some fixed polynomial. Beurling's result is of a similar nature.
Theorem 6.4 (Beurling's Theorem) Let X be a closed subspace of H2 which is invariant under
the shift S. Then if X = 0, there is an inner function b(z) a H,,, (i.e. Ib(eie) = I almost everywhere) such that X = bH2 = {b(z)f(z): f e H2). Moreover b is unique to within a constant factor.
(Clearly any subspace bH2 is closed and S-invariant, if b is inner.)
Proof For the uniqueness we note that if b 1H2 = b2H2, then b j = b2f2 and b2 = b jf j, with
fl, f2 E H2. Thus b j/b2 E H and is inner, likewise b2lb j. This implies that Jb j/b21 the disc, and jb2/b jI <_ I on the disc. Hence Ib j/b2j = 1 and therefore b11b2 is constant.
<_
1 on
58
We now establish the existence of such a function b. Suppose first that every function in X is
divisible by zk for some k > 0. Then X = zkY for some closed S-invariant subspace Y. We may
thus take the factor zk into the Blaschke product if necessary and hence assume without loss of
generality that there exists a function f r= X with f(0) * 0.
To obtain b, let us write H2 = X ® X-L , and define g(z) to be the orthogonal projection of the constant function 1 E H2 onto X. Then g r= X and 1 (zng
1
- g E X-L
. Therefore
- g) = 0rz einOg(elO)(1 - g(ei0))d6/2n = 0 for n = 0, 1, 2,
...
since zng(z) E X. Hence Frr ein01g(ei0)j2do/2rt =
JB O eineg(et9)d0a = 0
for n ? 1, as the right hand side is just zng(z) evaluated at 0, by Proposition 2.5. On taking complex conjugates, we see that this is true for negative values of n as well. Therefore Ig(eio )12 must be constant. Now g cannot be the zero constant, or else 1 would be in X -L
and hence
1(0) = (f, 1) = JO' f(ei0)d8/2n = 0 for all f E X, which is a contradiction. Thus writing b(z) = g(z)/II8112, we see that b is inner.
Since b E X, we certainly have bH2 c X, since X is closed and S-invariant. If bH2 x X, then there is a function h e X, h # 0, and h orthogonal to all of bH2. It is convenient now to look at gh, as follows. We have that J
h(ei0)e inOg(ei0)dO/2n = (h, zng) = 0
for n = 0, 1, 2, ... since zng e X. Also 2a ein8h(ei0)(1
- g(ei0))d0/2z = (znh, 1 - g) = 0
for n = 0, 1, 2, ..., since znh r X and I - g r=
Xd-
. So, since ?z etneh(ehe)dO/2n = 0
for n 2 1, because znh(z) is zero at 0, we conclude that 12a
h(ei0)ein0g(ei0) = 0 for all n E Z.
As g E L and h E H2, gh E L2 and is therefore zero. Hence h = 0 almost everywhere and X = bH2, as required.
Corollary 6.5 if r = rg is a Hankel operator, then Ker IF = bH2 for some inner function b,
and Fg(Rb) = 0.
59
Proof Ker r = bH2
by
Lemma
6.3
and
Theorem
6.4.
Also
if
f e H2,
then
Pg(Rb)(Rf) = rgbf = 0.
We now need to consider the Schmidt pairs of r in more detail. For this purpose we introduce the following notation.
If f e L2(T), let f+ = (RD-,
so that f+(z) = I (Y), and
if f(z) = Z_, cnzn, then
f+(z) = E- 7nzn. We collect together a few useful properties of this mapping.
Proposition 6.6 (i) If f E L (T) is a symbol for IF, then f+ is a symbol for r*.
(ii) If r*rv = a2v, then TT*v+ = a2v+. (iii) The mapping f -. f+ is a conjugate-linear isometry between
Ker(r*r
-
) and
Ker(IT'* - a2).
(iv) If UI = Ion T, then f+ = 1/Rf. Proof (i) If r is given by the Hankel matrix (a0, a1, ...), then r* is given by the Hankel matrix 010, ztl, ...), and hence f} is a symbol for r* (looking at its positive Fourier coefficients.)
(ii) Likewise if rv = aw, then r*v+ = aw+, so that rr*v+ = r(ow+) _ (F*(aw))+ (a2v)+ = a2v+. (iii)
Clearly the map is conjugate linear and 112-norm-preserving. Since f++ = f and
r** = F, we do indeed obtain a bijection. (iv) Immediate.
Suppose now that r is a compact Hankel operator, g a symbol for r, and that the singular
values of r are al, a2, ... We have, as usual, rx = E-1 ai(x, vi)wi, with (vi, wi) the Schmidt pairs of r. We shall make the standing assumption that n > 0 and that
an > a = an+I =
= an+k > an+k+1
and construct a Blaschke product b with n zeroes such that Ilrg(Rb)II = an+1. By Proposition
6.2, this is equivalent to constructing a Hankel operator rl with IIF - FIJI = an+1 and
60
rank(rl) = n. When we solved Nehari's problem for compact operators by producing a suitable symbol
(Theorem 3.10) we found it helpful to look at wl/Rvl, with (vl, wl) a leading Schmidt pair. Similar considerations apply here.
Lemma 6.7 Let (v, w), (v, v) be Schmidt pairs for r, so that rv = ow, rv' = ow, *w
IF
= av and r*v = ov'. Then w/Rv = w'/Rv' almost everywhere on T, and is unimodular
there.
Proof It is convenient to consider the following relationships which hold between various inner products.
Firstly, (znv, v') _ (znv, r*w')/a = (rznv, w')/a = (gR(znv), w')/a (gz n(Rv), w')/o = (g(Rv), znv )/o = (rv, znw')/a = (w, znw').
Similarly (z-nv, v) = (v, znv) = (znw, v) _ (w, z nw'). Thus ? n ein0v(ei8)V'(ei6)dO/2a =
J1C
O
0
ainOw(ete)s(eie)d8/2a
for all n, and thus
Frz ein9 (Rv) (RV') d0/2n = JOrz ein8
wt
d8/2n
for all n, which means that (Rv)(Rv') = wW almost everywhere.
In particular, with v = V, we have
IRvl2
= Iwl2 and tw/Rvi = 1 a.e. (We recall that by
Lemma 3.9 Rv * 0 almost everywhere on T).
Moreover (w/Rv).(v/Rv )" = 1 so that w/Rv = v/Rv a.e.
We write h(z) = aiwi(z)lRvi(z), which
is
independent
of i = n+1, ..., n+k. Also
h+(z) = aivi(z)lRwi(z), by Proposition 6.6 (iv), since wi/Rvi is unimodular. The function h has some useful properties, as we now show.
Lemma 6.8 There is an inner function b E H,, such that rg(Rb) = rh(Rb) Moreover Ilrg(Rb)II < on+1Proof Observe first that, if n+1
1
<_ n+k, then
rhvi = P((aiwi/Rvi)Rvi) = P(aiwi) = aix'i
61
vn+k) Now Ker(rg - rh) = bH2 for some inner b and
Hence 1'g = rh on Lin(vn+1
rg(Rb) = rg(Rb), by Corollary 6.5. Finally, IIFh(Rb)ll 5 Ilh(Rb)II = a,+1
Our aim is to show that b is a Blaschke product with at most n zeroes, as required by Proposition 6.2.
Given two inner functions bl and b2, we say that b1 divides b2 and write b11b2 if b2 a b1H2, or equivalently if b2H2 c b1H2. We have already noted that if b11b2 and b2lbl, then b1/b2 is a constant almost everywhere.
Lemma 6.9 Let b be inner. Then the space (bH2)1 c H2 has dimension p < oo if and only if b is a Blaschke product with p zeroes.
Proof If b is a Blaschke product with p zeroes, say al, ..., ap, then we can find functions el, ..., ep a H2
such
that
ei(aj) = Sid.
Therefore
any f E H2
is
of the
form
f = E f(ai)ei + bg, where g E H2. Hence bH2 has codimension p, as required. For a general inner function b, we can write
b(z) = zq 11 ((-aillail)(z
- ai)l(1 -
tliz)) h(z) = bp(z)h(z),
where b0 is a possibly infinite Blaschke product and h is inner and nonzero (by Theorems 2.9 and 2.12).
If h is nonconstant, then it has an infinite chain bl, b2, b3, ... of inner divisors, for example
h112, h114, 08, ..., which means that bH2 c b1H2 c b2H2 c ..., with strict inclusion, and so (bH2)1 is infinite-dimensional. A similar argument applies if the Blaschke product is infinite.
Hence the only way that (bH2)1 can have dimension p < o is for b to be a Blaschke product with p factors.
We now start finding some Schmidt pairs for rg(Rb) = rh(Rb)
Lemma 6.10 Let (v, w), h, a and b be as in (6.7) and (6.8). If b0 is an inner divisor of b, then rh(Rb)rh(Rb)vlbO = a2vlb0.
Proof Note that v E bH2 = Ker(I'g - I'h), so v/b0 E H2. Also w+ E bH2, by (6.6). Thus
62
w e b+H2,
and
we may write w =
b+wo.
rh(Rb)v/b0 = P{h(Rb)(Rv)l(Rb0))
Then
= P{(ow/Rv)(Rb)(Rv)/(Rb0)} = P{(a(b+w0)(Rb)l(Rb0)) = P(aw0/(Rb0)) = rnv0/(Rbo), since
b+(Rb) = 1 and w0/(Rb0) = w0bo a H2. rh(Rb)v/b0 = P(ah+b(Rwo)%)
Hence
= P{(a2v/Rw)b(Rw0)1b0) = a2vlb0,
since
Rw = (Rb+)(Rw0) = b(Rw0). This allows us to obtain a large number of eigenvectors of rh(Rb)rh(Rb)
Corollary 6.11 rh(Rb) has a = an+1 as a singular value with multiplicity at least k+m, where
an+1(r) = ... = an+k(F) and dim(bH2)-1- = M. Proof If m is finite, so that b = rl7 ((z
- ai)l(1 -
Zliz)), up to a constant of modulus one, then
let bj = rrj ((z
- ai)1(1 -
Lin(V, V/61,
V/bin) is an eigenspace for rh(Rb)rh(Rb) Its dimension is at least k + m,
Ztiz)), j = 1, ..., m. Writing V = Lin(vn+l.
vn+k), we see that
since dim V = k and the successive spaces each contain at least one independent function, as is easily seen.
If m
is
infinite,
then
b
has
infinitely
inner divisors
many
bjIb2Ib3I...
and
b1H2 > b2H2 > b3H2 > .... Thus using (6.10) we see that rh(Rb)rh(Rb) has an infinite dimensional eigenspace corresponding to a2n-+1 For if v = by, with y e H2, then bly, b2y,
...
is a sequence of independent functions of the form vlb0, b0 inner.
Theorem 6.12 With the notation of this chapter, IIrg(Rb)II 5 an+1 ' and b is a Blaschke product with at most n zeroes. Hence inf (her
-
r'p: rank(T) 5 n) = an+1
Proof From Lemma 6.8 we do have that rg(Rb) = rh(Rb)' which has norm at most an+l
Now the singular values of r are
al, 02, ..., an+l. ..., an+k, an+k+1, ... and those of rg(Rb) start
an+l, an+l.
an+1, ---
with at least m + k repetitions of the first one. But rg(Rb)f = Pg(Rb)(Rf) = rg(bf) = rgMbf;
and so by Corollary 1.5, ai(rg(Rb)) 5 ai(r'g) for all
i.
Thus m + k <_ n + k:
that is,
63
dim(bH2)1
S n, and so b is a Blaschke product with at most n zeroes, as required.
Example 6.13 The Hankel matrix (1, 1, 0, 0, ...) has rank two and a symbol g(z) = I + z. We seek an optimal rank-one Hankel approximant to it.
Its eigenvalues satisfy X2 - X - I = 0, and its singular values are just the absolute values of the eigenvalues, since the matrix is self-adjoint. So
01 = (45 + 1)12 and a2 = (JS - 1)12. For unnonnalized Schmidt vectors we can take
vl = (1, al -
1), w1
and
v2 = (1, -1 - 02), W2 = (-1, 1 + a2). Corresponding to these latter two we see that
h(z) = a2 (-1 + (a2 + 1)z) / = z(z - (;2)
/ (z -
(1
- (I + (y2)/z)
(I + (;2)), and so
g(z) - h(z) = (1 + z) - (z - 02)/(z - (1 + 02))
=-(i +(;2)/(z-(1 +a2))= 1 /(1 -(;2z). We are looking for a Blaschke product B with one zero such that (g - h)IB e H1. Evidently in this case
B(z) = (z - 02)1(1 - 02z)
makes (g - h)/B = 11(z - a2), which is in H1, since U applied to this gives z/(1 - a2z), which is in H_.
Hence pg/B = Th/B and it has norm a2. In fact h/B is just -a2z, which gives a rank-two operator (0, -02) with singular values 02 repeated twice.
Thus g(z) = -a2zB + (g -
h), or, translating
into Hankel operators, Fg = r'1 + r'2,
where 1ir'1Q = a2 and rank (r'2) = 1. Indeed (g - h)(z) = 1/(1 - 02z), which gives r'2 the rank-one matrix (1, 02, aj, In fact,
...).
as with Nehari's theorem, once we know that there is an optimal rank-n
approximation we can say rather more directly what it must be.
64
Theorem 6.14 Let T be any compact operator and TI any rank-n operator such that IIT -
TjII = an+1(T) < a,(T). Then (T - TI)vn+1 = an+lwn+1, where (vn+1, wn+1) is
the (n+l)st Schmidt pair for T.
Proof With the usual notation for Schmidt pairs, let Pl denote the orthogonal projection onto Lin(wl, ..., wn+1) Then IIPI(T - TI)II <_ an+1
vn+1) -4 Lin(wl,
Since P1T1: Lin(vl,
wn+1) has rank at most n, there is a norm
one vector x = 17+1 aivi E Ker PIT,. Now P1Tx = Pl(T - TI)x, so IIP1TxII <_ an+1 But, looking at the coordinates of x, we see that a, = ... = an = 0 and that x has to be a multiple of vn+1 Since IITx - Tlxll <_ an+,, we have that TI vn+1 = 0, and
(T - Tl)vn+1 = an+lwn+l, as required. Our final result should be compared with 3.10.
Corollary 6.15 Let r = rg be a compact Hankel operator and rl an optimal rank-n approximant; i.e. ilr
-
r111 = an+1 Then there is a unique symbol for r - r1 of minimal
L (T) norm, given by h = an+lwn+1/(Rvn+1) = rvn+l/Rvn+l Moreover lh(eie)I = an+1 almost everywhere. In particular r1 is also unique.
Proof By Theorem
6.14,
if h is
any
symbol
for
IF - r1, then
(Yn+IWn+l = rhvn+l = PhR(vn+1) Therefore an+l = Ilan+lwn+1112 < IIhR(vn+1)II2 So, if
we
have
that
= IIPhR(vn+1)112
<_ Ilhil IIRvn+1112 <_ Ilhll
Ilhil_ = an+1' we have equality throughout and an+lwn+l = hR(vn+1),
i.e.
h = an+lwn+IIRvn+l and Ih(eie11 = an+1 a.e., as required.
By using the map W in section 4, it follows for example that for a Hankel operator on H2(C+), with symbol G(s) E
a unique optimal rank-n approximant can be constructed,
with a symbol for the error given by
G(s) - GI(s) = an+IWn+l(s)iVn+l(-s) E L_(tR), and IG(iy) - GI(iy)I = an+l almost everywhere (y E R).
65
It is clear that the singular values of Hankel operators play an important role in approximation problems. In the final chapter, we examine these singular values in a more general context.
66
7. SPECIAL CLASSES OF HANKEL OPERATOR
In this final chapter we tie together some of the strands begun in previous chapters. Singular values, nuclear and Hilbert-Schmidt Hankel operators are discussed in detail, with the emphasis being on Hankel integral operators of the form introduced in Chapter 4.
We begin with the Riemann-Lebesgue lemma, itself an elegant result of classical analysis,
deducing from this that an Ll kernel determines a compact Hankel operator. Hilbert-Schmidt operators are next treated - these are relatively simple to characterise.
We then consider nuclearity, which is rather more difficult. Some useful inequalities are proved and related to the material on Hankel-norm approximation in Chapter 6. Finally we give a deep result, first discovered by Peller, Coifman and Rochberg, which characterises nuclear Hankel
operators in terms of their symbols lying in some space of analytic functions (a Bergman space).
The proof we give is due to Bonsall and Walsh, and it provides some quantitative estimates in
terms of norms. As a corollary we give a result which essentially states that a nuclear Hankel integral operator has a continuous impulse reponse which dies away to zero rapidly at infinity.
The papers of Bonsall and Walsh, and of Glover et al between them contain most of the material of this chapter. The latter might be suitable for further reading. The papers of Coifman and Rochberg and of Peller, and the book of Power are also relevant here.
We recall that the Hankel integral operators discussed in Chapter 4 had the following form:
(Thu)(t) = c h(t + t)u(T)dt, i.e. (t'hU)(s) = P+(H(s)U(-s)). We begin by giving a simple sufficient condition for compactness, using the results of Chapter 4.
67
Lemma 7.1 (The Riemann-Lebesgue lemma) If h e LI, then H = Lh is continuous on the imaginary axis s = iy and tends to zero as y -+ too. Proof If h is a step function the result is clearly true, since Nd-1
ajX[jM/N, (j+1)M/NJ)(s) =
IN-' aje jMslN(1 - e Ms/N)ls,
as in the proof of Theorem 2.18.
Now if h is an arbitrary L1 function we can choose a sequence (hn) of step functions with
hn - h in L1. Then Lhn -4 Lh uniformly in Re s z 0, using the result of Theorem 2.19 that IlLhil S IIhIIL1, and so the result holds for Lh.
Corollary 7.2 If h E LI, then I'h is a compact operator. Proof This follows from Lemma 7.1 and Corollary 4.10.
We now turn our attention to Hilbert-Schmidt operators. For h e LI and t 2 0 we write ht for the function which is h shifted by t, i.e. ht(tt) = h(t + t). Theorem 7.3 Suppose that h r= LI determines the bounded Hankel operator T. Then (i) h determines the operator I *; (ii) r is Hilbert-Schmidt if and only if tl/2h(t) E L2(0, oo), and if so II111HS = Iltl /2hIIL2.
Proof (i) Suppose that u, v r=
LI n L (0,
(rhu, v) = 10
oo). Then
h(t + 't) u(t) v(t) & dt
u(tt) ' h(t + t) V(t) dt dt,
_
c
by Fubini's theorem (see the Appendix),
= (u, i hv).
Thus (u, I'h*v) = (u, rhv) at least on a dense set of u and v. Since 1'h* and r -h are continuous, this is true always, and Th* = rh.
(ii) We have (ht, vi) = 6iwi(t)
almost everywhere.
Therefore El a = El IIFv1II2
= El 0 I(ht, vi)12 dt = 0 Ej I(ht, vi)12 dt, since all terms are positive. This gives IIht112
dt, since the sequence (vi) (augmented if necessary by vectors from the kernel of r)
68
forms an orthnocmal basis. Hence El ai = J0 10 Ih(t + 'r)12 dr dt Ih(r)I2 dt dr (changing variables by setting r = t + ti) jr Jr=0 t=O r Ih(r)12 dr. If either side converges, then so does the other. _
c
For example, the kernel h(t) = e tlJt determines a Hilbert-Schmidt operator, as does any L2
function of compact support. Nuclearity is more complicated, but the following inequality is useful.
Theorem 7.4 For the integral operator (I'u)(t) = 11111
<_ IIIhII <_
h(t + Cr) u(ti) d'r, with h e L1, we have
0j
IIhIIL
S 211111N.
Proof Only the final inequality is in need of proof (see Theorem 2.19 and Proposition 4.1 for the others).
Suppose first that h is continuous and of compact support, [0, Mj, say. For a > 0 and
n = 0, 1, 2,
... define the functions en (t) = a 1 /2x(n(x, (n+1)a)(t)
Then egj, e7, ... is an orthonormal sequence in L2(0, o) and (n+1)a n+1)a
en) = (1/a) Jn
(Te
a
Jn
h(t + ti) dt dti
a
By the uniform continuity of a we have for sufficiently small a that
Ih(v j) - h(v2)1 < e/2M if lv j
-
v21 5 2a.
Thus I(Ten, ea) - ah(2na)l <_ aeJ2M, and
1(112)
(n+1)a Ih(r)I dr - alh(2na)I 2n a
I
<_ ae12M.
Hence I
I(Fen, en )I - (1/2)J2nna1)a Ih(r)I drl < aeJM,
and so
I' 1(Fen, en)I -
(1/2)IIhIIj
so that E0 1(Ten, en)I - (1/2)IIhIIj as a --+ 0.
1
<_ (aeJM)(1 + Mix),
()
69
But for any h0 E LI, corresponding to the operator r0, and a > 0, we have
l(r0en, ea)l 5
,2nna1)a. Ih(r)I dr, 2
since for each r we obtain an obvious upper bound in
Thus, given e > 0, and h e LI, let h = hl + h2, where hl is continuous with compact support and 1Ih211 < e/2. Then
IEp I(ren, en)l
- (1/2)IIhIIII
5 Ep I(r'2en. en)l + IEp I(r'len, en)I -
5 IIh2111 + 1Z' I(rlen, en)I
- (1/2)IlhIIIII + (1/2)IIh2111
(1/2)IlhlljI
< e if a is sufficiently small.
But, by Theorem 1.12, IITIIN ? IF I(rea, en)l, and hence InIN >_ (1/2)Ilhlll, as required. The above result has a corollary which gives some useful bounds for model reduction. More such results may be found in the papers of Glover and Partington, and of Glover et al.
Corollary 7.5 Let H = Lh a
determine a nuclear Hankel operator. If I'1 is an
optimal rank-n approximant to r (as in Chapter 6) and H1 = LhI a RH (C+) a rational symbol for F1, then Ilh
- hllil 5
5 4nan+1(r) + 2(an+1(r) + an+2(r) +
)
Proof The singular values of F - rl satisfy
(a) ai(r - r1) 5 an+1(r) for all i, since Ilr
-
r'lll 5 an+1(r); and
(b) aj+n(r - rl) <_ al{r) for all j, since rank(r1) = n and so an+l(rl) = 0 (using Corollary 1.5).
Hence IF - rIIIN 5 2nan+1(r) + (an+1(r) + (yn+2(r) + ...), and the result follows from Theorem 7.4.
We recall from Chapter 6 that we also have IIH - Hl
-
F11 5 an+1(F), for some
Fe We conclude with a characterisation of nuclear Hankel operators as those which are an ll sum of rank-one Hankel operators:
70
F = E1 I'i, with rank(Fi) = 1 and V j* I IFill < - for all i.
Certainly any nuclear operator T with Tx = E j ai(x, vi)wi is an 11 sum of rank-one operators, but here we require the summands to be Hankel operators. The results that follow are due principally to Peller, Coifinan and Rochberg, though we shall give a simplified presentation due to Bonsall and Walsh. It will be convenient to work with operators on the disc.
We begin by defining a new class of normed spaces.
Definition 7.6 For 1 5 p < -, the Bergman space Bp is the space of analytic functions on
D= (Izl < 1) such that IIAIBp = ((1/n) 1 JD V(z)IP dx dy)1/P <
These spaces thus have a strong similarity to the Hp spaces: it is the rate of growth of the function at the boundary that determines whether a function is in B.
Theorem 7.7 (Peller, Coifman and Rochberg, Bonsall and Walsh) Let g e H2 and h(z) = z2g(z). Then the following are equivalent. (i) I'g is nuclear;
(ii) h" a BI; (iii) g(z) = EI kk(1
- IwkI2)/(1
- wkz),
with El 12'k1 < - and wk a D,
the sum
converging uniformly in D. Moreover I1h"11B1 < (8/9) IIFgIIN,
inf (E-1 IPkl: (iii) holds) S Ilh"IIBI and
Ilr'gIIN s inf (E1 Ikk1: (iii) holds)).
Remarks We saw in Example 3.5 that (1
- 1w12)/(1
- wz) does determine a rank-one Hankel
matrix of norm 1, provided that Iwi < 1. There is no restriction in assuming that g e H2, since
g e P(L.) anyway, by Nehari's Theorem. Moreover it is also true that rg is nuclear if and only
if g" is in Bl, but one cannot obtain an estimate of the norm that way since g" = 0 for all polynomials of degree one or less.
71
Proof of Theorem 7.7 We show that (i) (i)
(ii)
(iii)
(i).
(ii): If I'g is nuclear then, as usual, we have
I'gx = E ai(x, vi)wi. By Cauchy's integral formula (see the Appendix)
h"(w) = (1/ix)jlzl_1 g(z)z2/(z
Let us write fx,(z) = (1
-
wz)-312,
-
w)3 dz = (1/x)? g(e16) e3t0 d0/(ei0 - w)3. 0
for w e D. Our aim is to express the Bergman norm of
h" in terms of the functions fx,.
Now h"(w) = (1/x)?
g(ei0) d6/(1
- we i0)3
= 2(l'gfw, fw) = 2 Ej- (1#w, vi)(wi, ff,).
= 2(g(Rf,,,), fw)
22
- we i°)-3/2 d0/2x for v e H2.
We are therefore led to consider (v, fw) = Jox v(ei0) (1
Note that if v(z) = zn, then (v, fw) = f 0x ein0 (I = ?-?c
a in0 (1
g(ei0) (RfK)(ei9) fw(ei8) dO
= (1/n)?
wei0)-3/2
-
wei0)-3/2
-
d0/2x = (1/2xi) lox i
n-1
d0/2x
(1 - wz)-3!2 dz.
We can evaluate this integral using the Residue Theorem: the only pole of this function is at zero, and the residue is the coefficient of zn in (1 (1/n!) (3/2) (5/2) Moreover
...
wz)-312
-
which is
((2n + 1)12) wn = (inwn, say, where pn = (2n + 1)! l (22n(n!)2).
(fx vi) _ (vi, fw)
Hence I1h"11BI 5 2 ° ai II(vi, fw)IIB2 II(wi. fw)IIB2. by the Cauchy-Schwarz inequality. ( ) 2 But, calculating II(v. fw)I IB2. we obtain 1/x j1o
0
x
1(v, fR,)I2 r dr d0, where w = rep0,
and if v(z) has the power series expansion EF anzzn then (v, fw) = Eo anpnwn: hence II(v, fw)IIg2 5 2 Jo Ep IRnI2 lanl2 2n+1 d,,
=' Let us write yn =
Ian12 IRnI2
/ (n + 1).
an/(n + 1). Then yo = 1, y1 = 9/8, y2 = 75/64, and generally
yn+1/yn = (4n2 + 12n + 9)/(4n2 + 12n + 8), so that yn increases with n. Moreover, recalling Stirling's formula that m! is asymptotic to mine-m(2xm)1/2 as m we may evaluate lim yn, to obtain the following.
72
lim yn = lim (n + 1)-1(2n + 1)4n+3e(4n+2)2,n/ (24nn4n+2e-4n4n2) = lim
(4/a)e-2(1
+ 1/2n)4n = 4/tt.
(4/n) Ilvll
It follows that II(v, fR )IIg2
Using this, we see that Ilh"IlB1 < (8/n)7 ai, as required.
(ii)
(iii): the building blocks that we shall require to compose h", rather than g, will be
b",(z) = (z2 (1
- 1w12) / (1
bx,(z) = 2 (1 Note that IIbwIIBI
<_
- wz))", w E D, i.e.
- Iw12)/(1 - wz)3.
(8/ic), and the corresponding g determines a rank-one Hankel operator of
norm 1.
To obtain a decomposition of functions in B1, it is helpful to identify the dual space B j.
Let A be the space of functions f which are analytic in D, vanish at z = 0, and satisfy
II"IA = sup f(1
-
Iz12) If(z)I: Z E D) < -.
Lemma 7.8 The dual of B1 is isomorphic to A, with the pairing g E A H 4) e B j, where
1f1 = (1/a)
Jo ?'
(1 -
r2) 8 (rei0) .f(rei0) r d8 dr,
and II4IIBI 5 11811A = sup {I4)(bw): w E D) 5 (8/10 IIOIIBI Proof Clearly I4)(fll
<_
II1IB1
IIgHIA, and hence the first inequality follows.
Now, given 4) e B j, define
g(a) = f 4)(bw) dw /
(1 - Iw12)
4)(2/(1 -
_
wz)3) dw.
Then g is analytic with g(0) = 0 and g'(a) _ 4)(2/(1
-
az)3)
= 4)(ba)l(1
a power series in a. Thus 1181IA = sup {Ig'(a)l (1 - 1a12)) s (8/70 IIOIIel
-
lal2),
73
We therefore have two continuous linear maps:
a: A
B j (taking g to 4,), and
0: BI - A (taking 4, to g). It is therefore sufficient to verify that Oct is the identity map: that is, that if g determines 4, by the integral formula above, then $(ba)
= (1/n) Jo Jo" (1
-
r2) g'(() (2(1
-
Ial2) l (1
az)3) r dr de equals
-
(I - Ial2)g'(a)
As usual it is sufficient to verify this for g(z) = zn, when we obtain 1
(1/in) J 1 JIzI-r 0
(1 -
= 4 J11 r2n-1 (1
-
r2) r n(r2/z)n-1 (2(1
- Ial2)/(1 -
r2) nan-1 (n(n + 1)12) (1
-
az)3) dr dzlz
Ial2) dr, by the Residue Theorem (the only
pole being at 0),
= 2 ((1/2n) - (1/(2n+2))) nan-1 n(n + 1) (1
- IaI2)
=
nan-1
(1 -
lal2), as required.
Now, to complete the proof that (ii) = (iii), observe that for any 4, E B j, we have that
5 sup {4,(b t,): w E D). Consider now the closed absolutely convex set S which is the
closure of (E7 Xlbwt,: n e N, E7 I?,,I 5 1).
If f E B1 and IlfllB1 5 1, then f E S, for if not it would follow that there existed a functional
4, E B1 with I4(l > 1 but sup (l4,(bw)l: w e D) 5 1, by the Hahn-Banach
theorem (in its `separating hyperplane' version - see the Appendix). But then IIfll 5 1, which is a contradiction.
Hence, given e > 0, there is a convex combination E) Xkbx,k, such that E7 IPkl
<-
1 and
if - ?} Xkbwkll < F. Iterating, using f - Eq Xkbx,k, we obtain
If - E7 Xkb"'k - E7 µkbxkll < e2, and ET 1µk1
<-
e, and so on. Thus ultimately we obtain an
expansion f = El 'kbwk, with Ej IXkl < 1 + e + e2 + ... = 1/(1
- e).
As a result of this, given a function h with h" E B1, and e > 0, there are constants (Xk) and points (wk) such that
h" = Ej 'k{z2 (1
- lwkl2)l(1
- wkz))",
converging locally uniformly since Ibx,(z)I 5 2(1 - Izl)-3, and with E j l?,kl 5 (1 + e)Ilh"IIBI Integrating term by term gives h(z) = E j- Xkz2(1 - lwkl2)/(1
- wkz),
74
since h(O) = h'(0) = 0. Hence (iii) follows. Finally (iii) = (i) easily, since the space of nuclear operators is complete, and if
g(z) = E1 'k(1
- 1µ'k12)/(1
- wkz),
then we have convergence in the nuclear norm, since Z j- IXkl < o, and each rank one operator with symbol (1 - Iwk12)/(1 - wkz) has nuclear none one. We also obtain convergence in H,,, for
similar reasons. This completes the proof of Theorem 7.7.
As usual it is possible to translate results on the disc into results on the halfplane. The corresponding formula is as follows.
Corollary 7.9 A Hankel operator F on H2(C+) is nuclear if and only if it has a symbol of the form G(s) = E j- Xk (2 (Re ak)l(s - ak)),
with E1 ?k < °°, and ak a C-, the series converging in H,,. Moreover inf (Zj- IXkl, G(s) can be written as above) <_ (8/tt)IIr1IN Proof This follows using the equivalences established in Chapter 4.
We conclude with a result that tells us that if a Hankel integral operator with kernel h is nuclear then h must be well-behaved, both as to smoothness and as to rate of decay at oo. Various partial converses are known.
Corollary 7.10 If h(t) E LI(0, oo) determines a nuclear Hankel operator, then h is equal almost everywhere
to a function h0 which
is continuous on (0, 00); in addition h(t) satisfies
Ih(t)I S (16/e1t) IIr11N / t almost everywhere.
Proof Given e > 0, take h,(t) = E1 2Xk(Re ak)eakt, where VI' I?J <_ (8/n + E)IIr1IN and ak a C-. Now the series for h£ converges uniformly on [S, oo) for any 8 > 0, since sup [Ixe txI:
t e [8, o)) = xe-&, and
sup [Ixe Sx: x >_ 0) = 1/e&.
75
It follows therefore that he is continuous and that Ihe(t)I <_ 2(8/n + e)IIrIIN/et. Since h = hE
almost everywhere, the result follows on letting a -a 0.
The constant 161en in the above corollary has no great significance, and is unlikely to be
optimal. Although a lot is known about the singular values of Hankel operators, there is a regrettably large amount that is still a mystery. However the Exercises contain further material of interest.
76
APPENDIX - BACKGROUND RESULTS IN ANALYSIS
We collect here various results in Functional Analysis and Measure Theory which may be unfamiliar to some readers. It is not our intention to summarise the whole of Analysis within a few pages, but we do supply the necessary background to the results used in the main text.
Normed spaces, Banach spaces and Hilbert spaces
We recall that a (real or complex) vector space V is a set of points (or vectors) forming a commutative group under addition and with multiplication by scalars defined such that
(i) a ,(x + y) = Ax + ay, (ii) (A, + µ)x = Ax + px, (iii) A(itx) = (A,µ)x, and
(iv) Ix = x, for all vectors x and y, and all scalars A. and µ (real or complex, as appropriate).
A norm on a vector space V is a real non-negative function (the norm of a vector x conventionally written IIx l), such that
(i) Ilxll > 0 except when x = 0,
(ii) III = ICI (iii) Ilx + yll
Ilxll, and <-
IIXII + IIYII (the triangle inequality), where A. is a scalar and x and y are
vectors.
This makes V into a normed space.
We give some important examples of nonmed spaces.
(i) Rn and Cn, with the Euclidean norm Ilxll = (1x112 + ... + Ixnl2)1"2' (ii) C[0, 1], the space of continuous functions on the interval [0, 1] with norm 1141 = max If(x)I,
(iii) CLp[0, 1], (1 <_ p < oo), the space of continuous functions on [0, 1] with norm Ilfll =
(!o
Itt)IP dt)1/p
77
(iv) I , the space of sequences (xn) such that II(xn)II = ( E l -
Ixnlp)1/p <
(v) LL(A), for A an interval, possibly infinite, the set of functions on A whose pth powers are Lebesgue integrable (see below), with norm IIAI
= (1A V(t)I!' dt)"P
We say that a sequence (xn) of points in a normed space converges to a limit x, if Ilxn -
xll -> 0 as n - -. A sequence in a nonmed space is said to be a Cauchy sequence if,
given any e > 0, there is a number N, such that llzn - xmll < e if n, m > N. Convergent sequences are always Cauchy sequences, but the converse is not true.
A normed space is complete if every Cauchy sequence has a limit. For example, CLp[0, 11 is
not complete, since the sequence of functions (fn) taking the values 0 on [0, 1/2 - 11(n+1)], I on [1/2, 11 and linear in between forms a Cauchy sequence with no continuous limit. The other spaces listed above are complete. A complete normed space is called a Banach space.
One way of obtaining a norm is by means of an inner-product (scalar product) on a vector space. This assigns to each pair of vectors, x and y, a scalar quantity (x, y), and satisfies:
(i) (x, y) = (y, x) if the space is real, (x, y) = (y, x)- if the space is complex;
(ii) (x + Y, z) = (x, z) + (Y,
z);
(iii) ()..x, y) = ?,(x, y), and hence also (iv) (x, XY) = A(x, Y);
(v) (x, x) >_ 0 and (x, x) = 0 only if x = 0. An inner-product gives rise to a norm, by setting IIXII = *(x, x). A Hilbert space is a complete inner-product space. Of the nonmed spaces listed above, CL2 and L2 are inner-product spaces, if we define
(f, g) = I f g. The space L2 is a Hilbert space, as is 12.
One important concept in a Hilbert space is that of an orthonormal sequence. This is a
sequence (en) such that (en, em) = 0 if n * m, and (en, en) = 1 for all n. If every vector x
78
can be written as a sum x =
anen, then we say that (en) is an orthonormal basis. In this
case an = (x, en) and IIx112 = E1 Ianl2 (the Riesz-Fischer Theorem, used in Proposition 1.8). Operators and Spectral Theory
Let X be a complex nonned space, and T a function on X with values in X. Then T is a linear
operator if
T(Ax + µy) = ?.Tx + pTy for all x, y e X, and X µ e C. T is continuous (or bounded) if there exists a positive constant C with IITxII 5 CIIxII for all x e X. The least such C for which the inequality holds is called the norm of T, 11711. We write I to denote the identity map on X.
The spectrum of T, Sp(T), is the set of those ? E C for which T - a! fails to have a continuous inverse. We write p(T) for the spectral radius of T, which is defined to be max (IA: ? E Sp(T)). A.I. The spectrum of T is a closed, bounded, nonempty subset of C which contains all the eigenvalues of T. Moreover p(T) = lim n-* IITn11I/n
In a normed space a set S is compact if and only if it is sequentially compact, that is, if any sequence of elements of S has a norm-convergent subsequence. An operator is said to be compact
if and only if it maps bounded sets into subsets of compact sets - equivalently, given any sequence (xn), the sequence (Txn) has a convergent subsequence.
An important achievement of spectral theory is to describe the action of an operator in terms of the operator's eigenvalues and eigenvectors. In the finite-dimensional case, it is often possible
to choose a basis consisting of eigenvectors - with respect to such a basis the linear map is represented by a diagonal matrix. Failing this, one is able to choose two different bases such that the linear map then takes a diagonal form.
The natural infinite-dimensional setting for these ideas is in the study of compact operators. The Riesz theory of compact operators on a Banach space includes the following results.
79
A.2. If T is compact then Sp(T) is either finite, or consists of a countable sequence of points, tending to zero. Every nonzero point A, of Sp(T) is an eigenvalue, and has the property that the eigenspace KX = Ker(T - Al) is finite dimensional.
On a Hilbert space H, where the norm is given by an inner product, we say that T is Hermitian if T = T*, that is, if (Tx, y) = (x, Ty) for all x, y e H. For compact Hermitian operators we shall encounter the Spectral Theorem. Its proof proceeds using the following subsidiary results which are of some interest in their own right.
A.3. If T is Hermitian then Sp(T) is real.
A.4. If T is compact and Hermitian then either 11711 or -11711 is an eigenvalue of T, and hence p(T) = 11711. Moreover any nonzero eigenvalue A determines a finite-dimensional eigenspace K), and induces a decomposition
H=KKG(KA,)1, where each subspace is preserved by the action of T.
By an induction argument - which at each stage selects an eigenvalue of largest modulus and iterates the above decomposition into eigenspace plus orthogonal complement by looking next at
the restriction of T to the orthogonal complement - one obtains the spectral theorem in the following form.
A.5. If T is compact and Hermitian, then there exists a sequence of real numbers (),k) which is
either finite or, if infinite, tends to zero, and a corresponding sequence of mutually orthogonal finite-dimensional eigenspaces (K?,k), such that every vector y in H has a unique decomposition
y=Ej Yk+.Y-, with yk e
K),k
for each k and with Y1 orthogonal to every KXk; and such that Ty = Z-1
Xkyk.
80
By taking orthononnal bases inside each K),k, we obtain the spectral theorem in the form given in Proposition 1.1.
The Stone-Weierstrass Theorem
Let K be a compact metric space, for example [0, 1] or T. (Generalisations to more abstract
topological spaces exist but need not concern us here.) We write CR(K) for the space of continuous real-valued functions on K, C(K) for the space of continuous complex-valued functions. Each is a normed space over the appropriate field, with the supremum norm
ICI = sup (Ifix)j: x E K). The classical Weierstrass approximation theorem states that the polynomials are dense in CR([0, 1]), that is, that a real continuous function can be uniformly approximated by polynomials on the interval [0, 1].
The Stone-Weierstrass theorem is a generalization of this, and requires us to consider the notion of an algebra of functions. This is a set of functions that forms a vector space and is also
closed under multiplication. So, for example, the polynomials form an algebra, as do the trigonometric polynomials (polynomials in eit and et),
An algebra A of functions is said to separate points if, given any two distinct points x, y E K, there is a function f E A such that f(x) x f(y). Over the reals we then have the simplest form of the Stone-Weierstrass theorem as follows.
A.6. If A is a real algebra of continuous functions on K (a compact metric space) which separates points and contains the constant functions, then A is dense in CR(K) - that is, every function in CR(K) can be approximated arbitrarily closely (in the uniform norm) by functions in A.
81
Various proofs of this result are known. One such proof shows that if f E A, then Iu is in A-, the closure of A (this is analogous to approximating IxI by polynomials in x); it then proceeds
by showing that if f, g e A, then
max (f, g) = f + g + If - g112 E A-, and that min (f, g) E A-. It follows that A- is a lattice. One now uses the lattice operations to perform the desired approximation.
Over the complex numbers the above form of the Stone-Weierstrass theorem does not hold,
since, for example, the function f(z) = z cannot be approximated arbitrarily closely on T by polynomials in z, since it is not analytic. However, allowing for this special case, we obtain a complex form of the theorem: it can be deduced from the real form by taking real and imaginary parts.
A.7. If A is a complex algebra of continuous functions on K (a compact metric space) which separates points, contains the constant functions, and is closed under complex conjugation, then A is dense in C(K).
The application of the Stone-Weierstrass theorem that we require is for the case of C(T), with
A the algebra of trigonometric polynomials (i.e. polynomials in z and z). The fact that any continuous function can be uniformly approximated by trigonometric polynomials is used in Proposition 2.1 and Theorem 3.14. This particular result can also be derived using Fourier series methods, as in the book of KSrner.
The Hahn-Banach Theorem
Suppose that X is a normed space (real or complex). Then the norm of a continuous linear map f. X -* C (a continuous linear functional) is given by I141 = sup {Wx)I: IIXII < 1).
The dual space, X*, is the space of linear functionals equipped with the above norm.
82
Suppose now that X and Y are two normed spaces with X c Y. Then an element g of Y*
clearly determines a unique element gIX of X* by restricting its action to X. Moreover Ilgp ]I 5 Ilgll. The Hahn-Banach theorem is concerned with the converse situation: the extension of a linear functional to a larger nonmed space. In its most common form it is stated as follows.
A.8. If X and Y are normed spaces with X c Y, and f E X*, then there exists a functional f e Y* such that 7W = f(x) for all x n X, and such that IIf I1y* = IIflIX*.
The most natural proof of this result proceeds by increasing the dimension of the space one
step at a time (thus adding in one independent vector to X, and repeating). Some set-theoretic arguments are required to complete the extension, and we shall not discuss them.
In the proof of Nehari's Theorem (Theorem 3.2) we use the Hahn-Banach theorem in the above form. Regarding H1 as a subspace of LI(T), we extend a linear functional defined on the smaller space to give one defined on the larger space.
Another form of the Hahn-Banach theorem is the more geometrical Separating Hyperplane
Theorem. Note that A.8 implies that if x e X and Ilxll > 1, then there exists a linear functional
f e X* such that f(x) > I and Ilfll separating x from the unit ball (y:
Ilyll
1. In this case the set (y: f(y) = 1) is a hyperplane
<_
<_
1). To see this more generally, we make the following
definitions.
A set S in a nonmed space is convex, if for all s, t e S, the line segment joining s and t is contained in S, i.e.
Xs + (I - ?)t E Sforall0 <_
?
<_
1.
A set S is said to be absolutely convex if it is convex and, in addition,
Xs E S for all s
r=
S, lAl
<_
1,
?, being real or complex, as appropriate. Thus the unit ball of a normed space is always absolutely convex. The theorem of the separating hyperplane can now be stated, in the following form.
83
A.9. Let X be a normed space, S a closed absolutely convex subset of X, and x a point of X
which is not in S. Then there exists a functional f e X* such that I ls)I 5 1 for all s e S, and lf(x)I > 1. The result is used in this form in the proof of Theorem 7.7.
Results from Measure Theory
We begin with a few comments on Measure and Integration for the benefit of any readers unfamiliar with the concepts, and then discuss some key theorems in the subject.
A measure t, defined on a suitable class of sets (a measure space) is a function which takes non-negative values (we permit oo) and is countably additive, that is µ(A1
A2 U ...) = µ{A1) + µ(A2) +
...,
whenever the sets (Ad are pairwise disjoint. Informally one thinks of a measure as a generalization of length, area or volume - for some sets this can be measured directly, for others we have to proceed more carefully.
The most important example of a measure in Analysis is that of Lebesgue measure, which assigns to finite intervals (a, b) the measure b - a, and is extended from this to a larger class of sets, the measurable sets. For technical reasons it is not possible to define a measure on all sets,
but such problems do not concern us here. Under Lebesgue measure all finite sets and all
countable sets have measure zero. A property is said to hold almost everywhere if it holds everywhere except for a set of measure zero.
A real function f is said to be measurable if (x: f(x) < a) is a measurable set for every a e R. A complex function is measurable if and only if its real and imaginary parts are. The class of measurable functions includes all continuous functions and we shall implicitly assume from now on that all our functions are measurable.
The Lebesgue Integral is defined using Lebesgue measure, but can also be regarded as an
extension of the more classical Riemann integral to a larger set of functions. For example, continuous bounded functions defined on closed bounded intervals are Lebesgue integrable. The integral has the property that
84
1 E7 aiXAI(x) dµ(x) = E j ai t(Ai), if the sets (Ad are disjoint and if the sum converges absolutely. Here we use the notation XA(x)
to denote the characteristic function of a set A (also known as the indicator function). This function takes the values I on A, 0 on the complement of A. Note that if a function is zero almost everywhere then its integral is zero: for many purposes it is convenient to identify two functions if they differ only on a set of measure zero.
We make extensive use of normalized Lebesgue measure on T in our discussion of Hardy
spaces. This is constructed by parametrising T as (et0: 0 5 0 < 2n), and then transferring the Lebesgue measure from [0, 2a), dividing it by 2z so as to make the total measure equal to one.
Convergence in measure
A sequence of functions (fn) is said to converge to a function f in measure if, for all e > 0, µ([x: Ifn(x)
- f(x)I > e)) -, 0 as n -+ °°.
Thus, for large n, fn is close to f, except on a small set. This does not guarantee that fn(x) actually tends to f(x) for any x, and we make the following further definition. A sequence of functions (fn) is said to converge to a function f almost everywhere if there is a
set E with µ(E) = 0 such that fn(x) -* fx) for all x e E. The connection between these two modes of convergence is as follows, though the proof is technical and will be omitted.
A.10. If a sequence (fn) of functions converges to f in measure, then there is a subsequence which converges to f almost everywhere. Conversely, if the functions are defined on a measure
space with finite total measure, then the condition that fn - f almost everywhere implies that
fn -a f in measure. We remark that, if a sequence (fn) of functions converges to f in Lp norm, i.e. if
J Vn -I' then fn
* 0,
fin measure. This is easily verified and is used in Chapter 2.
85
The Monotone Convergence Theorem
Suppose we are given a sequence of functions fn(x) that converges to a limit function }(x) almost everywhere. It need not be true that f fn -+ f f, as the following easy example shows.
Let fn(x) = max(n - nx, 0) for x ? 0, so that fn(x) = 0 for x >_ 1/n, f(0) = n, and f is linear on [0, 1/n] and []/n, oo). Then f fn = 1/2 for all n, and fn(x) -a 0 as n - oo, for all
x;
so the limit function has integral zero. It is therefore useful to establish conditions under which lira
(f fn) = f (lim fn). The
theorem which follows gives one such condition.
A.11. Suppose that (Q is a sequence of real functions, that fn(x) -* f(x) almost everywhere and that the sequence fn(x) is monotonically increasing for almost all x. Then f fn
f f.
The above result is used in Chapter 2 during the proof of Theorem 2.12. Other convergence theorems are available, of which the following result (the Dominated Convergence Theorem) is perhaps the most generally useful.
A.12. Suppose that fn(x) -4 f(x) almost everywhere and that there exists an integrable function g such that Vn(x)I <_ g(x) almost everywhere. Then f fn -+ f f.
We omit the proofs of these results. The next section gives a further convergence result of a more general nature.
Fatou's Lemma Suppose that (an) is any sequence of real numbers. We recall the following definition.
lim inf am = I'M n-, (inf m>n am). The limit on the right hand side is easily seen to be a supremum as well. Fatou's Lemma may be stated in the following forth.
86
A.D. Let (fn) be any sequence of non-negative functions. Then
I lim inf fm <_ lim inf f fm. This result is used in the proofs of Theorems 2.4 and 2.6. Often we have the sequence (fn) converging pointwise (almost everywhere). In this case the result takes the simpler form:
J lim fn <_ lim inf f fm. Fatou's Lemma may easily be derived from the Monotone Convergence Theorem as follows.
For each n, write gn(x) = inf m>n
fm(x). Then it is not hard to see that each gn is integrable
(using the fact that the (fn) are non-negative), and that the sequence gn converges to lim inf fm monotonically (in fact it is increasing). Hence, by the Monotone Convergence Theorem,
I lim inf fm = lim f gn <_ lim inf m>n f fm, as required.
In the proof of Lemma 3.9 we turn this theorem round by applying it to the functions (fn). This gives the following restatement, in the form which is used there.
A.M. Let (gn) be a sequence of non-positive functions such that gn(x) -a g(x) almost everywhere and f gn > A for all n. Then f g >_ A. Fubini's Theorem
We have found it necessary on several occasions to rearrange double integrals, and it is helpful to have a result which reassures us that we still obtain the same answer after doing so. As with rearranging double summations, it is possible to construct examples where one does obtain a
different answer by these means, but it turns out that, assuming that the function f(x, y) being integrated is positive (or indeed, assuming merely that VI is integrable), then one may rearrange the integral with impunity. This parallels the notion of rearranging absolutely convergent series.
One version of Fubini's Theorem is as follows. Similar forms are sometimes referred to by the name of Tonelli's Theorem.
87
A.15. Let f(x, y) be a measurable function of two real variables, such that
1 1 If(x, Y)I dx dy < 0.
Then for almost all x and almost all y the partial integrals F(x) = 1 f(x, y) dy and G(y) = 1 f(x, y) dx exist and are integrable functions, and
1 f f(x, y) dx dy = 1 1 f(x, y) dy dx. Fubini's Theorem is used in this form in the proofs of Corollary 2.11, Proposition 4.1 and Theorem 7.3.
Holder's Inequality This famous inequality for integrals may most usefully be stated in the following form.
A.M. Let p and q be numbers such that 1 < p < 00, 1 < q < - and (lip) + (liq) = 1. Let A be a set on which measurable functions f and g are defined and such that fA JAP and 1A Iglq are finite. Then fg is integrable over A and I
JA fg
I
< (1A V1
)lip (1A
IgIq )I iq.
This inequality is used in the proof of Theorem 2.4. The special case p = q = 2 is rather more elementary and is known as the Cauchy-Schwarz Inequality.
Jensen's Inequality
This useful analytic inequality has been stated in many versions, and we give the one used in the proof of Lemma 2.8.
A function O(x) defined on some interval I of the real line is said to be convex if for x, y E I and 0 <_ X <_ 1, we have 4(Xx + (1 - X )y) <_ 24(x) + (1
- a,)O(y).
Informally, if one joins two points on the graph of 0 by a chord, then the chord lies above the graph. Jensen's inequality for convex functions may be stated in the following form.
88
A.17. Suppose that f is a bounded measurable function on a finite interval and that 4) is a convex
function on an interval containing the range off. Then
4)(I 1(x)) 5 I w(x)) A function 4) is concave if (-4)) is convex; this is equivalent to a reversal of the convexity
inequality, or, informally, to the condition that the chord lies below the graph. For concave functions the inequality (A.17) is easily seen to be reversed. One example of a concave function is 4)(x) = log(x): the proof of Lemma 2.8 uses Jensen's inequality for this function.
Complex Integration and the Residue Theorem
We make extensive use of the notion of the integral of a complex function along a simple closed curve - usually a circle. Given a smooth curve C, parametrised as (y(t): a 5 t 5 b), and a continuous function f(z) defined on C, we may define IC f(z) dz to be J
a
f(7(t)) y'(t) dt.
In the case in which f(z) is complex analytic inside C (i.e. differentiable) at all but finitely
many points one may obtain an expression for the integral of f along C using the Residue Theorem.
Suppose that f(z) is analytic on some small neighbourhood of z0 with the possible exception
of z0 itself. The function f then has a Laurent series f(z) = E ' an(z - z0)n, valid in some punctured disc (0 < 1z
-
z0J < e). The residue off at z0, R(f, z0), is the number a_1. In the
simplest case, that of a simple pole, in which the only negative power of (z - z0) appearing is
the a_I(z - z0)-1 term, the residue is given by R(f, z0) =
f(z)(z - z0). The Residue
Theorem is as follows.
A.18. Suppose that f(z) is analytic within a given circle C as well as on a neighbourhood of C, except f o r finitely many points z j, ..., zn inside C. Then n
IC f(z) dz = 21ti EI R(f,
z1).
89
This result is used in the proofs of Proposition 2.10 and Theorem 7.7, and Chapter 2 contains
other results which have some similarity to this. For example, the following formula (Cauchy's integral formula for analytic functions) may easily be obtained from the residue theorem, since the integrand has a simple pole at w:
f(w) = (1/2ai) JC f(z) dz /
(z
- w).
By differentiating the above, one obtains the following formula:
f(k)(w) = (k!/21ti) JC f(z) & /
(z - 1,,)k+]
for k Z 0. This has a wide range of applications in complex analysis.
90
EXERCISES
I am a mathematician, sir. I never permit myself to think. John Dickson Carr (The hollow man) That's sum puzzle you solved, doe. The Sun, March 1988.
1. Verify that if A = A*, then the singular values of A are just the absolute values of its eigenvalues.
2. Show that C1 and C2 are complete nonned spaces.
3. Using trace, identify C2 with its own dual. What is the dual of K(H), the space of compact operators on a Hilbert space? What is the dual of Cl?
4. Given an operator T with Tx = '7 ai(x, vi)wi, as usual, and an operator T. of rank n, show that one can find an orthonormal sequence xI, x2, ... with xi a Lin(vl,
EI an,,.
and T,1x1 = 0. Deduce that IIT 5.
vn+i) for each i,
In question 4, let U be the partial isometry taking vi to wi for each
(Txi, Uxi) >_ an+i for each i, and deduce that IIT - TnIIN
an+l + an+2 +
i.
Show that
----
6. Show that for an arbitrary compact operator T there is in general more than one rank-n approximant Tn such that IIT -
Tall = a,+1
7. Show that Coo(R), the space of all continuous functions with compact support, is dense in Lp(R) for 1 <_ p < =, but not for p =
8. Define HI(C+) in terms of H1 of the disc, and prove a Riesz factorization theorem: if f e HI(C+), then f = gh with g, h e H2(C+) and 111111
= 11g11211h112
9. What functions play the role of Blaschke products in Hp(C+)? That is, given s1 in C+, find a rational function G(s) EE H (C+) with IG(iy)l = 1, G having a zero at sI and a pole in C, and
no other zeroes or poles.
91
10. Consider f(z) = i log ((1
-
z)/(1 + z)): show that Re f is bounded and harmonic in the
disc, but Im f is unbounded. Thus the condition that Re f is bounded is necessary but not sufficient for f to be in H. 11.
that
Show
if
h(t) = the-t,
its
then
Laplace
transform
is
given
by
(Lh)(s) = n! / (s + 1)n+1 12. If Lh = H, and g(t) = f t h(x)dx, show that Lg(s) = H(s)1s.
13. Prove that the Laplace transforms of cos ?t and sin ?\,t are respectively s/(s2 + X2) and AI(s2 + X2). 14. Identify H2 with its own dual in a natural way.
15. By embedding H1 in L1(T), and using the Hahn-Banach theorem, show that the dual of H1
can be identified with the quotient space L /H1°, where H1° is the annihilator of H1 in L,,, that
is the set of all g E L such that J
It g(ei0)f(ei0)d0/2rt
0
= 0 for all f E H1.
16. Using the fact that the polynomials are LI dense in H1, show that, in question 15, H1° is H. Using Nehari's theorem deduce that the dual space of HI can be identified with the space of
Hankel operators. (HI* can also be identified with a space known as BMOA, the space of Analytic functions of Bounded Mean Oscillation: those functions in H2 which are projections of
L functions.)
17. Prove Corollary 3.15 directly by showing how to exhibit an arbitrary Hankel matrix (a0, a1, ...) with (ai) E 11 as the norm limit of a sequence of finite rank matrices. 18. If f e LI(T), show that I14I1 = sup (Jrzo f(ei0)g(et0)d0/2x: g (= CT), II (Hint: it
II
=
1)
is enough to prove this for step functions.) Deduce that if f e LI(T) and
(f, zn) = 0 for all n, then f = 0; and if (f, zn) = 0 for all n x 0, then f is constant. (The corresponding result for L2(T) is much easier.)
92
19. The Riemann mapping theorem states that any domain D c C (D * C) which is simply
connected is conformally equivalent to A = (Izi < 1),
i.e. there
is an analytic bijection
h: D -4 A. Given such a D and h, explain how to solve the Nevanlinna-Pick problem for D: to
find f with f(z1) = wi, and IMIH (D) = sup (L(z)I: z e D) minimal. Invent an example for
D = C+ and solve it. 20. Solve the Carathdodory-Fejdr problem for flzl
<_
r), r > 0. As an example find the extension
of 1 + 2z, and determine how the solution varies with r.
21. Show that the H2 version of the Nevanlinna-Pick problem can be solved by elementary methods.
22. For the nth order differential equation
y(n) +
an-ly(n-1)
+ ... + a0y = u,
take as states xl = y, ..., xn = y(n-1), and interpret the eigenvalues of the A matrix that you obtain.
23. Verify that H(s) = 11(2 + e-s) is not in H (C+), but that H(s) E H (C+) + RH,,,(C_). 24. Show that the z-transform (a0, a1, a2, ...) -a a0 + al/z + a2/z2 +
...
is an isometric
isomorphism between 12 and the space R(H2), which is the orthogonal complement of zH2 in L2(T).
25. Consider the linear discrete-time dynamical system specified by
any(i + n) + ... + aly(i + 1) + a0y(i) = bmx(i + m) + ... + blx(i + 1) + b0x(i). Suppose that x(0) _ ... = x(m - 1) = y(O) _
...
= y(n - 1) = 0, and that X(z) and Y(z) are
the z-transforms of x and y, as defined in the previous question. Show that Y(z)/X(z) = (bmzm + and state necessary and
...
+ b0)l (anzn + ... + a0) = G(z), say
sufficient conditions
Y(z) a R(H2) whenever X(z) e R(H2).
on the transfer function
G(z)
such that
93
26. Consider the discrete-time state-space model
x(t + 1) = Ax(t) + Bu(t) y(t) = Cx(t) + Du(t). Show that the z-plane transfer function is given by G(z) = D + C(zl - A)-IB, and explain the significance of the position of the eigenvalues of A.
27. Verify that the A-A-K results do go through to Hankel operators on the halfplane C. As an example
find
an optimal rank-one approximation to the Hankel operator with symbol
1/(s + 2) + 1/(s + 1). 28. Let r be a Hankel integral operator. Show that T'SS = SSI', where SS is a shift in L2(0, o),
i.e. (SSu)(t) = 0 for t < S, and (SSu)(t) = u(t - 8) fort >_ S. Compare Lemma 6.3. 29. Let T be a conformal bijection from C+ to itself (for example Ts = 1/s). Show that for G e H (C+) the Hankel operators determined by G(s) and G(Ts) have the same sequence of singular values. (Hint: use the A-A-K results to express an+1(r) as inf {IIG(s) - G0(s) - F(s)II,,: G0 a RH (C+) with at most n poles, F e
30. Show that if G(s) = Ej an(2 Re sn)l(s
-
sn), with Re sn < 0 and Ej IanI < oo, then
the sum converges in H (C+) and that G determines a nuclear Hankel operator with nuclear norm at most E j IanI 31.
Show
HAII2,5
that
a
= Z- (n +
Hankel
matrix
(a0, a1, ...) is
Hilbert-Schmidt
if and
only
if
1)IanI2 < O.
32. By considering VI' I(Aen, fn)I for suitable choices of (en) and (fn) show that
(i) MIN ? laol + Ia21 + ... and (ii) IIAIIN ? Ia1I + Ia31 + .... Deduce that II(an)III1
<_ 211AIIN.
33. Show that if A = (a0, a1, ...) is a positive Hankel operator, then A is nuclear if and only if E0 a2n < -. Why are the odd coefficients still relevant?
94
34. Let A = (a0, a1, ...) be a Hankel matrix such that an = O(n-a), a > 1/2. Estimate the Hilbert-Schmidt norm of the Hankel matrices An = (0, ..., 0, an, an+1,
n = 1, 2,
...,
and thus obtain an upper bound for CF2 +1(A) + (yn+2(A) + ... using question 4. Deduce that
an(A) = 0(n-a+1/2) 35. Show that the rank-two Hankel matrix (0, 1, 0, 0, ...) cannot be written as a finite sum of rank-one Hankel matrices.
36. Consider the Hankel integral operator r' with kernel h(t) = x(0,1)(t). By solving a differential equation for its eigenvalues and eigenvectors show that F has singular values
ai = 11(n(i - 1/2)), and that its Schmidt vectors vanish for t >_ 1, whereas for 0 <_ t 5 1 they are proportional to cos alt. Note that 1' is Hilbert-Schmidt but not nuclear.
37. For I' as in question 36, verify directly the formula for the Hilbert-Schmidt norm given in
Theorem 7.3, and verify also that Illll 5 IlhII j. Calculate Lh, and write down a differential equation (with delayed terms) that has Lh as its transfer function.
38. Let T: L2(0, o) - L2(0, oo) be an integral operator given by
(Tu)(t) _ c K(t,
r)
u(ti) d[,
where the kernel, K(t, 2), is real, measurable and symmetric, i.e. K(t, 'r) = K('r,
t)
for all
t, 2 E (0, co), and K satisfies
0 0 IK(t, 2)12 dt dr < Show that T is a bounded operator, and that its Hilbert Schmidt norm satisfies
II71I j = 0 0
IK(t,
ti)12
dt dr.
Deduce Theorem 7.3 (ii).
39. Suppose that h(t) E L2(0, cc). Let e be the operator defined by
(®u)(t) = ti 1/2 ' t-114 h(t + 2) u(tt)
I C-114 dti.
Using question 38 show that II®IIHS = I0112 Show also that, if h E L1 n L2(0, 00) then has the same rank as the usual Hankel operator F, if the Laplace transform of h is rational. Using
question 4, derive a lower bound for inf flIh values of ®.
-
h112: degree(h) <_ n) in terms of the singular
95
40. Show that for the function h(t) given in question 36, the singular values of 4n ® are at least
as large as the singular values of r. Deduce that the L2 error of any rank-n approximant to h is bounded below by Cn-1l2 for some C > 0 independent of n.
41. Show that, with the notation of Theorem 7.7, the function h" is in the Bergman space B1 if and only if g" is in B1. 42. Using Theorem 7.7 and the MObius map M, derive necessary and sufficient conditions for a
Hankel operator t on H2(C+) to be nuclear in terms of the derivatives of the transfer function G(s).
43. Show that the transfer function G(s) = 1/(s + e -s) determines a nuclear Hankel operator,
but that H(s) = es/2G(s) does not. Does H determine a Hilbert-Schmidt operator? 44. Suppose that g(z) = E1 anzn is analytic in a disc of radius R > 1. Show that IIg(z)
- (a0 + a1z + ...
+
anzn)Ilo, = O(K-n)
for some K > 1. Deduce that g determines a nuclear Hankel operator whose singular values satisfy an = O(K-n). 45. Given A > 0 show how to find a Hilbert-Schmidt Hankel operator t with kernel h(t) such that
lh(t)I > AIII'IIHS on
Ih(t)I
<_
an
interval
of positive length. Deduce that the
inequality
(16/ex)IIIIIN/t (almost everywhere) in Corollary 7.10 cannot be reformulated using the
Hilbert-Schmidt norm.
96
BIBLIOGRAPHY
Recommended Books
N. Dunford and J.T. Schwartz, Linear Operators, Interscience, 1957. Contains most of the basic results on Operator Theory that are needed.
P. Duren, HP spaces, Academic Press, 1970. A helpful book on Hardy spaces.
B.A. Francis, A course in H Control theory, Springer, 1987. Chapters 1, 2 and 5 contain lots of useful background material, without many proofs.
P.A. Fuhrmann, Linear systems and operators in Hilbert space, McGraw-Hill, 1981. Shows the connections between Operator Theory and Systems Theory, which are explored here.
J.B. Garnett, Bounded analytic functions, Academic Press, 1981. An excellent introduction to Hardy spaces.
I.C. Gohberg and M.G. Krein, Introduction to the theory of linear nonselfadjoint operators in Hilbert space, Transl. Math. Monographs 18, Providence, 1969. Contains most of the basic results on Operator Theory that are needed.
J.W. Helton, Operator Theory, Analytic Functions, Matrices, and Electrical Engineering, American Math. Soc., Providence, 1987. An entertaining book that ties together some of the themes pursued here.
O.L.R. Jacobs, Introduction to Control Theory, Oxford University Press, 1974. The first chapter or so contains useful introductory Systems Theory.
P. Koosis, Introduction to Hp spaces, LMS Lecture Notes 40, Cambridge University Press, 1980. A fairly comprehensive modem book on Hardy spaces.
T.W. Kdmer, Fourier Series, Cambridge University Press, 1988. A highly entertaining book on
the grand scale that shows that Fourier methods are to be met in a large number of different branches of Mathematics.
97
S.C. Power, Hankel Operators on Hilbert Space, Pitman, 1982. Contains most recent results on Hankel operators but is rather too advanced to act as an introduction.
S.C. Power (Ed.), Operators and Function Theory, D. Reidel, 1984. A conference proceedings with several relevant survey-type articles.
M. Rosenblum and J. Rovnyak, Hardy classes and Operator theory, Oxford, 1985. Relevant but not at all easy.
W. Rudin, Functional Analysis, McGraw-Hill, 1973. A good background book for most areas of analysis and this is no exception.
R. Schatten, Norm ideals of completely continuous operators, Springer, 1950. Something of a classic. Contains the basic material of Chapter 1.
R.L. Wheeden and A. Zygmund, Measure and Integral, Marcel Dekker, New York, 1977. Contains useful background Measure Theory, including Jensen's Inequality.
D.V. Widder, The Laplace Transform, Princeton, 1946. One of many appropriate books on the subject.
Relevant Articles
V.M. Adamjan, D.Z. Arov and M.G. Krein, Analytic properties of Schmidt pairs for a Hankel operator and the generalised Schur-Takagi problem, Math. USSR. Sbomik, 15 (1971), 31-73.
F.F. Bonsall, Decompositions of functions as sums of elementary functions, Quart. J. Math. (2), 37 (1986), 129-136.
F.F. Bonsall and D. Walsh, Symbols for trace class Hankel operators with good estimates for norms, Glasgow Math. J. 28 (1986), 47-54.
R.R. Coifman and R. Rochberg, Representation theorems for holomorphic and harmonic functions in LP, Asterisque 77 (1980), 11-66.
K. Glover, All optimal Hankel-norm approximations of linear multivariable systems and their L,, error bounds, Int. J. Control 39 (1984), 1115-1193.
98
K. Glover and J.R. Partington, Bounds on the achievable accuracy in model reduction, "Modelling robustness and sensitivity reduction in control systems", 95-118, Ed. R.F. Curtain, NATO ASI series F (1987).
K. Glover, R.F. Curtain and J.R. Partington, Realisation and approximation of linear infinite dimensional systems with error bounds, SIAM J. on Control and Optimization, to appear (1988). Report CUED/F-CAMS/TR.258, Cambridge University Engineering Department, 1986.
V.V. Peller, Hankel operators of class Cp and their applications (Rational approximation, Gaussian processes, the problem of majorizing operators), Math. USSR Sbomik 41 (1982), No. 4, 443-479.
S.C. Power, Hankel operators on Hilbert space, Bull. London Math. Soc. 12 (1980), 422-442.
D. Sarason, Generalised interpolation in Ho Trans. Amer. Math. Soc. 127 (1967), 179-203.
99
INDEX A0, see disc algebra. A-A-K results, see Adamjan-Arov-Krein results. absolutely convex set, 73, 82-83. Adamjan-Arov-Krein results, 55-65, 93. adjoint, 4, 9, 59, 67. algebra, 80-81. almost everywhere, 17, 83-84. approximation, 33, 53-65, 69, 80-81. approximation numbers, see singular values. Banach space, 1, 15-16, 77. Bergman space, 70-74, 95. Beurling's theorem, 57-58. Blaschke product, 18-22, 33, 35, 55-63, 90. BMOA, 91. Bonsall-Walsh theorem, 70-74. boundary value, 16-18, 22-23. bounded mean oscillation, 91. bounded operator, 78. Bp, see Bergman space. Caratheodory-Fejer problem, 34-35, 92. Cauchy-Schwarz inequality, 22, 25, 42, 71, 87. Cauchy sequence, 77. Cauchy's integral formula, 16-17, 71, 89. characteristic function, 17, 84. CL,,, 76-77. Coifman-Rochberg theorem, 70-74. compact operator, 4-12, 37, 39-41, 46, 67, 78. compact set, 78. completeness, 16, 74, 77, 90. concave function, 88. continuous impulse response, 66, 74. continuous operator, 78. continuous-time system, 48-54. control theory, 48-54. convergence almost everywhere, 17, 84. convergence in measure, 17, 84. convex function, 87-88. convex set, 73, 82. countably additive, 83. Co(iR), 46.
C'(iR), 46.
100
Cp, 4, 9, 90. 14, 90.
C(T), 14, 38-41, 80-81. degree, 50. delay system, 50-53. disc algebra, 38, 40-41. discrete-time system, 31, 92-93. divides, 61. dominated convergence theorem, 85.
dual space, 72-73, 81-83, 90-91. eigenvalues, 5-6, 59, 61-63, 78-80, 94. equivalence, 27-28, 44-47, 56, 74. Euclidean norm, 76. Fatou's lemma, 16-18, 36, 85-86. finite-rank operator, 37-39, 46, 55-56. Fourier series, 14, 81. Flibini's theorem, 21, 42, 67, 86-87. functional, 81-83. generalised eigenvalues, see singular values. Hahn-Banach theorem, 32, 73, 81-83. Hankel integral operator, 1, 42-45, 66-69, 93. Hankel matrix, 1, 29-31, 34-35, 37-38, 63, 93-94. Hardy spaces, basic properties of, 1, 13-28. harmonic function, 91. Hartman's theorem, 41, 46. heat equation, 50. Hermitian operator, 79-80, 90. Hilbert-Schmidt operator, 2, 9-12, 67-68, 93-95. Hilbert's inequality, 34. Hilbert's matrix, 34. Hilbert space, 1, 77. H,,, see Hardy spaces. Ham, 33-34, 38-41, 56.
H,,.system, 50. Holder's inequality, 16, 87. Hp, see Hardy spaces. hyperplane, 73, 82-83. ideal, 9. impulse response, 50. indicator function, see characteristic function. inner function, 19, 35, 56-58, 60-62. inner product, 10, 15, 77.
101
input, 48. integrable, 83-87. interpolation, 34-35, 92. Jensen's inequality, 19, 87-88. kernel, integral, 1, 45, 68, 94-95. null space, 57-59. Kronecker's theorem, 37, 46, 55. Laguerre polynomials, 27. Laplace transform, 25-28, 43-45, 48-52, 67, 69, 91, 94. lattice, 81. Laurent series, 88. Lebesgue integral, 83-87. Lebesgue measure, 83-84. lim inf, 85-86. linear functional, 81-83. linear operator, 78. linear system, see system. L, space, 13-14, 77, 84, 90.
M, a Mobius map, 23-25, 44-46, 95. measurable, 83. measure, 83-87. Mobius map, see M. mode, 50. model-matching problem, 54. model reduction, 48, 53, 69. modulus of an operator, 8. monotone convergence theorem, 22, 85-86. Nehari extension problem, 32, 34-37. Nehari's theorem, 29, 31-32, 40, 46, 56, 91. Nevanlinna-Pick problem, 34-35, 92. norm, 76-78, 81. normed space, 76. nuclear operator, 2, 9, 11-12, 68-75, 93-95. null space, 57-59. operator, 78.
orthonormal, 5-6, 9-12, 14-15, 24-27, 67-69, 77-78. output, 48. partial isometry, 8, 12, 90. Peller's theorem, 70-74. Poisson kernel, 20-21, 40-41. polar decomposition, 8.
102
polarization identity, 10. pole, 88. positive operator, 5, 93. principal ideal domain, 57. rank, see finite-rank operator. residue, 88. residue theorem, 21, 71, 73, 88-89. Riemann-Lebesgue lemma, 67. Riemann mapping theorem, 92. Riesz factorization theorem, 22, 90. Riesz-Fischer theorem, 10, 78. Riesz theorems, 21, 78-80. robust stabilization problem, 54. Sarason theorems, 37, 41. scalar product, see inner product. Schmidt expansion, 6. Schmidt pairs, 59-61, 64. self-adjoint, see Hermitian. separating hyperplane, 73, 82-83. separating points, 80. sequentially compact, 78. shift, 39-40, 57-58, 93. shower bath, 51-53. simple pole, 88. singular values 2, 6-12, 56, 59-65, 67-69, 90, 94-95. SISO, 48. spectral radius, 78. spectral theorem, 5, 78-80. spectrum, 78. square root of an operator, 5. stability, 50. state, 48.
step function, 14, 67, 91. Stirling's formula, 71. Stone-Wierstrass theorem, 14, 39, 80-81. symbol, 29, 32-34, 37, 45-46, 64, 74. system, 31, 48-54, 92-93. Tonelli's theorem, 86-87. trace, 11-12, 90. trace-class operator, see nuclear operator. tracking problem, 54. transfer function, 50, 92-93.
103
transmission line, 50. triangle inequality, 12, 16, 76. U, an isometric map, 31, 33-34, 37, 39, 46, 56, 63. V, an isometric map, 25, 44. vector space, 76. W, an isometric map, 45, 56, 64. Weierstrass approximation theorem, 80. z-transform, 92.