OPERATOR THEORY These lecture notes are based on the courses Operator Theory developed at King’s College London by G. Ba...
52 downloads
1802 Views
357KB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
OPERATOR THEORY These lecture notes are based on the courses Operator Theory developed at King’s College London by G. Barbatis, E.B. Davies and J.A. Erdos, and Functional Analysis II developed at the University of Sussex by P.J. Bushell, D.E. Edmunds and D.G. Vassiliev. As usual, all errors are entirely my responsibility. The same applies to the accompanying exercise sheets.
1
Introduction Spaces
By IF we will always denote either the field IR of real numbers or the field C of complex numbers. 1.1. Definition A norm on a vector space X over IF is a map k · k : X → [0, ∞) satisfying the conditions (i) kxk = 0 ⇐⇒ x = 0; (ii) kλxk = |λ|kxk, ∀x ∈ X, ∀λ ∈ IF; (iii) kx + yk ≤ kxk + kyk, ∀x, y ∈ X (the triangle inequality). A vector space X equipped with a norm is called a normed space. 1.2. Definition A normed space X is called a Banach space if it is complete, i.e. if every Cauchy sequence in X is convergent. 1.3. Examples 1. Any finite-dimensional normed space is a Banach space. 2. Let lp , 1 ≤ p ≤ ∞, denote the vector space of all sequences x = (xk )∞ k=1 , xk ∈ IF, such that kxkp :=
∞ X
!1/p p
|xk |
< ∞, 1 ≤ p < ∞,
k=1
kxk∞ := sup |xk | < ∞. k∈IN
1
Then k · kp is a norm on lp and lp , 1 ≤ p ≤ ∞, are Banach spaces. 3. Let C([0, 1]) be the space of all continuous functions on [0, 1] and kf kp :=
Z
1
1/p
p
|f (t)| dt
, 1 ≤ p < ∞,
0
kf k∞ := sup |f (t)|. 0≤t≤1
Then each k · kp is a norm on C([0, 1]). The space C([0, 1]) with the norm k · k∞ is a Banach space. The space C([0, 1]) with the norm k · kp , 1 ≤ p < ∞, is not a Banach space. Indeed, let us consider the sequence (fn ) 0 nt − n−1 fn (t) = 2 1
in C([0, 1]), where if 0 ≤ t ≤
1 2
−
1 2n 1 2
,
if
1 2
−
1 2n
if
1 2
+
1 2n
≤t≤1.
+
1 2n
,
It is easily seen that each fn is a piece-wise linear function, 0 ≤ fn ≤ 1 and Z
1 1 , 2m { 2n }
1 2 +max
kfn − fm kp <
!1/p p
1 dt 1 2 −max
1 1 , 2m { 2n }
≤
max
1/p 1 1 , → 0, n m as m, n → ∞.
So, (fn ) is a Cauchy sequence with respect to k·kp . Now, suppose there exists f ∈ C([0, 1]) such that kf − fn kp → 0. Taking into account that for an arbitrary δ ∈]0, 1/2[ there exists N such that fn = 1 on [1/2 + δ, 1], ∀n > N, we obtain Z 1
p
Z
1
|1 − f (t)| dt = 1/2+δ
|fn (t) − f (t)|p dt ≤ kf − fn kpp → 0 as n → ∞.
1/2+δ
Since the first integral is independent of n it has to be zero, which implies that f = 1 on [1/2 + δ, 1] and hence on ]1/2, 1] since δ was arbitrary. A similar argument shows that f = 0 on [0, 1/2[. Thus f is discontinuous at t = 1/2 and we obtain a contradiction.
1.4. Definition Normed spaces X and Y are called isomorphic if there exists a linear isometry from X onto Y . Isometry is a map which does not change the norms of the corresponding points. 2
1.5. Theorem For any normed space (X, k · k) there exists a Banach space (X 0 , k · k0 ) and a linear isometry from X onto a dense linear subspace of X 0 . Two Banach spaces in which (X, k · k) can be so imbedded are isomorphic . 1.6. Definition The space (X 0 , k · k0 ) from Theorem 1.5 is called the completion of (X, k · k). The completion of the space (C([0, 1]), k · kp ), 1 ≤ p < ∞, is the Banach space Lp ([0, 1]) from the theory of Lebesgue integral.
Operators 1.7. Definition Let X and Y be vector spaces. A map B : X → Y is called a linear operator (map) if B(λx + µz) = λBx + µBz, ∀x, z ∈ X, ∀λ, µ ∈ IF. 1.8. Theorem Let X and Y be normed spaces. For a linear operator B : X → Y the following statements are equivalent: (i) B is continuous; (ii) B is continuous at 0; (iii) there exists a constant C < +∞ such that kBxk ≤ Ckxk, ∀x ∈ X. 1.9. Definition A liner operator B is bounded if it satisfies the last (and hence all) of the three conditions above. If B is bounded we define its norm by the equality kBk := inf{C : kBxk ≤ Ckxk, ∀x ∈ X}. It is easy to see that kBxk ≤ kBk kxk, ∀x ∈ X, kBk = inf{C : kBxk ≤ C, all x s.t. kxk ≤ 1} = ( ) kBxk inf{C : kBxk ≤ C, all x s.t. kxk = 1} = sup : x 6= 0 = kxk sup{kBxk : kxk ≤ 1} = sup{kBxk : kxk = 1} 3
(1.1)
(prove these relations!). 1.10. Theorem Let X and Y be normed spaces. Then k · k is indeed a norm on the vector space B(X, Y ) of all bounded linear operators from X into Y, and kABk ≤ kAkkBk, ∀B ∈ B(X, Y ), ∀A ∈ B(Y, Z).
(1.2)
Moreover, if Y is complete then B(X, Y ) is a Banach space. Let X be a Banach space. Theorem 1.10 says that B(X) := B(X, X) is actually what we call a Banach algebra. A vector space E is called an algebra if for any pair (x, y) ∈ E × E a unique product xy ∈ E is defined with the properties (xy)z = x(yz), x(y + z) = xy + xz, (x + y)z = xz + yz, λ(xy) = (λx)y = x(λy), for all x, y, z ∈ E and scalars λ. E is called an algebra with identity if it contains an element e such that for all x ∈ E, ex = xe = x, ∀x ∈ E. A normed algebra is a normed space which is an algebra such that kxyk ≤ kxkkyk, ∀x, y ∈ E, and if E has an identity e, kek = 1. A Banach algebra is a normed algebra which is complete, considered as a normed space .
˜ and Y˜ . An operator 1.11. Definition Let X and Y be subspaces of X ˜ ˜ ˜ ˜ = Bx, ∀x ∈ X. B : X → Y is said to be an extension of B : X → Y if Bx
4
1.12. Theorem Let X and Y be Banach spaces and D be a dense linear subspace of X. Let A be a bounded linear operator from D (equipped with the X−norm) into Y. Then there exists a unique extension of A to a bounded linear operator A¯ : X → Y defined on the whole X; moreover ¯ kAk = kAk. 1.13. Example Let X = Y be the space L2 ([a, b]), i.e. the completion of D = (C([a, b]), k · k2 ), (−∞ < a < b < +∞). Consider the operator A : C([a, b]) → C([a, b]) ⊂ L2 ([a, b]), (Af )(t) =
Z
b
k(t, τ )f (τ )dτ, where k ∈ C([a, b]2 ).
a
Theorem 1.12 allows one to extend this operator to a bounded linear operator A¯ : L2 ([a, b]) → L2 ([a, b]), since D is dense in its completion X = L2 ([a, b]). We only need to prove that A : D → D is a bounded operator. Denote K = max 2 |k(t, τ )|. (t,τ )∈[a,b]
Using Cauchy–Schwarz inequality we obtain Z b |(Af )(t)| = k(t, τ )f (τ )dτ ≤ a
Z
!1/2
b
2
Z
b
!1/2 2
≤ |f (τ )| dτ |k(t, τ )| dτ a √ K b − akf k2 , ∀f ∈ C([a, b]).
a
Therefore kAf k2 =
Z
b
!1/2 2
|(Af )(t)| dt
≤ K(b − a)kf k2 , ∀f ∈ C([a, b]),
a
i.e. A is bounded and kAk ≤ K(b − a). (Let Y be the Banach space (C([a, b]), k · k∞ ). Then the above proof shows that A can be extended to a bounded linear operator A¯ : L2 ([a, b]) → √ C([a, b]), kAk ≤ K b − a .) 1.14. Definition Let X be a normed space and xn ∈ X, n ∈ IN. We P say that the series ∞ n=1 xnPis convergent to a vector x ∈ X if the sequence (Sj ) of partial sums, Sj = jn=1 xn , converges to x. We then write ∞ X
xn = x.
n=1
5
We say that the series converges absolutely if ∞ X
kxn k < ∞.
n=1
1.15. Theorem Any absolutely convergent series in a Banach space is convergent. Proof: For m > k we have kSm
X
m X
m
− Sk k = xn
≤ kxn k → 0, as k, m → ∞.
n=k+1 n=k+1
Hence the sequence (Sj ) is Cauchy and therefore converges to some x ∈ X. 2
2
Spectral theory of bounded linear operators Auxiliary results
Let us recall that for any normed spaces X and Y we denote by B(X, Y ) the space of all bounded linear operators acting from X into Y . We use also the following notation B(X) = B(X, X). 2.1. Definition Let A be a linear operator from a vector space X into a vector space Y. The kernel of A is the set Ker(A) := {x ∈ X : Ax = 0}. The range of the operator A is the set Ran(A) := {Ax : x ∈ X}. Ker(A) and Ran(A) are linear subspaces of X and Y correspondingly. (Why?) Moreover, if X and Y are normed spaces and A ∈ B(X, Y ), then Ker(A) is closed (why?); this is not necessarily true for Ran(A). 6
2.2. Theorem (Banach) Let X and Y be Banach spaces and let B ∈ B(X, Y ) be one-to-one and onto (i.e. Ker(B) = {0} and Ran(B) = Y ). Then the inverse operator B −1 : Y → X is bounded, i.e. B −1 ∈ B(Y, X). Proof: The proof of this fundamental result can be found in any textbook on functional analysis. 2 Let X and Y be vector spaces and let an operator B : X → Y have a right inverse and a left inverse operators Br−1 , Bl−1 : Y → X, Bl−1 B = IX , BBr−1 = IY . (By I we always denote the identity operator: Ix = x, ∀x ∈ X. Subscript (if any) indicates the space on which the identity operator acts.) Then B has a two-sided inverse operator B −1 = Br−1 = Bl−1 . Indeed, Br−1 = IX Br−1 = (Bl−1 B)Br−1 = Bl−1 (BBr−1 ) = Bl−1 IY = Bl−1 . 2.3. Lemma Let X, Y and Z be vector spaces. (i) If operators B : X → Y and T : Y → Z are invertible, then T B : X → Z is invertible too and (T B)−1 = B −1 T −1 . (ii) If operators B, T : X → X commute: T B = BT , then T B : X → X is invertible if and only if both B and T are invertible. (iii) If operators B, T : X → X commute and B is invertible, then B −1 and T also commute. Proof: (i) B −1 T −1 T B = B −1 B = IX , T BB −1 T −1 = T T −1 = IZ . (ii) According to (i) we have to prove only that the invertibility of T B implies the invertibility of B and T . Let S : X → X be the inverse of T B, i.e. ST B = T BS = I. It is clear that ST is a left inverse of B. Since B and T commute, we have BT S = I, i.e. T S is a right inverse of B. Hence B is invertible and its inverse equals ST = T S. Similarly we prove that T is invertible. (iii) B −1 T = B −1 T BB −1 = B −1 BT B −1 = T B −1 . 2 7
2.4. Lemma Let X be a Banach space, B ∈ B(X) and kBk < 1. Then I − B is invertible, (I − B)−1 =
∞ X
Bn
(2.1)
n=0
and k(I − B)−1 k ≤ 1/(1 − kBk). Proof: It follows from (1.2) that kB n k ≤ kB n−1 kkBk ≤ · · · ≤ kBkn , ∀n ∈ IN. Hence the series in the right-hand side of (2.1) is absolutely convergent and, consequently, convergent in B(X) (see Theorems 1.10 and 1.15). Let us denote its sum by R ∈ B(X). We have (I − B)R = (I − B) lim
N →∞
N X
B n = lim (I − B N +1 ) = I, N →∞
n=0
because kB N +1 k ≤ kBkN +1 → 0 as N → ∞. Analogously we prove that R(I − B) = I. Thus (I − B)−1 = R ∈ B(X). Further, −1
k(I − B)
N N
X X 1 − kBkN +1
n kBkn = lim B
≤ lim k = lim
= N →∞ N →∞ N →∞ 1 − kBk n=0 n=0
1 . 2 1 − kBk 2.5. Lemma Let X and Y be Banach spaces, A, B ∈ B(X, Y ). Let A be invertible and kBk < 1/kA−1 k. Then A + B is invertible, −1
(A + B)
=
∞ X
! −1
n
(−A B)
A
n=0
−1
=A
−1
∞ X
! −1 n
(−BA )
n=0
and k(A + B)−1 k ≤
kA−1 k 1 − kBkkA−1 k
Proof: We have A + B = A(I + A−1 B) = (I + BA−1 )A and kA−1 Bk ≤ kA−1 kkBk < 1, kBA−1 k ≤ kBkkA−1 k < 1. Now it is left to apply Lemmas 2.3(i) and 2.4. 2
8
The spectrum In this subsection we will deal only with complex vector spaces, i.e. with the case IF = C, if it is not stated otherwise. 2.6. Definition Let X be a Banach space and B ∈ B(X). The resolvent set ρ(B) of the operator B is defined to be the set of all λ ∈ C such that B − λI has an inverse operator (B − λI)−1 ∈ B(X). The spectrum σ(B) of the operator B is the complement of ρ(B): σ(B) = C\ρ(B), i.e. σ(B) is the set of all λ ∈ C such that B − λI is not invertible on X. A complex number λ is called an eigenvalue of B if there exists x ∈ X\{0} such that Bx = λx. In this case x is called an eigenvector of B corresponding to the eigenvalue λ. 2.7. Lemma All eigenvalues of B belong to σ(B). Proof: Suppose λ is an eigenvalue and x 6= 0 is a corresponding eigenvector of B. Then (B − λI)x = 0. Since we have also (B − λI)0 = 0, the operator B − λI is not one–to–one. So, B − λI is not invertible on X, i.e. λ ∈ σ(B). 2 On the other hand the set of all eigenvalues may be smaller than the spectrum. 2.8. Examples 1. If the space X is finite-dimensional then the spectrum of B ∈ B(X) coincides with the set of all its eigenvalues, i.e. with the set of zeros of the determinant of a matrix corresponding to B − λI (with respect to a given basis of X). 2. Let X be one of the Banach spaces C([0, 1]), Lp ([0, 1]), 1 ≤ p ≤ ∞, and let B be defined by the formula Bf (t) = tf (t), t ∈ [0, 1]. Then σ(B) = [0, 1] but B does not have eigenvalues (exercise).
9
2.9. Lemma Let X be a Banach space and B ∈ B(X). Then σ(B) is a compact set and σ(B) ⊂ {λ ∈ C : |λ| ≤ kBk}. (2.2) Proof: Suppose |λ| > kBk. Then kBk < 1/|λ|−1 = 1/k − λ−1 Ik = 1/k(−λI)−1 k. Hence, according to Lemma 2.5 the operator B − λI is invertible, i.e. λ ∈ / σ(B). So, we have proved (2.2). Let us take an arbitrary λ0 ∈ ρ(B). Then for any λ ∈ C such that |λ − λ0 | < 1/k(B − λ0 I)−1 k we conclude from Lemma 2.5 and the representation B − λI = (B − λ0 I) + (λ0 − λ)I
(2.3)
that the operator B − λI is invertible, i.e. λ ∈ ρ(B). Hence ρ(B) is an open set and its complement σ(B) is closed. So, σ(B) is a bounded (see (2.2)) closed set, i.e. a compact set. 2 2.10. Definition Let X be a Banach space and B ∈ B(X). The operatorvalued function ρ(B) 3 λ → R(B; λ) := (B − λI)−1 ∈ B(X) is called the resolvent of the operator B. 2.11. Lemma (the resolvent equation) R(B; λ) − R(B; λ0 ) = (λ − λ0 )R(B; λ)R(B; λ0 ), ∀λ, λ0 ∈ ρ(B). Proof: R(B; λ) − R(B; λ0 ) = (B − λI)−1 − (B − λ0 I)−1 = (B − λI)−1 ((B − λ0 I) − (B − λI))(B − λ0 I)−1 = (λ − λ0 )R(B; λ)R(B; λ0 ). 2 2.12. Lemma If operators A, B ∈ B(X) commute: AB = BA, then for any λ ∈ ρ(B) and µ ∈ ρ(A) the operators A, B, R(A; µ) and R(B; λ) all 10
commute with each other. Proof: It is clear that the operators A, B, A − µI and B − λI commute. Now the desired result follows from Lemma 2.3(iii). 2 Taking A = B in the last lemma we obtain the following result. 2.13. Corollary For any λ, µ ∈ ρ(B) the operators B, R(B; λ) and R(B; µ) commute. 2 For any normed space X we denote by X ∗ its dual (= adjoint) space, i.e. the space of all bounded linear functionals on X, i.e. B(X, IF). 2.14. Theorem Let Z be a complex Banach space, Ω ⊂ C be an open subset of the complex plane and F : Ω → Z be a Z–valued function. Then the following statements are equivalent: (i) for any λ0 ∈ Ω there exists the derivative dF 1 (λ0 ) = F 0 (λ0 ) := lim (F (λ) − F (λ0 )) ∈ Z, λ→λ0 λ − λ0 dλ i.e.
1 (F (λ) − F (λ0 )) − F 0 (λ0 )
→ 0 as λ → λ0 ; λ − λ0 (ii) any λ0 ∈ Ω has a neighbourhood where F (λ) can be represented by an absolutely convergent series
F (λ) =
∞ X
(λ − λ0 )n Fn (λ0 ), Fn (λ0 ) ∈ Z;
n=0
(iii) for any G ∈ Z ∗ the complex–valued function Ω 3 λ 7→ G(F (λ)) ∈ C is holomorphic (= analytic) in Ω. If Z = B(X, Y ) for some Banach spaces X and Y , then for the operator– valued function F the above statements are equivalent to (iv) for any x ∈ X and g ∈ Y ∗ the complex–valued function Ω 3 λ 7→ g(F (λ)x) ∈ C 11
is holomorphic (= analytic) in Ω. Proof: It is clear that each of the statements (i) and (ii) implies (iii). (Why?) It is also obvious that (iii) implies (iv). Indeed, for any x ∈ X and g ∈ Y ∗ the mapping B(X, Y ) 3 B 7→ g(Bx) ∈ C is a bounded linear functional on B(X, Y ). The opposite implications are very non–trivial. We will not give the proof here. It can be found, e.g., in E. Hille & R.S. Phillips, Functional Analysis and Semi–Groups, Ch. III, Sect. 2. 2 2.15. Definition A vector–valued (operator–valued) function is called holomorphic (= analytic) in Ω if it satisfies the conditions (i)–(iii) (conditions (i)–(iv)) of the above theorem. 2.16. Remark We will not use the non–trivial part of Theorem 2.14. We will only use the fact that each of the statements (i) and (ii) implies (iii) and (iv). In the proof of the following theorem we will check that the resolvent satisfies both (i) and (ii). 2.17. Theorem The B(X)–valued function R(B; ·) is analytic in ρ(B) and has the following properties dR(B; λ) dλ
!
= R2 (B; λ0 ), ∀λ0 ∈ ρ(B),
(2.4)
λ=λ0
−λR(B; λ) → I as |λ| → ∞, kR(B; λ)k ≥
1 , λ ∈ ρ(B), d(λ, σ(B))
(2.5) (2.6)
where d(λ, K) := inf{|λ − µ| : µ ∈ K} denotes the distance of λ from K. Proof: Let us take an arbitrary λ0 ∈ ρ(B). It follows from Lemma 2.5 and the proof of Lemma 2.9 that the function R(B; ·) is bounded in a neighbourhood of λ0 . From the resolvent equation (Lemma 2.11) we obtain R(B; λ) → R(B; λ0 ) as λ → λ0 12
and lim
λ→λ0
R(B; λ) − R(B; λ0 ) = lim R(B; λ)R(B; λ0 ) = R2 (B; λ0 ). λ→λ0 λ − λ0
Thus (2.4) is valid for any point λ0 ∈ ρ(B), i.e R(B; ·) is analytic in ρ(B). Note that (2.3) and Lemma 2.5 imply the following Taylor expansion R(B; λ) =
∞ X
(λ − λ0 )n Rn+1 (B; λ0 ), if |λ − λ0 | < 1/k(B − λ0 I)−1 k. (2.7)
n=0
Further, −1
−1
k − λR(B; λ) − Ik = k(I − λ B)
∞
X
−n n − Ik = λ B ≤
n=1
∞ X
kBk |λ|−1 kBk = → 0, as |λ| → ∞ |λ|−n kBkn = −1 1 − |λ| kBk |λ| − kBk n=1 (see Lemma 2.4) and (2.5) is proved. Let us take an arbitrary λ0 ∈ ρ(B). The proof of Lemma 2.9 implies that λ ∈ ρ(B) if |λ − λ0 | < 1/kR(B; λ0 )k. Hence d(λ0 , σ(B)) ≥ 1/kR(B; λ0 )k, i.e. kR(B; λ0 )k ≥
1 , ∀λ0 ∈ ρ(B). 2 d(λ0 , σ(B))
2.18. Lemma Let X be a normed space. Then for any z ∈ X there exists g ∈ X ∗ such that g(z) = kzk and kgk = 1. Proof: Let X0 = lin{z} be the one–dimensional subspace of X spanned by z: X0 := {αz| α ∈ IF}. We define a functional g0 : X0 → IF by the equality g0 (αz) = αkzk. It is clear that g0 (z) = kzk and kg0 k = 1. (Why?) Due to the Hahn–Banach theorem g0 can be extended to a functional g ∈ X ∗ such that g(z) = kzk and kgk = 1. 2 2.19. Lemma σ(B) 6= ∅ for any B ∈ B(X).
13
Proof: Let us suppose that σ(B) = ∅ and take arbitrary x ∈ X\{0}, g ∈ X ∗ . It follows from Theorem 2.17 that the function f (λ) := g(R(B; λ)x) is analytic in ρ(B) = C and f (λ) → 0 as |λ| → ∞ (see (2.5) and also Theorem 2.14 and Remark 2.16). Liouville’s theorem implies f ≡ 0. Let us take g ∈ X ∗ such that g(R(B; λ0 )x) = kR(B; λ0 )xk for the given x ∈ X\{0} and some λ0 ∈ C (see Lemma 2.18). Then we obtain 0 = f (λ0 ) = g(R(B; λ0 )x) = kR(B; λ0 )xk, i.e. R(B; λ0 )x = 0. Consequently x = (B − λ0 I)R(B; λ0 )x = 0 for x ∈ X\{0}. The obtain contradiction shows that σ(B) cannot be empty. 2 Combining Lemmas 2.9, 2.19 and Theorem 2.17 we obtain the following result. 2.20. Theorem Let X be a Banach space and B ∈ B(X). The spectrum of B is a non-empty compact set contained in the closed disk {λ ∈ C : |λ| ≤ kBk} and the resolvent (B − λI)−1 is an analytic B(X)–valued function on C\σ(B). 2 2.21. Remark Let (ak ) be a sequence of real numbers. We will use the following notation: lim inf an := n→∞ lim inf ak , n→∞ k≥n
lim sup an := n→∞ lim sup ak . n→∞
k≥n
These limits exist (finite or infinite) because bn := inf k≥n ak and cn := supk≥n ak are monotone. The Sandwich Theorem implies that the limit limn→∞ an exists if lim inf an = lim sup an . n→∞ n→∞
2.22. Definition Let B ∈ B(X). The spectral radius r(B) of the operator B is defined by the equality r(B) := sup{|λ| : λ ∈ σ(B)}. 14
This is the radius of the minimal disk centred at 0 and containing σ(B). 2.23. Theorem (the spectral radius formula) Let X be a Banach space and B ∈ B(X). Then r(B) = n→∞ lim kB n k1/n . (2.8) Proof: Let us take an arbitrary λ ∈ σ(B). We have B n − λn I = (B − λI)(λn−1 I + λn−2 B + · · · + λB n−2 + B n−1 ). Since B−λI is not invertible and the operators in the RHS commute, B n −λn I is not invertible (see Lemma 2.3(ii)). So, λ ∈ σ(B) implies λn ∈ σ(B n ). Consequently r(B)n = (sup{|λ| : λ ∈ σ(B)})n = sup{|λ|n : λ ∈ σ(B)} ≤ r(B n ). Hence, according to (2.2) (with B n instead of B) r(B) ≤ (r(B n ))1/n ≤ kB n k1/n , and therefore r(B) ≤ lim inf kB n k1/n . n→∞
Now it is sufficient to prove that lim sup kB n k1/n ≤ r(B).
(2.9)
n→∞
If λ ∈ C is such that |λ| > kBk, then λ ∈ / σ(B) and (B − λI)−1 = −λ−1 (I − λ−1 B)−1 = −
∞ X
λ−n−1 B n
(2.10)
n=0
(see Lemma 2.4). Let us take arbitrary x ∈ X, g ∈ X ∗ and consider the function f (λ) := g(R(B; λ)x). If |λ| > kBk then (2.10) implies the following Laurent expansion f (λ) = −
∞ X
λ−n−1 g(B n x).
n=0
15
(2.11)
Since R(B; ·) is analytic in C\σ(B) (see Theorem 2.17), so is f (see Theorem 2.14 and Remark 2.16). Hence, (2.11) is valid if |λ| > r(B). Taking λ = aeiθ , a > r(B) and integrating the series for λm+1 f (λ) term by term with respect to θ we have Z
2π
am+1 ei(m+1)θ f (aeiθ )dθ = −
∞ 2π X
Z 0
0
−
∞ X
am−n g(B n x)
Z
n=0 2π
am−n ei(m−n)θ g(B n x)dθ = ei(m−n)θ dθ = −2πg(B m x).
0
n=0
Thus |g(B m x)| ≤
1 Z 2π m+1 a |g(R(B; aeiθ )x)|dθ ≤ am+1 M (a)kgkkxk, 2π 0
where M (a) := sup kR(B; aeiθ )k 0≤θ≤2π
(see (1.1)). Taking g ∈ X ∗ such that g(B m x) = kB m xk, kgk = 1 (see Lemma 2.18) we obtain kB m xk ≤ am+1 M (a)kxk, ∀x ∈ X, i.e. kB m k ≤ am+1 M (a), ∀m ∈ IN. Consequently lim sup kB m k1/m ≤ a, if a > r(B), m→∞
i.e. we have proved (2.9). 2
The functional calculus n Let p(λ) = N and let B ∈ B(X). It is natural n=0 an λ be a polynomial PN to define p(B) by the equality p(B) = n=0 an B n . It turns out that one can define f (B) for any complex–valued function f which is analytic in some neighbourhood of σ(B).
P
Let f be a complex–valued function which is analytic in a disk {λ ∈ C : 16
|λ| < rf }, where rf > 0 is some given number. Then we have the following Taylor expansion f (λ) =
∞ X
an λn , |λ| < rf ,
n=0
where an = f (n) (0)/n! and the series is absolutely convergent. 2.24. Definition Let X be a Banach space, B ∈ B(X) and r(B) < rf , i.e. σ(B) ⊂ {λ ∈ C : |λ| < rf }. Then we define f (B) by the equality f (B) :=
∞ X
an B n .
(2.12)
n=0
We need to check that this definition makes sense. Let us take ε > 0 such that r(B) + ε < rf . It follows from the spectral radius formula that for sufficiently large n we have kB n k < (r(B)+ε)n . Therefore the power series (2.12) is absolutely convergent. Hence, f (B) ∈ B(X) (see Theorems 1.10 and 1.15). A disadvantage of the above definition is that f has to be analytic in some disk containing σ(B). This does not seem fair if σ(B) is far from being a disk, e.g. is a curve. So, it is natural to look for an alternative definition of f (B). 2.25. Definition A bounded open set Ω ⊂ C is called admissible if its boundary ∂Ω consists of a finite number of smooth closed pairwise disjoint curves. The orientation of ∂Ω is chosen in such a way that Ω remains on the left as we move along ∂Ω in the positive direction. Let Ω be an admissible set and let f be complex–valued function which is analytic in some neighbourhood of the closure Cl(Ω) of Ω. Then we can write the Cauchy integral formula from Complex Analysis: f (λ0 ) =
1 Z f (λ) dλ, λ0 ∈ Ω. 2πi ∂Ω λ − λ0
The formal substitution λ0 = B gives f (B) =
1 Z f (λ)(λI − B)−1 dλ. 2πi ∂Ω 17
The RHS is well defined and belongs to B(X) if ∂Ω σ(B) = ∅. Here and below the integrals are understood as norm limits of the corresponding Riemann sums. This motivates the following definition. T
2.26. Definition Let X be a Banach space, B ∈ B(X). Suppose f is analytic in some (open) neighbourhood ∆f of σ(B) and Ω is an admissible set such that σ(B) ⊂ Ω ⊂ Cl(Ω) ⊂ ∆f . (2.13) Then f (B) := −
1 Z f (λ)R(B; λ)dλ. 2πi ∂Ω
(2.14)
It is always possible to find an admissible set satisfying (2.13). Indeed, σ(B) is a closed bounded subset of the open set ∆f . Therefore d(σ(B), ∂∆f ) > 0. Here d(K1 , K2 ) := inf{|λ1 − λ2 | : λ1 ∈ K1 , λ2 ∈ K2 } denotes the distance between two sets. 2.27. Proposition The RHS of (2.14) does not depend on the choice of an admissible set satisfying (2.13). Proof: Let Ω and Ω1 be admissible sets satisfying (2.13). Then there exists an admissible set Ω0 such that σ(B) ⊂ Ω0 ⊂ Cl(Ω0 ) ⊂ Ω
\
Ω1 ⊂ ∆f .
It is sufficient to prove that 1 Z 1 Z f (λ)R(B; λ)dλ = f (λ)R(B; λ)dλ, 2πi ∂Ω 2πi ∂Ω0
(2.15)
because the same argument will apply to the pair Ω1 and Ω0 . Let us take arbitrary x ∈ X, g ∈ X ∗ and prove that 1 Z 1 Z f (λ)g(R(B; λ)x)dλ = f (λ)g(R(B; λ)x)dλ. 2πi ∂Ω 2πi ∂Ω0
(2.16)
The function λ 7→ f (λ)g(R(B; λ)x) ∈ C is analytic in a neighbourhood of S Cl(Ω)\Ω0 . Since the boundary of this set equals ∂Ω ∂Ω0 , (2.16) follows from the Cauchy theorem. 18
Thus, 1 Z 1 Z f (λ)R(B; λ)xdλ − f (λ)R(B; λ)xdλ = 0, ∀g ∈ X ∗ . g 2πi ∂Ω 2πi ∂Ω0
Consequently 1 Z 1 Z f (λ)R(B; λ)xdλ = f (λ)R(B; λ)xdλ, ∀x ∈ X, 2πi ∂Ω 2πi ∂Ω0 (see Lemma 2.18), i.e. (2.15) holds. 2 Our aim now is to show that Definition 2.26 does not contradict the “common sense”, i.e. that in the situations where we now what f (B) is, (2.14) gives the same result. 2.28. Lemma Let Ω be an admissible neighbourhood of σ(B). Then for any λ0 6∈ Cl(Ω) we have 1 Z (B − λ0 I) = − (λ − λ0 )m R(B; λ)dλ, ∀m ∈ Z. 2πi ∂Ω m
(2.17)
In particular R(B; λ0 ) = (B − λ0 I)−1 = −
1 Z (λ − λ0 )−1 R(B; λ)dλ. 2πi ∂Ω
Proof: Let us denote the RHS of (2.17) by Am . Since λ0 6∈ Cl(Ω), the function f (λ) := (λ − λ0 )m is analytic in some neighbourhood of Cl(Ω) and 1 Z (λ − λ0 )m dλ = 0. 2πi ∂Ω Using this equality and the resolvent equation (Lemma 2.11) R(B; λ) = R(B; λ0 ) + (λ − λ0 )R(B; λ0 )R(B; λ), we obtain Am = −R(B; λ0 ) R(B; λ0 )
1 Z (λ − λ0 )m dλ − 2πi ∂Ω
1 Z (λ − λ0 )m+1 R(B; λ)dλ = R(B; λ0 )Am+1 . 2πi ∂Ω 19
Hence Am = (B − λ0 I)−1 Am+1 , m = 0, ±1, ±2, . . . This recursion formula shows that (2.17) follows from the case m = 0. We thus have to prove that −
1 Z R(B; λ)dλ = I. 2πi ∂Ω
According to Proposition 2.27 it will suffice to prove that −
1 Z R(B; λ)dλ = I 2πi |λ|=r
for sufficiently large r > 0. If |λ| = r > kBk we have the following absolutely and uniformly convergent expansion (B − λI)−1 = −
∞ X
λ−n−1 B n
n=0
(see (2.10)). Termwise integration gives Z ∞ X 1 Z n 1 − R(B; λ)dλ = B λ−n−1 dλ = B 0 = I. 2 2πi |λ|=r 2πi |λ|=r n=0
2.29. Corollary Let Ω be an admissible neighbourhood of σ(B). Then for any polynomial p we have 1 Z p(λ)R(B; λ)dλ. p(B) = − 2πi ∂Ω In particular [ 1 Z B =− λn R(B; λ)dλ, ∀n ∈ IN {0}. 2πi ∂Ω n
Proof: It is sufficient to represent p(λ) in the form p(λ) = λ0 6∈ Cl(Ω), and apply the last lemma. 2
PN
m m=0 cm (λ − λ0 ) ,
2.30. Theorem Let f be analytic in a disk ∆f = {λ ∈ C : |λ| < rf }, 20
where rf > r(B). Then Definitions 2.24 and 2.26 give the same result. Proof: Termwise integration gives ∞ X 1 Z 1 Z − f (λ)R(B; λ)dλ = − an λn R(B; λ)dλ = 2πi ∂Ω 2πi ∂Ω n=0 ∞ ∞ X X 1 Z an − λn R(B; λ)dλ = an B n 2πi ∂Ω n=0 n=0
!
(see Corollary 2.29). 2 Let f and g be functions analytic in some (open) neighbourhoods ∆f and ∆g of σ(B). It follows directly from Definition 2.26 that (αf + βg)(B) = αf (B) + βg(B), ∀α, β ∈ C.
(2.18)
2.31. Theorem (f g)(B) = f (B)g(B). Proof: Let us take admissible sets Ωf and Ωg such that σ(B) ⊂ Ωf ⊂ Cl(Ωf ) ⊂ Ωg ⊂ Cl(Ωg ) ⊂ ∆f
\
∆g .
(2.19)
Using the resolvent equation (Lemma 2.11) R(B; λ)R(B; µ) = −
R(B; λ) R(B; µ) − , λ 6= µ, µ−λ λ−µ
we obtain !
!
1 Z 1 Z f (B)g(B) = − f (λ)R(B; λ)dλ − g(µ)R(B; µ)dµ = 2πi ∂Ωf 2πi ∂Ωg ! 1 Z 1 Z − f (λ) − g(µ)R(B; λ)R(B; µ)dµ dλ = 2πi ∂Ωf 2πi ∂Ωg ! 1 Z 1 Z g(µ) − f (λ)R(B; λ) dµ dλ − 2πi ∂Ωf 2πi ∂Ωg µ − λ ! 1 Z 1 Z g(µ) − f (λ) R(B; µ)dµ dλ. 2πi ∂Ωf 2πi ∂Ωg λ − µ 21
Since λ ∈ Ωg for any λ ∈ ∂Ωf and µ 6∈ Cl(Ωf ) for any µ ∈ ∂Ωg (see (2.19)), the Cauchy theorem implies 1 Z g(µ) dµ = g(λ), 2πi ∂Ωg µ − λ
1 Z f (λ) dλ = 0. 2πi ∂Ωf λ − µ
Therefore 1 Z f (B)g(B) = − f (λ)g(λ)R(B; λ)dλ − 2πi ∂Ωf ! 1 Z 1 Z f (λ) dλ g(µ)R(B; µ)dµ = (f g)(B) − 0 = (f g)(B). 2 2πi ∂Ωg 2πi ∂Ωf λ − µ 2.32. Corollary f (B)g(B) = g(B)f (B). Proof: f (B)g(B) = (f g)(B) = (gf )(B) = g(B)f (B). 2 2.33. Theorem Let A, B ∈ B(X) and AB = BA. Suppose f and g are analytic in some neighbourhoods ∆f and ∆g of σ(A) and σ(B) correspondingly. Then f (A)g(B) = g(B)f (A). Proof: Let us take admissible sets Ωf and Ωg such that σ(A) ⊂ Ωf ⊂ Cl(Ωf ) ⊂ ∆f ,
σ(B) ⊂ Ωg ⊂ Cl(Ωg ) ⊂ ∆g .
Since AB = BA implies R(A; µ)R(B; λ) = R(B; λ)R(A; µ) for any λ ∈ ρ(B) and µ ∈ ρ(A) (see Lemma 2.12), we have !
!
1 Z 1 Z f (A)g(B) = − f (µ)R(A; µ)dµ − g(λ)R(B; λ)dλ = 2πi ∂Ωf 2πi ∂Ωg Z 1 Z f (µ)g(λ)R(A; µ)R(B; λ)dλdµ = − 2 4π ∂Ωf ∂Ωg 1 Z Z f (µ)g(λ)R(B; λ)R(A; µ)dµdλ = − 2 4π ∂Ωg ∂Ωf ! ! 1 Z 1 Z − g(λ)R(B; λ)dλ − f (µ)R(A; µ)dµ = g(B)f (A). 2 2πi ∂Ωg 2πi ∂Ωf
22
2.34. Theorem (the spectral mapping theorem) Let f be analytic in some neighbourhood of σ(B). Then σ(f (B)) = f (σ(B)) := {f (λ) : λ ∈ σ(B)}. Proof: Let us prove that σ(f (B)) ⊂ f (σ(B)).
(2.20)
Let take any ζ ∈ C\f (σ(B)). Then f − ζ = 6 0 in some neighbourhood of σ(B). Hence the function g := 1/(f − ζ) is analytic in a neighbourhood of σ(B). Now Theorem 2.31 implies !
g(B)(f (B) − ζI) =
1 (f − ζ) (B) = 1(B) = I. f −ζ
Similarly (f (B) − ζI)g(B) = I. Therefore g(B) = R(f (B); ζ) and ζ 6∈ σ(f (B)). Thus ζ 6∈ f (σ(B)) =⇒ ζ 6∈ σ(f (B)), i.e. (2.20) holds. It is left to prove that f (σ(B)) ⊂ σ(f (B)).
(2.21)
Let us take an arbitrary µ ∈ σ(B) and consider the function (
h(λ) :=
(f (λ) − f (µ))/(λ − µ) if λ 6= µ, f 0 (µ) if λ = µ.
It is clear that h is analytic in some neighbourhood of σ(B) and f (λ)−f (µ) = (λ − µ)h(λ). It follows from Theorem 2.31 that f (B) − f (µ)I = (B − µI)h(B). The operator B − µI is not invertible, because µ ∈ σ(B). Since the operators B − µI and h(B) commute (see Corollary 2.32), the operator f (B) − f (µ)I cannot be invertible (see Theorem 2.3(ii)). Thus µ ∈ σ(B) =⇒ f (µ) ∈ σ(f (B)), i.e. (2.21) holds. 2
23
2.35. Theorem Let f be analytic in some neighbourhood ∆f of σ(B) and g be analytic in some neighbourhood ∆g of f (σ(B)). Then for the composition g ◦ f (defined by (g ◦ f )(λ) := g(f (λ))) we have (g ◦ f )(B) = g(f (B)). Proof: The RHS of the last equality is well defined due to the spectral mapping theorem. Let us take admissible sets Ωf and Ωg such that σ(B) ⊂ Ωf ⊂ Cl(Ωf ) ⊂ ∆f , σ(f (B)) = f (σ(B)) ⊂ f (Cl(Ωf )) ⊂ Ωg ⊂ Cl(Ωg ) ⊂ ∆g . For any ζ ∈ ∂Ωg the function hζ := 1/(f − ζ) is analytic in some neighbourhood of Cl(Ωf ). Exactly as in the proof of Theorem 2.34 we prove that hζ (B) = R(f (B); ζ). Using the Cauchy theorem we obtain 1 Z 1 Z g(f (B)) = − g(ζ)R(f (B); ζ)dζ = − g(ζ)hζ (B)dζ = 2πi ∂Ωg 2πi ∂Ωg ! 1 Z 1 Z 1 − g(ζ) − R(B; λ)dλ dζ = 2πi ∂Ωg 2πi ∂Ωf f (λ) − ζ ! 1 Z 1 Z g(ζ) R(B; λ) dζ dλ = − 2πi ∂Ωf 2πi ∂Ωg ζ − f (λ) 1 Z 1 Z − g(f (λ))R(B; λ)dλ = − (g ◦ f )(λ)R(B; λ)dλ = 2πi ∂Ωf 2πi ∂Ωf (g ◦ f )(B). 2
24
Summary Suppose X is a Banach space and B ∈ B(X). Let A(B) be the set of all functions f which are analytic in some neighbourhood of σ(B). (The neighbourhood can depend on f ∈ A(B).) The map A(B) 3 f 7→ f (B) ∈ B(X) defined by (2.14) has the following properties: (i) if f, g ∈ A(B), α, β ∈ C, then αf + βg, f g ∈ A(B) and (αf + βg)(B) = αf (B) + βg(B), (f g)(B) = f (B)g(B); P n (ii) if f has the power series expansion f (λ) = ∞ n=0 cn λ , valid in a neighP∞ bourhood of σ(B), then f (B) = n=0 cn B n ; (iii) σ(f (B)) = f (σ(B)) (the spectral mapping theorem); (iv) if f ∈ A(B) and g ∈ A(f (B)), then (g ◦ f )(B) = g(f (B)).
Linear operator equations Let X and Y be normed spaces and A ∈ B(X, Y ). The adjoint operator A∗ : Y ∗ → X ∗ is defined by the equality (A∗ g)(x) := g(Ax), ∀g ∈ Y ∗ , ∀x ∈ X. It is not difficult to prove that A∗ ∈ B(Y ∗ , X ∗ ) and kA∗ k = kAk (Functional Analysis I). It turns out that A∗ plays a very important role in the study of the following linear operator equation Ax = y, y ∈ Y.
(2.22)
2.36. Lemma For the closure of the range of A we have Ran(A) ⊂ {y ∈ Y : g(y) = 0, ∀g ∈ Ker(A∗ )}.
(2.23)
Proof: Let us take an arbitrary y ∈ Ran(A) and g ∈ Ker(A∗ ). There exist xn ∈ X such that yn := Axn converge to y as n → ∞. Hence g(y) = g( lim Axn ) = lim g(Axn ) = lim (A∗ g)(xn ) = lim 0(xn ) = 0. 2 n→∞
n→∞
n→∞
25
n→∞
2.37. Corollary (a necessary condition of solvability) If (2.22) has a solution then g(y) = 0, ∀g ∈ Ker(A∗ ). Proof: (2.22) is solvable if and only if y ∈ Ran(A). 2 Now we are going to prove that (2.23) is in fact an equality. We start with the following corollary of the Hahn–Banach theorem. 2.38. Lemma Let Y0 be a closed linear subspace of Y and y0 ∈ Y \Y0 . Then there exists g ∈ Y ∗ such that g(y0 ) = 1, g(y) = 0 for all y ∈ Y0 and kgk = 1/d, where d = d(y0 , Y0 ) = inf y∈Y0 ky0 − yk. (d > 0 because Y0 is closed.) Proof: Let Y1 := {λy0 + y : λ ∈ IF, y ∈ Y0 }. Define a functional g1 : Y1 → IF by the equality g1 (λy0 + y) = λ. It is clear that g1 is linear and g1 (y0 ) = 1, g1 (y) = 0 for all y ∈ Y0 . Further, |g1 (λy0 + y)| |λ| = sup = λy0 +y6=0 kλy0 + yk λy0 +y6=0 kλy0 + yk |λ| 1 1 = sup = sup = −1 kλy0 + yk λ6=0, y∈Y0 ky0 + λ yk z∈Y0 ky0 + zk 1 1 1 = = . inf z∈Y0 ky0 + zk inf w∈Y0 ky0 − wk d kg1 k =
sup λ6=0, y∈Y0
sup
Due to the Hahn–Banach theorem g1 can be extended to a functional g ∈ Y ∗ such that g(y0 ) = 1, g(y) = 0 for all y ∈ Y0 and kgk = 1/d. 2 2.39. Theorem Ran(A) = {y ∈ Y : g(y) = 0, ∀g ∈ Ker(A∗ )}. Proof: According to Lemma 2.36 it is sufficient to prove that N (A) := {y ∈ Y : g(y) = 0, ∀g ∈ Ker(A∗ )} ⊂ Ran(A). 26
Suppose the contrary: there exists y0 ∈ N (A), y0 6∈ Ran(A). Lemma 2.38 implies the existence of g ∈ Y ∗ such that g(y0 ) = 1, g(y) = 0 for all y ∈ Ran(A). Then (A∗ g)(x) = g(Ax) = 0, ∀x ∈ X, i.e A∗ g = 0, i.e. g ∈ Ker(A∗ ). Since y0 ∈ N (A), we have g(y0 ) = 0. Contradiction! 2 2.40. Corollary Suppose Ran(A) is closed. Then (2.22) is solvable if and only if g(y) = 0, ∀g ∈ Ker(A∗ ). Proof: (2.22) is solvable if and only if y ∈ Ran(A) = Ran(A). 2
Projections 2.41. Definition Let X be a normed space. An operator B ∈ B(X) is called a projection if it is idempotent, i.e. B 2 = B. 2.42. Lemma Let P ∈ B(X) be a projection. Then Q := I − P is a projection, P Q = QP = 0 and Ran(P ) = Ker(Q), Ran(Q) = Ker(P ). Proof: Q2 = (I − P )2 = I − 2P + P 2 = I − 2P + P = I − P = Q, i.e. Q is a projection. P Q = P (I − P ) = P − P 2 = 0 = (I − P )P = QP. It follows from the equality QP = 0 that Ran(P ) ⊂ Ker(Q) (why?). On the other hand for any x ∈ Ker(Q) we have x − P x = 0, i.e. x = P x, hence x ∈ Ran(P ). Thus Ker(Q) ⊂ Ran(P ). So, Ran(P ) = Ker(Q). The equality Ran(Q) = Ker(P ) follows from the last one if P is replaced by Q. 2 2.43. Definition Let L, L1 , L2 be subspaces of a vector space X. We say that L is the direct sum of L1 , L2 and write L = L1 ⊕ L2 if L1 ∩ L2 = {0} and L = L1 + L2 := {y1 + y2 : y1 ∈ L1 , y2 ∈ L2 }. 27
2.44. Proposition Let L, L1 , L2 be subspaces of a vector space X. Then L = L1 ⊕ L2 iff every element y of L can be uniquely written as y = y1 + y2 with yk ∈ Lk . Proof: Exercise. 2 2.45. Lemma If P ∈ B(X) is a projection, then Ran(P ) is closed and X = Ran(P ) ⊕ Ker(P ). Proof: Ran(P ) = Ker(I − P ), so Ran(P ) is closed. The equality I = P + (I − P ) and Lemma 2.42 imply Ran(P ) + Ker(P ) = Ran(P ) + Ran(I − P ) = X. So, it is left to prove that Ran(P ) ∩ Ker(P ) = {0}. Let us take an arbitrary x ∈ Ran(P ) ∩ Ker(P ). Since x ∈ Ran(P ), there exists y ∈ X such that x = P y. Consequently P x = P 2 y = P y = x. On the other hand x ∈ Ker(P ), i.e. P x = 0. Thus x = 0. 2 2.46. Theorem Let X be a Banach space and P ∈ B(X) be a non–trivial projection, i.e. P 6= 0, I. Then σ(P ) = {0, 1}. Proof: Using the spectral mapping theorem (Theorem 2.34) we obtain {0} = σ(0) = σ(P − P 2 ) = {λ − λ2 : λ ∈ σ(P )}. So, λ ∈ σ(P ) =⇒ λ = 0 or 1, i.e. σ(P ) ⊂ {0, 1}. On the other hand, Ran(P ) 6= {0} and Ran(I − P ) 6= {0} (why?), i.e. Ker(I − P ) 6= {0} and Ker(P ) 6= {0} (see Lemma 2.42). Consequently the operators P and P − I are not invertible, i.e. {0, 1} ⊂ σ(P ). 2 T Suppose B ∈ B(X) and Ω is an admissible set such that ∂Ω σ(B) = ∅. Consider the operator Z 1 R(B; λ)dλ. (2.24) P := − 2πi ∂Ω T If Ω σ(B) = ∅ then P = 0 (follows from the Cauchy theorem). If σ(B) ⊂ Ω then P = 1(B) = I (Lemma 2.28). The only non–trivial T case is when there exists a non–empty subset σ of σ(B) such that σ 6= σ(B), σ ⊂ Ω, Ω (σ(B)\σ) = ∅ and both σ and σ(B)\σ are closed. This can happen only if σ(B) is not connected. Let an admissible set Ω0 and open sets ∆, ∆0 be such that σ ⊂ Ω ⊂ Cl(Ω) ⊂ ∆, σ(B)\σ ⊂ Ω0 ⊂ Cl(Ω0 ) ⊂ ∆0
28
T S and ∆ ∆0 = ∅. Let us consider the function f : ∆1 := ∆ S∆0 → C which equals S 1 in ∆ and 0 in ∆0 . It is analytic in ∆1 . Since σ(B) ⊂ Ω1 := Ω Ω0 and ∂Ω1 = ∂Ω ∂Ω0 , we have Z Z 1 1 R(B; λ)dλ = − f (λ)R(B; λ)dλ = f (B). P =− 2πi ∂Ω 2πi ∂Ω1 It is clear that f 2 = f . Hence P 2 = P (Theorem 2.31). It follows from the spectral mapping theorem (Theorem 2.34) that σ(P ) = σ(f (B)) = {0, 1}. Thus P is a non–trivial projection which commutes with g(B) for any function g analytic in some neighbourhood of σ(B) (Corollary 2.32). P is called the spectral projection corresponding to the spectral set σ. Since g(B)P x = P g(B)x ∈ Ran(P ), ∀x ∈ X, and P g(B)x0 = g(B)P x0 = g(B)0 = 0, ∀x0 ∈ Ker(P ), Ran(P ) and Ker(P ) are invariant under g(B), i.e. g(B)Ran(P ) ⊂ Ran(P ),
g(B)Ker(P ) ⊂ Ker(P ).
Compact operators 2.47. Definition Let X be a normed space. A set K ⊂ X is relatively compact if every sequence in K has a Cauchy subsequence. K is called compact if every sequence in K has a subsequence which converges to an element of K. 2.48. Proposition K relatively compact =⇒ K bounded. K compact =⇒ K closed and bounded. Any subset of a relatively compact set is relatively compact. Any closed subset of a compact set is compact. 2.49. Theorem K is relatively compact if and only if it is totally bounded, i.e. for every ε > 0, X contains a finite set, called ε−net for K, such that the finite set of open balls of radius ε and centres in the ε−net covers K. Proof: See e.g. C. Goffman & G. Pedrick, First Course in Functional Analysis, Section 1.11, Lemma 1. 2
29
2.50. Proposition Let X be finite–dimensional. K ⊂ X is relatively compact iff K is bounded. K ⊂ X is compact iff K is closed and bounded. 2.51. Theorem On a finite–dimensional vector space X, any norm k · k is equivalent to any other norm k · k0 , i.e. there exist constants c1 , c2 > 0 such that c1 kxk ≤ kxk0 ≤ c2 kxk, ∀x ∈ X. Proof : See e.g. E. Kreyszig, Introductory Functional Analysis with Applications, Theorem 2.4-5. 2 2.52. Definition Let X and Y be normed spaces. A linear operator T : X → Y is called compact (or completely continuous) if it maps bounded sets of X into relatively compact sets of Y . We will denote the set of all compact linear operators acting from X into Y by Com(X, Y ). Since any relatively compact set is bounded, T ∈ Com(X, Y ) maps bounded sets into bounded sets. In particular the image T (SX ) of the unit ball SX := {x ∈ X : kxk ≤ 1}
(2.25)
is bounded, i.e T ∈ B(X, Y ). Thus Com(X, Y ) ⊂ B(X, Y ). It follows from Definition 2.47, that a linear operator T : X → Y is compact iff for any bounded sequence xn ∈ X, n ∈ IN, the sequence (T xn )n∈IN has a Cauchy subsequence. 2.53. Lemma Let X and Y be normed spaces. A linear operator T : X → Y is compact iff T (SX ) is relatively compact. Proof: Since SX is bounded, T ∈ Com(X, Y ) implies that T (SX ) is relatively compact (see Definition 2.52). Hence we need to prove that if T (SX ) is relatively compact then T ∈ Com(X, Y ). It is clear that for any t > 0 the set tT (SX ) = T (tSX ) is relatively compact. (Why?) Let W ⊂ X be an arbitrary bounded set. Then W ⊂ tSX if t ≥ sup{kxk : x ∈ W }. Using the obvious fact that any subset 30
of a relatively compact set is relatively compact (see Proposition 2.48) we conclude from T (W ) ⊂ T (tSX ) that T (W ) is relatively compact. 2 2.54. Theorem Let X, Y and Z be normed spaces. (i) If T1 , T2 ∈ Com(X, Y ) and α, β ∈ IF, then αT1 + βT2 ∈ Com(X, Y ). (ii) If T ∈ Com(X, Y ), A ∈ B(Z, X) and B ∈ B(Y, Z), then T A ∈ Com(Z, Y ) and BT ∈ Com(X, Z). (iii) If Tk ∈ Com(X, Y ), k ∈ IN, and kT − Tk k → 0 as k → ∞, then T ∈ Com(X, Y ). Proof: (i). Let xn ∈ X, n ∈ IN be an arbitrary bounded sequence. It follows from the compactness of T1 that (T1 xn ) has a Cauchy subsequence (T1 x(1) n ). Using the compactness of T2 we can extract a Cauchy subsequence (1) (2) (T2 x(2) n ) from the sequence (T2 xn ). It is clear that (T1 xn ) is a Cauchy sequence. Consequently any xn ∈ X, n ∈ IN has a sub bounded sequence (2) (2) sequence (xn ) such that (αT1 + βT2 )xn is a Cauchy sequence. Thus αT1 + βT2 ∈ Com(X, Y ). (ii) Let zn ∈ Z, n ∈ IN be an arbitrary bounded sequence. Then (Azn ) is a bounded sequence in X and it follows from the compactness of T that (T Azn ) has a Cauchy subsequence (T Azn(1) ). So, T A ∈ Com(Z, Y ). For any bounded sequence xn ∈ X, n ∈ IN, the sequence (T xn ) has a Cauchy subse(1) quence (T x(1) n ). It is easy to see that (BT xn ) is a Cauchy sequence. Hence BT ∈ Com(X, Z). (iii) Let xn ∈ X, n ∈ IN be an arbitrary bounded sequence. Since T1 is compact (T1 xn ) has a Cauchy subsequence (T1 x(1) n ). Using the compactness of T2 (2) we can extract a Cauchy subsequence (T2 xn ) from the sequence (T2 x(1) n ). It (2) is clear that (T1 xn ) is a Cauchy sequence. Repeating this process we obtain (m1 ) 0) ) is a subsequence of (x(m ) if sequences (x(m) n n )n∈IN in X such that (xn (m) m1 > m0 , and (Tk xn )n∈IN is a Cauchy sequence if k ≤ m. Let us consider (n) the diagonal sequence (x(n) n ). It is easily seen that (xn ) is a subsequence of (n) (xn ). (Tk x(n) n )n∈IN is a Cauchy sequence for any k ∈ IN, because (Tk xn )n≥k is (n) a subsequence of the Cauchy sequence (Tk x(k) n )n∈IN . Let us prove that (T xn ) is also Cauchy. We have (m) (n) (n) (n) (m) kT x(n) n − T xm k ≤ kT xn − Tk xn k + kTk xn − Tk xm k +
(m) (n) (m) (n) (m) kTk x(m) m − T xm k ≤ kT − Tk k kxn k + kxm k + kTk xn − Tk xm k.
31
For a given ε > 0, we choose k so large that kT − Tk k <
ε , 3M
where M := supn∈IN kxn k. Then we determine n0 so that ε (m) kTk x(n) n − Tk xm k < , ∀n, m ≥ n0 . 3 Now we have (m) kT x(n) n − T xm k < ε, ∀n, m ≥ n0 .
Hence (T x(n) n ) is a Cauchy sequence, i.e. T ∈ Com(X, Y ). 2 Propositions (i) and (iii) of the last theorem mean that Com(X, Y ) is a closed linear subspace of B(X, Y ). Let Y = X. It follows from (ii) that Com(X) := Com(X, X) is actually what we call an ideal in the Banach algebra B(X). 2.55. Definition We say that T ∈ B(X, Y ) is a finite rank operator if Ran(T ) is finite–dimensional. It is clear that any finite rank operator is compact (see Proposition 2.50). Theorem 2.54(iii) implies that a limit of a convergent sequence of finite rank operators is compact. 2.56. Theorem Every bounded subset of a normed space X is relatively compact iff X is finite–dimensional. Proof: See e.g. L.A. Liusternik & V.J. Sobolev, Elements of Functional Analysis, §16). 2 2.57. Corollary dimensional. 2
The identity operator IX is compact iff X is finite–
2.58. Theorem Let T ∈ Com(X, Y ) and let at least one of the spaces X and Y be infinite–dimensional. Then T is not invertible. Proof: Suppose T is invertible: T −1 ∈ B(Y, X). Then operators T −1 T = IX and T T −1 = IY are compact by Theorem 2.54(ii). This however contradicts Corollary 2.57. 2 32
2.59. Corollary Let X be an infinite–dimensional Banach space and T ∈ Com(X). Then 0 ∈ σ(T ). 2 2.60. Theorem Any non-zero eigenvalue λ of a compact operator T ∈ Com(X) has finite multiplicity, i.e. the subspace Xλ spanned by eigenvectors of T corresponding to λ is finite–dimensional. Proof: Any bounded subset Ω of Xλ is relatively compact because T x = λx, ∀x ∈ Ω, i.e. Ω = λ−1 T (Ω), and T is a compact operator. Hence Xλ is finite–dimensional by Theorem 2.56. 2 2.61. Theorem Let T ∈ Com(X). Then σ(T ) is at most countable and has at most one limit point, namely, 0. Any non-zero point of σ(T ) is an eigenvalue of T , which has finite multiplicity according to Theorem 2.60. In a neighbourhood of λ0 ∈ σ(T )\{0} the resolvent R(T ; λ) admits the following representation: R(T ; λ) =
k X
˜ ; λ) (λ − λ0 )−n−1 (T − λ0 I)n P + R(T
n=0
˜ ; λ) is analytic in some neighbourhood of λ0 , P is the spectral projection corwhere R(T responding to λ0 . P and hence (T − λ0 I)n P are finite rank operators.
Proof: See e.g. W. Rudin, Functional Analysis, Theorem 4.25 and T. Kato, Perturbation Theory For Linear Operators, Ch. III, §6, Section 5. 2
2.62. Theorem Let X and Y be Banach spaces and T ∈ B(X, Y ). Then T is compact if and only if T ∗ is compact. Proof: See W. Rudin, Functional Analysis, Theorem 4.19. 2 Let (X, k · kX ) be a normed space and Z be its linear subspace. Suppose Z is equipped with a norm k · kZ . We will say that (Z, k · kZ ) is continuously embedded in (X, k · kX ) if the operator I : (Z, k · kZ ) → (X, k · kX ), Ix = x, is continuous, i.e. kxkX ≤ constkxkZ , ∀x ∈ Z. If I : (Z, k · kZ ) → (X, k · kX ) is compact we say that (Z, k · kZ ) is compactly embedded in (X, k · kX ).
33
2.63. Proposition Let (Z, k · kZ ) be compactly embedded in (X, k · kX ), Y be a normed space and let A : Y → (Z, k · kZ ) be bounded. Then A ∈ Com(Y, X). Proof: Apply Theorem 2.54(ii) to the operator A = IA. 2 2.64. Example Let m ∈ IN {0}, −∞ < a < b < ∞. We denote by C m ([a, b]) the space of all m times continuously differentiable functions on [a, b] equipped with the norm S
kuk(m) :=
m X j=0
dj u(t) max . t∈[a,b] dtj
C m ([a, b]) is a Banach space. C m ([a, b]) is compactly embedded in C n ([a, b]) if m > n. (Let X and Z be some spaces of functions defined on a compact set K. Normally it is reasonable to expect Z to be compactly embedded in X if Z consists of functions which are “more smooth” than functions from X.) It follows from Proposition 2.63 that if a linear operator T is continuous from C n ([a, b]) to C m ([a, b]), m > n, then it is compact in C n ([a, b]). For example the operator defined by the formula (T u)(t) :=
Z
b
k(t, s)u(s)ds, k ∈ C 1 ([a, b]2 ),
a 0
is bounded from C ([a, b]) = C([a, b]) to C 1 ([a, b]) and hence compact in C([a, b]).
3
The geometry of Hilbert spaces
3.1. Definition An inner (scalar) product space is a vector space H together with a map (· , ·) : H × H → IF such that (λx + µy, z) = λ(x, z) + µ(y, z),
(3.1)
(x, y) = (y, x),
(3.2)
(x, x) ≥ 0, and (x, x) = 0 ⇐⇒ x = 0,
(3.3)
for all x, y, z ∈ H and λ, µ ∈ IF.
34
(3.1)–(3.3) imply ¯ y) + µ (x, λy + µz) = λ(x, ¯(x, z),
(3.4)
(x, 0) = (0, x) = 0.
(3.5)
3.2. Examples P 1. CN , (x, y) := N ¯k , wk > 0. (For wk = 1 we have the standard k=1 wk xk y N inner product on C .) P 2. l2 , (x, y) := ∞ ¯k . (The series converges because of the inequality k=1 xk y |xk yk | ≤ (|xk |2 + |yk |2 )/2.) R 3. C([0, 1]), (f, g) := 01 f(t)g(t)dt. R 4. C 1 ([0, 1]), (f, g) := 01 f (t)g(t) + f 0 (t)g 0 (t) dt. 3.3. Theorem Every inner product space becomes a normed space by setting kxk := (x, x)1/2 . Proof: Linear Analysis. 2 3.4. Theorem (Cauchy–Schwarz inequality) |(x, y)| ≤ kxkkyk, ∀x, y ∈ H.
(3.6)
Proof: Linear Analysis. 2 3.5. Definition A system {xα }α∈J ⊂ H is called orthogonal if (xα , xβ ) = 0, for all α, β ∈ J such that α 6= β. 3.6. Proposition (Pythagoras’ theorem) If a system of vectors x1 , . . . , xn ∈ H is orthogonal, then kx1 + x2 + · · · + xn k2 = kx1 k2 + kx2 k2 + · · · + kxn k2 . Proof: Express both sides in terms of inner products. 2 3.7. Proposition (polarization identity) If IF = IR then 4(x, y) = kx + yk2 − kx − yk2 , ∀x, y ∈ H. 35
If IF = C then 4(x, y) = kx + yk2 − kx − yk2 + ikx + iyk2 − ikx − iyk2 , ∀x, y ∈ H. Proof: The same. 2 3.8. Proposition (parallelogram law) kx + yk2 + kx − yk2 = 2kxk2 + 2kyk2 , ∀x, y ∈ H. Proof: The same. 2 According to Proposition 3.8 for the norm induced by an inner product the parallelogram law holds. Now the following result implies that a norm is induced by an inner product iff the parallelogram law holds. 3.9. Theorem (Jordan – Von Neumann) A norm satisfying the parallelogram law is derived from an inner product. 2 3.10. Definition A complete inner product space is called a Hilbert space. So, every Hilbert Space is a Banach space and the whole Banach space theory can be applied to Hilbert spaces. 3.11. Examples 1. CN with any of the norms of Example 3.2-1 is a Hilbert space. 2. l2 is a Hilbert space. 3. The inner product space from Example 3.2-3 is not a Hilbert space.
Orthogonal complements 3.12. Theorem Let L be a closed linear subspace of a Hilbert space H. For each x ∈ H there exists a unique point y ∈ L such that kx − yk = d(x, L) := inf{kx − zk : z ∈ L}. This point satisfies the equality (x − y, z) = 0, ∀z ∈ L. 36
Proof: Step I. Let yn ∈ L be such that kx−yn k → d := d(x, L). Applying the parallelogram law to the vectors x − yn and x − ym and using the fact that (yn + ym )/2 ∈ L we have kyn − ym k2 = 2kx − yn k2 + 2kx − ym k2 − kx − yn + x − ym k2 = 1 2kx − yn k2 + 2kx − ym k2 − 4kx − (yn + ym )k2 ≤ 2 2 2 2 2 2 2 2kx − yn k + 2kx − ym k − 4d → 2d + 2d − 4d = 0 as n, m → ∞. Hence (yn ) is a Cauchy sequence, its limit y belongs to L and kx − yk = lim kx − yn k = d(x, L). Step II. Let us take an arbitrary z ∈ L such that kzk = 1. Then w := y + λz ∈ L, ∀λ ∈ IF (why?). For λ = (x − y, z) we have d2 ≤ kx − wk2 = (x − y − λz, x − y − λz) = kx − yk2 − λ(z, x − y) − ¯ − y, z) + |λ|2 = kx − yk2 − |λ|2 = d2 − |λ|2 . λ(x Hence (x − y, z) = λ = 0, ∀z ∈ L s.t. kzk = 1. Consequently (x − y, z) = 0, ∀z ∈ L\{0} (why?). It is clear that the last equality is valid for z = 0 as well. Step III. Let y0 ∈ L be such that kx − y0 k = d(x, L). Then we obtain from Step II: (x − y0 , z) = 0, ∀z ∈ L. Hence (y − y0 , z) = 0, ∀z ∈ L. Taking z = y − y0 ∈ L we obtain ky − y0 k2 = 0, i.e. y = y0 . This proves the uniqueness. 2
3.13. Remark The last theorem and its proof remain true in the case when H is an inner product space and L is its complete linear subspace. 3.14. Definition The orthogonal complement M ⊥ of a set M ⊂ H is the set M ⊥ := {x ∈ H : (x, y) = 0, ∀y ∈ M }. 3.15. Proposition Let M be an arbitrary subset of H. Then (i) M ⊥ is a closed linear subspace of H; (ii) M ⊂ M ⊥⊥ ; (iii) M is dense in H =⇒ M ⊥ = {0}. Proof: Exercise. 2 3.16. Theorem For any closed linear subspace M of a Hilbert space H we 37
have H = M ⊕ M ⊥ . Proof: According to Theorem 3.12 for any x ∈ H there exists y ∈ M such that kx − yk = d(x, M ) and (x − y, z) = 0, ∀z ∈ M. So, u := x − y ∈ M ⊥ and x=y+u is a decomposition of the required form, i.e. H = M + M ⊥ . Let us prove now that M ∩ M ⊥ = {0}. For any w ∈ M ∩ M ⊥ we have kwk2 = (w, w) = 0, i.e. w = 0. 2
Complete orthonormal sets 3.17. Definition The linear span of a subset A of a vector space X is the linear subspace (
linA =
N X
)
λn xn : xn ∈ A, λn ∈ IF, n = 1, . . . , N, N ∈ IN .
n=1
If X is a normed space than the closed linear span of A is the closure of linA. We will use the following notation for the closed linear span of A spanA := Cl(linA). 3.18. Proposition If A is a finite set then spanA = linA. Proof: Cf. Example 1.3–1. 2 3.19. Definition A system {eα }α∈J ⊂ H is called orthonormal if (
(eα , eβ ) =
δαβ
=
38
1 if β = α , 0 if β 6= α .
3.20. Definition A system {eα }α∈J ⊂ H is called linearly independent if for P any finite subset {eα1 , . . . , eαN }, N n=1 cn eαn = 0 =⇒ c1 = · · · = cN = 0. 3.21. Proposition Any orthonormal system is linearly independent. Proof: Suppose
PN
n=1 cn eαn
0=
N X
= 0. Then N X
!
cn eαn , eαm
=
n=1
cn (eαn , eαm ) = cm
n=1
for any m = 1, . . . , N . 2 Let us consider the Gauss approximation problem: suppose {e1 , . . . , eN } is an orthonormal system in an inner product space H. For a given x ∈ H P find c1 , . . . , cN such that kx − N n=1 cn en k is minimal. 3.22. Theorem The Gauss approximation problem has a unique soluP tion cn = (x, en ). The vector w := x − N e )e is orthogonal to n=1 (x, P n n 2 L := lin{e1 , . . . , eN }. Moreover, kwk2 = kxk2 − N n=1 |(x, en )| (Bessel’s PN equality) and, consequently, n=1 |(x, en )|2 ≤ kxk2 (Bessel’s inequality). Proof: For arbitrary c1 , . . . , cN we have
2
N
X
c e
x − n n =
x−
n=1
2
kxk − 2
kxk −
N X
2
N X
cn (en , x) n=1 N X
|(x, en )| +
n=1
−
N X
n=1
kxk2 −
N X n=1
cn en , x −
n=1
cm (x, em ) +
m=1 N X
(x, en )en −
n=1
N X
N X
cn en ,
n=1
cn en ,
N X
(x, em )em −
m=1
N X m=1 N X m=1 N X
!
cm em = !
cm em = !
cm em =
m=1
2 N N
X
X
|(x, en )|2 + (x, en )en − cn en
.
n=1
n=1
2 2 Therefore kx − N n=1 cn en k is minimal when cn = (x, en ) and kwk = kxk − PN 2 n=1 |(x, en )| . The fact that w is orthogonal to L follows from Theorem
P
39
3.12 (see also Remark 3.13). Let us give an independent direct proof of this fact: (w, em ) = x −
N X
!
(x, en )en , em = (x, em ) −
n=1
N X
(x, en )(en , em ) =
n=1
(x, em ) −
N X
(x, en )δnm = (x, em ) − (x, em ) = 0
n=1
for any m = 1, . . . , N . So, w is orthogonal to e1 , . . . , eN , i.e. to L. 2 3.23. Lemma Let H be a Hilbert space and {en } be an orthonormal P P∞ 2 system, cn ∈ IF, n ∈ IN and ∞ n=1 |cn | < ∞. Then the series n=1 cn en is convergent and for its sum y we have kyk =
∞ X
!1/2 2
|cn |
.
n=1
Proof: Let ym = m theorem (see Proposin=1 cn en . Then, using Pythagoras’ P∞ 2 tion 3.6) and the convergence of the series n=1 |cn | , we obtain P
kym
2
X
m m X
− yk k2 =
cn en
= |cn |2 → 0, as k, m → ∞, (m > k).
n=k+1
n=k+1
Thus (ym ) is a Cauchy sequence, has a limit y and 2
2
kyk = m→∞ lim kym k = m→∞ lim
m X
2
|cn | =
n=1
∞ X
|cn |2 . 2
n=1
3.24. Corollary Let H be a Hilbert space and {en } be an orthonormal set. P For any x ∈ H the series ∞ n=1 (x, en )en is convergent and for its sum y we have ! kyk =
∞ X
1/2
2
|(x, en )|
≤ kxk.
n=1
Proof: It follows from Theorem 3.22 that N X
|(x, en )|2 ≤ kxk2
n=1
40
for any N ∈ IN. Therefore ∞ X
|(x, en )|2 ≤ kxk2 . 2
n=1
3.25. Definition The Fourier coefficients of an element x ∈ H with respect to an orthonormal set {en } are the numbers (x, en ). 3.26. Theorem Let {en } be an orthonormal set in a Hilbert space H. Then the following statements are equivalent P (i) x = ∞ (x, e )e , ∀x ∈ H; n=1 P∞ n n 2 (ii) kxk = n=1 |(x, en )|2 , ∀x ∈ H (Parseval’s identity); (iii) (x, en ) = 0, ∀n ∈ IN =⇒ x = 0; (iv) lin{en } is dense in H. Proof: (i) =⇒ (ii) Follows from Corollary 3.24. (ii) =⇒ (iii) Immediate. (iii) =⇒ (iv) For an arbitrary x ∈ H and y := ∞ n=1 (x, en )en (see Corollary 3.24) we have (x − y, em ) = 0, ∀m ∈ IN (see the proof of Theorem 3.22). Then x = y by (iii). It is clear that y ∈ span{en }. Consequently x ∈ span{en } for any x ∈ H, i.e. Cl(lin{en }) = H. P
(iv) =⇒ (i) For an arbitrary x ∈ H and y = ∞ n=1 (x, en )en we have (x − y, en ) = 0, ∀n ∈ IN. Therefore (x − y, z) = 0, ∀z ∈ lin{en }, i.e. x − y ∈ (lin{en })⊥ . But (lin{en })⊥ = {0} since lin{en } is dense (see Proposition 3.15). Hence x = y. 2 P
3.27. Definition The orthonormal set {en } is called complete if it satisfies any (and hence all) of the conditions of Theorem 3.26. 3.28. Examples 1. An orthonormal subset of a finite-dimensional Hilbert space is complete iff it is a basis. 2. Let H = l2 , en = (0, . . . , 0, 1, 0, . . .). |
{z n
41
}
Then {en }n∈IN is a complete orthonormal set (why?). 3. The set {en }n∈Z , en (t) = ei2πnt , is orthonormal in the inner product space C([0, 1]) of Examples 3.2-3, 3.11-3 (check this!). It is one of the basic results of the classical Fourier analysis P that it has also the properties (ii)–(iv) of Theorem 3.26 (with Z and ∞ n=−∞ P P∞ instead of IN and ∞ n=1 correspondingly). The Fourier series n=−∞ (f, en )en of a function f ∈ C([0, 1]) is convergent to f with respect to the norm k · k2 , but may be not uniformly convergent, i.e. not convergent with respect to the norm k · k∞ . 3.29. Theorem (Gram–Schmidt orthogonalization process) For an arbitrary countable or finite set {yn } of vectors of an inner product space H there exists a countable or finite orthonormal set {en } such that lin{yn } = lin{en }. Proof: Let us first construct a set {zn } such that lin{yn } = lin{zn } and zn are linearly independent. We define zn inductively: z1 = yn1 , where yn1 is the first non-zero yn , z2 = yn2 , where yn2 is the first yn not in the linear span of z1 , and, generally, zk = ynk , where ynk is the first yn not in the linear span of z1 , . . . , zk−1 . It is clear that lin{yn } = lin{zn } and zn are linearly independent (prove this!). Now we define inductively vectors en such that for any N the set {en }N n=1 N is orthonormal and lin{en }N = lin{z } . By our construction z is non1 n n=1 n=1 zero, so z1 e1 = kz1 k
42
is well defined and has all the necessary properties in the case N = 1. Assume that vectors e1 , . . . , eN −1 have been defined. By hypothesis zN ∈ / LN −1 := N −1 N −1 lin{en }n=1 = lin{zn }n=1 , so the vector uN = zN −
N −1 X
(zN , ek )ek
k=1
must be non-zero and hence eN =
uN kuN k
is well defined. We have N lin{en }N n=1 = lin(eN , LN −1 ) = lin(uN , LN −1 ) = lin(zN , LN −1 ) = lin{zn }n=1 .
⊥
−1 Further, uN ∈ lin{en }N by Theorem 3.22. Consequently eN is orthogon=1 −1 nal to any element of the orthonormal set {en }N n=1 . It is clear that keN k = 1. So the set {en }N n=1 is orthonormal. This completes the induction step.
Thus we have constructed an orthonormal set {en } such that lin{en } = lin{zn } = lin{yn }. 2
3.30. Definition A normed space is called separable if it contains a countable dense subset. 3.31. Theorem A Hilbert space H is separable iff it contains a countable (or finite) complete orthonormal set. Proof: Let H be a separable Hilbert space, {yn }n∈IN be a dense subset and let {en } be the orthonormal set of the last theorem. Then lin{en } = lin{yn } is dense in H, i.e. {en } is complete. Conversely, assume that H contains a countable (or finite) complete orthonormal set {en }. Let (
QN =
N X
)
λn en : λ1 , . . . , λN ∈ Q + iQ ,
n=1
43
where Q is the field of rational numbers, and let Q = ∪QN . Then (exercise) Q is a countable dense subset of H. 2
Let U : H1 → H2 be an isomorphism of inner product spaces H1 and H2 (see Definition 1.4). It follows from the polarization identity (see Proposition 3.7) that U preserves the inner products (U x, U y)2 = (x, y)1 , ∀x, y ∈ H1 . 3.32. Theorem All infinite dimensional separable Hilbert spaces are isomorphic to l2 and hence to each other. Proof: Let H be an infinite dimensional separable Hilbert space and let {en }n∈IN be a complete orthonormal set in it. Let us define U : l2 → H by P∞ the formula U (λn )n∈IN = n=1 λn en (see Lemma 3.23). It is clear that U is linear. Lemma 3.23 and Theorem 3.26 (Definition 3.27) imply that U is onto and an isometry. 2 3.33. Remark Suppose {eα } is a not necessarily countable orthonormal set. It is called complete if (x, eα ) = 0, ∀α =⇒ x = 0. Any two complete orthonormal subsets of a Hilbert space H have the same cardinality. This cardinality is called the dimension of H. Two Hilbert spaces are isomorphic iff they have the same dimension. (For proofs see e.g. C. Goffman & G. Pedrick, First Course in Functional Analysis, Section 4.7.).
Bounded linear functionals on a Hilbert space 3.34. Theorem (Riesz) f is a bounded linear functional on a Hilbert space H, i.e. a bounded linear operator from H to IF if and only if there exists a unique z ∈ H such that f (x) = (x, z), ∀x ∈ H. 44
(3.7)
Moreover, kf k = kzk. Proof: It is clear that the equality (3.7) defines a bounded linear functional for an arbitrary z ∈ H. So, we have to prove that any bounded linear functional on H has a unique representation of the form (3.7). Uniqueness. If z1 , z2 satisfy (3.7), then (x, z1 ) = (x, z2 ), ∀x ∈ H, and therefore z1 − z2 ∈ H⊥ = {0}. Existence. If f = 0, take z = 0. Suppose f 6= 0. Then Ker(f ) is a closed linear subspace of H (see Definition 2.1) and Ker(f ) 6= H. Let y ∈ Ker(f )⊥ , y 6= 0. We have f (f (x)y − f (y)x) = f (x)f (y) − f (x)f (y) = 0, ∀x ∈ H, i.e. f (x)y − f (y)x ∈ Ker(f ), ∀x ∈ H. Consequently (f (x)y − f (y)x, y) = 0, ∀x ∈ H. The left-hand side of the last equality equals f (x)kyk2 −f (y)(x, y) = f (x)kyk2 −(x, f (y)y). Thus f (y) y. f (x) = (x, z), ∀x ∈ H, where z := kyk2 The equality kf k = kzk follows from the relations |f (x)| = |(x, z)| ≤ kzkkxk, ∀x ∈ H, and f (z) = kzk2 . 2
4
Spectral theory in Hilbert spaces The adjoint operator
4.1. Theorem Let H1 and H2 be Hilbert spaces and B ∈ B(H1 , H2 ). There exists a unique operator B ∗ ∈ B(H2 , H1 ) such that (Bx, y) = (x, B ∗ y), ∀x ∈ H1 , ∀y ∈ H2 . Proof: Let us take an arbitrary y ∈ H2 . Then H1 3 x 7→ f (x) := (Bx, y) ∈ IF
45
(4.1)
is a linear functional and |f (x)| ≤ kBxkkyk ≤ kBkkxkkyk = (kBkkyk)kxk, ∀x ∈ H1 . So, f is a bounded linear functional on H1 and kf k ≤ kBkkyk. According to the Riesz theorem (Theorem 3.34) there exists a unique z = z(B, y) ∈ H1 such that (Bx, y) = f (x) = (x, z), ∀x ∈ H1 , and kzk = kf k ≤ kBkkyk. It is easy to see that the map H2 3 y 7→ B ∗ y := z ∈ H1 is linear (check this!). We have already proved that kB ∗ yk ≤ kBkkyk, ∀y ∈ H2 , i.e. B ∗ is bounded and kB ∗ k ≤ kBk.
(4.2)
It is clear that (4.1) is satisfied and that the constructed operator B ∗ is the unique operator satisfying this equality. 2
4.2. Definition the adjoint of B.
The operator B ∗ from the previous theorem is called
4.3. Theorem Let H1 , H2 and H3 be Hilbert spaces, B, B1 , B2 ∈ B(H1 , H2 ) and T ∈ B(H2 , H3 ). Then (i) (αB1 + βB2 )∗ = αB1∗ + βB2∗ , ∀α, β ∈ IF; (ii) (T B)∗ = B ∗ T ∗ ; (iii) B ∗∗ = B; (iv) kB ∗ k = kBk; (v) kB ∗ Bk = kBB ∗ k = kBk2 ; (vi) if B is invertible then so is B ∗ and (B ∗ )−1 = (B −1 )∗ . Proof: (i) ((αB1 + βB2 )x, y) = α(B1 x, y) + β(B2 x, y) = α(x, B1∗ y) + β(x, B2∗ y) = (x, (αB1∗ + βB2∗ )y), ∀x ∈ H1 , ∀y ∈ H2 . (ii) (T Bx, z) = (Bx, T ∗ z) = (x, B ∗ T ∗ z), ∀x ∈ H1 , ∀z ∈ H3 . (iii) (B ∗∗ x, y) = (y, B ∗∗ x) = (y, (B ∗ )∗ x) = (B ∗ y, x) = (x, B ∗ y) = (Bx, y), ∀x ∈ H1 , ∀y ∈ H2 .
46
Consequently ((B − B ∗∗ )x, y) = 0, ∀x ∈ H1 , ∀y ∈ H2 . Taking y = (B − B ∗∗ )x we obtain (B − B ∗∗ )x = 0, ∀x ∈ H1 . (iv) follows from (4.2) and (iii): kBk = k(B ∗ )∗ k ≤ kB ∗ k ≤ kBk.
(v) kB ∗ Bk ≤ kB ∗ kkBk = kBk2 (see (iv)). On the other hand kBk2 = sup kBxk2 = sup (Bx, Bx) = sup (x, B ∗ Bx) ≤ kxk=1
kxk=1
kxk=1 ∗
sup kxkkB Bxk = kB ∗ Bk. kxk=1
Thus kB ∗ Bk = kBk2 . Using this equality with B ∗ instead of B we derive from (iii) and (iv) that kBB ∗ k = kBk2 . (vi) It is sufficient to take the adjoint operators in the equality BB −1 = IH2 , B −1 B = IH1 , (see (ii)). 2
Parts (i)–(iii) of Theorem 4.3 show that B(H) is a Banach algebra with an involution. Part (v) shows that it is, in fact, a C ∗ –algebra. A mapping x 7→ x∗ of an algebra E into itself is called an involution if it has the following properties (αx + βy)∗ = αx∗ + βy ∗ , (xy)∗ = y ∗ x∗ , x∗∗ = x, for all x, y ∈ E and α, β ∈ IF. A Banach algebra E with an involution satisfying the identity kx∗ xk = kxk2 , ∀x ∈ E, is called a C ∗ –algebra.
The analogue of Theorem 4.3(i) for operators acting on normed spaces has the following form (αB1 + βB2 )∗ = αB1∗ + βB2∗ , ∀α, β ∈ IF. 47
(4.3)
Let us explain why this equality is different from that of Theorem 4.3(i). If H1 and H2 are Hilbert spaces, then we have two definitions of the adjoint of B ∈ B(H1 , H2 ): Definition 4.2 and that for operators acting on normed spaces. B ∗ ∈ B(H2 , H1 ) according to the first one, while the second one gives us an operator B ∗ ∈ B(H2∗ , H1∗ ). In order to connect them to each other we have to use the identification H1∗ ∼ H1 , H2∗ ∼ H2 given by Theorem 3.34. This identification is conjugate linear (anti-linear): fj (x) = (x, zj ), j = 1, 2, ∀x ∈ H =⇒ αf1 (x)+βf2 (x) = (x, αz1 +βz2 ), ∀x ∈ H. This is why we have the complex conjugation in Theorem 4.3(i) and do not have it in (4.3). Note also that we use the following definition of linear operations on X ∗ (for an arbitrary normed space X) (f1 + f2 )(x) := f1 (x) + f2 (x), ∀f1 , f2 ∈ X ∗ , ∀x ∈ X, (αf )(x) := αf (x), ∀f ∈ X ∗ , ∀x ∈ X, ∀α ∈ IF. One can replace the last definition by (αf )(x) := αf (x), ∀f ∈ X ∗ , ∀x ∈ X, ∀α ∈ IF. In this case Theorem 4.3(i) is valid for operators acting on normed spaces. 4.4. Theorem Let H1 and H2 be Hilbert spaces, B ∈ B(H1 , H2 ). Then Ker(B ∗ ) = Ran(B)⊥ , Ker(B) = Ran(B ∗ )⊥ . Proof: B ∗ y = 0 ⇐⇒ (x, B ∗ y) = 0, ∀x ∈ H1 ⇐⇒ (Bx, y) = 0, ∀x ∈ H1 ⇐⇒ y ∈ Ran(B)⊥ . Thus Ker(B ∗ ) = Ran(B)⊥ . Since B ∗∗ = B, the second assertion follows from the first if B is replaced by B ∗ . 2 4.5. Corollary (cf. Theorem 2.39). Let H1 and H2 be Hilbert spaces, B ∈ B(H1 , H2 ). Then Cl(Ran(B)) = Ker(B ∗ )⊥ , Cl(Ran(B ∗ )) = Ker(B)⊥ . 48
Proof: Note that for any linear subspace M of a Hilbert space H we have the equality Cl(M ) = M ⊥⊥ . (Prove this!) 2 4.6. Definition Let H, H1 and H2 be Hilbert spaces. An operator B ∈ B(H) is said to be (i) normal if BB ∗ = B ∗ B, (ii) self-adjoint if B ∗ = B, i.e. if (Bx, y) = (x, By), ∀x, y ∈ H. An operator U ∈ B(H1 , H2 ) is called unitary if U ∗ U = IH1 , U U ∗ = IH2 , i.e. if U −1 = U ∗ . Note that any self-adjoint operator is normal. Any unitary operator U ∈ B(H) is normal as well. 4.7. Theorem An operator B ∈ B(H) is normal if and only if kBxk = kB ∗ xk, ∀x ∈ H.
(4.4)
Proof: We have kBzk2 = (Bz, Bz) = (B ∗ Bz, z), kB ∗ zk2 = (B ∗ z, B ∗ z) = (BB ∗ z, z) for any z ∈ H. So, (4.4) holds if B is normal. If kBzk = kB ∗ zk, ∀z ∈ H, then using the polarization identity (see Proposition 3.7) we obtain (Bx, By) = (B ∗ x, B ∗ y), ∀x, y ∈ H, i.e. ((B ∗ B − BB ∗ )x, y) = 0, ∀x, y ∈ H. Taking y = (B ∗ B − BB ∗ )x we deduce that (B ∗ B − BB ∗ )x = 0, ∀x ∈ H. 2 4.8. Theorem Let B ∈ B(H) be a normal operator. Then (i) Ran(B ∗ )⊥ = Ker(B) = Ker(B ∗ ) = Ran(B)⊥ , (ii) Bx = αx =⇒ B ∗ x = αx, (iii) eigenvectors corresponding to different eigenvalues of B are orthogonal 49
to each other. Proof: (i) follows from Theorems 4.4 and 4.7. Applying (i) to B − αI in place of B we obtain (ii). (Exercise: prove that B − αI is normal.) Finally, suppose Bx = αx, By = βy and α 6= β. We have from (ii) α(x, y) = (αx, y) = (Bx, y) = (x, B ∗ y) = (x, βy) = β(x, y). Since α 6= β, we conclude (x, y) = 0. 2 4.9. Theorem If U ∈ B(H1 , H2 ), the following statements are equivalent. (i) U is unitary. (ii) Ran(U ) = H2 and (U x, U y) = (x, y), ∀x, y ∈ H1 . (iii) Ran(U ) = H2 and kU xk = kxk, ∀x ∈ H1 . Proof: If U is unitary, then Ran(U ) = H2 because U U ∗ = IH2 . Also, U ∗ U = IH1 , so that (U x, U y) = (x, U ∗ U y) = (x, y), ∀x, y ∈ H1 . Thus (i) implies (ii). It is obvious that (ii) implies (iii). It follows from the polarization identity (see Proposition 3.7) that (iii) implies (ii). So, if (iii) holds, then (U ∗ U x, y) = (U x, U y) = (x, y), ∀x, y ∈ H1 . Consequently U ∗ U = IH1 (why?). On the other hand (iii) implies that U is a linear isometry of H1 onto H2 . Hence U is invertible. Since U ∗ U = IH1 , we have U −1 = U ∗ (why?), i.e. U is unitary. 2
Theorem 4.9 shows that the notion of a unitary operator coincides with the notion of an isomorphism of Hilbert spaces (see Definition 1.4) and that these isomorphisms preserve the inner products (see also the equality in the paragraph between Theorems 3.31 and 3.32). 4.10. Theorem (i) B is self-adjoint, λ ∈ IR =⇒ λB is self-adjoint. (ii) B1 , B2 are self-adjoint =⇒ B1 + B2 is self-adjoint. (iii) Let B1 , B2 be self-adjoint. Then B1 B2 is self-adjoint if and only if B1 and B2 commute. (iv) If Bn , n ∈ IN are self-adjoint and kB − Bn k → 0, then B is self-adjoint.
50
Proof: Exercise. 2 4.11. Theorem Let H be a complex Hilbert space (i.e. IF = C) and B ∈ B(H). Then B is self-adjoint iff (Bx, x) is real for all x ∈ H. Proof: If B is self-adjoint, then (Bx, x) = (x, Bx) = (Bx, x), ∀x ∈ H, i.e. (Bx, x) is real. Let us prove the converse. For any x, y ∈ H we have 4(Bx, y) = (B(x + y), x + y) − (B(x − y), x − y) + i(B(x + iy), x + iy) − i(B(x − iy), x − iy)
(4.5)
(check this expanding the right-hand side; this is the polarization identity for operators) and similarly 4(x, By) = (x + y, B(x + y)) − (x − y, B(x − y)) + i(x + iy, B(x + iy)) − i(x − iy, B(x − iy)). Since (Bz, z) = (Bz, z) = (z, Bz), ∀z ∈ H, we have (Bx, y) = (x, By), ∀x, y ∈ H, i.e. B is self-adjoint. 2 4.12. Theorem All eigenvalues of a self-adjoint operator B ∈ B(H) are real and eigenvectors corresponding to different eigenvalues of B are orthogonal to each other. Proof: According to Theorem 4.8(iii) it is sufficient to prove the first statement. Let λ be an eigenvalue and x ∈ H\{0} be a corresponding eigenvector: Bx = λx. Using Theorem 4.11 we obtain λ=
(Bx, x) ∈ IR. 2 kxk2 51
4.13. Theorem Let H be a Hilbert space. Each of the following four properties of a projection P ∈ B(H) implies the other three: (i) P is self-adjoint. (ii) P is normal. (iii) Ran(P ) = Ker(P )⊥ . (iv) (P x, x) = kP xk2 , ∀x ∈ H. Proof: It is trivial that (i) implies (ii). Theorem 4.8(i) shows that Ker(P ) = Ran(P )⊥ if P is normal. Since P is a projection Ran(P ) is closed (see Lemma 2.45). So, Ran(P ) = Ran(P )⊥⊥ = Ker(P )⊥ (see the proof of Corollary 4.5). Thus (ii) implies (iii). Let us prove that (iii) implies (i). Indeed, if (iii) holds, then (P x, (I − P )y) = 0, ((I − P )x, P y) = 0, ∀x, y ∈ H, (see Lemma 2.42). Therefore, (P x, y) = (P x, P y + (I − P )y) = (P x, P y) = (P x + (I − P )x, P y) = (x, P y), i.e. P is self-adjoint. So, we have proved that the properties (i)–(iii) are equivalent to each other. Now it is sufficient to prove that (iii) is equivalent to (iv). We have seen above that (iii) implies the equality (P x, (I − P )x) = 0, ∀x ∈ H. Hence (P x, x) = (P x, P x) + (P x, (I − P )x) = (P x, P x) = kP xk2 , ∀x ∈ H. Finally, assume (iv) holds. Let us take arbitrary y ∈ Ran(P ) and z ∈ Ker(P ) and consider x = y + tz, t ∈ IR. It is clear that P x = y. Consequently 0 ≤ kP xk2 = (P x, x) = (y, y + tz) = kyk2 + t(y, z). Thus t(y, z) ≥ −kyk2 , ∀t ∈ IR, i.e. (y, z) = 0, ∀y ∈ Ran(P ), ∀z ∈ Ker(P ), i.e. (iii) holds. (In the case when H is a complex Hilbert space the proof can be slightly simplified: after proving the implications (i) =⇒ (ii) =⇒ (iii) =⇒ (iv) as above, we obtain directly from Theorem 4.11 that (iv) =⇒ (i).) 2
4.14. Definition Property (iii) of the last theorem is usually expressed 52
by saying that P is an orthogonal projection.
Numerical range 4.15. Theorem Let H be a Hilbert space and B ∈ B(H). Suppose there exists c > 0 such that |(Bx, x)| ≥ ckxk2 , ∀x ∈ H.
(4.6)
Then B is invertible and kB −1 k ≤ 1/c. Proof: It follows from (4.6) that Ker(B) = {0} and ckxk2 ≤ kBxkkxk, ∀x ∈ H, i.e. kxk ≤ c−1 kBxk, ∀x ∈ H.
(4.7)
Let us prove that Ran(B) is closed. For any y ∈ Cl(Ran(B)) there exist xn ∈ H such that Bxn → y as n → ∞. It follows from (4.7) that kxn − xm k ≤ c−1 kBxn − Bxm k → 0 as n, m → ∞. Hence (xn ) is a Cauchy sequence in the Hilbert space H. Let us denote its limit by z. We have
Bz = B
lim xn = lim Bxn = y.
n→∞
n→∞
Thus y ∈ Ran(B), i.e. Ran(B) is closed. (4.6) implies that if x ∈ Ran(B)⊥ , then x = 0. So, Ran(B)⊥ = {0}, i.e. Ran(B) is dense in H (why?). Consequently Ran(B) = H and the operator B is invertible. The inequality kB −1 k ≤ 1/c follows from (4.7). 2 4.16. Definition Let H be a Hilbert space and let B ∈ B(H). Then the set Num(B) := {(Bx, x) : kxk = 1, x ∈ H} (4.8) 53
is called the numerical range of the operator B. It is clear that |(Bx, x)| ≤ kBxkkxk ≤ kBkkxk2 , ∀x ∈ H, so, Num(B) ⊂ {µ ∈ C : |µ| ≤ kBk}.
(4.9)
The numerical range of any bounded linear operator is convex (see e.g. G. Bachman & L. Narici, Functional Analysis, 21.4). 4.17. Theorem For any B ∈ B(H) we have σ(B) ⊂ Cl(Num(B)).
(4.10)
Proof: Let us take an arbitrary λ ∈ C\Cl(Num(B)) and z ∈ H such that kzk = 1. It is clear that (Bz, z) ∈ Num(B). Hence |((B − λI)z, z)| = |(Bz, z) − λ| ≥ d > 0, where d is the distance from λ to Cl(Num(B)). Now for arbitrary x ∈ H\{0} we have ! x x |((B − λI)x, x)| = kxk2 (B − λI) , ≥ dkxk2 . kxk kxk
According to Theorem 4.15 B − λI is invertible, i.e. λ ∈ / σ(B). Hence σ(B) ⊂ Cl(Num(B)). 2
54
Spectra of self-adjoint and unitary operators It follows from Theorem 4.11 that if B ∈ B(H) is self-adjoint then Num(B) ⊂ IR. Therefore Theorem 4.17 implies the following result. 4.18. Theorem Let B ∈ B(H) be self-adjoint and m := inf (Bx, x), M := sup (Bx, x). kxk=1
kxk=1
Then σ(B) ⊂ [m, M ] ⊂ IR. 2 4.19. Theorem If U ∈ B(H) is unitary, then σ(U ) ⊂ {λ ∈ C : |λ| = 1}. Proof: It follows from Theorem 4.9(iii) that kU k = 1 = kU −1 k. If |λ| > 1 then λ ∈ / σ(U ) by Lemma 2.9. Suppose |λ| < 1. We have U − λI = U (I − λU −1 ) and kλU −1 k = |λ| < 1. Hence U − λI is invertible by Lemmas 2.3(i) and 2.4. So, λ ∈ / σ(U ), i.e. σ(U ) ⊂ {λ ∈ C : |λ| = 1}. 2
Spectra of normal operators 4.20. Theorem The spectral radius of an arbitrary normal operator B ∈ B(H) equals its norm: r(B) = kBk (cf. Theorem 2.23). Proof: Replacing x by Bx in (4.4) we obtain kB 2 xk = kB ∗ Bxk, ∀x ∈ H. So, kB 2 k = kB ∗ Bk. Now Theorem 4.3(v) implies kB 2 k = kB ∗ Bk = kBk2 . Hence kB 2 k = kBk2 . Since (B n )∗ = (B ∗ )n (see Theorem 4.3(ii)), if B is normal then so is B n (why?). Hence it follows by induction that kB m k = kBkm for all integers m of the form 2k . Using Theorem 2.23 we can write k
k
r(B) = n→∞ lim kB n k1/n = lim kB 2 k1/2 = lim kBk = kBk. 2 k→∞
k→∞
55
4.21. Theorem Let B ∈ B(H) be a normal operator. Then kBk = sup |(Bx, x)|.
(4.11)
kxk=1
Proof: Theorems 4.17 and 4.20 imply kBk = r(B) = sup{|λ| : λ ∈ σ(B)} ≤ sup{|λ| : λ ∈ Cl(Num(B))} = sup{|λ| : λ ∈ Num(B)} = sup |(Bx, x)| ≤ kBk. 2 kxk=1
Compact operators in Hilbert spaces 4.22. Theorem Let X be a normed space, H be a Hilbert space and let T ∈ Com(X, H). Then there exists a sequence of finite rank operators Tn ∈ Com(X, H), such that kT − Tn k → 0 as n → ∞ (see Definition 2.55). (n)
(n)
Proof: Let y1 , . . . , ykn be a 1/n–net of the relatively compact set T (SX ) (n) (n) (see (2.25) and Theorem 2.49) and let Ln := span{y1 , . . . , ykn }. Let Pn be the orthogonal projection onto Ln , i.e. Pn y = z, where y = z + w, z ∈ Ln , w ∈ L⊥ n, (see Theorem 3.16). It is easy to see that Tn := Pn T is a finite rank operator and k(T − Tn )xk = kT x − Pn T xk = dist(T x, Ln ) <
1 , ∀x ∈ SX , n
(see Theorem 3.12). Thus kT − Tn k ≤ 1/n. 2 4.23. Remark A subset {en }n∈IN of a Banach space X is called a Schauder basis if for any x ∈ X there exists unique representation of the form x=
∞ X
λn en , λn ∈ IF.
n=1
It is clear that any Banach space having a Schauder basis is separable (cf. the proof of Theorem 3.31). Theorem 4.22 remains valid if we replace H by 56
a Banach space having a Schauder basis. In 1973 P. Enflo gave a negative solution to the long standing problem on existing of a Schauder basis in any separable Banach space: he constructed a separable Banach space without Schauder basis. He also proved that there exist a separable Banach space and a compact operator acting in it such that this operator is not a limit of a convergent sequence of finite rank operators.
Hilbert–Schmidt operators Let H1 and H2 be separable Hilbert spaces and {en }n∈IN be a complete orthonormal set in H1 . Then for any operator T ∈ B(H1 , H2 ) the quantity P∞ 2 n=1 kT en k is independent of the choice of a complete orthonormal set {en } (this quantity may be infinite). Indeed, if {fn }n∈IN is a complete orthonormal set in H2 then ∞ X
∞ X ∞ X
2
kT en k =
n=1
2
|(T en , fk )| =
n=1 k=1
2 So, ∞ n=1 kT en k = and {e0n } in H1 .
P
P∞
n=1
∞ X ∞ X n=1 k=1
∗
2
|(en , T fk )| =
∞ X
kT ∗ fk k2 .
k=1
kT e0n k2 for any two complete orthonormal sets {en }
4.24. Definition The Hilbert–Schmidt norm of an operator T ∈ B(H1 , H2 ) is defined as !1/2 ∞ X 2 kT kHS := kT en k n=1
and the operators for which it is finite are called Hilbert–Schmidt operators. Note that kT k ≤ kT kHS .
(4.12)
Indeed,
kT xk = T
∞ X
∞
! ∞
X X
|(x, en )|kT en k ≤ (x, en )en = (x, en )T en ≤
n=1 n=1 n=1 !1/2 ∞ !1/2 ∞ X X
|(x, en )|2
n=1
kT en k2
n=1
57
= kT kHS kxk, ∀x ∈ H1 ,
(see Theorem 3.26). 4.25. Exercise Prove that the set of all Hilbert–Schmidt operators is a subspace of B(H1 , H2 ) and that k · kHS is indeed a norm on that subspace. 4.26. Theorem Hilbert–Schmidt operators are compact. Proof: Let T be a Hilbert–Schmidt operator and let {en }n∈IN be a complete orthonormal set in H1 . The operator TN defined by ∞ X
TN
!
an en =
n=1
N X
an T en
n=1
is a finite rank operator (why?). Moreover for any x = have
P∞
n=1
an en ∈ H1 we
X
∞ X
∞
kT x − TN xk =
an T en
≤ |an |kT en k ≤
n=N +1
n=N +1 1/2 1/2 1/2 ∞ ∞ ∞ X X X |an |2 kT en k2 ≤ kxk kT en k2 n=N +1
n=N +1
n=N +1
and therefore
1/2
∞ X
kT − TN k ≤
kT en k2
→ 0 as N → ∞.
n=N +1
Hence T is compact being the limit of a sequence of compact operators (see Theorem 2.54(iii)). 2 4.27. Example (Integral operators) Let H = L2 ([0, 1]) and let a measurable function k : [0, 1] × [0, 1] → IF be such that Z 0
1
Z
1
|k(t, τ )|2 dτ dt < ∞.
0
Define the operator K by (Kf )(t) :=
1
Z
k(t, τ )f (τ )dτ.
0
58
Then K : L2 ([0, 1]) → L2 ([0, 1]) is a Hilbert–Schmidt operator and kKkHS =
Z 0
1
Z
1
1/2
2
|k(t, τ )| dτ dt
.
0
Proof: For t ∈ [0, 1] let kt (τ ) := k(t, τ ). Let {en }n∈IN be a complete orthonormal set in L2 ([0, 1]). Then Ken (t) =
0
Hence 2
kKen k =
1
Z
k(t, τ )en (τ )dτ = (kt , en ).
1
Z 0
2
|Ken (t)| dt =
Z 0
1
|(kt , en )|2 dt
and therefore ∞ X n=1
kKen k2 =
∞ Z X
1
n=1 0 Z 1 0
Z
|(kt , en )|2 dt =
kkt k2 dt =
Z 0
1
Z
∞ 1 X
0 n=1 1
|(kt , en )|2 dt =
|k(t, τ )|2 dτ dt < ∞,
0
since {en } is a complete orthonormal set in L2 ([0, 1]).
5
Spectral theory of compact normal operators
In this chapter H will always denote a Hilbert space. 5.1. Theorem Let T ∈ Com(H) be a normal operator. Then T has an eigenvalue λ such that |λ| = kT k. Proof: This result follows from Theorems 2.61 and 4.20. Since we have not proved Theorem 2.61, we give an independent proof of Theorem 5.1. There is nothing to prove if T = 0. So, let us suppose that T 6= 0. It follows from Theorem 4.21 that there exist xn ∈ H, n ∈ IN such that kxn k = 1 and |(T xn , xn )| → kT k. We can suppose that the sequence of numbers (T xn , xn ) is convergent (otherwise we could take a subsequence of 59
(xn )). Let λ be the limit of this sequence. It is clear that |λ| = kT k = 6 0. We have kT xn − λxn k2 = (T xn − λxn , T xn − λxn ) = kT xn k2 − λ(xn , T xn ) − λ(T xn , xn ) + |λ|2 kxn k2 = kT xn k2 − 2Re(λ(T xn , xn )) + |λ|2 ≤ 2|λ|2 − 2Re(λ(T xn , xn )) → 2|λ|2 − 2|λ|2 = 0 as n → ∞. Hence kT xn − λxn k → 0 as n → ∞. Since T is a compact operator the sequence (T xn ) has a Cauchy subsequence (T xnk )k∈IN . The last sequence is convergent because the space H is complete. Let us denote its limit by y. It follows from the formula xnk =
1 (T xnk − (T xnk − λxnk )) λ
that the sequence (xnk )k∈IN converges to x := λ−1 y. Consequently
Tx = T and
lim xnk = lim T xnk = y = λx
k→∞
k→∞
kxk = lim xnk
= lim kxnk k = 1. k→∞ k→∞
Thus x 6= 0 is an eigenvector of T corresponding to the eigenvalue λ. 2 5.2. Theorem (Spectral theorem for a compact normal operator) Let T ∈ Com(H) be a normal operator. Then there exists a finite or a countable orthonormal set {en }N n=1 , N ∈ IN ∪ {∞}, of eigenvectors of T such that any x ∈ H has a unique representation of the form x=
N X
cn en + y, y ∈ Ker(T ), cn ∈ C.
(5.1)
n=1
One then has Tx =
N X
λn cn en ,
(5.2)
n=1
where λn 6= 0 is the eigenvalue of T corresponding to the eigenvector en . Moreover, σ(T )\{0} = {λn }N (5.3) n=1 , 60
|λ1 | ≥ |λ2 | ≥ · · ·
(5.4)
lim λn = 0, if N = ∞.
(5.5)
and n→∞
Proof: Step I We will use an inductive process. Let λ1 be an eigenvalue of T such that |λ1 | = kT k (see Theorem 5.1) and let e1 be the corresponding eigenvector such that ke1 k = 1. Suppose we have already constructed non-zero eigenvalues λ1 , . . . , λk of T satisfying (5.4) and corresponding eigenvectors e1 , . . . , ek such that the set {en }kn=1 is orthonormal. Let Lk := lin{e1 , . . . , ek }, Hk := L⊥ k. It is clear that L1 ⊂ L2 ⊂ · · · ⊂ Lk , H1 ⊃ H2 ⊃ · · · ⊃ Hk .
(5.6)
Pk
cn ∈ C.
Any x ∈ Lk has a unique representation of the form x = Consequently Tx =
k X
λn cn en ∈ Lk , T ∗ x =
n=1
k X
n=1 cn en ,
λn cn en ∈ Lk
n=1
(see Theorem 4.8(ii)). Hence T Lk ⊂ Lk , T ∗ Lk ⊂ Lk . For any z ∈ Hk = L⊥ k we have (x, T z) = (T ∗ x, z) = 0, (x, T ∗ z) = (T x, z) = 0, ∀x ∈ Lk , because T ∗ x, T x ∈ Lk . Thus T z, T ∗ z ∈ Hk . So, T Hk ⊂ Hk , T ∗ Hk ⊂ Hk . Let us consider the restriction of T to the Hilbert space Hk : Tk := T |Hk ∈ B(Hk ). It is easy to see that Tk is a compact normal operator (why?). There are two possibilities. 61
Case I. Tk = 0. In this case the construction terminates. Case II. Tk 6= 0. In this case there exists an eigenvalue λk+1 6= 0 of Tk such that |λk+1 | = kTk k = kT |Hk k (see Theorem 5.1). It follows from the construction and from (5.6) that |λk+1 | ≤ |λk | ≤ · · · ≤ |λ1 |. Let ek+1 ∈ Hk = L⊥ k be an eigenvector of Tk corresponding to λk+1 and such that kek+1 k = 1. It is clear that ek+1 is an eigenvector of T and that the set {en }k+1 n=1 is orthonormal. This construction gives us a finite or a countable orthonormal set {en }N n=1 , N ∈ IN ∪ {∞}, of eigenvectors of T and corresponding non-zero eigenvalues λn satisfying (5.4). This set is finite if we have Case I for some k and infinite if we always have Case II. Step II Let us prove (5.5). Suppose (5.5) does not hold. Then there exists δ > 0 such that |λn | ≥ δ, ∀n ∈ IN (see (5.4)). Consequently kT en − T em k2 = kλn en − λm em k2 = |λn |2 + |λm |2 ≥ 2δ 2 > 0 if n 6= m. Hence (T en ) cannot have a Cauchy subsequence. However, this contradicts the compactness of T . So, we have proved (5.5). Step III Let us consider the subspace HN := ∩N n=1 Hn . If N is finite, then according to our construction T |HN = 0, i.e. HN ⊂ Ker(T ). If N = ∞, then for any x ∈ HN we have kT xk ≤ kT |Hk kkxk = |λk+1 |kxk → 0 as k → ∞. Thus kT xk = 0, i.e. T x = 0, ∀x ∈ HN . Consequently HN ⊂ Ker(T ). On the other hand Ker(T ) ⊂ L⊥ k = Hk for any k by Theorem 4.8(iii). Hence HN = Ker(T ).
(5.7)
Step IV For an arbitrary x ∈ H we have x−
N X n=1
!
(x, en )en , ek = (x, ek ) −
N X
(x, en )(en , ek ) = (x, ek ) − (x, ek ) = 0,
n=1
∀k = 1, . . . , N,
62
(cf. the proof of Theorem 3.22). Thus x has a representation of the form (5.1), where cn = (x, en ) and y := x −
N X
(x, en )en ∈ HN = Ker(T ).
n=1
Taking the inner products of both sides of (5.1) with ek and using (5.7) we obtain ck = (x, ek ), i.e. a representation of the form (5.1) is unique. Step V (5.2) follows immediately from (5.1) and the continuity of T . It N is left to prove (5.3). Let us take an arbitrary λ ∈ C\ {λn }n=1 ∪ {0} . It is clear that the distance d from λ to the closed set {λn }N n=1 ∪ {0} is positive. (5.1), (5.2) imply (T − λI)x =
N X
(λn − λ)(x, en )en − λy.
n=1
It is easy to see that the operator T − λI is invertible and −1
(T − λI) x =
N X
(λn − λ)−1 (x, en )en − λ−1 y.
(5.8)
n=1
Note that k(T − λI)−1 xk2 = −2
d
N X n=1 N X
|λn − λ|−2 |(x, en )|2 + |λ|−2 kyk2 ≤ ! 2
|(x, en )| + kyk
2
= d−2 kxk2 < ∞
n=1
(see Proposition 3.6, Lemma 3.23 and (5.7)). So, we have proved that λ ∈ / N σ(T ), i.e. σ(T ) ⊂ {λn }N ∪{0}. On the other hand we have {λ } ⊂ σ(T ). n n=1 n=1 2 5.3. Remark Since any self-adjoint operator is normal (see Definition 4.6), the previous theorem is valid for self-adjoint operators. In this case all eigenvalues λn are real (see Theorem 4.12 or Theorem 4.18). This variant of Theorem 5.2 is known as the Hilbert–Schmidt theorem.
63
Let Pn be the orthogonal projection onto the one–dimensional subspace generated by en , i.e. Pn := (·, en )en and let PKer (T) be the orthogonal projection onto Ker(T ). Then (5.1), (5.2) and (5.8) can be rewritten in the following way: N X
I=
Pn + PKer (T) ,
(5.9)
n=1
T =
N X
λ n Pn ,
(5.10)
n=1
R(T ; λ) = (T − λI)−1 =
N X
(λn − λ)−1 Pn − λ−1 PKer (T)
(5.11)
n=1
(see Step IV of the proof of Theorem 5.2), where the series are strongly P convergent. (A series ∞ n=1 An , An ∈ B(X, Y ), is called strongly convergent P∞ if the series n=1 An x converges in Y for any x ∈ X.) On the other hand it is easy to see that in the case N = ∞ the series (5.9) and (5.11) do not converge in the B(H)–norm, while (5.10) does (prove this!). Suppose a function f is analytic in some neighbourhood ∆f of σ(T ) and Ω is an admissible set such that σ(T ) ⊂ Ω ⊂ Cl(Ω) ⊂ ∆f . Then we obtain from Definition 2.26, (5.11) and the Cauchy theorem
f (T )x = −
1 Z f (λ)R(T ; λ)dλ x = 2πi ∂Ω
N X 1 Z − f (λ) (λn − λ)−1 Pn x − λ−1 PKer (T) x dλ = 2πi ∂Ω n=1
!
−
N X 1 Z n=1
N X
f (λ)(λn − λ)−1 dλ Pn x +
2πi ∂Ω 1 Z f (λ)λ−1 dλ PKer (T) x = 2πi ∂Ω
f (λn )Pn x + f (0)PKer (T) x, ∀x ∈ H.
(5.12)
n=1
The series here are convergent because (Pn x, Pm x) = 0 if m 6= n and PN 2 2 2 n=1 kPn xk = n=1 |(x, en )| ≤ kxk (see Lemma 3.23 and Corollary 3.24).
PN
64
Note that if dim(H) < +∞ it may happen that 0 6∈ σ(T ) and 0 6∈ Ω. If in this case f (0) 6= 0 then 1 Z f (λ)λ−1 dλ = 0 6= f (0). 2πi ∂Ω Nevertheless (5.12) holds because PKer (T) = 0, since Ker (T) = 0. It is clear that the RHS of (5.12) is well defined for any bounded on σ(T ) not necessarily analytic function f . This motivates the following definition. 5.4. Definition Let T ∈ Com(H) be a normal operator and f be a bounded function on σ(T ). Then f (T )x :=
N X
f (λn )Pn x + f (0)PKer (T) x, ∀x ∈ H.
n=1
Note that if dim(H) < +∞ and 0 6∈ σ(T ), then f may be not defined at 0. The above definition, however, still makes sense because in this case PKer (T) = 0, since Ker (T) = 0, and we assume that f (0)PKer (T) = 0. The operator f (T ) is well defined because
2 N
X
f (λn )Pn x + f (0)PKer (T) x =
n=1
N X
|f (λn )(x, en )|2 + |f (0)|2 kPKer (T) xk2 ≤
n=1
!2
sup |f (λ)|
kxk2 < ∞, ∀x ∈ H,
λ∈σ(T )
(cf. Step V of the proof of Theorem 5.2). It follows from (5.13) that kf (T )k ≤ sup |f (λ)|. λ∈σ(T )
It is easy to see that if Ker(T ) = {0}, then kf (T )k ≤ sup |f (λ)| = λ∈{λn }
sup λ∈σ(T )\{0}
65
|f (λ)|.
(5.13)
Since kf (T )en k = kf (λk )en k = |f (λn )| and kf (T )yk = kf (0)yk = |f (0)|kyk for any y ∈ Ker(T ), we have (
supλ∈σ(T ) |f (λ)| if Ker (T) 6= {0}, supλ∈σ(T )\{0} |f (λ)| if Ker (T) = {0}.
(
supλ∈σ(T ) |f (λ)| if Ker (T) 6= {0}, supλ∈σ(T )\{0} |f (λ)| if Ker (T) = {0}.
kf (T )k ≥ Thus kf (T )k =
(5.14)
If dim(H) < +∞ and Ker(T ) = {0}, then 0 6∈ σ(T ) and σ(T ) \ {0} = σ(T ). If dim(H) = +∞ and Ker(T ) = {0}, then 0 ∈ σ(T ) is the only limit point of the set σ(T ) \ {0}. Therefore sup |f (λ)| = λ∈σ(T )
sup
|f (λ)|, if f is continuous at 0.
λ∈σ(T )\{0}
However, if f is not continuous at 0, the above equality is not necessarily true. It is easy to see that the functional calculus f 7→ f (T ) from Definition 5.4 has the same properties as that from Definition 2.26 (see Theorems 2.30, 2.31, 2.34 and 2.35). The advantages of Definition 5.4 are that f has to be defined only on σ(T ) and may be non-analytic. The disadvantage is that T has to be a compact normal operator acting on a Hilbert space, while Definition 2.26 deals with an arbitrary bounded linear operator acting on a Banach space. 5.5. Theorem Let T ∈ Com(H) be a normal operator and f be a bounded function on σ(T ). The operator f (T ) is normal. It is compact if and only if one of the following statements is true: (i) N < +∞, dim(Ker(T )) < +∞, i.e. dim(H) < +∞, (ii) N < +∞, dim(Ker(T )) = +∞ and f (0) = 0, (iii) N = +∞, dim(Ker(T )) < +∞ and f (λn ) → 0 as n → ∞, (iv) N = +∞, dim(Ker(T )) = +∞, f (0) = 0 and f (λn ) → 0 as n → ∞. The operator f (T ) is self–adjoint if and only if one of the following statements is true: (a) Ker(T ) = {0} and f (λn ) ∈ IR for each n, (b) Ker(T ) 6= {0}, f (0) ∈ IR and f (λn ) ∈ IR for each n. Proof: Exercise. 2 66