This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
)
Chapter IV. Determinants
and =
...
Since
(i,j = ... n)
<(p* x*i, xJ> = <X*i, (p x1>
1
it follows that = A*(x*l ... x*n)A(cox1 ... (4.28)
But
A*(p*x*l,
and
q,*X*fl) = detco*zl*(x*l ...
= det
A((p X1 ... (p
A (x1 ...
and so we obtain from (4.28) that (det (p* — det co)zl* (x*' ... x*'l)A (x1 ...
=0
whence (4.27)
The above result implies that transposed n x n-matrices have the same determinant. In fact, let A be an n x n-matrix and let (p be a linear transformation of an n-dimensional vector space such that (with respect to a given basis) M(ço)=A. Then it follows that detA* = detM(cp)* = detM((p*) = = det (p* = det (p = det M ((p) = det A.
Problems 1. Show that the determinant-functions, A, 4* of § 1, problem 1 are dual. 2. Using the expansion formula (4.16) prove that,
detA* = detA. §
5. The
adjoint matrix
4.12. Definition. Let (p:E—*E(dim E=ii) be a linear transformation and let ad((p) be the adjoint transformation (cf. sec. 4.6). We shall express the matrix of ad ((p) in terms of the matrix of It is called the adjoint of the matrix of (p. Let a1
a,, be a basis of E such that 4(a1 ad(p)a1 =
I and write
adjoint matrix
§ 5. The
(v= I ...
Then, setting
a)
L15
and x=a1 in the identity (4.14) we obtain ...,
zi(a1.
= zl(a1
whence
=
(i,j= 1 ... ii).
a1,
This equation shows that = det
where
E—*E is the linear transformation given by =
(v
Since the matrix of
i)
(i,j = ...
a, =
1
a)
with respect to the basis a1
-'
a is
given by
+1
1
I
1 1
f—I
i
i±1
f-fl
•..
...
a
CL1-f1
+1•••
1
we have
= det is called the cofactor of Deilnition: The determinant of denoted by Thus we can write
and is
(i,j= la); i.e.. the adjoint matrix is the transpose of the matrix of cofactors. 4.13. Cramer's formula. In view of the result above we obtain from Proposition V. sec. 4.6, the relations =
. detA
(4.36)
and (4.37)
Chapter IV. Determinants
Setting k=i in (4.36) we obtain the expansion formula by cofactors.
(i= I
detA
and define a matrix
Now assume that detA
=
Then these equations yield
(4.38)
n). by setting
detA
is the inverse of the matrix (xi). showing that the matrix From (4.36) we obtain Cramers solution formula of a system of n linear equations in a unknowns whose determinant is non-zero. In fact, let
=
=
a
(4.39)
k=1
be such a system. Then we have, in view of (4.36).
whence
=
=
This formula expresses the (unique) solution of (4.39) in terms of the
cofactors of the matrix A and the determinant of A. Given an (a x ii)-matrix denote for each 4.14. The submatrices pair (Lj) (i= 1 .. a, j = I ... a) by SI the (a — 1) x (a — 1)—matrix obtained from A by deleting the i-th row and the j-th column. We shall show that .
cof
=(—
1)1
+
det
(i, I = 1 ... a).
(4.40)
In fact, by (i—I) interchanges of rows and (j— I) interchanges of columns we can transform 1
into the matrix (cf. sec. 4.12)
§ 5. The
Thus
adjoint matrix
= det
=(
117
—
On the other hand we have, in view of the expansion formula (4.38), with 1= 1
detBl =
and so (4.40) follows. 4.15. Expansion by cofactors. From the relations (4.35) and (4.30) we
obtain the expansion-formula of the determinant with respect to the 1th row detA = (1 = 1... ii). (4.41) By this formula the evaluation of the determinant of n rows is reduced to the evaluation of n determinants of n— 1 rows. In the same way the expansion-formula with respect to the Jth column is proved: + det A = ( det S/ (4.42) (j = 1 ... n).
4.16. Minors. Let A =
be a given ii x in-matrix. For every system
of indices and
the submatrix of A, consisting of the rows 1l'k and the is called a minor of order k of the matrix A. It will be shown that in a matrix of rank r there is always a minor of order r which is different from zero, whereas all minors of order be a minor of order k > r. Then the row-vectors k > r are zero. Let of A are linearly dependent. This implies that the rows of the a,1.. are also linearly dependent and thus the determinant must matrix be zero. It remains to be shown that there is a minor of order r which is different denote by
columns /1•Jk• The determinant of
from zero. Since A has rank r, there are r linearly independent rowvectors a, ..
a. The submatrix consisting of these row-vectors has again
the rank r. Therefore it must contain r linearly independent columnvectors h' ... 1* (cf. sec. 3.4). Consider the matrix Its columnvectors are linearly independent, whence
detA/'j"+O. If A is a square-matrix, the minors det
are called the principal minors of order k.
Chapter IV. Determinants
Problems 1. Prove the Laplace expansioii formula for the determinant of an 1) be a fixed integer. Then
(ii x n)-matrix: Let p (1 det A =
det
v1,) det A"
c (v1
where
= B'"
+
and
= 2.
(ii)
— 1)'
1
h, be two bases of a vector space E and a, and h1 1) be given. Show that the vectors can be reordered
Let a1
let p(l such that (i)
(v,—i) (
hr,, is a basis of E, is also a basis of E.
a1
h,,
3. Compute the inverse of the following matrices. 11
1
1
1
—1
1
1
—1 —1 1
—1 1
21 21 B= 2
4. Show that a)
xaa...a
a x a...a det
:
:
a...ax
= [x + (n —
1)
a] (x —
— 1
§ 5. The acijoint matrix
and that
r1
b)
...
1
1
22
det
... 2
1 1
—1
1
—1
Show
(n>2)
that
z11=x1; z12=x1x2+1. 6. Verify the following formula for a quasi-triangular
x11...x1 p
O....O
determinant:
xp+1p+1....xp+ln
o....o I
=
det
xnl 7.
xnn
det
xnp+1
Prove that the operation
properties: a) ad (A B)= ad B ad A. b) det ad A = (det c) ad ad A=(det
A =(det
xnn
_.)
1)2
(cf. sec.4.12) has the following
Chapter IV. Determinants
12()
§ 6.
The characteristic polynomial
4.17. Eigenvectors. Consider a linear transformation 'p of an n-dimensional linear space E. A vector a+O of E is called an eigenvector of 'p if 'p a =
2
a.
The scalar 2 is called the corresponding eigen value. A linear transformation (p need not have eigenvectors. As an example let E be a real linear space of two dimensions and define 'p by
(px1=x2 çox2=—x1 the vectors x1 and x2 form a basis of E. This mapping does not have eigenvectors. in fact, assume that where
a=
x1
+
is an eigenvector. Then pa=2a and hence These
equations yield
whence
+
=
and
4.18. The characteristic equation. Assume that a is an eigenvector of 'p and that 2 is the corresponding eigenvalue. Then
'pa=2a,
a+O.
This equation can be written as ('p — 2i)a = 0
(4.43)
showing that p—21 is not regular. This implies that det('p — 2i) = 0.
(4.44)
Hence, every eigenvalue of 'p satisfies the equation (4.44). Conversely, assume that 2 is a solution of the equation (4.44). Then 'p—2i is not regular. Consequently there is a vector a+O such that ('p — 2i)a = 0,
whence 'pa=2a. Thus, the eigenvalues of'p are the solutions of the equation (4.44). This equation is called the characteristic equation of the linear transformation 'p.
§ 6. The characteristic polynomial
121
4.19. The characteristic polynomial. To obtain a more explicit expression for the characteristic equation choose a determinant function A +0 in E. Then
=
A(px1 — Ax1 ... çox, —
=
1
det(ço
—
Ai)A(x1
... n).
(4.45)
Expanding the left hand-side we obtain a sum of
terms of the form
A(z1 ...
Denote by the sum of all terms in which p arguments are equal to and il—p arguments are equal to Collect in each term of Si,, the indices (v1 <
such that
and the indices
=—
=
—
Introducing the permutation a by
(i= in) we can write
= = =
A (z1 ...
... —
—
...
(—
xa(p+ 1)
Thus,
=
(4.46)
...
(—
where the sum is extended over all permutations a subject to the conditions
a(l)<...
=
(,—
(4.47)
where the sum on the right hand-side is taken over all permutations. Let be the function defined by =
...
...
(0
p n)
Chapter IV. Determinants
and -r be an arbitrary permutation of (1 .. ./1). Then
=
Xta(l)
A
=
A
=
Xrc(p + 1)
Xta(p), Xra(p + 1)
1)
A
Xra(fl))
1)
...
=
This equation shows that ments. This implies that =
is skew symmetric with respect to all argu(4.48)
—
(—
where ce,, is a scalar. Inserting (4.48) into (4.47) we obtain
= Hence, the left hand-side of (4.45) can be written as A((px1 — 2x1, ...
= A(x1 ...
—
(4.49)
Now equations (4.45) and (4.49) yield det (p —
2 i)
=
ce,,
—"
p=o
showing that the determinant of —2 i is a polynomial of degree n in 2. This polynomial is called the characteristic polynomial of the linear transformation p. The coefficients of the characteristic polynomial are determined by equation (4.48), and are called the characteristic coefficients. These relations yield for p = 0 and p = n
= (—
and ; = detq
respectively.
4.20. Existence of eigenvalues. Combining the results of sec. 4.18 and 4.19, we see that the eigenvalues of 'p are the roots of the characteristic polynomial
f(2) = This shows that a linear transformation of an n-dimensional linear space has at most n d!fferent eigen values.
§ 6. The characteristic polynomial
123
Assume that E is a complex linear space. Then, according to the fundamental theorem of algebra, the polynomialf has at least one zero. Consequently, every linear transformation of a complex linear space has at least one eigen value.
if E is a real linear space, this does not generally hold, as it has been shown in the beginning of this paragraph. Now assume that the dimension of E is odd. Then
lirnf(2)=+oo
and
and thus the
must have at least one zero. This proves that a linear transformation of an odd-dimensional real linear space has at least one eigen value. Observing that
f(O)=
=
detço
we see that a linear transformation of positive determinant has at least one positive eigenvalue and a linear transformation of negative determinant has at least one negative eigenvalue, provided that E has odd dimension. If the dimension of E is even we have the relations
lim f
=
urn f
and
= oo
A—' —03
and hence nothing can be said if det co>0. However,
if det p<0,
there
exists at least one positive and one negative eigenvalue. 4.21. The characteristic polynomial of the inverse mapping. It follows from (4.27) that the characteristic polynomial of the dual transformation coincides with the characteristic polynomial of 'p. Suppose now that where E1 and 122 are stable subspaces. Then the result of sec. 4.7 implies that the characteristic polynomial of 'p is the product of the characteristic polynomials of the induced transformations 'Pi : E1 and 'P2 Finally, let p: E—÷E be a regular linear transformation and consider the
inverse transformation p'. The characteristic polynomial of p' is defined by
F(2) = det('p1 Now, —
2i =
'p'
—
— 2'p) = — 2co_1
— )_1
Chapter IV. Determinants
124
whence
det((p'
—
2i) =
—
(—
).' i).
This equation shows that the characteristic polynomials of are related by F(2) = (— Expanding Fe.) as -
F (2) = we
and of
obtain the following relations between the coefficients of f and of F:
(v=0...n). 4.22. The characteristic polynomial of a matrix. Let e%(v = 1.. ii) be a basis of E and A = be the matrix of the linear transformation relative to this basis. Then M(p — 2t) =
—
2M(i) =
A
—
whence
det ('p — 2 i) = det M ('p — 2 i) = det (A — 2 J).
Thus, the characteristic polynomial of 'p can be written as
f
= det (A —
2
J).
(4.50)
The polynomial (4.50) is called the characteristic polynoiiiial of tile matrix A. The roots of the polynomialf are called the eigenvalues of the matrix A.
Problems 1. Compute the eigenvalues of the matrix
/1
0
\i
—1
3
(3 —2 —l 1
2. Show that the eigenvalues of the real matrix / ci
are real.
§ 6. The characteristic polynomial 3.
125
Prove that the characteristic polynomial of a projection
(see Chapter II, sec. 2.19) is given by —
= (— where n
dim E and p = dim E1.
4. Show that the coefficients of the characteristic polynomial of an involution satisfy the relations
(p=O...n). 5. Consider a direct decomposition Given linear transfor(i = 1,2) consider the linear transformation ço = mations E—÷ E. Prove that the characteristic polynomial of 'p is the product of the characteristic polynomials of and of 'pa.
6. Let ço:E—*E be a linear transformation and assume that E1 is E1 a stable subspace. Consider the induced transformations and Prove that :
X
Xi X
Xi and denote the characteristic polynomials of 'p, respectively. In particular show that where
and
=(
is the induced transformation of E/ker 'p and s denotes the dimension of ker 'p. 7. A linear transformation, p, of E is called nilpotent if ç0k=0 for some k. Prove that 'p is nilpotent if and only if the characteristic polynomial has the form where
x (2) = (—
Hint: Use problem 6. 8. Given two linear transformations 'p and cli of E show that det('p — Açli) is a polynomial in 2. 9. Let 'p and cl' be two linear transformations. Prove that 'pot/i and have the same characteristic polynomial. Hint: Consider first the case that is regular.
Chapter IV. Determinants
26
§
7. The trace
4.23. The trace of a linear transformation. In a similar way as the determinant, another scalar can be associated with a given linear transformation (p. Let 44z0 be a determinant function in E. Consider the sum
This sum obviously is again a determinant function and thus it can be written as (4.51)
is a scalar. This scalar which is uniquely determined by 'p is called the trace of 'p and will be denoted by tr 'p. It follows immediately that where
the trace depends linearly on p, tr(2ço +
= Atr'p +
Next we show that =
(4.52)
for any two linear transformations 'p and i/i. The trace of by the equation ...
A (x1 ...
Replacing the vectors
= tr
0p)A (x1 ...
l...n)
by
we
p
E
is defined
E.
obtain (4.53)
=
The left hand-side of this equation can be written as A
x1
...
o 'p
o
x1...
= det
A (x1 ... ('p
...
= and thus (4.53) implies that
=
(4.54)
If is regular, this equation may be divided by det cli yielding (4.52). If cl' is non-regular, consider the mapping t/i —). t where is different from all
§7. The trace
eigenvalues
of
Then
tr
1/I
127
—21 is regular, whence
-21)].
-1 i) q] = tr
In view of the linearity of the trace-operator this equation yields tr(110(p) — 2trp = — 2tr(p whence (4.52). Finally it will be shown that the coefficient of 2" in the characteristic
polynomial of p can be written as (4.55)
=(— Formula (4.48) yields for p = ...
= (—
(4.56)
A (x1 ...
the sum being taken over all permutations a subject to the restrictions This sum can be written as 1
(
A
x1 ...
...
A (x1 ...
=
x1,
1
We thus obtain from (4.56) =
(—
(4.57)
Comparing the relations (4.57) and (4.51) we find (4.55). = 1.. .n) be a basis of E. Then q 4.24. The trace of a matrix. Let determines an ii x n-matrix by the equations (4.58)
Inserting
l...n) in (4.51) we find = trcoA(e1
(4.59)
Equations (4.58) and (4.59) imply that
=
A(e1 ...
A(e1 ...
whence (4.60)
Observing that
=
çü e%>,
Chapter IV. Determinants
12g
where e*%' (v = 1 .. ii)
is the dual basis of
we can rewrite equation
(4.60) as tr çø =
(4.61)
<e*1, ço e>.
Formula (4.60) shows that the trace of a linear transformation is equal to the sum of all entries in the main-diagonal of the corresponding matrix. For any ii x n-matrix A this sum is called the trace of A and will be denoted by tr A, (4.62)
Now equation (4.60) can be written in the form
tr(p = trM(p).
4.25. The duality of L(E; F) and L(F; E). Now consider two linear spaces Eand Fand the spaces L(E; F) and L(F; E) of all linear mappings With the help of the trace a scalar product can be p:E—÷F and introduced in these spaces in the following way: q7EL(E; F),
=
E).
(4.63)
The function defined by (4.63) is obviously bilinear. Now assume that (4.64)
for a fixed mapping coeL(E; F) and all linear mappings /ieL(F; E). It has to be shown that this implies that ço=O. Assume that 'p+O. Then Extend the vector b1 =pa to there exists a vector aEE such that a basis (b1 .
.
.bm)
of F and define the linear mapping i/i: F—>E by
(p=2...in).
Then
=0
= b1,
= 2... in),
whence
=
=
= 1.
This is in contradiction with (4.64). Interchanging E and F we see that the relation
for a fixed mapping /ieL(F; E) and all mappings peL(E; F) implies that Hence, a scalar-product is defined in L(E; F) and L(F; E) by (4.63).
§7. The trace
129
Problems 1. Show that the characteristic polynomial of a linear transformation of a 2-dimensional linear space can be written as
f()) = )2 Verify that every such
satisfies (p2 —
— 2trço + detçD.
its characteristic equation,
(ptr(p + ldet(p = 0.
2. Given three linear transformations 'p, i/i,
of E show that
tr(yoç/ioço) + in general. 3. Show that the trace of a projection operator ir:E—+E1 II sec. 2.19) is equal to the dimension of Im it.
(see
Chapter
4. Consider two pairs of dual spaces E*, E and F*, F. Prove that the spaces L (E; F) and L (E*; F*) are dual with respect to the scalar-product defined by = F*). (pEL(E; F)
5. Letf be a linear function in the space L(E; E). Show thatf can be written as
f((p) = is a fixed linear transformation in E. Prove that is uniquely determined by f 6. Assume thatf is a linear function in the space L(E; E) such that where
Prove that
f(ço)=2tr(p where 2 is a scalar. 7. Let and be two linear transformations of E. Consider the sum
i*j
A (x1 ... (p Xi ... cli x1...
where A+O is a determinant function in E. This sum is again a determinant function and hence it can be written as i4j 9
A (x1
Linear Algebra
(p
...
cli
x1 ...
= B (p, cl') A (x1 ...
Chapter IV. Determinants
I 3()
By the above relation a bilinear function B is defined in the space L (E; E). Prove: a) B(p, p tr b) IB((p, (p)=(— where is the coefficient of 2P12 in the characteristic polynomial of çø. c)
=
—
8. Consider two ii x n-matrices A and B. Prove the relation tr(A B) = tr(BA)
a) by direct computation.
b) using the relation tr p=tr M(p). 9. If çø and are two linear transformations of a 2-dimensional linear space prove the relation +
=
+
+
—
10. Let A:L(E; E)—÷L(E; E) be a linear transformation such that
A
a 2-dimensional vector space and p be a linear transformation of E. Prove that ço satisfies the equation p2= —2i, 2>0 if and
only if
detço>0 and trp=0. 12. Let ço:E1—*E1 and
be linear transformations. Consider
=
Prove that tr ço=tr +tr P2• be a linear transformation and assume that there is a 13. Let into subspaces such that decomposition (1= l...r). Prove that tr q=0. be linear maps between finite dimen14. Let p: and /i: a sional vector spaces. Show that tr ((p /1) = )-th I 5. Show that the trace of ad ((p) (cf. sec. 4.6) is the negative (ii characteristic coefficient of 1
§ 8. Oriented vector spaces
§ 8.
131
Oriented vector spaces
In this paragraph E will be a real vector space of dimension n 1. 4.26. Orientation by a determinant function. Let A1 + 0 and A2 + 0 be two determinant functions in E. Then A2=2A1 where 2+0 is a real number. Hence we can introduce an equivalence relation in the set of all determinant functions A +0 as follows: if
2>0.
it is easy to verify that this is indeed an equivalence. Hence a decomposition of all determinant functions A +0 into two equivalence classes is induced. Each of these classes is called an orientation of E. If (A) is an orientation and A e (A) we shall say that A represents the given orientation. Since there are two equivalence classes of determinant functions the vec-
tor space E can be oriented in two different ways. A basis (v = n) of an oriented vector space is called positive if 1 . . .
where A is a representing determinant function. If (e1 .. is a positive basis and a is a permutation of the numbers (l...n) then the basis (eat!)... ea(fl)) is positive if and only if the permutation a is even. Suppose now that E* is a dual space of E and that an orientation is defined in E. Then the dual determinant function (cf. sec. 4.10) determines an orientation in E*. it is clear that this orientation depends only on the orientation of E. Hence, an orientation in E* is induced by the orientation of E. .
4.27. Orientation preserving linear mappings. Let E and F be two oriented vector spaces of the same dimension n and be a linear isomorphism. Given two representing determinant functions AE and AF defined by in E and F consider the function x1, ...,
= Clearly
is again a determinant function in E and hence we have that =
2+0 is a real number. The sign of 2 depends only on p and on the given orientations (and not on the choice of the representing determinant functions). The linear isomorphism çø is called orientation preserving if where
Chapter IV. Determinants
132
).>O. The above argument shows that, given a linear isomorphism and an orientation in E, then there exists precisely one orientation in F such that
preserves the orientation. This orientation will be called the orientation induced by ço. Now let be a linear automorphisni; i.e., F=E. Then we have AF=AF and hence it follows that = det(p.
This relation shows that a linear automorphism
preserves the
orientation if and only if det ço >0. As an example consider the mapping ço= —i. Since det(— i) = (— 1)'
it follows that ii is even.
preserves the orientation if and only if the dimension of
4.28. Factor spaces. Let E be an orientated vector space and F be an oriented subspace. Then an orientation is induced in the factor space E/F in the following way: Let A be a representing determinant function
in E and a1 .
.
be a positive basis of F. Then the function A (a1 ... ap,
depends only on the classes are equivalent mod F. Then
+
EE
1
in fact, assume for instance that
and
)p+1 = Xp+i +
and we obtain A (a1 ...
a
•..
A
=
A
(a1 ...
of (n—p) vectors in E1'F is well defined by
=
...
(4.65)
it is clear that 21 is linear with respect to every argument and skew symmetric. Hence A is a determinant function in ElF. It will now be shown that the orientation defined in E/F by A depends only on the orientations of E and F. Clearly, if A' is another representing determinant function in
§ 8. Oriented vector spaces
133
E we have that A'
2A, 2>0 and hence A' = 221. Now let another positive basis of F. Then we have that
>0
= whence A (a'1 ...
a,
.. .a,) be
=
1
A (a1 ...
1
It follows that the function A' obtained from the basis a'1 . . .a, is a positive multiple of the function A obtained from the basis a1... a direct decomposition
E=E1E13E2
(4.66)
and assume that orientations are defined in E1 and E2. Then an orientation is induced in E as follows: Let l...p) and l...q) be positive bases of E1 and E2 respectively. Then choose the orientation of E such that the basis a1 . b1 . . .bq is positive. To prove that this orientation depends only on the orientations of E1 and E2 let (i= I .. .p) and (1= 1... q) be two other positive bases of E1 and E2. Consider the linear transformations p:E1—+E1 and defined by .
(i=1...p) Then the transformation basis (a1..
b1
. .
(j=1...q).
and
carries the basis
bi...bq) into the
Since det ço >0 and det i/i >0 it follows from sec. 4.7
.bq).
that
>0
det(p
hi...hq) is again a positive basis of E. and hence Suppose now that in the direct decomposition (4.66) orientations are given in Eand E1. Then an orientation is induced in E2. In fact, consider the projection it:E—÷E2 defined by the decomposition (4.66). It induces an isomorphism
p:E/E1 4E2. In view of sec. 4.28 an orientation in E/E1 is determined by the orientations of E and E1. Hence an orientation is induced in E2 by p. To describe this orientation explicitly let A be a representing determinant function in E and e1
ep be a positive basis of E1. Then formula (4.65)
implies that the induced orientation in E2 is represented by the determinant function A2
...
=
A (e1
...
... yb),
(4.67)
Chapter IV. Determinants
134
Now let ..., be a positive basis of E7 with respect to the induced orientation. Then we have
and hence formula (4.67) implies that 4(e1 ... It follows that the basis e1 ...
of E is positive. In other words,
1
the orientation induced in E by E1 and 122 coincides with the original orientation. The space 122 in turn induces an orientation in E1. It will be shown that this orientation coincides with the original orientation of E1 if and only if p (n —p) is even. The induced orientation of E1 is represented by the determinant-function A1(x1 ...
=
...
...
(4.68)
where eA(2—p+ 1, ...n) is a positive basis of E2. Substituting (v
1.. .n) in equation (4.68) we find that
A1(e1 ...
=
But e)(i.=p+ I
...
...
= (— 1)
...
en).
(4.69)
n) is a positive basis of E2 whence
...
(4.70)
It follows from (4.69) and (4.70) that A1(e1 ...e")
c>
0
if p(n — p) is even if p(n—p)is odd.
(4.71)
is positive with respect to the original orienSince the basis of tation, relation (4.71) shows that the induced orientation coincides with the original orientation if and only if p(n—p) is even. 4.30. Example. Consider a 2-dimensional linear space 12. Given a basis (e1, e2) we choose the orientation of E in which the basis e1, e2 is positive. Then the determinant function A, defined by
zl(e1,e2) =
1
1,2) generrepresents this orientation. Now consider the subspace 1,2) with the orientation defined by Then E1 induces in ated by 122 the given orientation, but E2 induces in E1 the inverse orientation.
§ 8. Oriented vector spaces
135
In fact, defining the determinant-functions A1 and A1 (x)
A (e2,
x)
xe E1,
A2 (x) = A
and
A2 in E1 and in E2 by (e1,
x)
x e E2
we find that
4.31.
A2(e2) = A(e1, e2) =
1
Intersections. Let E1
and E2 be
and
A1 (e1) =
A(e2,
e1) = —
two subspaces of E
E = E1 + E2
1.
such
that (4.72)
and assume that orientations are given in E1, E2
and E. It
will be shown
that then an orientation is induced in the intersection E12=E1 n
E2.
Setting
dimE1 = p,
dimE2 = q,
dimE12 = r
we obtain from (4.72) and (1.32) that
r= p + q — n. Now consider the isomorphisms ço:E/E1 and Vi:E/E2
E1/E12.
Since orientations are induced in E/E1 and E/E2 these isomorphisms determine orientations in E2/E12 and in E1/E12 respectively. Now choose two positive bases +1• a,, and hr . bq in E1/E12 and E2/E12 respecand bJEE2 be vectors such that tively and let .
and
where it1 and
it2
denote the canonical projections and
ir2:E2—+E2/E12.
Now define the function A12 by A12 (z1 ... Zr) = A (Z1 ... Zr, ar+ 1
a,,, br+ 1 .. bq).
(4.73)
In a similar way as in sec. 4.30 it is shown that the orientation defined in depends only on the orientations of E1, and E (and not on E12 by the choice of the vectors and ba). Hence an orientation is induced in E12.
Chapter IV. Determinants
Interchanging E1 and E2 in (4.73) we obtain
(:i
•..
(:i
=A
r,
br+ 1
bq,
1
ar).
(4.74)
Hence it follows that =
(q-r)4
1)(P_r)
(n-q)4
= (_
(475)
Now consider the special case of a direct decomposition. Then p+q=n and E12=(0). The function /112 reduces to the scalar (4.76)
Moreover the number—2 depends
It follows from (4.76) that
only on the orientations of E1, E2 and E. It is called the intersection number
of the oriented subspaces E1 and E2. From (4.75) we obtain the relation =(
= 1.. ii) be two bases of E. Basis deformation. Let and is called deformable into the basis if there exist n Then the basis continuous mappings 4.32.
t t1
t
satisfying the conditions 1. (t0) = and (t1) = (t) (v = 2. The vectors 1
The
.
.11) are
linearly independent for every fixed t.
deformability of two bases is obviously an equivalence relation.
Hence, the set of all bases of E is decomposed into classes of deformable bases. We shall now prove that there are precisely two such classes. This is a consequence of the following Theorem: Two bases and (v = 1.. ./1) are deformable into each other if and only if the linear transformation p: E defined by 'p = has positive determinant.
Proof: Let 4+0 be an arbitrary determinant function. Then formula lii) are 4.17 together with the observation that the components continuous functions of shows that the mapping Ex x E—*R defined by A is continuous. into the is a deformation of the basis Now assume that basis Consider the real valued function (t) ...
§ 8. Oriented vector spaces
137
The continuity of the function 4 and the mappings
the function Ji
is
implies that
continuous. Furthermore,
ck(t)+0 the vectors (t) (v = I .n) are linearly independent. Thus the function cP assumes the same sign at t=t0 and at t=t1. But because
. .
= 4(pa1 ...
= A(b1
= det(pA(a1 ...
=
whence
detço >0 and so the first part of the theorem is proved. 4.33. Conversely, assume that the linear transformation a deformation assume first that the vector n-tuple (a1 ...
...
(4.77)
1). Then consider the de-
is linearly independent for every 1(1 composition = By the above assumption the vectors (a1..
are linearly independ-
ent, whence /3" + 0. Define the number c,, by
1+1 if if
L—1
It will be shown that the n mappings
Ix,(t)=a.(v=1...n—1) = define
(1 —
+
(0
a deformation
(a1... a determinant function in E. Then
4 (x1 (t) ... Since
(t)) = ((1
—
t) +
t) A (a1 ... an).
it follows that
(0t1)
Chapter IV. Determinants
138
whence
(0 t 1).
+0
(t) ...
zi
This implies the linear independence of the vectors
(t)(v =
1 . .
ii) for
every t.
In the same way a deformation
(a1...
(a1 ...
can be constructed where tain a deformation (a1 ...
—÷
1= ± I. Continuing this way we finally ob-
b1 ... c,1
= ±
1
(v
1 ... n).
To construct a deformation (b1 ...
(e1 b1 ...
consider the linear transformations
(v=1...n) and
(v=1...n). The product of these linear transformations is given by
(v=1...n). By hypothesis,
det(i/iop)>O
(4.78)
detp>O.
(4.79)
and by the result of sec. 4.32 Relations (4.78), and (4.79) imply that > 0.
det But whence
+1. Thus, the number of equal to — is even. Rearranging the vectors (v = 1.. n) we can achieve that 1
.
5—i
vl+i
(v=1...2p)
(v=2p+1...n).
§ 8. Oriented vector spaces
139
Then a deformation b1
...
e,,
-÷ (b1 ...
is defined by the mappings
+
=— =—
(v — —
1
—
(v=2p+1...n) 4.34. The case remains to be considered that not all the vector n-tuples (4.77) are linearly independent. Let A + 0 be a determinant function. The linear independence of the vectors (v = 1 . n) implies that . .
Since zl is a continuous function, there exists a spherical neighbourhood that Ua
(v= 1...n).
if XvEUa
Choose a vector a'1 e Uaj which is not contained in the (n — 1)-dimensional subspace generated by the vectors (b2. . Then the vectors (a'1, b2.. are linearly independent. Next, choose a vector a'2 E Ua2 which is not con-
tained in the (n— 1)-dimensional subspace generated by the vectors (a'1, b2. Then the vectors (a'1, a'2, b3. are linearly independent. Going on this way we finally obtain a system of n vectors (v = 1.. n) .
.
.
such that every n-tuple ...
(a'1
is
linearly independent. Since
Hence the vectors
(1= 1 ... ;i)
it follows that
(v = 1. . n) form a basis of E. The n mappings .
define a deformation (4.80)
In fact,
(t)(0 t
1) is contained in Uav whence A
(t) ...
(t)) + 0
(0
t
This implies the linear independence of the vectors
1).
(t)(v = 1.. .n).
140
Chapter IV. Determinants
By the result of sec. 4.33 there exists a deformation ...
... ba).
(4.81)
The two deformations (4.80) and (4.81) yield a deformation (a1 ...
... ba).
This completes the proof of the theorem in sec. 4.32. 4.35. Basis deformation in an oriented linear space. If an orientation is given in the linear space E, the theorem of sec. 4.32 can be formulated as and (v = I n) can be deformed into each other follows: Two bases if and only if they are both positive or both negative with respect to the given orientation. In fact, the linear transformation . .
p:
.
(v = I ... a)
—*
has positive determinant if and only if the bases and b%, (v = I .. are both positive or both negative. Thus the two classes of deformable bases consist of all positive bases and all negative bases. 4.36. Complex vector spaces. The existence of two orientations in a
real linear space is based upon the fact that every real number 2t0 is either positive or negative. Therefore it is not possible to distinguish two orientations of a complex linear space. In this context the question arises whether any two bases of a complex linear space can be deformed into each other. It will be shown that this is indeed always possible. Consider two bases and (v = a) of the complex linear space E. As in sec. 4.33 we can assume that the vector n-tuples 1
(a1...
. .
...
are linearly independent for every 1(1 1). It follows from the above assumption that the coefficient in the decomposition = is
different from zero. The complex number
Now choose a positive continuous function
r(0) =
1,
r(I) = r
can
be written as
1) such that (4.82)
§ 8. Oriented vector spaces
and a continuous function
1) such that
(t)(O t
(4.83)
Define mappings
(t),
(0
t 1)
(v =
=
by 1 ...
n — 1) )
and (t)
=t
Then the vectors (t) (v = fact, assume a relation
1 ..
(4.84)
1.
t
flV
+ r(t) .
n)
are linearly independent for every t. In
0. Then
+
r (t)
+
=
0
whence
(v==1...n—1) and = 0.
1, the last equation implies that )J' = 0. r (t) + 0 for 0 t (n— 1) equations reduce to in— 1). It follows from (4.84), (4.82) and (4.83) that
Since
Hence the
first
=
and
=
Thus the mappings (4.84) define a deformation (a1 ...
—÷
Continuing this way we obtain after 1...n). into the basis
(a1 n
...
ba).
steps a deformation of the basis
Problems 1. Let E be an oriented n-dimensional linear space and
(v = 1.. .11) be
the subspace generated by the vectors a positive basis; denote by is positive with respect Prove that the basis (x1 to
the orientation induced in
by the vector
l)i —
1
xL.
Chapter IV. Determinants
142
2. Let E be an oriented vector space of dimension 2 and let a1, a2 be two linearly independent vectors. Consider the 1-dimensional subspaces E1 and E2 generated by a1 and a2 and define orientations in E. such that the bases a are positive (i= 1,2). Show that the intersection number of E1 and E2 is + 1 if and only if the basis a1, a2 of E is positive.
3. Let E be a vector space of dimension 4 and assume that (v
= ... 1
4)
is a basis of E. Consider the following quadruples of vectors:
I. e1+e2,e1+e2+e3,e1+e2+e3+e4,e1—e2+e4 II. e1+2e3, e2+e4, e2—e1+e4, e2
III.
e2+e4, e3+e2, e2—e1
1V.
e1+e2—e3, e2—e4, e3+e2, e2—e1
V. e1—3e3, e2+e4, e2—e1—e4, e2. a) Verify that each quadruple is a basis of E and decide for each pair of bases if they determine the same orientation of E. b) If for any pair of bases, the two bases determine the same orientation, construct an explicit deformation. c) Consider E as a subspace of a 5-dimensional vector space E and assume that (v = 1, . .5) is a basis of E. Extend each of the bases above to a basis of which determines the same orientation as the basis (v= 1, ..., 5). Construct the corresponding deformations explicitly. 4. Let E be an oriented vector space and let E1, E2 be two oriented subspaces such that E== E1 + E2. Consider the intersection E1 fl E2 together with the induced orientation. Given a positive basis (c1, ..., Cr) of extend it to a positive basis (c1, ..., Cr, ar+ E1 fl a positive basis (C1, ..., Cr, br+ i, ..., bq) of E2. Prove that then (C1, ..., Cr, ar+ 1' ..., ap, br+ i, ..., bq) is a positive basis of E. .
5. Linear isotopices. Let E be an n-dimensional real vector space. will be called linearly Two linear automorphisms p: and i/i: (I the closed unit isotopic if there is a continuous map 1: I x and such that for each interval) such that x)=p(x), x) is a linear automorphism. tEl the map given by (i) Show that two automorphisms of E are linearly isotopic if and only if their determinants have the same sign. (ii) Let j: E—+E be a linear map such that = — Show that j is linearly isotopic to the identity map and conclude that detj= 1. 6. Complex structures. A complex structure in real vector space E is a linear transformation j: E—+E satisfying j2 = — 1.
§8. Oriented vector spaces
143
(i) Show that a complex structure exists in an n-dimensional vector space if and only if ii is even. (ii) Let j be a complex structure in E where dimE=2n. Let 4 be a determinant function. Show that for (v= 1 ... 11) either 4(x1
or 4(x1 ,...,x,1,jx1 The natural orientation of (E, J) is
defined to be the orientation re-
presented by a non-zero determinant function satisfying the first of these.
Conclude that the underlying real vector space of a complex space carries a natural orientation. (iii) Let be a vector space with complex structure and consider the complex structure (E, —j). Show that the natural orientations of and (E, —j) coincide if and only if n is even (dimE=2n).
Chapter V
Algebras In paragraphs one and two all vector spaces are defined over a fixed, but arbitrarily chosen field F of characteristic 0.
§ 1. Basic properties 5.1. Definition: An algebra, A, is a vector space together with a mapping A x such that the conditions (M1) and (M2) below both hold. The image of two vectors xeA, yeA, under this mapping is called the product of x and y and will be denoted by xy. The mapping A x A-+A is required to satisfy: (M1) (M2)
(2x1 + jix2)y = 2(x1 y) + ,u(x2y) + 11y2)
2(xy1) + ji(xy2).
As an immediate consequence of the definition we have that
0x = x0 0. Suppose B is a second algebra. Then a linear mapping p : a homomorphism (of algebras) if çø preserves products; i.e.,
p(xy)=pxpy.
is called (5.1)
A homomorphism that is injective (resp. surjective, bijective) is called a monomorphism (resp. epimorphism, isomorphism). If B=A, p is called an endomorphism. Note: To distinguish between mappings of vector spaces and mappings of algebras, we reserve the word linear mapping for a mapping between vector spaces satisfying (1.8), (1.9) and homomorphism for a linear mapping between algebras which satisfies (5.1). Let A be a given algebra and let U, V be two subsets of A. We denote by U V, the set
§ 1. Basic properties
145
Every vector aeA induces a linear mapping p(a):A—*A defined by
1u(a)x=ax
(5.2)
4u (a) is called the multiplication operator determined by a.
An algebra A is called associative if
x(yz)=(xy)z and commutative if
xy=yx
x,y,zEA
x,yeA.
From every algebra A we can obtain a second algebra (x y)OPP
by defining
=yx
AON' is called the algebra opposite to A. It is clear that if A is associative
then so is A°". If A is commutative we have A°"=A. If A is an associative algebra, a subset Sc: A is called a system of generators of A if each vector xeA is a linear combination of products of elements in 5, 1... x= e S, 1... Vp ... (v)
A unit element (or identity) in an algebra is an element e such that for every x
xe=ex=x.
(5.3)
if A has a unit element, then it is unique. In fact, if e and e' are unit elements, we obtain from (5.3) e = e e' = e'. Let A be an algebra with unit element eA and ço be an epimorphism of A onto a second algebra B. Then eB = eA is the unit element of B. In fact,
if yEB is arbitrary, there exists an element xeA such that y= çox. This gives
yeB =
=
=
= y.
In the same way it is shown that An algebra with unit element is called a division algebra, if to every element a 0 there is an element a' such that a a — = a — = c. 1
5.2. Examples: 1. Consider the space L(E; E) of all linear transformations of a vector space E. Define the product of two transformations by 10
= i/i o Greub Linear Algebra
Chapter V. Algebras
146
The relations (2.17) imply that the mapping (p, i)—*çlip satisfies (M1) and (M2) and hence L(E; E) is made into an algebra. L(E; E) together with this multiplication is called the algebra of linear transformations of E and is denoted by A (E; E). The identity transformation i acts as unit element in A (E; E). It follows from (2.14) that the algebra A (E; E) is associative.
However, it is not commutative if dim E 2. In fact, write E= (x1) and (x2) are the one-dimensional subspaces generated by two linearly independent vectors x1 and x2, and F is a complementary subspace. Define linear transformations 'p and by
'px1=O,
(px2=x1,
py=O,yeF
ç/ix2=O;
çliy=O,yeF.
and Then
= 'p0 = 0 while (pX2
=
çlix1
= x2
whence
Suppose now that A is an associative algebra and consider the linear mapping defined by
p(a)x=ax.
(5.4)
Then we have that
p(a b)x = a bx = p(a)p(b)x whence
p(ab) = p(a)p(b). This
relation shows that p is a homomorphism of A
into A (A; A).
M X be the vector space of (n x mi)-matrices for a given integer ii and define the product of two (mi x n)-matrices by formula Example 2: Let
(3.20). Then it follows from the results of sec. 3.10 that the space is made into an associative algebra under this multiplication with the unit matrix J as unit element. Now consider a vector space E of dimension n with a distinguished basis (v = 1. n). Then every linear transforE determines a matrix M ('p). The correspondence 'p —+ M(co) mation 'p . .
§ 1. Basic properties
determines a linear isomorphism of A (E; E) onto
147 X
In view of sec.
3.10 we have that
M is an isomorphism of the algebra A (E; E) onto the opposite algebra (M" X n)0PP. Example 3: Suppose F1 cF is a subfield. Then F is an algebra over F1. We show first that F is a vector space over F1. In fact, consider the mapping F1 x F—+F defined by
(2,x)-÷2x, It satisfies the relations
2eF1,xEF.
+ x =2x+ x 2(x + y) = Ax + 2y (24u)x = 2(1ux)
lx = x where 2,
x, yeF. Thus F is a vector space over F by (x, y) x y (field multiplication).
Then M1 and M2 follow from the distribution laws for field multiplication.
Hence F is an associative commutative algebra over F1 with 1 as unit element.
Example 4: Let cr be the vector space of functions of a real variable t which have derivatives up to order r. Defining the product by
(1 g) (t) = f (t) g (t) we obtain an associative and commutative algebra in which the function 1(t) = 1 acts as unit element. 5.3. Subalgebras and ideals. A subalgebra, A1, of an algebra A is a linear subspace which is closed under the multiplication in A; that is, if x and y are arbitrary elements of A1, then xyeA1. Thus A inherits the structure of an algebra from A. It is clear that a subalgebra of an associative (commutative) algebra is itself associative (commutative). Let S be a subset of A, and suppose that A is associative. Then the subspace B A generated (linearly) by elements of the form
is
SEES Si...Sr, clearly a subalgebra of A, called the subalgebra generated by S. It is
easily verified that
where the 10*
B= A containing S.
Chapter V. Algebras
A right (left) ideal in an algebra A is a subspace I such that for every xEI, and every yEA, xyEI(yxEI). A subspace that is both a right and left ideal is called a tivt'o-sided ideal, or simply an ideal in A. Clearly, every
right (left) ideal is a subalgebra. As an example of an ideal, consider the subspace A2 (linearly generated by the products xy). A2 is clearly an ideal and is called the derived algebra. The ideal I generated by a set S is the intersection of all ideals containing S. If A is associative, I is the subspace of A generated (linearly) by elements of the form
s,as,sa
sES,aEA.
In particular every single clement a generates an ideal principal ideal generated by a.
4
is
called the
Example 5: Suppose A is an algebra with unit element e, and let (p:F-÷A be the linear mapping defined by ço2 = Ae.
Considering F as an algebra over itself we have that =
e
=
e)(R e) =
then ).e=0 whence p is a homomorphism. Moreover, if )L=0. It follows that p is a monomorphism. Consequently we may identify F with its image under p. Then F becomes a subalgebra of A and scalar multiplication coincides with algebra multiplication. In fact, if). is any scalar, then = (2e).a = = Example
6: Given an element a of an associative algebra consider
the set, Na, of all elements xEA such that ax =0. If xENa then we have for every VEA
and so xyENa. This shows that Na is a right ideal in A. It is called the right annihilator of a. Similarly the left annihilator of a is defined. 5.4. Factor algebras. Let A be an algebra and B be an arbitrary subspace of A. Consider the canonical projection
It will be shown that A/B admits a multiplication such that ir is a homomorphism if and only if B is an ideal in A. Assume first that there exists such a multiplication in A/B. Then for
§ 1. Basic properties
149
every xeA, yeB, we have
ir(xy) =
irxiry =
it follows that and so B must be an ideal. Conversely, assume B is an ideal. Then define the multiplication in
A/B by
= ir(xy)
(5.5)
where x and y are any representatives of
and 7 respectively.
It has to be shown that the above product does not depend on the choice of x and y. Let x' and y' be two other elements such that it x' = and iry' =7. Then
x'—xeB and y'—yeB. Hence we can write
x'==x+b, It
beB and y'—y+c,
ceB.
follows that
x'y' — xy = by + xc + bceB and so
rr(x'y') = it(xy). The multiplication in A/B clearly satisfies (M1) and (M2) as follows from the linearity of it. Finally, rewriting (5.5) in the form
it (x y) = it x iry
we see that it is a homomorphism and that the multiplication in A/B is uniquely determined by the requirement that it be a homomorphism. The vector space A/B together with the multiplication (5.5) is called the
factor algebra of A with respect to the ideal B. It is clear that if A is associative (commutative) then so is A/B. If A has a unit element e then ë= ire is the unit element of the algebra A/B. 5.5. Homomorphisms. Suppose A and B are algebras and is
a homomorphism. Then the kernel of p is an ideal in A. In fact, if xeker ço and yEA
are
arbitrary we have that
p(xy) =
=
=
0
whence xyEker qi. In the same way it follows that yxEker tp. Next consider the subspace Im pcB. Since for every two elements x, yeA = qi(xy)elmq
it follows that Im 'p is a subalgebra of B.
Chapter V. Algebras
Now let
be the induced injective linear mapping. Then we have the commutative d iagrani
A/kerço
and since
is
a homomorphism, it follows that = = = =
'p
(x y) (x). 'p (y)
is a homomorphism and hence a monomorphism. In particular, the induced mapping This relation shows that
4 Imço is an isomorphism. Finally, assume that C is a third algebra, and let be a homomorphism. Then the composition ti0 p: A —+ C is again a homomorphism. In fact, we have 'ps. = = =
Let 'p:A—*B be any homomorphism of associative algebras and S be by a system of generators for A. Then 'p determines a set map (po:
'p0x=çox,
xeS. In fact, if
The homomorphism 'p is completely determined by
x=
...
E S,
(v)
is
an arbitrary element we have that
... 'poxv
= (v)
I
Vp
e I'
§ 1. Basic properties
151
be an arbitrary set map. Then Po can be Proposition I: Let can be extended to a homomorphism ço:A—+B if and only if ...
whenever
0
po
...
= 0. (5.6)
(v)
(v)
Proof: It is clear that the above condition is necessary. Conversely, assume that (5.6) is satisfied. Then define a mapping ço:A—÷B by
= (v)
...
(5.7)
(v)
To show that çü is, in fact, well defined we notice that if = ... •.. Ypq
(v)
(p)
then —
Ypq
= 0.
(ji)
(v)
In view of (5.6) ...
...
—
(v)
0
(p)
and so
...
It follows from (5.7) that
px=ip0x xeS p(2x + jiy) = 2px + and
'p (x y) = 'p
'p y
and hence 'p is a homomorphism. Now suppose that
be a linear map such
is a basis for A and let
= for each x, j3. Then 'p is a homomorphism, as follows from the relation 'p (x y) =
'p
ep)}
e2) (> p
2,fl
=
'p (ep)) =
'p p
'p
(x) 'p (y).
Chapter V. Algebras
152
5.6. Derivations. A linear mapping 0: A called a derivation if
A of an algebra into itself is
X,yEA.
(5.8)
and As an example let A be the algebra of Cr-functions f: define the mapping 0 by 0:f—+f' wheref' denotes the derivative off. Then the elementary rules of calculus imply that 0 is a derivation. If A has a unit element e it follows from (5.8) that Oe =
Oe
+ 0e
whence Oe=—O. A derivation is completely determined by its action on a
system of generators of A, as follows from an argument similar to that used to prove the same result for homomorphisms. Moreover, if O:A—*A
is a linear map such that
=
+
is a basis for A, then 0 is a derivation in A. For every derivation 0 we have the Leibniz formula
where
on(xy)
(5.9)
= In fact, for n = 1, (5.9) coincides with (5.8). Suppose now by induction that (5.9) holds for some n. Then +
1(x y) =
o on
(x y) or+
-r y +
= x.On+ly +
+
or x
-r+
=
= x.On+ly + n+ 1
and so the induction is closed.
Orx.On+l_ry +
(n+1) Orx.On_r+ly +
§ 1. Basic properties
153
The image of a derivation 0 in A is of course a subspace of A, but it is in general not a subalgebra. Similarly, the kernel is a subalgebra, but it is not, in general, an ideal. To see that ker 0 is a subalgebra, we notice that for any two elements x, yeker 0
0(xy) =
0
whence xy E ker 0.
it follows immediately from (5.8) that a linear combination of derivations 01:A—*A is again a derivation in A. But the product of two derivations 01, 02 satisfies
(0102)(xy) = 01(02x.y + x.02y) (5.10)
and so is, in general, not a derivation. However, the commutator [01,02] = 0102—0201 is again a derivation, as follows at once from (5.10). 5.7. ip-derivations. Let A and B be algebras and 'p : A B be a fixed homomorphism. Then a linear mapping 0: A B is called a 'p-derivation if
0(xy)=
x,yeA.
+
denotes In particular, all derivations in A are i-derivations where i : A the identity map. As an example of a 'p-derivation, let A be the algebra of Cm-functions f: R—* and let B= 11. Define the homomorphism cp to be the evaluation homomorphism : f -÷ (0) and the mapping 0 by
f
0:f Then it follows that
0(fg) = (1 g)'(O) = = 0
f' (0) g (0) + f (0) g' (0)
a 'p-derivation.
More generally, if °A is any derivation in A, then derivation. In fact, 0(xy) = çoOA(xy) + xOAv) = = 'p
+
'p x
0 y.
Similarly, if °B is a derivation in B, then °B° 'p
is
a 'p-derivation.
is
a ço-
Chapter V. Algebras
154
5.8. Antiderivations.
Recall that an involution in a linear space is a
linear transformation whose square is the identity. Similarly we define an involution w in an algebra A to be an endomorphisin of A whose square is the identity map. Clearly the identity map of A is an involution. If A has a unit element e it follows from sec. 5.1 that we=e. Now let w be a fixed involution in A. A linear transformation Q:A—*A will be called an antiderivation with respect to w if it satisfies the relation
Q(xj) =
+
wxQy.
(5.11)
In particular, a derivation is an antiderivation with respect to the involution i. As in the case of a derivation it is easy to show that an antiderivation is determined by its action on a system of generators for A and that ker Q is a subalgebra of A. Moreover, if A has a unit element e, then Qe=O. It also follows easily that any linear combination of antiderivations with respect to a fixed involution w is again an antiderivation with respect to w. Suppose next that Q1 and Q2 are antiderivations in A with respect to 0w1. Then w1 oW2 the involutions w1 and w2 and assume that w1 0w2 is again an involution. The relations
(Q122)(xy)= Q1(Q2xy + w2xQ2y)=
+
and
(Q2Q1)(xy)=
+ w1xQ1y)= Q2Q1xy + + w2Q1xQ2y + Q2w1xQ1y + w2w1xQ2Q1y
yield
(Q1Q2 ± Q2 ± Q2w1)xQ1 y + +(Q1w2 ± w2Q1)xQ2y + W1W2X.(Q1Q2 ± Q2Q1)y. Q2
(5.12)
Now consider the following special cases: 1. w1 Q2 = Q2 w1 and w2 Q1 = Q1 (this is trivially true if w1 = ± i and Q2 —Q2 is an antiderivation = ± i). Then the relation shows that with respect to the involution W1 w2. In particular, if Q is an antiderivation
with respect to w and 0 is a derivation such that 0)0=0W, then OQ—Q0 is again an antiderivation with respect to oi. 2. w1Q2=—Q2W1andw2Q1=—Q1W2.ThenQ1Q2+Q2Q1isanantiderivation with respect to the involution W1 0)2.
§ 1. Basic properties
155
Now let Q1 and Q2 be two antiderivations with respect to the same involution w such that
(i=1,2). Then it follows that Q1 Q2 + Q2 Q1 is a derivation. In particular, if Q is any antiderivation such that wQ = - Qw then Q2 is a derivation. Finally, let B be a second algebra, and let cp: A —÷ B be a homomorphism. Assume that WA is an involution in A. Then a p-antiderivation with respect to WA is a linear mapping Q:A-+B satisfying (5.13) If WB is
an involution in B such that WA
WB 'P
then equation (5.13) can be rewritten in the form
Q(xy) =
+
Problems 1. Let A be an arbitrary algebra and consider the set C(A) of elements aEA that commute with every element in A. Show that C(A) is a subspace
of A. If A is associative, prove that C(A) is a subalgebra of A. C(A) is called the centre of A. 2. If A is any algebra and 0 is a derivation in A, prove that C(A) and the derived algebra are stable under 0. 3. Construct an explicit example to prove that the sum of two endomorphisms is in general not an endomorphism. 4. Suppose 'p A —÷ B is a homomorphism of algebras and let ). + 0, 1 be
an arbitrarily chosen scalar. Prove that 2p is a homomorphism if and only if the derived algebra is contained in ker 'p. 5. Let C1 and C denote respectively the algebras of continuously differentiable and continuous functionsf: (cf. Example 4). Consider the linear mapping d: C' C
given by df=f' wheref' is the derivative off.
Chapter V. Algebras
a) Prove that this is an i-derivation where i: C1—*C denotes the canonical injection. b) Show that d is surjective and construct a right inverse for d.
c) Prove that d cannot be extended to a derivation in the algebra C. 6. Suppose A is an associative commutative algebra and 0 is a derivation in A. Prove that Ox" = px"' 0(x). 7. Suppose that 0 is a derivation in an associative commutative algebra A with identity e and assume that xeA is invertible; i.e.; there exists an element x ' such that
Prove that x" (p 1) is invertible and that
(x")' —(x')" Denoting the inverse of x" by
show that for every derivation 0
0(x") = — px"1 0(x). 8. Let L be an algebra in which the product of two elements x, y is denoted by [x, y]. Assume that [x, y] + [y, x] = z], x] + y], z] +
0
(skew symmetry)
x], y] = 0 (Jacobi identity)
Then L is called a Lie algebra. Let Ad (a) be the multiplication operator in the Lie algebra L. Prove that Ad (a) is a derivation.
9. Let A be an associative algebra with product xy. Show that the multiplication (x, y)—÷[x, y] where
[x,y]
xy — yx
makes A into a Lie algebra. 10. Let A be any algebra and consider the space D (A) of derivations in A. Define a multiplication in D(A) by setting [01,02] = a)
0102 —0201.
Prove that D (A) is a Lie algebra.
b) Assume that A is a Lie algebra itself and consider the mapping given by Show that 'p is a homomorphism of Lie algebras. Determine the kernel of p.
§ 1. Basic properties
11. If L is a Lie algebra and I is an ideal in A, prove that the algebra L/I is again a Lie algebra. 12. Let E be a finite dimensional vector space. Show that the mapping 1: A (E; E) —÷ A (E*; E*)OPP
is an isomorphism of algebras. given by 13. Let A be any algebra with identity and consider the multiplication
operator
ji: A
A (A; A).
Show that p is a monomorphism. If A = L (E; E) show that by a suitable restriction of p a monomorphism
(i= 1
E be an n-dimensional vector space. Show that each basis n) of L(E; E) such that n) of E determines a basis 1
(i) QijOlk = (ii)
i.
Conversely, given n2 linear transformations of E satisfying i) and ii), prove that they form a basis of L(E; E) and are induced by a basis of E. Show that two bases e1 and of E determine the same basis of L (E; E) if and only if e = A 2 eF. 15. Define an equivalence relation in the set of all linearly independent E), in the following way: n2-tuples (p1...pn2), ...
('P1 ...
if and only if there exists an element =
such that (v
= ... 1
,12)
Prove that 2EV
only if 2=1. 16. Prove that the bases of L(E; E) defined in problem 14 form an equivalence class under the equivalence relation of problem 15. Use this to show that every non-zero endomorph ism cP: A (E; E)—*A (E; E) is an inner automorphism; i.e., there exists a fixed linear automorphism of E such that 1
17. Let A be an associative algebra, and let L denote the corresponding
Chapter V. Algebras
158
is Lie algebra (cf. problem 9). Show that a linear mapping derivation in A only if it is a derivation in L. 18. Let E be a finite dimensional vector space and consider the mapping E)—*A(E; E) defined by
= — Prove that is a derivation. Conversely, prove that every derivation in A (E; E) is of this form. Hint. Use problem 14. § 2.
Ideals
5.9. The lattice of ideals. Let A be an algebra, and consider the set 1 of ideals in A. We order this set by inclusion; i.e., if and '2 are ideals if and only if in A, then we write '1 The relation is clearly a partial order in 1 (cf. sec. 0.6). Now let and '2 be ideals in A. Then it is easily cheeked that + and '1 n '2 are again ideals, and are in
fact the least upper bound and the greatest lower bound of '1 and '2 Hence, the relation induces in 5 the structure of a lattice. 5.10. Nilpotent ideals. Let A be an associative algebra. Then an element aEA will be called ni/potent if for some k, (5.14)
The least k for which (5.14) holds is called the degree of ni/potency of a. An ideal I will be called nilpotent if for some k,
,k0
(5.15)
The least k for which (5.15) holds is called the degree of ni/potency of I and will be denoted by deg I. 5.11.* Radicals. Let A be an associative commutative algebra. Then the nilpotent elements of A form an ideal. In fact, if x and y are nilpotent of degree p and q respectively we have that p+q
+
p+q
p
= i=O
and
(xy)" = x"y" = 0.
§ 2. Ideals
The ideal consisting of the nilpotent elements is called the radical of A and will be denoted by rad A. (The definition of radical can be generalized to the non-commutative case; the theory is then much more difficult and belongs to the theory of rings and algebras. The reader is referred to [14]).
It is clear that rad (rad A) = rad A.
The factor algebra A/rad A contains no non-zero nilpotent elements. To prove this assume that A is an element such that for some k. Then x"erad A and hence the definition of rad A yields / such that = (x")' = 0. It follows that xErad A whence by the formula
The above result can be expressed
rad(A/radA) = 0.
Now assume that the algebra A has dimension n. Then rad A is a nilpotent ideal, and (5.16) deg(radA) dim(radA) + 1 n + 1. For the proof, we choose a basis e1, ..., er of rad A. Then each is nilpotent. Let k = max (deg e1), and consider the ideal (rad An arbitrary element in this ideal is a sum of elements of the form where
k
so
...
=0. This shows that
=0 and so rad A is nilpotent. Now let s be the degree of nilpotency of rad A, and suppose that for some m<s, (rad A)tm = (rad A)m+
(5.17)
Then we obtain by induction that (radA)tm =
= (radA)m+2 =...= (radA)s = 0
which is a contradiction. Hence (5.17) is false and so in particular dim(rad A)tm> dim (radA)m+l,
m <s.
Chapter V. Algebras
It follows at once that s — cannot be greater than the dimension of rad A, which proves (5.16). As a corollary, we notice that for any nilpotent element xEA, its degree of nilpotency is less than or equal to ii+ I, 1
degx
n + 1.
5.12.* Simple algebras. An algebra A is called simple if it has no proper
non-trivial ideals and if A2 +0. As an example consider a field F as an algebra over a subfield F1. Let 1+0 be an ideal in F. If x is a non-zero element of I, then 1
and it follows that
F=
cJ
whence F = I. Since 12+0, F is simple. As a second example consider the algebra A (E; E) where E is a vector space of dimension n. Suppose I is a non-trivial ideal in A (E; E) and let p+O be an arbitrary element of I. Then there exists a vector aeE such
that pa+0. Now define the linear transformations
by
i,k=1...n where e (i= that
.n) is a basis of E. Choose linear transformations
such
i=1...n.
(E; E) be arbitrary and c4 let be the matrix of Let the basis e1. Then = = =
with respect to
whence 'I'
=
It follows that i/iel and so I_—A(E; E). Since, (clearly) A (E; E)2 +0, A (E; E) is a simple algebra. Theorem I: Let A be a simple commutative associative algebra. Then A is a division algebra. Proof: We show first that A has a unit element. Since A2+0, there is an element aEA such that Ia+0• Since A is simple, Thus there is an element eEA such that C!• e = a.
§2. Ideals
161
Next we show that e2 = e. In fact, equation (5.18) implies that a(e2 — e)
= (ae)e — ae = ae
—
ae =0
and so e2—e is contained in the annihilator of a, e2—eeN0. Since N0 is an ideal and Na*A it follows that Na=0 whence e2=e.
Now let xeA be any element. Since a=aeele we have Ie#0 and 50 IeA. Thus we can write x = er
for some ye A.
It follows that ex=e2v=ev=x and so e is the unit element of A. Thus every element xeA satisfies x=x. e; i.e., xel,. In particular, * 0 if .v * 0. Hence, if x + 0. = A and so there is an element x such
that xx"'=x'x=e: i.e., A is a division algebra.
5.13.* Totally reducible algebras. An algebra A is called totally recluclb/c if to every ideal I there is a complementary ideal I', A=
1$!'.
Every ideal I in a totally reducible algebra is itself a totally reducible algebra. In fact, let I' be a complementary ideal. Then
1.1' ci fl I' =
0.
Consequently, if J is an ideal in I, we have
J
an
ideal in A. Let J' be a complementary ideal in A, A
Intersecting with I and observing that id we obtain (cf. sec. 1.13) I It follows that us again totally reducible. An algebra, A, is called irreducible if it cannot be written as the direct
sum of two non-trivial ideals. 5.14.* Semisimple algebras. In this section A will denote an associative commutative algebra. A will be called semisimple if it is totally reducible and if for every non-zero, I, ideal j2 +0. 1
(reeb.
near AIgr±ra
Chapter V. Algebras
162
Proposition I: If A is totally reducible, then A is the direct sum of its radical and a semisimple ideal. The square of the radical is zero. Proof: Let B denote a complementary ideal for rad A,
A = rad A
B.
B contains no non-zero nilpotent eleIt follows from sec. (5.13) that B is totally reducible ments and so B2 and hence B is semisimple. To show that the square of rad A is zero, let k be the degree of nilan ideal in rad A, and potency of rad A, (rad A)k=0. Then (rad so there exists a complementary ideal J, Since
EEJ = radA.
Now we have the relations =
=
0
=0 n J=0.
jk-1 These relations yield (rad
0
A
and only if A is totally reducible and
rad A=0.
Problems 1. Suppose that
are ideals in an algebra A. Prove that
+ '2)/Il
'2/(11
'2).
Show that the algebra C' defined in Example 4, § 1, has no nilpotent elements +0. 111. Show that the oper3. Consider the set S of step functionsf: [0, ations 2.
(1 +g)(t)=f(t)+g(t) (Af)(t) =
(t)
(fg)(t) =f(t)g(t)
§ 3. Change of coefficient field of a vector space
163
make S into a commutative associative algebra with identity. (A function f: [0,1] —÷ is called a step function if there exists a decomposition of the
unit interval,
0=to
5. Show that the algebra S of problem 3 has ideals which are not principal. Let (a, b)c: [0,1] be any open interval, and letf be a step function such that f (t) = 0 if and only if a < t < b. Prove that the ideal gener-
ated by f is precisely the subset of functions g such that g(t)==0 for
a
transformations of E whose kernels have finite codimension form an ideal. Conclude that A (E; E) is not simple. Hint: See problem 11, chap. II, § 6 and problem 8, chap. I, § 4. § 3.
Change of coefficient field of a vector space
5.15. Vector space over a subfield. Let E be a vector space over a field F and let A be a subfield of F. The vector space structure of E involves a mapping
Fx
satisfying the conditions (11.1), (11.2) and (11.3) of sec. 1.1. The restriction of this mapping to A x E again satisfies these conditions, and so it deter-
mines on E the structure of a vector space over A. A subspace (factor space) of E considered as a A-vector space, will be called a A-subspace (A-factor space). Similarly we refer to F-subspaces and F-factor spaces. Clearly every F-subspace (factor space) is a A-subspace (A-factor space). Now let F be a second vector space over F and suppose that ço : is a F-linear mapping; i.e., + py) =
+
ppy x,yeE,2,peF.
Then 'p is a A-linear mapping ifEand Fare considered as A-vector spaces.
Chapter V. Algebras
As an example, consider the field F as a I-dimensional vector space over itself. Then (cf. sec. 5.2 Example 3) F is an algebra over A. 5.16. Dimensions. To distinguish between the dimension of E over F
and A we shall write dim1E and dim4E. Suppose now that F is finite dimensional over A. Assume further that the dimension of E over F is finite. It will be shown that dim4 E = dim1 E dim4 F.
(1= 1.. .n) be a basis of E over F and consider the F-subspace E generated by e. Then there is a F-isomorphism ço E.. But also a A-isomorphism and hence it follows that Let
dim4 E1 = dim4 F,
i= 1 ... n.
is
(5.21)
Since the A-vector space E is the direct sum of the A-vector spaces E, we obtain from (5.21) that
dim4E = ndim4F = As an example let E be a complex vector space of dimension 11. Then, since C =2, E, considered as a real vector space, has dimension 2n. If ; (v = .n) is a basis of the complex vector space E then the vectors (v = .. n) form a basis of E considered as a real vector space. 5.17. Algebras over subfields. Again let A be a subfield of F and let A be an algebra over F. Then A may be considered as a vector space over A, 1 . .
1
.
and it is clear that A, together with its A-vector space structure, is an algebra over A. We (in a way similar to the case of vector spaces) distinguish between A-subalgebras, A-homomorphisms and F-subalgebras, Fhomomorphisms. Clearly every F-subalgebra, (F-homomorphism) is a A-subalgebra, (A-homomorphism).
5.18. Extension fields as subalgebras of A4(E; E). Let E be a nontrivial vector space over F and A F be a subfield. Then E can be considered as a vector space over A. Denote by A4 (E: E) the algebra (over A) of A-linear transformations of E. Define a mapping by (5.22)
where Then
is
the ordinary scalar multiplication defined between F and E. =
=
§ 3. Change of coefficient field of a vector space
165
and
+ fix
+ 13)x =
+ /3)x =
= cP
(/3))
whence
+ /3) =
+ cP(f3)
and
Since A F, it follows that is a A-homomorphism. Moreover, is injective. In fact, implies that for every XEE whence o=O. Since cP is a monomorphism we may identify F with the A-subalgebra
Im
of AA (E; E).
Conversely, let E be a vector space over a field A and assume that F c: AA (E; E) is a field containing the identity. We may identify A with the subalgebra ofF consisting of elements of the form i, AeA, the identification map being given by Then we have A cF; i.e., A is a subfield ofF. Now define a mapping F x E—*E by çOEF,XEE.
(5.23)
Then we have the relations =
(x + y) = + x=
çø
ix=x
x+ x+
y
x,yEE;q9,I/JET,
and hence E is made into a vector space over F. The restriction of the mapping (5.23) to A gives the original structure of E as a vector space over A while the mapping P restricted to A reduces (E; E). to the canonical injection of A into 5.19. Linear transformations over extension fields. Let A F be a subfield and E be a vector space over F. Then we have shown that (E; E) (E; E). Now we shall prove the more precise
E) is the subalgebra of E) consisting of Proposition: those A-linear transformations which commute with every A-linear transformation of the form cXEF.
Chapter V. Algebras
166
Proof: Let
E). Then
and so
(5.24) Conversely, if (5.24) holds, then by inverting the above argument we ob-
tain that E is a vector space over A and F
(E; E) is a field
E) is contained in
such that lET'. Then a transformation
E) if and only if it commutes with every A-linear transformation
Problems I. Suppose A F is a subfield of F such that F has finite dimension over A. Suppose further that Prove A that A is a subfield of F. 2. Show that ifAcT is a subfield, and F has finite dimension over A, then there are no non-trivial derivations in the A-algebra F. 3. A complex number is called algebraic if it satisfies an equation of the form
;=
0
rational)
where not all the coefficients are zero. Prove that the algebraic numbers are a subfield, A, of C and that A has infinite dimension over the rationals.
Prove that there are no non-trivial derivations in A.
Chapter VI
Gradations and homology In this chapter all vector spaces are defined over a fixed, but arbitrarily chosen field F of characteristic 0.
§ 1. G-graded vector spaces 6.1. Definition. Let E be a vector space and G be an abelian group. Suppose that a direct decomposition (6.1) cxel
is given and that to every subspace an element k (tx) of G is assigned such that the mapping is injective. Then E is called a G-graded vector space. G is called the group of degrees for E. The vectors of are called homogeneous of degree k (ci) and we shall write
degx=k(cs), In particular, the zero vector is homogeneous of every degree. If the mapping is bijective we may use the group G as index set in the decomposition (6.1). Then formula (6.1) reads
E= keG
Ek
where Ek denotes the subspace of the homogeneous elements of degree k. If G = E will be called simply a graded vector space. Suppose that E is a vector space with direct decomposition
E=>JEk Then by setting
—1)
(ke7L).
we make Einto agraded space, and when-
ever we refer to the graded space E_—>Ek, we shall mean this particular k=O
gradation. A gradation of E such that Ek = 0, k — gradation.
1,
is called a positive
Chapter VI. Gradations and homology
Now let E be a G-graded space, E=>Ek,
and
consider a subspace
kEG
FcEsuchthat
F=
keG
Then a G-gradation is induced in F by assigning the degree k to the vec-
tors of Fn Ek. F together with its induced gradation is called a G-graded subspace of E. Suppose next that is a family of G-graded spaces indexed by a set I and let E be the direct sum of the EA. Then a G-gradation is induced in
Eby E=
where
Ek =
kEG
).eI
This follows from the relation Ad
kEG )et
AEI keG
keG
6.2. Linear mappings of G-graded spaces. Let E and F be two G-graded spaces and let E—3F be a linear map. The map p is called homogeneous if there exists a fixed element kEG such that
jeG
(6.2)
k is called the degree of the homogeneous mapping p. The kernel of a homogeneous mapping is a graded subspace of E. In fact, if
E and F induced by the gradations of E and F it follows from (6.2) that cJk+JO(P—(POQJ.
(6.3)
Relation (6.3) implies that ker 'p is stable under the projection operators and hence ker 'p is a G-graded subspace of E. Similarly, the image of 'p is a G-graded subspace of F. Now let E be a G-graded vector space, F be an arbitrary vector space (without gradation) and suppose that 'p:E—*Fis a linear map of Eonto F such that ker 'p is a graded subspace of E. Then there is a uniquely determined G-gradation in F such that 'p is homogeneous of degree zero. The G-gradation of F is given explicitly by jeG where
= 'p (Es).
(6.4)
§ 1. 6-graded vector spaces
169
To show that (6.4) defines a G-gradation in F, we notice first that since ço is onto,
F=
=
E=
To prove that the decomposition (6.4) is direct assume that where Since lows that ço
every can be written in the form = 0 whence
It fol-
Since ker p is a graded subspace of E we obtain
for eachj Thus the decomposition (6.4) is direct and hence it defines a G-gradation in F. Clearly, the mapping p is homogeneous of degree zero with respect to the induced gradation. Finally, it is clear that any G-gradation of F such that ço is homogeneous of degree zero must assign to the elements of the degree]. In view of the decomposition (6.4) it follows that this G-gradation is uniquely determined by the requirement that 'p be homogeneous of degree zero. This result implies in particular that there is a unique G-gradation determined in the factor space of E with respect to a G-graded subspace such that the canonical projection is homogeneous of degree zero. Such a factor space, together with its G-gradation, is called a G-graded factor whence
space of E.
Now let E= that
Ek and F= kEG
Fk be two G-graded spaces, and suppose keG
is a linear mapping homogeneous of degree 1. Denote by Pk the restriction of 'p to Ek, Ek —*
Fk+l.
Then clearly keG
It follows that 'p is injective (surjective, bijective) if and only if each c°k injective (surjective, bijective).
Chapter VI. Gradations and homology
6.3. Gradations with respect to several groups. Suppose that t: G
into another abelian group G'. Then a G-
gradation of E induces a G'-gradation of E by where
(6.5) t(2)=fl
/3
To prove this we note first that E= since
X
The directness of the decomposition (6.5) follows from the
fact that every space decomposition
is a sum of certain subspaces
and that the
is direct.
A Gp-gi'adation is a gradation with respect to the group
GEE... p
If G = we refer simply to a p-gradation of E. Given a G p-gradation in E consider the homomorphism
i:
G
given by
The induced (simple) G-gradation of E is given by (6.6)
j
The G-gradation (6.6) of E is called the (simple) G-gradation induced by the given G p-gradation. Finally, suppose E is a vector space, and assume that
E=
E=
jeG
keIi
Fk
(6.7)
define G and H-gradations in E. Then the gradations will be called coinpatible if E= n Fk. j,
k
if the two gradations given by (6.7) are compatible, they determine a in E by the assignment n Fk.
§ 1. G-graded vector spaces
Conversely, suppose a
171
in E is given
E= k
Then compatible G and H-gradations of E are defined by jeG
keH
and
E determined by these G and Hgradations is given by
jEG,kEH.
EJ,k=EfflFk
6.4. The Poincaré series. A gradation E== >.Ek of a vector space E is called almost finite if the dimension of every space Ek is finite. To every almost finite positive gradation we assign the formal series PE(t) = (t) is called the Poincaré series of the graded space E. If the dimension of E is finite, then (t) is a polynomial and
PE(l) = dimE. The direct sum of two almost finite positively graded spaces E and F is again an almost finite positively graded space, and its Poincaré series is given by = PE(t) + PF(t). Two almost finite positively graded spaces E and F are connected by a homogeneous linear isomorphism of degree 0 if and only if = In fact, suppose is such a homogeneous linear isomorphism of degree 0. Writing k=O
(cf. sec. 6.2) we obtain that each c°k is a linear isomorphism, Ek 4Fk.
Chapter VI. Gradations and homology
172
Hence
(k=O,1,...)
dimEk=dimFk
(6.8)
and so
= Conversely, assume that are linear isomorphisms
Then (6.8) must hold, and thus there (6.9)
(pk.Ek—*Fk.
Since
we
can construct the linear mapping
= k=O
E
F
which is clearly homogeneous of degree zero. Moreover, in view of sec. (6.2) it follows from (6.9) that ço is a linear isomorphism. 6.5. Dual G-graded spaces. Suppose E= > Ek and F= Fk are two GkE G graded vector spaces, and assume that k G ço:E x
is a bilinear function. Then we say that cv respects the G-gradations of E
and Fif (6.10)
for each pair of distinct degrees, k +j. Every bilinear function cv: E x F—÷F which respects G-gradations determines bilinear functions pk: Ek x (keG) by
(PIEkXFk=(Pk,
keG.
(6.11)
Conversely, if any bilinear functions (pk: Ek x are given, then a unique bilinear function p:Ex F-+F which respects G-gradations is determined by (6.10) and (6.11). In particular, it follows that ço is non-degenerate if and only if each is non-degenerate. Thus a scalar product which respects G-gradations determines a scalar product between each pair (Ek, Fk), and conversely if a scalar product is defined between each pair (Ek, Fk) then the given scalar product can be extended in a unique way to a G-gradation-respecting scalar product between E and F. E and F, together with a G-gradation-respecting scalar product, will be called dual G-graded spaces.
Now suppose that E and F are dual almost finite G-graded spaces.
§ 1. G-graded vector spaces
173
Then Ek and Fk are dual, and so
dimEk=dimFk In particular, if G
kEG.
and the gradations of E and F are positive, we have
=
Problems 1. Let çø : E—*F be an injective linear mapping. Assume that F is a Ggraded vector space and that Im ço is a G-graded subspace of F. Prove that there is a unique G-gradation in E so that p becomes homogeneous of degree zero. 2. Prove that every G-graded subspace of a G-graded vector space has a complementary G-graded subspace. 3. Let E, F be G-graded vector spaces and suppose that E1 E, F1 F are G-graded subspaces. Let p: E—+F be a linear mapping homogeneous of degree k. Assume that p can be restricted to E1, F1 to obtain a linear mapping p1 :E1—*F1 and an induced mapping
Prove that q1 and are homogeneous of degree k. 4. If cp, E, F are as in problem 3, prove that if has a left (right) inverse, then a left (right) homogeneous inverse of must exist. What are the possible degrees of such a homogeneous left (right) inverse mapping? 5. Let E1, E2, 123 be G-graded vector spaces. Suppose that p: E1 and i/i:E1—+E3 are linear mappings, homogeneous of degree k and 1 respectively. Assume that can be factored over 'p. Prove that cli can be factored over 'p with a homogeneous linear mapping x: and determine the degree of x. Hint: See problem 5, chap. II, § 1.
6. Let E, E* and F, F* be two pairs of dual G-graded vector spaces. Assume that and 'p is homogeneous of degree k, prove that 'p* homogeneous of degree k. and 7. Let E be an almost finite graded space. Suppose that are G-graded spaces each of which is dual to the G-graded space E. Construct
is
Chapter VI. Gradations and homology
174
a homogeneous linear isomorphism of degree zero
such that
<(py*x> =
xeE.
8. Let F, E* be a pair of almost finite dual G-graded spaces. Let F be a G-graded subspace of E. Prove that F' is a G-graded subspace of E*
and that (F')'=F. 9. Suppose E, E*, F are as in problem 8. Let F1 be a complementary G-graded subspace for F in E (cf. problem 2). Prove that
E* =F'f13F1' and that F, F and F1, F' are two pairs of dual G-graded spaces. 10. Suppose E, E* and F, F* are two pairs of almost finite dual G1L
graded vector spaces, and let p:E—*Fbe a linear mapping homogeneous of degree k. Prove that çQ* exists.
11. Suppose F, E* is a pair of almost finite dual G-graded vector spaces. Let be a basis of F consisting of the set union of bases for the in E* exists. homogeneous subspaces of E. Prove that a dual basis 12. Let F and F be two G-graded vector spaces and p: E—+ F be a homogeneous linear mapping of degree k. Assume further that a homomorphism w:G—*H is given. Prove that p is homogeneous of degree w(k) with erspect to the induced H-gradation. § 2.
G-graded algebras
6.6. G-graded algebras. Let A be an algebra and suppose that a Ggradation A =
is defined in the vector space A. Then A is called a kEG
G-graded algebra if for every two homogeneous elements x and y, xy is
homogeneous, and
deg(xy) = degx + degy. Suppose that A = keG
(6.12)
is a graded algebra with identity element e.
Then e is homogeneous of degree 0. In fact, writing
e=
ek
keG
ekeAk
§ 2. G-graded algebras
175
we obtain for each xeA that
x is homogeneous of degree 1 kEG
whence
xek—O for k+O and so
xe0=x.
(6.13)
Since (6.13) holds for each homogeneous vector x it follows that e0 is a right identity for A, whence e = e e0
= e0.
Thus eeA0 and so it is homogeneous of degree 0.
It is clear that every subalgebra of A that is simultaneously a G-graded subspace, is a G-graded algebra. Such subalgebras are called G-graded subalgebras.
Now suppose that 1c A is a G-graded ideal. Then the factor algebra A/I has a natural G-gradation as a linear space (cf. sec. 6.2) such that the canonical projection ir:A—A/I is homogeneous of degree zero. Hence if and 9 are any two homogeneous elements in A/I we have .9 =
and so
is
deg
= 7t (x y)
homogeneous. Moreover, = deg(x y) = deg x + deg y = deg
+ deg9.
Consequently, A/I is a G-graded algebra. More generally, if B is a second algebra without gradation, and p : A—÷B is an epimorphism whose kernel is a G-graded ideal in A, then the induced G-gradation (cf. sec. 6.2) makes B into a G-graded algebra. Now let A and B be G-graded algebras, and assume that p:A—+B is a
homogeneous homomorphism of degree k. Then ker 'p is a G-graded ideal in A and Im 'p is a G-graded subalgebra of B. G' is a homoSuppose next that A is a G-graded algebra, and r: morphism, G' being a second abelian group. Then it is easily checked that the induced G'-gradation of A makes A into a G'-graded algebra.
Chapter VI. Gradations and homology
176
The reader should also verify that if E is simultaneously a G- and an H-graded algebra such that the gradations are compatible, then the inalgebra. of A makes A into a duced A graded algebra A is called anticommutative if for every two homogeneous elements x and y = if x and y are two homogeneous elements in an associative anticommux y comtative graded algebra such that deg mute, and so we obtain the binomial formula n
yfl - i.
(x +
= an involution w is defined by
In every graded algebra A =
1)kx, wx = (6.14) XEAk. In fact, if XEAk and yeA1 are two homogeneous elements we have
w(xy) =
1)k+lxy = (_
1)1y
=
w preserves products. It follows immediately from (6.14) that = i and so w is an involution. w will be called the canonical involution of the graded algebra A. A homogeneous antiderivation with respect to the canonical involution (6.14) will simply be called an antiderivation in the graded algebra A. It satisfies the relation
Q(xy)=
+(— 1)kxQy,
xeAk,yeA.
If Q1 and Q2 are antiderivations of odd degree then Q1 Q2 + Q2 Q1 is a
derivation. If Q is an antiderivation of odd degree and 0 is a derivation then QO —OQ is an antiderivation (cf. sec. 5.8). Now assume that A is an associative anticommutative graded algebra
and let hEA be a fixed element of odd (even) degree. Then, if Q is a homogeneous antiderivation, p (h) Q is a homogeneous derivation (antiderivation) and if 0 is a homogeneous derivation, p(h)O is a homogeneous antiderivation (derivation) as is easily checked.
Problems 1. Let A be a G-graded algebra and suppose x is an invertible element homogeneous of degree k (cf. problem 7, chap. V, § 1). Prove that 1
§ 2. G-graded algebras
177
is homogeneous and calculate the degree. Conclude that if A is a positively graded algebra, then kz=O. 2. Suppose that A is a graded algebra without zero divisors. Prove that every invertible element is homogeneous of degree zero.
3. Let E, F be G-graded vector spaces. Show that the vector space LG(E; F) generated by the homogeneous linear mappings ço:E-+F is a subspace of L(E; F). Define a natural G-gradation in this subspace such that an element cOELG(E; F) is homogeneous if and only if it is a homogeneous linear mapping. 4. Prove that the G-graded space LG (E; E) (E is a G-graded vector space) is a subalgebra of A (E; E). Prove that the G-gradation makes LG (E; E) into a G-graded algebra (which is denoted by AG (E; E)). 5. Let E be a positively graded vector space. Show that an injective (surjective) linear mapping E) has degree Conclude that a homogeneous linear automorphism of E has degree zero. 6. Let A be a positively graded algebra. Show that the subset Ak of A consisting of the linear combinations of homogeneous elements of degree
k is an ideal. 7. Let E, E* be a pair of almost finite dual G-graded vector spaces. Construct an isomorphism of algebras: &AG(E;E)4 AG(E*;E*)0PP. Hint: See problem 12, chap. V, § 1. Show that there is a natural G-gradation in AG(E*; such that 1i is homogeneous of degree zero. 8. Consider the G-graded space LG (E; E) (E is a G-graded vector space). Assign a new gradation to LG (E; E) by setting
degp =
— degp
whenever PELG(E; E) is a homogeneous element. Show that with this new gradation LG (E; E) is again a G-graded space and AG (F; E) is a
G-graded algebra. To avoid confusion, we denote these objects by LG(E; E) and AG(E; E). Prove that the scalar product between LG (E;
E) and EG (E; E) defined
by
= makes these spaces into dual G-graded vector spaces. 12
Linear Algebra
Chapter VI. Gradations and homology
178
9. Let
be a graded algebra and consider the linear mapping
O:A—*A defined by
Ox = px
Show that 0 is a derivation. §
3•* Differential spaces and differential algebras
6.7. Differential spaces. A differential operator in a vector space E is a linear mapping E—5 E such that = 0. The vectors of ker = Z(E) are called boundaries. it are called cycles and the vectors of Im follows from 132=0 that B(E)cZ(E). The factor space
H(E) = Z(E)/B(E) is called the homology space of E with respect to the differential operator 13. A vector space E together with afixed differential operator 13E' is called a differential space. A linear mapping of a differential space (E, 13E) into a differential space (F, 13F) is called a homomorphism (of differential spaces) if (6.15)
CF0P_(P0GE
It follows from (6.15) that p maps Z(E) into Z(F) and B(E) into B(F). Hence a linear mapping p is an isomorphism of differential spaces and is the linear inverse iso1
morph ism, then by applying p - on the left and right of (6.15) we obtain 1
—1
°
= CEO çø
and so p1 is an isomorphism of differential spaces as well. If 1/1 is a homomorphism of(F, 13k) into a third differential space (G, we
have clearly
In particular, if ço is an isomorphism of E onto F and p isomorphism we have =I =
1
and
= Consequently,
= 1.
is an isomorphism of H(E) onto H(F).
is
the inverse
§ 3. Differential spaces and differential algebras
6.8. The exact triangle. An exact sequence of differential spaces is an exact sequence (6.16)
where (F, aF), (E, and (G, are differential spaces, and 'p, are homomorphisms. Suppose we are given an exact sequence of differential spaces then the sequence H(E) H(G) o is exact at H(E). In fact, clearly, = 0 and so Tm * c ker Conversely, let and choose an element yEZ(E) which repre-
sents /3. Then, since z1EG.
is surjective, there is an element y1 E E such that t/iy1 = z1. It follows that — CGZ1 = 0. — CEll) = = — 6G Since
Hence, by exactness at E,
for some XEF. Applying
we
obtain CE(PX = GEl
=0
whence
Since p is injective, this implies that i.e., XEZ(F). Thus x represents an element It follows from the definitions that (p*
= /3
and so
It should be observed that the induced homology sequence is not short exact. However, there is a linear map, y: H(G)—*H(F) which makes the triangle
H(F)-*H(E)
(6.17)
A
H(G)
exact. It is defined as follows: Let sentative of Choose VEE such that =
and let ZEZ(G) be a repreThen = GG
Z
=0
Chapter VI. Gradations and homology
and so there is an element XEF such that = Since
=
=
=
0
and since is injective, it follows that IFX=O and so x represents an element x of H(F). It is straightforward to check that is independent of all choices and so a linear map y: is defined by It is not difficult to verify that this linear map makes the triangle (6.17) exact at 11(G) and H(F). / is called the connecting homomorphism for the short exact sequence (6.16). 6.9. Dual differential spaces. Suppose (E, is a differential space and consider the dual space E* = L(E). Let a* he the dual map of Then for all xEE, X*EE* we have = <x*,0> =
=
0
whence 3*a*x*_o i.e., (o*)2 = 0.
Thus (E*,
is again a differential space. It is called the dual differential
space.
= Z (E*) are called cocycles (for E) and the vectors The vectors of ker of B(E*) are called coboundaries (for E). The factor space
H(E*) = Z(E*)/B(E*) is called the cohoinology space for E. it will now be shown that the scalar product between E and E* determines a scalar product between the homology and cohomology spaces. In view of sec. 2.26 and 2.28 we have the relations
Z(E*)=B(E)L,
Z(E)=B(E*)L
(6.18)
B(E*) = Z(E)-'-,
B(E) = Z(E*)--.
(6.19)
We can now construct a scalar product between H(E) and H(E*). Consider the restriction of the scalar product between E and E* to Z(E) x Z(E*), Z(E) x Then since, in view of (6.18) and (6.19),
z(E*)±
n
Z(E) = B(E)
n
Z(E) = B(E)
§ 3. Differential spaces and differential algebras
and n
181
B(E*) n z(E*) =
z(E*)
the equation
=
a scalar product between H(E) and H(E*) (cf. sec. 2.23). (E*, Finally, suppose (E, and (F, SF)' (F*, CF*) are two pairs of dual differential spaces. Let
defines
ço:E-*F be a homomorphism of differential spaces, with dual map (p*: Then dualizing (6.15) we obtain *
and so
induced
°
*_ F
* * E0(P
is a homomorphism of differential spaces. It is clear that the mappings
and
H(F*)
: H(E*)
are again dual with respect to the induced scalar products; i.e., I
*\,'# — —I
\*
6.10. G-graded differential spaces. Let E be a G-graded space, E= peG
and consider a differential operator in E that is homogeneous of some degree k. Then a gradation is induced in Z(E) and B(E) by
Z(E)=
Z,,(E)
and B(E)=
peG
and
where
peG
(cf. sec. 6.2).
Now consider the canonical projection
Since rr is an onto map and the kernel of is a graded subspace of Z (E), a G-gradation is induced in the homology space H(E) by
H(E)=
where
peG
Chapter VI. Gradations and homology
The Now consider the subspaces and factor space is called the p-ill homology space of the graded In differential space E. It is canonically isomorphic to the space fact, if then denotes the restriction of it to the spaces
is an onto map and the kernel of =
n
is given by
kent =
n
B(E) =
induces a linear isomorphism of if E is a graded space and dim is finite we write
Hence,
onto
=
dim
is called the p-th Betti number of the graded differential space (E,
If E is an almost finite positively graded space, then clearly, so is H(E). The Poincaré series for H(E) is given by P
D
H(E)
P=o
6.11. Dual G-graded differential spaces. Suppose (E= (E* = keG
E,
Ek, c5)
and
kEG is
a pair of dual G-graded differential spaces. Then if 0E
is homogeneous of degree 1, we have that
E1
E1 +
and hence
= <)j,GEXJ> = 0 unless
it follows that
fl
j+i—l
is homogeneous of degree —1. and so Now consider the induced G-gradations in the homology spaces
The induced scalar product is given by (6.22)
Since
and
§ 3. Differential spaces and differential algebras
183
are homogeneous of degree zero, it follows that the scalar product (6.22) respects the gradations. Hence H(E) and H(E*) are again dual G-graded vector spaces. In particular, if G = the p-th homology and cohomology spaces of E are dual. If has finite dimension we obtain that dim
=
dim
6.12. Differential algebras. Suppose that A is an algebra and that 13 is a differential operator in the vector space A. Assume further that an involution w of the algebra A is given such that 13w+w13=O, and that 13 is an antiderivation with respect to i.e., that (6.23)
Then (A, 13) is called a differential algebra.
It follows from (6.23) that the subspace Z(A) is a subalgebra of A. Further, the subspace B(A) is an ideal in the algebra Z(A). In fact, if and 13xeB(A) we have
yeZ(A)
13(xy) = 13xy
and
13(wy.x)= w2 y13x +
y13x
yeZ(A)
whence 13x .yeB(A) 13xeB(A). Hence, a multiplication is induced in the homology space H(A). The space H(A) together with this multiplication is called the homology algebra of the differential algebra (A, 13). The multiplication in H(A) is given by
irz1 nz2 =
z1,z2eZ(A)
m(z1 z2)
where ir:Z(A)—÷H(A) denotes the canonical projection. If A is associative (commutative) then so is H(A). Let (A, 8A) and (B, 0B) be differential algebras. Then a homomorphism A -+B is called a homomorphism of d(fferential algebras if 'P 13A =
13B
It follows easily that the induced mapping : is a homomorphism of homology algebras. Suppose now A is a graded algebra and that 13 and a) are both homogeneous, w of degree zero. The A is called a graded differential algebra. Consider the induced gradation in H(A). Since the canonical projection m:Z(A)—÷H(A) is a homogeneous map of degree zero it follows that H(A) is a graded algebra.
Chapter VI. Gradations and homology
If A is an anticommutative graded algebra, then so is H(A) as follows from the fact that rr is a homogeneous epimorphism of degree zero.
Problems 1. Let (E, 0k), (F, 02) be two differential spaces and define the differby ential operator 8 in 8= Prove that
2. Given a differential space (E, 0), consider a differential subspace; i.e., a subspace E1 that is stable under 0. Assume that Q:E—÷E1 is a linear mapping such that i) ii)
VEE1
iii) QX — XEB(E)
XEZ(E).
Prove that the induced mapping
H(E1) is a linear isomorphism. 3. Let (F, 0) be a differential space. A homotopy operator in E is a linear transformation h:E—*E such that
hO + Oh = i.
Show that a homotopy operator exists in E if and only if H(E)==O. 4. Let (E, be and (F, OF) be two differential spaces and let =1/1* if and only if homomorphisms of differential spaces. Prove that there exists a linear mapping such that liCE + CFh =
—
is called a honiotopy operator connecting çø and //. Show that problem 3 is a special case of problem 4. 5. Let 81, 02 be differential operators in Ewhich commute, 8102=8281. a) Prove that 01 02 is a differential operator in E. b) Let B1, B2, B be the boundaries with respect to 82 and 82. Prove that 02(B1) = 81(B2) = B. h
§ 3. Differential spaces and differential algebras
c) Let Z1, Z2, Z that
Z1 + Z2
be the cycles with respect to 02 and Z. Establish natural linear isomorphisms
Z/Z1 4B1 n
Z2
and
Z/Z2
185
Show
4B2 fl Z1.
d) Establish a natural linear isomorphism (B1 n
n
Z1)/02(Z1)
and then show that each of these spaces is linearly isomorphic to Z/(Z1 + Z2). induces a differential operator in Z2. Let P1 denote e) Show that the corresponding homology space. Assume now that Z=Z1 +Z2 and prove that P1 can be identified with a subspace of the homology space H1 =Z1/B1. State and prove a similar result for 02. f) Show that the results a) to e) remain true if 0102 02 be differential operators in E such that 0102= are differential operators in E. a) Prove that 01+02 and b) With the notation of problem 5, assume that
B=B1nB2 and Z=Z1+Z2. Prove that the homology space is linearly isomorphic to
of each of the differential operators in a)
Z1 n Z2/(Z1 (This
n
B2 + Z2 n B1).
is essentially the Künneth theorem of sec. 2.10
7.
formula. Let E =
be
volume II.)
a finite dimensional
graded
differential space and assume that 0 is homogeneous of degree — be a homomorphism of differential spaces, (1=0 in E0). Let p: homogeneous of degree zero. Denote the restrictions of p (respectively by ço1, (respectively Prove the (respectively to Lefschetz formula —
—
p=O
Conclude
=
p=O
that —
(EP)
p=O
and PH(E). (Euler-Poincaré formula). Express this formula in terms of The number given in (EP). is called the Euler-Poincaré characteristic
of E.
Chapter VII
Inner product spaces In this chapter all vector spaces are assumed to be real vector spaces
§ 1. The inner product 7.1. Definition. An inner product in a real vector space E is a bilinear function (,) having the following properties: 1. Symmetry: (x, y)=(y, x).
2. Positive definiteness: (x, x)O, and (x, x)=O only for the vector x==O.
A vector space in which an inner product is defined is called an inner product space. An inner product space of finite dimension is also called a Euclidean space. The norm xl of a vector xEE is defined as the positive square-root
A unit vector is a vector with the norm 1. The set of all unit vectors is called the unit-sphere. It follows from the bilinearity and symmetry of the inner product that x + y12 = 1x12 + 2(x,y)+ JyV
whence
(x,y) = 4(lx +
y12
-
This equation shows that the inner product can be expressed in terms of the norm. The restriction of the bilinear function (,) to a subspace E1 E has again properties 1 and 2 and hence every subspace of an inner product space is itself an inner product space. The bilinear function (,)is non-degenerate. In fact, assume that (a,y)=O
for a fixed vector aeE and every vector yeE. Setting y=a we obtain (a, a) =0 whence a =0. It follows that an inner product space is dual to itself.
7.2.
§ 1. The inner product
187
Examples. 1. In the real number-space
standard inner pro-
duct is defined by
(x,y) where
x =
and
...
...
y =
2. Let E be an n-dimensional real vector space and basis of E. Then an inner product can be defined by
(v =
1.
.
.n) be a
(x,y) = where
3.
Consider the space C of all continuous functions f in the interval 1 and define the inner product by
(f,g) = Jf(t)g(t)dt. 7.3. Orthogonality. Two vectors xeE and yeE are said to be orthogonal if (x, y)=O. The definiteness implies that only the zero-vector is orthogonal to itself. A system of p vectors +0 in which any two vectors are orthogonal, is linearly independent. In fact, the rex. and lation =0 yields x,1) =
0
=1
...
p)
whence
Two subspaces E1 E and E2 c E are called orthogonal, denoted as E1 ±E2, if any two vectors X1EE1 and x2eE2 are orthogonal.
7.4. The Schwarz-inequality. Let x and y be two arbitrary vectors of the inner product space E. Then the Schwarz-ine quality asserts that (x,y)2
1x121y12
(7.1)
and that equality holds if and only if the vectors are linearly dependent. To prove this consider the function Ix
+
Chapter VII. Inner product spaces
of the real variable 2. The definiteness of the inner product implies that
Expanding the norm we obtain )2
+ 22(x, y) + x12
0.
Hence the discriminant of the above quadratic expression must be negative or zero,
(x,r)2 < Now assume that equality holds in (7.1). Then the discriminant of the quadratic equation + 22(x,y) + lxi2 = 0
221yi2
is zero.*) Hence equation (7.2) has a real solution
(7.2)
It follows that
i2oy + XV = 0, whence 2oY + X = 0.
Thus, the vectors x and y are linearly dependent. 7.5. Angles. Given two vectors x + 0 and y 0, the Schwarz-inequality implies that (x,y) —
1
- lxi -1.
Consequently, there exists exactly one real number w (0
(x y)
cosw=—
w
m) such that (7.3)
.
lxi
The number w is called the angle between the vectors x and y. The sym-
metry of the inner product implies that the angle is symmetric with respect to x and y. If the vectors x and y are orthogonal, it follows that cos co =0, whence U) =
—
2
Now assume that the vectors x and y are linearly dependent, y=2x, Then
2
1+1
121
L—1
if 2>0 if 2<0
*) Without loss of generality we may assume that y
0.
§ 1. The inner product
and hence (0
if 2>0
(ic
if 2<0.
189
With the help of (7.3) the equation fx — y12
= ixi2 — 2(x,y) + iyi2
can be written in the form fx — y12 = 1x12 + 1y12 — 2fxi iylcosw.
This formula is known as the cosine-theorem. If the vectors x and y are
orthogonal, the cosine-theorem reduces to the Pythagorean theorem
ix- y12 = 7.6.
ixi2 + fyi2.
The triangle-inequality. It follows from the Schwarz-inequality
that lx
ixi2 + 2fxf fyI + fyi2
+ yi2 = ixi2 + 2(x,y) + fyi2
(lxi +
whence
x+yf fxi +iyi.
(7.4)
Relation (7.4) is called the triangle-inequality. To discuss the equalitysign we may exclude the trivial case y 0. It will be shown that equality holds in (7.4) if and only if
x—2y, 2>0. The equation
Ix+yi=ixi+iyi implies that lxi
2
+ 2(x,y)+
fyi2
= lxi 2 + 2fxifyf +
whence
(x, y) = lxi fyi.
(7.5)
Thus, the vectors x and y must be linearly dependent,
x=2y. Equations (7.5) and (7.6) yield 2=121, whence 20. Conversely, assume that x = 2y, where 2 0. Then fx
+ yf = (2 + 1)yI = (2 + 1)fyf = 2fyf + iyf = fxf + fyj.
(7.6)
Chapter VII. Inner product spaces
Given three vectors x, y, z, the triangle-inequality can be written in the form
x-yl Ix-zI +lz-yi.
(7.7)
As a generalization of (7.7), we prove the Ptolemy-inequality
ix-yllzl ly-zHxl +lz-xilyi.
(7.8)
Relation (7.8) is trivial if one of the three vectors is zero. Hence we
may assume that x+O, y+O and z+O. Define the vectors x', y' and z' by x
X—2, Y lxi
y
,
z Z—2. ,
2'
Izi
Then
lv'—y'l2=
k—12 xi2 in2'
1
lxl2
lxl2 lyi2
lyl2
Applying the inequality (7.7) to the vectors x', y' and z' lxi
=
we
obtain
Izi lxi'
Izi
whence (7.8).
7.7. The Riesz theorem. Let E be an inner product space of dimension n and consider the space L(E) of linear functions. Then the spaces L(E) and E are dual with respect to the bilinear function defined by
On the other hand, E is dual to itself with respect to the inner product. Hence Corollary II to Proposition I sec. 2.33 implies that there is a linear isomorphism a—*fa of E onto L(E) such that fa(Y) = (a,y). In
other words, every linear function f in E can be written in the form
f(y) and
= (a,y)
the vector aEE is uniquely determined byf(Riesz theorem).
Problems 1. For x =
1
(x,y) satisfies
show that the bilinear function
in
and y = —
the properties listed in sec. 7.1.
—
+
§ 2. Orthonormal bases
191
2. Consider the space S of all infinite sequences that
Show that
•••) such
converges and that the bilinear function (x,
is an inner product. 3. Consider three distinct vectors x40, y+O and z40. Prove that the equation lxi + iz - xi fyi ix izi = iY
-
-
holds if and only if the four points x, y, z, 0 are contained on a circle such
that the pairs x, y and z, 0 separate each other. 4. Consider two inner product spaces E1 and E2. Prove that an inner product is defined in the direct sum E1 EEE2 by x1,y1 eE1,
((x1,x2),(y1,y2)) = (x1,y1) + (x2,y2) 5.
x2,y2eE2.
Given a subspace E1 of a finite dimensional inner product space E,
consider the factor space E/E1. Prove that every equivalence class con-
tains exactly one vector which is orthogonal to E1. § 2.
Orthonormal bases
7.8. Definition. Let E be an n-dimensional inner product space and (v = .n) be a basis of E. Then the bilinear function (,) determines a symmetric matrix (v,p = 1... n). (7.9) 1 . .
The inner product of two vectors x= can
and
be written as
(x,y) =
if1(x
y=
x) =
(7.10)
and hence it appears as a bilinear form with the coefficient-matrix The basis (v = .ii) (v = .11) is called orthonornial, if the vectors are mutually orthogonal and have the norm 1, 1
1
. .
=
. .
(7.11)
Chapter VII. Inner product spaces
192
Then formula (7.10) reduces to
(x,y) and in the case y = x ixi2
The substitution
(7.12)
=
in (7.12) yields
=
(p
= 1... n).
(7.13)
Now assume that x+0, and denote by the angle between the vectors x and Formulas (7.3) and (7.13) imply that =
(p =
1
n).
(7.14)
lxi
If x is a unit-vector (7.14) reduces to cos
(p = I ... n).
=
(7.15)
These equations show that the components of a unit-vector x relative to an orthonormal basis are equal to the cosines of the angles between x and the basisvectors 7.9. The Schmidt-orthogonalization. In this section it will be shown that an orthonormal basis can be constructed in every inner product space of finite dimension. Let (v = ii) be an arbitrary basis of E. Starting out from this basis a new basis (v = I .. n) will be constructed whose vectors are mutually orthogonal. Let b1 = a1. Then put b2 = a2 + 2b1 1 . .
.
and determine the scalar A such that (h1, h2)=0. This yields
(a2,b1) + A(b1,b1) = 0. Since b1 40, this equation can be solved with respect to A. The vector b2 thus obtained is different from zero because otherwise a1 and a2 would be linearly dependent.
To obtain b3 set
b3=a3 -i-pb1 +-vb2
and determine the scalars p and v such that (b1,b3) =
0
and
(b2,b3) = 0.
§ 2. Orthonormal bases
193
This yields
b1) = 0
(a3, b1) +
and (a3, b2) + v(b2,b2)
0.
Since b1 + 0 and b2 + 0, these equations can be solved with respect to ji and v. The linear independence of the vectors a1, a2, a3 implies that b3 +0. Continuing this way we finally obtain a system of n vectors
such that
+ 0 (v = 1.. .n)
(v+ji).
It follows from the criterion in sec. 7.3, that the vectors are linearly independent and hence they form a basis of E. Consequently the vectors (v
=
= 1 ... n)
form an orthonormal basis. 7.10. Orthogonal transformations. Consider two orthogonal bases and (v = 1.. .n) of E. Denote by the matrix of the basis-transformation (7.16)
The relations and imply that (7.17)
This equation shows that the product of the matrix and the transposed matrix is equal to the unit-matrix. In other words, the transposed matrix coincides with the inverse matrix. A matrix of this kind is called orthogonal.
Hence, two orthonormal bases are related by an orthogonal matrix. Conversely, given an orthonormal basis (v 1. . n) and an orthogonal the basis defined by (7.16) is again orthonormal. n x n-matrix .
7.11. Orthogonal complement. Let E be an inner product space (of finite or infinite dimension) and E1 be a subspace of E. Denote by set of all vectors which are orthogonal to E1. Obviously,
E only.
of the zero-vector
is called the orthogonal complement of E1. If E has finite dimen-
sion, then we have that
dimE1 +dimEiL= dimE 13
the
Greub. Linear Algebra
Chapter VII. Inner product spaces
and hence E1
n E iL
=
0
implies that (7.18)
Select an orthonormal basis and a vector y= of
E1 consider
rn) of E1. Given a vector XEE
1
the difference z
=x
—
y.
Then (z,
= (x,
—
= (x,
(y,
if and only if
This equation shows that z is contained in (R
—
= 1... rn).
We thus obtain the decomposition
x=p+h
(7.19)
where
p is called the orthogonal projection of x onto E1.
Passing over to the norm in the decomposition (7.19) we obtain the relation ix12
= pt2 + hI2.
(7.20)
Formula (7.20) yields Bessel's-inequality lxi
showing that the norm of the projection never exceeds the norm of x. The
equality holds if and only if h=0, i.e. if and only if xeE1. The number ihi is called the distance of x from the subspace E1.
Problems 1. Starting from the basis a1 = (1,0,1) a2 = (2,1,
—
3)
a3 = (— 1,1,0)
construct an orthonormal basis by the Schmidtof the number-space orthogonalization process.
§ 3. Normed determinant functions
195
2. Let E be an inner product space and consider E as dual to itself. Prove that the orthonormal bases are precisely the bases which are dual to themselves.
3. Given an inner product space E and a subspace E, of finite dimension consider a decomposition
x=x1+x2 and the projection
x,eE,
x=p+h
Prove that 1X21
Ihi
that equality is assumed only if x, =p and x2 = h. 4. Let C be the space of all continuous functions in the interval 0 t 1 with the inner product defined as in sec. 7.2. If C' denotes the subspace of all continuously differentiable functions, show that (C1)' = 0. 5. Consider a subspace E1 of E. Assume an orthogonal decomposition and
E,=F1$G, F1±G,. Establish the relations
F,' =
I G1
and
IF1.
=
6. Let F3 be the space of all polynomials of degree inner product of two polynomials as follows: (P,Q)
2. Define the
= f P(t)Q(t)dt.
The vectors 1, t, t2 form a basis in F3. Orthogonalize and orthonormalize this basis. Generalize the result for the case of the space P of polynomials of degree § 3.
Normed determinant functions
7.12. Definition. Let E be an n-dimensional inner product space and A0+O be a determinant function in E. Since E is dual to itself we have in view of (4.21) z10 (x,, ...
where
...
is a real constant. Setting
= det (x1,
e E,
where
eE
is an orthonormal
Chapter VII. Inner product spaces
basis we obtain
= z10(e1 ...
and so the constant
(7.21)
is positive. Now define a determinant function A
by A
(7.22)
=±
Then we have
A (x1 ...
=
(Yi ...
(7.23)
y1).
A determinant function in an inner product space which satisfies (7.23) is called a normed determinant function. It follows from (7.22) that there are precisely two normed determinant functions A and —A in E. Now assume that an orientation is defined in E. Then one of the functions A and —A represents the orientation. Consequently, in an oriented inner product space there exists exactly one normed determinant function representing the given orientation.
7.13. Angles in an oriented plane. With the help of a normed determinant-function it is possible to attach a sign to the angle between two vectors of a 2-dimensional oriented inner product space. Consider the normed determinant function A which represents the given orientation. Then the identity (7.23) yields 1x12
(7.24)
1y12 — (x, y)2 = A (x, y)2.
Now assume that x+O andy+O. Dividing (7.24) by 1x121y12 we obtain the relation + 1x121y12
1x121y12
Consequently, there exists exactly one real number 0 mod 2m such that
cosO =
(x,y)
.
and
sin0 =
lxi
A(x,y) .
lxi
(7.25)
This number is called the oriented angle between x and y.
if the orientation is changed, A has to be replaced by —A, and hence 0 changes into —0. Furthermore it follows from (7.25) that 0 changes sign if the vectors x and y are interchanged and that 0(x,
—
y)
= 0(x, y) +
mod 2it.
§ 3. Normed determinant functions
197
1.. .p) in an inner 7.14. The Gram determinant. Given p vectors product space E, the Gram determinant G (x1 . x,,) is defined by . .
= det
G (x1 ...
It will be shown that
\ (xi,, x1) ... (xe,
G(x1 ...
(7.26)
).
(
0
(7.27)
and that equality holds if and only if the vectors (x1 . .x,) are linearly dependent. in the case p = 2 (7.27) reduces to the Schwarz-inequality. To prove (7.27), assume first that the vectors (v 1.. .p) are linearly .
dependent. Then the rows of the matrix (7.26) are also linearly dependent whence
if the vectors (v = .. .p) are linearly independent, they generate a p-dimensional subspace E1 of E. E1 is again an inner product space. Denote by A1 a normed determinant function in E1. Then it follows from (7.23) that G(x1 ... = A1(x1 ... 1
The linear independence of the vectors A1 (x1 .. . x,,) +0,
(v = .. .p) 1
implies that
whence
7.15. The volume of a parallelepiped. Let p linearly independent vectors be given in E. The set
(v=1...p)
(7.28)
is called the p-dimensional parallelepiped spanned by the vectors (v = 1.. .p). The volume V(a1 ..
of the parallelepiped is defined by (7.29)
where A1 is a normed determinant function in the subspace generated by the vectors (v = .p). In view of the identity (7.23) formula (7.29) can be written as 1 .
.
V(a1 ...
= det
(
a1) ... (ar,
).
(7.30)
Chapter VII. Inner product spaces
198
In the case p = 2 the above formula yields V(ai,a2)2=1a1121a212—(a1,a2)2=1a1121a212sin20,
(7.31)
where 0 denotes the angle between a1 and a2. Taking the square-root on
both sides of (7.31), we obtain the well-known formula for the area of a parallelogram: V(a1,a2) = 1a11 1a21 sin
p) and de-
Going back to the general case, select an integer i (1 i compose a1 in the form
a, =
v*i
+
=
where
(v + i).
0
(7.32)
Then (7.29) can be written as V(a1 ...
=
A1 (a1
...
...
Employing the identity (7.23) and observing that (hi,
= 0 (v +1) we
obtain *) (a1, a1) ... (a1, a1) ... (a
V(a1 ...
det
(7.33)
...
The determinant in this equation represents the square of the volume of the(p — 1)-dimensional parallelepiped generated by the vectors (a1 . .
a1..
We thus obtain the formula V(a1 ... a',) = V(a1
showing that the volume V(a1 .
(1 .
is
i
p)
the product of the volume
of the "base" and the corresponding height. 7.16. The cross product. Let E be an oriented 3-dimensional Euclidean space and A be the normed determinant function which represents the orientation. Given two vectors xeE and yEE consider the linear function f defined by (7.34) f (z) = A (x, y, z). V(a1 . . .á,. .
In view of the Riesz-theorem there exists precisely one vector uEE such that (7.35) f (z) = (u, z). *) The symbol at indicates that the vector at is deleted.
§ 3. Normed determinant functions
199
The vector u is called the cross product of x and y and is denoted by x x y. Relations (7.34) and (7.35) yield
(x xy,z)=A(x,y,z).
(7.36)
It follows from the linearity of 4 in x and y that the cross product is distributive (Ax1 + px2) x y = Ax1 x y + px2 x x x (AY1 + 11Y2) = AX X Yi + PX X
y
and hence it defines an algebra in E. The reader should observe that the
cross product depends on the orientation of E. If the orientation is reversed then the cross product changes its sign. From the skew symmetry of 4 we obtain that
x xy=—y xx. Setting
z=x
in (7.36) we obtain that
(x x y,x) = 0.
Similarly it follows that
(x xy,y)=O and so the cross product is orthogonal to both factors. It will now be shown that x x y + 0 if and only if x and y are linearly independent. In fact, if y—Ax, it follows immediately from the skew symmetry that x x y =0. Conversely, assume that the vectors x and y are linearly independent. Then choose a vector zEE such that the vectors x, y, z form a basis of E. It follows from (7.36) that (x
x y,z) =
zl(x,y,z) + 0
whence x x y +0.
Formula (7.36) yields for z=xxy
A(x,y,x x y) = Ix x
(7.37)
y12.
If x
and y are linearly independent it follows from (7.37) that 4 (x, y, x x y) >0 and so the basis x, y, x x y is positive with respect to the given orientation. Finally the identity
(x1 x x2,y1 x j'2) = (x1,y1)(x2,y2)
—
(x1,y2)(x2,y1)
will be proved. We may assume that the vectors x1,
x2
(7.38)
are linearly inde-
Chapter VII. Inner product spaces
200
pendent because otherwise both sides of (7.38) are zero. Multiplying the relations A(x1,x2,x3) = (x1 x x2,x3) and x A(y1,y2,i'3) = we obtain in view of (7.23) (x1 x x2, x3)(v1
13) =
Setting Y3 = x1 x x2 and expanding the determinant on the right hand side
by the last row we obtain that
(x1 x x2,x3)(y1 x
x x2) =
(x1 x x2, x3) [(x1, v1)(x2, v2) — (x1,
y2)(x2, v1)].
(7.39)
Since x1 and x2 are linearly independent we have that x1 x x2 + 0 and hence x3 can be chosen such that (x1 x x2, x3)+0. Hence formula (7.39) implies (7.38).
Formula (7.38) yields for x1=y1=x and x2=y2=y (7.40)
x
If 0(0 0 the form
m) denotes the angle between x and y we can rewrite (7.40) in
= xl yI sin 0
Ix x
x + 0, y + 0.
Now we establish the triple identity
xx (1'x z)=(x,z)y—(x,y)z. In fact, let UEE
be
(7.41)
arbitrary. Then formulae (7.36) and (7.38) yield
(xx ft x z),u) = A(x,y x z,u)= — 4(y x z,x,u) = — ft x z, x x u) = — (,x)(z, u) + (y, u)(z, x)
Since ueE is arbitrary, (7.41) follows. From (7.41) we obtain the Jacobi identity
xx(y xz)+vx(z x x)+:x(x xy)=0. Finally note that if e1, e2, e3 is a positive orthonormal basis of E we have the relations X e2
=
C2 X Cf3 = e1,
Cf3
X
= C2.
§ 3. Normed determinant functions
201
Problems 1. Given a vector a +0 determine the locus of all vectors x such that x—a is orthogonal to x+a. 2. Prove that the cross product defines a Lie-algebra in a 3-dimensional inner product space (cf. problem 8, Chap. V, § 1). 3. Let e be a given unit vector of an n-dimensional inner product space E and E1 be the orthogonal complement of e. Show that the distance of a vector xeE from the subspace E1 is given by
d= (x,e)J. 4. Prove that the area of the parallelogram generated by the vectors x1 and x2 is given by A = — a)(s — b)(s — c), where
a=1x11,
b=1x21, c=1x2—xiI, s=4(a+b+c).
5. Let a +0 and b be two given vectors of an oriented 3-space. Prove that the equation x x a = b has a solution if and only if (a, b) = 0. If this condition is satisfied and x0 is a particular solution, show that the general solution is x0+2a. 6. Consider an oriented inner product space of dimension 2. Given two positive orthonormal bases (e1, e2) and (e1, e2), prove that = e1 cos w — e2 sin co = e1 sin co + e2 cos co. where co is the oriented angle between e1 and ë1.
7. Let a1 and a2 be two linearly independent vectors of an oriented Euclidean 3-space and F be the plane generated by a1 and a2. Introduce an orientation in F such that the basis a1, a2 is positive. Prove that the angle between two vectors and
is determined by the equations —
cosO
=
lxi lyl
and sinO =
lxi lyl
x a21(— ic <0 pr).
Chapter VII. Inner product spaces
2U2
8. Given an orthonormal basis linear transformations
=x x Prove
9.
1, 2, 3) in the 3-space, define
by
(v = 1,2,3).
that
Let zl be a determinant function in the plane. Prove the identity
(x. x1)zl(x,, x3) + (x. x2)4(x3, x1) + (x, x3)zl(x1,
10. Let
x2)= 0
(1= 1. 2, 3) be unit vectors in an oriented plane and denote
by
the
e
formulae
xEE.
cosO13 = cose12 cosO23 — sin e12 sine23 sin e13
11. Let x, y, z
=
cose23 + cosO12
sine23.
be three vectors of a plane such that x and y are linearly
independent and that x+y+z=0.
a)
Prove that the ordered
pairs x, y; y, z and
z,
x represent
the same
orientation. Then show that
O(x,y)+
where
the angles refer
O(y,z) + O(z,x)=
to the above orientation.
b) Prove that O(y, — x) + O(z,
—
y) + O(x, — z) =
What is the geometric significance of the two above relations?
12. Given p vectors x1, ..., G(x1,
x,,
prove the inequality 1x112 x212...
Then derive Hadamard's inequality for a determinant /a11 ... a
§ 4.
Duality in an inner product space
7.18. The isomorphism t. Let E be an inner product space ofdi let E* be a vector space which is dual to E with respect to a scalar
ii and
§ 4. Duality in an inner product space
203
product <,). Since E* and E are both dual to E it follows from sec. 2.33
such that
that there is a linear isomorphism
<xx,y>_—(x,y) x,yEE.
(7.42)
With the aid of this isomorphism we can introduce a positive definite inner product in E* given by y*). (x*, y*) = (r (7.43) 1
1
Now introduce a scalar product in Ex E* by =
(7.44)
Then it follows from (7.42) and (7.44) that
= (x,y) = (y,x) =
and so z is dual to itself. e*v (v = 1. .n) be a pair of dual bases of E and E* and consider Let the matrices = (e*v, and (7.45) = .
It follows from the symmetry of the inner product that the matrices (7.45) are symmetric. On the other hand, the linear isomorphism determines an n x n-matrix by teA
Taking the inner product with
yields
=
= (eA,
=
(7.46)
A similar argument shows that — 1
=
e,2.
From (7.46) and (7.47) we obtain that
s'
pKçK
and hence the matrices (7.45) are inverse to each other.
(7.47)
Chapter VII. Inner product spaces
204
If
x=
(7.48)
is an arbitrary vector of E we can write (7.49)
The numbers are called the Co variant components of x with respect to the basis e1 ().= I .n). It follows from (7.48) that . .
=
=
=
whence
=
(7.50)
We finally note that the covariant components of a vector XEE are its inner products with the basis vectors. In fact, from (7.48) and (7.50) we obtain that (x, e%,) =
=
=
If the basis e%,(v= l...n) is orthonormal we have that formulae (7.46) simplify to It follows that t
= and hence
every orthonormal basis of E into the dual basis. Moreover, the equations (7.50) reduce in the case of an orthonormal basis to
maps
— —
Problems 1. Let 1 ...n) be a basis of E consisting of unit vectors. Given a vector xeE write x = + h, where is the orthogonal projection of x onto the subspace defined by (x, e,) = 0. Show that
i=1...n where the are the covariant components of x with respect to the basis 2. Let E, E* be a dual pair of finite dimensional vector spaces and consider a linear isomorphism Find necessary and sufficient con-
§ 5. Normed vector spaces
205
ditions such that the bilinear function defined by
(x,y)=
x,yeE
be a positive definite inner product. § 5.
Normed vector spaces
7.19. Norm-functions. Let E be a real linear space of finite or infinite dimension. A norm-function in E is a real-valued function ii having the following properties: ii
0 for every x e E, and lix ii = 0 only if x = 0. N1: lxii N2: lix + iixil + iiyii. N3:
=
A linear space in which a norm-function is defined is called a normed linear space. The distance of two vectors x and y of a normed linear space is defined by = lix - yii. N1, N2 and N3 imply respectively
Q(x,y)>O (x, y)
Q(x,y)=
if x+y (x, z) + (z, y)
(triangle inequality)
Q(y,x).
is a metric in E and so it defines a topology in E, called the norm topology. It follows from N2 and N3 that the linear operations are continuous in this topology and so E becomes a topological vector space. 7.20. Examples. 1. Every inner product space is a normed linear space with the norm defined by Hence
iixii
2.
Let C be the linear space of all continuous functions fin the interval 1. Then a norm is defined in C by
= max Ot 1
f(t)l.
Conditions N1 and N3 are obviously satisfied. To prove N2 observe
that
1(t) + g(t)l
If (t)I + ig(t)i
ill ii +
whence
ui +
ii + iigii.
,
(0
t 1)
Chapter VII. inner product spaces
3. Consider an n-dimensional (real or complex) linear space E and let (v 1 .. .n) be a basis of E. Define the norm of a vector
= by 7.21. Bounded linear transformations. A linear transformation p: of a normed space is called bounded if there exists a number M such that
xEE.
(7.51)
It is easily verified that a linear transformation is bounded if and only if it is continuous. It follows from N2 and N3 that a linear combination of bounded transformations is again bounded. Hence, the set B(E; E) of all bounded linear transformations is a subspace of L (E; E). Let be a bounded linear transformation. Then the set = 1 is bounded. Its least upper bound will be denoted by II'pII
= sup
(7.52)
lixil = 1
It follows from (7.52) that
xeE. Now it will be shown that the function (p—* thus obtained is indeed a norm-function in B(E; E). Conditions N1 and N3 are obviously satisfied. To prove N2 let ço and be two bounded linear transformations. Then
xeE and consequently
The norm-function
has the following additional property: <
(7.53)
.
.
In fact, .
XEE
whence (7.53).
7.22. Normed spaces of finite dimension. Suppose now that E is a normed vector space of finite dimension. Then it will be shown that the norm topology of E coincides with the natural topology (cf. sec. 1 .22). Since the linear operations are continuous it has only to be shown that a linear function is continuous in the norm topology. Let (v = I, ..., 11) be a basis
§ 5. Normed vector spaces
207
of E. Then we have in view of N2 and N3 that = This relation implies that the function
is continuous in the natural
topology. Now consider the set Q c E defined by
in the naturaltopology and Q that there exists a positive constant 1)1 such that
for xeQ it follows
xeQ. Now N3 yields
xeE whence
v= 1,...,n.
(7.54)
Let f be a linear function in E. Then we have in view of (7.54) that
M
f(x)l = and sof is continuous. This completes the proof.
Since every linear transformation 'p of E is continuous (cf. sec. 1.22) it follows that 'p is bounded and hence B(E; E)=L(E; E). Thus L(E; E) becomes a normed space, the norm of a transformation 'p being given by kpM = max IIxH =1
Problems 1. Let E be a normed linear space and E1 be a subspace of E. Show that a norm-function is defined in the factor-space E/E1 by
= inf xex 1, 2...) of a normed linear 2. An infinite sequence of vectors space E is called convergent towards x if the following condition holds: To every positive number c there exists an integer N such that
if n>N.
Chapter VII. Inner product spaces
a) Prove that every convergent sequence satisfies the following Cauchycriterion: To every positive number a there exists an integer N such that — Xmh
n > N and in > N.
b) Prove that every Cauchy.sequence*) in a normed linear space of finite dimension is convergent.
c) Give an example showing that the assertion b) is not necessarily correct if the dimension of E is infinite. 3. A normed linear space is called complete if every Cauchy-sequence is convergent. Let E be a complete normed linear space and be a linear
transformation of E such that
<1. Prove that the series
convergent and that the linear transformation
v0
pV is
= has
the following properties:
a) b) 1—
§ 6.
The algebra of quaternions
7.23. Definition. Let E be an oriented Euclidean space of dimension 4. Choose a unit vector e, and let E1 denote the orthogonal complement of e. Let E1 have the orientation induced by the orientation of E and by e (cf. sec. 4.29). Observe that every vector XEE can be uniquely decomposed in the form
x=),e+x1 Now consider the bilinear map E x
X1EE1.
defined by
ex= x, xe= x =
— (x,
v) e + x x v
XEE
(7.55)
x.
(7.56)
where x denotes the cross product in the oriented 3-space E1. It is easily checked (by means of formula (7.41)) that this bilinear map makes E into an associative algebra. This algebra is called the algebra *) i.e. a sequence satisfying the Cauchy-criterion.
§ 6. The algebra of quaternions
209
of quaternions and is denoted by 0+ In view of (7.55), e is the unit element of 0-fl while (7.56) shows that
xy+yx=—2(x,y)e
x,yEE1
and so the algebra 0-fl is not commutative.
The elements of E are called quaternions, and the elements of E1 are
called pure quaternions. The conjugate of a quaternion e R, X1EE1) is defined by
=
C
— X1
It is easily verified that X,yEE and
XEE.
Note that xeE1 if and only if x= —x. Next, let XEO-fl be arbitrary and write
x
=
e
+ (x1, x1) e = (x, x) e =
e.
Similarly,
x=
e.
Thus we have the relations x=
=
x
e.
Now assume that x+O and define x' by 1
X
X.
=
Then the relations above yield
x x' =
x=e
(x + 0).
Thus every non-zero element in has a left and right inverse and is a division algebra. Finally note that the multiplication in satisfies the relations I-fl
so 0-0
0-fl
(x y, x z) = and
z)
(y. x, z. x) = (y,
x, y, ZE0-fl
z)
(7.57)
(7.58)
which are easily verified using (7.36) and (7.38). In particular, we have
= 14
Greub. Linear Algebra
.
yI
x, yE0+
Chapter VII. Inner product spaces
Proposition I: The 3-linear function in E1 given by = (x1 x2, x3)
x1,
X3EE1
is a normed determinant function in E1 and represents the orientation of E1.
Proof: First we show that 4 is skew symmetric. In fact, formulae (7.58) and (7.57) imply that 4(x1, x, x) =
(x1
.
x, x) =(x1, e)
=0
and zl(x, x2, x) = (x . x2, x) = (x2, e) x12 = 0.
Thus zl is skew symmetric. Next observe that 4(x1, x2. x3) = {4(x1, x2, x3) = —
4(x2, x1. x3)}
x x2, x3).
.
This relation shows that 4 is a normed determinant function in E1 and represents the orientation of E1. 7.24. Associative division algebras. Let A be an associative division algebra (cf. sec. 5.1) with unit element e. Observe that the real numbers, the complex numbers and the quaternions form associative division The dimensions of these algebras are respectively 1, 2 algebras over and 4. We shall show that these are the on/i finite dimensional associative division algebras over Let A be any such algebra with unit element e. Let aEA be arbitrary and consider the n + 1 powers
a'(v=0
n,a°=e)
n=dimA.
Since these elements are linearly dependent we have a non-trivial relation C!' = 0
This relation can be written in the form
f(a) = where f is the polynomial given by
f(t)
tv
0
(7.59)
§ 6. The algebra of quaternions
211
By the fundamental theorem of algebra, J is a product of irreducible polynomials of degree 1 and 2. Since A is a division algebra, it follows from (7.59) that a satisfies an equation of the form
p(a)=O where p is of degree two or one. Equivalently, every element a E A satisfies
an equation of the form = —fl2e
(a
Lemma I: If x and v are elements of A satisfying x2 =
—e
and y2 =
— e,
then
Proof? Without loss of generality we may assume that y ± x. Then the elements e, x. v are linearly independent as is easily checked.
Now observe that x+v and x—v satisfy quadratic equations + 2fie = 0
(7.60)
+?(x—y)+2àe=O
(7.61)
(x + y)2 +
and
(x—y) 2
+
Adding these equations and observing that x2 = + ;') x +
y
—
=—e
we obtain
+ 2(fl + —2) e = 0.
Since the vectors e, x and y are linearly independent, the equation above yields
cx+;'=O, fi
=
+
2
and therefore Now equations (7.60) and (7.61) reduce to and
(x+v)2 = —2fle
(7.62)
=
(7.63)
(x — y)2
e.
—
Since the vectors e, x. y are linearly independent the polynomial t2 + 2/i must be irreducible and so /3>0. Similarly, 5 >0. Now the equation j3 + 5 2 implies that 0< /3<2. Finally, relation (7.62) yields
xv + vx = 2(1 Since 0< /3 <2, the lemma follows.
—
/3) e.
Chapter VII. Inner product spaces
Lemma II: Let F be the subset of A consisting of those elements x which satisfy
=
—
for some
e
Then
(i) F is a vector space. (ii) A=(e)EJ3F where (e) denotes the 1-dimensional subspace generated by e. (iii) IfxEF and yEF, then xy+vxE(e) and Xy—VXEF. Now let xcF and VEF. Proof: (i) If XEF, then clearly, ).xEF for We may assume that x2 = — e and y2 = — e. Then Lemma I yields + xv + yx =
(x + 1)2 = x2 +
— 1)
e,
I
1.
It follows that x+yEF. Thus F is a vector space. (ii) Clearly, (e) fl F=O. Finally, let aEA be arbitrary. Then a satisfies an equation of the form (a + e)2 =
— 112
flE
e
Set
and
Then x1E(e), x2EF and x1+x2=a.
(iii) Let XEF and VEF. Again we may assume that x2 = —e and —e. Then, by Lemma I, xy+yxE(e). To show that xy— VXEF observe that xv —
yx =
— (x
+ y)(x —
v)
= (x
—
y)(x + v)
Thus (x v
v
x)2 = — (x + y) (x —
y)2
(x + v).
Since, by (i), x+VEF and x—yEF we have =— e (x + (x—y)2= —f32e
whence (xv — vx)2 =
—
112
e
and so xy—yxEF. 7.25. The inner product in F. Let xEF and yEF. Then, by Lemma II, xy+vxE(e). Thus a symmetric bilinear function, (,), is defined in F by xv + y x =
— 2(x, y) e
Since x2
—
(x, x) e
x, yEF.
(7.64)
§ 6. The algebra of quaternions
it follows that (,)
is
213
positive definite and so it makes F into a Euclidean
space.
On the other hand, again by Lemma II sec. 7.24, x y — y xe F. Hence a F x F—+F is defined by skew symmetric bilinear map
xy—yx=2W(x,y)
X,yEF.
(7.65)
Formulae (7.64) and (7.65) imply that
xy=—(x,y)e+W(x,y)
X,yEF.
(7.66)
Finally, observe that
(xy—yx,y)=O
X,yEF
(7.67)
since
(xy—yx))' +y(xy—vx)= xy2 —y2 x= fl2(xe—ex)=O.
7.26. Theorem: Let A he a finite dimensional associative division algebra over
Then A
C or A
A
9-0.
Proof: We may assume that n2 (n=dim A). If n=2, then F has dimension 1. Choose a vector JE F such that phism A 4 C is given by
= — e.
Then an isomor-
Now consider the case n> 2. Then dim F 2. Hence there are unit vectors e1EF, C2EF such that (e1, e2) = 0.
It follows that e1 e2 + e2 e1 = — 2(e1, e2) e = 0.
Now set e3 =
Then we have
2_
,
e3 — e1 e2 c1
e1
e2.
2 2_ ,— — — e1 e2 —
whence e3EF. Moreover, e1 e3 =
— e2,
e2 e3 =
These relations imply that (e3, e3)= 1 and
(e1, e3) = (e2, e3) = 0.
Thus the vectors e1, e2, e3 are orthonormal.
e1.
e
Chapter VII. Inner product spaces
214
In particular. it follows that dim Now we show that the vectors e1. c2. e3 span F. In fact, let :eF. We may assume that :2 = e. Then —
= :e1 e7 — e1e2:
= [— 2(:, e1) e — e1 :] C2 — C1 [— 2(:. e2) — = e2] = —2(:, e1)e2 + 2(:, e2)e1. On
the other hand, _C3
+
e3
= —2(:, e3)e.
Adding these equations we obtain
= —(, e1) e2 + (-.
—
(:, e3) e
whence
=
(, e1) C1 + L e2) e2 + (,
C3.
This shows that: is a linear combination of e1, e2 and e3. Hence dim F= 3
and so dim A = 4. Finally, we show that A is the quaternion algebra. Let A be the trilinear function in F defined by A(x, y, ) = (V'(x,
p),)
x. y,
:EF.
(7.68)
We show that A is skew symmetric. Clearly, A(x, x, =)=O. On the other hand, formula (7.67) yields A(x,
y)
= (W(x. v), i') =
—
Thus A is a determinant in F. Since A(e1, e2, e3) =
— C2
e3) = (e3, e3) =
1,
A is a norined determinant function in the Euclidean space F. In particular, A specifies an orientation and so the cross product is defined in F. Now relation (7.68) implies that v) = x x v
x, VEF.
Combining (7.66) and (7.69) yields
xi'=—(x,v)e+xxy
X,I'EF.
Since on the other hand,
xe=ex=x and A =
XEA
it follows that A is the algebra of quaternions.
(7.69)
§ 6. The algebra of quaternions
215
Remark: It is a deep theorem of Adams that if one does not require associativity there is only one additional finite dimensional division
algebra over
It is the algebra of Cavley numbers defined in Chap. XI, § 2,
problem 6.
Problems 1. Let a 0 be a quaternion which is not a negative multiple of e. Show that the equation x2 = a has exactly two solutions. (1= 1, 2, 3) are three vectors in E1 prove the identity 2. If Y2
Y3 = — 4(Yl' Y2' y3) e — (y1, y2) y3 + (y1, y3) y2 — (y2,
y1.
3. Let p be a fixed quaternion and consider the linear transformation p given by p x = px. Show that the characteristic polynomial of 'p reads
f(t) = (t2 - 2(p, e) t + p12)2. Conclude that p has no real eigenvalues unless p is a multiple of e.
Chapter VIII
Linear mappings of inner product spaces In this chapter all linear spaces are assumed to be real and to have finite dimension
§ 1. The adjoint mapping 8.1. Definition. Consider two inner product spaces Eand Fand assume that a linear mapping p: E—*Fis given. If E* and F* are two linear spaces
dual to E and F respectively, the mapping ço induces a dual mapping The mappings p and are related by
x>
xe E, y* e F*.
(8.1)
Since inner products are defined in E and in F, these linear spaces can be considered as dual to themselves. Then the dual mapping is a linear mapping of F into E. This mapping is called the adjoint mapping of p and will be denoted by Replacing the scalar product by the inner product in (8.1) we obtain the relation
(px,y)=(x,çzy) xEE,yeF.
(8.2)
In this way every linear mapping p of an inner product space E into an inner product space F determines a linear mapping of F into E. The adjoint mapping of is again p. In fact, the mappings and are related by (8.3)
(y,
Equations (8.2) and (8.3) yield
xeE,yeF = ço. Hence, the relation between a linear mapping and the adjoint mapping is symmetric. As it has been shown in sec. 2.35 the subspaces Im çø and ker are whence
§ 1. The adjoint mapping
217
orthogonal complements. We thus obtain the orthogonal decomposition (8.4)
8.2. The relation between the matrices. Employing two bases (v = 1.. .n) and = 1.. .m) of E and of F, we obtain from the mappings and and two matrices defined by the equations
= and
Substituting
and
in (8.2) we obtain the relation (8.5)
= Introducing the components = (xv, xA)
and
=
YK)
of the metric tensors we can write the relation (8.5) as
Multiplication by the inverse matrix gVQ yields the formula (8.6)
Now assume that the bases normal, =
(v =
1. . .n)
and YL (ji = 1. . . m) are ortho-
=
Then formula (8.6) reduces to
= This relation shows that with respect to orthonormal bases, the matrices of adjoint mappings are transposed to each other. 8.3. The adjoint linear transformation. Let us now consider the case that F= E. Then to every linear transformation 'p of E corresponds an adjoint transformation Since is dual to 'p relative to the inner pro*) The
subscript indicates the row.
218
Chapter VIII. Linear mappings of inner product spaces
duct, it follows that det
= det
and
tr
The adjoint mapping of the product i/is l/ioço
The matrices of
tr ço. is
given by
=
and çø relative to an orthonormal basis are transposes
of each other.
Suppose now that e and ë are eigenvectors of p and
respectively.
Then we have that
çoe=2e and whence in view of (8.2) —
2) (e, e)
= 0.
It follows that (e, e) = 0 whenever 2; that is, any two eigenvectors of p and whose eigenvalues are different are orthogonal. 8.4. The relation between linear transformations and bilinear functions. Given a linear transformation ço:E—*E consider the bilinear function
1i(x,y) = (px,y).
(8.7)
The correspondence p —* c/i defines a linear mapping
Q:L(E; E) denotes the space of bilinear functions in Ex E. It will be shown that this linear mapping is a linear isomorphism of L(E; E) onto B(E, E). To prove that is regular, assume that a certain ço determines the zero-function. Then ('p x, y) =0 for every x e E and every ye E, whence 'p = 0.
It remains to be shown that is a mapping onto B(E, E). Given a bilinear function c/i, choose a fixed vector xeE and consider the linear funcdefined by = cIl(x,y). By the Riesz-theorem (cf. sec. 7.7) this function can be written in the form
L(y) = (x',y) where
the vector x'eE is uniquely determined by x.
§ 1. The adjoint mapping
219
Define a linear transformation ço:E—÷E by
X = X'. Then
xeE,yeE. Thus, there is a one-to-one correspondence between the linear transformations of E and the bilinear functions in E. In particular, the identitymap corresponds to the bilinear function defined by the inner product. Let çji be the bilinear function which corresponds to the adjoint transformation. Then
=(x,py) =
=
((py,x)
This equation shows that the bilinear functions cb and çji from each other by interchanging the arguments. 8.5. Normal transformations. A linear transformation p : normal, if
are
obtained is called (8.9)
The above condition is equivalent to
x,yeE.
(8.10)
In fact, assume that çø is normal. Then
(px,py) =
=(x,p
(x,
=
Conversely, condition (8.10) implies that (y,
=(py,px) =
= (y,p
whence (8.9).
Formula (8.10) is equivalent to xeE. This relation implies that the kernels of p and
coincide,
kerq = Hence, the orthogonal decomposition (8.4) can be written in the form (8.11)
Relation (8.11) implies that the restriction of 'p to Im 'p is regular. Hence, has the same rank as 'p. The same argument shows that all the trans-
formations p"(k=2, 3...) have the same rank as 'p.
Chapter VIII. Linear mappings of inner product spaces
220
It is easy to verify that if ço is a normal transformation then so is p —Ai, Hence it follows that
Ae 11.
ker(q — 2z)=
—
2i).
In other words, çø and have the same eigenvectors. Now the result at the end of sec. 8.3. implies that every two eigenvectors of a normal transformation whose eigenvalues are different must be orthogonal. be a linear transformation and assume that an orthogonal Let ço decomposition the restriction Then p is normal if and only if the subspaces are stable under and the transformations are normal. In fact, assume that p is normal and let be arbitrary. Then we
is given such that the subspaces E1 are stable. Denote by
of p to
have for every
= 0.
=
This implies that The normality of
Thus E, is stable under j+ whence follows immediately from the relation
Conversely, assume that
is stable under
and
that
is normal. Then
we have for every vector that 1px12 =
and
=
=
Vpjxjl2 =
=
so p is normal.
Problems 1. Consider two inner product spaces E and F. Prove that an inner product is defined in the space L(E; F) by
Derive the inequality
and show that equality holds only if
§ 2. Selfadjoint mappings
221
be the adjoint transbe a linear transformation and 2. Let (p: formation. Prove that ifFc:Eis stable under (p, then F' is stable under 3. Prove that the matrix of a normal transformation of a 2-dimensional space with respect to an orthonormal basis has the form fl\
/
)
§ 2.
11
or
Selfadjoint mappings
8.6. Eigenvalue problem. A linear transformation p: E—+E is called p or equivalently
selfadjoint if =
x,yeE.
(cox,y)=(x,coy)
The above equation implies that the matrix of a selfadjoint transformation relative to an orthonormal basis is symmetric. If p: E is a stable subspace
then the orthogonal complement F' is stable as well. In fact, let ZEF' be any vector. Then we have for every yeF
(coz,y) =(z,çoy)= 0 whence pzEF'. It is the aim of this paragraph to show that a selfadjoint transformation of an n-dimensional inner product space E has n eigenvectors which are mutually orthogonal. Define the function F by
F(x)=
(x,px)
x+O.
(x,x)
(8.12)
This function is defined for all vectors x +0. As a quotient of continuous functions, F is also continuous. Moreover, F is homogeneous of degree zero, i.e.
F(2x)=F(x)
(2+0).
(8.13)
Consider the function F on the unit sphere lxi = 1. Since the unit sphere is a bounded and closed subset of E, F assumes a minimum on the sphere lxi = 1. Let e1 be a unit vector such that
F(e1) F(x)
(8.14)
222
Chapter VIII. Linear mappings of inner product spaces
for all vectors
=
1.
Relations (8.13) and (8.14) imply that
F(e1) F(x)
(8.15)
for all vectors x+O. in fact, if x*O is an arbitrary vector, consider the corresponding unit-vector e. Then x=JxIe, whence in view of (8.13)
F(x) = F(e) F(e1). Now it will be shown that e1 is an eigenvector of çø. Let y be an arbitrary vector and define the functionf by
f(t) =
F(e1
+ tj).
(8.16)
Then it follows from (8.15) thatf assumes a minimum at t=O, whence f' (O)=O. inserting the expression (8.12) into (8.16) we can write (e1 +
ty,e1 + ty) Ditlèrentiating this function at t=O we obtain
f'(O) =
+
—
(8.17)
Since ço is selfadjoint,
= and hence equation (8.17) can be written as
f'(O) = 2((pe1,y) —
2(e1,çoe1)(e1,y).
(8.18)
We thus obtain (8.19)
for every vector yEE. This implies that pe1
i.e. e1 is an eigenvector of 'p and the corresponding eigenvalue is = (e1,'pe1).
8.7. Representation in diagonal form. Once an eigenvector of 'p has been constructed it is easy to find a system of ii orthogonal eigenvectors. In fact, consider the 1-dimensional subspace (e1) generated by e1. Then (e1) is stable under 'p and hence so is the orthogonal complement E1 of
§ 2. Selfadjoint mappings
223
Clearly the induced linear transformation is again selfadjoint and hence the above construction can be applied to E1. Hence, there exists an eigenvector e2 such that (e1, e2)=0. Continuing this way we finally obtain a system of n eigenvectors (v = 1.. .n) such that (e1).
e,4) =
form an orthonormal basis of E. In this basis the mapping p has the form The eigenvectors
(8.20)
denotes the eigenvalue of These equations show that the matrix of a selfadjoint mapping has diagonal form if the eigenvectors are used as a basis. 8.8. The eigenvector-spaces. If 2 is an eigenvalue of 'p, the corresponding eigen-space E(2) is the set of all vectors x satisfying the equation px=2x. Two eigen-spaces E(2) and E(2') corresponding to different eigenvalues are orthogonal. In fact, assume that where
çø e = 2 e
p e' = 2' e'.
and
Then (e', 'p e) = 2 (e, e') and
(e, 'p e')
=
2' (e,
e').
Subtracting these equations we obtain
(2'— 2)(e,e') = 0, whence (e, e') = 0 if 1' + 2. Denote by (v = . .r) the different eigenvalues of 'p. Then every two eigenspaces and are orthogonal. Since every vector xeE can be written as a linear combination of eigenvectors it follows that the direct sum of the spaces is E. We thus obtain the orthogonal decomposition 1
.
(8.21)
Let
be the transformation induced by 'p in ço,x
Then
xeE(23.
=
This implies that the characteristic polynomial of
det
—
2i) = (2
—
2)ki
(1 =
is given by 1
... r)
(8.22)
where k. is the dimension of E(21). It follows from (8.21), and (8.22)
224
Chapter VIII. Linear mappings of inner product spaces
that the characteristic polynomial of det((p —)
1)
is equal to the product
— 2)k1
=
(2r
—
(8.23)
The representation 8.23 shows that the characteristic polynomial of a selfadjoint transformation has n real zeros, if every zero is counted with its multiplicity. As another consequence of (8.23) we note that the dimenin sion of the eigen-space is equal to the multiplicity of the zero the characteristic polynomial. 8.9. The characteristic polynomial of a symmetric matrix. The above has n real eigenresult implies that a symmetric n x n-matrix A = values. In fact, consider the transformation =
(v =
1
... n)
(v = 1.. .n) is an orthonormal basis of E. Then çø is selfadjoint and hence the characteristic polynomial of p has the form (8.23). At the same time we know that
where
det ('p — 2 i)
= det (A — 2 J).
(8.24)
Equations (8.23) and (8.24) yield det(A — 2J) =
— 2)k1 ...
—
8.10. Eigenvectors of bilinear functions. in sec. 8.4 a one-to-one correspondence between all the bilinear functions qi in E and all the linear transformations 'p E-+E has been established. A bilinear function and the corresponding transformation 'p are related by the equation :
cli(x,y)=(çox,y)
x,yeE.
Using this relation, we define eigenvectors and eigenvalues of a bilinear function to be the eigenvectors and eigenvalues of the corresponding transformation. Let e be an eigenvector of and 2 be the corresponding eigenvalue. Then çJi
(e,
y) =
('p e,
y) =
2 (e,
y)
(8.25)
for every vector yeE. Now assume that the bilinear function k is symmetric,
cli(x,y)= cli(y,x). Then the corresponding transformation 'p
is
selfadjoint. Consequently,
§ 2. Selfadjoint mappings
225
there exists an orthonormal system of ii eigenvectors
(v = 1... n).
=
(8.26)
This implies that cP
2v
Hence, to every symmetric bilinear function in E there exists an orthonormal basis of E in which the matrix of P has diagonal-form.
Problems 1. Prove by direct computation that a symmetric 2 x 2-matrix has only real eigenvalues. 2. Compute the eigenvalues of the matrix
-1 —1 —2 2
L 3.
Find the eigenvalues of the bilinear function v*1L
4. Prove that the product of two selfadjoint transformations p and i,1i is selfadjoint if and only if cli p = 'p 0k/i. 5. A selfadjoint transformation 'p is called positive, if
(x,çox) 0 for every xEE. Given a positive selfadjoint transformation ço, prove that there exists exactly one positive selfadjoint transformation i/i such that
6. Given a selfadjoint mapping çø, consider a vector be(ker p)'. Prove that there exists exactly one vector aEker p' such that çoa=b. 7. Let 'p be a selfadjoint mapping and let 1 ...n) be a system of n orthonormal eigenvectors. Define the mapping 'pA by 'pA = 'p — 2t
where 2
is
a real parameter. Prove that PA
provided that 2 15
Greub. Linear Algebra
is
not an eigenvalue of 'p.
xeE.
Chapter VIII. Linear mappings of inner product spaces
226
8. Let p be a linear transformation of a real n-dimensional linear space E. Show that an inner product can be introduced in E such that becomes a selfadjoint mapping if and only if has n linearly independent eigenvectors. 9. Let be a linear transformation of E and the adjoint map. Denote the norm of which is induced by the Euclidean norm of E (cf. by sec. (7.19)). Prove that = where 2 is the largest eigenvalue of the selfadjoint mapping
çø.
10. Let ço be any linear transformation of an inner product space E. Prove that ço is a positive self-adjoint mapping. Prove that
if and only if ço is regular.
11. Prove that a regular linear transformation 'p of a Euclidean space can be uniquely written in the form (p = 0o1
is a positive selfadjoint transformation and z is a rotation. Hint: Use problems 5 and 10. (This is essentially the unitary trick of
where
Weyl).
§ 3.
Orthogonal projections
8.11. Definition. A linear transformation rc:E—÷E of an inner product space is called an orthogonal projection if it is selfadjoint and satisfies the condition it2 = it. For every orthogonal projection we have the orthogonal decomposition
E = ker it
Im it
and the restriction of it to Im it is the identity. Clearly every orthogonal
projection is normal. Conversely, a normal transformation 'p which satis= = 'p is an orthogonal projection. In fact, since fies the relation we can write
x='px+x1
x1eker'p.
Since 'p is normal we have that ker 'p = ker
and so it follows that
x1eker Hence
we obtain for an arbitrary vector yEE
(x,'py)=('px,'py) +(x1,q,y) =(px,'py)
= (px,'py)
§ 3. Orthogonal projections
227
whence
(x,(py) =
(y,çox).
It follows that p is selfadjoint. To every subspace E1 E there exists precisely one orthogonal projection it such that Im it=E1. It is clear that iris uniquely determined byE1. and define it by To obtain it consider the orthogonal complement iry = y,yeE1;
irz =
Then it is easy to verify that it2 = and = Consider two subspaces E1 and E2 of E and the corresponding orthogonal projections it 1: E—+ E1 and it2 E—3 E2. It will be shown that it2 it1 =0 if and only if E1 and E2 are orthogonal to each other. Assume first that =0. ConE1 ±E2. Then it1xEE± for every vector xeE, whence it2 versely, the equation =0 implies that it1 xeE± for every vector xeE, whence E1EE±. 8.12. Sum of two projections. The sum of two projections it1 and it2:E-+E2 is again a projection if and only if the subspaces E1 and E2 are orthogonal. Assume first that E1 ±E2 and consider the transformation Then it.
it
:
it(x1+x2)=itx1+irx2=x1+x2 Hence, it reduces
x1eE1,x2eE2.
to the identity-map in the sum E1$E2. On the other
hand,
itx=0
if
is the orthogonal complement of the sum n it is the projection of E onto Conversely, assume that it1 + it2 is a projection. Then But
(it1 + it2)2 =
it1
and hence
+ it2,
whence it1 This
it2 + it2 it1 =
0.
(8.27)
equation implies that
it1oit2oit1+it2oit1=0
(8.28)
=0.
(8.29)
and
it1oit2 + Adding
(8.28) and (8.29) and using (8.27) we obtain
=0.
(8.30)
Chapter VIII. Linear mappings of inner product spaces
The equations (8.30) and (8.28) yield
0.
7C2oit1
This implies that E1 I E2, as has been shown at the end of sec. 8.11. 8.13. Difference of two projections. The difference it1 — it2 of two proand it2 : E-+E2 is a projection if and only if E2 is a subjections it1 : space of E1. To prove this, consider the mapping
= t —(itt —it2)=(i —it1)+ it2. it follows that p is a projection if Since i —itt is the projection If this condition is fuland only if i.e., if and only if E1 This implies that filled, p is the projection onto the subspace —it2 = i —p is the projection onto the subspace n
This subspace is the orthogonal complement of E2 relative to E1.
8.14. Product of two projections. The product of two projections it1 : E—+E1 and it2 E—*E2 is an orthogonal projection if and only if the projections commute. Assume first that it2 it1 = it1 it2. Then it2 it1 x = it2 x = x for every vector
XE E1 fl E2.
(8.31)
On the other hand, it2 it1 reduces to the zero-map in the subspace (E1 n E2)' =
+ E±. In fact, consider a vector X=
+ X±
Then it2it1X =
=
+
+ it1
= 0.
Equations (8.31) and show that it2 0it1 is the projection Conversely, if it2 cit1 is a projection, it follows that it2
=
7t1 OTt2
(8.32) n E2.
= it1
Problems 1. Prove that a subspace JcE if and only if
J=Jn
is
stable under the projection it:E—+E1
§ 4. Skew mappings
229
2. Prove that two projections it1 : E—+E1 and it2 : E—+E2 commute if and only if
E1 +E2 =E1 n E2 +E1 fl
E at a subspace E1 is defined by
ox = where
x=p+h(peE1,
tion it:
p
—
h
Show that the reflection
and the projec-
are related by
= 2ir — 4. Consider linear transformation of a real linear space E. Prove that an inner product can be introduced in E such that p becomes an orthogonal projection if and only if p2 = 5. Given a selfadjoint mapping p of E, consider the distinct eigenvalues denotes and the corresponding eigenspaces 1 ...r). If the orthogonal projection prove the relations:
a)
(i+j).
b) c)
§ 4. Skew mappings
8.15. Definition. A linear transformation in E is called skew if —!i. The above condition is equivalent to the relation
x,yeE.
(8.33)
It follows from (8.33) that the matrix of a skew mapping relative to an orthonormal basis is skew symmetric. Substitution of y=x in (8.33) yields the equation
(x,t/ix)=O
xeE
(8.34)
showing that every vector is orthogonal to its image-vector. Conversely, a transformation having this property is skew. In fact, replacing x by x+y in (8.34) we obtain = 0, (x + + whence (y,
x) + (x, t/i v) = 0.
Chapter VIII. Linear mappings of inner product spaces
23()
It follows from (8.34) that a skew mapping can only have the eigenvalue I;. =0. The relation /i= —p/i implies that
tn/i =
0
and
den/i = The last equation shows that
(—
detk/J = 0
if the dimension of E is odd. More general, it will now be shown that the rank of a skew transformation is always even. Since every skew mapping is normal (see sec. 8.5) the image space is the orthogonal complement of the kernel. Consequently, the induced transformation : Im k/i—*Im is regular. Since is again skew, it follows that the dimension of Im must be even. It follows from this result that the rank of a skew-symmetric matrix is always even. 8.16. The normal-form of a skew symmetric matrix. In this section it will be shown that to every skew mapping cli an orthonormal basis (v = 1 n) can be constructed in which the matrix of i/i has the form .
. .
0
K1
— K1
0
(8.35) —
0
=
Consider the mapping Then 8.7, there exists an orthonormal basis 'p
=
=
According to the result of sec. (v = 1 ii) in which 'p has the form . . .
(v = I ... ii)
All the eigenvalues L, are negative or zero. In fact, the equation
pe=i.e
eJ=l
§ 4. Skew mappings
231
implies that 2
= (e,(pe) =
=
—
0.
(t/ie,çl'e)
has the same rank as i/i, the rank of Since the rank of is even and must be even. Consequently, the number of negative eigenvalues is even and we can enumerate the vectors (v = 1 . n) such that (p
. .
(v=2p+ in).
(v= l...2p) and Define the orthonormal basis
(v = 1. . ii) by .
(v=1...p) and
(v=2p+1...n).
In this basis the matrix of
has the form (8.35).
Problems 1. Show that every skew mapping (p of a 2-dimensional inner product space satisfies the relation (p
(p
(x, y).
2. Skew transfbrmations of 3-space. Let E be an oriented Euclidean 3-space.
(i) Show that every vector aEE determines a skew transformation of E given by (ii) Show that
x x.
a, bEE. = — (iii) Show that every skew map p: E—÷E determines a unique vector aEE such that given by Hint: Consider the skew 3-linear map P: E x E x
P(x, v, :)=(px, v) :+(pv. :) .v+(p:, x) y and choose aEE to be the vector determined by v. :) = zl(x, v, :)a
sec. 4.3, Proposition II). (iv) Let e1. e2, e3 be a positive orthonorma! basis of E. Show that the vector a of part (iii) is given by (cf.
a= where
C1 +
C2 +
C3
is the matrix of (p with respect to this basis.
Chapter VIII. Linear mappings of inner product spaces
232
3. Assume that p4O and are two skew mappings of the 3-space having the same kernel. Prove that i/i = ). çø where is a scalar. 4. Applying the result of problem 3 to the mappings çox = (a1 x a2) x x and
— a1(a2,x)
x= prove
the formula (a1 x a2) x a3 = a2(a1,a3) — a1(a2,a3).
E—*E satisfies the relation that a linear transformation if and only if p is selfadjoint or skew. 6. Show that every skew symmetric bilinear function P in an oriented 5. Prove
3-space E can be represented in the form
cP(x,y)=(x xy,a) and that the vector a is uniquely determined by 'P. 7. Prove that the product of a selfadjoint mapping and a skew mapping has trace zero. 8. Prove that the characteristic polynomial of a skew mapping satisfies the equation
x(- = (-
From this relation derive that the coefficient 0f is zero for every odd v. 9. Let çø be a linear transformation of a real linear space E. Prove that an inner product can be defined in E such that ço becomes a skew mapping if and only if the following conditions are satisfied: 1. The space E can be decomposed into ker 'p and stable planes. 2. The mappings which are induced in these planes have positive determinant and trace zero. 10. Given a skew symmetric 4 x 4-matrix A = verify the identity
detA = §
5.
+
+
Isometric mappings
8.17. Definition. Consider two inner product spaces E and F. A linear mapping 'p:E—*F is called isometric if the inner product is preserved under
('px1,'px2)=(x1,x2) x1,x2eE. Setting x1 =
= x we find
kpxl=IxI xeE.
§ 5. Isometric mappings
Conversely, the above relation implies that
=
+
x2)12
is
233
isometric. In fact,
—
—
= x1 + x2f2 — 1x112 — 1x212 = 2(x1,x2).
an isometric mapping preserves the norm it is always injective. We assume in the following that the spaces E and F have the same dimension. Then every isometric mapping ço:E—÷Fis a linear isomorphism of E onto F and hence there exists an inverse isomorphism p 1: F—FE. The isometry of implies that Since
((px,y)=(x,p'y)
xeE,yeF,
whence (8.36)
Conversely every linear isomorphism çü satisfying the equation (8.36) is isometric. In fact,
= (x1,x2).
=
(x1,
The image of an orthonormal basis
(v =
1. . . n)
of E under an isometric
mapping is an orthonormal basis of F. Conversely, a linear mapping which sends an orthonormal basis of E into an orthonormal basis (v = 1.. .n) of F is isometric. To prove this, consider two vectors
and then
and whence
(pxl,px2) =
=
=
=
It follows from this remark that an isometric mapping can be defined between any two inner product spaces E and F of the same dimension: Select orthonormal bases and (v = 1. n) in E and in F respectively . .
and define ço by
8.18. The condition for the matrix. Assume that an isometric mapping lii) we obtain from p:E—*Fis given. Employing two bases and ço an n x n-matrix by the equations =
234
Chapter VIII. Linear mappings of inner product spaces
Then the equations ((p
(p aM) =
aM)
can be written as (b), bK) = (as,, aM).
Introducing the matrices
=
and
(au,
h)K
(b1, bK)
we obtain the relation
=
(8.37)
Conversely, (8.37) implies that the inner products of the basis vectors are preserved under (p and hence that (p is an isometric mapping. If the bases and are orthonormal,
=
=
,
relation (8.37) reduces to
= showing that the matrix of an isometric mapping relative to orthonormal
bases is orthogonal. 8.19. Rotations. A rotation of an inner product space E is an isometric mapping of E into itself. Formula (8.36) implies that
(detp)2 =
1
showing that the determinant of a rotation is ± 1. A rotation is called proper if det (p = + 1 and improper if det (p = — 1. imEvery eigenvalue of a rotation is ± 1. In fact, the equation plies that let = let, whence )= ± 1. A rotation need not have eigenvecvectors as can already be seen in the plane (cf. sec. 4, 17). Suppose now that the dimension of E is odd and let be a proper rotation. Then it follows from sec. 4.20 that has at least one positive eigenvalue ).. On the other hand we have that ). = ± 1 whence 2= 1. Hence, every proper rotation of an odd-dimensional space has the eigenvalue 1. The corresponding eigenvector e satisfies the equation (pe=e; that is, e A similar argument shows that to every imremains invariant under proper rotation of an odd-dimensional space there exists a vector e such that pe= —e. If the dimension of E is even, nothing can be said about
§ 5. Isometric mappings
235
the existence of eigenvalues for a proper rotation. However, to every im-
proper rotation, there is at least one invariant vector and at least one vector e such that çoe= —e (cf. sec. 4.20). Let ço:E-+E be a rotation and assume that FcE is a stable subspace. Then the orthogonal complement F' is stable as well. In fact, if z is arbitrary we have for every yeF (coz,y) =
=
0
whence çozeF'.
The product of two rotations is obviously again a rotation and the inverse of a rotation is also a rotation. In other words, the set of all rotations of an n-dimensional inner product space forms a group, called the general orthogonal group. The relation det (p2 ° pi) = det c°2 det q1
implies that the set of all proper rotations forms a subgroup, the special orthogonal group.
A linear transformation of the form 2 ço where 2 0 and p is a proper rotation is called homothetic. 8.20. Decomposition into stable planes and straight lines. With the aid of the results of § 2 it will now be shown that for every rotation 'p there exists an orthogonal decomposition of E into stable subspaces of dimension 1 and 2. Denote by E1 and E2 the eigenspaces which correspond to the eigenvalues 2 + 1 and 2 = — 1 respectively. Then E1 is orthogonal to E2. In fact, let x1 eE1 and x2eE2 be two arbitrary vectors. Then
'px1=x1 and 'px2=—x2. These
equations yield
(x1,x2) = — (x1,x2), whence (x1, x2)=0. It follows from sec. 8.19 that the subspace F= (E1 is again stable under 'p. Moreover, F does not contain an eigenvector of 'p and hence F has even dimension. Now consider the selfadjoint mapping
of F. The result of sec. 8.6 assures that there exists an eigenvector e of i/i. If 2 denotes the corresponding eigenvalue we have the relation çoe
+
'p' e = 2e.
Chapter VIII. Linear mappings of inner product spaces
236
Applying p we obtain ço2e=)Lçce—e.
(8.38)
Since there are no eigenvectors of çø in F the vectors e and e are linearly independent and hence they generate a plane F1. Equation (8.38) shows
that this plane is stable under 'p. The induced mapping is a proper rotation (otherwise there would be eigenvectors in F1). F is again stable The orthogonal complement under and hence the same construction can be applied to Fj'-. Continuing in this way we finally obtain an orthogonal decomposition ofF into mutually orthogonal stable planes. Now select orthonormal bases in E1, 122 and in every stable plane. These bases determine together an orthonormal basis of E. In this basis the matrix of 'p has the form
cosO1
sin 01
—sin01
cos01
±1
—
where
(i =
1.
. .
k) denotes
(v =
1
...p)
2k=n—p
COSOk
sinOk
sin °k
C05 °k
the corresponding rotation angle (cf. sec. 8.21).
Problems 1. Given a skew transformation k/I of E, prove that
is a rotation and that — is not an eigenvalue of 'p. Conversely, if 'p is a rotation, not having the eigenvalue — I prove that =('p — i)o('p + 1)-i is a skew mapping. 2. Let 'p be a regular linear transformation of a Euclidean space E such that ('p x, coy) = 0 whenever (x, y) = 0. Prove that 'p is of the form 'p = )L t, ).+O where z is a rotation. 1
§ 6. Rotations of Euclidean spaces of dimension 2, 3 and 4
237
3. Assume two inner products qi and 'P in E such that all oriented angles with respect to and 'I' coincide. Prove that 'P(x, y) where 2>0 is a constant. 4. Prove that every normal non-selfadjoint transformation of a plane is homothetic. 5. Let be a mapping of the inner product space E into itself such that (pO=0 and çox — (P11 = lx — x. VEE.
Prove that
is then linear.
there exists a continuous 6. Prove that to every proper rotation family (p(0t I) of rotations such that and (P1 = 7. Let be a linear automorphism of an n-dimensional real linear space E. Show that an inner product can be defined in E such that p becomes a rotation if and only if the following conditions are fulfilled:
(i) The space E
can
be decomposed into stable planes and stable
straight lines. (ii) Every stable straight line remains pointwise fixed or is reflected at the point 0.
(iii) In every irreducible invariant plane a linear automorphism induced such that and tn/il <2. =
is
1
8. If
is a rotation of an n-dimensional Euclidean space, show that
Jtrcoln. 9. Prove that the characteristic polynomial of a proper rotation satisfies the relation =(
—
)4nf(2—
10. Let E be an inner product space of dimension n>2. Consider a proper rotation t which commutes with all proper rotations. Prove that = i where c = if ii is odd and = ± if n is even. 1
§ 6.
1
Rotations of Euclidean spaces of dimension 2,3 and 4
8.21. Proper rotations of the plane. Let E be an oriented Euclidean plane and let zl denote the normed positive determinant function in E. is determined by the equation Then a linear transformation j: zi (x, i') = (j
i)
x, v E.
Chapter VIII. Linear mappings of inner product spaces
238
The transformation / has the following properties:
I) (jx,v)+(x,jv)=O (jx,jv)=(x,v)
2)
3)
4)
detj=1.
In fact, 1) follows directly from the definition. To verify 2) observe that the identity (7.24), sec. 7.13, implies that (x, y)2 + (/x,v)2 y
= IxV
x we obtain, in view of 1),
(jx, jx)2 = j is injective (as follows from the definition) we obtain
= imply that
Now the relations .1= —.1 and
=—
Finally, we
obtain from 1), 2), 3) and from the definition of /
4(jx,iv) = (/2 x, iv) = — (x,jy) = (jx, v) = zl(x,
y)
whence
det j =
1.
The transformation / is called the canonical complex structure of the oriented Euclidean plane E. Next, let p be any proper rotation of E. Fix a non-zero vector x and denote by 0 the oriented angle between x and p x (cf. sec. 7.1 3). It is determined mod 2 it by the equation
px = x cos 0 +j(x) sin 0.
(8.39)
We shall show that cos 0 = tr 'p
and
sin 0 = 4 tr(jo p).
(8.40)
In fact, by definition of the trace we have
zl(p x, y) + 4(x, py) = 4(x, y). tr 'p. Inserting (8.39) into (8.41) and using properties 2) and 3) of j
(8.41)
we
obtain
2cos0 .A(x, v)=zl(x,y) . trqi and so the first relation (8.40) follows. The second relation is proved in a similar way.
§ 6. Rotations of Euclidean spaces of dimension 2, 3 and 4
239
Relations (8.40) show that the angle (9 is independent of x. It is called the rotation angle of and is denoted by e((p). Now (8.39) reads
cosO((p)+j(x) sinê((p)
XEE.
In particular, we have
O(i)=0,
/ix=x
&(—i)=m, cos
sin
a second proper rotation with rotation angle shows that is
a
simple calculation
= x cos(&(p)+ (9(h)) +j(x) sin((9(p)+ Thus any two proper rotations commute and the rotation angle for their
product is given by (p) = (9((p) +
(mod 2it).
(8.42)
Finally observe that if e1, e2 is a positive orthonormal basis of E then we have the relations (p e1 e1 cos O(p) + e2 sin O((p) (p e2 = — e1 sin O(p) + e2 cos O(p).
Remark: If E is a non-oriented plane we can still assign a rotation angle to a proper rotation (p. It is defined by the equation
cos 9(p) = tr p
(0
it).
(8.43)
Observe that in this case (9(p) is always between 0 and it whereas in the oriented case 9(p) can be normalized between —it and it. 8.22. Proper rotations of 3-space. Consider a proper rotation p of a 3-dimensional inner product space E. As has been shown in sec. 8.19,
there exists a 1-dimensional subspace E1 of E whose vectors remain fixed. If p is different from the identity-map there are no other invariant vectors
(an invariant vector is a vector xeE such that (px=x). In fact, assume that a and b are two linearly independent invariant vectors. Let c(c+0) be a vector which is orthogonal to a and to b. Then (pc—2c where 2= ± 1. Now the equation det (p= implies that 2= + 1 showing that (p is the identity. In the following discussion it is assumed that +1. Then the invariant vectors generate a i-dimensional subspace E1 called the axis of (p. 1
240
Chapter VIII. Linear mappings of inner product spaces
To determine the axis of a given rotation (p consider the skew mapping (8.44)
and introduce an orientation in E. Then
can
t/ix=u xx
be written in the form
UEE
(8.45)
(cf. problem 2, § 4). The vector u which is uniquely determined by (p iS
called the rotation-vector. The rotation vector is contained in the axis of (p. In fact, let a+O be a vector on the axis. Then equations (8.45) and (8.44) yield (8.46)
showing that u is a multiple of a. Hence (8.45) can be used to find the rotation axis provided that the rotation vector is different from zero. This exceptional case occurs if and only if = i.e. if and only if Then (p has the eigenvalues 1, —1 and — 1. In other words, is (p = (p a reflection at the rotation axis. 8.23. The rotation angle. Consider the plane F which is orthogonal to E1. Then transforms F into itself and the induced rotation is again proper. Denote by 0 the rotation angle of (p1 (cf. remark at the end of see 8.21). Then, in view of (8.43),
cosO =
Observing that trço =
+1
trço1
we obtain the formula cosO = 3-(trp
1).
To find a formula for sin 0 consider the orientation of Fwhicl' is induced by E and by the vector u (cf. sec. 4.29)*). This orientation is represented by the normed determinant function
A1(y,z)= lul
where A is the normed determinant function representing the orientation
of E. Then formula (7.25) yields
sinO=A1(y,(py)='A(u,y,(py) lul
*) It is assumed that u
0.
(8.47)
§ 6. Rotations of Euclidean spaces of dimension 2. 3 and 4
241
where y is an arbitrary unit vector of F. Now
A(u,y,çoy)= = 4(u,co' y,y)= — A(u,y,ço1y) and hence equation (8.47) can be written as (8.48) lul
By adding formulae (8.47) and (8.48) we obtain (8.49)
lxi
21u1
Inserting the expression (8.45) in (8.49), we thus obtain
x y)=
sin0 = lul Since
x lul
(8.50)
y is a unit-vector orcnogonal to u, it follows that iu
xyl=iuilyl=lui
and hence (8.50) yields the formula
sin0=iui. This equation shows that sin 0 is positive and hence that 0<0< it if the above orientation of F is used. Altogether we thus obtain the following geometric significance of the rotation-vector u: 1. u is contained in the axis of 'p. 2. The norm of u is equal to sin 0. 3. If the orientation induced by u is used in F, then 0 is contained in the interval 0<0< it. 1 has obviously the Let us now compare the rotations 'p and p 1= same axis as 'p. Observing that we see that the rotation vector of 1 is — u. This implies that the inverse rotation induces the inverse orientation in the plane F. To obtain an explicit expression for u select a positive orthonormal be the corresponding matrix of 'p. Then ,/i basis e1, e2, e3 in E and let has the matrix — = 16
Greub. Linear Algebra
Chapter VIII. Linear mappings of inner product spaces
242
and the components of u are given by
It should be observed that a proper rotation is not uniquely determined by its rotation vector. In fact, if and (p2 are two rotations about the same axis whose rotation angles satisfy 01 +02 = it, then the rotation coincide. Conversely, if ç1 and p2 have the same and vectors of rotation vector, then their rotation axes coincide and the rotation angles satisfy either 01 =02 or 01 + 02 = it. Since cos(m—0)= —cos0 it follows from the remark above that a rotation is completely determined by its rotation vector and the cosine of the rotation angle. 8.24. Proper rotations and quaternions. Let E be an oriented 4-dimensional Euclidean space. Make E into the algebra of quaternions as described in sec. 7.23. Fix a unit vector a and consider the linear transformation
(px=ax
Then
(p is
xeE.
a proper rotation. In fact, (pxl
= laxl= al xl = xl.
is proper choose a continuous map f from the closed unit interval to S3 such that
To show that (p
f(1)=a and set
xeE. Then every map
is
a rotation whence
det(p1=±l Since det
is a continuous function of t it follows that det 'p1 = det
= det i = +
1.
This implies that det (p = +
1
and so (p is a proper rotation. In the same way it follows that the linear transformation x=xa xeE is
a proper rotation.
§ 6. Rotations of Euclidean spaces of dimension 2, 3 and 4
243
Next, let a and b be fixed unit vectors and consider the proper rotation 'r of E given by XEE. (8.51)
If a = b = p (p a unit vector) we have
xeE.
(8.52)
It follows that t (e) = e and so t induces a proper rotation, t1, of the orthogonal complement, E1, of E. We shall determine the axis and the rotation angle of r1. Let qeE1 be the vector given by
q q
determines the axis of the rotation -c1. We shall assume that
±e so that q#0. To determine the rotation angle of 'r1 consider the 2-dimensional subspace, F of E1 orthogonal to q. We shall show that, for ZEF, TZ
=
—
l)z +
(8.53)
In fact, the equations
and p'=).e—q yield
+
q)
qz— zq
and
z
= 2(q x z)
qz+zq= —2(q,z)e——O
we obtain ZEF
and so (8.53) follows.
Let F have the orientation induced by the orientation of E1 and the vector q (cf. sec. 4.29). Then the rotation angle ê is determined by the equations (cf. see 8.21)
cos ê = (z, z, z) and .
sin ê =
1
zl
(q, z, t z)
where z is any unit vector in F.
—
244
Chapter VIII. Linear mappings of inner product spaces
Using formula (8.53) we obtain
cos 0 = and
2).2
—
sine
=2)
qx
Since
=
cos
w and
= sin
U)
where 0) denotes the angle between e and p (0< (y) <m) these relations
can be written in the form
cos 0 = 2 cos2 0) — 1 = cos 20) and
sin 0 = 2 cos 0)
sin
U)
= sin 2w.
These relations imply that
0 =2w. Thus we have shown that the axis of 'r1 is generated by q and that the rotation angle of is twice the angle between p and e.
Proposition I: Two unit quaternions p1 and p2 determine the same rotation of E1 only if p2 = ±p1. Moreover, every proper rotation of E1 can be represented in the form (8.52).
Proof: The first part of the proposition follows directly from the result above. To prove the second part let a be any proper rotation of E1. We may assume that a a unit vector such that aa=a and let F E1 be the plane orthogonal to a. Give F1 the orientation induced by a and denote the rotation angle of a by p = e cos
(0< & <2 it). Set
+ a sin
Then the rotation i given by
zx=pxp1 coincides with a. In fact, if q is the vector determined by p=Ae+q, qeE1, then q = a sin
and so i and a have the same rotation axis. Next, observe that since sin
>0, q is a positive multiple of a and so
the vectors q and a induce the same orientation in F. But, as has been
§ 6. Rotations of Euclidean spaces of dimension 2, 3 and 4
245
shown above, the rotation angle of 'r with respect to this orientation is given by
t9=2 It follows that t = a. This completes the proof. Proposition II: Every proper rotation of E can be written in the form (8.51). Moreover, if a1, b1 and a2, b2 are two pairs of unit vectors such that the corresponding rotations coincide, then
a2=ra1 and b2=rb, where
± 1. Proof: Let a be a proper rotation of E and define t. by
t(x)=a(x)a(e)1
XEE.
Then t is again a proper rotation and satisfies 'r (e) = e. Thus 'r restricts to a proper rotation of E1. Hence, by Proposition I, there is a unit vector p such that
xeE. It follows that
a(e)= axb'
a(x)= t(x)a(e) = with a = p and b = p. Finally, assume that a1 x
xeE.
= a2 x
Setting x = e yields a1 bj
1
= a2 b1.
Then we obtain from (8.53)
pxp1=x
xeE.
Now Proposition I implies that p = r e, r = ± 1 and so
a2=ra1, This completes the proof.
b2=cb1.
(8.53)
Chapter VIII. Linear mappings of inner product spaces
246
Problems I. Show that the rotation vector of a proper rotation ço of 3-space is zero if and only if is of the form (px= — x+2(x,e)e
where e is a unit vector. 2. Let p be a linear automorphism of a real 2-dimensional linear space E. Prove that an inner product can be introduced in E such that 'p becomes a proper rotation if and only if det 'p =
1
and
2.
Itr
3. Consider the set H of all homothetic transformations 'p of the plane. Prove:
a) If 'p1eH and 'p2eH, then 2p1+jup2eH. b) If the multiplication is defined in H in the natural way, the set H becomes a commutative field. c) Choose a fixed unit-vector e. Then, to every vector xeE there exists Define a multisuch that exactly one homothetic mapping Prove that E becomes a field under this multiplication in E by xy = defines an isomorphism of E onto plication and that the mapping H. d) Prove that E is isomorphic to the field of complex numbers. 4. Given an improper rotation p of the plane construct an orthonormal basis e1, e2 such that çoe1 =e1 and pe2= —e2. 5. Show that every skew mapping of the plane is homothetic. If #0,
prove that the angle of the corresponding rotation is equal to + orientation is defined by the determinant-function
A(x,y)=(çlix,y)
x,yeE.
6. Find the axis and the angle of the rotation defined by = çoe2 =
=
e1
+ 2e2 — 2e3)
+ 2e2 + e3) — e2 —
2e3)
(v = 1, 2, 3) is a positive orthonormal basis. If 'p is a proper rotation of the 3-space, prove the relation
where 7.
det('p + i) = 4(1 + cosO) where 0 is the rotation angle.
2
if the
§ 6. Rotations of Euclidean spaces of dimension 2, 3 and 4
247
whose determinant is + 1.
8. Consider an orthogonal 3 x 3-matrix Prove the relation — — +
4
V
V
9. Let e be a unit-vector of an oriented 3-space and 0 ( — ir <0 ic) be a given angle. Denote by Fthe plane orthogonal to e. Consider the proper rotation 'p whose axis is generated by e and whose angle is 0 if the orientation induced by e is used in F. Prove the formula (p x = x cos 0 + e (e, x)(1
—
cos 0)
+ (e x x) sin 0.
10. Prove that two proper rotations of the 3-space commute if and only if they have the same axis. 11. Let be a proper rotation of the 3-space not having the eigenvalue —1. Prove that the skew transformations are
x=(co—i)o(ca+iY' connected by the equation
and
1
1 + cos0
'1'
where 0 denotes the rotation-angle of 'p. 12. Assume that an improper rotation 'p
—
i of the Euclidean 3-space
is given.
a) Prove that the vectors x for which çox= —x, form a 1-dimensional subspace E1. is induced in the plane F orthogonal b) Prove that a proper rotation is the to E1. Defining the rotation-vector u as in sec. 8.22, prove that identity if and only if u = 0. is given by c) Show that the rotation-angle of + 1)
cosO
and that 0<0< ic if the induced orientation is used in F.
13. Let a be a vector in an oriented Euclidean 3-space such that 1. Consider the linear transformation 'pa given by (flaX = x cos(ic
+
1
(a, x)f /Ial\2 + (a x
where f is defined by
f(t)=-1-sinmt
Chapter VHI. Linear mappings of inner product spaces
(i) Show that
is a proper rotation whose axis is generated by a
and whose rotation angle is e=m (ii) Show that = i and that
(PaX =
if
x + 2a(a, x)
1.
(iii) Suppose that a+h. Show that P,=Pb if and only if and that there is a correspondence between the proper rotations of and the straight lines in h = — a. Conclude 14.
1
1
Let E be an oriented 3-dimensional inner product space.
a) Consider E together with the cross product as an algebra. Show that the set of non-zero endomorphisms of this algebra is precisely the group of proper rotations of E. b) Suppose a multiplication is defined in E such that every proper rotation z is an endomorphism, t (x y) = t x t y. Show that
x y) xy = a constant. 15. Let a be a skew linear transformation of the Euclidean space a) Show that a can be written in the form
where
is
ax=px+xq
I-Il.
XE1-11
and that the vectors p and q are uniquely determined by a. b) Show that E1 is stable under a if and only if q = — p. c) Establish the formula det a =
— 1q12)2.
16. Let p be a unit vector in E and consider the rotation i x = p x Show that the rotation vector (cf. sec. 8.22) of t is given by
u=2).(p—).e)
)=(p,e).
17. Let p + ± e be a unit quaternion. Denote by F the plane spanned by e and p and let F' be its orthogonal complement. Orient F so that the vectors e, p form a positive basis and give F' the induced orientation. Consider the rotations
çox=px and t/ix=xp.
§ 7. Differentiable families of linear automorphisms a)
Show that the planes F and F' are stable under p and
249
and that
in F.
b) Denote by e the
in F and by the common rotation angle of p and rotation angles for and in F'. Show that e and
Differentiable families of linear automorphisms
§ 7.
8.25. Differentiation formulae. Let E be an n-dimensional inner product space and let L(E; E) be the space of all linear transformations of E. It has been shown in sec. 7.21 that a norm is defined in the space L(E; E) by the equation = max kpxl. IxI=1
A continuous mapping of a closed interval into the space L (E; E) will be called a continuous family of linear transformations or a continuous curve in L (E; E). A continuous curve çø (t) is called d(fferentiable if the limit lim
exists for every t(t0 every fixed t.
t
(p(t+LIt)—(p(t)
ut
t1). The mapping
= is obviously again linear for
The following formulae are immediate consequences of the above definition: 1.
p+p
—2
+p
(2, p constants)
2. 3.
4. If p differentiable curves in L (E; E) and çJi is a p-linear function in L(E; E), then d
dt
= v=1
A curve çø (t)(t0 t is called continuously differentiable if the mapping t-÷ (t) is again continuous. Throughout this paragraph all differentiable
curves are assumed to be continuously differentiable. 8.26. Differentiable families of linear automorphisms. Our first aim is to establish a one-to-one correspondence between all differentiable families of linear automorphisms on the one hand and all continuous families of linear transformations on the other hand. Let a differentiable family
Chapter VIII. Linear mappings of inner product spaces
of linear automorphisms be given such that 'p (t0) = i. of linear transformations is defined by
'p
Then a continuous family i1Li (t)
Interpreting t as time we obtain the following physical significance of the mappings cli (t): Let x be a fixed vector of E and
x(t) = (p(t)x (t) is given by
the corresponding orbit. Then the velocity vector
= ç3(t)x = q3(t)q,(t)_1 x(t) = i/i(t)x(t). the mapping (t) associates with every vector x (t) its velocity at the instant t. Now it will be shown that, conversely, to every continuous curve i/i (t) in L (E; E) there exists exactly one differentiable family 'p (t) (t0 t t1) of linear automorphisms satisfying the differential equation Hence,
= cl/(t)o(p(t)
(8.54)
and the initial-condition 'p (t0) = z. First of all we notice that the differential equation (8.54) together with the above initial condition is equivalent to the integral equation
'p(t) = z+
(t0
t
t3.
(8.55)
In the next section the solution of the integral equation (8.55) will be constructed by the method of successive approximations. 8.27. The Picard iteration process. Define the curves 0,1...) by the equations
=
i
and
(8.56) +
1(t)
z + 5 (t) o
(0 dt
(n = 0, 1 ...)
Introducing the differences
=
—
(t)
(n = 1,2, ...)
(8.57)
§ 7. Differentiable families of linear automorphisms
we obtain from (8.56) the relations
(n = 2,3,...).
=
(8.58)
Equation (8.57) yields for n= 1
-
A1(t) =
=
Define the number M by
M= max It/i(t)I. tO
t
t1
Then
(t)I
M(t — t0).
(8.59)
Employing the equation (8.58) for n=2 we obtain in view of (8.59) IA
M2(
f (t — t0) dt
2(t)I
—
and in general (n
= 1,2...).
Now relations (8.57) imply that n+p A%,(t),
—
whence n+p MV
n+p —
(t —
vn+1
vn+1
v!
Let >0 be an arbitrary number. It follows from the convergence of the series
MV —i-- (t1
_t0)V that there exists an integer N such that
n+p MV v!
(t1 — to)V
<
c
for
n > N and p 1.
The inequalities (8.60) and (8.61) yield —
p
1.
(8.61)
Chapter VIII. Linear mappings of inner product spaces
These relations show that the sequence the interval
is uniformly convergent in (p(t).
In view of the uniform convergence, equation (8.56) implies that
t t1).
= i+
(8.62)
As a uniform limit of continuous curves the curve p (t) is itself continuous. Hence, the right hand-side of (8.62) is differentiable and so p (t) must be differentiable. Differentiating (8.62) we obtain the relation
= //(t)op(t) showing that 'p (t) satisfies the differential equation (8.54). The equation
i is an immediate consequence of (8.62). 8.28. The determinant of It remains to be shown that the mappings 'p (t) are linear automorphisms. This will be done by proving the formula
'p (ta) =
f tri/i(t)dt
det'p(t) = eto Let A
(8.63)
0 be a determinant function in E. Then A ('p (t) x1 ... 'p (t)
= det 'p (t) A
(x1 ...
E E.
Differentiating this equation and using the differential equation (8.54) we obtain ... (8.64)
= Observing that (ço (t)x1 ...
=
...
(co(t)x1 ...
= tr /i (t) det 'p (t) A
(x1 ...
we obtain from (8.64) the differential equation det 'p (t) = tr
(t) det 'p (t)
(8.65)
families of linear automorphisms
§ 7. Differentiable
253
for the function det p (t). Integrating this differential equation and observing the initial condition det (t0) = det i = 1 we find (8.63). 8.29. Uniqueness of the solution. Assume that (t) and p2(t) are two solutions of the differential equation (8.54) with the initial condition = z. Consider the difference p
p(t) = p2(t) — The curve 'p (t) is again a solution of the differential equation (8.54) and it satisfies the initial condition 'p (t0) = 0. This implies the inequality
=
(8.66) to
to
Now define the function F by
F(t)=
(8.67)
Then (8.66) implies the relation
E(t) MF(t). Multiplying by e_tM we obtain E(t)e_tM — Me_tMF(t)
0,
whence d
dt
(F(t)e_tM)
0. —
Integrating this inequality and observing that F(t0) = 0, we obtain
F(t)e_tM <0 and consequently
F(t)O
(8.68)
On the other hand it follows from (8.67) that
F(t)O
(8.69)
Relations (8.68) and (8.69) imply that F(t)_= 0 whence 'p (t)m 0. Consequently, the two solutions (t) and 'p2(t) coincide.
254
Chapter VIII. I.inear mappings of inner product spaces
8.30. 1-parameter groups of linear automorphisms. A differentiable family of linear automorphisms (p (t) ( — cc
Equation (8.70) implies indeed that the automorphisms (p(t) form an (abelian) group. Inserting t=0 we find (p(O)=l. Now equation (8.70) yields
q(t)0q(— t)==z showing that with every automorphism (p (t) the inverse automorphism (p (t)' is contained in the family (t)( — cc
+ t) = Inserting t = 0 we obtain the differential equation
(—cc<x
(8.71)
çà(0). Conversely, consider the differential equation (8.71) is a given transformation of E. It will be shown that the solution (p (r) of this differential equation to the initial condition p (0) i is a Iparameter group of automorphisms. To prove this let t be fixed and consider the curves where where
p1(t)= p(t+
(8.72)
and
(p2(t) = ço(t)op(t).
(8.73)
Differentiating the equations (8.72) and (8.73) we obtain (8.74)
and
=
= l//o(p(t)o(p(r) =
(8.75)
Relations (8.74) and (8.75) show that the two curves c°i (t) and (p2(t) satisfy the same differential equation. Moreover, (0) = (p2(0)
(p (x).
Thus, it follows from the uniqueness theorem of sec. 8.28 that whence (8.70).
(t)
(t)
§ 7. Differentiable families of linear automorphisms
8.31. Differentiable families of rotations. Let p(t)(t0 t1) be a differentiable family of rotations such that 'p (ta) = i. Since det 'p (t) = ± 1 for every t and det 'p (ta) = + lit follows from the continuity that det 'p (t) = + 1, i.e. all rotations cp (t) are proper. Now it will be shown that the linear transformations
= are
skew. Differentiating the identity
we
obtain
=
Inserting
= and
=
=
into this equation we find
+ i/J(t))op(t) = 0, whence (t) +
(t) = 0.
Conversely, let the family of linear automorphisms 'p (t) be defined by the differential equation
= b(t)0p(t),
p(t0) = 1
where cli (t) is a continuous family of skew mappings. Then every automorphism 'p (t) is a proper rotation. To prove this, define the family x (t) by
x(t) = Then
=
-
+
=0
and X(to)
i.
Now the uniqueness theorem implies that x (t)
i, whence
This equation shows that the mappings 'p (t) are rotations.
Chapter VIII. Linear mappings of inner product spaces
8.32. Angular velocity. As an example, let p(t) be a differentiable family of rotations of the 3-space such that çø (0) = i. If t is interpreted as the time, the family (t) can be considered as a rigid motion of the space E. Given a vector x, the curve
x(t) = co(t)x describes
its orbit. The corresponding velocity-vector is determined by =
=
=
(8.76)
Now assume that an orientation is defined in E. Then every mapping i/i
(t) can be written as
= u(t) x y.
(8.77)
The vector u (t) is uniquely determined by (t) and hence by t. Equations (8.76) and (8.77) yield ±(t) u(t) x x(t). (8.78)
The vector u(t) is called the angular velocity at the time t. To obtain a physical interpretation of the angular velocity, fix a certain instant t and assume that u(t)+0. Then equation (8.78) shows that (t) a multiple of u (t). In other words, the straight line generated by u(t) consists of all vectors having the velocity zero at the instant t. This straight line is called the instantaneous axis. Equation (8.78) implies that the velocity-vector ±(t) is orthogonal to the instantaneous axis. Passing over to the norm in equation (8.78) we find that J±(t)I
= u(t)I Ih(t)I
is the distance of the vector x 0') from the instantaneous axis (fig. 1). ConseFig. 1 quently, the norm of u 0') is equal to the magnitude of the velocity of a vector having the distance 1 from the instantaneous axis. The uniqueness theorem in sec. 8.28 implies that the rigid motion ço 0') is uniquely determined by p (t0) if the angular velocity is a given function
where h (t)I
of t.
8.33. The trigonometric functions. In this concluding section we shall apply our general results about families of rotations to the Euclidean plane and show that this leads to the trigonometric function cos and sin. This definition has the advantage that the addition theorems can be proved in a simple fashion, without making use of the geometric intuition.
§ 7. Differentiable families of linear automorphisms
257
Let E be an oriented Euclidean plane and zi be the normed determinant function representing the given orientation. Consider the skew mapping which is defined by the equation
x, y) =
A
(x, y).
(8.79)
First of all we notice that cli is a proper rotation. In fact, the identity 7.24 yields
(Vix,y)2 = A(x,y)2 =(x,x)(y,y) — (x,y)2.
Inserting y i,li x we find
= Now i/i is regular as follows from (8.79). Hence the above equation implies
that
= (x,x). Replacing x and y by clix and çliy respectively in (8.79) we obtain the relation x, y) = x, i/i y) = (cii x, y) = A (x, y) showing that detcii = + 1. Let (t)( — cc
=
c&o(p(t)
(0)
=
(8.80)
and the initial condition z.
Then it follows from the result of sec. 8.29 that + x) = p(t)0p(t).
(8.81)
We now define functions c and s by
c(t) = 4trp(t) and
—
(t) =
—
tr (ci'
cc < t
(8.82)
p (t))
These functions are the well-known functions cos and sin. In fact, all the properties of the trigonometric functions can easily be derived from (8.82). Select an arbitrary unit-vector e. Then the vectors e and i/ie form 17
Greub. Linear Algebra
Chapter VIII. Linear mappings of inner product spaces
an orthonormal basis of E. Consequently,
tr(p(t) =
+
(8.83)
is itself a proper rotation, the mappings ço (t) and cli commute. Hence, the second term in (8.83) can be written as Since çfr
= (p(t)e,e).
= We
thus obtain c(t) = (p(t)e,e).
(8.84)
In the same way it is shown that
s(t)
(8.85)
Equations (8.84) and (8.85) imply that
p(t)e = c(t)e + s(t)l/Je.
(8.86)
Replacing t by t+t in (8.84) and using the formulae (8.81) and (8.86) we obtain
c(t + r) = (q,(t + t)e,e) = (p(t)q(t)e,e) = c (t) (p (t) e, e) — s (t) ('p (t) e, e).
(8.87)
Equations (8.87), (8.84) and (8.85) yield the addition theorem of the function c: c(t + t) = c(t)c(t) — s(t)s(t). In the same way it is shown that
s(t + t) = s(t)c(t) + c(t)s(t). Problems 1. Let cli be a linear transformation of the inner product space E. Define the linear automorphism exp cl' by
= 'p(l) where 'p (t) is the family of linear automorphisms defined by
i/iop(t),p(O) = z. Prove that
'p(t)=
t
§ 7. Differentiable families of linear automorphisms
259
defined in problem 1 has the fol-
2. Show that the mapping lowing properties:
I.
2. exp(—çli)=(exp 3.
exp0=l.
4.
exp
if is skew. that exp is a 3. Consider the family of rotations 'p (t) defined by (8.80). a) Assuming that there is a real number p + 0 such that 'p (p) = i, prove
that (p(t+p)=p(t)(—cc
r
if and only if
dt
c) Show that the family 'p (t)
(k=0,±1,±2,...). has
derivatives of every order and that
(v=0,1...). d) Define the curve x(t) by
x(t) = p(t)e where e is a fixed unit-vector. Show that
= 1.
Derive from formulae (8.82) that the function c is even and that the function s is odd. 5. Let be the skew mapping defined by (8.79). Prove De Moivre's 4.
formula
= c(t)i + 6. Let a skew mapping of an n-dimensional inner product space and 'p (t) the corresponding family of rotations. Consider the normal form (8.35) of the matrix of Vi. Prove that the function 'p (t)( — oo
Chapter VIJI. Linear mappings of inner oroduct spaces
260
7. Let A be a finite dimensional associative real algebra. such a) Consider a differentiable family of endomorphisms is a derivation in A. that Po 1. Prove that b) Let 0 be a derivation in A and define the family çot of linear transformations by
c°o=1. is an automorphism of A. Show that every
Prove that every mutes with 0.
com-
8. The quaternionic exponential function. Fix a quaternion a and consider the differential equation
±(t)=ax(t) with the initial condition x(0) = e. Define exp a by
expa=x(1). Decompose a in the form a =2 e + h where I e and (b, e) =0. (i) Show that a)
exp 2 is the exponential function for the reals.
b) exp b=ecos IbI c)
sin IbI.
exp a = exp 1 (e cos lb I +
sin lb I).
(ii) Show that
lexpal=expl. (iii) Prove that exp a e if and only if
2=0 and
lbl=2km,
ke7L.
(iv) Let a1 = e + b1 and a2 =22 e + b2 ne two quaternions with b1 + 0 and b2 + 0. Show that exp a2 = exp a1 if and only if
22=21
kEl.
and
(v) Let B3 be the closed 3-ball given by (x, e) = 0, lxl mapp: ERxB3-*Eby
p(2,y)=exp(le+y)
yeB3.
Show that a) p maps B3 onto E —0. b) 'p is injective in the interior of B3. c) If S3 denotes the boundary of B3 then
'p(l,y)= —exple
yeS3.
Define a
Chapter IX
Symmetric bilinear functions All the properties of an inner product space discussed in Chapter VII are based upon the bilinearity, the symmetry and the definiteness of the inner product. The question arises which of these properties do not depend on the definiteness and hence can be carried over to a real linear space with an indefinite inner product. Linear spaces of this type will be discussed in § 4. First of all, the general properties of a symmetric bilinear function will be investigated. It will be assumed throughout the chapter that all linear spaces are real. § 1.
Bilinear and quadratic functions
9.1. Definition. Let E be a real vector space and qi be a bilinear function in E x E. The bilinear function cP is called symmetric if
x,yEE. Given a symmetric bilinear function çji consider the (non-linear) function 'I' defined by 1i(x,x). Then is uniquely determined by In fact, replacing x by x+y in (9.1) we obtain (9.2)
whence
=
+ y) —
—
(9.3)
Equation (9.3) shows that different symmetric bilinear functions 1i lead to different functions W. Replacing y by —y in (9.2) we find —
y) = W(x)
—
+
(9.4)
Chapter IX. Symmetric bilinear functions
262
Adding the equations (9.2) and (9.4) we obtain the so-called parallelogram-identity
W(x + y) +
(x
(x) +
y)
—
(9.5)
9.2. Quadratic functions. A continuous function W of one vector which satisfies the parallelogram-identity will be called a quadratic function. Every symmetric bilinear function yields a quadratic function by setting x=y. We shall now prove that, conversely, every quadratic function can be obtained in this way. Substituting x==y=O in the parallelogram-identity we find that (9.6)
Now the same identity yields for x
0
W(— y)
= W(y)
showing that a quadratic function is an even function.
If there exists at all a symmetric bilinear function çji such that
cli(x,x)= this function is given by the equation
cli(x,y) =
+ y) —
— W(y)}.
(9.7)
Therefore it remains to be shown that the function cP defined by (9.7) is indeed bilinear and symmetric. The symmetry is an immediate consequence of (9.7). Next, we prove the relation cli(x1 + x2,y) = cP(x1,y) + cP(x2,y).
(9.8)
Equation (9.7) yields
+ x2,y) = =
W(x1 + y) —
—
2cP(x2,y) =
+ y) —
—
+ x2 + y) — W(x1 + x2) —
whence 2
(x1 + x2, y) —
— {W(x1
+ y) +
(xi, y) — (x2, y)} = {W (x1 + x2 + y) + W (y)} — — W(x2)}. (9.9) + y)} — {W(x1 + x2) —
It follows from (9.5) that
W(x1 + x2 + y) + W(y) =
+ x2 + 2y) + W(x1 + x2)}
(9.10)
§ 1. Bilinear and quadratic functions
263
and
+ y)+ W(x2 + y)=
+ x2 + 2y)+
—
x2)}.
(9.11)
Subtracting (9.11) from (9.10) and using the parallelogram-identity again we find that {W(x1 + x2 + y) + W(y)} — {W(x1 + y) + W(x2 + y)} — x2)} = — W(x1)— W(x2) +
= 4{W(x1 + x2) —
(9.12)
+ x2).
Now equations (9.9) and (9.12) imply (9.8). Inserting x1 = x and x2 = into (9.8) we obtain
x,y) = — cP(x,y).
—x
(9.13)
It remains to be shown that 'Ii(Ax,y) = Acli(x,y)
(9.14)
for every real number A. First of all it follows from (9.8) that
cP(kx,y)= kP(x,y) for a positive integer k. Equation (9.13) shows that (9.14) is also correct for negative integers. Now consider a rational number A
"
(p,q integers).
q
Then
(p \q
\ J
whence
\j= p cli(x,y). j q
(p \q
To prove (9.14) for an irrational factor A we note first that is a continuous function of x and y, as follows from the continuity of W. Now select a sequence of rational numbers such that lim n
= A.
cxj
Then we have that
(9.15)
For n-+cx we obtain from (9.15) the relation (9.14). Our result shows that the relations (9.1) and (9.7) define a one-to-one correspondence between all symmetric bilinear functions and all qua-
Chapter IX. Symmetric bilinear functions
264
dratic functions. If no ambiguity is possible we shall designate a symme-
tric bilinear function and the corresponding quadratic function by the same symbol, i.e., we shall simply write
=
9.3. Bilinear and quadratic forms. Now assume that E has dimension n and let (v = n) be a basis of E. Then a symmetric bilinear 1
function
. . .
can be expressed as a bilinear form
P(x,y) =
(9.16) p
where and the matrix
is defined by *)
=
It follows from the symmetry of cP that the matrix
is symmetric:
= Replacing y by x in (9.16) we obtain the corresponding quadratic form cP(x) =
Problems 1. Letf and g be two linearly independent linear functions in E and let 0 be a derivation in the algebra 11. Show that the function
f(x) 0 [g(x)] —g(x) 0[f(x)]
satisfies the paralelogram identity and the relation Prove that the function cP obtained from W by (9.7) is bilinear if and only
if 0=0. 2. Prove that a symmetric bilinear function in E defines a quadratic function in the direct sum 3. Denote by A and by A the matrices of the bilinear function çji with respect to two bases and (v = I ii). Show that . .
A= TAT* where T is the matrix of the basis transformation *) The first index counts the row.
§ 2. The decomposition of E
265
§ 2. The decomposition of E
9.4. Rank. Let E be a vector space of dimension n and a symmetric bilinear function in E x E. Recall that the nulispace E0 of cP is defined to be the set of all vectors x0eE such that cb(x0,y) =
for every yeE.
0
(9.17)
The difference of the dimensions of E and E0 is called the rank of Hence çJi is non-degenerate if and only it has rank n. Now let E* be a dual space and consider the linear mapping p : defined by
x,yeE.
(9.18)
Then the null-space of çJi obviously coincides with the kernel of (p,
= kerp. Consequently, the rank of çJi is equal to the rank of the mapping p. Let be the matrix of relative to a basis x,, (v = 1. n) of E. Then relation . .
(9.18) yields
=
=
is the matrix of the mapping p. This implies that the rank of the matrix is equal to the rank of 'p and hence equal to the rank of In particular, a symmetric bilinear function is non-degenerate if and only if the determinant of is different from zero. 9.5. Definiteness. A symmetric bilinear function cli is called positive
showing that
definite if cli (x) > 0
for all vectors x +0. As has been shown in sec. 7.4, a positive definite bilinear function satisfies the Schwarz-inequality
cP(x,y)2 cIl(x)cP(y)
X,yEE.
Equality holds if and only if the vectors x and y are linearly dependent. A positive definite function çJl is non-degenerate. If cll(x)O for all vectors xeE, but cli(x)=O for some vectors x+0, the function cli is called positive semidefinite. The Schwarz inequality is still valid for a semidefinite function. But now equality may hold without the vectors x and y being linearly dependent. A semidefinite function is al-
Chapter 1X. Symmetric bilinear functions
ways degenerate. In fact, consider a vector x0 0 such that di (x0) =0. Then the Schwarz inequality implies that cP (x0, y)2 cP (x0) cP (y) = 0
whence dl (x0, y) =0 for all vectors y. In the same way negative definite and negative semidefinite bilinear functions are defined. The bilinear function cP is called indefinite if the function (x) assumes positive and negative values. An indefinite function may be degenerate or non-degenerate. 9.6. The decomposition of E. Let a non-degenerate indefinite bilinear function dl be given in the n-dimensional space E. It will be shown that E such that the space Ecan be decomposed into two subspaces cli is positive definite in and is negative definite in E. Since cli is indefinite, there is a non-trivial subspace of E in which cli is positive definite. For instance, every vector a for which cIl(a)>0 generates such a subspace. Let be a subspace of maximal dimension such that dl is positive with redefinite in Et Consider the orthogonal complement E of spect to the scalar product defined by c/I. Since cli is positive definite in E the intersection E consists only of the zero-vector. At the same time we have the relation (cf. Proposition II, sec. 233)
dimE = dimE. This yields the direct decomposition E =
Now it will be shown that cli is negative definite in E. Given a vector z +0 of E -, consider the subspace E1 generated by E + and z. Every vector of this subspace can be written as
This implies that cli (x) =
cli
(y) +
dl (z).
(9.19)
Now assume that li(z)>0. Then equation (9.19) shows that cli is positive definite in the subspace E1 which is a contradiction to the maximumproperty of E Consequently,
dl(z) 0
for all vectors
zeE
§ 2. The decomposition of E i.e.,
is
267
negative semidefInite in E. Using the Schwarz inequality
z1eE,zeE
(9.20)
we can prove that cP is even negative definite in E. Assume that cb(z1)=0 for a vector z1eE. Then the inequality (9.20) yields
1(z1,z) = 0
for all vectors zeE. At the same time we know that =0
for all vectors yeEt These two equations imply that 'k(z1,x)—_ 0
for all vectors xeE, whence z1 =0. 9.7. The decomposition in the degenerate case. If the bilinear function 'P is degenerate, select a subspace E1 complementary to the nulispace E0, E = E0 EBE1. Then 'P is non-degenerate in E1. In fact, assume that
'P(x1,y1)= 0 for a fixed vector x1 e E1 and all vectors Yi e E1. Consider an arbitrary vector yeE. This vector can be written as
YYo+Yi
y0eE0,y1eE1
whence
'P(x1,y) = 'P(x1,y0) + 'P(x1,y1) = 0.
(9.21)
This equation shows that x1 is contained in E1 and hence it is contained in the intersection E0 n E1. This implies that x1 =0. Now the construction of sec. 9.6 can be applied to the subspace E1. We thus obtain altogether a direct decomposition (9.22)
of E such that 'P is positive definite in and negative definite in E. 9.8. Diagonalization of the matrix. Let (x1 .x5) be a basis of E which is orthonormal with respect to 'P, (Xs+i...Xr) be a basis of E which is . .
Chapter IX. Symmetric bilinear functions
orthonormal with respect to —P, and (Xr+i...Xn) be an arbitrary basis of Then + l(v= is) ç — I (v = s + I ... r) where = (xv, = (
O(v=r+1...n)
then form a basis of E in which the matrix of Ji has the following diagonal-form: The vectors (x1 .
. .
0
9.9. The index. It is clear from the above construction that there are infinitely many different decompositions of the form (9.22). However, the dimensions of E + and E - are uniquely determined by the bilinear function b. To prove this, consider two decompositions (9.23)
and (9.24)
such that P is positive definite in
and
and negative definite in
Ej and E. This implies that n
whence dim
n.
(9.25)
+dimEj +dimE0=n.
(9.26)
+ dim Ej + dim E0
Comparing the dimensions in (9.23) we find
Equations (9.25) and (9.26) yield
§ 2. The decomposition of E
Interchanging
and
269
we obtain
whence
=
Consequently, the dimension of is uniquely determined by cu. This number is called the index of the bilinear function cui and the number —dim E =2s—r is called the signature of cP. dim Now suppose that (v = 1.. .n) is a basis of E in which the quadratic function cli has diagonal form 11(x) = and assume that
and Then p is the index of cli. In fact, the vectors (v = 1. . .p) generate a subspace of maximal dimension in which cli is positive definite. From the above result we obtain Sylvester's law of inertia which asserts
that the number of positive coefficients is the same for every diagonal form. 9.10. The rank and the index of a symmetric bilinear function can be determined explicitly from the corresponding quadratic form cil(x) =
We can exclude the case cu =0. Then at least one coefficient from zero. If apply the substitution
= Then
where say
=
+
is different
—
=
+0 and
0. Thus, we may assume that at least one coefficient is different from zero. Then çJl (x) can be written as
2"
X11
)
The substitution ill =
+
in (v=2...n)
Chapter IX. Symmetric bilinear functions
yields
(x) =
(p1)2 1
+
(9.27)
The sum in (9.27) is a symmetric bilinear form in (n — 1) variables and hence the same reduction can be applied to this sum. Continuing this way we finally obtain an expression of the form = Rearranging the variables we can achieve that
)V>O (v=l...s) (v=s+1...r) (v=r+l...n). Then r is the rank and s is the index of P.
Problems 1. Let cP + 0 be a given quadratic function. Prove that cu can be written in the form
±1 wheref is a linear function, if and only if the corresponding bilinear function has rank 1. 2. Given a non-degenerate symmetric bilinear form cu in E, let J be a subspace of maximal dimension such that cui(x, x)=0 for every xeJ. Prove that dim J = mm (s, n — s).
Hint: Introduce two dual spaces E* and F* and linear mappings and defined by
cli(x,y)=
3. Define the bilinear function
= Let S(E; E) be the space of all selfadjoint mappings and A (E; E) be the space of all skew mappings with respect to a positive definite inner product. Prove:
§ 2. The decomposition of E
271
a) cb(p, (p)>O for every p+O in S(E; E), b) 'P(p, (p)
d) The index of
is
n(n+1)
where n
2
dim E.
4. Find the index of the bilinear function = tr(i/10p)
—
in the space L(E; E). 5. Find the index of the quadratic form =
6. Let
i<j
be a bilinear function in E. Assume that E1 is a subspace of
E such that çji is non-degenerate in E1. Define the subspace E2 as follows:
A vector x2eE is contained in E2 if b(x1, x2) =
for all vectors x1 eE1.
0
Prove that
E=E1 7. Consider a (not necessarily symmetric) bilinear function such that (x, x)>O for all vectors x #0. Construct a basis of E in which the matrix of çji has the form 1
K1
—K1 1
1
1
1, Hint: Decompose CP in the form
=
+
where
cP1(x,y)=
+ CP(y,x))
and P2 (x, y) =
(cP (x, y)
— 1i
(y, x)).
Chapter IX. Symmetric bilinear functions
272
8. Let E be a 2-dimensional vector space, and consider the 4-dimensional space L(E; E). Prove that there exists a 3-dimensional subspace Fc:L(E; E) and a symmetric bilinear function in F such that the nilpotent transformations (cf. problem 7, Chap. IV, § 6) are precisely the transformations t satisfying (In other words, the nilpotent transformations form a cone in F). § 3.
Pairs of symmetric bilinear functions
9.11. In this paragraph we shall investigate the question under which
conditions two symmetric bilinear functions cP and W can be simultaneously reduced to diagonal form. To obtain a first criterion we consider the case that one of the bilinear functions, say W, is non-degenerate. Then the vector space E is self-dual with respect to W and hence there exists a linear transformation satisfying
x,yeE (cf. Prop. Ill, sec. 2.33). Suppose now that x1 and x2 are eigenvectors of p such that the corresponding cigenvalues and 22 are different. Then we have that = ¶t'(x1,x2) and
cP(x2,x1) = 22 ¶P(x2,x1)
whence in view of the symmetry of cP and W —
Since
22) W(x1,x2)
0.
it follows that ¶I'(x1,x2)=O and hence k(x1, x2)=O.
Proposition: Assume that Y is non-degenerate. Then P and
are
simultaneously diagonalizable if and only if the linear transformation 'p has n linearly independent eigenvectors. If 'p has /1 linearly independent eigenvectors consider the distinct eigenvalues of 'p. Then it follows that . .
E=E1EE...EEE. where E, is the eigenspace of V'
(x,
=
Then we have for 0
and
CP (xe,
and
1jj
= 0.
Now choose a basis in each space E1 such that ¶1' has diagonal form (cf.
§ 3. Pairs of symmetric bilinear functions sec.
273
9.8). Since
= it follows that has also diagonal form in this basis. Combining all these bases of the E such that and W have diagonal form. Conversely, let e (i = 1.. .n) be a basis of E such that (ei, = 0 and W (ei, = 0 if i +j. Then we have that
i+j. This equation shows that the vector (p e1 is contained in the orthogonal complement (with respect to the scalar product defined by of the subspace v + i. But Fj is the 1-dimensional generated by the vectors subspace generated by e, =2 e1. In other words, the are eigenvectors of (p. As an example let E be a plane with basis a, b and consider the bilinear functions W given by
P(a,b)=O,
P(b,b)=—1
and
W(a,a)= 0,
0.
1,
It is easy to verify that then the linear transformation p is given by
(pa=b, pb=—a. Since the characteristic polynomial of p is 22+ 1 it follows that (p has no eigenvectors. Hence, the bilinear functions and 'P are not simultaneously diagonalizable. Theorem: Let E be a vector space of dimension n 3 and let and 'I' be two symmetric bilinear functions such that
D(x)2-E-W(x)2+O
if x+O.
Then çji and 'P are simultaneously diagonalizable. Before giving the proof we comment that the theorem is not correct for dimension 2 as the example above shows. 9.12. To prove the above theorem we employ a similar method as in sec. 8.6. If one of the functions rIi and 'P,say 'P, is positive definite the desired basis-vectors are those for which the function
F (x) = 18
Greub. Linear Algebra
P (x)
x
+ 0.
(9.28)
274
Chapter IX. Symmetric bilinear functions
assumes a relative minimum. However, if is indefinite, the denominator and hence the in (9.28) assumes the value zero for certain vectors
function F is no longer defined in the entire space x+0. The method of sec. 8.6 can still be carried over to the present case if the function F is replaced by the function
arc tanF(x).
(9.29)
To avoid difficulties arising from the fact that the function arc tan is not single-valued, we shall write the function as a line-integral. At this point the hypothesis n 3 will be essential *). Let E be the deleted space x 0 and x = x (t)(0 t 1) be a differentiable curve in E. Consider the line-integral
=
(9.30)
+
J
0
taken along the curve x (t). First of all it will be shown that the integral J
depends only on the initial point x0=x(O) and the endpoint x=x(l) of the curve x(t). For this purpose define the following mapping of E into the complex w-plane:
w(x)= cP(x)+ I W(x). The image of the curve xQ) under this mapping is the curve w (t) =
cP
(x (t)) + I 'I' (x (t))
in the w-plane. The hypothesis 1i
(0 t 1)
(9.31)
+ 'P (x)2 + 0 implies that the curve
w(t)(0t 1) does not go through the point w=0. The integral (9.30) can now be written as 1
2J u +v dt 2
2
0
where the integration is taken along the curve (9.31). Now let 0(t) be an angle-function for the curve w(t) i.e. a continuous function of t such that v(t) u(t) (9.32) and sin 0(t) = cos0(t) = .
w(t)I
*) The proof given is due to JOHN MILNOR.
Jw(t)I
§ 3. Pairs of symmetric bilinear functions
275
(cf. fig. 2) *). It follows from the differentiability of the curve w (t) that
the angle-function 0 is differentiable and we thus obtain from (9.32) —uiwi +
d wi (9.33)
iwi
and
d wi Iwi
Fig. 2
(9.34) by
0
0 and adding these equa-
tions we find that
ui'—üv
0=-
-
+
Integration from t = 0 to t = gives 1
liv 2dt=O(l)—0(0) +V
ut) — U
2
showing that the integral J is equal to the change of the angle-function 0 along the curve w (t), (9.35) 2J = 0(1) — 0(0).
Now consider another differentiable curve x = (t)(0 t 1) in E with the initial point x0 and the endpoint x and denote by Jthe integral (9.30) Then formula (9.35) yields taken along the curve
2J= 0(1) — 0(0)
(9.36)
where U is an angle-function for the curve (t) =
cP
(t)) + i
(t))
Since the curves w (t) and (t)(0 t the same endpoint it follows that O(O)—O(O)=2k0m
and
(0 t 1).
1) have the same initial point and
O(l)—0(l)=2k1it
(9.37)
*) For more details cf. P.S. A1ExANDRoV. Combinatorial Topology, Vol. I. chapter II. § 2.
Chapter IX. Symmetric bilinear functions
where k0 and k1 are integers. Equations (9.35), (9.36) and (9.37) show that the difference J—i is a multiple of 7t,
J—J = km.
it remains to be shown that k=0. The hypothesis n3 implies that the space E is simply connected. in other words, there exists a continuous 1 into E such that mapping x = x (t, t) of the square 0 t 1, 0 t
x(t,O)=x(t), and
x(O,t)=x0,
x(1,'r)=x
The mapping x(t, t) can even be assumed to be differentiable. Then, for every fixed 'r, we can form the integral (9.30) along the curve
x(t,z) This integral is a continuous function J(t) of i. At the same time we know that the difference —J is a multiple of m, J(t) — J = mk('r).
Hence, k (z)
(9.38)
a continuous integer-valued function in the interval Ot I and thus k(t) must be a constant. Since k(0)=0 it follows that is
1). Now equation (9.38) yields
J(x)=J Inserting t = 1 we obtain the relation
i=J showing that the integral (9.30) is indeed independent of the curve x (t). 9.13. The function F. We now can define a single-valued function F in the space E by
F(x) =
r (x) I
J
-
W (x
cP(x)2
-
(x
+ W(x)2
W
(x)
dt
(9.39)
where the integration is taken along an arbitrary differentiable curve x(t) in E leading from x0 to x. The function F is homogeneous of degree zero,
F(2 x) = F(x),
> 0.
(9.40)
§ 3. Pairs of symmetric bilinear functions
277
To prove this, observe that F(itx)—F(x)=
— + W(x)2
I
j
—-dt.
Choosing the straight segment
as path of integration we find that
=
(1
—
whence
= 0.
—
This implies the equation (9.40). 9.14. The construction of eigenvectors. From now on our proof will follow the same lines as in sec. 8.6. We consider first the case that is non-
degenerate. Introduce a positive definite inner product in E. Then the continuous function F assumes a minimum on the sphere lxi = 1. Let e1 be a vector on lxi = I such that
F(e1) F(x) for all vectors lxi = 1. Then the homogeneity of F implies that
F(e1) F(x) for all vectors x +0. Consequently, the function
f(t) =
F(e1
+ ty),
where y is an arbitrary vector, assumes a minimum at t==0, whence
f'(O)=O.
(9.41)
Carrying out the differentiation we find that (0) — —
Equations
'P(e',y)
y)
P(e1)2 + ¶P(e1)2
(9 42)
(9.41) and (9.42) imply that
cl(e1,y)
k11(e1)
— P(e1) ¶I'(e1,y) = 0
(9.43)
for all vectors yeE. In this equation W(e1)+0. In fact, assume that
_____
278
Chapter IX. Symmetric bilinear functions
W(e1)=0. Then and hence equation (9.43) yields ¶I'(e1, y)=0 for all vectors yEE. This is a contradiction to our assumption that W is non-degenerate. Define the number by = then equation (9.43) can be written as VEE.
9.15. Now consider the subspace E1 defined by the equation
V'(e1,z) = 0.
Since W is non-degenerate, E1 has the dimension n — I. Moreover, the restriction of 'P to E1 is again non-degenerate: Assume that z1 is a vector of E1 such that
'P(z1,z)=O
(9.44)
for all vectors zeE1. Equation (9.44) implies that (9.45)
for every vector xEE because x can be decomposed in the form zEE1.
=0, and so 'P is non-degenerate in E1. Therefore, the construction of sec. 9.14 can be applied to E1. We thus obtain a vector e2eE1 with the property that Now it follows from (9.45) that
b (e2, z) =
22
'P (e2, z)
for every vector z
E E1
(9.46)
where — 2
1(e2) W(e2)
Equation (9.46) implies that
=
22
¶I'(e2,y)
for every vector yeE; in fact, y can be decomposed in the form ZEE1
(9.47)
§ 3. Pairs of symmetric bilinear functions
and
279
we thus obtain
+ P(e2,z) =
= =
+
(9.48)
=
W(e1,e2) +
and =
W(e2,z).
+
(9.49)
Equations (9.46), (9.48) and (9.49) yield (9.47).
Continuing this construction we obtain after n steps a system of n vectors
subject to the following conditions:
;') =
P
(9.50)
E
(er, v)
W
4 0
= Rearranging the vectors
+ ii).
0
and multiplying them with appropriate scalars
we can achieve that = where
that
s
ç+i(v=i...s) — 1(v
(9.51)
= s + 1 ...n)
denotes the signature of W. It follows from the above relations
the vectors
Inserting
form a basis of E. in the first equation
(9.50)
we
find
(9.52)
= Equations (9.51) and (9.52) show that the bilinear functions diagonal form in the basis 1.. .n).
and 'P have
9.16. The degenerate case. The degenerate case now remains to be con-
We may assume that W+O. Then it will be shown that there exists a scalar is non-desuch that the bilinear function sidered.
generate Let E* be a dual space of E and consider the linear mappings
and defined
by the equations
cP(x,y)=
and
Then Im
To prove this relation, let
fl
p (ker fl
= 0. be any vector. Then
(9.53)
=px
Chapter IX. Symmetric bilinear functions
for some xe ker i/i. Hence (9.54)
=0
¶P(x) =
and
=
x,
=
x> =0
(9.55)
and
because
and hence that = ço x = 0. x Now let I .n) be a basis of E such that the vectors (Xr+ i. form a basis of ker i/i. Employing a determinant-function zl +0 in E we obtain zl(gox1 + + = + ... (PXr + . .
.
The expansion of this expression yields a polynomial f(2) starting with
the term is not identically zero. This follows from the relation (9.53) and the fact that the r vectors = ...r) and the (n —r) in) are linearly independent. (a=r+ vectors (pxae(p(ker i/i) implies Hence, f is a polynomial of degree r. Our assumption that r 1. Consequently, a number can be chosen such is non-degenerate (cf. sec. 9.4). Then By the previous theorem there exists a basis (v = .. .n) of E in which the bilinear functions P and P+)LW both have diagonal form. Then the The coefficient of
1
1
functions
and ¶1' have also diagonal form in this basis.
Problems 1. Let 'k and 'I' be two symmetric bilinear functions in E. Prove that the condition P(x)2 + W(x)2 > 0, x+0 is equivalent to the following one: There exist real numbers ). and ji such that (x) + /2 W (x) > 0 for every x +0. and B = 2. Let A = be two symmetric n x n-matrices and assume that the equations and
§4. Pseudo-Euclidean spaces
together imply that
(v=
1
281
n). Prove that the polynomial
+ AB)
is of degree r and has r real roots where r is the rank of B. § 4.
Pseudo-Euclidean spaces
9.17. Definition. A pseudo-Euclidean space is a real linear space in which a non-degenerate indefinite bilinear function is given. As in the positive definite case, this bilinear function is called the inner product and
is denoted by (,).The index of the inner product is called the index of the pseudo-Euclidean space. Since the inner product is indefinite, the number (x, x) can be positive, negative or zero, depending on the vector x. A vector x+0 is called space-like, if (x, x) >0 time-like, if (x, x) <0
a light-vector, if (x, x)=0 The light-cone is the set of all light-vectors. As in the definite case two vectors x and y are called orthogonal if (x, y) =0. The light-cone consists of all vectors which are orthogonal to themselves.
A basis
1 ...n) is called orthonormal if e,2) =
where
c+i(v=i...s)
In sec. 9.8 we have shown that an orthonormal basis can always be constructed.
If an orthonormal basis two vectors
(v = 1. . .n)
is chosen, the inner product of
and
is given by the bilinear form
(x,y) =
= and the equation of the light-cone reads
v1
vs+1
(9.56)
Chapter IX. Symmetric bilinear functions
9.18. Orthogonal complements. Since the inner product in E is nondegenerate the space E is dual to itself. Hence, every subspace E1 c: E determines an orthogonal complement EjL which is again a subspace of E
and has complementary dimension. does not necessarily consist of the However, the intersection E1 n zero-vector alone, as in the positive definite case. Assume, for instance, that E1 is the 1-dimensional subspace generated by a light-vector 1. Then E1 is contained in E1L.
It will be shown that E1 fl E1'=O if and only if the inner product is non-degenerate in E1. Assume first that this condition is fulfilled. Let x1 be a vector of E1 n E1L. Then (x1,y1) = 0 for all vectors Yi EE1 and thus x1 =0. Conversely, assume that E1 fl E1L =
0.
(9.57)
,
Then it follows that (9.58)
since E1 and have complementary dimension. Now let x1 be a vector of E1 such that (x1, v1) =
0
for all vectors Ji
It follows from (9.58) that every vector y of E can be written as Y = Yi + 3
ii E E1,
E
whence
(x1, y) = (x1,
+ (x1,
y e E.
This equation implies that x1 =0. Consequently, the inner product is non-degenerate in E1. 9.19. Normed determinant functions. Let be a determinant function in E. Since E is dual to itself, the identity (4.21) applies to E yielding
40(x1, Substituting basis, we obtain
= in (9.59), where e%(v=
=
This equation shows that
(—
(9.59) I
n) is an orthonormal
§ 4. Pseudo-Euclidean spaces
283
Consequently, another determinant function, A, can be defined by
(9.60) Then the identity (9.59) assumes the form A (x1
...
(y1 ...
y1).
= (—
(961)
A determinant function satisfying the relation (9.61) is called a normed determinant function. Equation (9.60) shows that there exist exactly two normed determinant functions A and —A in E. 9.20. The pseudo-Euclidean plane. The simplest example of a pseudo-
Euclidean space is a 2-dimensional linear space with an inner product of index I. Then the light-cone consists of two straight lines. Selecting two vectors and 12 which generate these lines we have the equations (li,
=
0
and
(12, /2)
= 0.
(9.62)
But
(l1,12)+0 because otherwise the inner product would be identically zero. We can therefore assume that (9.63) (l1,12)=— 1. Given a vector x= + 12 of E the equations (9.62) and (9.63) yield
(x,x) =
—
<0 and x is time-like if >0. In showing that x is space-like if other words, the space-like vectors are contained in the two sectors S1 and T. and S2 of fig. 3 and the time-like vectors are contained in The inner product of two vectors and
\ 7
(2z)
Fig. 3
is
given by (x
\:
This formula shows that the inner product of two space-like vectors is positive if and only if these vectors are contained in the same one of the sectors .
.
and
Let an orientation be defined in E by the normed determinant function A.
284
Chapter IX. Symmetric bilinear functions
Then the identity (9.61) yields (n=2, s= 1) (x,
—
(9.64)
(x, y)2 = (x, x) (y, y).
A
If x and y are not light-vectors equation (9.64) may be written in the form
2
2
—='
(x, x) (y, y)
(9.65)
(x, x) (y, y)
Now assume in addition, that the vectors x and y are space-like and are and S2. Then contained in the same one of the sectors
(x,y)>O.
(9.66)
Relations (9.65) and (9.66) imply that there exists exactly one real number 0 ( — oo <0< cc) such that cosh 0 =
(x,y)
and
sinh 0 =
lxi
A(x,y) .
-
(9.67)
lxi
This number is called the pseudo-Euclidean angle between the space-like
vectors x and y. We finally note that the vectors e1 =
and
12)
e2
+ 12)
form an orthonormal basis of E. Relative to this basis the equation of the light-cone assumes the form —
= 0.
9.21. Pseudo-Euclidean spaces of index n—i. More generally let us consider an n-dimensional pseudo-Euclidean space with index n —1. Then
every fixed time-like unit vector z determines an orthogonal decomposition of E into an (n — 1)-dimensional subspace consisting of spacelike vectors and the 1-dimensional subspace generated by z. In fact, every vector xEE can be uniquely decomposed in the form
(z,y)=0 where the scalar 2 is given by 2
= —(x,z).
Passing over to the norm we obtain the equation
(x,x)=
+(y,y)
§ 4. Pseudo-Euclidean spaces
285
showing that
<(y, y) if x is space-like A2 > (y, y) if x is time-like A2 = (y, y) if x is a light-vector.
(9.68)
From this decomposition we shall now derive the following properties: (1) Two time-like vectors are never orthogonal. (2) A time-like vector is never orthogonal to a light-vector. (3) Two light-vectors are orthogonal if and only if they are linearly dependent.
(4) The orthogonal complement of a light-vector is an (n — 1)-dimensional subspace of E in which the inner product is positive semidefinite and has rank n—2. To prove (1), consider another time-like vector z1. This vector z1 can be written as (z,y1) = 0. (9.69) = Az + Yi Then
> (yr)
whence
Inner multiplication of (9.69) by z yields
(z,z1) = A(z,z) + 0. Next, consider a light-vector 1. Then
l=Az+y
(z,y)=0
and
A2 =(y,y) >0. These two relations imply that
(!,z) = A(z,z) + 0 which proves (2). Now let and '2 decompositions
be
two orthogonal light-vectors. Then we have the
=21z+y1 and
12
22z+y2,
whence —
A2 + (Y1'Y2) = 0.
(9.70)
Observing that
= (Yi'Yi) and
= (Y2'Y2)
we obtain from (9.71) the equation (Yl,Yl)(Y2'Y2) = (Y1'Y2)2.
(9.71)
Chapter IX. Symmetric bilinear functions
286
The vectors Yi and Y2 are contained in the orthogonal complement of z. In this space the inner product is positive definite and hence equation. Inserting (9.71) implies that Yt and Y2 are linearly dependent, Y2 whence 12=2'l. this into (9.70) we find Finally, let / be a light-vector and E1 be the orthogonal complement of 1. It follows from property (2) that E1 does not contain time-like vectors. In other words, the inner product is positive semidefinite in E1. To find the null-space of the inner product, assume that Yi is a vector of E1 such that (Yi'Y) = 0 for all vectors yeE1.
This implies that (Yi' Yi)=O showing that Yi is a light-vector. Now it follows from property (3) that Yi is a multiple of 1. Consequently, the null-space of the inner product in E1 is generated by 1. 9.22. Fore-cone and past-cone. As another consequence of the properties established in the last section it will now be shown that the set of all T (cf. fig. 4.) To time-like vectors consists of two disjoint sectors this purpose we define an equivalence relation in the set T of all time-like
vectors in the following way:
if (:1,z2)<0.
2
(9.72)
Relation (9.72) is obviously symmetric and reflexive.
To prove the transitivity, consider three time-like 1, 2, 3) and assume that
vectors
(z1,z3)<0 and (z2,z3)<0. We have to show that Fig. 4
(z1,z2) <0.
We may assume that is a time-like unit-vector. Then the vectors z1 and z2 can be decomposed in the form
=
=
+
(i = 1,2)
—
(9.73)
where the vectors Yi and Y2 are contained in the orthogonal complement F of z3. Equations (9.73) yield
=
—
(1 = 1,2)
+
(9.74)
and
(zl,:2) = —
+
(9.75)
§ 4. Pseudo-Euclidean spaces
287
It follows from (9.68) that (ye, yi) <
(1 = 1,2).
Now observe that the inner product is positive definite in the subspace F. Consequently, the Schwarz inequality applies to the vectors Yi and Y2' yielding (Y1'Y2)2
inequality shows that the first term determines the sign on the right-hand side of (9.75). But this term is negative because = — z3) >0 (1 = 1, 2) and we thus obtain
This
(z1,z2)
<0.
The equivalence relation (9.72) induces a decomposition of the set T into two classes T which are obtained from each other by the reflection x—÷ —x.
9.23. The two subsets any two vectors z1 and
z2
T are convex, i.e., they contain with the straight segment
z(t)=(1—t)z1+tz2 In fact, assume that z1 and z2 ETt
Then
(z1, z1) <0,(z2, z2) < 0 and (z1, z2) <0, whence
(z(t),z(t))
(1 — t)2(z1,z1) + 2t(1 —
t)(z1,z2) + t2(z2,z2) <0,
In the special theory of relativity the sectors and the past-cone.
T
The set S of the space-like vectors is not convex as fig. 4 shows.
Problems 1. Let E be a pseudo-Euclidean plane and g1, g2 be the two straight lines generated by the light-vectors. Introduce a Euclidean metric in E such that g1 and g2 are orthogonal. Prove that two vectors x+0 and y+O are orthogonal with respect to the pseudo-Euclidean metric if and only if they generate the Euclidean bisectors of g1 and g2.
2. Consider a pseudo-Euclidean space of dimension 3 and index 2. Assume that an orientation is defined in E by the normed determinant
Chapter IX. Symmetric bilinear functions
288
function A. As in a Euclidean space define the cross product of two vectors x1 and x2 by the relation
(x1 x x2,x3) = A(x1,x2,x3). Prove: a) x1 x x2 = 0 if and only if the vectors x1 and x2 are linearly dependent. b) (x1 x x2, x1 x x2)= (x1, x2)2 — (x1, x1) (x2, x2) c) If e1, e2, e3 is a positive orthonormal basis of E, then
e1 x e2 =
— e3,
= — e2,
e1 x
e2 x
=
e1
3. Let E be an n-dimensional pseudo-Euclidean space of index n — 1. Given two time-like unit vectors z1 and z2 prove: a) The vector +z2 is time-like or space-like depending on whether z1 and z2 are contained in the same cone or in different cones. b) The Schwarz inequality holds in the reversed form (z1,z2)2 (z1,z1)(z2, z2).
Equality holds if and only if z1 and z2
are
linearly dependent.
4. Denote by S the set of all space-like vectors. Prove that the set S More precisely: Given two vectors x0eS and x1eS is connected if there exists a continuous curve x=x(t)(O t 1) in S such that x(O)=x0
x(l)=x1. §
5. Linear mappings of pseudo-Euclidean spaces
9.24. The adjoint mapping. Let p a linear transformation of the n-di-
mensional pseudo-Euclidean space E. Since E is dual to itself with respect to the inner product the adjoint mapping can be defined as in sec. (8.1). The mappings çø and are connected by the relation
X,yEE. The duality of the mappings ço and det
= det 'p
(9.76)
implies that and tr
= tr 'p.
(v, = 1 ii) be the matrices of 'p and relative to an and Let into (9.76) we find that Inserting and orthonormal basis .
.
(v,u= I ...n)
§ 5. Linear mappings of pseudo-Euclidean spaces
where
289
J+1(v=1...s) l—1(v=s+1...n).
Now assume that the mapping p is selfadjoint, = p. In the positive definite case we have shown that there exists a system of n orthonormal eigenvectors. This result can be carried over to pseudo-Euclidean spaces of dimension n 3 if we add the hypothesis that (x, (px) + 0 for all lightvectors. To prove this, consider the symmetric bilinear functions
k(x,y)=(çox,y) and W(x,y)=(x,y). It follows from the above assumption that çji
(x)2
> 0 for all vectors x + 0.
+ 'I'
Hence the theorem of sec. 9.11 applies to and exists an orthonormal basis (v= 1 ...n) such that
showing that there
(v,p = 1... n).
=
(9.77)
Equations (9.77) imply that
(v=1...n) showing that the are eigenvectors of (p. 9.25. Pseudo-Euclidean rotations. A linear transformation çø of the pseudo-Euclidean space E which preserves the inner product, (çox,çoy) = (x,y)
(9.78)
is called a pseudo-Euclidean rotation. Replacing y by x in (9.78) we obtain the equation
(px,px)=(x,x)
xeE
showing that a pseudo-Euclidean rotation sends space-like vectors into space-like vectors, time-like vectors into time-like vectors and lightvectors into light-vectors. A rotation is always regular. In fact, assume that çox =0 for a vector xeE. Then it follows from (9.78) that
(x,y) =((px,py) = 0 for all vectors yeE, whence x=0. Comparing the relations (9.76) and (9.78) we see that the adjoint and the inverse of a pseudo-Euclidean rotation coincide, (9.79) 19
Greub. Linear Algebra
Chapter IX. Symmetric bilinear functions
290
Equation (9.79) shows that the determinant of must be ± 1, as in the Euclidean case. Now let e be an eigenvector of 'p and ). be the corresponding eigenvalue,
çoe
=
Passing over to the norms we obtain (coe,coe) =
equation shows that ).= ± 1 provided that e is not a light-vector. An eigenvector which is contained in the light-cone may have an eigenvalue ± I as can be seen from examples. If an orthonormal basis is chosen in E the matrix of satisfies the relations = This
A matrix of this kind is called pseudo-orthogonal. 9.26. Pseudo-Euclidean rotations of the plane. In particular, consider
a pseudo-Euclidean rotation 'p of a 2-dimensional space with index 1. Then the light-cone consists of two straight lines. Since the light-cone is preserved under the rotation ço, it follows that these straight lines are either transformed into themselves or they are interchanged. Now assume that 'p is a proper rotation i.e. det 'p = + I. Then the second case is im-
possible because the inner product is preserved under 'p. Thus we can write
and
1
'p12—j2,
(9.80)
It
the basis of E defined in sec. 9.20. The number). is positive or negative depending on whether the sectors T are mapped onto themselves or interchanged. where
/2 is
Now consider an arbitrary vector (9.81)
Then equations (9.80) and (9.81) yield
'p
(x, x).
(9.82)
§ 5. Linear mappings of pseudo-Euclidean spaces
291
This equation shows, that the inner product of x and çox depends only on the norm of x as in the case of a Euclidean rotation (cf. sec. 8.21). To find the "rotation-angle" of (p, introduce an orientation in E such is positive. Let zI be a normed determinant function that the basis which represents this orientation. Then identity (9.64) yields 4(11,12)2 =
12) =
1,
whence 4(11,12) = 1.
Inserting the vectors x and çox into A we find that
= T Now assume in addition that p transforms the sectors and T). Then into themselves (i. e., that tp does not interchange
2>
and equation (9.82) shows that (x, q,x) >0 for every space-like vector x. Using formulae (9.67) we obtain the equations 0
cosh ü =
(2 +
and sinh
=1
(9.83) —
where 0 denotes the pseudo-Euclidean angle between the vectors x and Now consider the orthonormal basis of E which is determined by the vectors
=
— 12)
and
e2
+ 12).
Then equations (9.80) yield
—
+ 2(A
thus obtain the following representation of p, which corresponds to the representation of a Euclidean rotation given in sec. 8.21: We
0 + e2 sinh 0 = e1 sinh 0 + e2 cosh 0.
p e1 = e1 cosh
q, e2 19*
292
Chapter IX. Symmetric bilinear functions
9.27. Lorentz-transformations. A 4-dimensional pseudo-Euclidean space with index 3 is called Minkowski-space. A Lorentz-transformation is
a rotation of the Minkowski-space. The purpose of this section is to show that a proper Lorentz-transforrnation ço possesses always at least one eigen vector on the We may restrict ourselves to Lorentz-
transformations which do not interchange fore-cone and past-cone because this can be achieved by multiplication with — I. These transformations are called orthochroneous. First of all we observe that a lightvector 1 is an eigenvector of ço if and only if (1, col) = 0. In fact, the equation çol=2l yields (l,(p/) = 2(1,1) = 0.
Conversely, assume that 1 is a light-vector with the property that (1, pl) = 0. Then it follows from sec. (9.21) property (3) that the vectors 1 and çol are linearly dependent. In other words, 1 is an eigenvector of ço. Now consider the selfadjoint mapping (9.84) Then
=(x,gox) xeE.
= j-(x,(px)+
(9.85)
It follows from the above remark and from (9.85) that a light-vector 1 is an eigenvector of p if and only if (I, = 0. We now preceed indirectly and assume that ço does not have an eigenvector on the lightcone. Then (x, = (x, cox) 0 for all light-vectors and hence we can apply the result of sec. 9.24 to the mapping /i: There exist four eigenvectors (v = 1.. .4) such that
(+
=
=
1(v = 1,2,3) 1
(v
= 4).
Let us denote the time-like eigenvector e4 by e and the corresponding eigenvalue by 2. Then and hence it follows from (9.84) that ço2e = 22çIie
—
e.
Next, we wish to construct a time-like eigenvector of the mapping 'p. If çoe is a multiple of e, e is such a vector. We thus may assume that the vectors e and çoe are linearly independent. Then these two vectors generate *) Observe that a proper Euclidean rotation of a 4-dimensional space need not have
eig envectors.
§ 5. Linear mappings of pseudo-Euclidean spaces
a
293
plane F which is invariant under ço. This plane intersects the light-cone
in two straight lines. Since the plane F and the light-cone are both invariant under these two lines are either interchanged or transformed into themselves. In the second case we have two eigenvectors of p on the light-cone, in contradiction to our assumption. In the first case select two generating vectors and '2 on these lines such that (/i, /2)= 1. Then and
The condition (911,(p12) = (11,12)
implies that cxf3= 1. Now consider the vector z Z
= 11 + CL!2.
Then
=
+ cx/311
cL!2
+
=
z
To show that z is timelike observe that
(z,z) = and that the vector t=11
is
2ci(11,12)
time-like. Moreover,
(t, q,t) <0 because ço leaves the fore-cone and the past-cone in-
variant (cf. sec. 9.22). Hence equation implies that <0, showing that z is time-like. Using the time-like vector z we shall now construct an eigenvector on the light-cone which will give us a contradiction. Let E1 be the orthogonal complement of z. E1 is a 3-dimensional Euclidean subspace of E which is invariant under çø. Since p is a proper Lorentz-transformation it induces a proper Euclidean rotation in E1. Consequently, there exists an invariant axis in E1 (cf. sec. 8.22). Let y be a vector of this axis such that (i'. i')= —2 Then / = v + z is a light-vector and CL.
y
+z=
1
i. e. 1 is an eigenvector of p. Hence, the assumption that there are no eigenvectors on the light-cone leads to a contradiction and the assertion in the beginning of this section is proved.
Chapter IX. Symmetric bilinear functions
294
We finally note that every eigenvalue 2 of p whose eigenvector 1 lies on the light-cone is positive. In fact, select a space-like unit-vector y such that (1, y) = and consider the vector z = 1+ ry where t is a real parameter. Then we have the relation 1
(:,:)
= 2t + z2
showing that z is time-like if —2<'r
preserves
(z,çoz)
(—2
+ ty,2l+
t(2
fore-cone and
But
and we thus obtain +
(—2
<0
+
Letting 'r—+O we see that 2 must be positive.
Problems 1. Let p be a linear automorphism of the plane E. Prove that an inner product of index 1 can be introduced in E such that p becomes a proper pseudo-Euclidean rotation if and only if the following conditions are satisfied:
1. There are two linearly independent eigenvectors.
2. detp=l. 3.
Find the eigenvectors of the Lorentz-transformation defined by the matrix 2.
0 —l
H1
0
0
H
0
1
1
0
1
Verify that there exists an eigenvector on the light-cone. 3. Let a and b be two linearly independent light-vectors in the pseudoof E is defined by Euclidean plane E. Then a linear transformation
i/ib=—b.
§ 5. Linear mappings of pseudo-Euclidean spaces
295
Consider the family of linear automorphisms (p (t) which is defined by the
differential equation
= and the initial condition
(p (0) = a) Prove that (p (t) is a family of proper rotations carrying fore-cone and past-cone into themselves. b) Define the functions C(t) and S(t) by
and S(t) =
C(t) = Prove the functional-equations
C(t1 + t2) = C(t1)C(t2) + S(t1)S(t2) and S(t1
+ t2) = S(t1)C(t2) + S(t2)C(t1).
c) Prove that
(p(t)a=eta
and
(p(t)b_—etb.
4. Let E be a pseudo-Euclidean space and consider an orthogonal decomposition
E=
such that the restriction of the inner product to
(E) is positive
(negative) definite. Let w be a selfadjoint involution of E such that (and hence E) is stable under w. Define a symmetric bilinear function
x,yEE.
W(x,y)=(wx,y) Prove that the signature of
is given by
—trw where
w denote the restrictions of w to
E respectively.
Chapter X Q uadrics
*
In the present Chapter the theory of the bilinear functions developed
in Chapter IX will be applied to the discussion of quadrics. In this context we shall have to deal with affine spaces. § 1.
Affine spaces
10.1. Points and vectors. Let E be a real n-dimensional linear space and let A be a set of elements F, Q... which will be called points. Assume that a relation between points and vectors is defined in the following way:
1. To every ordered pair F, Q of A there is assigned a vector of E, called the vector and denoted by PQ. 2. To every, point PeA and every vector xeE there exists exactly one
point QeA such that PQ=x. 3. If P, Q, R are three arbitrary points, then (10.1)
A is called an n-dimensional affine space with the d?fference space E.
Insertion of Q =P in (10.1) yields PP+ FR=PR, whence PP= 0 for every point PeA. Using this relation we obtain from (10.1)
QP = - P Q. The equation P1Q1=P2Q2 implies that P1P2=Q1Q2 (parallelogramlaw). In fact
P1P2=P1Q2-P2Q2 =
P1Q2
—
Q1.
Subtraction of these equations yields P1P2= For any given linear space E, an affine space can be constructed which possesses E as difference space:
§ 1. Affine spaces
297
Define the points as the vectors of E and the difference-vector of two
points x and y as the vector y—x. Then the above conditions are obviously satisfied.
Let A be a given affine space. If a fixed point 0 is distinguished as origin, every point P is uniquely determined by the vector x = OP. x is called the position-vector of P and every point P can be identified with the corresponding position-vector x. The difference-vector of two points x and y is simply the vector y —x. 10.2. Affine coordinate systems. An affine coordinate-system (0; x1 . l...n) of the consists of a fixed point OEA, the origin, and a basis .
difference-space E. Then every point PeA determines a system of n numbers
(v= 1.. ii) by
OP = (v = .. n) are called the affine coordinates of P relative to the given system. The origin 0 has the coordinates Now consider two affine coordinate-systems
The numbers
1
(0; x1 ...
and
(0'; Yi ...
the matrix of the basis-transformation affine coordinates of 0' relative to the system (0; x1
and by f3V the
Denote by
. .
= The affine coordinates
tems (0; x1 .
.
and
and (0'; Yi
. .
of a point P, corresponding to the sys.y,,)
respectively, are given by
and
01P _>:JJVy
(10.2)
Inserting O'P= OP—00' in the second equation (10.2) we obtain = — whence
(JL==1...n).
Multiplication by the inverse matrix yields the transformation-formula for the affine coordinates:
(v=1...n).
Chapter X. Quadrics
10.3. Affine subspaces. An affine suhspace of A is a subset A1 of A such
that the vectors PQ(PeA1, QeA1) form a subspace of E. If 0 is the origin of A and (01; is an affine coordinate-system of A1, the points of A can be represented as 1
=
(10.3)
+
Forp= I we obtain a straight line through °1 with the "direction vector"
0P= 0
+ x.
In the case p=2 equation (10.3) reads
It then represents the plane through 01 generated by the vectors x1 and x2. An affine subspace of dimension n — is called a hyperplane. 1
Two affine subspaces A1 and A2 of A are called parallel if the difference-space E1 of A1 is contained in the difference-space E2 of A2, or conversely. Parallel subspaces are either disjoint or contained in each other. Assume, for instance, that E2 is contained in E1. Let Q be a point of the intersection A1 fl A2 and P2 be an arbitrary point of A2. Then
QP2 is contained in E2 and hence is contained in E1. This implies that P2EA1, whence A2 c: A1. 10.4. Affine mappings. Let P—÷P' be a mapping of A into itself subject
to the following conditions: 1. P1 Q1 =P2Q2
implies that
2. The mapping p:
defined by p (PQ)=P'Q' is linear.
Then P—*P' is called an affine mapping. Given two points 0 and 0' and
a linear mapping p: E—*E, there exists exactly one affine mapping which sends 0 into 0' and induces 'p. This mapping is defined by
0
=
00' + 'p (OP).
If a fixed origin is used in A, every affine mapping x—÷x' can be written in the form
x' =
'p
x + b,
where 'p is the induced linear mapping and b =0 0'.
§ 1. Affine spaces
299
A translation is an affine mapping which induces the identity in E,
For two arbitrary points P and F' there obviously exists exactly one translation which sends P into P'. 10.5. Euclidean space. Let A be an n-dimensional affine space and assume that a positive definite inner product is defined in the differencespace E. Then A is called a Euclidean space. The distance of two points P and Q is defined by
Q(P,Q)= IPQI.
It follows from this definition that the distance has the following properties:
1. Q(P,Q)0 and Q(P,Q)=0 if and only ifP=Q. 2. 3.
Q(P,R) + Q(R,Q). All the metric concepts defined for an inner product space (cf. Chap. VII) can be applied to a Euclidean space. Given a point x1eA of A and a vector p +0 there exists exactly one hyperplane through x1 whose difference-space is orthogonal to p. This hyperplane is represented by the equation (x — x1,p)= 0. A rigid motion of a Euclidean space is an affine mapping P—+P' which preserves the distance, (10.4) Q(P',Q') = Q(P,Q).
Condition (10.4) implies that the linear mapping, which is induced in the difference-space by a rigid motion, is a rotation. Conversely, given a rotation p and two points OeA and O'eA, there exists exactly one rigid motion which induces 'p and maps 0 into 0'.
Problems in an affine space are said to be in general 1. (p+ 1) points are not contained in a (p — 1)-dimensional subposition, if the points .p) are in general position if and space. Prove that the points only if the vectors
are linearly independent.
Chapter X. Quadrics
in general position, the set of all
2. Given (p+ 1) points points P satisfying —--*
p
p
is called the p-simplex spanned by the points .p). If 0 is the origin of A, prove that a point P of the above simplex can be uniquely represented as .
= 1.
= The numbers
(v=O.. .p) are called the barycentric coordinates of P.
The point B with the barycentric coordinates
1
+
(v=O. ..p) is
called the center of S. 3. Given a p-simplex (P0.. (p 2) consider the (p — 1) simplex Si deand denote by B1 the center fined by the points of Show that the straight lines (P,B1) and (i+j) intersect each other at the center B of S and that
BB=
1
p+l
B.P..
4. An equilateral simplex of length a in a Euclidean space is a simplex (P0..
=a(v+p). Find the angle between
with the property that
the vectors
and B
+ v, 2 + v) and between the vectors
is the center of (P0
. . .
Assume that an orientation is defined in the difference-space E by the determinant function A. An ordered system of (n + 1) points (P0.. . in general position is called positive with respect to the given orientation, if A (P0 P1 ... P0 P,j > 0. 5.
a) If the system (P0..
is positive and
numbers (0, 1 .. .n), show that the system
is a permutation of the . . .
is again positive if
and only if the permutation a is even. b) Let A1 be the (n — 1)-dimensional subspace spanned by the points P0.. P1.. Introduce an orientation in the difference-space of A, with
the help of the determinant function
§ 2. Quadrics in the affine space
301
where Q is an arbitrary point of A1. Prove that the ordered n-tuple is positive with respect to the determinant function an n-simplex andSbe its center. Denote by Sll...ik the center of the (n —k)-simplex obtained by deleting the vertices P,.., and define Now select an ordered system of n integers ii,..., the affine mapping by 6. Let (P0..
Prove that
(ii+l)!
is the corresponding linear map.
where
In this equation a denotes the permutation a(v)=i%. (v= l...n), a(O)=k ia). where k is the integer not appearing among the numbers 7. Let g1 and g2 be two straight lines in a Euclidean space which are not parallel and do not intersect. Prove that there exists exactly one . . .
point P1 ong1(i=l, 2) such that P1 P2 is orthogonal to g1 and tog2. 8. Let A1 and A2 be two subspaces of the affine space such that the difference-spaces E1 and E2 form a direct decomposition of E. Prove that the intersection A1 n A2 consists of exactly one point. 9. Prove that a rigid motion x' = tx + a has a fixed point if and only if the vector a is orthogonal to all the vectors invariant under t. A rigid motion is called proper if det t = + 1. Prove that every proper rigid motion of the Euclidean plane without fixed points is a translation.
10. Consider a proper rigid motion x'=tx+a(t+i) of the Euclidean plane. Prove that there is exactly one fixed point x0 and that + b is a vector of the same length as a and orthogonal to a. O designates the rotation-angle relative to the orientation defined by the basis (a, b). 11. Prove that two proper rigid motions + t and /3 + t of the plane commute, if and only if one of the two conditions holds: 1. cz and /3 are both translations 2. and /3 have the same fixed point.
§ 2.
Quadrics in the affine space
10.6. Definition. From elementary analytic geometry it is well known
Chapter X. Quadrics
that every conic section can be represented by an equation of the form
and are constants. Generalizing this to higher dimensions we define a quaciric Q in an n-dimensional affine space A as the set of all points satisfying an equation of the form where
'P(x) + 2f(x) =
(10.5)
is a quadratic function,f a linear function and a constant. For the following discussion it will be convenient to introduce a dual space E* of the difference-space E. Then the bilinear function 'P can be written in the form where
x,yeE, where p is a linear mapping of E into E* which is dual to itself: = çø. Moreover, the linear functionf determines a vector a*EE* such that
f
xeE.
Hence, equation (10.5) can be written in the form
(10.6)
We recall that the null-space of the bilinear function 'P coincides with the kernel of the linear mapping çø.
10.7. Cones. Let us assume that there exists a point x0eQ such that (pxo+a*=0. Then (10.6) can be written as
cc
(10.7)
and the substitution x=x0 gives cc =
—
x0, x0>.
Inserting this into (10.7) we obtain
+
—
Hence, the equation of Q assumes the form 'P(x —
x0)= 0.
A quadric of this kind is called a cone li'ith the vertex x0. For the sake of
§ 2. Quadrics in the affine space
303
simplicity, cones will be excluded in the following discussion. In other
words, it will be assumed that
p x + a* + 0 for all points x e Q.
(10.8)
10.8. Tangent-space. Consider a fixed point x0eQ. It follows from condition (10.8) that the orthogonal complement of the vector (pxo+a* is of E. This subspace is called the an (n — 1)-dimensional subspace if and tangent-space of Q at the point x0. A vectoryEEis contained in only if (10.9) 0.
andf equation (10.9) can be written as
In terms of the functions
(xe, y) + f (y)
0.
(10.10)
affine subspace which is determined by the point x0 and the tangent-space is called the tangent-hyperplane of Q at x0. It consists of all points The (n — 1)-dimensional
x=x0+y Inserting y=x—x0 into equation (10.10) we obtain
cli(x0,x—x0)+f(x—x0)=0.
(10.11)
Observing that cP(x0)+
we can write equation (10.11) of the tangent-hyperplane in the form
+ x)=
(10.12)
To obtain a geometric picture of the tangent-space, consider a 2dimensional plane (10.13)
through x0 where a and b are two linearly independent vectors. Inserting (10.13) into equation (10.5) we obtain the relation + +
+
a) + f (a)) +
+
b) + 1(b)) =
0
(10.14)
showing that the plane F intersects Q in a conic Define a linear function g by setting (10.15) g(x) = x) + .1(x)).
Chapter X. Quadrics
304
Now the equation of the conic y can be
then. in view of (10.8), written in the form
+ ijg(b) = 0.
b) + ij2'P(b) +
+
(10.16)
Now assume that the vectors a and h are chosen such that g(a) and g(h) are not both equal to zero. Then the conic has a unique tangent at the and this tangent is generated by the vector point
t=—g(b)a +g(a)b.
(10.17)
this follows from the The vector t is contained in the tangent-space equation g(t) = — g(b)g(a) + g(a)g(b) = 0.
can be obtained in this way. Every vector y+O of the tangent-space in fact, let a be a vector such that g (a) = and consider the plane through x0 spanned by a andy. Then equation (10.17) yields 1
t=—g(y)a +g(a)y=y
(10.18)
showing that y is the tangent-vector of the intersection Q fl F at the point
Note: If g(a)=0 and g(b)=0 equation (10.16) reduces to + ?J2cIi(b) = 0.
+
Then the intersection of Q and F consists of a) two straight lines intersecting at x0, if cli(a,b)2 — b)
the point x0 only, if 1(a, b)2 — cP(a)cP(b)
<0,
c) one straight line through x0, if —
but not all three coefficients (a),
7i(a)cP(b) = 0,
(b) and P (a, b) are zero
d) The entire plane F, if D(a) = 'P(b) = P(a,b) = 0. 10.9. Uniqueness of the representation. Assume that a quadric Q is represented in two ways (x) + 2f1 (x) = (10.19)
§ 2. Quadrics in the affine space
305
and
'P2 (x) + 212(x) =
(10.20)
It will be shown that =
=
'P2 =
+0 is a real number. Let x0 be a fixed point of Q. It follows from hypothesis (10.8) that the linear functions g1 and g2, defined by where ).
g1 (x) =
'P1
(x, x0) +
(x) and g2 (x) = 'P2 (x, x0) + 12(x)
(10.21)
are not identically zero. Choose a vector a such that g1 (a) + 0 and g2 (a) + 0, and a vector b +0 such that g1 (h) 0. Obviously a and h are linearly independent. The plane
x=
x0
+
+
then intersects the quadric Q in a conic y whose equation is given by each one of the equations 'P1(a) +
(b) + g1 (a) =
'P1 (a, b) +
0
(10.22)
and
'P2(a) +
'P2(b) + g2 (a) + g2 (b) = 0.
'P2 (a, b) +
(10.23)
The tangent of this curve at the point çe = = 0 is generated by the vector
=
g1
(a) b
and also by =
— g2
(b) a + g2 (a) b.
This implies that (10.24)
But b+0 was an arbitrary vector of the kernel of g1. Hence equation (10.24) shows that g2(b)=0 whenever g1(b)=0. In other words, the linear functions g1 and g2 have the same kernel. Consequently g2 is a constant multiple of g1
g2=Ag1 /+0.
(10.25)
Multiplying equation (10.22) by ). and subtracting it from (10.23) we obtain in view of (10.24) that —
).P1)(a) +
—
+
—
= 0. (10.26)
20
(Ireub. Linear Algebra
Chapter X. Quadrics
In this equation all three coefficients must be zero. In fact, if at least one coefficient is different from zero, equation (10.26) implies that the conic consists of two straight lines, one straight line or the point x0 only. But this is impossible because g1 (a)+0. We thus obtain from (10.26) that 2 (a, b) = 2 (a, b), (a), (10.27) 2 (b).
These equations show that 2
(x)
(10.28)
for all vectors XEE: If g1 (x)=0, (10.28) follows from the third equation (10.27); if g1 (x)+0, then g2 (x)+0 [in view of (10.25)] and (10.28) follows from the first equation (10.27). Altogether we thus obtain the identities (P2—2(P1
2+0.
and g2=1g1
Now relations (10.21) imply that 12 = 2f1
and equation (10.20) can be written as 2(cP1 (x)
+ 2f1 (x))
(10.29)
Comparing equations (10.19) and (10.29) we finally obtain completes the proof of the uniqueness theorem. 10.10. Centers. Let Q: (P (x) + 2f (x) = be
==2a1. This
a given quadric and c be an arbitrary point of the space A. if we
introduce c as a new origin,
x= the
C + X',
equation of Q is transformed into (P (x') + 2 ((P (c, x') + f (x')) =
— (P (c) —
2f (c).
(10.30)
Here the question arises whether the point c can be chosen such that the linear terms in (10.30) disappear, i. e. that (P (c,
x') + f (x') = 0
(10.31)
for all vectors x'eE. If this is possible, c is called a center of Q. Writing
§ 2. Quadrics in the affine space
307
equation (10.31) in the form
x'eE we see that c is a center of Q if and only if (cf. sec. 10.7)
coc=_a*.
(10.32)
This implies that the quadric Q has a center if and only if the vector a* is contained in the image space Im cp. Observing that Im is the orthogonal complement of the kernel of p we obtain the following criterion: A quadric Q has a center if and only if the vector a* is orthogonal to the kernel of p. if this condition is satisfied, the center is determined up to a vector of ker ço. In other words, the set of all centers is an affine subspace of A with ker ço as difference-space. Now assume that the bilinear function cli
is
non-degenerate. Then
is
a
regular mapping and hence equation (10.32) has exactly one solution. Thus it follows from the above criterion that a non-degenerate quadric has exactly one center. 10.11. Normal-form of a quadric with center. Suppose that Q is a quad-
nc with centers. If a center is used as origin the equation of Q assumes the form ck(x)=f3 (10.33) $+0.
Then the tangent-vectors y at a point x0 e Q are characterized by the equation
(10.34)
It follows from (10.34) that a center of Q is never contained in a tangenthyperplane. 1 Dividing (10.33) by ,13 and replacing the quadratic function cli by cli we can write the equation of Q in the normal-form cll(x) = 1.
Now select a basis
(10.35)
(v = 1.. .n) of E such that
+ 1(v = =
is)
= — I (v = s + 1... r)
0(v=r+1...n)
(10.36)
Chapter X. Quadrics
308
where r denotes the rank and s denotes the index of form (10.36) can be written as
Then the normal-
1.
(10.37)
10.12. Normal-form of a quadric without center. Now consider a quadnc Q without a center. If a point of Q is chosen as origin the constant in (10.5) becomes zero and the equation of Q reads
ck(x) + 2 = 0.
(10.38)
By multiplying equation (10.38) with — if necessary we can achieve that 2s r. In other words, we can assume that the signature of çJi is not 1
negative
To reduce equation (10.38) to a normal form consider the tangentspace T0 at the origin. Equation (10.9) shows that T0 is the orthogonal complement of a*. Hence, a* is contained in the orthogonal complement On the other hand, a* is not contained in the orthogonal complement K' (K=ker p) because otherwise Q would have a center (cf. sec. 10.10). show that T01cf K'. Taking the orthoThe relations a*ET0' and gonal complement we obtain the relation T0 K showing that there exists a vector aeK which is not contained in T0 (cf. fig. 5). Then +0 and hence we may assume that = 1. Now T0 has dimension n— 1 and hence every vector
xEE can be written in the form
yeT0.
(10.39)
Inserting (10.39) into equation (10.38) we obtain
2
Fig. 5
= 0.
(10.40)
Now
and
cP(a)=0,
because aeK, and = 0, because yeT0. Hence, equation (10.40) reduces to the following normalform: = 0. (10.41) cP(y) +
Since a is contained in the null space of 'P it follows from the decomposition (10.39) that the restriction of 'P to the tangent-space T0 has again
§2. Quadrics in the affine space
rank r and index s. Therefore we can select a basis = 1. such that (v'u = 1 ... n — I).
309
.
.n — 1) of T0
Then the vectors (v = 1.. .n — 1) and a form a basis of E in which the normal form (10.41) can be written as = 0.
+
(10.42)
Problems 1. Let E be a 3-dimensional pseudo-Euclidean space with index 2. Given an orientation in E define the cross product x x y by
(x x y, z) = A (x, y, z)
x, y, z e E
where A is a normed determinant function (cf. sec. 9.19) which represents the orientation. Consider a point x0 + 0 of the light-cone (x, x)
0
andaplane which does not contain the point 0. Prove that the intersection of the plane F and the light-cone is an ellipse if a x b is time-like a hyperbola if a x b is space-like a parabola if a x b is a light-vector.
2. Consider the quadric Q : (x) = where 1i is a non-degenerate quadratic function. Then every point x1 +0 defines an (n — 1)-dimensional 1
subspace P(x1) by the equation
cP(x,x1)= 1. This subspace is called the polar of x1. It follows from the above equation
that the polar P(x1) does not contain the center 0. a) Prove that x2eP(x1) if and only if x1 eP(x2). b) Given an (n — 1)-dimensional affine subspace A1 of A which does not contain 0, show that there exists exactly one point x1 such that A1 cP(x1). c) Show that P(x1) is a tangent-plane of Q if and only if x1 eQ. 3. Let x1 be a point of the quadric (x) = 1. Prove that the restriction has the rank r — and of the bilinear function to the tangent-space index s—l. 4. Show that a center of a quadric Q lies on Q only if Q is a cone. 1
Chapter X. Quadrics
5. Let x0 be a point of the quadric 'P (x) + 2f (x) + ci =
0
and consider the skew bilinear mapping w: E x w (x, y) = g (x) y — g (v) x
defined by x, y E E
the linear function g is defined by (10.15). Show that the linear closure of the set w (x,y) under this mapping is the tangent-space where
§ 3.
Affine equivalence of quadrics + b of A onto itself be
10.13. Definition. Let an affine mapping x' = given. Then the image of a quadric Q: 'P (x) is
+ 2f (x) =
the quadric Q' defined by the equation Q': W(x) + 2g(x) =
J3,
where
= 'P(t'x), g(x) = — 'P(t'x,r1 b) + f
(10.43)
(t1 x)
(10.44)
and (10.45)
In fact, relations (10.43), (10.44) and (10.45) yield
+ b) + 2g(tx + b)
= 'P(x) + 21(x)
—
that a point XEA is contained in Q if and only if the point x'=tx+b is contained in Q'. showing
Two quadrics Q1 and Q2 are called affine equivalent if there exists a one-to-one affine mapping of A onto itself which carries Q1 into Q2. The
affine equivalence induces a decomposition of all possible quadrics into affine equivalence classes. It is the purpose of this paragraph to construct a complete system of representatives of these equivalence classes. 10.14. The affine classification of quadrics is based upon the following theorem: Let E and F be two n-dimensional linear spaces and 'P and
two symmetric bilinear functions in E and in F. Then there exists a
§ 3. Affine equivalence of quadrics
linear isomorphism r:
311
with the property that
x,yeE
=
(10.46)
if and only if cP and W have the same rank and the same index. To prove this assume first that the relation (10.46) holds. Select a basis (v= 1.. ii) of E such that
+1(v=1..s) = — 1(v = s + 1 ... r)
(10.47)
0(v=r+1...n). Then equations (10.46) and (10.47) yield
=
=
showing that W has rank r and index s.
Conversely, assume that this condition is satisfied. Then there exist and of E and of F such that
bases
(ar, a,L)
and
=
=
W (by,
by the equations
Define the isomorphism 'r:
(v=1...n). Then
(v,u = 1 ... n)
P(aV,a,L) =
and consequently
x,yeE.
1(x,y)=W(xx,xy)
10.15. Affine classification. First of all it will be shown that the centers are invariant under an affine mapping. In fact, let Q:
çJi
(x — c)
=
/3
be a quadric with c as center and x' = xx + b an affine mapping of A onto itself. Then the image Q' of Q is given by the equation Q':
W
(x — c')
=
/3,
where
P(x1x) and
c' =
b
+ x c.
This equation shows that c' is a center of Q'.
Chapter X. Quadrics
312
Now consider two quadrics with center
Q1P1(x
(10.48)
— c1)= 1
and
Q2:P2(x—c2)=
(10.49)
1
is an affine mapping carrying Q1 into Q2. Since and assume that centers are transformed into centers we may assume the mapping x—+x' sends c1 into c2 and hence it has the form x' = r(x
—
c1) + c2.
By hypothesis, Q1 is mapped onto Q2 and hence the equation
c12(r(x — c1)) = 1
(10.50)
must represent the quadric Q1. Comparing (10.48) and (10.50) and applying the uniqueness theorem of sec. 10.9 we find that
= This relation implies that
= r2
r1
and
s1
= s2.
(10.51)
Conversely, the relations (10.51) imply that there exists a linear automorphism t of E such that 'P1(x) = 'P2(rx). Then the affine mapping
defined by
x' =
— c1) + c2
transforms Q1 into Q2. We thus obtain the following criterion: The two
normal forms (10.48) and (10.49) represent affine equivalent quadrics if and only if the bilinear functions and P2 have the same rank and the same index. 10.16. Next, let
=0
q1eQ1
(10.52)
=
q2 EQ2
(10.53)
and Q2:
(x
—
q2)
+2
x — q2>
0
be two quadrics without a center. It is assumed that the equations (10.52) and (10.53) are written in such a way that 2s1 r1 and 2s2 r2. If x' = t(x—q1) + q2 is an affine mapping transforming Q1 into Q2, the
§ 3. Affine equivalence of quadrics
313
equation of Q1 can be written in the form —
q1))
+
q1)>
—
= 0.
Now the uniqueness theorem yields
= 2b2('rx) where 2+0 is a constant. This relation implies that the bilinear functions have the same rank r and that cP1 and or =r depending
on whether 2>0 or 2<0. But the equation s2=r—s1 is only compatible with the inequalities 2s1
ifs1
and 2s2
=
and hence we see
that =s2 in either case. Conversely, assume that r1 = r2 = r and = s. To find an affine mapping which transforms Q1 into Q2 consider the tangent-spaces As has been mentioned in sec. 10.12 the restriction 71(Q1) and to the subspace (i= 1, 2) has the same rank and the same of index as cli1. Consequently, there exists an isomorphism _* such that yET41(Q1). P1(y) = Now select a vector a1 in the nullspace of
=
1
(i =
1,
2) such that
(1 = 1,2)
and define the linear automorphism 'r of E by the equations
ye71(Q1) and
'ra1=a2.
(10.54)
Then
=
+
cP1(x)
xeE.
+
(10.55)
in fact, every vector xEE can be decomposed in the form (10.56)
Equations (10.54), (10.55) and (10.56) imply that c112(tx) = cIi2(tY +
=
=
cP1(y)
= P1(y +
=
cP1(x)
(10.57)
and
=
+
=
=
=
(10.58)
Chapter X. Quadrics
314
Adding (10.57) and (10.58) we obtain (10.55). Relation (10.55) shows
that the affine mapping x'=r(x—q1)+q2 sends Q1 into
Q2 and we
have the following result: The normal-forms (10.52) and (10.53) represent affine equivalent quadrics if and only if the bilinear functions and '2 have the same rank and the same index. 10.17. The affine classes. It follows from the two criteria in sec. 10.15 and 10.16 that the normal forms
and
(r 2s) form a complete system of representatives of the affine classes. Denote by
N1 (r) and by N2 (r) the total number of affine classes with center and without center respectively of a given rank r. Then the above equations show that r+1 if r is odd 2
N1(r)=r and N2(r)= r+2. 2
.
if r is even
0
——
r=n.
The following list contains a system of representatives of the affine classes in the plane and in 3-space *):
Plane: I. Quadrics with center:
1. r=2: a) s=2: 2. r= I:
ellipse,
1
b) 5= 1:
s=1:
±
=1
hyperbola. two parallel lines.
=0
parabola.
1
11. Quadrics without center:
r= 1, s= 1: 3-space:
I. Quadrics with center:
1. r=3: a) s=3: b) s = 2:
c) s= 1:
ellipsoid,
+
— = I hyperboloid with one shell, =1
hyperboloid with two shells.
*) In the following equations the coordinates are denoted by scripts indicate exponents.
and the super-
§ 3. Affine equivalence of quadrics 2.
r
2: a) s = 2:
+ p12
elliptic cylinder,
1
=
b) s— 1: 3. r = I: s = 1:
315
hyperbolic cylinder.
two parallel planes.
=±I 1!. Quadrics without center:
1. r—2: a) s=2:
elliptic paraboloid,
=0 hyperbolic paraboloid.
b) s= 1: 2. r= I, s= 1:
parabolic cylinder.
Problems 1. Let Q be a given quadric and C be a given point. Show that C is a
center of Q if and only if the affine mapping —
defined by CP'=
CF transforms Q into itself. 2. If 1i is an indefinite quadratic function, show that the quadrics
=
1
'k(x) = — 1
and
are equivalent if and only if the signature of cP is zero.
3. Denote by N1 and by N2 the total number of affine classes with center and without center respectively. Prove that N1 = N2
n(n + 1) 2
(k2+k—1 if n=z2k
if n=2k+1.
4. Let x1 and x2 be two points of the quadric
Q:'P(x)= 1. Assume that an isomorphism t:
is given such that
P('ry,tz) = 1(y,z) which transforms Q into itself and Construct an affine mapping in the tangent-space which induces the isomorphism t 5. Prove the assertion of problem 4 for the quadric
P(x) + 2 = 0.
Chapter X. Quadrics
§ 4.
Quadrics in the Euclidean space
10.18. Normal-vector. Let A be an n-dimensional Euclidean space and
determines a selfadjoint linear
be a quadric in A. The bilinear function transformation of E by the equation
ck(x,y) = (çox,y).
The linear function f can be written as
f(x) = (a,x) where a is a fixed vector of E. Cones will again be excluded; i. e. we shall assume that çox4 —a for all points xeQ. Let x0 be a fixed point of Q. Then equation (10.10) shows that the tangent-space consists of all vectors y satisfying the relation
+ a,y) = 0. In other words, the tangent-space
is the orthogonal complement of
the normal-vector
p(x0) =
çox0
+ a.
The straight line determined by the point x0 and the vector p (x0) is called the normal of Q at x0. 10.19. Quadrics with center. Now consider a quadric with center
Q:P(x)= 1.
(10.59)
Then the normal-vector p (x0) is simply given by
p(x0) =
cpx0.
This equation shows that the linear mapping p associates with every point x0eQ the corresponding normal-vector. In particular, let x0 be a point of Q whose position-vector is an eigenvector of ço. Then we have the relation = showing that the normal-vector is a multiple of the position-vector x0.
§ 4. Quadrics in the Euclidean space
317
Inserting this into equation (10.59) we see that the corresponding eigen-
value is equal to
2=—. 1
1Xo12
As has been shown in sec. 8.7 there exists an orthonormal system of n eigenvectors 1.. .n). Then (v = 1... n)
=
(10.60)
whence
=
cP (es,,
Let us enumerate the eigenvectors
such that
(10.61)
where r is the rank and s is the index of cb. Then equation (10.59) can be written as 1.
(10.62)
The vectors
(v=1...s)
(v=s+1...r)
and
are called the principal axes and the conjugate principal axes of Q. Inserting
(v=1...s)
(v=s+1...r)
and
Ia%I
into (10.62) we obtain the metric normal-form of Q: (10.63) i
1
a straight line which intersects the and —ar. The straight lines generated by the conjugate axes have no points in common with Q but they intersect the conjugate quadric
quadric Q in the points
I
at the points
and
l...r).
Chapter X. Quadrics
318
10.20. Quadrics without center. Now consider a quadric Q without center. Using an arbitrary point of Q as origin, we can write the equation of Q in the form +
2(a,x) =
0
(10.64)
where a is a normal vector of Q at the point x=0. For every point xeQ the vector
p(x) =
+a
(10.65)
is contained in the normal of Q. A point xEQ is called a vertex if the corresponding normal is contained in the null-space K of cP.
It will be shown that every quadric without center has at least one vertex. Applying
to the equation (10.65) we obtain
(pp(x)=
+
a point xeQ is a vertex if and only if
ço2x=—qa.
(10.66)
To find all vertices of Q we thus have to determine all the solutions of equations (10.64) and (10.66). The self-adjointness of implies that the mappings ço and 'p2 have the same image-space and the same kernel (cf. sec. 8.7). Consequently, equation (10.66) has at least one solution x. The general
solution of (10.66) can be written in the form
x=
x0
+z
where z is an arbitrary vector of the kernel K. Inserting this into equation
(10.64) we obtain
+ (a,x0) + (a,z) = 0.
(10.67)
Now (otherwise Q would have a center) and consequently (10.67) has a solution zeK. This solution is determined up to an arbitrary vector
of the intersection K n
T0.
In other words, the set of all vertices of Q
forms an affine subspace with the difference-space K fl T0. This subspace
has dimension (n—r-—1). Now we are ready to construct the normal form of the quadric (10.64). First of all we select a vertex of Q as origin. Then the vector a in (10.64) is
contained in the kernel K. Multiplying equation (10.64) by an appropriate scalar we can achieve that al = 1 and that Now let I
§ 4. Quadrics in the Euclidean space n — 1)
319
be a basis of T0 consisting of eigenvectors of Then the vectors — 1) and a form an orthonormal basis of E such that
(v = 1.. . n
(v,u = 1...
=
—
1)
and
=
=0
(v = 1... n — 1).
In this basis the equation of Q assumes the metric normal-form +
= 0.
(10.68)
Upon introduction of the principal axes and the principal conjugate axes
(v=1...s)
(v=s+1...r)
and
the normal-form (10.68) can be also written as
v1
(10.69) v=s+1
10.21. Metric classification of bilinear forms. Two quadrics Q and Q' in the Euclidean space A are called metrically equivalent, if there exists a rigid motion which transforms Q into Q'. Two metrically equiv-
alent quadrics are a fortiori affine equivalent. Hence, the metric classification of quadrics consists in the construction of the metric subclasses within every affine equivalence class.
It will be shown that the lengths of the principal axes form a complete system of metric invariants. In other words, two affine equivalent quadrics Q and Q' are metrically equivalent if and only if the principal axes of Q and Q' respectively have the same length.
We prove first the following criterion: Let E and F be two n-dimensional Euclidean spaces and consider two symmetric bilinear functions and having the same rank and the same index. Then there exists an isometric mapping t: E—+F such that
cli(x,y)=W(tx,ry)
x,yeE
if and only if cP and W have the same eigenvalues. E and Define linear transformations p: P (x, y) = ('p x,
x, y E E
and
W (x, y)
(10.70)
F by x, y)
x, y e F.
Chapter X. Quadrics
Then the eigenvalues of 'P and W are equal to the eigenvalues of
and
// respectively (cf. sec. 8.10).
Now assume that 'r is an isometric mapping of E onto F such that relation (10.70) holds. Then
(px,y) = (ç/itx,ty)
(10.71)
whence
= 'c_I cl//cl. This relation implies that (p and i/i have the same eigenvalues. have the same eigenvalues. Then Conversely, assume that p and there is an orthonormal basis in E and an orthonormal basis 1...n) in F such that = 2%a%,
and
(v = 1... ii).
l//b%, =
Hence, an isometric mapping 'r:
(10.72)
defined by
'rap =
1... n).
(v
(10.73)
Equations (10.72) and (10.73) imply that ((p
=
(k/i
(v, p =
'r
1
... n)
whence (10.71).
10.22. Metric classification of quadrics. Consider first two quadrics Q and Q' with center. Since a translation does not change the principal axes we may assume that Q and Q' have the common center 0. Then the equations of Q and Q' read
Q: 'P (x) =
1
and
Q':cP'(x)= 1. Now assume that there exists a rotation of E carrying Q into Q'. Then
cP(x)='P'('cx)
xEE.
(10.74)
It follows from the criterion in sec. 10.21 that the bilinear functions 'P and cP' have the same eigenvalues. This implies that the principal axes of Q and Q' have the same length.
=
(v
=
1
...r).
Conversely, assume the relations (10.75). Then
(v=1...n).
(10.75)
§ 4. Quadrics in the Euclidean space
321
Observing the conditions (10.61) we see that l...ii). According to the criterion in sec. 10.21 there exists a rotation z of E such that
= cP(-rx').
This rotation obviously transforms Q into Q'. Now let Q and Q' be two quadrics without center. Without loss of generality we may assume that Q and Q' have the common vertex 0. Then the equations of Q and Q' read
Q:cP(x)+2(a,x)=0
aeK
al = 1,
(10.76)
Q':P'(x)+2(a',x)=0
a'EK
la'I = 1.
(10.77)
and
If Q and Q' are metrically equivalent there exists a rotation 'r such that cP(x) = cb'(rx).
Then
= 1 ... r).
(v
(10.78)
Conversely, equations (10.78) imply that the bilinear functions P and ck' have the same eigenvalues,
=
(v =
1
... n).
(10.79)
Now consider the restriction W of to the subspace T0 (Q). Then every eigenvalue of W is also an eigenvalue of cb. in fact, assume that W(e,y) = )L(e,y) for a fIxed vector eeT0(Q) and all vectors yET0(Q). Then cP (e, x) =
'P (e,
a + y) =
'P (e,
(e, y) = 'P (e, a) +
a) +
(e, y)
(10.80)
for an arbitrary vector xEE. Since the point 0 is a vertex of Q that 'P(e, a)=0. We thus obtain from (10.80) the relation 'P (e, x) = ). (e, y) =
A (e,
we have
a + y) = A (e, x)
showing that A is an eigenvalue of 'P. Hence we see that the bilinear In the same way we see that the function W has the eigenvalues restriction (/1' of'P' to the subspace T0 (Q') has the eigenvalues Ai.. Now it follows from (10.79) and the criterion in sec. 10.21 that there exists
an isometric mapping T0(Q') 21
Greub. Linear Algebra
Chapter X. Quadrics
322
with the property that y) = Define
(y)
y E T0
(Q).
the rotation t of E by )'ET0(Q)
t a = a'. Then
xeE and consequently, t transforms Q into Q'. 10.23. The metric normal-forms in the plane and in 3-space. Equations
(10.63) and (10.69) yield the following metric normal forms for the dimensions n =2 and n =3: Plane:
I. Quadrics with center: 1.
+
= 1,
2.
—
=1
a
ellipse with the axes a and b.
b
hyperbola with the axes a and b.
two parallel lines with the distance
= ±a
3.
2a.
II. Quadrics without center:
=
a
parabola with latus rectum of length a.
2
3-space:
I. Quadrics with center: 1.
2.
2
a
a2
+
+
+
—
—
b
—
= 1, a b c ellipsoid with axes a, b, c.
c
2=
1,
a
hyperboloid with one shell and axes
a,b,c. = 1, b
c
b
c
hyperboloid with two shells and axes
a,b,c.
§ 4. Quadrics in the Euclidean space
=
+
a2
b
2
1,
a
elliptic cylinder with the axes a and b.
b
hyperbolic cylinder with the axes a and b.
=1
two parallel planes with the distance
= ±a
6.
323
2a.
II. Quadrics without center: = 2 (,
—
a2 3.
b
a b
elliptic paraboloid with axes a and b.
hyperbolic paraboloid with axes a
2(
and b.
= 2(
a
parabolic cylinder with latus rectum of length a.
Problems 1. Give the center or vertex, the type and the axes of the following quadrics in the 3-space: (2
a) b) c)
1.
+
+ 7(2 — 1
—
—
= 9.
d)
2. Given a non-degenerate quadratic function qi, consider the family of quadrics defined by
Show that every point x +0 is contained in exactly one quadric Prove that the linear transformation 'p of E defined by
= ('px,y) with every point x +0 the normal vector of the quadric passing through x. 3. Consider the quadric associates
Q:P(x)= 1, 21*
Chapter X. Quadrics
324
where D is a non-degenerate bilinear function. Denote by Q' the image of Q under the mapping p which corresponds to çJi. Prove that the principal axes of Q' and Q are connected by the relation
(v=1...n). 4.
Given two points p, q and a number
consider the
locus Q of all points x such that Ix
Prove that Q
—
+ Ix — ql =
a quadric of index n whose principal axes have the length = (v = 2... n). a1! = — q12 is
5. Let cP (x) = be the equation of a non-degenerate quadric Q with the property that x is a normal vector at every point of Q. Prove that Q is a sphere. 1
Chapter XI
Unitary spaces Hermitian functions
§ 1.
11.1. Sesquilinear functions in a complex space. Let E be an n-dimensional complex linear space and ck: Ex E—*C be a function such that
+px2,y)=,Wi(x1,y)+#cli(x2,y) 2P(x,y1)+ j7P(x,y2)
cli(x,2y1 +
(11.1)
where 2 and 4ã are the complex conjugate coefficients. Then cP will be called a sesquilinear function. Replacing y by x we obtain from çji the corresponding quadratic function 11(x) = P(x,x).
It follows from (11.1) that
(11.2)
satisfies the relations
+ y) +
—
y) =
+
and
W(2x)
1212
The function çji can be expressed in terms of W. In fact, equation (11.2) yields
W(x + y) =
+ W(y) + cP(x,y) +
(11.4)
Replacing y by iy we obtain W (x + i y) =
W
(x) + ¶1' (y) — I cP (x, y) + I P (y, x).
Multiplying (11.5) by iand adding it to (11.4) we find
(x + y) + i
(x + iy) =
(1
+ i)(W (x) +
(y)) + 2P (x, y),
whence (x, y) = { W (x + y) —
W
(x) —
W
(y)} + I { W (x + iy) —
(x)
—
(y)} (11.6)
326
Chapter XI. Unitary spaces
Note; The fact that (P is uniquely determined by the function W is due to the sesquilinearity. We recall that a bilinear function has to be symmetric in order to be uniquely determined by the corresponding quadratic function. 11.2. Hermitian functions. With every sesquilinear function (P we can associate another sesquilinear function (P given by
= (P(y,x). A sesquilinear function (P is called Hermitian if (P= (P, i. e. (P (x, y) = (P (y, x).
(11.7)
Inserting y = x in (11.7) we find that
= V'(x).
(11.8)
Hence the quadratic function 'I' is real valued. Conversely, a sesquilinear function (P whose quadratic function is real valued is Hermitian. In fact, if is real valued, both parentheses in (11.6) are real. Interchange of x and y yields
2(P(y,x)={W(x+y)— W(x)— ¶t'(y)} + i{W(y+ ix)— W(x)— W(y)}. (11.9)
Comparison of (11.6) and (11.9) shows that the real parts coincide. The sum of the imaginary parts is equal to
W(x + iy) +
+ ix) — 2W(x)
—
2W(y).
Replacing y by iy in (11.3) we see that this is equal to zero, whence
(P(y,x) = (P(x,y). A Hermitian function (P is called positive definite, if ¶1' (x) >0 for all vec-
tors x+0. 11.3. Hermitian matrices. = 1 . ii) be a basis of E. Then every sesquilinear function (P defines a complex ii x n-matrix .
= (P (xv, xv).
The function (P is uniquely determined by the matrix and
In fact, if
§2. Unitary spaces
327
are two arbitrary vectors, we have that
The matrices relation
and
of cP and çli are obviously connected by the
=
If P is a Hermitian function it follows that = A complex n x n-matrix satisfying this relation is called a Hermitian matrix.
Problems 1. Prove that a skew-symmetric seq uilinear function is identically zero. 2. Show that the decomposition constructed in sec. 9.6 can be carried over to Hermitian functions. § 2.
Unitary spaces
11.4. Definition. A unitary space is an n-dimensional complex linear space E in which a positive definite Hermitian function, denoted by (,), is distinguished. The number (x, v) is called the Hermitian inner product of the vectors x and y. It has the following properties: 1. (Ax1 (x, Ay1 +PY2)
2. (x,y)=(y,x). In particular, (x, x) is real. 3. (x, x)>O for all vectors x+O. Example I: A Hermitian inner product in the complex number space C' is defined by (x. where
and
be the complex ificapie II: Let E be a Euclidean space and let tion of the vector space E (cf. sec. 2.16). Then a Hermitian inner product Exam
Chapter Xl. Unitary spaces
328
is defined in ('1 +
by
+ 'j2) =(x1. x)+(v1. 1))+
(x.
unitary space so obtained is called the coin plcxificutioii of the Euclidean space E. The norm of a vector x of a unitary space is defined as the positive square-root lxi = The Schwarz-inequality (11.10) l(x,y)l The
is proved in the same way as for real inner product spaces. Equality holds if and only if the vectors x and y are linearly dependent. From (11 .10) we obtain the triangle-inequality
lx+yi
+Jyl.
Equality holds if and only ify=Ax where ). is real and non-negative. In fact, assume that (11.11) lx+yi = fxl + lyl.
Squaring this equation we obtain
(x,v)+(x,y)=2lxHy(.
(11.12)
This can be written as
Re(x,v) = lxi yJ where Re denotes the real part. The above relation yields (x,y)I = lxl and
hence it implies that the vectors x and y are linearly dependent,
y—=Ax. Inserting this into (11.12) we obtain 2121
whence
Re2 =
121.
Hence, 2 is real and non-negative. Conversely, it is clear that
I(1+2)xI=lxH-21x1 for every real, non-negative number 2.
§ 2. Unitary spaces
329
Two vectors xeE and yeE are called orthogonal, if
(x,y)= 0. Every subspace E1 E determines an orthogonal complement consisting of all vectors which are orthogonal to E1. The spaces E1 and form a direct decomposition of E:
E= E1 A basis
(v = I ... n) of E is called orthonormal, if
=
(xv,
The inner product of two vectors and is
then given by
(x,y) = Replacing y by x we obtain x12
=
Orthogonal bases can be constructed in the same way as in a real inner
product space by the Schmidt-orthogonalization process. Consider two orthonormal bases and (v = 1.. .n). Then the matrix satisfies the relations of the basis-transformation
A complex matrix of this kind is called a unitary matrix. Conversely, if an orthonormal basis and a unitary matrix is given, the basis
= again orthonormal. 11.5. The conjugate space. To every complex vector space E we can assign a second complex vector space. E. in the following way: E coinis
cides with E as a real vector space. However, scalar multiplication in E, denoted by
z)—*2
.
z
is defined by
E is called the conjugate rector space. Clearly the identity map satisfies
—i
:eE.
Chapter XI. Unitary spaces
Now assume that (.) is a Hermitian inner product in E. Then a complex bilinear function. <>. is defined in E x E by
xeE. teE. Clearly this bilinear function is non-degenerate and so it makes E. E into a pair of dual complex spaces.
Now all the properties arising from duality can be carried over to unitary spaces. The Ries: theorem asserts that every linear function / in a unitary space can be uniquely represented in the form 1(x) = (x, a)
xeE.
In fact, consider the conjugate space E. In view of the duality between E and E there is a unique vector beE such that
f(x) =
<x, h>.
Now set a=K1(h). 11.6. INormed determinant functions. A nouined (leteulni000t Junction in a unitary space is a determinant function, zl, which satisfies x1eE.
4k1
(11.13)
We shall show that normed determinant functions always exist and and 47 are connected by that two normed determinant functions. the relation 47 where r is a complex number of magnitude I. In fact, let be any determinant function in E. Then a determinant in E is given by function.
\.*fl)_A(k._I\.*l
Since E and E
are
dual with respect to the scalar product <.> for-
mula (4.21) yields
4o(xi where
is a complex constant. Setting =
we
obtain x,)
Now set It follows that
where
(v= 1 ... n) 40(e1
is an orthonormal basis of E. =
y
§2. Unitary spaces and so
331
is real and positive. Let ). be any complex number satisfying and define a new determinant function, 4, by setting 4 I,.
Then (11.14) yields the relation
4
,...,x,,)
v,,) = det ((xi, vi))
4(v1
showing that zi is a normed determinant function. Finally, assume that z11 and are normed determinant functions in E. 1. Then formula (11.13) implies that 42=r41,
11.7. Real forms. Let E be an n-dimensional complex vector space and consider the underlying 2 n-dimensional real vector space E is a subspace, Fc: such that F is a real form of E, then every vector zeE can be uniquely written in the form z=x+iy. x. yeF.
Every complex space has real forms. In fact, let z1 ... z,7 be a basis of E and define F to be the subspace of consisting of the vectors X
E
=
Then clearly, F is a real form of E. A conjugation in a complex vector space is an satisfying of
iz =
involution,
—
Every real form F of E determines a conjugation given by x + ivF—*x —
iy
x.
yeF.
is a conjugation in E,then the vectors z which satisfy = z determine a real form F of E as is easily checked. The vectors of F are called real (with respect to the conjugation). Conversely, if
Problems 1. Prove that the Gram determinant
x1) ... (xv,
Chapter XI. Unitary spaces
332
of p vectors of a unitary space is real and non negative. Show that if and only if the vectors
are linearly dependent.
2. Let E be a unitary space. (i) Show that there are conjugations in E satisfying (ii) Given such a conjugation show that the Hermitian inner product of E defines a Euclidean inner product in the corresponding real subspace.
3. Let F be a real form of a complex vector space E and assume that a positive definite inner product is defined in F. Show that a Hermitian inner product is given in E by + i((x1.
:2) = ((x1. x2) + 2)
a unit quaternion / (cf. sec. 7.23) such that (I. e)=O.
Use / to make the space of quaternions into a 2-dimensional complex vector space. Show that this is not an algebra over C. 5. The complex cross product. Let E be a 3-dimensional unitary space and choose a normed determinant function /1. (i) Show that a skew symmetric bilinear map E x E E (over the reals) is determined by the equation
x, v. :eE. (ii) Prove the relations
(ix)xv= xx(iv)= —i(xxv)
(xxi',x)=(xxv,v)=O (x1 x x,.
X 12) = (yl. x1)(v2. x7) x
=
—
x1)(11. x2)
(x1. x2H2
6. Cavlev iiumhers. Let E be a 4-dimensional Euclidean space and E (cf. Example II, sec 11.4). Choose a and let E1 denote the orthogonal complement of e. unit vector Choose a normed determinant function 4 in E1 and let x be the corresponding cross product (cf. problem 5). Consider as a real 8-dimenlet
§2. Unitary spaces
333
sional vector space F and define a bilinear map x, y x y = —(x, y)e +
and
xx v
x i by setting
x. tEE1 tEC. VEE1 xeE1
x(ie)=/x
Show that this bilinear map makes F into a (non-associative) division algebra over the reals. Verify the formulae x =(xv)y and x2 y=x(xy). 7. Symp/ectic spaces. Let E be an rn-dimensional vector space over a field F. A symplectic inner product in E is a non-degenerate skew symmetric bilinear function <.> in E. (i) Show that a symplectic inner product exists in E if and only if in is even, rn=2n. (ii) A basis a1 a,,. b1 h, of a symplectic vector space is called syinpiectic. if K
a1>
=0
=
Kb,.
0
Show that every symplectic space has a symplectic basis. (iii) A linear transformation p: E—÷E is called symplectic. if it satisfies
x. yeE.
If (P and W are symplectic inner products in E, show that there is a linear automorphism of E such that x. yeE. 8. Symplectic ad joint transJorrnations. i) If p: E —* E is a linear transformation of a symplectic space the syrnp/ectic adjoint transformation. is defined by the equation <(PX. V> = <x. 10Y>
—(p) then (respectively (respectively skew syniplectic).
X.
VEE.
is called svinp/ectic se//ac/joint
(i) Show that = p. Conclude that every linear transformation ço of E where is symplectic can be uniquely written in the form selfadjoint and p2 is skew symplectic. (ii) Show that the matrix of a skew symplectic transformation with respect to a symplectic basis has the form (A B
Chapter XI. Unitary spaces
334
where A. B. C, D are square (n x ii)-mat rices satisfying
D=_A*.
C*=C.
B*=B.
Conclude that the dimensions of the spaces of symplectic selfadjoint (respectively skew symplectic) transformations are given by = ii(2ii
= ii(2,i
— 1)
—1.— 1).
(iii) Show that the symplectic adjoint of a linear transformation p of a 2-dimensional space is given by =
i
trço —
çø.
(iv) Let E be an n-dimensional unitary space with underlying real vector space Show that a positive definite symmetric inner product and a symplectic inner product is defined in by (x. j') = Re (x. j') and
x.
<x, i'> = Im (x, v)
Establish the relations <x, y> = — (ix.
and
v)
= (x.v).
Conclude that the transformation symplectic.
§
is both symplectic and skew
3. Linear mappings of unitary spaces
11.8. The adjoint mapping. Let E and F be two unitary spaces and (p: E-÷ F a linear mapping of E into F. As in the real case we can associate
of F into E. A conjugation in a unitary with ço an adjoint mapping of the underlying real vector space space is a linear automorphism
such that ix=
and
fl=(x,v).
If
is a conjugation in E,
then a non-degenerate complex bilinear function is defined by <x,y>= (x, and be conjugations in E and in F, respecx, yEE. Let tively. Then E and F are dual to themselves and hence 'p determines a dual mapping 'p*: by the relation
(11.16)
§ 3. Linear
mappings of unitary spaces
335
Replacing y by 9 in (11.16) we obtain (11.17)
Observing the relation between inner product aid scalar product we can rewrite (11.17) in the form
(çox,y) =(x,co*9). Now define the mapping
E+-F by (11.19)
Then relation (11. 18) reads
xeE, yEF. The mapping assume that (11.20). Then
(11.20)
does not depend on the conjugations in E and F. In fact, and are two linear mappings of F into E satisfying
0.
—
This equation holds for every fixed yeF and all vectors xeE and hence it implies that
=
The mapping
is called the adjoint of the mapping
It follows from equation (11 .18) that the relations
and hold for any two linear mappings and for every complex coefficient 2. Equation (11.20) implies that the matrices of 'p and relative to two orthonormal bases of E and F are connected by the relation
Now consider the case E= F. Then the determinants of 'p and are complex conjugates. To prove this, let A +0 be a determinant function in E and A be the conjugate determinant function. Then it follows from the definition of A that = det 'p A
...
) = det 'p
This equation implies that deUp = det'p.
A (x1
...
___________
336
Chapter XI. Unitary spaces
If (p is replaced by p —Ai, where 2 (11.21) yields det
—2
is
a complex parameter, relation
i) = det ((p — 2 i).
Expanding both sides with respect to 2 we obtain ;n—V
=
This equation shows that corresponding coefficients in the characteristic polynomials of (p and are complex conjugates. In particular,
= trço.
11.9. The inner product in the space L(E; E). Consider the space L (E; E). An inner product can be introduced in this space by
=
(11.22)
It follows immediately from (Il .22) that the function (p, i/i) is sesquilinear. Interchange of and yields
To prove that the Hermitian function (11.22) is positive definite let (v= in) be an orthonormal basis. Then and where
(11.23)
Equations (11.23) yield (p çoe%, =
whence
=
=
=
This formula shows that ((p, p)> 0 for every transformation (p + 0. 11.10. Normal mappings. A linear transformation (p : E—+ E is called normal, if the mappings (p and commute, (11.24)
In the same way as for a real inner product (cf. sec. 8.5) it is shown
§ 3. Linear mappings of unitary spaces
337
that the condition (11.24) is equivalent to
xeE.
It follows from (11.25) that the kernels of (p,
(11.25)
coincide, ker p = ker
and
We thus obtain the direct decomposition
E = ker (p
(11.26)
Tm (p.
The relation ker (p = ker implies that the mappings (p and have the same eigenvectors and that the corresponding eigenvalues are complex conjugates. In fact, assume that e is an eigenvector of (p and that is the corresponding eigenvalue, çoe = 2e. Then e is contained in the kernel of (p —2i. Since the mapping (p —2i is
again normal, e must also be contained in the kernel of —L, i. e.
qe—)e. In sec. 8.7 we have seen that a selfadjoint linear transformation of a real inner product space of dimension n always possesses n eigenvectors which are mutually orthogonal. Now it will be shown that in a complex space the same assertion holds even for normal mapping. Consider the characteristic polynomial of (p. According to the fundamental theorem Then of algebra this polynomial must have a zero is an eigenvalue of Let e1 be a corresponding eigenvector and E1 the orthogonal complement of e1. The space E1 is stable under (p. In fact, let y be an arbitrary vector of E1. Then ((py,e1) = (y,
=
= 2(y,e1) =
0
and hence is contained in E1. The induced mapping is obviously again normal and hence there exists an eigenvector e2 in E1. Continuing this way we finally obtain n eigenvectors (v = 1. n) which are mutually orthogonal, . .
(eV,e,L)=O
(v+ii).
If these vectors are normed to length one, they form an orthonormal basis of E. Relative to this basis the matrix of has diagonal form with the eigenvalues in the main-diagonal, (pep = 22
Greub. Linear Algebra
(v = 1... n).
(11.27)
Chapter XI. Unitary spaces
11.11. Selfadjoint and skew mappings. Let ço be a selfadjoint linear
transformation of E; i.e., a mapping such that
= (p. Then relation
(11.20) yields
(çox,y)=(x,coy)
X,yEE.
Replacing y by x we obtain
((px,x) = (x,px)
=(qx,x)
showing that ((px, x) is real for every vector xEE. This implies that all eigenvalues of a selfadjoint transformation are real. In fact, let e be an eigenvector and )L be the corresponding eigenvalue. Then çoe=Ae, whence
(pe,e) = 2(e,e). Since (çoe, e) and (e, e) + 0 are real, ). must be real.
Every selfadjoint mapping is obviously normal and hence there exists a system of n orthonormal eigenvectors. Relative to this system the matrix of p has the form (11.27) where all numbers are real.
The matrix of a selfadjoint mapping relative to an orthonormal basis is Hermitian. A linear transformation p of E is called skew if p= —(p. In a unitary
space there is no essential difference between selfadjoint and skew mappings. In fact, the relation iço
shows that multiplication by i associates with every selfadjoint mapping a skew mapping and conversely. 11.12. Unitary mappings. A unitary mapping is a linear transformation of E which preserves the inner product,
(çox,çoy)=(x,y)
x,yeE.
(11.28)
Relation (11.28) implies that
xEE showing that every unitary mapping is regular and hence it is an automorphism of E. If equation (11 .28) is written in the form
((px,y) =(x,(p1 y) it shows that the inverse mapping of (p is equal to the adjoint mapping, (11.29)
§ 3. Linear mappings of unitary spaces
339
Passing over to the determinants we obtain 1
whence
= 1. Every eigenvalue of a unitary mapping has norm 1. In fact, the equation çoe = 2e
yields = 121 el
whence 121=1.
Equation (11.29) shows that a unitary map is normal. Hence, there exists an orthonormal basis l...n) such that
= where the
are
(v = 1... n)
complex numbers of absolute value one.
Problems 1. Given a linear transformation p: E-÷E show that the bilinear function ck defined by
P(x,y)=(px,y)
is sesquilinear. Conversely, prove that every sesquilinear function can be obtained in this way. Prove that the adjoint transformation determines the Hermitian conjugate function.
2. Show that the set of selfadjoint transformations is a real vector space of dimension n2. 3. Let p be a linear transformation of a complex vector space E. a) Prove that a positive definite inner product can be introduced in E
such that p becomes a normal mapping if and only if ço has n linearly independent eigenvectors. b) Prove that a positive definite inner product can be introduced such
that p is i) selfadjoint ii) skew iii) unitary
if and only if in addition the following conditions are fulfilled in corresponding order: i) all eigenvalues of çø are real ii) all eigenvalues of 'p are imaginary or zero iii) all eigenvalues have absolute value 1. 22*
Chapter XI. Unitary spaces
4. Denote by S(E) the space of all selfadjoint mappings and by A (E) the space of all skew mappings of the unitary space E. Prove that a multiplication is defined in S(E) and A (E) by and
=
—
respectively and that these spaces become Lie algebras under the above multiplications. Show that S(E) and A(E) are real forms of the complex space L(E; E) (cf. problem 8, § 1, chap. I and sec. 11.7). §
4•* Unitary mappings of the complex plane
11.13. Definition. In this paragraph we will study the unitary mappings of a 2-dimensional unitary space C in further detail. Let -r be a unitary mapping of C. Employing an orthonormal basis e1, e-, we can represent the mapping i in the form (11.30) ie1 =c,e1 ±f3e2 're2 = c(— f3e1 + e
are complex numbers subject to the conditions =
1cx12 + 11312
and
= 1.
These equations show that
dett =
c.
We are particularly interested in the unitary mappings with the determinant + 1. For every such mapping equations (11.30) reduce to
te1=cxe1+$e2
2 cx
+
—
This implies that
-r'e1
—/3e2
T'e2=13e1 +cxe2. Adding the above relations in the corresponding order we find that (i +
= (cx +
t+
=
t' = itri.
(11.31)
§ 4. Unitary mappings of the complex plane
341
Formula (11.31) implies that
(z,rz) + (z,x' z) = zl2trt for every vector zeC. Observing that
(z,'r' z)
(tz,z)
(z,tz)
thus obtain the relation
we
2Re(z,rz)=Izl2trx
zEC
(11.32)
showing that if det t = 1. then the real part of the inner product (:.-r:) depends only on the norm of:. (11.32) is the complex analogue of the relation (8.42) for a proper rotation of the real plane. We finally note that the set of all unitary mappings with the determinant + 1 forms a subgroup of the group of all unitary mappings. 11.14. The algebra Q. Let Q denote the set of complex linear transformations of the complex plane C which satisfy
q3+ço=ltrço. It
(11.33)
is easy to see that these transformations form a real 4-dimensional
vector space, Q. Since (11.33) is equivalent to ç75 = ad p (cf. sec. 4.6 and problem 8, § 2, chapter IV) it follows that Q is an algebra under composition. Moreover, we have the relation pEQ.
(11.34)
Define a positive definite inner product in Q by setting (cf. sec. 11.9)
Then we have
(p.p)=detp.
çoeQ
(11.35)
and (11.36)
Now we shall establish an isomorphism between the algebra Q and the algebra I-fl of quaternions defined in sec. 7.23. Consider the subspace Q1 of Q consisting of those elements which satisfy or equivalently (p. i)=O. Then Q1 is a 3-dimensional Euclidean subspace of Q.
Chapter XI. Unitary spaces
342
and thus (11.34) yields
For q7eQ1 we have, in view of(11.33).
=—detq 1=— (p,p)i
eQ1.
This relation implies that =
+
'
— ((p,
(11.37)
Next, observe that for çoEQ1 and /JEQ1
((pop whence
°
—
i)=
—
Thus a bilinear map. x, is defined in Q1 by
°
(p x
(11.38)
It satisfies ((p
and
x
(q x
(11.39)
Moreover, equations (11.37) and (11.38) yield (11.40)
It follows that (11.41)
Finally, let /1 be the trilinear function in Q1 given by L1((p,
In view of (11.40) we can write
= (p x
L1((p,
(11.42)
This relation together with (11.39) implies that I is skew symmetric and hence a determinant function in Q1. Moreover, setting x = (p x yields, in view of (11.41), X
.
Thus A is a uoi';ned determinant function in the Euclidean space Q1. Hence it follows from (11.42) that the bilinear function x is the cross product in Q1 if Q1 is given the orientation determined by I (cf. sec. 7.16). Now equation (11.40) shows that Q is isomorphic to the algebra 1-fl of quaternions (cf. sec. 7.23).
§ 4. Unitary mappings of the complex plane
343
Remark: In view of equation (11.35), Q is an associative division algebra of dimension 4 over 11. Therefore it must be isomorphic to
I-fl
(cf. sec. 7.26). The above argument makes this isomorphism explicit. 11.15. The multiplication in C. Select a fixed unit-vector a in C. Then
to every vector :EC there exists a unique mapping
such that
This mapping is determined by the equations
a = a + f3 b = — fla + b
is a unit-vector orthogonal to a and
z = ca + fib. The correspondence
obviously satisfies the relation
+
=
for any two real numbers 2 and p. Hence, it defines a linear mapping of
C onto the linear space Q, if C is considered as a 4-dimensional real linear space. This suggests defining a multiplication among the vectors zEC in the following way: Z1Z2 = (11.43) Then (Pa1 and z1z2eC. (11.44) = In fact, the two mappings
Z2
and
°
are both contained in Q and
a into the same vector. Relation (11.44) shows that the correspondence preserves products. Consequently, the space C besend
comes a (real) division-algebra under the multiplication (11.43) and this algebra is isomorphic to the algebra of quaternions. Equation (11.31) implies that = 2(p, i)a z+ (11.45) for every unit-vector z. In fact, if z is a unit-vector then is a unitary mapping with determinant 1 and thus (11.31) and (11.36) yield z
+
= çoa +
=
+
a=
=
Finally, it will be shown that the inner products in C and Q are connected by the relation Re (z1, z2) =
(p.,
p:2).
(11.46)
Chapter XI. Unitary spaces
To prove this we may again assume that and :2 are unit-vectors. Then and are unitary mappings and we can write
pa,a) = ((p,-!a,a).
=
(:1,:2) =
(11.47)
is also unitary formula (11.32) yields
Since
a, a) = =
tr (p2_i
—1
= -3
= -3
(11.48)
=
Relations (11.47) and (11.48) imply (11.46).
Problems 1. Assume that an orthonormal basis is chosen in C. Prove that the transformations which correspond to the matrices (1
(i
O\ 1)
( 0-i
(O-l\
O\
o)
0
form an orthonormal basis of Q. 2. Show that a transformation çOEQ is skew if and only if tr p=O. 3. Prove that a transformation çOEQ satisfies the equation
=
—
if and only if
detp=1
and
trço=O.
4. Verify the formula (z1 z, z2 z) = (z1, z2)
z e C.
5. Let E be the underlying oriented 4-dimensional vector space of the complex plane C and define a positive definite inner product in E by
(x, J')E = Re(x, v).
Let a. h be an orthonormal basis of C such that a is the unit element for the multiplication defined in sec. 11.15. Show that the vectors = a.
e1 = ia_
c7 =
h,
c3 = ib
form an orthonormal basis of E and that, if 4 is the normed positive determinant function in E, = — I.
§ 5. Application to Lorentz-transformation
345
Let E1 denote the orthogonal complement of e in E (cf. problem 5). t/i be a skew Hermitian transformation of C with tn/i=0. a) Show that there is a unique vector peE1 such that x. xeE. 6.
Let
b)If / is the matrix of with respect to the orthonormal basis a. b in problem 5, show that p= + fi e2 + ; e3. §
5•* Application to Lorentz-transformations
11.16. Selfadjoint linear transformations of the complex plane. Consider the set S of all selfadjoint mappings a of the complex plane C. S is a real 4-dimensional linear space. In this space introduce a real inner product by (11.69)
This inner product is indefinite and has index 3. To prove this we note first that — (trcr)2) = — deta = (11.70) and
=—4tra.
(11.71)
Now select an orthonormal basis z1,z2 of C and consider the transformations (j = 1,2,3) which correspond to the Pauli-matrices
/1
0\ —i)
i\
/0
/0
o)
—i 0
Then it follows from (11.69) that
=
and
0,
= — 1.
These equations show that the mappings i, a1, a2, a3 form an orthonormal basis of S with respect to the inner product (11.69) and that this inner product has index 3. Thus S becomes a Minkowski space (cf. sec. 9.7).
The orthogonal complement of the identity-map consists of all selfadjoint transformations with the trace zero.
Chapter XI. Unitary spaces
346
11.17. The relation between the spaces Q and S. Recall from sec. 11 .14 the definition of the 4-dimensional Euclidean space Q. Introduce a new inner product in Q by setting
We shall define a linear isomorphism Q:
such that
In fact, set
Q(p=
itrgo+igo
goeQ.
Then
=j
—
Since
.
trgo
— i(go
and so it follows that
we have
- Q go =0; i.e., Qgo is selfadjoint. The map W:
W(oi=-
given by
l+i
oeS
is easily shown to be inverse to Q and thus Q is a linear isomorphism. Finally, since = (I —
i)
-i tr(p•
+
1—
tr(p + go tn/i) — (po /1
it follows that
tr
= trgo . tn/i — tr(goo
whence — trQgo
=
go
=
tn/i —
trgo
= (go,
Note as well that, by definition,
Now consider an arbitrary linear trans11.18. The transformations formation of the complex plane C such that det = 1. Then a linear transformation T2: S—÷S is defined by TES.
(11.79)
§ 5.
Application to Lorentz-transformations
347
In fact, the equation
=
=
=
is again selfadjoint. The transformation shows that the mapping preserves the inner product (11.69) and hence it is a Lorentz-transformation (cf. sec. 9.7):
=
= — detaldetcxl2 = — detcr
Every Lorentz-transformation obtained in this way is proper. To prove this let 1) be a continuous family of linear transformations of C such that cx(O) = z = and detcx(t) = 1 (0 t 1).
it follows from the result of sec. 4.36 that such a family exists. The continuous function
det T2 (t)
t 1)
(0
is equal to ± 1 for every t. In particular det
= det i; = 1.
This implies that =1
(0 t 1)
whence
= 1. The transformations that
are orthochroneous. To prove this, observe i=
whence
=
=—
<0.
This relation shows that the time-like vectors i and are contained in the same cone (cf. sec. (9.22)). 11.19. In this way every transformation with determinant 1 defines a Obviously, proper Lorentz-transformation (11.80)
Now it will be shown that two transformations and if f3_— In view of(l 1.80) it is sufficient to prove that operator only if = ± i. If is the identity, then
forevery
creS.
coincide only is the identity (11.81)
Chapter XI. Unitary spaces
348
Inserting a = we find that cx o = i whence cx = implies that 1
1 Now equation (11.81)
cxa=a0x forevery aES.
(11.82)
To show that cx= ± i select an arbitrary unit-vector eEC and define a selfadjoint mapping a by a:=(z,e)e ZEC. Then
(aocx)e =(cxe,e)e and
(cxoa)e =
cxe.
Employing (11.82) we find that cx e =
(cx
e, e) e.
In other words, every vector cxz is a multiple of z. Now it follows from the linearity that cx = 2i where )L is a complex constant. Observing that det cx = we finally see that A = ± 1. 1
11.20. In this section it will be shown conversely that every proper orthochroneous Lorentz-transformation T can be represented in the
form (11.17). Consider first the case that i is invariant under T,
Ti = i. Employing the isomorphism Q:Q—*S (cf. sec. 11.17) we introduce the transformation
T'=Q'0T0Q
(11.83)
of Q. Obviously,
=
<(p,l/J>
(11.84)
i = i.
(11.85)
and
Besides the inner product of sec. 11.17 we have in Q the positive inner product defined in sec. 11.14. Comparing these two inner products we see that (q,,
+ tr
=
tr
=
-2
i>.
(11.86)
Now formulae (11.84), (11.86) and (11.85) yield
(T'ço,T'l./J)=(ço,l/J)
p,l/JEQ
showing that T' is also an isometry with respect to the positive definite inner product. Hence, by Proposition I, sec. 8.24, there exists a unitvector /3EQ such that
T'p=fopo/31.
(11.87)
§ 5. Application
to Lorentz-transformations
349
Using formulae (11.83), (11.87) and (11.74) we thus obtain
Tci =(Q0T'0Q1)a = Since
$0a0f31.
=f3 (cf. sec. 11.34). This equation can be written in the form
T ci = fi
ci
=
fi
ci.
Finally,
detfi =(/3,/3)= 1. Now consider the case Ti
i. Then, since T is a proper orthochroneous
Lorentz-transformation, Ti and i are linearly independent. Consider the plane F generated by the vectors i and Ti. Let w be a vector of F such that
(11.88)
trw=0 and detw=—l. Therefore WoW
1.
By hypothesis, T preserves fore-cone and past-cone. Hence Ti can be written as (cf. sec. 9.26)
Ti = i cosh 0 + wsinh0.
(11.91)
Let cx be the selfadjoint transformation defined by
=
i cosh
0
+ w sinh —.
(11.92)
Then
Ti=
0
= i cosh2 + —
2 w cosh
2
=
i cosh 0
0
0
2
2
- sinh - +
0
w w sinh2 — 2
(11.93)
+ w sinh 0.
Comparing (11.91) and (11.93) we see that
Ti = Tai. This equation shows that the transformation T' ° T leaves vector i invariant. As it has been shown already there exists a linear transformation f3e Q of determinant 1 such that T=
Chapter XI. Unitary spaces
Hence,
T= It remains to be proved that cz has determinant 1. But this follows from (11.70), (11.92) and (11.88): det
= —
=
0
— <1,
0
i> cosh2 — <w, w> sinh2 - = 1.
Problems 1) Let be the linear transformation of a complex plane defined by the matrix
(
1
21 3
Find the real 4 x 4 matrix which corresponds to the Lorentz-transformation with respect to the basis i, a3 (cf. sec. 11.16). 2) A fractional linear transformation of the complex plane formation of the form
T(:) =
c:+d
ad— hc
=
is a trans-
1
where a. b. c. d arc complex numbers. (i) Show that the fractional linear transformations form a group.
(ii) Show that this group is isomorphic to the group of proper orthochroneous Lorentz transformations.
Chapter XII
Polynomial Algebras in this chapter F denotes a fIeld of characteristic zero. § 1. Basic properties In this paragraph we shall define the polynomial algebra over F and establish its elementary properties. Some of the work done here is
simply a specialization of the more general results of Chapter VII. Volume II. and is included here so that the results of the following chapter will be accessible to the reader who has not seen the volume on multilinear algebra. 12.1. Polynomials over F. A polynomial over F is an infinite sequence
are different from zero. The sum and such that only finitely many the product of two polynomials =
...)
and
g=
...)
are defined by
jg =
...)
where i-t-j=k
These operations make the set of all polynomials over F into a commutative associative algebra. It is called the polynomial algebra over F and is denoted by ['[t]. The unit element, 1, of F[t] is the sequence (1.0...). It is easy to check that the map i: F—+F[t] given by
is an injective homomorphism of algebras.
Thus we may identify F with its image under 1. The polynomial (0, 1, 0 ...) will be denoted by
t=(0.l,0...).
Chapter XII. Polynomial algebra
It follows that
tk=(0. (11.0...)
k =0.1....
Thus every polynomial I can be written in the form (12.1)
where only finitely many are different from zero. Since the elements (k=0. 1. ...) are linearly independent, the representation (12.1) is unique. Thus these elements form a basis of the vector space T[t]. Now let .1
0
=
be a non-zero polynomial. Then ; is called the /eadi;ig ('oeff!cwIlt of f The leading coefficient of is the product of the leading (ff0, coefficients off and g. This implies that fg 0 whenever / 0 and g 0. A polynomial with leading coefficient is called monk. On the other hand, for every polynomial the element I
is called the sea/a, term. It is easily checked that the map is a surjective homomorphism. given by Consider a non-zero polynomial k=O
The number ii is called the degree of
I.
If g is a second non zero
polynomial, then clearly
deg(f + g) max(degf,degg) and
deg(fg) = degf + degg.
(12.2)
A polynomial of the form ;t' (; +0) is called a monomial of degree ii. be the space of the monomials of degree ii together with the Let zero element. Then clearly
F[t] and by assigning the degree a to the elements of
we make r[t]
into a graded algebra as follows from (12.2). The homogeneous elements
§ 1. Basic properties
353
of degree n with respect to this gradation are precisely the monomials of degree a.
However, the structure of F[t] as a graded algebra does not play a role in the subsequent theory. Consequently we shall consider simply its structure as an algebra. 12.2. The homomorphism Let A be any associative algebra with unit element e and choose a fixed element aEA. Then the map 1
e, t
a
can be extended in a unique way to an algebra homomorphism
The uniqueness follows immediately from the fact that the elements 1 and t generate the algebra F[t]. To prove existence, simply set
=
a".
It follows easily that k is an algebra homomorphism. The image of T[t] under J) will be denoted by T(a). It is clearly the subalgebra of A generated by e and a. Its elements are called polynomials in a, and are denoted by J(a). The homomorphism cP:f—÷f(a) induces a monomorphism cP:
In particular, if A is generated by e and a then cP (and hence P) are surjective and so P is an isomorphism
P: F[t]/kerP4A in this case.
Example: Let AeT[t] and let
be a polynomial. Then we
have
f(g) =
ti'.
In particular. the scalar term of f(g) is given by =
=J(/30).
Thus we can write
where g: 23
Greub. Linear Algebra
is the homomorphism defined in sec. 12.1.
Chapter XII. Polynomial algebra
12.3. Differentiation. Consider the linear mapping
d:E[t]—*F[t] defined by
dt"—pt"' dl =0. Then we have for p,q1 = = (p + = + t"q = + =
d
d
+
d
(12.3)
It is easily checked that (12.3) continues to hold for p=O or q=0. Since the polynomials form a basis of T[t] (ef. sec. 12.1) it follows from (12.3) that the mapping d is a derivation in the algebra F[t]. d is called the d(fferentiation map in F[t], and is the unique derivation which maps t into 1. It follows from the definition of d that d lowers the degree of a polynomial by 1. In particular we have ker d = F
and so the relations
df = dg and
are equivalent. The polynomial df will be denoted byf'. The chain rule states that for any two polynomials f and g,
(f(g))'=f'(g)g'. For the proof we comment first that
dgk=kgk_ldg dg° =
(12.4)
0
which follows easily from an induction argument. Now let
I= Then
I (g) =
gk
§ 1. Basic properties
355
and hence formula (12.4) yields
(f(g))' =
1) is called the r-th derivative of f and is The polynomial d7 We extend the notation to the case r=O by usually denoted by if and only setting f(0)=f It follows from the definition that if r exceeds the degree of j 12.4. Taylor's formula. Let A be an associative commutative algebra over F with unit element e. Recall from sec. 12.2 that if jET[t] and aeA then an element [(a)eA is determined by J(a) = Tailor's fhrmula states that for aEA, beA fl
j(a+h)= >J——--,-—-h". p=o
(12.5)
P•
Since the relation (12.5) is linear in j it is sufficient to consider the case f= t". Then we have by the binomial formula f(a + h) =
(a
=
+
p==o
P
...
(a
p
+ 1)
...
(a — p
+ 1)
On the other hand. = n(n
— 1)
—
whence (a)
= a (a —
1)
and so we obtain (12.5).
Problems defined by 1. Consider the mapping F[t] x a) Show that this mapping does not make the space F[t] into an algebra.
b) Show that the mapping is associative and has a left and right identity.
Chapter XII. Polynomial algebra
c) Show that the mapping is not commutative. d) Prove that the mapping obeys the left cancellation law but not the right cancellation law; i.e., (g) =f2(g) implies that f1 =f2 but f(g1) f(g2) does not imply that g1 =g2. 2. Construct a linear mapping
r[t] such that
do$=i. Prove that if such that
and
$2 are
two such mappings, then there is a fixed scalar,
-$2)f In particular, prove that there is a unique homogeneous linear mapping $
of the graded space F [t] into itself and calculate its degree. $ integration operator. 3. Consider the homomorphism Q:F[t]—J' defined by
is
called the
=
Q
Show that
=f(O). Prove that if $ is the integration operator in F[z'], then
=i Use
—
this relation to obtain the formula for integration by parts: $1 g'
4. What is the Poincaré series for the graded space F[t]? 5. Show that if ThF[t]—+F[t] is a non-trivial homogeneous antiderivation in the graded algebra F[t] with respect to the canonical involution, and then
pl.
6. Calculate Taylor's expansion
f (g + h) = I (g) + hf'(g) +.. for the following cases and so verify that it holds in each case:
a) f=t2—t+l,
g=t3+2t,
h=t—5
b)f=t2+l, g=t3+t—1, h=—t+l c)f=3t2+2t+5, g=l, h=—l d)f=t3—t2+t—l,
g=t,
h=t2—t+1
§ 2. Ideals and divisibility
357
7. For the polynomials in problem 6 verify the chain rule
[1(g)]' explicitly. Express the polynomialf(g(h))' in terms of the derivatives off, g and h and calculate [f(g(h))]' explicitly for the polynomials of problem 6.
§ 2.
Ideals and divisibility
12.5. Ideals in F Iti. In this section it will be shown that every ideal in
the algebra F[t] is a principal ideal (cf. sec. 5.3). We first prove the following
Lemma I: (Euclid algorithm): Let [+0 and g+0 be two polynomials. Then there exist polynomials q and r such that
f = gq + and deg r<deg g or r=0. Proof: Let degf= n and deg g = in. If in > n we write
f =gO+f and the lemma is proved. Now consider the case Without loss of generality we may assume thatf and g are monic polynomials. Then we have
f=
+
degf1
(12.6)
1ff1 + 0 assume (by induction on n) that the lemma holds forf1. Then
f1=gq1+r1
(12.7)
where deg r1 <deg g or r1 =0. Combining (12.6) and (12.7) we obtain
f = (tn_rn + q1)g + r1 and so the lemma follows by induction.
Proposition 1: Every ideal in F[t] is principal. Proof: Let I be the given ideal. We may assume that 1+0. Let h be a monic polynomial of minimum degree in I. It will be shown that I 4 (cf. sec. 5.3). Clearly, 4 c I. Conversely, let fel be an arbitrary polynomial. Then by the lemma, f= hq+ where (12.8) degr<degh or r=0.
Chapter XII. Polynomial algebra
Sincefe I and he I we have
r= and hence if
f
— h
q el
deg rdeg h. Now (12.8) implies that r=O; i.e.f=hq
and sofe The monic polynomial h is uniquely determined by I. In fact. assume Then there are monic that k is a second polynomial such that polynomials g1 and g, such that k=g1li and h=g2k. It follows that
k =g1g,k whence g1g2 = I. Since g1 and g2 are monic we obtain g1 =g2 = whence
k=li.
12.6. Ideals and divisors in F[t]. Let [and g be non-zero polynomials. We say that g dirides f or / is a multiple of g if there is a polynomial Ii such that
/=
g Ii.
In this case we write g/f Clearly. f is a multiple of g if and only if it is contained in the ideal, 1g. generated by g. If li/g and g/f then li/f Two monic polynomials which divide each other are equal. Next. letf and g be monic polynomials and consider the ideal I+Ig. In view of Proposition I. sec. 12.5. there is a unique monic polynomial. [v g. such that 'fvg = 1f + 1g. It is called the greatest coinnion dicisor of I and g. Since
g
and
[vg is indeed a common divisor of •[ and g. On the other hand, if li/f and li/g, then
whence
and 'g
g
and so li/f v g.
This shows that every common divisor of I and g is a divisor of I v g. A polynomial f whose only divisors are scalars and scalar multiples of I is called irreducible ar prime. In a similar way the greatest common divisor of a finite number of monic polynomials (i= I ... r) is defined. It is denoted by f v and is characterized by the relation 111 v
v
=
+
+ 1j;.
(12.9)
polynomials f are called relatirel prime. 12.7. Again let I and g be monic polynomials and consider the ideal n 'g. By Proposition I, sec. 12.5. there is a unique monic polynomial, .1 A g. such that the
'fAg
= 'f
§ 2. Ideals and divisibility
359
It is called the least common 17111/tip/c of f and g. It follows from the definition that [Ag is indeed a common multiple off and g and every common multiple off and g is a multiple of [Ag. If 1 (1= 1 ... r) are monic polynomials their least common multiple. A Jr. is defined by the equation ji A A
=
A
(12.10)
Proposition II: 1ff is the greatest common divisor of the f (i= 1 ... then there are polynomials such that
r)
(12.11)
Proof: It follows from (12.9) that every element
can be written as
= Since tel1. [ must be of the form (12.11). Corollary: If the polynomials nomials (1= 1.. .r) such that
are
relatively prime there exist poly(12.12)
Conversely if there exists a relation of the form (12.12) then the J
are
relatively prime.
Proof: The first part follows immediately from the proposition. Now assume that there is a relation of the form (12.12). Then every common divisor of theJ divides 1 and hence is a scalar.
Proposition III. Let f be the greatest common divisor of the monic polynomials
12 and write
= f/i1
(i = 1.2).
Then the polynomials /1-, are relatively prime and the least common multiple of the polynomials /12. f Proof: In view of Proposition II we can write .
f=>fg1.
(12.13)
Chapter XII. Polynomial algebra
It follows that .1 =
g1
whence 1.
This shows that and /12 are relatively prime. Clearly, the polynomial I 112 is a common multiple of .11 and .12. Now let g be any common multiple of f and and write
g=jp1.
g=J2p7.
Then (12.13) yields
=.1 1i2(g1 P2 + g2 Pi).
This implies that
Pi = h2(g1 P2 + g2 Pi) whence
g = ./1
Pi
.1
112 (g1 P2 + g2 Pi);
i.e., g is a multiple off. /11 112.
Corollan.: If the monic polynomials f (i= I ... r) are relatively prime, then
a
.
A
A fr = /1 ... fr
Proof: If r=2 this follows immediately from the proposition. If r>2 simple induction argument is required.
12.8. The lattice of ideals in F[t]. Let f denote the set of all ideals in r[t]. Recall from sec. 5.9 that .1 is a lattice with respect to the partial order given by inclusion. On the other hand, let 5) denote the set of all monic polynomials together with the zero polynomial. Define a partial order in 5) in by setting I g if g/f (.1 +0, g +
g0
for every
Then, in view of sec. 12.6 and 12.7, 5) becomes a lattice as well. Now let Ii.. Then .f g implies that P: 5) .1 be the map given by b: and so cP is a lattice homomorphism. Moreover, since b is bijective, it is a lattice isomorphism (cf. Proposition I, sec. 12.5).
§2. Ideals and divisibility
361
12.9. The decomposition of a polynomial into prime factors.
Theorem 1: Every monic polynomial can be written (12.14) f where the J are distinct irreducible monic polynomials and degf 1. The decomposition is unique up to the ordering of the prime factors. Proof: The existence of the decomposition (12.14) is proved by induction on the degree off. if degf=O thenf= 1 and the decomposition is trivial. Suppose that the decomposition (12.14) exists for polynomials of degree
f=gh deg h<degf it follows by induction that
and —
whence
—
t...
ii
t
1••
Collecting the powers of the same prime polynomials we obtain the decomposition (12.14).
The uniqueness part follows (with the aid of a similar induction argument) from
Lemma II: Let g, h be monic polynomials and assume that h is irreducible. Let m be a positive integer. Then htm divides fg if and only such that if there is an integer p (1
h"/f and
htm - 17g.
Proof The "if" part of the statement is trivial. Now suppose that Let pO be the largest integer such that /v!)/f. If p=m there is nothing to prove. If p<m, write Then
fg = h" 11 g.
On the other hand. by hypothesis, for some polynomial k.
Chapter XII. Polynomial algebra
These relations yield fi g =
is not divisible by /1. Since Ii is irreducible, By the definition of p. it follows that Ii and are relatively prime. Thus there are polynomials u and u such that (cf. the corollary to Proposition II, sec. 12.7) u Ii
+u
=
1.
Set g1
k.
= ug +
Then we have Ii g1
= liii g
+ r /i"j' k
g = g;
= /10 g + 1 i g = (Ii U +
i.e., hg1=g.
Now
equation (12.15) implies that f
/1i11—
g1
p—i
k
divides g. This establishes the Continuing this process we see that is complete. lemma and so the proof of Theorem I
Carol/an'. The monic polynomials which divide the polynomial are
precisely the polynomials
(v=1,...,r). Now let (12.14) be the decomposition of the monic polynomial f and set q = fk ... ... (1 = ... 1
It will be shown that the q are relatively prime and that for every i. I is the least common multiple of and the polynomial V q1, (12.16)
V 1=1
j*i Let g be a monic polynomial which divides that g has the form
(12.17)
Then Theorem I implies
g divides all polynomials q1, it follows that g =
1,
whence (12.16).
§ 2. Ideals and divisibility
A similar argument shows that V
and
363
now rormula (12.17)
follows from the relation q, A
and the fact that
and
q f:k
(V
=f
j *i
are relatively prime.
Proposition IV: Suppose g is a product of relatively prime irreducible
polynomials and suppose f is a polynomial such that g/j'" for some in 1. Then g/f Let
j—j1 ... I,
k,
be the decomposition of f into its prime factors. Then the corollary of Theorem I implies that g is of the form ho
= /1f/i
Is
by hypothesis, g is a product of relatively prime irreducible polynomials it follows that L = = J = whence Since,
1
...
and so g/f Proposition
V: A polynomial f is the product of relatively prime
irreducible polynomials if and only if f and f' are relatively prime. Proof Let jk
1— 1k
.1
Jr
J1
be the decomposition of f into prime factors. Suppose that some Then writing we
f = h j2
I
for
lie I' [t]
obtain
+ 2/if
f
f and
prime.
f and f' are not relatively
Conversely, assume that (i= I... r). Then, if I and 1' have a common factor, the corollary to Theorem I, implies that, for some i, J divides f'. 1
Since
= j=1
...
... Jr =
11
...
f1
...
Jy + /1
...
Jr
Chapter XII. Polynomial algebra
we obtain 11.11
.11'
fr
is irreducible and the polynomials .11 fr are relatively prime. This But this is impossible since follows that
Since it
contradiction shows that / and f' are relatively prime. 12.10. Polynomial functions. Let (F; F) be the space of all set maps furnished with the linear structure defined in sec. 1.2, Example 3. determines an elementf of (F; F) defined
Then every polynomialf= by
The functions f are called polynomial functions. for some lET, then 2 is called a root off. I isaroot off if and only if t—1 dividesf. In fact, if
f=(t-2)g it follows thatf(2) = 0. Conversely, if t —2 does not dividef, then t —2 and fare relatively prime and hence there exist polynomials q and s such that
I q + (t —1) S = 1.
This implies that
f(2)q(1)= 1 whencef(2)+0. It follows from the above remark that a polynomial of degree n has at most ii roots. Proposition VI: The mappingf—*Jis injective. for every Proof. Suppose J=O. Then Since F has characteristic zero it contains infinitely many elements and hence it follows that
f=0. In view of the above proposition we may denote the polynomial function f simply by f
Problems I. Let I be a polynomial such that 1(0) 0. Consider two polynomials g1 and g2 are g1 and g2 such that relatively prime.
____________
§ 2. Ideals and divisibility
365
2. Consider the set of all pairs (f, g) where g #0. Define an equivalence relation in this set by
(f,g)
if and only if
= Jg.
Show that this is indeed an equivalence relation. Denote the equivalence
classes by. Prove that the operations (f1,g1)
+(f2,g2) =(fi g2 + f2g1,g1 g2)
and
(f1,g1)(f2,g2) = (f1f2,g1g2) are well defined.
Show that with these operations the set of equivalence classes becomes
a field, denoted by 01[t]. Prove that the mapping is a monomorphism of the algebra F[t] into the algebra Q1[t]. 3. Extend the derivation d to a derivation in and show that this extension is unique. Show that the integration operator j' (cf. problem 2, § 1) cannot be extended to Q1[t]. 4. Show that any ideal in T[t] is contained in only finitely many ideals. 5. Consider the mapping 11[t] x R[t]—+111[t] defined by
Show that this mapping makes R[t] into an inner product space. Prove that the induced topology makes R[t] into a topological algebra (addition, scalar multiplication, multiplication and division are continuous). have the inner product of problem 5. Let I be any ideal. 6. Let Calculate I explicitly. Under what conditions do either of the equations =
(I')' = hold? Show that
((I')')' = I'. 7. Let f and g be any two non-zero polynomials and assume that deg g. Write
I=
+ g1
where g1 =0 or deg g1 <deg g. Prove that the greatest common divisor off and g coincides with the greatest common divisor of g and g1, unless
Chapter XII. Polynomial algebra
g dividesf. If g1 +0 write g2 =
g = p2g1 + g2,
degg2 <degg1.
or
0
Show that the repeated application of this process yields an explicit calculation of the greatest common divisor of f and g. (This method is
called the Euclidean algorithm).
8. Calculate the greatest common divisors and the least common multiples of the following polynomials over a) + t4 + t3 + b) c) 5t4 — +
d) 818 +
e) 3t4 +
12
—
— —
9t2
+ I + 1, t + 7, 3t
t6 — 712
+ 7,
+ 16, 218 + +
—
+ + 841 + 5, —
I
+ 16
72
+
—
2912 —
—
+ 21 + 1
— 612 —
64t
3t
— 15
+ 4.
9. 1ff, g are two polynomials and d is their greatest common divisor use the Euclidean algorithm to construct polynomials r, s such that
+ g s = d.
10. Construct the polynomials r, s explicitly for the polynomials of problem 8, (in parts b) and c) it will be necessary to construct three polynomials).
11. Decide whether the following polynomials are products of relatively prime irreducible polynomials: a) 17—t5+t3—t; b)
g
c) the polynomials of problem 6, § 1; d) the polynomials of problem 8. 12. Let g1, g2 be non-zero polynomials such that g1 +g2. Show that g2 divides f(g1) —
§ 3.
Factor algebras
12.11. Minimum polynomial. Let A be a finite dimensional associative
commutative algebra with unit element e. Fix an element OEA and consider the homomorphism P: T[t]—÷A given by
P(f)=f(a)
fer[t]
§ 3. Factor algebras
367
sec. 12.2). Its image is the subalgebra of A generated by e and a. It will be denoted by ['(a). The kernel, K of (P is an ideal in T[t]. Since A has finite dimension, By Proposition I, (Sec. 12.5) there is a unique monic polynomial. p is called the mininuim poli'nomial of a. p, such that K = 0, p must have positive degree. To see this assume that p = 1. If A and so (P =0. It follows that e=(P(1)=0 whence A =0. Then The homomorphism (P factors over the canonical projection to induce an isomorphism (cf.
['(ii)
W:
such
that the diagram
T[t]
['(a)
commutes.
Example I: If a=0, then K consists of all polynomials whose scalar term is zero and so p=1. Example II: If a=e. then K consists of all polynomials I In this case we have p=t— I.
satisfying
Example III: Set A = F [1]/Ih (where h is a fixed monic polynomial) and
a=t where t=m(t). Then, for every polynomial I (P(f) = and
=
ir(tY =
=
so (P coincides with the canonical projection. It follows that the
minimum polynomial of is the polynomial h.
Proposition I: The dimension of ['(a) is equal to the degree of the minimum polynomial of a. Proof: Let r denote the degree of p. Then it is easily checked that the tr_l form a basis of F[t]/I,,. Thus we have, in view of elements I. t the isomorphism W. dim ['(a) = dim T[t]/I,, = I'. Proposition II: Every ideal in ['(a) is principal.
Then I is an Proof: Let 14 be an ideal in ['(a) and set I = ideal in F[t]. Thus, by Proposition V. sec. 12.5. there is an element fE J'[t] such that I = I. It follows that = (P(I) =
=
Chapter XII. Polynomial algebra
12.12. Nilpotent elements in 1(a). Suppose that I (a) is a nilpotent element of 1(a). Then, for some in 1. 1(a)" = 0. It follows that f.rn) =
= 1(a)" = 0
and so the minimum polynomial. p. of a divides I Now decompose p into its prime factors P=
/1
tr
.
and set
g=f1... Then
fr
we have g/p1/f'. Now Proposition IV. sec. 12.9. implies that g
divides f Conversely, assume that g//. Set k = max
(k1
kr). Then
p, g' and so
we have
It follows that I = 0 and so 1(a) is nilpotent. Thus I (a) is in/potent if and on/v if g divides f. In particular. if all the
exponents in the decomposition of p are I. then p=g and so f(a)
is
nilpotent if and only if p/f; i.e.. if and only if f(a)=0. Hence, in this case there are no non-zero nilpotent elements in 1(a). 12.13. Factor algebras of an irreducible polynomial. Theorem I: Let I be a polynomial. Then the factor algebra 1(t) is a field if and only if f is irreducible.
Proof: Supposef= gh,wheredegg 1 and degh and so h+O. On the other hand, and so 1(1) has zero divisors. Consequently, it is not a field. Conversely, suppose f is irreducible. 1(1) is an associative commutative algebra with identity T. To prove that 1(1) is a field we need only show that every non-zero element has a multiplicative inverse. Let be any representing polynomial. Then since it follows that g is not divisible by f Since f is irreducible, f and g are relatively prime and so there exist polynomials h and k such that
gh +fk whence
Butf=O and so 1. Hence ii
is an inverse of
I
§ 4. The structure of factor algebras
369
Corollary: If j. is irreducible, then 1(1) is an extension field of F. Proof: Consider the homomorphism co:F—*F(i) defined by It is clear that ço is a monomorphism and so the corollary follows.
Problems 1. Consider the irreducible polynomial f= t2 + 5t + 1 as an element of [t] (where Q is the field of rational numbers). Let m:Q[t]—÷Q[t]/11
be the canonical projection. a) Decide whether the polynomials of problem 6, § 1, problem 8, § 2 (except for part d) and problems 11 a) and b), § 2 are in the kernel of it. b) For each polynomial p of part a) such that itp+O construct a polynomial [t] such that
= (mp)'. 2. Let feF[t] be any polynomial. Consider an arbitrary element xel'[t]/11. Prove that the minimum polynomial of x has degree
degf.
3. Suppose fer[t] is an irreducible polynomial, and consider the polynomial algebra F[t]/11[t] denoted by f'1[t]. a) Show that F[t] may be identified in a natural way with a subalgebra of F1[t].
b) Prove, that if two polynomials in F[t] are relatively prime, then they are relatively prime when considered as polynomials in ['1[t]. c) Construct an example to prove that an irreducible polynomial in F[t] is not necessarily irreducible in F1[t]. d) Suppose thatf has degree 2, and that g is an irreducible polynomial of degree 3 in F[t]. Prove that g is irreducible in F1[t]. §
4•* The structure of factor algebras
In this paragraph f will denote a fixed monic polynomial. and F(t) will denote the factor algebra F [t]/If. 12.14. The lattice of ideals in 1(i). Consider the set of all monic polyof nomials which divide 1. These polynomials form a sublattice then the greatest (cf. sec. 12.8). In fact. if ... is any finite set in 24
(i rcuh. Linctr \Igchra
Chapter XII. Polynomial algebra
370
common divisor and the least common multiple of the is again contained in f is a lower bound and 1 is an upper bound of ideals in the algebra On the other hand, consider the lattice F(i)=F[t]/11. The remarks of sec. 12.11 establish a bijection defined by
'Pg = where 'g denotes
the ideal in T(i) generated by
check that 'P and phism: i.e.,
The reader can easily 'P
cP(\/
a lattice isomor-
=
and
In
particular,
'P(1)=F(i) and
cP(f)=O. 12.15. Decomposition of F(i) into irreducible ideals. Let f =f1 denote the ideal in 1(1) generated byf,. Consider the ideal
. .
and let
(12.18)
Proposition I. I = if and only if the polynomials j are relatively prime. The sum (12.18) is direct if and only if, for each j. the polynomials and V J havef as least common multiple.
Proof: To prove the first part of the proposition we notice that F (1) = is
equivalent to 'P(1) = 'P(v fL)
which in turn is equivalent to
1= V
But according to sec. 12.6, this holds if and only if the J are relatively prime.
§ 4. The structure of factor algebras
371
For the second part we observe that the sum is direct if and only if
(j=1...m).
i*j Since
un
i*j
i*j
this is equivalent to
(j=1...m).
i*j Theorem I: Let be
the decomposition off into prime polynomials and let the polynomials be defined by k1
...f1
k2
fr
Then
1(1) = where
(12.19)
Moreover let
denotes the ideal generated by
(12.20)
and 1=1k
(12.21)
be the decompositions determined by (12.19). Then
is an identity in
and for every je1(f) q
41
=
(13.
(12.22)
Finally, if I is any ideal in 1(1), then (12.23)
Proof: The relation (12.19) is an immediate consequence of Proposition I, and formulae (12.16) and (12.17). To show that ë is an identity in 1(t) let be arbitrary. Then ci =
T
=
ci.
Since forj#i =0 24*
Chapter XII. Polynomial algebra
372
it follows that
= Now let
be
j.
an arbitrary element of T(t) and let q
=
we obtain q—
+...+
+•••+
But
and so It follows that k=O
+...+
= i=1
Finally, let I be any ideal in F(i). Then clearly
mu and so (12.23) is an immediate consequence of (12.20).
Theorem II. With the notation of Theorem I let —p
be
the homomorphism defined by =
p.(i) = Then
Thus
is an epimorphism and ker induces an isomorphism
=
F
(12.24)
In particular, the minimum polynomial of isf11". Proof: (12.22) shows that is an epimorphism. Next we prove that = ker We have
and the sum (12.19) is direct it follows that
(i=1...r);
§ 4. The structure of factor algebras
373
Now consider the induced map
i.e.,
Then
qEF[t]
= q(i,) and, in view of(12.22), j41
But
j*i
and thus
is the ideal generated
= 'fkj. This completes the proof.
Corollary I: An element cje 1(1) is contained in
if and only if
j+i.
and
Theorem III: The ideals are irreducible and (12.19) is the unique decomposition of 1(1) into a direct sum of irreducible ideals. Proof: Let In view of the isomorphism (12.24) it is sufficient to prove that the algebra 1' [t]/Igj is irreducible. According to Proposition I, T[t]/Igj is the direct sum of two ideals and '2 only if and '2 =
q1 and q2 are relatively prime divisors of
where
But this can only
happen if either q1 = or q2 = 1. If q1 1, say, then '1 is the full algebra and so 12=0. It follows that F(t)/Igj is irreducible. 1
Now suppose that I is an irreducible ideal in 1(1). Then Theorem I, sec. 12.16 gives
In I
I
for some i. Thus if
T(i)= for some ideal J, then intersection with I, gives I1).
is irreducible, it follows that Jn pletes the proof. Since
whence
This com-
374
Chapter XII. Polynomial algebra
Corollary I. The irreducible factor algebras 1(1) are precisely those for which f is a power of an irreducible polynomial.
Corollary II: Suppose the ideal I is a direct summand in 1(1). Then I is a direct sum of the I. Proof: Let J be an ideal such that J = 1(1). It is obvious that in a finite dimensional algebra any ideal is a direct sum of irreducible ideals. Since I and J are ideals, I J= 0, and so any ideal in 1(J) is an ideal in 1(1). Now the result follows from Theorem III. 12.16. Semisimple elements. Let A be a finite dimensional asso-
ciative commutative algebra with unit element e and let aeA. Recall from sec. 12.11 that the minimum polynomial, /, of a has positive degree. The element a is called seinisim pie. if f is the product of relatively prime irreducible polynomials.
Lemma I: a is semisimple if and only if the element f'(a) is invertible in the algebra T(a). If a is semisimple, then the polynomials I and f' are relatively prime (cf. Proposition V. sec. 12.9). Thus, by Corollary I to Proposition II. sec. 12.8. there are polynomials p and q such that
pf+qf'= I. Since f(a)=0 it follows that q(a)f'(a)=e and so f'(a) is invertible. Conversely, assume that ['(a) is invertible in 1(a). Then there is a poly-
nomial g such that f'(a)g(a)=e.
Set
Ii = 1' g
Then
1.
ii(a) = f'(a)g(a)
— e
= 0.
Since [ is the minimum polynomial of a it follows that f/i. Thus we can write
/
g
—
I
=
f. q
where q is some polynomial. It follows that [and [' are relatively prime.
Lemma II: Assume that a is semisimple and let h be a polynomial such that Ii(aY1=0 for some ,n 1. Then Ii(a)=0. Proof: Let .1 be the minimum polynomial of a. Then the hypothesis implies that f//i"'. Since a is semisimple. [is a product of relatively prime irreducible polynomials and it follows that f/li (cf. Proposition IV. sec. 12.9). Thus h(a)=0.
§ 4. The structure of factor algebras
375
IV: Let A be a finite dimensional associative commutative algebra. Let xEA and let Theorem
j=
Irk
the decomposition of the minimum polynomial of x into prime factors. Set h(1=1/1 r be
and
(i=l...r). Then there are unique elements xsEA and XvEA such that x5 is semisimple, x\ is nilpotent and X=
.vs
+ XN.
The minimum polynomials of xs and =g
are given by
and
=
Proof: 1. Existence: Identify the subalgebra 1(x) with the factor algebra
T(fl=r[t]/11 via x=t. Lemma IV below (cf. sec. 12.17) yields elements UET[t] and (')ET[t] with the following properties: (i)
(ii)
(iii)
U+Wt, g divides w.
Projection onto the quotient algebra yields the relations (12.25) (12.26)
and (12.27)
Now set
xs = ii and
=
Then g(x5) = 0.
Thus the minimum polynomial of xs divides g and hence it is a product of relatively prime irreducible polynomials. Thus, by definition. x5 is semisimple. Relation (12.26) shows that xv is nilpotent and from (12.27) we obtain the relation xv + xs = x. Finally. x5 and
are polynomials in x.
Chapter XII. Polynomial algebra
376
2. .\Iiiiiiniiin poltiiomiuls: Next we show that g is the minimum poiy— nomial of Let Ii he the minimum polynomial of Then Ii divides g. On the other hand. Taylor's expansion gives
+ x) =
/i(x) =
=
+q
q
where qEA is some element. This shows that /i(x) is nilpotent. Now Sec. 12.12 implies that g,'/i. Thus g=/i. consider me minimum polynomial, gI()L Thus ''V. It follows that
=
Since g/w we have
of
= 0.
= 1' with / k. To show that k / we may assume that k 2. Then (Ii
Thus
=g
r and r is
relatively prime tog (cf. Lemma III and IV, sec. 12.17). Hence r' is relatively prime to f It follows that r(x) is invertible. Now
=
= g(x) i(x)
=g.
and so
= xç. =
g(x)'
= 0.
Since t(x) is invertible we obtain g(x)'=O whence g'(x)=O. This implies that f/gt whence k 1. Thus k = 1: i.e.. = 3. Uniqueness: Assume a decomposition X=
Ys
is nilpotent. Set
where v5 is semisimple and
== and
+ .
Then
—
so = is nilpotent. We must show that ==0. Taylor expansion yields I
(i)(
j=1
)
x
j=O
g (i)(
)
.1•
showing that gft5) is nilpotent. Hence, for some in. (g(ys)Y1 = 0.
Since v5 is semisimple it follows that g(v5)=0 (cf. Lemma II) sec. 12.16 and so I.
(J(j) (
'=O.
(12.28)
§ 4. The structure of factor algebras
377
Let / 1 denote the degree of nilpotency of:. We show that 1= 1. In fact, assume that /2. Then, multiplying the equation above by :1_2 we obtain
=0.
In view of Lemma I (applied with A=T(x5)). g'(x5) is invertible in and hence invertible in A. It follows that zi =0, in contradiction to the choice of!. Thus 1=1; i.e., :=0. Thus i'5=X5 and IN=XN and the proof is complete.
are called the semisimple and
DefInition: The elements XN and ni/potent parts of x. 12.17. Lemma III:
Let g be a polynomial such that g and g' are rel-
atively prime. Then for each integer k 1 there are polynomials u and t' with the following properties: (i)
(ii) (iii)
u + gr = t. If k2, then r is relatively prime tog.
Proof: For k = I set u = t and v = 0. Next consider the case k =2. Since g and g' are relatively prime there are polynomials p and q such that
1+pg'=qg. The Taylor expansion (cf. sec. 12.4) gives, in view of the relation above, 0(n)
cx)
= g + g'pg +
g(t +
=g(1 +g'p)+g2 .
=
g2
+
+ (12.29)
/
q + g2 / = g2(q + /).
Now set
u=t+gp and r=—p. Then we have
u+gr=t
(12.30)
and, in view of(l2.29). = g2(q + /) which achieves the result for k =2. Finally, suppose the proposition holds for k — (k 3) and define Uk by 1
=
U2(11k1).
Chapter XII. Polynomial algebra
378
Replacing t by
in (12.30) we obtain )+
112(11k
12
By induction hypothesis
(uk_i) = Uk_i.
(12.31)
gkl
g
.
where q is some polynomial. Now (12.31) can be written as 11k
+ g.
q
.
Uk_i.
u2U(k_i)
(12.32)
Finally, (ii) yields in view of the induction hypothesis
=
t
(12.33)
g
Relations (12.32) and (12.33) imply that 11k
Setting tk = q
v2 (Uk
+g.
+
_
(q
17(11k_i)
.
we
+ tk—
= 1.
obtain
Uk + g Vk = t.
It remains to be shown that gk divides
But
=
=
Now the lemma, applied for k=2, shows that gk_
Since
it follows that and so the induction is closed. Thus property (i) and (ii) are established. (iii): Let k2. Suppose that ii is a common divisor of g and i.
g=p/i,
1:=q
11.
Then, by (ii)
g(u) =g(t — ig) = g(t
—
pqli2)
and so Taylor's expansion yields g(u) =
g
—
.
/
where / is some polynomial. Since k2. (i) implies that g2/g(u) and so /i2/g(u). Now the equation above shows that divides g g=
/i
.
§ 4. The structure of factor algebras
379
It follows that g' =
in +
2 Ii
m'
and so Ii is a common divisor of g and g'. Since g and g' are relatively prime. Ii must be a scalar. Hence g and r are relatively prime. This completes the proof.
Lemma IV: Let g be a polynomial such that g and g' are relatively prime. Then for each k I there are polynomials u and co with the following properties: (1) gk dirides g(u).
(ii) U+('J=t. (iii) g dirides ('i.
Proof: Define e by 0) = g
1
where r is the polynomial obtained in
Lemma III. 12.18. Decomposition of 1(t) into the radical and a direct sum of fields. Let be
the decomposition off into its prime factors and set
a= I/1 h
r
Consider the factor algebra T(i)=T[t]/11. By Theorem IV there is a unique decomposition =
ts + IN
is nilpotent. Moreover. g is the minimum is semisimple and polynomial oft5. Now let A be the subalgebra of 1(i) generated by I and Let be the ideal in T[f] generated by / (1= 1 ...r). where
Theorem V: 1. A is the (unique) direct sum of irreducible ideals (1=1 ... A
= Ii
and T[t]/11
In particular, each
(i =
1
... ij).
is a field and so A contains no non-zero nilpotent
elements.
2. The radical of T(t) consists precisely of the nilpotent elements in and is generated as an ideal by
Chapter XII. Polynomial algebra
3. The vector space T(t) is the direct sum of the subalgebra A and the ideal rad T(t). T(t) = A rad 1(t).
4. The subalgebra A consists precisely of the sernisimple elements in fliL 5. A is the only subalgebra of T(ii which complements the radical. Proof: 1) Consider the surjective homomorphism flt)—*A given by tF—*t. Since g is the minimum polynomial of it induces an isomorphism
T[t]/Ig
A.
According to Theorem III (applied to g) T[t]/Ig is the unique direct ['[tI/I1. (1 = ... sum of irreducible ideals where the corresponding ideal in A. Then 1
r).
Let
also denote
and the are irreducible ideals. is irreducible, Theorem I, sec. 12.13, implies Finally, since each is a field. that 2) and 3): Let denote the ideal generated by and let I be the ideal of nilpotent elements in T'(t). Since is nilpotent we have (12.34)
it follows that
Next, since
=
A
+
(12.35)
Moreover, since A is the direct sum of the fields rad
nA
rad A =
rad
= 0.
(12.36)
Relations (12.34) and (12.36) imply that the decomposition (12.35) is direct,
['(t) =
A
This yields
rad 1'(t) = rad 1'(t)n A + I\ = I\ whence I = rad ['(t). It follows that
T(t) =
A
rad T(t).
(12.37)
§4. The structure of factor algebras
381
We show first that every element in A is semisimple. In fact, let
4)
xe. Then, by Theorem IV, sec. 12.16, there are elements XNE['(X) such
that
is semisimple,
= In particular,
is
nilpotent and
+ XN.
is nilpotent in A. Thus, by 1),
and so
is semi-
simple.
be semisimple. In view of 3) we can write
Conversely, let
x=
XA
+ XR
T(t).
XAEA,
(12.38)
is nilpotent. Thus (12.38) must be the is semisimple and unique decomposition of into its semisimple and nilpotent parts.
Then
Since
is
semisimple it follows that
whence
which complements the radical. 5) Let B be any subalgebra of Then dividing out by the radical we obtain an algebra isomorphism B.
A
Thus. by 4), the elements of B are semisimple. Hence, again by 4),
BcA. It follows that B=A. Corollary: Let xET(t). Then the decomposition X = + XR obtained from the decomposition 3) is the decomposition of x into its
semisimple and nilpotent parts. The results of this paragraph yield at once Theorem VI: Let (12.39)
be the decomposition of the algebra T(t) into irreducible ideals. Then every ideal
is a direct sum rad I
I. = where
(12.40)
is a field isomorphic to T[t]/11 (i= I ..r). Moreover, the .
decompositions (12.39) and (12.40) are connected by rad T(t) = and A
rad I,
Chapter XII. Polynomial algebra
382
Problems 1. Consider the Polynomials
a) f=t3—6t2+ lit— 16. b) 1= t2 + t + 7. c) I = t2
—
5.
Prove that in each case I is the product of relatively prime irreducible polynomials. For k=2, 3. construct polynomials ii and r which satisfy the conditions of Lemma III. sec. 12.17. 2. Show that if and are any two polynomials satisfying the conditions of Lemma III (for some fixed k), then gk divides and g"' divides
Chapter XIII
Theory of a linear transformation Iii tills (ila/)ter E will ileiiote a clef iiied ouci
a/i aihitiaiv tie/cl T of characteristic 0.
l(X'tOl space E—E will
a liiiear tiaiisforiiiatioii.
§ 1. Polynomials in a linear transformation 13.1. Minimum polynomial of a linear transformation. Consider the algebra A(E; E) of linear transformations and fix an element E). E) is defined by Then a homomorphism P: cP:
fF—*f((p)
sec. 12.11). Let p be the minimum polynomial of (p. Since A(E; E) is non-trivial and has finite dimension, it follows that deg p (cf. sec. 12.11). The minimum polynomial of the zero transformation is t whereas the minimum polynomial of the identity map is t— 1. Proposition I. sec. 12.1 1. shows that (cf.
1
dim T(p) = deg p where
T(p)= Im&
13.2. The space Kifi. Fix a polynomial f and denote by K([) the kernel of the linear transformation f(p). Then the subspace is stable under (p. In fact. if xcK(f). then f(p)x=0 and so
=
= 0;
xEK(f). In particular we have
K(l)=0.
K(t)=kerço
and
where p denotes the minimum polynomial of (p.
K(p)=E.
Chapter XIII. Theory of a linear transformation
and let q1 : F F he the Let F E he ani suhspace stable ttIIL'iC /Lj: induced linear transformation. Then fL((pj;) 0 ciiid 50 denotes the minimum polynomial Now let g be a second polynomial and assume that I. Then
K(g)cK(f).
(13.1)
In fact, writingf==gg1 we obtain that for every vector xEK(g)
f(p)x =g1(p)g(p)x =0 whence xEK(f). This proves (13.1).
Proposition I. Letf and g be any two non-zero polynomials, and let d be their greatest common divisor. Then
K(d)=K(f)n K(g). Proof: Since d/f and d/g it follows that K (d)
K (f) and K (d)
K (g)
whence
K(d)
K(f) fl K(g).
(13.2)
On the other hand, since d is the greatest common divisor of f and g, there exist polynomialsf1 and g1 such that
d=f1f +g1g. Thus if xEK(f)n K(g) we have
d(p)x = f1(p)f(p)x + g1(p)g(p)x = 0 and hence xeK(d). It follows that
K(d)
K(f) fl K(g)
(13.3)
which, together with (13.2) establishes the proposition.
Corollary I: Letf be any polynomial and let d be the greatest common divisor off and 1u. Then
K(f) = K(d). Proof: Since K(p)=E, it follows from the proposition that
K(d)=K(f)n E=K(f).
§ 1. Polynomials in a linear transformation
385
Proposition II: Let f and g be any two non-zero polynomials, and let v be their least common multiple. Then
K(v) =
K(f) + K(g).
(13.4)
Proof. Sinceflv and gjv it follows that K(f)c:K(v) and K(g)cK(v); whence
K(v)
K(f) + K(g).
(13.5)
On the other hand, since v is the least common multiple off and g, there are polynomials f1 and g1 such that
f1f = v = andf1 and g1 are relatively prime. Choose polynomialsf2 and g2 so that f2f1 +g2g1 = 1; then + g2('p)g1(p) = 1. Consequently each xeE can be written as
x = x1 + x2 where
f2(p)f1(p)x and x2 = g2(p)g1 ((p)x.
x1
(13.6)
Now suppose that xeK(v). Then
f1(co)f(co)x = g1(ço)g(ço)x = v(p)x =
0
and sO (13.6) implies that
f(ço)x1 =O=g(p)x2. Hence x1EK(f), and x2eK(g), so that
xeK(f) + K(g) that is,
K(v) c K(f) + K(g).
(13.7)
(13.4) follows from (13.5) and (13.7).
Corollary I: 1ff and g are relatively prime, then K (1 g) = K (f)
K (g).
(13.8)
Proof: Sincef and g are relatively prime, their least common multiple is fg and it follows from Proposition 11 that
K(fg)= K(f)+ K(g). 25
Greub. Linear Algebra
(13.9)
Chapter XIII. Theory of a linear transformation
On the other hand the greatest common divisor of f and g is I, and so Proposition I yields that
K(f)n K(g)=K(l)=0.
(13.10)
Now (13.9) and (13.10) imply (13.8).
Corollary II. Suppose
f=fi"fr
is a decomposition of the polynomialf into relatively prime factors. Then
K(f)= Proof: This is an immediate consequence of Corollary I with the aid of an induction argument on r. Let f and g be any two non-zero polynomials such that g is a proper but the inclusion need not be proper. In divisor off Then fact, let g=p and I =/ip where ii is any polynomial with deg/i 1. Then g I (properly) but K (g) = E = K(f).
Proposition III: Letf and g be non-zero polynomials such that (i)
and (ii)
(properly)
Then
K (g) c K (f)
(properly).
Proof: (i) and (ii) imply that there are polynomialsf1 and g1 such that
p=ff1 and f=gg1
degg1>0.
(13.11)
Setting g2=gf1 we have degg2<degp and so p is not a divisor of g2. it follows that g2 cannot annihilate all the vectors of E; i.e., there is a vector xeE such that (13.12)
Let y =f1 (p)x. Then we obtain from (13.11) and (13.12) that
f(ço)y =f(p)f1(p)x = p(p)x =
0
while
= g(p)f1(p)x = g2(p)x
0.
§ I. Polynomials in a linear transformation
Thus
yeK(f), but
387
and so K(g) is a proper subspace of K(f).
Corollary I: Letf be a non-zero polynomial. Then
K(f)=O if and only iff and
(13.13)
are relatively prime.
1ff and /2 are relatively prime, then Corollary I to Proposition II gives
K(f)=K(f)n E=K(f)n K(/2)=O. Conversely suppose (13.13) holds, and let d be the greatest common divisor off and p. Then l/d and d/p but
K(d)=K(f)fl K(p)=O=K(1). It follows from Proposition III that 1 cannot be a proper divisor of d; hence 1 andf and p are relatively prime. Corollary II: Let f be any non-zero monic polynomial that divides p, and let ço1 denote the restriction of çü to K(f). Let pj. denote the minimum polynomial of p1. Then (13.14) pf=f.
Proof: We have from the definitions thatf(p1)=O, and hence ps/f. It K(f) and since K(pf) K(f) we obtain
follows that
K(p1) =
On the other hand,fjp and cannot be a proper divisor of Proposition IV: Let be
K(f).
Now Proposition III implies that,u1 which yields (13.14).
pf1...fr
a decomposition of p into relatively prime factors. Then
Moreover, if co denotes the restriction of q, to K(f,) and then polynomial of
pi =fi.
25*
is the minimum
Chapter XIII. Theory of a linear transformation
Proof: The proposition is an immediate consequence of Corollary II to Proposition 11 and Corollary 11 to Proposition 111. such 13.3. Eigenvalues. Recall that an eigenvalue of is a scalar that
(px=/x
(13.15)
for some non-zero vector XEE. and that x is called an eigenvector corresponding to the eigenvalue ).. (13.1 5) is clearly equivalent to
K(f)
0
(13.16)
where / is the polynomial t In view of Corollary I to Proposition III. (13.16) is equivalent to requiring that / and p have a non-scalar common divisor. Siiice deg/ I. this is the same as requiring that Thus the eigenvalues of p are precisely the distinct roots of p. Now consider the characteristic polynomial of çø, x
=
are the characteristic coefficients of p defined in sec. 4.19. The corresponding polynomial function is then given by where the
det ((p — )L i):
x
it follows from the definition that
dimE = Recall that the distinct roots of are precisely the eigenvalues of (cf. sec. 4.20). Hence the distinct roots of the characteristic polynomial coincide with those of the minimum polynominal. In sec. 13.20 it will be shown that the minimum polynomial is even a divisor of the characterisic polynomial.
Problems 1. Calculate the minimum polynomials for the following linear transformations: a) ço=).t b) (p is a projection operator c) (p is an involution
§ I. Polynomials in a linear transformation d) e)
389
'P is a differential operator
is a (proper or improper) rotation of a Euclidean plane or of
Euclidean 3-space 2. Given an example of linear transformations (p, t/i:E—*E such that 0i/i do not have the same minimum polynomial. o 'P and where (1= 1,2) are 3. Suppose E—E1E13E2 and linear transformations. Let p, 111, P2 be the minimum polynomials of (p, and 'P2. Prove that p is the least common multiple of P1 and P2. 4. More generally, suppose E1, E2 c E are stable under 'P and E= and 'P12:E1fl E2 be the E1+E2. Let q,1:E1—+E1, of'P and suppose that they have minimum restrictions polynomials P1,
is the least common multiple of Pt and P2• where v is the greatest common divisor of c) Give an example showing that in general P12 + 5. Suppose E1 E is a subspace stable under 'P. Let and minimum polynomials of p
b)
and
P2.
and 1ü be the
and
Prove that let v be the least common multiple of Pi and Construct an example where v=p+üp1 and an example where v+p= Finally construct an example where
6. Show that the minimal polynomial p of a linear transformation 'P can be constructed in the following way: Select an arbitrary vector x1eE and determine the smallest integer k1, such that the vectors (PVX1 (v= 0.. .k1) are linearly dependent,
= 0. Define a polynomial ft by k1
11
v0
tv.
If the vectors (PVXI (v=O...k1) do not generate the space E select a vector x2 which is not a linear combination of these vectors and apply the same
construction to x2. Let f2 be the corresponding polynomial. Continue this procedure until the whole space E is exhausted. Then p is the least common multiple of the polynomialsfQ. 7. Construct
the minimum and characteristic polynomials for the
following linear transformations of R4. Verify in each case that p divides
Chapter XIII. Theory of a linear transformation
390
a) b)
= =
c)
=
+ + + —
d)
+ —
+
+
+
+
—
—
—
+
— —
8. Let p be a rotation of an inner product space. Prove that the coefficients
of the minimum polynomial satisfy the relations
k=degp,v=0...k ± 1 depending on whether the rotation is proper or improper. 9. Show that the minimum polynomial of a selfadjoint transformation of a unitary space has real coefficients. 10. Assume that a conjugation z—*2 is defined in the complex vector space E (cf. sec. 11.7). Let ço:E—÷E be a linear transformation such that Prove that the minimum polynomial of p has real coefficients. 11. Show that the set of stable subspaces of E under a linear transformation is a lattice with respect to inclusion. Establish a lattice homomorphism of this lattice onto the lattice of ideals in F((p). 12. Given a regular linear transformation show that is a polywhere
nomial in çø.
13. Suppose p e L(E, E) is regular. Assume that for every //e L(E: E) Prove that J= I for some k. If/k is the leasi. integer (some
such that )'= I, prove that the minimum polynomial,
of p can be
written
=
§ 2.
Generalized eigenspaces
13.4. Generalized eigenspaces. Let
=
...
ft irreducible
(13.17)
be the decomposition of p into its prime factors (cf. sec. 12.9). Then the spaces
E. = K are called the generalized eigenspaces of p. It follows from sec. 13.2 that are stable under 'p. Moreover, Proposition IV, sec. 13.2 implies the
§ 2. Generalized eigenspaces
391
that (13j8) and
=
jk
where p, denotes the minimum polynomial of the restriction of p to E1. In particular, dim Now suppose 2 is an eigenvalue for 'p. Then t—2Jp, and so for some i t
Hence the eigenspaces of
—2.
are precisely the spaces
K(f) where the J, are the those polynomials in the decomposition (13.17) which are linear.
13.5. The projection operators. Let the projection operators in E associated with the decomposition (13.18) be denoted by ira. It will be shown that the mappings are polynomials in p,
i=1,...,r. If r= 1, it1 = t and the assertion is trivial. Suppose now that r> 1 and define polynomials by —
g
are relatively prime, and hence there
exist polynomials h1 such that (13.19)
On the other hand, it follows from Corollary II, Proposition II, sec. 13.2 that
K(g) =
j*i
and so, in particular, =
0
xe
j*i
Now let xeE be an arbitrary vector, and let x1EE1
(13.20)
392
Chapter XIII. Theory of a linear transformation
be the decomposition of x determined by (13.18). Then (13.19) and (13.20) yield the relation x=
=
x =
h (go) g1 (go)
h. (go) g. ((p) x1
whence x1
=
I = 1,
...,r.
(13.21)
It follows at once from (13.20) and (13.21) that
=
I
=
1,
...,r
which completes the proof. 13.6. Arbitrary stable subspaces. Let Fc: E be any stable subspace. Then
F=
(13.22)
n
where the are the generalized eigenspaces of go. In fact, since the projection operators are polynomials in go, it follows that F is stable under each 7C1.
Now we have for each xeF that X
=1X=
X
and fl
It follows that
E1,
whence E1.
inclusion in the other direction is obvious, (13.22) is established. 13.7. The Fitting decomposition. Suppose F0 is the generalized eigenspace of go corresponding to the irreducible polynomial t (if t does not divide then of course F0=0). Let F1 be the direct sum of the remaining generalized eigenspaces. The decomposition Since
E=
F0
F1
is called the Fitting decomposition of E with respect to go. F0 and F1 are called respectively the Fitting-null component and the Fitting-one component of E.
§ 2. Generalized eigenspaces
393
Clearly F0 and F1 are stable subspaces. Moreover it follows from the and F1, then definitions thatif 'p is ni/potent; i.e.,
forsomel>O
while Pi is a linear isomorphism. Finally, we remark that the corresponding projection operators are polynomials in (p, since they are sums of the projection operators defined in sec. 13.5. 13.8. Dual mappings. Let E* be a space dual to E and let E*
çp* E*
be the linear transformation dual to 'p. Then if f is any polynomial, it follows from sec. 2.25 that This
=
implies that f(p*) = 0 if and only if f('p) 0. In particular, the
minimum polynomials of 'p and
coincide.
Now suppose that F is any stable subspace of E. Then F' is stable under co*. In fact, if yEF and y*EF' are arbitrarily chosen, we have <(p* y*, y> =
=
'p
0
whence 'p*y*EFI. This proves that F' is stable. Thus 'p* induces a linear transformation ((p*)I.
E*/FI
E*/FI.
On the other hand, let and ('p*)1 are be the restriction of 'p to F. It will now be shown that dual with respect to the induced scalar product between F and E*/FI is a representative of (cf. sec. 2.24). In fact, if yEF is any vector and an arbitrary vektor e E*/FI, then <('p*)1
which proves the duality Suppose next that
v*, y> =
= =
=
'pF
and ('p*)1. Thus we may write
E=
F1
F2
is a decomposition of E into two stable subspaces. Then it follows that E* = F±
F1'
Chapter XIII. Theory of a linear transformation
394
is a decomposition of E* into stable subspaces (under co*). Moreover, the pairs and are dual, and
induce dual mappings in each pair. are two dual subspaces Conversely, assume that F1 c:E and stable under p and respectively. Then we have the direct decompositions (cf. sec. 2.30), and it is easily checked that çø and
E== F1
and E*
=
FjL
are again stable.
(cf. sec. 2.30). Clearly the subspaces (F1*)I and More generally. a direct decomposition
E into several stable subspaces determines a direct decomposition of E* into stable subspaces,
j*i
F)1
as follows by an argument similar to that used above in the case r=2. Each pair is dual and the restrictions to F, p and and
are dual mappings.
Pro position I: Let
p=
.
the decomposition of the common minimum polynomial. p. of p and (p* Consider the direct decompositions be
and (13.23)
of
£ and E*
into the generalized eigenspaces of p and
= Proof: Consider
Es)'
i = 1, ...,r.
j*i
the subspaces F,*cE*
F1*=(>EJ)l
j*i
defined by
i=1,...,r.
Then (13.24)
§ 2. Generalized eigenspaces
are stable under
Then, as shown above, the
395
and (13.25)
It will now be shown that (13.26)
Let y*EF,* be arbitrarily chosen. Then for each
=
=
we have
= 0.
In view of the duality between EL and F1*, this implies thatf/'1(cp*)y* =0;
y*EE* This establishes (13.26). Now a comparison of the decompositions (13.23) and (13.25) yields (13.24).
Problems 1. Show that the minimum polynomial of p is completely reducible (i.e. all prime factors are of degree 1) if and only if every stable subspace contains an eigenvector.
2. Suppose that the minimum polynomial p of ço is completely reducible. Construct a basis of E with respect to which the matrix of ço is lower triangular; i.e., the matrix has the form 0
*
Hint: Use problem 1.
3. Let E be an n-dimensional real vector space and p: E—*E be a regular linear transformation. Show that çø can be written p=c01p2 where every eigenvalue of p1 is positive and every eigenvalue of .p2 is negative.
4. Use problem 3 to derive a simple proof of the basis deformation theorem of sec. 4.32. 5. Let ço:E-*E be a linear transformation and consider the subspaces F0 and F1 defined by F0 =
and
F1 =
fl
Chapter XIII. Theory of a linear transformation
396
a) Show that F0 = U ker
ço1.
1
b)
Show that
c) Prove that F0 and F1 are stable under
and that the restrictions are respectively nilpotent and regular. and cond) Prove that c) characterizes the decomposition
(p0:F0-3F0 and ço1
clude that F0 and F1 are respectively the Fitting null and the Fitting 1-component of E.
6. Consider the linear transformations of problem 7, § 1. For each transformation a) Construct the decomposition of into the generalized eigenspaces. b) Determine the eigenspaces. c) Calculate explicitly polynomials g, such that the g.(p) are the pro-
jection operators in E corresponding to the generalized eigenspaces. Verify by explicit consideration of the vectors that the (ço) are in fact the projection operators. d) Determine the Fitting decomposition of E. 7. Let E= be the decomposition of E into generalized eigenspaces
of p, and let
be the corresponding projection operators. Show that
there exist unique polynomials
(p) = Conclude that the polynomials
such that and
deg
deg
depend only on p. E*
8. Let E* be dual to E and
E* into generalized eigenspaces of co* prove that
=
are the corresponding projection operators and the g, are defined in problem 7. Use this result to show that and are dual and to obtain formula
where the
(13.24).
9. Let FcE be stable under 'p and consider the induced mappings E/F—+ E/F. Let E= be the decomposition of F into
c°F :F—÷F and
be the canonical injection and generalized eigenspaces of 'p. Let Q:E—*E/Fbe the canonical projection. a) Show that the decomposition of F into generalized eigenspaces is given by
F=
where
F
fl
§ 3. Cyclic spaces
397
Show that the decomposition of E/F into generalized eigenspaces of is given by E/F = (ElF)1 where (E/F), = (E1). b)
Conclude that
determines a linear isomorphism (E/F),.
E1/F,
c) If denote the projection operators in E, F and E/F associated with the decompositions, prove that the diagrams
and F
are commutative, and that
are the unique linear mappings for
which this is the case. Conclude that if g are the polynomials of problem 7,then F g1((p) and = —
10. Suppose that the minimum polynomial ji of çü is completely reducible. a) By considering first the case p = (t —
dimE.
degp
b) With the aid of a) prove that of p. § 3.
prove that
x the characteristic polynomial
Cyclic spaces
13.9. The map 0a• Fix a vector aEE and consider the linear map ar,:
given by a
fer[t].
Let K1, denote the kernel of ar,. It follows from the definition that feK,, Ka, where p is if and only if aEK(J). K,, is an ideal in T[t]. Clearly the minimum polynomial of p. Proposition 1: There exists a rector a E E such that K,, = I,, Proof: Consider first the case that p is of the form
p=
k 1.
/ irreducible.
Chapter XIII. Theory of a linear transformation
398
Then there is a vector tIE E such that (13.27)
0.
Suppose now that hEKa and let g be the greatest common divisor of h and p. Since aeK(fl, Corollary Ito Proposition I, sec. 13.2 yields
ueK(g).
(13.28)
Since g, p it follows that g =1' where / k. Hence 1 (p) a = () and relation (13.27) implies that /=k. Thus K(g)=K0. Now it follows from (13.28) that
In the general case decompose p in the form =
J irreducible
...
and let (i= 1 i) be be the corresponding decomposition of E. Let is the induced transformation. Then the minimum polynomial of given by (cf. sec. 13.4) = r) (1= 1
Thus, by the first part of the proof, there are vectors K,, =
I,,
(i =
I
such that
. . .
Now set
(1(l1 Assume that jEK11. Then f(p) a=0 whence
= 0.
Since f(p)
it follows that
f(p) whence fe Ka (i =
1
=0
(1 =
I
... r). Thus •f C
and so tel0. This shows that
•'' fl
=
III
whence K(,=I,L.
13.10. Cyclic spaces. The vector space E is called crc/ic (with respect
to the linear transformation p) if there exists a vector acE such that is surjective. Every such vector is called a geiiclaloI of E. the map In fact, let I cK and let XEE. Then, Ifa is a generator of E. then
§ 3. Cyclic spaces
399
for some geT[t], x=g(p)a. It follows that
=f(p)g(p)a = whence
f(p)a = g(p)O = 0
and so tel1.
Proposition II: If E is cyclic, then
degp = dimE.
Proof Let a be a generator of E. Then, since Ka =
0a induces an
isomorphism E.
It follows that (cf. Proposition I, sec. 12.11)
= degp.
dim
Proposition III: Let a be a generator of E and let deg p = m. Then the vectors
a,p(a)
form a basis of E.
Proof. Let / he the largest integer such that the vectors a, p(a)
(13.29)
are linearly independent. Then these vectors generate E. In fact, every
vector xeE can be written in the form x=f(p)a where I =
is a
polynomial. It follows that i—i
k
X
=
(a) = V=
p' (a). 1)
1)
Thus the vectors (13.29) form a basis of E. Now Proposition II implies that I=ni. Proposition IV: The space E is cyclic if and only if there exists a basis (v=O... n—i) of E such that =
(v = 0 ... ii —2).
Proof: If E is cyclic with generator a set a=a and apply Proposiu—I) is a basis satisfying tion III. Conversely. assume that
the conditions above. Let v= e I'[t] by
be an arbitrary vector and define V
Chapter XIII. Theory of a linear transformation
400
Then
=x as is easily checked and so E is cyclic. (v = 0 ii — I be a basis as in the Proposition CoioI/urr: Let above. Then the minimum polynomial of is given by )
= where the
are
—
determined by
Proof: It is easily checked that p(p)=O and so p is a multiple of the minimum polynomial of p. On the other hand. by Proposition II. the minimum polynomial of has degree ii and thus it must coincide with p. 13.11. Cyclic subspaces. A stable subspace is called a crc/ic siihspacc if it is cyclic with respect to the induced transformation. Every vector UEE determines a cyclic subspace. namely the subspace = Im Proposition V: There exists a cyclic subspace whose dimension is equal to degp. Proof: In view of Proposition I there is a vector u e F such that ker = Then induces an isomorphism E11.
It follows that
dimE, = diml'[t]
'R
= degp.
Throic;,, I: The degree of the minimum polynomial satisfies
degp dimE. Equality holds if and only if E is cyclic. Moreover, if F is any cyclic subspace of E. then
dimF degp.
Proof: Proposition V implies that deg p dimE. If E is cyclic. equality
holds (cf. Proposition III). Conversely. assume that degp=dimE. By Proposition V there exists a cyclic subspace with dim F=degp. It follows that F=E and so E is cyclic.
§ 3. Cyclic spaces and irreducible spaces
401
Finally, let Fc:E be any cyclic subspace and let v denote the minimum polynomial of the induced transformation. Then, as we have seen above, dim F = deg v. But v/p (cf. sec. 13.2) and so we obtain dim F deg p.
Corollary: Let F E be any cyclic subspace, and let v denote the minimum polynomial of the linear transformation induced in F by (p. Then
v=u if and only if
dimF = degp. Proof: From the theorem we have
dimF =
degv
while according to sec. 13.2 v divides p. Hence v=p if and only if deg v=deg ,i; i.e., if and only if
dimF = degp. 13.12. Decomposition of E into cyclic subspaces. Theorem II: There exists a decomposition of E into a direct sum of cyclic subspaces. Proof: The theorem is an immediate consequence (with the aid of an induction argument on the dimension of E) of the following lemma.
E such that
Lemma I. Let dimEa = degp.
Then there is a complementary stable subspace, Fc E, E = Ea
F.
Proof: Let Ea
denote the restriction of (p to Ea, and let (p*:E*
E*
the linear transformation in E* dual to (p. Then (cf. sec. 13.8) stable under (p*, and the induced linear transformation be
*
*
J_
*
(pa:E lEa *—E lEa 26
Greub. Linear Algebra
is
Chapter XIII. Theory of a linear transformations
402
is dual to
with respect to the induced scalar product between Ea and
The corollary to Theorem I, sec. 13.11 implies that the minimum polynomial Of 'Pa is again p. Hence (cf. sec. 13.8) the minimum polynomial
are dual, so that
of p is p. But Ea and
= dimEa = degp.
dim
is cyclic with respect to çü.
Thus Theorem I, sec. 13.11 implies that Now let
be the projection and choose u*eE* so that the element generates E*/E±. Then, by Proposition III, the vectors (p = 0... in — 1) form a basis of Hence the vectors
(p* V
(p = I ... in — I) are linearly independent.
Now consider the cyclic subspace Since I) it follows that On the other hand. Theorem I. sec. 13.11. implies that dim Hence dim
= (p = 0... is injective. Thus 0
Finally, since 7r* TE to
dim
+ dim
=
in —
= m. I
)
it follows that the restriction of But
in + (ii — in)
=
ii
(n =dim E)
and thus we have the direct decomposition E* =
of E* into stable subspaces. Taking orthogonal complements we obtain the direct decomposition
E=
E into stable subspaces which completes the proof.
§ 4.
Irreducible spaces
13.13. Definition. A vector space E is called
!I1(Iecoin/)os(Ib/e or
ii'i'cthicible with respect to a linear transformation 'P. if it can not be expressed as a direct sum of two proper stable subspaces. A stable
§4. Irreducible spaces
403
subspace F E is called irreducible if it is irreducible with respect to the linear transformation induced by (p. Proposition I: E is the direct sum of iireducihle subspaces. Proof: Let
E into stable subspaces such that s is maximized
(this is clearly possible. since for all decompositions we have sdim E). Then the spaces are irreducible. In fact, assume that for some i, =
is a decomposition of
dim
Ft"
> 0, dim
>0
into stable subspaces. Then
E=
F Fe" j*i is a decomposition of E into (s+ I) stable subspaces, which contradicts the maximality of s. An irreducible space E is always cyclic. In fact, by Theorem II, sec. 13.12, E is the direct sum of cyclic subspaces.
E=
If E is indecomposable. it follows that 1=and so E is cyclic. On the other hand, a cyclic space is not necessarily indecomposable. In fact, let E be a 2-dimensional vector space with basis a. h and define p: E—+E by setting (pa=h and (ph =0. Then (p is cyclic (cf. Proposition IV, sec. 13.10). On the other hand, E is the direct sum of the stable subspaces generated by a + b and a — b. 1
Theorem I: E is irreducible if and only if i)
p=fk,
f irreducible;
ii) E is cyclic.
Proof. Suppose E is irreducible. Then, by the remark above. E is E into the cyclic. Moreover, if generalized eigenspaces (cf. sec. 13.4), it follows that r= 1 and so
(f irreducible). Conversely, suppose that (i) and (ii) hold. Let E=
E1
E2
Chapter XIII. Theory of a linear transformations
404
be any decomposition of E into stable subspaces. Denote by and c02 the linear transformations induced in F1 and E, by ço, and let and 112 be the minimum polynomials of and q2. Then sec. 13.2 implies that Hence, we obtain from (i) that p and 111 =
k2 k.
=
(13.33)
Without loss of generality we may assume that k1 k2. Then xEE1
or
XEE7
and so
0.
It follows that
whence k1 k. In view of (13.33) we obtain k1 =k
i.e., It
= Iti•
Now Theorem I, sec. 13.10 yields that
dimE1 degp.
(13.34)
On the other hand, since E is cyclic, the same Theorem implies that
dimE = degjt.
(13.35)
Relations (13.34) and (13.35) give that
dimE = dimE1. Thus E= F1 and F2 = 0. It follows that E is irreducible.
Corollary I. Any decomposition of F into a direct sum of irreducible subspaces is simultaneously a decomposition into cyclic subspaces.
Then a stable subspace Corollary II. Suppose that of F is cyclic if and only if it is irreducible. 13.14. The Jordan canonical matrix. Suppose that F irreducible with respect to p. Then it follows from Theorem I sec. 13.13 that F is cyclic and that the minimum polynomial of ço has the form
p_fk
kl
(13.36)
where f is an irreducible polynomial. Let e be a generator of E and
§4. Irreducible spaces
405
consider the vectors (13.37)
it will be shown that these form a basis of E. Since
dimE =
deg,u
= pk
it is sufficient to show that the vectors (13.37) generate E. Let Fc:E be the subspace generated by the vectors (13.37). Writing
we obtain that
i==1,...,k j=1,...,p—1 e
= f(çp)1_i ço"e =
p—i —
i1,...,k1 These
equations show that the subspace F is stable under 'p. Moreover,
e = a11 e F. On the other hand, since E is cyclic, E is the smallest stable
subspace containing e. It follows that F—E. Now consider the basis ...
a11,
...;akl ...
of E. The matrix of 'p relative to this basis has the form 0 1
(13.38)
0
406
Chapter XIII. Theory of a linear transformation
are all equal, and given by
where the submatrices
01 0
0 1
0•.
j=1,...,k. 0
1
The matrix (13.38) is called a Jordan canonical matrix of the irreducible transformation p. Next let ço be an arbitrary linear transformation. In view of sec. 13.12 there exists a decomposition of E into irreducible subspaces. Choose a basis in every subspace relative to which the induced transformation has the Jordan canonical form. Combining these bases we obtain a basis of E. In this basis the matrix of p consists of submatrices of the form (13.38)
following each other along the main diagonal. This matrix is called a Jordan canonical matrix of çø. 13.15. Completely reducible minimum polynomials. Suppose now that E is an irreducible space and that the minimum polynomial is completely
reducible (p= 1); i.e. that
p = (t
—
2)k.
It follows that the
are (1 x 1)-matrices given by Jordan canonical matrix of ço is given by
21
Hence the
0
2 1. (13.39)
..1 0
In particular, if E is a complex vector space (or more generally a vector space over an algebraically closed field) which is irreducible with respect to 'p, then the Jordan canonical matrix of çü has the form (13.39). 13.16. Real vector spaces. Next, let E be a real vector space which is irreducible with respect to ço. Then the polynomialf in (13.36) has one of the two forms f=t—2 2ER or
§4. Irreducible spaces
407
In the first case the Jordan canonical matrix of p has the form (13.39). In the second case the are 2 x 2-matrices given by 0
Hence
1
the Jordan canonical matrix of 0
has the form
1
-fl
1
- -
1
H 13.17. The number of irreducible subspaces. It is clear from the construction in sec. 13.12 that a vector space can be decomposed in several ways into irreducible subspaces. However, the number of irreducible subspaces of any given dimension is uniquely determined by as will be shown in this section. Consider first the case that the minimum polynomial of 'p has the form
p=f
degf=p
wheref is irreducible. Assume that a decomposition of E into irreducible subspaces is given. The dimension of every such subspace is of the form pK(l as follows from sec. 13.12. Denote by FK the direct sum of the irreducible subspaces of dimension pic and denote by NK the number of the subspaces Then we have
E=
FK.
Comparing the dimensions in (13.40) we obtain the equation = dim E=n.
(13.40)
Chapter XIII. Theory of a linear transformation
405
Now consider the transformation
it follows from (13.40) that
Since the subspaces FK are stable under
(13.41)
By the definition of
we have FK
dim
=
whence
=
Since the dimension of each follows that
decreases by p under k/I (cf. sec. 1 3. 14) it
dimk//Fh =
—
(13.42)
1)NK.
Equations (13.41) and (13.42) yield (r(k/I) = rank t/i)
K=2
Repeating the above argument we obtain the equations j=1,...,k.
Kj+
1
Replacing] byj+ I and]—! respectively we find that
(K—j—l)NK=p (13.43)
and —j + 1)NK =
=
—j)NK +
pINK.
(13.44)
Adding (13.43) and (13.44) we obtain
+
= 2p
(K
=2p Kj+
1
—j)NK +
+
+
§4. Irreducible spaces
409
whence
=
+
j
—
= 1,...,k.
This equation shows that the numbers N1 are uniquely determined by the
(j= l,...,k).
ranks of the transformations In particular.
1.(i/1k_l)>
Nk=
I
if]> k. Thus the degree of is pk where k is the largest integer k and such that Nk>O. In the general case consider the decomposition of E into the generalized eigenspaces E= a a direct sum of irreducible subspaces. Then E every irreducible subspace F, is contained in some E (cf. Theorem I, sec. 13.13). Hence the decomposition determines a decomposition of
each E, into irreducible subspaces. Moreover, it is clear that these subspaces are irreducible with respect to the induced transformation E. Hence the number of irreducible subspaces in E1 of a given dimension is determined by and thus by p. It follows that the number of spaces F, of a given dimension depends only on (p. 13.18. Conjugate linear transformations. Let E and F be u-dimensional vector spaces. Two linear transformations p: E—*E and i/i: F—÷F are called conjugate if there is a linear isomorphism such that (pt: E1
= Proposition II: Two linear transformations (p and u/i
are
conjugate if
and only if they satisfy the following conditions: (i)
The minimum polynomials of (p and
factors /1
u/i
have
the same prime
Ir
(ii)
Proof: If (p and u/i
1=1 ... are
I.
conjugate the conditions above are clearly
satisfied.
To prove the converse we distinguish two cases.
Chapter XIII. Theory of a linear transformation
410
Casc I: The minimum polynomials of p and /i are of the form Pq)
.1
=
and
f irreducible.
.11.
Decompose E and F into the irreducible subspaces I
i=-1 j=1
The numbers =
.\ (/1)
F=>i=1 j=1
and N1(l/J) are given by
-
+
p=degf
and
= (cf.
-
+
sec. 13.17). Thus (ii) implies that
il. Since
i>k and
I>!
it follows that k=l. In view of Theorem I, sec. 13.13, the spaces El and F/ are cyclic. Thus Lemma I below yields an isomorphism such that El
= These
isomorphisms determine an isomorphism 1/I
Case II: and ized eigenspaces
=
0
such that
0
are arbitrary. Decompose E and F into the general-
and set
(i= I ... i). Then
restricts to a linear isomorphism in the subspace
§4. Irreducible spaces
Moreover, for
411
is zero in E,. Thus dimE —
=
= 1 ...
dimF —
=
=
Similarly.
... r.
i
Now (ii) implies that
1=1 ...r. Let and denote the respectively. The minimum polynomials of
restrictions of 'p and are of the form
and
i=l ... r and Since
+ dimE —
and
= r(f
j
+ dim F — dim
1
—
it follows from (ii) that l
(j(y) = Thus the pair
satisfies
...
the hypotheses of the theorem and so we
are reduced to case I.
Lemma I: Let p: E E and /i: F —* F (dim E = dim F = ii) be linear transformations having the same minimum polynomial. Then, if E and F are cyclic, 'p and are conjugate. Proof: Choose generators a and b of E and F. Then the vectors 01='pi(a) and
ii— I) forma basis ofEand F respectively.
Now set a,='p(a) and h= t/i'(h). Then, for some and
a,, = 1=
h=
()
h1. 1=
()
Moreover, the minimum polynomials of 'p and =
t'
and
—
1=)
=
t'
are given by —
1=0
Chapter XIII. Theory of' a linear transformations
412
(cf. the corollary of Proposition IV, sec. 1 3.10). Since 1 = 0 ...
=
it follows that
I).
n
by setting
define an isomorphism =
=
/ = 0 ...
h1
n —
Then we have
= (a,) = These
= h,. j=()
j=()
equations imply that
j=0...ii— I. This shows that Anticipating the result of sec. 13.20 we also have Corollary I: Two linear transformations are conjugate if and only if they have the same minimum polynomial and satisfy condition (ii) of Proposition II.
Corollary II: Two linear transformations are conjugate if and only if they have the same characteristic polynomial (cf. sec. 4.19) and satisfy (ii).
Proof: Let J be the common characteristic polynomial of p and /i. Write
=
. .
=
...
.
(f irreducible).
and = .1111
Set
the generalized eigenspaces for direct decompositions
and
fr
13(1=1 ... r) denote 1= 1 ... r; let and t/i respectively. Then we have the
§4. Irreducible spaces
it follows that
Since
413
(1=1 ... r). Similarly.
whence
(i=l...r). Thus we may assume that (g irreducible). Then (kni). ,=g'
I
is
of the form f=g" and
the corollary
follows immediately from the proposition.
Corollaiv III: Every linear transformation is conjugate to its dual.
Problems 1.
(i)
Let P: A(E; E) be a non-zero algebra endomorphism. Show that for any projection operator it.
r(P(m))=i(rr), (ii) Use (i) to prove that 2. Show that every non-zero endomorphism of the algebra A (E; E) is an inner automorphism. Hint: Use Problem I and Proposition II. sec. 13.18. 3. Construct a decomposition of into irreducible subspaces with respect to the linear transformations of problem 7, § 1. Hence obtain the Jordan canonical matrices. 4. Which of the linear transformations of problem 7, § make into a cyclic space? For each such transformation find a generator. For which of these transformations is irreducible? 5. Let E1 cE be a stable subspace under p and consider the induced mappings p1: E1 E1 and E/E1 E/E1. Let p, 1u1, ji be the corresponding minimum polynomials. a) Prove that E is cyclic if and only if 1
i) E1 is cyclic ii) E/E1 is cyclic
iii) p=u1ü. In particular conclude that every subspace of a cyclic space is again cyclic.
b) Construct examples in which conditions i), ii), iii) respectively fail, while the remaining two continue to hold. Hint: Use problem 5, § 1. 6. Let j=1 are
E into subspaces and suppose linear transformations with minimum polynomials
Chapter XIII. Theory of a linear transformations
414
Define a linear transformation
of E by
a) Prove that E is cyclic if and only if each relatively prime.
is cyclic and the
are
b) Conclude that if E is cyclic, then each F3 is a sum of generalized eigenspaces for çø.
c) Prove that if E is cyclic and
a generates Eifand only if generates 1. ..s). 7. Suppose Fc: E is stable under p and let (pF:F-+F (minimum polynomial jib.) and (minimum polynomial ji) be the induced transformations. Show that E is irreducible if and only if i) E/F is irreducible. ii) F is irreducible. iii) wheref is an irreducible polynomial. 8. Suppose that Eis irreducible with respect to çø. Letf (f irreducible)
be the minimum polynomial of ço. a) Prove that the k subspaces K(f) stable subspaces of E b) Conclude that lmf(co)K =
K(fk_l) are the only non-trivial
K
k.
9. Find necessary and sufficient conditions that E have no nontrivial stable subspaces. 10. Let be a differential operator in E.
a) Show that in any decomposition of E into irreducible subspaces, each subspace has dimension I or 2. b) Let N1 be the number of/-dimensional irreducible subspaces in the above decomposition (j= 1, 2). Show that
N1-f-2N,=dimE and N1=dimH(E). c) Using part b) prove that two differential operators in E are conjugate if and only if the corresponding homology spaces coincide. Hint: Use Proposition II, sec. 13.18.
11. Show that two linear transformations of a 3-dimensional vector space are conjugate if and only if they have the same minimum polynomial.
§ 5. Applications of cyclic spaces
12. Let be a linear transformation. Show that there exists a (not necessarily unique) multiplication in E such that i) E is an associative commutative algebra ii) E contains a subalgebra A isomorphic to F((p)
iii) If &F(p)4A is the isomorphism, then
13. Let F be irreducible (and hence cyclic) with respect to p. Show that the set S of generators of the cyclic space E is not a subspace. Construct a subspace F such that S is in I — correspondence with the non-zero elements of E/F. 1
14. Let
be
a linear transformation of a real vector space having can not be written in the
distinct eigenvalues, all negative. Show that form § 5. Applications
of cyclic spaces
In this paragraph we shall apply the theory developed in the preceding paragraph to obtain three important, independent theorems. 13.19. Generalized eigenspaces. Direct sums of the generalized eigenspaces of 'p are characterized by the following
Theorem I: Let (13.45)
be any decomposition of E into a direct sum of stable subspaces. Then the following three conditions are equivalent: (i) Each is a direct sum of some of the generalized eigenspaces E, of 'p. (ii) The projection operators
in E associated with the decomposition
(13.45) are polynomials in 'p. (iii) Every stable subspace UcE satisfies U=
U n
Proof: Suppose that (i) holds. Then the projection operators are sums of the projection operators associated with the decomposition of F into generalized eigenspaces, and so it follows from sec. 13.5 that they are polynomials in 'p. Thus (i) implies (ii).
416
Chapter XIII. Theory of a linear transformation
Now suppose that (ii) holds, and let Uc E be any stable subspace. Then i, we have
since
Uc>o1U. Since Uis stable underQ1 it follows that Q1Uc: Un U c:
U
fl
whence
F1.
The inclusion in the other direction is obvious. Thus (ii) implies (iii). Finally, suppose that (iii) holds. To show that (i) must also hold we first prove
Lemma I: Suppose that (iii) holds, and let
E into generalized eigenspaces. Then to every i, such that (1= 1, ...,r) there corresponds precisely one integer], (1 E1 n
Proof of lemma I: Suppose first that s). Then from (iii) we obtain every ./(1
E. =
E, n
=0 for a fixed I and for
fl
=
0
j= 1
which is clearly false (cf. sec. 13.4). Hence there is at least one] such that
To prove that there is at most one] (for any fixed 1) such that E1 n we shall assume that for some f,]1,]2, fl
+0
and E1 n
0,
+0
and derive a contradiction. Without loss of generality we may assume that
E1flF1+0 and E1flF2+0. Choose two non-zero vectors
y1eE1flF1 and y2EE1flF2. Then since Yi' y2eE1 we have that =
§ 5. Applications of cyclic spaces
417
J irreducible, is the decomposition of p. Let and '2 the least integers such that =0 and =0. We may assume that = '2 we simply replace Yt by the vector where ji
. .
be
Then
f
= 0 and
11
+ 0 for k <
12.
Now set Y = Yi + Y2 and let V be the cyclic subspace generated by y. Clearly Yc F1 SF2, and so
in view of (iii) we obtain Y= Y
Y n F2.
n F1
It will now be shown that Y n F1 = 0. Let ue Yn F1 be any vector. Since ue Ywe have that
u =f(ço)y
+f(co)y2
for some polynomialf. Since ueF1, it follows that
f(p)y2=u—f(p)y1eF1 n F2=0 and hence d(cp)y2=0 where d is the greatest common divisor of f and Thus we obtain d=ft where
f1k
But dif; and so u=f(p)y=f(q,)y1+f(q,)y2=0. This proves that Yn F1 =0. A similar argument shows that Yn F2 = so that V = Y n F1 Y n F2 = 0. This is the desired contradiction, and it
completes the
0
proof of the lemma.
We now revert to the proof of the theorem. Recall that we assume that (iii) holds, and are required to prove (i). In view of the above lemma we can define a set mapping such that fl FC(I)
0
i = 1,...,r and
fl
Then (iii) yields that E1
= 27
Greub. Linear Algebra
E1 n
=
E1 n
=
0
Chapter XIII. Theory of a linear transforniation
Finally, the relation
E=
E
i1
t
j
i
implies that and
iet'(j)
j
E.
Hence t is a surjection, and for every integer] (1
of some of the
is a direct sum
and so (i) is proved. Thus (iii) implies (i), and the
proof of the theorem is complete.
13.20. Cayley-Hamilton theorem. It is the purpose of this section to prove the Theorem II: (Cayley-Hamilton) Let x denote the characteristic polynomial of 'p. Then or, equivalently, 'p satisfies its own characteristic equation.
Before proceeding to the proof of this theorem we establish some elementary results. Suppose ).ET is any scalar, and let v denote the minimum polynomial
of 'p —ii. Assume further that v has degree in, and let v be given explicitly by
=
+
trn
(13.46)
Then o=
-
- 2i)m + rn—i
(prn+
= where
(13.47)
•
f ('p)
j=O
=
trn
+
It follows that p/f, and so in particular degp
degf = degv.
On the other hand, 'p =
('p
—
i) +
i
(13.48)
§ 5. Applications of cyclic spaces
419
and thus a similar argument shows that
degv degp. This, together with (13.48) implies that
degf =
degv
degp.
(13.49)
andf has leading coefficient 1, we obtain that
I =p. In particular, sincef=v(g), where g=t—)L, we have
= v(0) = /3g.
=
(13.50)
Lemma II: Suppose E is cyclic and dim E= m, and let x be the characteristic polynomial for 'p. Then x
=(—
Proof: Let )LEF be any scalar, and let v be the minimum polynomial for ço—2i. Since E is cyclic (with respect to 'p), Theorem I, sec. 13.1 1
implies that
dimE = m. Now we obtain from (13.49) that deg v=m, and so a second application of Theorem I, sec. 13.11 shows that E is cyclic with respect to 'p —2i. Let a be any generator of E (with respect to 'p —2i). Then
is a basis for E (cf. Proposition III, sec. 13.10). Now suppose that zi is a non-trivial determinant function for E. Then —
2i)a,...,('p —
= det(q
2
—
=A(('p_2,)a,...,('p_Ai)ma) =
—
21)ma,('p
—
a) (13.51)
2z)a,...,('p
—
2,)m_1
a).
On the other hand, if (13.46) gives the minimum polynomial of 'p —2:, we obtain that ('p —2i)ma
Chapter XIII. Theory ofa linear transformation
42()
and substitution in (13.51) yields the relation —
=
—
—
(—
= It follows that
—
2l)m1 a)
—
—
= (and in view of (13.50) we obtain that
=
REF.
(—
(13.52)
Finally, since (13.52) holds for every 2eF, we can conclude (cf. sec. 12.10) that
=
(—
Proof of Theorem II: According to Proposition V, sec. 13.11 there
exists a cyclic subspace Eac:E such that dimEa = By Lemma I, sec. 13.12, Ea has a complementary stable subspace F0, E
Ea
F.
Let Xa and XF be the characteristic polynomials of the linear transformations induced in Ea and in F by ço. Then X
XaXF
(cf. sec. 4.21).
On the other hand, the minimum polynomial of the linear transformation induced in Ea by p is as follows from the corollary to Theorem I, sec. 13.11. Now the lemma implies that 1a
±/1
Hence,
13.21.* The commutant of p. The commutant of ço, C(co), is the sub-
algebra of A(E; E) consisting of all the linear transformations that commute with p. Let f be any polynomial. Then K(f) is stable under every cli e C(co). In fact ifyEK(f) is any vector, then
and so clJYEK(f).
§ 5. Applications of cyclic spaces
421
Next suppose that e C(ço) is any linear transformation. Consider the decompositions of E into generalized eigenspaces of and of i/i,
(forp) and (fon/i)
and the corresponding projection operators in E, mappings and it follows that
Since the and are respectively polynomials in p and (cf. sec. 13.5)
ltjoQjQjoJtj Now define linear transformations
=
in E by o
Then we obtain that
= and hence the
are
o Qj
o
=
o
=
o
o
again projection operators in E.
Since
cEn
and
=
=
it follows that
E=
fl
cE
whence
and
E=
n
F,.
E a direct Proposition I: Let E= F1 sum of subspaces. Then the subspaces F; are stable under 'p if and only if the projection operators aj are contained in C(p). . . .
Proof: Since
flkera1
1*j
it follows that the are stable under 'p if the e C(ço). Conversely, if the are stable under 'p we have for each YEFJ that 'pyEF1, and hence =
coy
=
422
Chapter XIII. Theory of a linear transformation
while
1+).
c11(py=O=çocr1y
Thus the cr1 commute with (p. 13.22.* The bicommutant of p. The hicoiiimutant, C2((p), of (p is the
subalgebra of (cf. example I, see 5.2) consisting of all the linear transformations which commute with every linear transformation in C(p). Theorem Ill. C2(co) coincides with the linear transformations which are polynomials in (p.
Proof Clearly Conversely, suppose t/JEC2(q9)
is
any linear transformation and let (13.53)
be a decomposition of E into cyclic subspaces with respect to p. A decomposition (13.53) exists by Theorem II, sec. 13.12. Let {a,} be any fixed generators of the spaces F induced by p, and let be the minimum polynomial of (pt. Then (cf. sec. 13.2) so we can write
1=1
S.
In view of Proposition V and the corollary to Theorem I, sec. 13.11 we may (and do) assume that Now the Fare stable subspaces of E (under and so by Proposition I, sec. 13.21 the projection operators in Eassociated with (13.53) commute with (p. Hence they commute with cli as well, and so a second application of Proposition 1 shows that the F are stable under i/i. In particular, Since is cyclic with respect to (p we can write
i=1 Thus ifh(p)aEF, is an arbitrary vector in
we obtain
= h(p) k/Ia =
=
since (p and k/I commute. It follows that
= g(p)
I=
1
s
where k/Is denotes the restriction of 4' to F. In the following it will be
§ 5. Applications of cyclic spaces
423
shown that
thus proving the theorem. Consider now linear transformations
(2
I s) in E defined by
j4:i
X1XX = To show that x,
is
well-defined it is clearly sufficient to prove that
f((p)v1(p)a1=0
whenever
But iff(ço)a1=0, then pIf and so
divides
whence
= 0.
The relation =
=
shows that xL commutes with 'p, and hence with t/i. On
the other hand we
have that =
=
=
and =
=
=
whence
=0. This relation implies that
But
=p and so
— g1).
Since
p=v1p1, we obtain that —g1.
This last relation yields that for any vector
i=2,...,s. It follows that
I'=gl@p) which completes the proof.
Problems 1. Let k1
"fr
kr
Chapter XIII. Theory of a linear transformation
424
be the decomposition of the minimum polynomial of p, and let P21 be the
the number of
generalized eigenspaces. 1ff has degree p denote by irreducible subspaces of E1 of dimension p1(1
Set
= Show that the characteristic polynomial of p is given by
2. Prove that E is cyclic if and only if x
=±
IL.
3. Let (p be a linear transformation of E and assume that E=>F1 is a decomposition of E as a direct sum of stable subspaces. if each is a sum of generalized eigenspaces, prove that each is stable under every is stable under every k/JEC(p) Conversely, assume that each and prove that each is a sum of generalized eigenspaces of (p. 4. a) Show that the only projection operators in C(ço) are i and 0 if and only if E is irreducible with respect to ço.
b) Show that the set of projection operators in C(p)
is
a subset of
C2(p) if and only if E is cyclic with respect to (p. 5.
a) Define C3(p) to be the set of all linear transformations in E
commuting with every transformation in C2(ço). Prove that C3((p) = C(ço).
Prove that C2(p) = C(p) if and only if E is cyclic. 6. Let E be cyclic with respect to p and S be the set of generators of P2. Let G be the set of linear automorphisms in C(p). a) Prove that G is a group. b) Prove that for each (/EG, S is stable under and the restriction of ,/' to S is a bijection. Show that has no fixed points if k/i i. c) Let aES be a fixed generator and let a, lEG be arbitrary. Prove that b)
a=r if and only if
aa =
hence in particular, a=r if and only if d) Prove that G acts transitively on S; i.e. for each a, bES there is a
and
k/lEG such that /ia=b.
e) Conclude that if aES is a generator, then the mapping given by
is a hijection.
§6. Nilpotent and seniisimple transformations
425
7. Let be a differential operator in E. Consider the set I of transformations k/JEC(() such that k/JZ(E)c:B(E) (cf. sec. 6.7).
Show that I is an ideal in
and establish an algebra isomorphism
A(H(E); H(E)). § 6.
Nilpotent and semisimple transformations
13.23. Nilpotent transformations. A linear transformation,
is called
ni/potent if (pk=0 for some integer k or equivalently, if its minimal polynomial has the form =
tm.
The exponent ni is called the degree of ço. It follows from sec. 13.7 that p is nilpotent if and only if the Fitting null component is the entire space.
It is clear that the restriction of a nilpotent transformation to a stable subspace is again nilpotent. are two commuting nilpotent transforSuppose now that and mations. Then the transformations çü + i/i and i/i ço are again nilpotent. In fact, jfk and / denote the degrees of 'p and i/i, then ('p +
(k+ 1)
=
=
0
j=k+1
j=O
and
which proves that 'p +
(k+l)
+ =
=
0
'p are nilpotent.
and
Assume that E is irreducible with respect to the nilpotent transformation 'p. Then it follows from sec. 13.14 that the Jordan canonical matrix of has the form 0
(13.54) 0
Hence, the Jordan canonical matrix of any nilpotent transformation consists of matrices of the form (13.54) following each other along the main diagonal.
Chapter XIII. Theory of a linear transformation
426
Suppose now that decomposition
Letf be the polynomial
is
any linear transformation, and that p has the =
f
J=
Then if/i is any polynomial, h(co) is nilpotent if and only at once from sec. 12.12.
as follows
Is 13.24. Semisimple transformations. A linear transformation called semisimple if every stable subspace E1 c:E has a complementary stable subspace. Example I: Let E be a Euclidean space and be a rotation of E. Since the orthogonal complement of every stable subspace is stable (cf. sec. 8.19) it follows that is semisimple. Example II: In a Euclidean space every selfadjoint and every skew transformation is semisimple, as follows from a similar argument. Example III: Let E be a unitary space. Then every unitary and every selfadjoint transformation is semisimple. Let ço be a semisimple transformation and suppose that E1 is a stable subspace. Then the restriction go1 of to E1 is semisimple. In fact, suppose F1 is stable under goi. Then F1 is stable under go, and hence there exists a complementary stable subspace, F2, in E,
E= Intersection with E1 yields E1 = Since F2 n
E1
F1
F1
F2. (F2 fl E1).
is (clearly) stable under go1 it follows that go1 is semisimple.
Proposition I: Suppose go is semisimple, and let f be any polynomial. Thenf(go) is nilpotent if and only iff(go)=O. Proof. The if part is trivial. Suppose now thatf(go) is nilpotent of degree k. Then is stable under go and so we can write E = where F is stable under go, and hence under f(go). On the other hand, it is
clear thatf(go)FcK(f"_ 1) whence
F=O i.e.,
§ 6. Nilpotent and semisimple transformations
427
It follows that E=K(f') where l=max (k— 1,1). Since k is the degree of nilpotency off(p) we have
l>k
whence k=l== 1. Hencef(p)=O. is simultaneously nilpotent and semisimple, then Corollary I: If (p = 0.
The major result on semisimple transformations obtained in this section is the following criterion:
Theorem I: A linear transformation is semisimple if and only if its minimum polynomial is the product of relatively prime irreducible polynomials (or equivalently, if the polynomials p and ji' are relatively prime).
Remark: Theorem I shows that a linear transformation p is semisimple if and only if it is a semisimple element of the algebra F(p) (cf. sec. 12.17).
Proof: Suppose (p
=
j1ki
is
...
semisimple. Consider the decompositions
1 irreducible and relatively prime
of the minimum polynomial, and set
ffifr
Then
f(p) is nilpotent, and hence by Proposition I of this section,
f(p)=O. It follows that
by definition, we have
This proves the only if part of the theorem.
To prove the second part of the theorem we consider first the special case that the minimum polynomial, ji, of ip is irreducible. To show that (p is semisimple consider the subalgebra, T(p), of A (E; E) generated by (p and i. Since p is irreducible, F(p) is a field (cf. sec. 12.14). E(p) contains F and hence it is an extension field of F, and E may be considered as a vector space over F(p) (cf. § 3, Chapt. V). Since a subspace of the F-vector space E is stable under p if and only if it is stable under every transforma-
tion of F(p), it follows that the stable subspaces of E are precisely the F(p)-subspaces of the space E. Since every subspace of a vector space has a complementary subspace it follows that 'p is semisimple.
Chapter XIII. Theory of a linear transformation
Now consider the general case
P=
fi
•
.
. fr
ft irreducible and relatively prime.
Then we have the decomposition
E into generalized eigenspaces. Since the minimum polynomial of the is preciselyf (cf. sec. 13.4) it follows induced transformation from the above result that is semisim pIe. Now let E be a stable subspace. Then we have, in view of sec. 13.6,
F F is a stable subspace of complementary subspace
=
and hence there exists a stable
(F n E1)
These equations yield =
H=
H is a stable subspace of E it follows that p is semisimple. Corollary I. Let p be any linear transformation and assume that
E=(F E into stable subspaces such that the induced are semisimple. Then p is semisimple.
transformations be the minimum polynomial of the induced transforProof: Let Since is semisimple each p is a product of relatively mation prime irreducible polynomials. Hence, the least common multiple, f, of the is again a product of such polynomials. Butf(p) annihilates E and hence the minimum polynomial, p, of 'p dividesf It follows that p is a product of relatively prime irreducible polynomials. Now Theorem I implies that 'p is semisimple. Proposition H: Let A c F be a subfield, and assume that E, considered as a A-vector space has finite dimension. Then every (F-linear) transfor-
mation, çü, of E which is semisimple as a A-linear transformation is semisimple considered as F-linear transformation.
§ 6. Nilpotent
and semisimple transformations
429
Proof Let p4 be the minimum polynomial of p considered as a zi-linear are relatively
transformation. It follows from Theorem I that and prime. Hence there are polynomials g,rE4[t] such that
(13.55)
On the other hand, every polynomial over A[t] may be considered as a polynomial in F[t]. Since ((p) 0 we have JLr
denotes the minimum polynomial of the (F-linear) transformation p. Hence we may write = p1 h some /1 E F [t] and so = (13.56) + Pr/I'. where
Combining (13.55) and (13.56) we obtain
q/zp1+h'rp1+hrp-= 1 whence
p- = I
(q Ii + h 'r) Pr + (Ii
are relatively prime. This relation shows that the polynomials and Now Theorem I implies that the r-linear transformation ço is semisimple.
Theorem II: Every linear transformation p can be written in the form
where c°s is semisimple and p and
are given by
and
k=max(k1 Moreover, if N
any decomposition of p into a semisimple and nilpotent transformation such that then is
and Proof': For the existence apply Theorem I and Theorem IV. sec. 12.16
with A=F(p).
430
Chapter XIII. Theory of a linear transformation
To prove the uniqueness let
be
any decomposition of (p
into a semisimple and a nilpotent transformation such that with Then the subalgebra of A(E; E) generated by i.
commutes is and
commutative and contains (p. Now apply the uniqueness part of Theorem IV. sec. 12.16.
13.25. The Jordan normal form of a semisimple transformation. Suppose that E is irreducible with respect to a semisirn pie transformation Then it follows from sec. 13.13 and Theorem I sec. 13.24 that the minimum polynomial of (p
has
the form IL =
where
f is irreducible. Hence the Jordan canonical matrix of (p
has
the
form (cf. see. 13.15) o i.••
(13.57) o
p
•i
p=
deg p.
It follows that the Jordan canonical matrix of an arbitrary semisimple transformation consists of submatrices of the form (13.57) following each other along the main diagonal. Now consider the special case that E is irreducible with respect to a semisimple transformation whose minimum polynomial is completely reducible. Then we have that p= 1 and hence E has dimension 1. It follows that if is a semisimple transformation with completely reducible minimum polynomial, then E is the direct sum of stable subspaces of dimension 1; i.e., E has a basis of eigenvectors. The matrix of with respect to this basis is of the form
(13.58)
§ 6. Nilpotent
and semisimple transformations
431
are the (not necessarily distinct) eigenvalues of p. A linear transformation with a matrix of the form (13.58) is called diagonalizable. Thus semisimple linear transformations with completely reducible minimum polynomial are diagonalizable. Finally let p be a semisimple transformation of a real vector space E. where the
Then a similar argument shows that E is the direct sum of irreducible subspaces of dimension 1 or 2. 13.26. * The commutant of a semisimple transformation.
Theorem III: The commutant C(p) of a semisimple transformation is a direct sum of ideals (in the algebra C((p)) each of which is isomorphic to the full algebra of transformations of a vector space over an extension
field ofF. Proof: Let
EE1EI3"EBEr
(13.59)
be the decomposition of E into the generalized eigenspaces. It follows from sec. 13.21 that the eigenspaces E are stable under every transformation I/JEC((p). Now let be the subspace consisting of all transformations t/i such that
is stable under each element of C(p) it follows that is an ideal in the algebra C((p). As an immediate consequence of the definition, we Since
have
fin
j=1,...,r.
k*j
(13.60)
be arbitrary and consider the projection operators associated with the decomposition (13.59).
Now let
Then (13.61)
where (13.62)
It follows from (13.62) that
Hence formulae (13.60) and (13.61)
imply that C ('p) =
It is clear that C
is the restriction of 'p to
Chapter XIII. Theory of a linear transformation
432
where the isomorphism is obtained by restricting a transformation to
induced by (p. Since the Now consider the transformations minimum polynomial of is irreducible it follows that E we obtain from chap. V, § 3 a vector space over that = of linear 13.27. Semisimple sets of linear transformations. A set { (p transformations of E will be called seniisimple if to every subspace F1 c: E which is stable under each there exists a complementary subspace F2 which is stable under each }
Suppose now that {co2} is any set of linear transformations and let be the subalgebra generated by the Then clearly, a subspace Fc:E is stable under each if and only if it is stable under is semisimple if and only if the each L'eA. In particular, the set algebra A is semisimple. be a set of commuting semisimple transforTheorem IV: Let mations. Then is a semisimple set.
Proof: We first consider the case of a finite set of transformations and proceed by induction on s. If s= I the theoreni is trivial. Suppose now it holds for s--I and assume for the moment that the minimum polynomial of is irreducible. Then E may be considered as a they may F((p1)-vector space. Since the ...,s) commute with be considered as F(ço1)-linear transformations (cf. Chap. V, § 3). Moreover, Proposition II, sec. 13.24 implies that the considered as linear transformations, are again semisimple. Now let F1 E be any subspace stable under the I, ...,s). Then since F1 is stable under (pt, it is a F(p1 )-subspace of F. Hence. by the induction hypothesis, there exists a F(p1 )-subspace of E, F2, which is stable under and such that
E=
F1
F2.
Since F2 is a F((p1)-subspace, it is also stable under stable subspace complementary to F1. of Let the minimum polynomial simple. we have P1 =
fi
. .
. Jr
and so it is a p
f1 irreducible and relatively prime.
is semi-
§ 6. Nilpotent and semisimple transformations
433
Let
E into the generalized eigenspaces of (pt.
Now assume that F1 cE is a subspace stable under each (1=1, ...,s). According to sec. 13.21 each E1 is stable under each It follows that the subspaces F1 n are also stable under every Moreover the restrictions of the to each are again semisimple (ci. sec. 13.24) and in particular, the restriction of ço1 to has as minimum polynomial the
irreducible polynomial f1. Thus it follows that the restrictions of the form a semisimple set, and hence there exist subspaces which are stable under each co and which satisfy to
= (F1
P
fl
Setting F2
I, ...,
=P
we have that F2 is stable under each
E=
j=
F1
and that F2.
This closes the induction, and completes the proof for the case that the are a finite set. If the set is infinite consider the subalgebra A c: A (E; E) generated by the 'pa. Then A is a commutative algebra and hence every subset of A consists of commuting transformations. In view of the discussion in the beginning of this section it is sufficient to construct a semisimple system of generators for A. But A is finite dimensional and so has a finite system of generators. Hence the theorem is reduced to the case of a finite set. Theorem IV has the following converse: Theorem
V: Suppose Ac:A(E; E) is a commutative semisimple set.
Then for each çoEA, p is a semisimple transformation. Proof Let coEA be arbitrary and consider the decomposition
of its minimum polynomial p. Define a polynomial, g, by
Since the set A is commutative, K(g) is stable under every t/ieA. Hence 28
Greub. Linear Algebra
Chapter XIII. Theory of a linear transformation
434
there exists a subspace E1 c:Ewhich is stable under every /JEA such that
E= Now let
Ii =
(13.63)
1k1—1
Then we have
h(ço)E c: K(g). On the other hand, since E1 is stable under h(p), h((p)E1
E1
whence
K(g)
h((p)E1
n E1
= 0.
It follows that E1 c: K(h). Now consider the polynomial
1,1).
where
Since
k, —
1
it follows that
p(p)x=O
xeE1
(13.64)
xEK(g).
(13.65)
and from 1, 1 we obtain
p(p)x=O
In view of (13.63), (13.64) and (13.65) imply that p(p)=O and so Now it follows that
max(k1—1,l)k,
i=1,...,r
whence
i = 1, ...,r. k, = I Hence ço is semisimple. Corollary: If çü and are two commuting semisimple transformations, then p+i/i and IJco are again semisimple.
Proof. Consider the subalgebra A cA(E;E) generated by p and Then Theorem IV implies that A is a semisimple set. Hence it follows
from Theorem V that ço +
and
p
are semisimple.
Problems I. Let p be nilpotent, and let be the number of subspaces of dimension 2 in a decomposition of E into irreducible subspaces. Prove that dim kerp = EN).
§ 6. Nilpotent
and semisimple transformations
435
2. Let p be nilpotent of degree k in a 6-dimensional vector space E. For each k (1 k 6) determine the possible ranks of p and show that k and r(co) determine the numbers N) (cf. problem 1) explicitly. Conclude that two nilpotent transformations ço and are conjugate if and only if
and 3. Suppose ço is nilpotent and let p*:E*÷_E* be the dual mapping. Assume that E is cyclic with respect to p and that p is of degree k. Let a be a generator of E. Prove that E* is cyclic with respect to and that
is of degree k. Let a*EE* be any vector. Show that
if and only if a* is a generator of E*. 4. Prove that a linear transformation p with minimum polynomial 1u is diagonalizable if and only if I) p is completely reducible ii) p is semisimple Show that i) and ii) are equivalent to
where the
are
distinct scalars.
5. a) Prove that two commuting diagonalizable transformations are simultaneously diagonalizable; i.e., there exists a basis of E with respect to which both matrices are diagonal. b) Use a) to prove that if p and t/i are commuting semisimple transformations of a complex space. then a complex space E. Let a p be the decomposition of £ into generalized eigenspaces, and let
be the corresponding projection operators. Assume that the minimum is Prove polynomial of the induced transformation that the semisimple part of 'p is given by cos
Let £ be a complex vector space and 'p be a linear transformation a), not necessarily distinct. Given an arbitrary polynomialf prove directly that the linear transformationf('p) has the eigenvalues (v= 1 n) (n=dimE). 7.
with eigenvalues
28*
I
Chapter XIII. Theory of a linear transformation
8. Give an example of a semisimple set of linear transformations which contains transformations that are not semisimple.
9. Let A be an algebra of commuting linear transformations in a complex space E. such that for any pEA, a) Construct a decomposition E1 is stable under 'p and the minimum polynomial of the induced transformation is of the form (t
(p) —
b) Show that the mapping A—÷C given by
'peA is a linear function in A. Prove that preserves products and so it is a homomorphism. c) Show that the nilpotent transformations in A form an ideal which is precisely rad A (cf. chap. V, § 2). Consider the subspace T of L(A) Prove that generated by the radA = T1 d) Prove that the semisimple transformations in A form a subalgebra, A5. Consider the linear functions in A5 obtained by restricting to A5. Show that they generate (linearly) the dual space L (A5). Prove that the is a linear isomorphism. T4 L(A5). mapping
10. Assume that E is a complex vector space. Prove that every commutative algebra of semisimple transformations is contained in an ndimensional commutative algebra of semisimple transformations. 11. Calculate the semisimple and nilpotent parts of the linear transformations of problem 7, § 1. § 7. Applications
to inner product spaces
In this concluding paragraph we shall apply our general decomposition theorems to inner product spaces. Decompositions of an inner product space into irreducible subspaces with respect to selfadjoint mappings,
skew mappings and isometries have already been constructed
in
chap. VIII.
Generalizing these results we shall now construct a decomposition for a iiorinal transformation. Since a complex linear space is fully reducible with respect to a normal endomorphism (cf. sec. 11.10) we can to real inner product spaces. restrict
§ 7. Applications
to inner product spaces
437
13.28. Normal transformations. Let E be an inner product space and (p: E—*E be a normal transformation (cf. sec. 8.5). It is clear that every
polynomial in ço is again normal. Moreover since the rank of (p1' (k =2, 3 ...)
equal to the rank of p, it follows that p is nilpotent only if = 0. Now consider the decomposition of the minimum polynomial into its prime factors, (13.66) = ... is
and the corresponding decomposition of E into the generalized eigenspaces,
(13.67)
Since the projection operators m1 associated with the decomposition (13.67) are polynomials in they are normal. On the other hand and so it follows from sec. 8.11 that the and X1EE1 be arbitrary. Then
arc selfadjoint. Now let
=0 i + I; = i.e., the decomposition (13.67) is orthogonal. It follows from Now consider the induced transformations sec. 8.5 that the are again normal and hence so are the transformations On the other hand, is nilpotent. It follows thatf(pj=0 and (xi,
= (xi,
hence all the exponents in (13.66) are equal to I. Now Theorem I of sec. 13.24 implies that a normal transformation is semisimple. Theorem I: Let E be an inner product space. Then a linear transformation ço is normal if and only if i) the generalized eigenspaces are mutually orthogonal. are homothetic (cf. sec. 8.19). ii) The restrictions
Proof: Let p be a normal transformation. It has been shown already that the spaces E1 are mutually orthogonal. Now consider the minimum polynomial. of the induced transformation p.. Since j is irreducible over it follows that either (13.68)
or
ft =
+
;t +
— 4,8k
In the first case we have that consider the case (13.69). Then
+
<0
131e
(13.69)
and so is homothetic. Now satisfies the relation
+ fit = 0
Chapter XIII. Theory of a linear transformation
438
and hence the proof is reduced to showing that a normal transformation
which satisfies (13.70)
is homothetic. We prove first that
— is regular. In fact, let K be the kernel of p whence If:EK is an arbitrary vector, we have
=
=
(ço
=
(p.
=
It follows that K is stable under 'p and hence stable under
Clearly the
restriction of 'p to K is selfadjoint. Hence, if K1= 0, 'p has an eigenvector in K which contradicts the hypothesis — 4f3 <0. Consequently, K= 0. Equation (13.70) implies that (13.71)
Multiplying (13.70) and (13.71) respectively by we find that
whence, in view of the regularity of
and 'p and subtracting
— ço,
(13.72)
Define a transformation, i. by
(notice that c2 —4/3<0 implies that /3>0). Then (13.72) yields and so z is a rotation. This proves that every normal mapping satisfies i) and ii). The converse follows immediately from sec. 8.5. Corollary I: If 'p is a normal transformation then the orthogonal complement of a stable subspace is stable. Proof: Let F be a stable subspace. In view of sec. 13.6 we have
F(E1fl F the restriction, of'p to is homothetic it follows that the orthogonal complement, of n F in E1 is again stable under 'p. Hence, the space is stable under 'p. On the other hand the equations Since
=
n F)
§ 7. Applications
to inner product spaces
yield
439
H=F'.
E=FEi?H
Hence F' is a stable subspace. As an immediate consequence of Theorem I we obtain Theorem II: Let E be an inner product space. Then a linear transformation p is normal if and only if E can be written as the sum of mutually
orthogonal irreducible subspaces such that the restriction of
to
every subspace is homothetic. 13.29. Semisimple transformations of a real vector space. In sec. 13.28 it has been shown that every normal transformation of an inner product space is semisimple. Conversely, let tp:E—*E be a semisimple transformation of a real vector space. Then a positive definite inner product can be introduced in E such that p becomes a normal mapping. To prove this let
be a decomposition of F into irreducible subspaces. In view of Theorem II it is sufficient to define a positive inner product in each such that the restrictions, ço1. of p to F1 are homothetic. In fact, we simply extend these
inner products to an inner product in E such that the F be one of the irreducible subspaces. Since 'p is semisimple, F has dimension 1 or 2. If dim F= 1 we choose the inner product in F arbitrarily. If dim F= 2 there exists a basis a, b in F such that
pa=b, 'pb=—fla—ctb, (cf. sec. 13.16). Define the inner product by
(a,a)=1, Then we have for every vector
x=
ofF (x,x)
+ + flij2.
—
— 4f3 <0 it follows that (x, x) x=O. Moreover, since
Since
0 and equality holds only for
('pa,'pa)=fl=fl(a,a), and ('pb,'pb) =
/32
= J3(b,b)
44()
Chapter XIII. Theory of a linear transformation
it follows that
xEF.
lcoxi2=fiIxI2
This equation shows that is homothetic and so the proof is complete. 13.30. Lorenti-transformations. As a second example we shall Construct a decomposition of the Minkowski-space into irreducible sub(cf. sec. 9.27). For spaces with respect to a Lorentz-transformation the sake of simplicity we assume that the Lorentz-transformation is proper orthochroneous. The condition implies that the inverse of every eigenvalue is again an eigenvalue. Since there exists at least one eigenvalue (cf. sec. 9.27) the minimum polynomial. p. of has at least one real root. Now we distinguish three cases: I. The minimum polynomial p contains a prime factor
of second degree. Then consider the mapping =
+
+ fit.
The kernel of t is a stable subspace F of even dimension and containing
no eigenvectors. Since p has an eigenvector in E, E+F. Thus F has necessarily dimension 2 and hence it is a plane. The intersection of the plane F and the light-cone consists of two straight lines, one straight line, or the point 0 only. The two first cases are impossible because the plane F does not contain eigenvectors (cf. sec. 9.26). Thus the inner product must be positive definite in F and the induced transformation Pi is a proper Euclidean rotation. (An improper rotation ofF would have eigenvectors.)
Now consider the orthogonal complement F'. The restriction of the inner product to F' has index 1. Hence F' is a pseudo-Euclidean plane. Denote by P2 the induced transformation of F'. The equation
detp = detp1detp2 implies that det ço2 = + I, showing that c°2
is
a proper pseudo-Euclidean
rotation. Choosing orthonormal bases in F and in F' we obtain an orthonormal basis of E in which the matrix of ço has the form
cosw
sinw
0
—5mw cosw LO
coshO sinh 0
sinhO cosh 0
§ 7. Applications
to inner product spaces
441
II. The minimum polynomial is completely reducible, and not all its roots are equal to 1. Then has eigenvalues )L+ and be corresponding eigenvectors 1
1
1. Let e and e'
1
The condition
1 implies that e and e' are light-vectors. These vectors
are linearly independent, whence (e,e')+O (cf. sec. 9.21). Let F be the plane generated by e and e' and let z =
+
be any vector of F. Then (z,z) =
This equation shows that the induced inner product has index 1. The orthogonal complement F' is therefore a Euclidean plane and the induced mapping is a Euclidean rotation. The angle of this rotation must be o or ir, because otherwise the minimum polynomial of p would contain an irreducible factor of second degree. Select orthonormal bases in Fand F'. These two bases form an orthonormal basis of E in which the matrix of çø has the form 0
040
cosh0
sinh0
c
0
Oc
0
lIT. The minimum polynomial of ço has the form
i-i=(t_1)k if k= 1, çü reduces to the identity map. Next, it will be shown that the case k=2 is impossible. Ifk=2, applying 'p' to the equation (p—i)2=0 yields
+
= 2i
whence
XEE. Inserting for x a light-vector, 1, we see that (1, 'pl)=O. But two light-vectors
can be orthogonal only if they are linearly dependent. We thus obtain çol=J. Since ço does not have eigenvalues 2+1, it follows that çøì=ì for all light-vectors 1. But this implies that çü is the identity. Hence the minimal polynomial is t — in contradiction to our assumption k =2. 1
Chapter XIII. Theory of a linear transformation
442
Now consider the case k 3. As has been shown in sec. 9.27 there exists an eigenvector e on the light-cone. The orthogonal complement E1 of e is a 3-dimensional subspace of F which contains the light-vector e.
The induced inner product has rank and index 2 (cf. sec. 9.2 1). Let F be a 2-dimensional subspace of F in which the inner product is positive definite. Selecting an orthonormal basis in F we can write çoe2 = — e1sinw
+ e2cosw +
(13.73)
çoe = e.
and are not both zero. In fact, if =0 and =0 the plane F is invariant under p and we have the direct decomposition of F into two 2-dimensional invariant subspaces. This would The coefficients
imply that k2. Now consider the characteristic polynomial.
of the induced mapping
Computing the characteristic polynomial from the matrix
coi : E1
(13.73) we find that Xi = (t2
—
2tcosw + 1)(1 — t).
(13.74)
At the same time we know that Xi =(1 — t)3.
(13.75)
Comparing the polynomials (13.74) and (13.75) we find that w=0. Hence, equations (13.73) reduce to e
'p
e
çoe = e.
Now consider the vector
=
e2 —
e1.
Then
(yy) =
+
=
+
>0
and
=
'pe2
—
—
+
e)
= y.
In other words, y is a space-like eigenvector of 'p. Denote by Y the 1-dimensional subspace generated by y. Then we have the orthogonal decomposition
E= Y
Y'
into two invariant subspaces. The orthogonal complement Y' is a 3dimensional pseudo-Euclidean space with index 2.
§ 7. Applications
to inner product spaces
443
The subspace Y' is irreducible with respect to p. This follows from our
hypothesis that the degree of the minimal polynomial p is 3. On the other hand, p can not have degree 4 because then the space E would be irreducible. Combining our results we see that the decomposition of a Minkowskispace with respect to a proper orthochroneous Lorentz-transformation has one of the following forms: I. E is completely reducible. Then is the identity.
II. E is the direct sum of an invariant Euclidean plane and an invariant pseudo-Euclidean plane. These planes are irreducible except for the case where the induced mappings are ± i (Euclidean plane) or t (pseudo-Eudidean plane) Ill. E is the direct sum of a space-like I-dimensional stable subspace (eigenvalue 1) and an irreducible subspace of dimension 3 and index 2.
Problems 1. Suppose E is an n-dimensional vector space over F. Assume that a
symmetric bilinear function ExE—J' is defined such that <x, x> 40 whenever x +0. a) Prove that <.> is a scalar product and thus E is self dual. b) If F E is a subspace show that
c) Suppose p:
is a linear transformation such that (p(p* =
(p*(p.
Prove that çø is semisimple. 2. Let ço be a linear transformation of a unitary space. Prove that p is normal if and only if, for some polynomial
=
I ('i').
3. Suppose ço is a linear transformation of a complex vector space such = i for some integer k. Show that E can be made into a unitary space such that ço becomes a unitary mapping. 4. Let E be a real linear space and let p be a linear transformation of E. Prove that a positive definite inner product can be introduced in E such that p becomes a normal mapping if and only if the following conditions are satisfied: a) The space Ecan be decomposed into invariant subspaces of dimension 1 and 2.
that
444
Chapter XIII. Theory of a linear transformation
b) If -r is the induced mapping in an irreducible subspace of dimension 2, then — dett <0.
5. Consider a 3-dimensional pseudo-Euclidean space E with the index 2. Let (1= 1, 2, 3) be three light-vectors such that =
I
I
Define a linear transformation, ço, by the equations
(ply =11
= (p13 =
+
— —
—
1)1k
+ (1 — — 1)12 + (2—
+
Prove that (p is a rotation and that E is irreducible with respect (p. 6. Show that a real 3-dimensional vector space cannot be irreducible
with respect to a semisimple linear transformation. Conclude that the pseudo-Euclidean rotation of problem 5 is not semisimple. Use this to show that a linear transformation in a self dual space which satisfies p*=(p*(p is not necessarily semisimple (cf. problem 1). 7. Let a Lorentz-transformation (p be defined by the matrix 1
2
2
3
2
1
1
939
1— 43
69 5
6
10
4
10
1
—
5
43
3
18
5
3
Construct a decomposition of E into irreducible subspaces. 8. Consider the group G of Lorentz transformations. Let e be a time-like unit vector and F be the orthogonal complement of e. Consider the subgroup Hc G consisting of all Lorentz transformations such that çoH=H. a) Prove that H is a compact subgroup. b) Prove that H is not properly contained in a compact subgroup of G. Hint. Show first that if K is a compact subgroup of G and (pEK, then every real eigenvalue of (p is ± 1.
Bibliography [1] BAER, R. Linear Algebra and Projective Geometry, New York, Academic Press Inc. 1952. [2] BELLMAN, R. Introduction to Matrix Analysis. Mac Graw-Hill Publ. Comp. Inc. 1960.
[3] BIRKHOFF, G., and S. MAC LANE. A Survey of Modern Algebra. New York, The MacMillan Co. 1944. [4] BOURBAKI, N. Elements de mathernatique, Premiere Partie, Livre II. [5] CHEVALLEY, C. Fundamental Concepts of Algebra, Academic Press Inc., New York, 1956. [6] CULLEN, C. Matrices and Linear Transformations, Addison-Wesley Pubi. Comp. 1966.
[7] GANTMACHER, F. R. Matrizenrechnung. Berlin, Deutscher Verlag der Wissenschaften, 1958. [8] GEL'FAND, 1. M. Lectures on Linear Algebra, Tnterscience Tracts in Pure and Applied Mathematics No. 9, New York, 1961. [9] GRORNER, W. Matrizenrechnung, Miinchen, R. Oldenbourg, 1956. P. Finite Dimensional Vector-spaces. Princeton. D. van Nostrand Co.. [10] Springer-Verlag. 1974.
[II] HAIMOS. P. Naive Set Theory, D. van Nostrand Co.. 1960. Springer-Verlag. 1974. [12] HOFFMAN, K., and R. KUNZE. Linear Algebra. Prentice-Hall Inc., 1961. [13] JACOBSON, N. Lectures in Abstract Algebra, 11. Princeton, D. van Nostrand Co. 1953.
[14] JACOBSON. N. The Theory of Rings. American Math. Soc. 1943.
[15] KELLER, 0. H. Analytische Geometric und lineare Algebra. Berlin, Deutscher Verlag der Wissenschaften, 1957. [16] KELLEY, J. General Topology, D. van Nostrand Co., 1955. [17] KOWALSKI, H. Lineare Algebra, Walter de Gruyter Co., Berlin, 1963. [18] KUIPER, N. Linear Algebra and Geometry. North-Holland PubI. Comp. Amsterdam, 1961. [19] LICHNEROWICZ, A. Lineare Algebra und Lineare Analysis, Berlin, Deutscher Verlag der Wissenschaften, 1956. [20] MAL'CEV, A. Foundations of Linear algebra, W. H. Freeman, San Francisco and London. [21] NEVANLINNA, R. u. F. Absolute Analysis, Grundlehren, Math. Wiss. 102, Gottingen-Heidelberg, Springer, 1958. [22] NoMizu, K. Fundamentals of Linear Algebra, MacGraw-H ill Book, Publ. Comp. Inc.
[23] PICKERT, G. Analytische Geometric, Leipzig, Akademische Verlagsgesellschaft, 1953.
[24] REICHARDT, H. Vorlesungen liber Vector- und Tensorrechnung, Berlin, Deutscher Verlag der Wissenschaften, 1957. [25] SCHREIER, 0. and E. SPFRNER, Modern Algebra and Matrix Theory, New York, Chelsea PubI. Co., 1955. [26] SHILOV, G. Theory of Linear Spaces, Prentice Hall Inc., 1961.
Bibliography
446
W. I. Lehrgang der höheren Mathernatik Teil III, 1, Berlin, Deutscher Verlag der Wiss., 1958.
[27] SMIRNOW,
[28] STOLL, R. Linear Algebra and Matrix Theory, London, New York-Toronto, McGraw-Hill Book PubI. Comp. Inc., 1952. [29] VAN DER WAERDEN, B. L. Algebra, Griindlehren Math. Wiss. Berlin-GOttingenHeidelberg, Springer, 1955.
Additions to the Bibliography [71 DicKsu\. L. linear Algebras: Cambridge tJnk Press. 1914. [8] Dii i i)O\\i . .1. Algêhre linêaire et geometric elémentaire: Hermann & (ic. Paris. [1 5] lii Si\iOIi.i:R. 1). libre Bundles: McGraw— H ill Book [22] 1965. S. Algebra: Addison
[23]
S. Linear Algebra: Addison Wesley 1970.
Subject Index Ahelian group 2 Adjoint mapping 216 matrix 114 Affine coordinates 297 coordinate-system 297 equivalent quadrics 310 mapping 298
Chain 4 rule 354 Characteristic coefficients 122 equation 120 polynomial 122. 388 nomial of a matrix 124
space 296 suhspace 298 Algebra 144
Classical adjoint transformation adjoint matrix 105 Coboundaries 180
of linear transformations of quaternians 208, 341 Algebraic number 166 Almost finite gradation 171 theorem 87 Angle
zero (field)
188
Barvcentic coordinates Basis
176
300
11
Basis transformation 92 Bessel's-inequality 194 Bicommutant 422 Bilinear function 63 Binomial formula 176 Boundaries 178 Bounded linear transformation
Canonical complex structure injection 23 involution 176 projection 2. 26 Icy-Hamilton theorem numbers 332 Cauchy-criterion 208 Center 300 of an algebra 155
Cofactor 115 Cohomology space 180 Column-vectors 83 Commutant 420 of a semisimple transformation 431 Commutative algebra 145 (group) 2 Compatible gradations 170 Complementary suhspace 26 Complete normed linear space 208 Complex vector space 6 Components 11 Composition 18 Cone with vertex 302 Conjugate linear transformations 409 principal axes space 329
206
238
418
105
Cocycles 180 Co-dimension 36 Coefficient field S
146
Anticommutative graded algebra Antiderivation 1 54 Associative algebra 145 di\ision algebra 210 Associativity I Augmented matrix 89
3
317
Conjugation 334 Connecting homomorphism 180 Continuous curve 249 Continuously differentiable curve 249 Convex subsets 287 Contragredient basis-transformations 93 Convergent sequence 207 Coset (group) 2 Cosine-theorem 189 Covariant components 204 Cramer's solution formula 116 Cross product 198 product in a three dimensional uIlitarv space 332
Subject Index
44S
characteristic formula 185 l.s en permutation 100
178
• uler—
('sclic space
398 400
suhspace
1)eformahle basis
1
Ixact sequence 46 sequence of differential spaces Exchange theorem of Steiniti 16
36
Degree of a homogeneous linear mapping 168 of a polynomial 352 of nilpotenc\ 158 DeMois re's formula 259 Dens ation 1 52 p—denis atiori
—-
Factor algebra group 3 set
1 53
space
Field
Gaussian elimination 97 General linear group 53 — orthogonal group 235 Generalized Eigenspaces 415 Generator 398 of a cyclic space 398 G-graded algebra 174 — differential spaces 181
Differentiation map 354 Dimension 32 Direct sum (external) 56 (internal)
24
Distance 194 Distrihutis e lasv (field) 3 Dis ision-algebra 145 Dis isor 358 Dual bases 78 determinant function 113 —. differential spaces 180 6-graded spaces 182 mappings 67 spaces
factor space
309
Endomorphism 144 Entries of a matrix 83 Equilateral simplex 300 Equis alent matrices 98 sectors 28 Euclid algorithm 357. 366 Euclidean space 1 86. 299
169
suhalgebra 175 suhspace 168 sector space 167 Graded differential algebras 183 space 167 Gram determinant 197. 331 Greatest common divisor 358 —- lower hound 4 Group ofdegrees 167 — of linear automorphisms 53 of permutations
65
Eigensalue 120 Eigens alues of a matrix 124 Eigen-space 223 Eigenvector 1 20 Elementary basis transformation Elements of a matrix 83 Ellipse
392
I
operator 1 78 space 178
-—
29
3
Fitting decomposition Fore-cone 286 Free vector space 13 Function
104
Differentiable curse 249 Differential algebra 183
——
149
296
space
1 79
Expansion—formula of a determinant Ixtension field 3
—.
Derised algebra 148 Determinant function 102 of a linear transformation ofan ii xii matrix 110 Diagonal mapping 57 Difference sector 296
1 85
I
1
95
Hadamard's inequality 202 Hermitian function 326 matrix 327 Hexagonal lemma 51 Homogeneous linear mapping 168 system of linear equations 86 vectors
167
Homology algebra — space
183
178
Homomorphism of algebras 144 of differential algebras 1 83
11 7
449
Subject Index of differential spaces — of exact sequences — of groups 2
Homothetic linear transformation Homotopy operator 184 Hyperbola 309 Hyperplane 298 Ldeal
Left ideal
178
47 235
148
Identification 56 Identity element Image space 41 Indecomposable space 402 Indefinite bilinear function 266 Index of a bilinear function 269 of a pseudo-Euclidean space 269 Induced basis 76 orientation 132 Infinite dimensional \ector space 32 Inhomogeneous system of linear equations 86 Inner product 186 186 — product space Instantaneous axis 256 Integration operator 356 Intersection 23 number 136 Inverse isomorphism 17 Inverse matrix 91 1
18
Irreducible algebra
Isometric linear mapping 232 Isomorphic 17 Isomorphism of exact sequences 47
Jacobi identity 156 Jordan canonical matrix
406
41
Kronecker symbol
— closure
23
combination function 16 — isomorphism
8
17
mapping 16 transformation 17 Linearly dependent ectors 9 independent vectors 10 — ordered elements 4 Lorentz transformation 292 Lower hound 4 — triangular matrix 91
Mapping Matrix 83 Metric normal-form of a quadric 319 Metrically equivalent quadrics 319 Minimum polynomial 367 Minkowsky-space 292 Minor 117 Monic polynomial 352 Monomial 352 Multiple 358 Multiplication operator 145 1
161
— polynomial 358 54, 402 — space — stable subspace 403
Kernel
148
inverse of a linear mapping 52 Leihniz formula 152 Lie algebra 156 Light-cone 281 -sector 281 Linear automorphism 17
76
Laplace expansion formula 118 Lattice 4 Leading coefficient 352 Least common multiple 359 upper bound 4 Lefschetz formula 185
n-dimensional affine space 296 n-space over 1 7 Nilpotent element 158 — linear transformation 125. 425 part 377 Non-degenerate bilinear function 64 ial linear relation 9 Norm of a vector 186. 328 Norm-function 205 Norm-topology 205 Normal form of a matrix 95 form of a quadric 307 — linear transformation 219. 336
Normal-\ector 316 Normed determinant function in a Euclidean space 196 Pseudo-Euclidean space space 330 Normed linear space 205 Nullspaces 64
— in a — in a
283
Subject Index
45()
Odd permutation Opposite algebra Orientation 1 3 1 presers trig
10()
Quadratic form
145
—
Oriented anole 196 Origin 297 Orthochroneous transformation Orthonormal basis 19 1
Quatern ion—algebra 292
67. 193
vectors
67. 187
/)—linear map 100 Parabola 309
Parallel affine suhspaces
29$ 262
1-parameter group 254 Partially ordered set 4 Past-cone 287 Pauli-matrices 345 Point 296 Poincaré series 1 71 Polar suhspace 309 Pol\ nomial 359 -—
function
364
definite
Quatern ionic exponent al funct on Quaternions 209
— -vector
Position sector 297 Positis e basis
Ross-vector
240 83
265
Scalar
gradation
167
selfadjoint transformation 225 semidefinite symmetric bilinear function 265 Prime pols nomial 358 Principal axes 3 1 7 minors 117 Product 144 —. of matrices 90 Projection operator 60 Proper rigid motion 301 rotation 234 p—dimensional parallelepiped
p-th Betti number — homology
5
product
326
space
197
182 182 190
Ptolems-inequalit\ Pseudo-Euclidean rotation — -Euclidean
200
131
definite Hermitian function —-
119
341
Radical of an algebra 1 59 Rank of a linear mapping 79 of a Sr mmetric bilinear function 265 Real form 331 sector space 6 Reflection 229 Regular linear mapping 79 matrix 90 Relatis clv prime polr nomials 358 Restriction of a bilinear function 64 — of a linear mapping 42 Riesi theorem 190 Right ideal 148 ins erse of a linear mapping 52 Rigid motion 299 Root of a polynomial 364 Rotation 234 Rotation-angle 239. 240
matrl\ 193 projection 194. 226 --
264
262
Quadric 302 Quasi—triangular determinant
131
complement
function
space
-orthogonal matrix Pythagorean theorem
281
290 1 89
65
multiplication 6 term 352 Schmidt—orthogonaliiation 192 Schur's lemma 54 Schwari-inequalitr 187 Selfadjoint linear transformation 221 sr mplectic transformation 333 Semisimple algebra 161 — element
—
374
linear transformation 426 part 377 set of linear transformations 432 transformation of a real sector
space 439 Sesquilinear function 325 Set mapping Short exact sequence 46 Signature of a bilinear function Simple algebra 160 Simplex 300 I
289
269
Subject Index
Skew linear transformation 229, 333 symmetric map 100 symplectic transformation 333 Space-like 281 — of linear mappings
51
Special orthogonal group 235 Split short exact sequence 47 Stable subspace 48, 383 Standard iiiner product 187 Step function 163 Suhalgebra 147 Subfield
Subgroup Suhspacc
Topological vector space 37 Totally reducible algebra 161 Trace of a linear transformation of a matrix 128 Transposed matrix 83 Triangle-inequality 189 Trigonometric functions 257 Two-sided ideal 148 Unit element — -sphere — vector
3
matrix
trick of Weyl 226 Upper hound 4 triangular matrix 145 9
303
91
Vandermonde determinant Vector space 5 Vectors
Tangent-hyperplanc
338
329 327
space
89
Sylvester's law of inertia 269 Symmetric bilinear function 261 Symplectic space 333 System of generators of an algebra of generators of a vector space
-space 303 Taylor's formula Time-like 281
145
186 186
Unitary mapping
2 23
Sum 23 of matrices
451
5
Vertex 318 Volume 197
355
Zero-xector
5
119
1 26