Linear Algebra, 4th edition

Graduate Texts in Mathematics Linear Algebra Volume 23 1975 Springer Graduate Texts in Mathematics 23 Editorial Bo...

Author: Werner H. Greub

1791 downloads 7616 Views 5MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Graduate Texts in Mathematics

Linear Algebra

Volume 23 1975

Springer

Graduate Texts in Mathematics 23

Editorial Board: F. W. Gehring P. R. Halmos (Managing Editor) C. C. Moore

Werner Greub

Linear Algebra Fourth Edition

Springer-Verlag

New York Heidelberg Berlin

Werner Greub University of Toronto Department of Mathematics Toronto M5S 1A1 Canada

Managing Editor P. R. Halmos Indiana University Department of Mathematics Swain Hall East Bloomington, Indiana 47401

Editors F. W. Gehring

C. C. Moore

University of Michigan Department of Mathematics Ann Arbor, Michigan 48104

University of California at Berkeley Department of Mathematics Berkeley, California 94720

AMS Subject Classifications 15-01, 15A03, 15A06, 15A18, 15A21, 16-01 Library

Greub,

of Congress Cataloging in Publication Data Werner Hildbert, 1925—

algebra. (Graduate texts in mathematics; v. 23) Bibliography: p. 445 1. Algebras, Linear. 1. Title. II. Series. QA184.G7313 1974 512'.S 74-13868 Linear

All rights

reserved.

No part of this book may be translated or reproduced in any form without written permission from Springer-Verlag. © 1975 by Springer-Verlag New York Inc. Softcover reprint of the hardcover 4th edition 1975 ISBN 978-1-4684-9448-8 DOl 10.1007/978-1-4684-9446-4

ISBN 978-1-4684-9446-4 (eBook)

To Roif Nevanlinna

Preface to the fourth edition This textbook gives a detailed and comprehensive presentation of linear algebra based on an axiomatic treatment of linear spaces. For this fourth edition some new material has been added to the text, for instance, the intrinsic treatment of the classical adjoint of a linear transformation in Chapter IV, as well as the discussion of quaternions and the classification of associative division algebras in Chapter VII. Chapters XII and

XIII have been substantially rewritten for the sake of clarity, but the contents remain basically the same as before. Finally, a number of problems covering new topics — e.g. complex structures, Caylay numbers and symplectic spaces — have been added.

I should like to thank Mr. M. L. Johnson who made many useful suggestions for the problems in the third edition. I am also grateful to my colleague S. Halperin who assisted in the revision of Chapters XII and XIII and to Mr. F. Gomez who helped to prepare the subject index. Finally, I have to express my deep gratitude to my colleague J. R. Vanstone who worked closely with me in the preparation of all the revisions and additions and who generously helped with the proof reading.

Toronto, February 1975

WERNER H. GREUB

Preface to the third edition The major change between the second and third edition is the separation

of linear and multilinear algebra into two different volumes as well as the incorporation of a great deal of new material. However, the essential

character of the book remains the same; in other words, the entire presentation continues to be based on an axiomatic treatment of vector spaces.

In this first volume the restriction to finite dimensional vector spaces has been eliminated except for those results which do not hold in the infinite dimensional case. The restriction of the coefficient field to the real and complex numbers has also been removed and except for chapters VII to XI, § 5 of chapter I and § 8, chapter IV we allow any coefficient

field of characteristic zero. In fact, many of the theorems are valid for modules over a commutative ring. Finally, a large number of problems of different degree of difficulty has been added.

Chapter I deals with the general properties of a vector space. The topology of a real vector space of finite dimension is axiomatically characterized in an additional paragraph. In chapter lIthe sections on exact sequences, direct decompositions and duality have been greatly expanded. Oriented vector spaces have been incorporated into chapter IV and so chapter V of the second edition has disappeared. Chapter V (algebras) and VI (gradations and homology)

are completely new and introduce the reader to the basic concepts associated with these fields. The second volume will depend heavily on some of the material developed in these two chapters.

Chapters X (Inner product spaces) XI (Linear mappings of inner product spaces) XII (Symmetric bilinear functions) XIII (Quadrics) and XIV (Unitary spaces) of the second edition have been renumbered but remain otherwise essentially unchanged. Chapter XII (Polynomial algebra) is again completely new and developes all the standard material about polynomials in one indeterminate. Most of this is applied in chapter XIII (Theory of a linear transformation). This last chapter is a very much expanded version of chapter XV of the

second edition. Of particular importance is the generalization of the

Preface to the third edition

results in the second edition to vector spaces over an arbitrary coefficient field of characteristic zero. This has been accomplished without reversion

to the cumbersome calculations of the first edition. Furthermore the concept of a semisimple transformation is introduced and treated in some depth.

One additional change has been made: some of the paragraphs or sections have been starred. The rest of the book can be read without reference to this material.

Last but certainly not least, I have to express my sincerest thanks to everyone who has helped in the preparation of this edition. First of all I am particularly indebted to Mr. S. HALPERIN who made a great number of valuable suggestions for improvements. Large parts of the book, in particular chapters XII and XIII are his own work. My warm thanks also go to Mr. L. YONKER, Mr. G. PEDERZOLI and Mr. J. SCHERK who did the proofreading. Furthermore I am grateful to Mrs. V. PEDERZOLI

and to Miss M. PETTINGER for their assistance in the preparation of the

manuscript. Finally I would like to express my thanks to professor K. BLEULER for providing an agreeable milieu in which to work and to the publishers for their patience and cooperation.

Toronto, December 1966

WERNER H. GREUB

Preface to the second edition Besides the very obvious change from German to English, the second

edition of this book contains many additions as well as a great many other changes. It might even be called a new book altogether were it not

for the fact that the essential character of the book has remained the same; in other words, the entire presentation continues to be based on an axiomatic treatment of linear spaces. In this second edition, the thorough-going restriction to linear spaces of finite dimension has been removed. Another complete change is the

restriction to linear spaces with real or complex coefficients, thereby removing a number of relatively involved discussions which did not really contribute substantially to the subject. On p. 6 there is a list of those chapters in which the presentation can be transferred directly to spaces over an arbitrary coefficient field.

Chapter I deals with the general properties of a linear space. Those concepts which are only valid for finitely many dimensions are discussed in a special paragraph.

Chapter II now covers only linear transformations while the treatment of matrices has been delegated to a new chapter, chapter III. The discussion of dual spaces has been changed; dual spaces are now introduced abstractly and the connection with the space of linear functions is not established until later. Chapters IV and V, dealing with determinants and orientation respectively, do not contain substantial changes. Brief reference should

be made here to the new paragraph in chapter IV on the trace of an endomorphism — a concept which is used quite consistently throughout the book from that time on. Special emphasis is given to tensors. The original chapter on Multilinear Algebra is now spread over four chapters: Multilinear Mappings (Ch. VI), Tensor Algebra (Ch. VII), Exterior Algebra (Ch. VIII) and

Duality in Exterior Algebra (Ch. IX). The chapter on multilinear mappings consists now primarily of an introduction to the theory of the tensor-product. In chapter VII the notion of vector-valued tensors has

been introduced and used to define the contraction. Furthermore, a

XII

Preface to the second edition

treatment of the transformation of tensors under linear mappings has been added. In Chapter VIII the antisymmetry-operator is studied in greater detail and the concept of the skew-symmetric power is introduced. The dual product (Ch. IX) is generalized to mixed tensors. A special paragraph in this chapter covers the skew-symmetric powers of the unit tensor and shows their significance in the characteristic polynomial. The paragraph "Adjoint Tensors" provides a number of applications of the duality theory to certain tensors arising from an endomorphism of the underlying space.

There are no essential changes in Chapter X (Inner product spaces) except for the addition of a short new paragraph on normed linear spaces.

In the next chapter, on linear mappings of inner product spaces, the orthogonal projections 3) and the skew mappings 4) are discussed in greater detail. Furthermore, a paragraph on differentiable families of automorphisms has been added here. Chapter XII (Symmetric Bilinear Functions) contains a new paragraph dealing with Lorentz-transformations. Whereas the discussion of quadrics in the first edition was limited to quadrics with centers, the second edition covers this topic in full. The chapter on unitary spaces has been changed to include a more thorough-going presentation of unitary transformations of the complex plane and their relation to the algebra of quaternions. The restriction to linear spaces with complex or real coefficients has of course greatly simplified the construction of irreducible subspaces in chapter XV. Another essential simplification of this construction was achieved by the simultaneous consideration of the dual mapping. A final paragraph with applications to Lorentz-transformation has been added to this concluding chapter. Many other minor changes have been incorporated — not least of which are the many additional problems now accompanying each paragraph.

Last, but certainly not least, I have to express my sincerest thanks to everyone who has helped me in the preparation of this second edition. First of all, I am particularly indebted to CORNELIE J. RHEINBOLDT

who assisted in the entire translating and editing work and to Dr. WERNER C. RHEINBOLDT who cooperated in this task and who also made a number of valuable suggestions for improvements, especially in the chapters on linear transformations and matrices. My warm thanks

also go to Dr. H. BOLDER of the Royal Dutch/Shell Laboratory at Amsterdam for his criticism on the chapter on tensor-products and to Dr. H. H. KELLER who read the entire manuscript and offered many

Preface to the second edition

XIII

important suggestions. Furthermore, I am grateful to Mr. GI0RGI0 PEDERZOLI who helped to read the proofs of the entire work and who collected a number of new problems and to Mr. KHADJA NESAMUDDIN KHAN for his assistance in preparing the manuscript.

Finally I would like to express my thanks to the publishers for their patience and cooperation during the preparation of this edition. Toronto, April 1963

WERNER H. GREUB

Contents Chapter 0. Prerequisites

Chapter I. Vector spaces § 1. Vector spaces § 2. Linear mappings § 3. Subspaces and factor spaces § 4. Dimension § 5. The topology of a real finite dimensional vector space

37

Chapter II. Linear mappings § 1. Basic properties § 2. Operations with linear mappings § 3. Linear isomorphisms § 4. Direct sum of vector spaces § 5. Dual vector spaces § 6. Finite dimensional vector spaces

41 41

Chapter III. Matrices § 1. Matrices and systems of linear equations § 2. Multiplication of matrices § 3. Basis transformation § 4. Elementary transformations

83 83 89 92 95

Chapter IV. Determinants § 1. Determinant functions § 2. The determinant of a linear transformation § 3. The determinant of a matrix § 4. Dual determinant functions § 5. The adjoint matrix § 6. The characteristic polynomial §7. The trace § 8. Oriented vector spaces Chapter V. Algebras § 1. Basic properties

51

55 56

63 76

104 109 112 114 120 126 131

Change of coefficient field of a vector space

144 144 158 163

Chapter VI. Gradations and homology § 1. G-graded vector spaces § 2. G-graded algebras § 3. Differential spaces and differential algebras.

167 167 174 178

Chapter VII. Inner product spaces § 1. The inner product § 2. Orthonormal bases

186 186

§2. Ideals § 3.

191

Contents

XVI

Normed determinant functions Duality in an inner product space Normed vector spaces § 6. The algebra of quaternions Chapter VIII. Linear mappings of inner product spaces § 1. The adjoint mapping § 2. Selfadjoint mappings § 3. Orthogonal projections § 4. Skew mappings § 5. Isometric mappings

195

§ 3. § 4. § 5.

§ 6.

Rotations of Euclidean spaces of dimension 2, 3and4

§ 7.

Differentiable families of linear automorphisms.

202 205

208 216

216 221

226 229 232

237 249

Chapter IX. Symmetric bilinear functions § 1. Bilinear and quadratic functions § 2. The decomposition of E § 3. Pairs of symmetric bilinear functions § 4. Pseudo-Euclidean spaces § 5. Linear mappings of Pseudo-Euclidean spaces.

261

Chapter X. Quadrics § 1.

296 296

§

301

261 265

272

.

§

281 288

Affine spaces 2. Quadrics in the affine space 3. Affine equivalence of quadrics

§ 4.

310 316

Quadrics in the Euclidean space

Chapter XI. Unitary spaces § 1. Hermitian functions § 2. Unitary spaces § 3. Linear mappings of unitary spaces § 4. Unitary mappings of the complex plane § 5. Application to Lorentz-transformations

325 325 327 334 340

Chapter XII. Polynomial algebra . § 1. Basic properties § 2. Ideals and divisibility § 3. Factor algebras § 4. The structure of factor algebras.

351 351 357

Chapter XIII. Theory of a linear transformation § 1. Polynomials in a linear transformation § 2. § 3.

345

366 369

.

Generalized eigenspaces Cyclic spaces

.

383 390

397

§ 4. Irreducible spaces

402

Application of cyclic spaces Nilpotent and semisimple transformations 7. Applications to inner product spaces .

§ 5.

§ 6. §

383

.

.

415 425 436

Bibliography

445

Subject Index

447

Interdependence of Chapters Vector spaces F

Linear mappings J

Matrices

Determinants

Inner product spaces

Gradations and homology

Linear mappings of inner product spaces

Symmetric bilinear functions

r

Unitary spaces

Algebras

Polynomial algebra

Quadrics

Theory of a linear transformation

Chapter 0

Prerequisites 0.1. Sets. The reader is expected to be familiar with naive set theory

up to the level of the first half of [11]. In general we shall adopt the notations and definitions of that book; however, we make two exceptions. First, the word function will in this book have a very restricted meaning, and what Halmos calls a function, we shall call a mapping or a set mapping. Second, we follow Bourbaki and call mappings that are one-to-one (onto, one-to-one and onto) injective (surjective, bijective). 0.2. Topology. Except for § 5 chap. I, § 8, Chap. IV and parts of chapters VII to IX we make no use at all of topology. For these parts of the book the reader should be familiar with elementary point set topology as found in the first part of [16]. 0.3. Groups. A group is a set G, together with a binary law of composition x G—+G which satisfies the following axioms (x, y) will be denoted by xy): 1. Associativity: (xy)z = x (yz) 2. Identity: There exists an element e, called the identity such that

xe = ex = x. 3. To each element xeG corresponds a second element

xx' =

such that

= e.

The identity element of a group is uniquely determined and each element has a unique inverse. We also have the relation

(xy)' As an example consider the set of all permutations of the set { 1. n} and define the product of two permutations a, t by . .

(crt)i=a(ti) In this way

i=1...n.

becomes a group, called the group of permutations of n objects. The identity element of is the identity permutation.

I

Greub. linear Algebra

Chapter 0. Prerequisites

Let G and H be two groups. Then a mapping G —*

H

is called a homomorphism if

X,yEG.

=

A homomorphism which is injective (resp. surjective, bijective) is called a monomorphism (resp. epimorphism, isomorphism). The inverse mapping of an isomorphism is clearly again an isomorphism. A subgroup H of a group G is a subset H such that with any two elementsyeHand zeH the productyz is contained in Hand that the inverse of every element of H is again in H. Then the restriction of p to the subset Hx H makes H into a group. A group G is called commutative or abelian if for each x, yeG xy=yx.

In an abelian group one often writes x+y instead of xy and calls x+y the sum of x and y. Then the unit element is denoted by 0. As an example consider the set of integers and define addition in the usual way. 0.4. Factor groups of commutative groups.* Let G be a commutative

group and consider a subgroup H. Then H determines an equivalence relation in G given by x

x'

if and only if

x—

x' e H.

The corresponding equivalence classes are the sets {H+x} and are called the cosets of H in G. Every element xeG is contained in precisely one coset ?. The set G/H of these cosets is called the factor set of G by H and the surjective mapping G

G/H

defined by XEX

called the canonical projection of G onto G/H. The set G/H can be made into a group in precisely one way such that the canonical projection becomes a homomorphism; i.e., is

it(x + y) = irx + 7ry. To define the addition in G/H let

e G/H, 9 e G/H be arbitrary and choose

xeG and yeG such that and

my=9.

*) This concept can be generalized to non-commutative groups.

Chapter 0. Prerequisites

Then the element it (x +y) depends only on

two other elements satisfying whence

3

and 9. In

and icy' =9

we

fact, if x', y' are

have

x'—xeH and y'—yeH (x' + y') — (x

+ y)eH

and so ir(x'+y')—ic(x+y). Hence, it makes sense to define the sum by

is easy to verify that the above sum satisfies the group axioms. Relation (0.1) is an immediate consequence of the definition of the sum in G/H. Finally, since it is a surjective map, the addition in G/H is uniquely deterIt

mined by (0.1). The group G/H is called the factor group of G with respect to the subgroup H. Its unit element is the set H. 0.5. Fields. Afield is a set F on which two binary laws of composition, called respectively addition and multiplication, are defined such that 1. F is a commutative group with respect to the addition. 2. The set F — {0} is a commutative group with respect to the multiplication.

3. Addition and multiplication are connected by the distributive

law,

The rational numbers the real numbers 11 and the complex numbers C are fields with respect to the usual operations, as will be assumed without proof. A homomorphism p:F—÷F' between two fields is a mapping that preserves addition and multiplication. A subset A F of a field which is closed under addition, multiplication and the taking of inverses is called a subfield. If A is a subfield of F, F is called an extension field of A. Given a field F we define for every positive integer k the element ke (c unit element of F) by

The field F is said to have characteristic zero if ke + 0 for every positive integer k. If F has characteristic zero it follows that kc+k'c whenever k+k'. Hence, a field of characteristic zero is an infinite set. Throughout this book it will be assumed without explicit mention that all fields are of characteristic zero.

Chapter 0. Prerequisites

4

For more details on groups and fields the reader is referred to [29]. 0.6. Partial order. Let cl be a set and assume that for some pairs X. Y (XE.ci. YE.cl) a relation, denoted by X Y, is defined which satisfies the following conditions: (i) XX for every Xe.cl (Reflexivity) (ii) if X Y and Y X then X = Y (Antisymmetry) (iii) If X Y and Y Z, then X Z (Transitivity). Then is called a partial order in d.

A homomorphism of partial/v ordered sets is a map go:

such

that goX go Y whenever X Y

Clearly a subset of a partially ordered set is again partially ordered.

Let d be a partially ordered set and suppose A Ed is an element such that the relation A X implies that A = X. Then A is called a maximal element of ci. A partial ordered set need not have a maximal element. A partially ordered set is called linear/v ordered or a chain if for every

pairX, YeitherXYor be a subset of the partially ordered set ci. Then an element if X A for every XEc11. A ed is called an upper hound for In this book we shall assume the following axiom: A partially ordered set in which every chain has an upper bound, contains a maximal element. This axiom is known as Zorn's lemma, and is equivalent to the axiom Let

of choice (cf. [11]).

d be a 0.7. Lattices. Let d be a partially ordered set and let subset. An element Aed is called a least upper hound (l.u.b.) for d1 if 1) A is an upper bound for cl1. 2) If X is any upper bound, then A X. It follows from (ii) that if a l.u.b. for d1 exists, then it is unique. In a similar way, lower bounds and the greatest lower bound (g.1.b.) for a subset of d are defined. A partially ordered set ci is called a lattice, if for any two elements X, Y the subset {X, Y} has a 1.u.b. and a g.l.b. They are denoted by X v Y and X A Y. It is easily checked that any finite subset (X1, ..., Xr) of a lattice

has a 1.u.b. and a g.l.b. They are denoted by V

and A X,.

As an example of a lattice, consider the collection of subsets of a given set, X, ordered by inclusion. If U, V are any two subsets, then

UAV=UflV and UvV=UUV.

Chapter 1

Vector Spaces § 1.

Vector spaces

1.1. Definition. A vector (linear) space, E, over the field F is a set of elements x, y, ... called vectors with the following algebraic structure:

I. E is an additive group; that is, there is a fixed mapping Ex E—+E denoted by (1.1)

and satisfying the following axioms: 1.1. (x +y) +z = x + (y + z) (associative law)

1.2. x+y=y+x (commutative law) 1.3. there exists a zero-vector 0; i.e., a vector such that x+0= O+x=x for every xeE. 1.4. To every vector x there is a vector —x such that x+(—x)=O.

II. There is a fixed mapping F x

denoted by

(2,x)—÷Ax

(1.2)

and satisfying the axioms: 11.1. (associative law) 11.2.

A(x + y) = Ax + Ay (distributive laws) 11.3.

unit element of F)

(The reader should note that in the left hand side of the first distributive law, + denotes the addition in F while in the right hand side, + denotes the addition in E. In the sequel, the name addition and the symbol + will continue to be used for both operations, but it will always be clear from the context which one is meant). F is called the coefficient field of the vector space E, and the elements of F are called scalars. Thus the mapping

Chapter 1. Vector spaces

(1.2) defines a multiplication of vectors by scalars, and so it is called scalar multiplication.

If the coefficient field 1' is the field l1 of real numbers (the field C of complex numbers), then £ is called a real (complex) vector space. For the rest of this paragraph all vector spaces are defined over a fixed, but arbitrarily chosen field F of characteristic 0. If {x1 ,...,x,1} is a finite family of vectors in E, the sum x1 + will often be denoted

Now we shall establish some elementary properties of vector spaces. It follows from an easy induction argument on a that the distributive laws hold for any finite number of terms,

A) x .

Ax

holds if and only if

=

=

0

2=0 or x=0.

Proof Substitution of pc=O in the first distributive law yields Ax = Ax + Ox

whence Ox =0. Similarly, the second distributive law shows that

20 = 0. Conversely, suppose that Ax=O and assume that 2+0. Then the associative law 11.1 gives that

lx = (A1A)x = A1(Ax) = Ab0 = 0 and hence axiom 11.3 implies that x=O. The first distributive law gives for Ax + (— A)x =

(A

—

—A

A)x =

whence (— A)

x = — Ax.

0

§ 1. Vector spaces

In the same way the formula 2(— x) = — 2x is

proved.

1.2. Examples. 1. Consider the set F "=

of n-tuples

and define addition and scalar multiplication by I

+

1

+

=

1,

+

and

Then the associativity and commutativity of addition follows at once from the associativity and commutativity of addition in F. The zero vector is the n-tuple (0, ..., 0) and the inverse of is the n-tuple ..., Consequently, addition as defined above makes the set ..., F" into an additive group. The scalar multiplication satisfies I1.i, 11.2, and 11.3, as is equally easily checked, and so these two operations make F" into a vector space. This vector space is called the n-space over F. In particular, F is a vector space over itself in which scalar multiplication coincides with the field multiplication.

2. Let C be the set of all continuous real-valued functions, f, in the

interval I:Ot 1, R.

1ff, g are two continuous functions, then the function f+g defined by (1 + g) (t) is

= f (t) + g (t)

again continuous. Moreover, for any real number 2, the function 2f

defined by

(1f)(t) = 2.1(t) is

continuous as well. It is clear that the mappings (1' g) —p

f +g

and

(2,1)

2]

the systems of axioms 1. and 11. and so C becomes a real vector space. The zero vector is the function 0 defined by satisfy

0(t) =

0

Chapter 1. Vector spaces

and the vector —f is the function given by

(- f)(t) = - f(t). Instead of the continuous functions we could equally well have considered the set of k-times differentiable functions, or the set of analytic functions.

3. Let X be an arbitrary set and E be a vector space. Consider all and define the sum of two mappings f and g as the xeX (f + g)(x) = f(x) + g(x) and the mapping )f by

mappings J: mapping

(i.[)(x) =

XEX.

Under these operations the set of all mappings f:

becomes a vector space, which will be denoted by (X; E). The zero vector of (X; E) is the function f defined by f(x)=0, XEX. 1.3. Linear combinations. Suppose E is a vector space and x1 are vectors in E. Then a vector XEE is called a linear combination of the vectors x1 if it can be written in the form x=

be any family of vectors. Then a vector More generally, let if there is a family xEE is called a linear combination of the vectors only finitely many different from zero, such that of scalars, x= where the summation is extended over those We shall simply write =

for which

cLEA

and it is to be understood that only finitely many are different from zero. In particular, by setting =0 for each we obtain that the 0-vector is a linear combination of every family. It is clear from the definition that if x is a linear combination of the family {xj then x is a linear combination of a finite subfamily. Suppose now that x is a linear combination of vectors cteA

x= rzeA

and assume further that each x2 is a linear combination of vectors

§ 1. Vector spaces

9

JJeB2,

=

e F. p

Then the second distributive law yields X

=

x2

r

=

Yap'

=

hence x is a linear combination of the vectors A subset ScEis called a system of generators for Eif every vector xeE is a linear combination of vectors of S. The whole space E is clearly a system of generators. Now suppose that S is a system of generators for E and that every vector of S is a linear combination of vectors of a subset Tc: S. Then it follows from the above discussion that T is also a system of generators for E. 1.4. Linear dependence. Let be a given family of vectors. Then a non-trivial linear combination of the vectors is a linear combination where at least one scalar is different from zero. The family {xa} and

is called linearly dependent if there exists a non-trivial linear combination of the that is, if there exists a system of scalars such that (1.3)

and at least one + 0. It follows from the above definition that if a subfamily of the family is linearly dependent, then so is the full family. An equation of the form (1.3) is called a non-trivial linear relation. A family consisting of one vector x is linearly dependent if and only if x 0. In fact, the relation

1.0=0 that the zero vector is linearly dependent. Conversely, if the vector x is linearly dependent we have that 2x=O where 2+0. Then Proposition I implies that x=0. It follows from the above remarks that every family containing the zero vector is linearly dependent. shows

is linearly dependent if and Proposition II: A family of vectors is a linear combination of the vectors x2, cx+fl. only if for some fleA, Proof: Suppose that for some fleA, xp

Chapter 1. Vector spaces

10

Then setting

—1

we obtain

=

o

and hence the vectors x2 are linearly dependent.

Conversely, assume that

=

0

and that

for some fleA. Then multiplying by view of II.! and 11.2 0= +

we obtain in

2*/i

()ji)_l)2x

xp=— a*f1

Corollary: Two vectors x, y are linearly dependent if and only ify=Ax (or x==)Ly) for some AcT. 1.5. Linear independence. A family of vectors (x2)26A is called linearly

independent if it is not linearly dependent; i.e., the vectors x2 are linearly independent if and only if the equation

= implies that A2=O for each

0

It is clear that every subfamily of a line-

arly independent family of vectors is again linearly independent. If flEA,

is a linearly independent family, then for any two distinct indices is injective. and so the map

Proposition III: A family (x2)2EA of vectors is linearly independent if and only if every vector x can be written in at most one way as a linear combination of the x2 i.e., if and only if for each linear combination

X>22X2

(1.4)

the scalars A2 are uniquely determined by x. Proof. Suppose first that the scalars A2 in (1.4) are uniquely determined by x. Then in particular for x=0, the only scalars A2 such that =0

are the scalars A2=O. Hence, the vectors x2 are linearly independent. Con-

§ 1. Vector spaces

versely,

suppose that the

11

are linearly independent and consider the

relations Then

=0

—

whence in view of the linear independence of the i.e., 1.6. Basis. A family of vectors

in E is called a basis of E if it is simultaneously a system of generators and linearly independent. In view of Proposition III and the definition of a system of generators, we have that A is a basis if and only if every vector xeE can be written in precisely one way as

The scalars

are called the components of x with respect to the basis

As an example, consider the n-space, ['fl, over F defined in example 1, sec. 1.2. It is easily verified that the vectors

i=1...n

x1=(O,...,O,l,O...O)

form a basis for F". We shall prove that every non-trivial vector space has a basis. For

the sake of simplicity we consider first vector spaces which admit a finite system of generators. Proposition IV: (i) Every finitely generated non-trivial vector space has a finite basis (ii) Suppose that S=(x1 ,...,Xm) is a finite system of generators of E and that the subset R c S given by R = (x1 Xr) (r m) consists of

linearly independent vectors. Then there is a basis,

of E such that

RcTcS. Proof: (i) Let x1 the vectors x1

x,, be a minimal system of generators of E. Then are linearly independent. In fact, assume a relation

=

0.

Chapter I. Vector spaces

12

If

= 0 for some 1, it follows that

;eT and so the vectors

(1.5)

generate E. This contradicts the minimality

of ii. (ii) We proceed by induction on ,i

(nr). If n=r then there is nothing to prove. Assume now that the assertion is correct for ii — I. Consider the vector space, F, generated by the vectors x1 Xr+i Then by induction, F has a basis of the form X1

Xr,

where V1ES

Ys

1

(j = 1

Now consider the vector x,1. If the vectors x1

). 1

x, are

linearly independent, then they form a basis of E which has the desired property. Otherwise there is a non-trivial relation

= 0.

+ Since the vectors x1

are

that y+O. Thus

linearly independent, it follows

r

x,7= Q=l

71 generate E. Since they are linearly

Hence the vectors x1

independent, they form a basis. Now consider the general case. Theorem I: Let E be a non-trivial vector space. Suppose S is a system

of generators and that R is a family of linearly independent vectors in E such that R S. Then there exists a basis, of E such that R S. Proof Consider the collection .W(R, S) of all subsets, X, of E such that

1) RcXcS 2) the vectors of X are linearly independent. The a partial order is defined in d(R, 5) by inclusion (cf. sec. 0.6). We show that every chain, in .&(R, 5) has a maximal element A. In fact, R

A

set A =

We

have to show that A ed(R, S). Clearly,

S. Now assume that x5,eA.

(1.6)

§ 1. Vector spaces

Then, for each assume that

i,

for some

is a chain, we may

Since

(i=1...n).

(1.7)

are linearly independent it follows that Since the vectors of (v = ... n) whence A ed(R, S). Now Zorn's lemma (cf. sec. 0.6) implies that there is a maximal element, L in d(R, S). Then R c: Tc:S and the vectors of Tare linearly 1

independent. To show that T is a system of generators, let xeE be arbitrary. Then the vectors of T U x are linearly dependent because otherwise it would follow that x U Ted(R, S) which contradicts the maximality of T Hence there is a non-trivial relation ,tVET, Since

the vectors of T are linearly independent, it follows that

whence

xv.

This equation shows that T generates E and so it is a basis of E.

Corollary I: Every system of generators of E contains a basis. In particular, every non-trivial vector space has a basis.

Corollary II: Every family of linearly independent vectors of E can be extended to a basis. 1.7. The free vector space over a set. Let X be an arbitrary set and such that f(x)+0 only for finitely many XEX. consider all maps f:

Denote the set of these maps by C(X). Then, if fe C(X), ge C(X) are scalars, 2f+ is again contained in C(X). As in example 3, sec. 1.2, we make C(X) into a vector space. For any aeX denote by the map given by and ).,

x=a

51

Then the vectors (aEX) form a basis of C(X). In fact, let fe C(X) be the (finitely many) distinct points be given and let a1 a, such that [(a) 0. Then we have

f where

= [(a)

i=1

(1 =

1

n)

Chapter I. Vector spaces

14

and so the element a relation

(aEX) generate C(X). On the other hand, assume

= 0. j=1

Then we have for each 1(1= ... ii) 1

0

1

=

... a). This shows that the vectors

(aeX) are

linearly independent and hence they form a basis of C(X). Now consider the inclusion map X—*C(X) given by

ix(a)=Ja

aEX.

This map clearly defines a bijection between X and the basis vectors

of C(X). If we identify each element aeX with the corresponding map then X becomes a basis of C(X). C(X) is called the free vector space over X or the vector space generated by X.

Problems 1. Show that axiom 11.3 can be replaced by the following one: The equation 2x=O holds only if 2=0 or x=0. 2. Given a system of linearly independent vectors (x1, ..., xe,), prove that the system (x1, .. i4j with arbitrary 2 is again linearly independent. 3. Show that the set of all solutions of the homogeneous linear differential equation d2y dt2

+

dy + qy = 0 dt

p and q are fixed functions of t, is a vector space. 4. Which of the following sets of functions are linearly dependent in the vector space of Example 2? where

a)f1=3t; f2=t+5; f3=2t2; b)f1=(t+1)2;f2=t2—1; f3=2t2+2t—3 f2=et; f3=e_t c) f1=1; d)f1=t2; e)

f1=1—t;

f2=t;

f3=1

f2=t(1—t); f3=1—t2.

§ 1. Vector spaces

5. Let E be a real linear space. Consider the set E x E of ordered pairs

(x, y) with xeE and yeE. Show that the set Ex E becomes a complex vector space under the operations: (x1,y1) + (x2,y2) =

(x1

+ x2,y1 + Y2)

and + i fJ)(x, y) =

x—

,8

y, y + J3 x)

/3 real numbers).

6. Which of the following sets of vectors in (a generating set, a basis)?

linearly independent,

a) (1, 1, 1, 1), (1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), (0, 0, 0, 1) b) (1, 0, 0, 0), (2, 0, 0, 0) c) (17, 39, 25, 10), (13, 12, 99, 4), (16, 1,0,0) 0, 0, d) (1, 1, 0, 0), (0, 0, 1, 1), (0, 4, 4, 1),

Extend the linearly independent sets to bases. 7. Are the vectors x1 (1, 0, 1); x2 = (i, 1, 0), x3 = (1, 2, 1 +1) linearly independent in C3? Express x = (1, 2, 3) and y = (1, 1, 1) as linear combibinations of x1, x2, x3. is defined by a map 8. Recall that an n-tuple given by

(i=1...n). Show that the vector spaces C{l...n} and F" are equal. Show further that the basis J defined in sec. 1.7 coincides with the basis x defined in sec. 1 .6.

9. Let S be any set and consider the set of maps

f: S

F"

such thatf(x)=0 for all but finitely many xeS. in a manner similar to that of sec. 1.7, make this set into a vector space (denoted by C(S, F')). Construct a basis for this vector space. be a basis for a vector space E and consider a vector 10. Let

a= that for some fleA. again a basis for E. Suppose

Show that the vectors

form

Chapter I. Vector spaces

11. Prove the following exchange theorem of Steinitz: Let be a a system of linearly independent vectors. basis of E and Then it is possible to exchange certain p of the vectors by the vectors a such that the new system is again a basis of F. Hint: Use problem 10. 12. Consider the set of polynomial functionsf:

f (x) =

xi.

Make this set into a vector space as in Example 3, and construct a natural basis. § 2.

Linear mappings

In this paragraph, all vector spaces are defined over a fixed but arbitrarily chosen field F of characteristic zero.

1.8. Definition. Suppose that E and F are vector spaces, and let (p:

be a set mapping. Then 'p will be called a linear mapping if

'p(x+y)='px+'py x,yeE

(1.8)

and

)LEF,xEE

= 2çox

(1.9)

(Recall that condition (1.8) states that 'p is a homomorphism between abelian groups). If F=F then 'p is called a linear function in E. Conditions (1.8) and (1.9) are clearly equivalent to the condition =

'p

'p

and so a linear mapping is a mapping which preserves linear combinations. From (1.8) we obtain that for every linear mapping, 'p, 'p 0

whence 'p (0)

=

'p (0

+0) =

'p (0)

+ 'p (0)

0. Suppose now that

=0 is a linear relation among the vectors 'p x1

=

'p

)!

Then we have

=

'p0

=0

§ 2. Linear mappings

whence

= 0.

17

(1.11)

Conversely, assume that ço:E—*F is a set map such that (1.11) holds whenever (1.10) holds. Then for any x, yeE and )LEF set

u=x+y and v=2x. Since

u—x—y=0 and v—2x=0

it follows that x

(x + y) —

—

y= 0

and ç (2 x) —2 (p x =

0

and hence p is a linear mapping. This shows that linear mappings are precisely the set mappings which preserve linear relations. In particular, it follows that if x1 . ..v,. are linearly dependent, then so If X1 .• . are linearly independent, it does not, are the vectors çox1 however, follow that the vectors cOXi...cOXr are linearly independent. In fact, the zero mapping defined by (px=0,xEE,is clearly a linear mapping which maps every family of vectors into the linearly dependent set (0). A bijective linear mapping (p: E3F is called a linear isomorphism and will be denoted by p: Given a linear isomorphism (p: the set mapping (p 1: E+—F. It is easy to verify that p again satisfies the .

. . .

conditions (1.8) and (1.9) and so it is a linear mapping. p' is bijective and hence a linear isomorphism. It is called the inverse isomorphism of (p. Two vector spaces E and F are called isomorphic if there exists a linear

isomorphism from E onto F. is called a linear transformation of E. A A linear mapping (p bijective linear transformation will be called a linear automorphism of E. 1.9. Examples: 1. Let E=F" and define (p:E—+E by =

+

satisfies the conditions (1.8) and (1.9) and hence it is a linear transformation of E. 2. Given a set S and a vector space E consider the vector space (S; E) be the mapping given by defined in Example 3, sec. 1.2. Let (S;

Then

(pf=f(a) fe(S;E) where aES is a fixed element. Then (p is 2

Greub. Linear Atgebra

a linear mapping.

Chapter

18

1.

Vector spaces

where )LeT is a be the mapping defined by 3. Let fixed element. Then p is a linear transformation. In particular, the ideaWv map i:E—+E, ix=x, is a linear transformation. 1.10. Composition. Let p: E—*F and i/i: F—*G be two linear mappings. Then the composition of ço and /i0ço:E—*G

is defined by XEE. The relation

x) =

(/9

ço x1)

k/i

= p is a linear mapping of E into G. k/Jo p will often be denoted simply by k/1p. If p is a linear transformation in E, then we denote (/9 o (p by (p2. More generally, the linear transformation p ... p is denoted by

shows that k/Jo

i. A linear transformation, p, satisfying (p2 = i is called an involution in E. 1.11. Generators and basis. S—*F Proposition I: Suppose S is a system of generators for E and is a set map (Fa second vector space). Then Po can be extended in at most one way to a linear mapping We extend the definition to the case k=O by setting (p°=

F.

(p: E

A necessary and sufficient condition for the existence of such an extension is that 0

whenever

= 0.

Proof If (p X1

is

an extension of

we

have for each finite set of vectors

ES

=

=

Since the set S generates E it follows from this relation that is (/9 uniquely determined by (po Moreover, if

XES

§ 2. Linear mappings

it follows that =

=

=

cOO

=

0

and so condition (1.12) is necessary. Conversely, assume that (1.12) is satisfied. Then define

by (1.13)

To prove that

is

a well defined map assume that

Then =o

—

whence in view of (1.12)

0

—

and so

= The linearity of p follows immediately from the definition, and it is clear that p extends cOo.

Proposition II: Let be a basis of E and cOo: be a set can be extended in a unique way to a linear mapping map. Then (p:

F.

Proof. The uniqueness follows from proposition I. To prove the existence of p consider a relation = 0. Since the vectors

are linearly independent it follows that each

whence

= 0.

Now proposition I shows that

can be extended to a linear mapping

Corollary: Let S be a linearly independent subset of E and be a set map. Then cOo can be extended to a linear mapping

Chapter I. Vector spaces

20

Proof. Let T be a basis of F containing S (cf. sec. 1.6). Extend c°o in an T—+F. Then k/I0 may be extended to a linear arbitrary way to a set map mapping t/i:E—+F and it is clear that k/i extends

Now let be a surjective linear map, and suppose that S is a system of generators for E. Then the set = is a system of generators for F. In fact, since çü is surjective, every vector yEF can be written as = x

for some xeE. Since S generates E there are vectors such that x=

and scalars

whence

y=

=

This shows that every vector yEF is a linear combination of vectors in

'p (5) and hence 'p (5) is a system of generators for(5). 'p Next, suppose that p: E—÷ F is injective and that S is a linearly independent subset of E. Then 'p (5) is a linearly independent subset of F. in

fact the relation implies that

Since 'p

is

= 0,

= 0.

injective we obtain

=

0

whence, in view of the linear independence of the vectors 'p (5) is a linearly independent set. In particular, if çø: E—+F is a linear isomorphism and for E, then is a basis for F.

A! =0. Hence

is a basis

Proposition III: Let ço:E—+Fbe a linear mapping and be a basis of E. Then 'p is a linear isomorphism if and only if the vectors = form a basis for F. Proof. If 'p is a linear isomorphism then the vectors form a linearly independent system of generators for F. Hence they are a basis. Converse-

§ 2. Linear mappings

ly, assume that the vectors

21

form a basis of F. Then we have for every

yEF y= and so

is

=

surjective.

Now assume that jf Xci.

Then it follows that o=

—

= Since the vectors each and so

—

are linearly independent, we obtain that

for

1?

It follows that 'p is injective, and hence a linear isomorphism.

Problems 1. Consider the vector space of all real valued continuous functions defined in the interval a t b. Show that the mapping 'p given by (p:X(t)—* is

tX(t)

linear.

2. Which of the following mappings of F4 into itself are linear transformations?

-

a) b)

+

c)

+

+

Let E be a vector space over F, and letf1 . given by E. Show that the mapping p: 3.

.

be linear functions in

= (ft(x), ...,fr(X)) is

linear.

4. Suppose

is a linear map, and write 'p x = (f1 (x), .. , fr (x)).

Show

that the

are linear functions in E.

Chapter I. Vector spaces

22

5. The unirersal property of C(X). Let X be any set and consider the free vector space, C(X). generated by X (cf. sec. 1 .7).

(i) Show that if j: then there is a unique linear map p:

X into a vector space F such that (p0

where X—*C(X) is the inclusion map. (ii) Let Y be a set map. Show that there is a unique linear C(X)—*C(Y) such that the diagram map

tX1.

C(X)f'4 C(Y)

is a second set map prove the composition

commutes. If [3: formula

(fib

=

(iii) Let E be a vector space. Forget the linear structure of E and form

the space C(E). Show that there is a unique linear map mE:

C(E)—÷E

such that mE ° 1E

(iv) Let E and F be vector spaces and let E—*F be a map between the underlying sets. Show that is a linear map if and only if ltFO(p*_ (POThE (v) Denote by N(E) the subspace of C(E) generated by the elements of the form a, beE, /IEF — (cf.

part (iii)). Show that

kermE = N(E).

6.Let v=O

be a fixed polynomial and letf be any linear function in a vector space E. Define a function P(f): E-÷F by

P(f)x =

(x)".

Find necessary and sufficient conditions on P that P(f) function. § 3.

be

again a linear

Subspaces and factor spaces

In this paragraph, all vector spaces are defined over a fixed, but arbitrarily chosen field F of characteristic 0.

§ 3. Subspaces

and factor spaces

23

1.12. Subspaces. Let E be a vector space over the field F. A non-empty subset, E1, of E is called a subspace if for each x, yEE1 and every scalar

x+yEE1

(1.14)

and

Equivalently, a subspace is a subset of E such that Ax + /IyEE1

whenever x, yeE1. In particular, the whole space E and the subset (0) consisting of the zero vector only are subspaces. Every subspace E1 E contains the zero vector. In fact, if x1 e E1 is an arbitrary vector we have that 0= x1 — x1 E E1. A subspace E1 of E inherits the structure of a vector space from E. Now consider the injective map i: E1 —* E defined by

ix=x,

xeE1.

In view of the definition of the linear operations in E1 i is a linear mapping, called the canonical injection of E1 into E. Since i is injective it follows from (sec. 1.11) that a family of vectors in E1 is linearly independent (dependent) if and only if it is linearly independent (dependent) in E. Next let S be any non-empty subset of E and denote by E5 the set of linear combinations of vectors in S. Then any linear combination of vecis a linear combination of vectors in S (cf. sec. 1.3) and hence tors in Thus is a subspace of E, called the subspace generated it belongs to by 5, or the linear closure of S. In particular, if the set S is Clearly, S is a system of generators for linearly independent, then S is a basis of We notice that = S if and only if S is a subspace itself. 1.13. Intersections and sums. Let E1 and E2 be subspaces of E and consider the intersection E1 n E2 of the sets E1 and E2. Then E1 fl E2 is again a subspace of E. In fact, since OeE1 and OeE2 we have OEE1 fl E2 and so E1 fl E2 is not empty. Moreover, it is clear that the set E1 n E2 satisfies again conditions (1.14) and (1.15) and so it is a subspace of E. E1 n E2 is called the intersection of the subspaces E1 and E2. Clearly, E1 fl E2 is a subspace of E1 and a subspace of E2. The sum of two subspaces E1 and E2 is defined as the set of all vectors of the form x1eE1,x2EE2 (1.16) x=x1 +x2,

Chapter 1. Vector spaces

24

and is denoted by E1 +122. Again it is easy to verify that E1 +E2 is a subspace of E. Clearly E1 + E2 contains E1 and E2 as subspaces. A vector x of E1 + E2 can generally be written in several ways in the form (1.16). Given two such decompositions

x=x1+x2 and it follows that x1

—

=

x'2 —

x2.

Hence, the vector z = x1

—

fl E2. Conversely, let x=x1 +x2, x1 eE1, x2eE2 be a decomposition of x and z be an arbitrary vector of E1 fl E2. Then the vectors

is contained in the intersection

=

— zEE1

=

and

x2

+ zEE2

form again a decomposition of x. It follows from this remark that the decomposition (1.16) of a vector xeE1 +E2 is uniquely determined if and only if E1 n E2 0. In this case E1 + E2 is called the (internal) direct sum of E1 and 122 and is denoted by Now let S1 and be systems of generators for E1 and E2. Then clearly is a system of generators for E1 +E2. If T1 and T2 are respectively S1 U

bases for E1 and E2 and the sum is direct, E1 fl 122=0, then T1 U T2 is a To prove that the set T1 U T2 is linearly independent, basis for suppose that = 0, + Then

=

=

fl

—

0

whence and

Now the x1 are linearly independent, and so )J=O. Similarly it follows

that Suppose that

E=E1$E2

(1.17)

is a decomposition of E as a direct sum of subspaces and let F be an arbitrary subspace of E. Then it is not in general true that

F=F

n E1

Fn

E2

§ 3. Subspaces

and factor spaces

the example below will show. However, if E1 In fact, it is clear that

as

25

then (1.18) holds. (1.19)

On the other hand, it'

y=x1 +x2

x1eE1,x2eE2

is the decomposition of any vector yEF, then

x1eE1=Ffl E1, x2=y—x1eFfl E2. It

follows that

FcFfl

E2.

(1.20)

The relations (1.19) and (1.20) imply (1.18). Example 1: Let E be a vector space with a basis e1, e2. Define E1, E2 and F as the subspaces generated by e1, e2 and e1 + e2 respectively. Then

E =E1 while on the other hand

Fn

E1

=F

n E2

= 0.

Hence

F+Fn

E2.

1.14. Arbitrary families of subspaces. Next consider an arbitrary family of subspaces E, Then the intersection flEa is again a subspace of E. The sum EE1 is defined as the set of all vectors which can be written as finite sums,

(1.21)

and is a subspace of E as well. If for every

then each vector of the sum

can be uniquely represented in the form

(1.21). In this case the space

is called the (internal) direct sum of the

and is denoted by

subspaces

If

is a system of generators for

then the set

If the sum of the

is direct and

generators for then

is a basis for

is a system of

is a basis of

Chapter 1. Vector spaces

26

Example 2. Let by Then

be a basis of E and E2 be the subspace generated

E= Suppose (1.22)

is a direct sum of subspaces. Then we have the canonical injections We define the canonical projections

determined by

= where

It is clear that the are surjective linear mappings. Moreover, it is easily verified that the following relations hold: c'

XEE.

1.15. Complementary subspaces. An important property of vector spaces is given in the

Proposition I: If E1 is a subspace of E, then there exists a second subspace E2 such that E = E1 E2. E2 is called a complementary subspace for E1 in E. Proof We may assume that E1 E and E1 (0) since the proposition is trivial in these cases. Let (xx) be a basis of E1 and extend it with vectors to form a basis of E (cf. Corollary TI to Theorem I, sec. 1.6). Let E2 Then be the subspace of E generated by the vectors

E=

E1

E2.

In fact, since (xx) U (yp) is a system of generators for E, we have that

E=E1+E2.

(1.23)

On the other hand, if XEE1 fl E2, then we may write

x=

x=

Yfi II

and factor spaces

§ 3. Subspaces

27

whence

= 0.

— 13

Now since the set (x2) U

(y13) is

linearly independent, we obtain and

whence x=0. It follows that E1 fl E2=O and so the decomposition (1.23) is direct. As an immediate consequence of the proposition we have Corollary I. Let E1 be a subspace of E and : E1 —* F a linear mapping (Fa second vector space). Then ço1 may be extended (in several ways) to a linear map (p: E—*F.

Proof Let E2 be a complementary subspace for E1 in E,

E=E1Ej3E2

(1.24)

and define p by çox =

where

x—y+z is the decomposition of x determined by (1.24). Then 'p

=

+

'p

=

+

=

= and so q, is linear. It is trivial that 'p extends 'pr.

As a special example we have:

Corollary II: Let E1 be a subspace of E. Then there exists a surjective linear map —*

E1

such that

çox=x

xEE1.

Chapter I. Vector spaces

28

Proof Simply extend the identity map i : E1 —÷E1 to a linear map (p:E-*E1. 1.16.

Factor spaces. Suppose E1 is a subspace of the vector space E.

Two vectors XE E and x' e E are called equivalent mod E1 if x' — XE E1. it

is easy to verify that this relation is reflexive, symmetric and transitive and hence is indeed an equivalence relation. (The equivalence classes are the cosets of the additive subgroup E1 in E(cf. sec. 0.4)). Let E/E1 denote the set of the equivalence classes so obtained and let be

the set mapping given by

7rXX, where

XEE

is the equivalence class containing x. Clearly m is a surjective

map.

Proposition II: There exists precisely one linear structure in E/E1 such that m is a linear mapping. Proof: Assume that E/E1 is made into a vector space such that m is a linear mapping. Then the equations

m(x + y) =

mx

+ my

and m (2 x) =

2

mx

show that the linear operations in E/E1 are uniquely determined by the linear operations in E. it remains to be shown that a linear structure can be defined in E/E1 such that m becomes a linear mapping. Let i and 9 be two arbitrary elements of E/E1 and choose vectors XEE and yEE such that

my=9. Then the class m (x + y) depends only on

and 9. Assume for instance that

x'EE is another vector such that mx'=i. Then mx' = mx and hence we may write

x'=x+z,

zEE1.

it follows that

x' + y = (x + y) + z whence

m(x' + y)= m(x + y).

§ 3. Subspaces and factor spaces

We now define the sum of the elements where

29

and 9eE/E1 by

and 9=7ry.

(1.25)

It is easy to verify that E/E1 becomes an abelian group under this operation and that the class = E1 is the zero-element. Now let be an arbitrary element and 2eF be a scalar. Choose xeE such that Then a similar argument shows that the class ir(Ax) depends only on (and not on the choice of the vector x). We now define the scalar multiplication in E/E1 by where

(1.26)

Again it is easy to verify that the multiplication satisfies axioms 11.1—11.3

and so E/E1 is made into a vector space. It follows immediately from (1.25) and (1.26) that

ir(x+y)=itx+iry

x,yeE

= 2itx i.e., iv is a linear mapping.

The vector space E/E1 obtained in this way is called the factor space of E with respect to the subspace E1. The linear mapping iv is called the canonical projection of E onto E1. If E1 = E, then the factor space reduces to the vector 0. On the other hand, if E1 =0, two vectors xeE and yeE are equivalent mod E1 if and only if y = x. Thus the elements of E/(0) are the singleton sets {x} where x is any element of E, and iv is the linear isomorphism Consequently we identify E and E/(0). 1.17. Linear dependence mod a subspace.. Let E1 be a subspace of E, and suppose that (x2) is a family of vectors in E. Then the will be called linearly dependent mod E1 if there are scalars not all zero, such that

If the

are not linearly dependent mod E1 they will be called linearly

independent mod E1. Now consider the canonical projection

It follows immediately from the definition that the vectors dependent (independent) mod E1 if and only if the vectors dependent (independent) in E/E1.

are linearly are linearly

Chapter 1. Vector spaces

1.18. Basis of a factor space. Suppose that U (:fl) is a basis of E form a basis of E1. Then the vectors 7r:p form a such that the vectors basis of E/E1. To prove this let E2 be the subspace of E generated by the vectors zp . Then

(1.27)

Now consider the linear mapping

defined by

goz=rrz Then

ZEE2.

is surjective. In fact, let

be an arbitrary vector. Since

it:E—*E/E1 is surjective we can write

x—irx,

xEE.

In view of (1.27) the vector x can be decomposed in the form

x=y+z

yEE1,ZEE2.

(1.28)

Equation (1.28) yields

= irx = icy + Trz = and so çø is surjective. To show that çø is injective assume that

çoz=çoz'

=

z,z'eE2.

Then —

z) = p(z' — z) = 0

and hence z'—zeE1. On the other hand we have that z'—zeE2 and thus

z'—zeE1 fl is a linear isomorphism and now PropoIt follows that sition Ill of sec. 1.11 shows that the vectors irzp form a basis of E/E1.

Problems 1. Let be an arbitrary vector in F3. Which of the following subsets are subspaces? a) all vectors with b) all vectors with =0 c) all vectors with d) all vectors with = F,,, Fd generated by the sets of problem 1, 2. Find the subspaces and construct bases for these subspaces.

§ 3. Subspaces and factor spaces

31

3. Construct bases for the factor spaces determined by the subspaces of problem 2.

4. Find complementary spaces for the subspaces of problem 2, and construct bases for these complementary spaces. Show that there exists more than one complementary space for each given subspace. 5. Show that

a) 13=Fa+Fb b)

13=Fb+FC

c) 13=Fa+Fc Find the intersections Fa fl Fb, Fb fl

Fa n

and decide in which cases

the sums above are direct.

6. Let S be an arbitrary subset of E and let be its linear closure. is the intersection of all subspaces of E containing S. Show that 7. Assume a direct composition Show that in each class of E with respect to E1 (i.e. in each coset there is exactly one vector of E2. 8. Let E be a plane and let E1 be a straight line through the origin. What is the geometrical meaning of the equivalence classes with respect to E1. Give a geometrical interpretation of the fact that x

y

x' + i.

Suppose S is a set of linearly independent vectors in E, and suppose T is a basis of E. Prove that there is a subset of T which, together with S, is again a basis of E. and E_ defined 10. Let w be an involution in E. Show that the sets 9.

by

= {xeE; wx = x}, are

E_ = {xeE; wx = — x}

subspaces of E and that

E=

E..

11. Let E1, E2 be subspaces of E. Show that E1 + E2 is the linear closure of E1 u E2. Prove that

E1 + E2 =

E1

U E2

if and only if

E1DE2 or E2DE1. 12. Find subspaces E1, E2, E3 of r3 such that

i) En E1=0 (i+j) ii) E1 + E2 + E3 = p3 iii) the sum in ii) is not direct.

Chapter I. Vector spaces

32

§ 4.

Dimension

In this paragraph all vector spaces are defined over a fixed, but arbitrarily chosen field F of characteristic 0. 1.19. Finitely generated vector spaces. Suppose E is a finitely generated vector space, and consider a surjective linear mapping (p:E—+F. Then F is a system of generators for is finitely generated as well. In fact, if generate F. In particular, the factor space E, then the vectors çox1, ..., of a finitely generated space with respect to any subspace is finitely generated. Now consider a subspace E1 of E. In view of Cor. II to Proposition I, sec. 1.15 there exists a surjective linear mapping ço:E—÷E1. It follows that E1 is finitely generated.

1.20. Dimension. Recall that every system of generators of a nontrivial vector space contains a basis. It follows that a finitely generated non-trivial vector space has a finite basis. In the following it will be shown that in this case every basis of E consists of the same number of vectors.

This number will be called the dimension of E and will be denoted by dim E. E will be called a finite-dimensional vector space. We extend the definition to the case E=(0) by assigning the dimension 0 to the space (0). If E does not have finite dimension it will be called an infinite-dimensional vector space.

Proposition 1. Suppose a vector space has a basis of 11 vectors. Then every family of (n+ 1) vectors is linearly dependent. Consequently, ii is the maximum number of linearly independent vectors in E and hence every basis of E consists of n vectors. Proof: We proceed by induction on ii. Consider first the case n= 1 and let a be a basis vector of E. Then if x 0 and y 0 are two arbitrary vectors we have that whence

and y=pa, ii+O —

= 0.

Thus the vectors x and y are linearly dependent. Now assume by induction that the proposition holds for every vector space having a basis of 1 vectors. = 1. ii) be a basis of E and a family of ii + I vectors. We may assume that + 0 because

Let F be a vector space, and let alL x1 . .

. .

1

§ 4. Dimension

33

otherwise it would follow immediately that the vectors x1 . 1 were linearly dependent. Consider the factor space E1 =E/(xn+ i) and the canonical projection

it: E —÷ E/(xn+i) Since the system where (xn+ i) denotes the subspace generated by generates E1 it contains a basis of E1 (cf. Cor. Ito Theorem I, a1, ...,

sec. 1.6). On the other hand the equation Xn+i

implies that

vi

AV

=0

and so the vectors (a1, ..., an) are linearly dependent. It follows that E1

has a basis consisting of less than n vectors. Hence, by the induction hypothesis, the vectors are linearly dependent. Consequently, there exists a non-trivial relation

=0 and so

=

Xn+i.

This formula shows that the vectors Xi...Xn+i are linearly dependent and

closes the induction.

Example: Since the space vectors it follows that

n

(cf. Example 1, sec. 1.2) has a basis of n

= n.

Proposition II: Two finite dimensional vector spaces E and F are isomorphic if and only if they have the same dimension. Proof: Let be an isomorphism. Then it follows from Proposition III, sec. 1.11 that p maps a basis of E injectively onto a basis of F and so dim E=dim F. Conversely, assume that dim E=dim F—n and let and = 1.. .n) be bases of E and F respectively. According to Propo-

sition II, sec. 1.11 there exists a linear mapping

such that

1...n). Then p maps the basis onto the basis it is a linear isomorphism by Proposition Ill, sec. 1.11.

and hence

3

(Ireub. Lineur Algebra

Chapter I. Vector spaces

34

1.21. Subspaces and factor spaces. Let E1 be a subspace of the n-dimen-

sional vector space E. Then E1 is finitely generated and so it has finite dimension in. Let Xi...Xm be a basis of E1. Then the vectors xi...xm are linearly independent in E and so Cor. II to Theorem I, sec. 1.6 implies that the vectors may be extended to a basis of E. Hence

dimE1 dimE.

(1.29)

if equality holds, then the vectors xi...xm form a basis of E and it follows that E1 = E. Now it will be shown that

dimE=dimE1 +dimE/E1.

(1.30)

If E1 =(0) or E1 =E (1.30) is trivial and so we may assume that E1 is a proper non-trivial subspace of E,

0< dimE1
Let X1.. .Xr be a basis of E1 and extend it to a basis x1.. .Xr.

the vectors

.

follows.

Finally, suppose that E is a direct sum of two subspaces E1 and E2,

E=E1 Then

dimE = dimE1 + dimE2. In fact, if x1 . is a basis of E1 and .xp+q is a basis of E2, then is a basis of E whence (1.31). More generally, if E is the direct sum of several subspaces, .

E

=

then

dimE = EdimE1. Formula (1.31) can also be generalized in the following way. Let E1 and E2 be arbitrary subspaces of E. Then dim(E1 + E2) + dirn(E1 fl E2) = dimE1 + dimE2. In fact, let z1 .. . :r

be

a basis of E1 n

E2

and extend it to a basis

(1.32)

§ 4. Dimension Xr+i . .

of E1 and to a basis Z1 .. .Zr, Yr+ Zr,

35

Yq of E2. Then the vectors

Xr+ 1'

Yr+ 1

(1.33)

Yq

form a basis of E1 +E2. Clearly, the vectors (1.33) generate E1 +E2. To show that they are linearly independent, we comment first that the vectors are linearly independent mod (E1 n E2). In fact, the relation n

implies that

= whence

=0 and JL" =0. Now assume a relation

= 0.

+

+

j

i

k

Then

= i

—

—

j

k

whence fl

Since the vectors x1 are linearly independent mod (E1 fl E2) it follows that In the same way it is shown that ij-'=O. Now it follows that

and so the vectors (1.33) are linearly independent. Hence, they form a basis of E1 + E2 and we obtain that

dim(E1 +E2)=r+(v—r)+(q—r)

=p+q—r

= dimE1 + dimE2 — dim(E1

n

Problems 1. Let (x1, x2) be a basis of a 2-dimensional vector space. Show that the vectors X1 + X2, X2 = X1 — X2 and be the components of a vector respectively. Express the comx relative to the bases (x1, x2) and in terms of the components ponents 2. Consider an n-dimensional complex vector space E. Since the multiplication with real coefficients in particular is defined in E, this space may

again form a basis. Let

Chapter I. Vector spaces

also be considered as a real vector space. Let (z1 ...;) be a basis of E. Prove that the vectors z1 iZ1 form a basis of E if E is considered as a real vector space. 3. Let E be an n-dimensional real vector space and C the complex linear space as constructed in § 1, Problem 5. If (v 1 .. .n) is a basis of E, prove that the vectors (xv, 0) (v = n) form a basis of C. 4. Consider the space of n-tuples of scalars 2EF. Choose as basis the vectors: . . .

. . .

1 . . .

e1 =(1,1,...,1,1) = (0, 1, ..., 1, 1) en

(o,o,...,o,1).

ij2, ..., Compute the components of the vector •, relative to the above basis. For which basis in is the connection be..., tween the components of x and the scalars particularly simple? 5. In F4 consider the subspace Tof all vectors

satisfying

Show that the vectors: x1 =(l, 0, 1, 0) and x2= (0, 1, 0, 1) are linearly independent and lie in T; then extend this set of two vectors to a basis of T. 6. Let be fixed real numbers. Show that all vectors (ii', in 1R4 obeying form a subspace V. Show that V is generated by x1

Verify

= (1,0,0,x1);

that x1, x2, x3

x2 = (0,

(0,0,

x3

form a basis of the subspace V. n—

7. In the space P of all polynomials of degree

bases

and

1

consider the two

defined by

= = (t Express the vectors

—

a)v

(a, constant; v =

0,

...,n —

1).

explicitly in terms of the vectors

8. A subspace E1 of a vector space E is said to have co-dimension n if the factor space E/E1 has dimension n. Let E1 and F1 be subspaces of finite codimension, and let E2, F2 be complementary subspaces,

E1$E2=E, Show that dim E2 = codim E1,

dim F2 = codim F1.

§ 5. The topology of a real finite-dimensional vector space

37

Prove that E1 fl F1 has finite codimension, and that

codim(E1 n F1)dimE2+dimF2.

9. Under the hypothesis of problem 8, construct a decomposition such that H1 has finite codimension and i) H1 ii) H2

E1 n F1 E1 + F1.

Show that n

and n Ha).

and (Yp)fl€B be two bases for a vector space E. Establish 10. Let correspondence between the sets A and B. 11. Let E be an n-dimensional real vector space and E1 be an (n — 1)dimensional subspace. Denote by E1 the set of all vectors xeE which are not contained in E1. Define an equivalence relation in E1 as follows: Two vectors xEE' and yeE' are equivalent, if the straight segment a1—

1

x(t)=(1—t)x+ty is disjoint to E1. Prove that there are precisely two equivalence classes. 12. Show that a vector space is not the union of finitely many proper subspaces.

13. Let E be an n-dimensional vector space. Let F, (i= I ... k) be subspaces such that

dim F, r

(i =

1

... k)

where r < n is a given integer. Show that there is a subspace F E of dimension n—r such that F fl F1=0 (1=1 ... k). Hint: Use problem 12.

§

5. The topology of a real finite-dimensional vector space 1.22. Real topological vector spaces. Let E be a real vector space in

which a topology is defined. Then E is called a topological vector space if the linear operations

ExE

E

and

l1 x E E defined by +y

Chapter I. Vector spaces

38

and

are continuous. Since the set 111" is the Cartesian Example: Consider the space by the topology in a topology is induced in product of n copies of It is easy to verify that the linear operations are continuous with reis a topological vector space. A second spect to this topology and so example is given in problem 6. In the following it will be shown that a real vector space of finite dimension carries a natural topology. Then Proposition: Let E be an n-dimensional vector space over there exists precisely one topology in E satisfying the conditions T1: E is a topological vector space T2: Every linear function in E is continuous. Proof: To prove the existence of such a topology let (v = 1, ..., n) be a given by fixed basis of E and consider the linear isomorphism p:

Then define the open sets in E by p (U) where U is an open set in Clearly p becomes a homeomorphism and the linear operations in E are continuous in this topology. Now letf be a linear function in E. Then we have for every x0eE, XEE (es,). (x - xo) =

11171.

1(x) - f (xo) = f

-

Given an arbitrary positive number

consider the neighbourhood,

çoU, of x0 defined by

v=1,...,n where 5>0 is a number such that

Then if XEçOU we have that

If (x) — I

<5

f

which proves the continuity off at x=x0.

It remains to be shown that the topology of E is uniquely determined by T1 and T2. In fact, suppose that an arbitrary topology is defined in E which satisfies T1 and T2.

Let

§ 5. The topology of a real finite-dimensional vector space

39

1, ..., n) be a basis of E and define mappings p:

and

by and

where x= T1 implies that 'p is continuous. On the other hand, the functions are linear and hence it follows from T2 that i/i is continuous. Since

and

coOt//—1E

we obtain that 'p is a homeomorphism of 11.' onto E. Hence the topology of E is uniquely determined by T1 and T2. Corollary: The topology of E constructed above is independent of the basis

Let F be a second finite-dimensional real vector space and let p: be a linear mapping. Then 'p is continuous. In fact, 1, ..., ni) a basis of F we can write

is

'px = are linear functions in E. Now the continuity of 'p follows from T1 and T2. 1.23. Complex topological vector spaces. The reader should verify that the results of sec. 1.22 carry over word for word in the case of complex

where the

spaces.

Problems 1. Letf be a real valued continuous function in the real n-dimensional linear space E such that

f(x + y) =f(x)+f(y)

x,yeE.

Prove that f is linear. a surjective linear mapping of finite dimensional real vector spaces. Show that 'p is open (the image of an open set in E1 under 'p is open in E2). 2. Let

be

Chapter I. Vector spaces

3. Let m:E—÷E/F be the canonical projection, where E is a real finite dimensional vector space, and F is a subspace. Then the topology in E determines a topology in E/F (a subset UcE/F is open if and only if

m'Uis open in E). a) Prove that this topology coincides with the natural topology in the vector space E/F. b) Prove that the subspace topology of F coincides with the natural topology of F. 4. Show that every subspace of a finite dimensional real vector space is a closed set. 5. Construct a topology for finite dimensional real vector spaces that satisfies T1 but not T2, and a topology that satisfies T2 but not T1. 6. Let E be a real vector space. Then every finite dimensional subspace of E carries a natural topology. Let E1 be any finite dimensional subspace of E, and let U1 E1 be an open set. Moreover let E2 be a complementary subspace in E, Then U1 and E2 determine a set 0 given by

0={x+y;xeU1,yEE2}.

(1.34)

Suppose that

0' =

{x + y;

is a second set of this form. Prove that 0 n 0' is again a set of this form. Hint: Use problems 8 and 9, § 4.

Conclude that the sets OcE of the form (1.34) form a basis for a topology in E. 7. Prove that the topology defined in problem 6 satisfies T1 and T2. 8. Prove that the topology of problem 7 is regular. Show that E is not metrizable if it has infinite dimension.

Chapter II

Linear Mappings In this chapter all vector spaces are defined over a fixed but arbitrarily chosen field, F, of characteristic 0.

§ 1. Basic properties 2.1. Kernel and image space. Suppose E, F are vector spaces and let p: E—*F be a linear mapping. Then the kernel of 'p, denoted by ker ço, is the subset of vectors xeE such that px=O. It follows from (1.8) and (1.9) that ker ço is a subspace of E. The mapping p is injective if and only if (0).

ker p

In fact, if p is injective there is at most one vector xeE such that px=O. But p0 = 0 and so it follows that ker 'p = (0). Conversely, assume that (2.1) holds. Then if (pX1 =

for two vectors x1, x2eE we have —

whence x1— x2 e ker 'p. 'p is injective.

x2)

=

0

It follows that x1— x2 = 0 and so x1 = x2. Hence

The image space of p, denoted by Tm 'p, is the set of vectors ye F of the form y=çox for some xeE. Im 'p is a subspace of F. It is clear that 'p is surjective if and only if Tm 'p = F.

Example 1. Let E1 be a subspace of E and consider the canonical projection Then ker ir = E1

and Tm it = E/E1.

Chapter II. Linear mappings

42

2.2. The restriction of a linear mapping. Suppose go: mapping and let E1 c E, F1 c: F be subspaces such that XEE1.

for

goxEF1

is a linear

Then the linear mapping E1

F1

defined by

xeE1

go1x=gox

is called the restriction of go to F1, F1. It satisfies the relation

F are the canonical injections. Equivalently,

where iE: E1 —÷E and

the diagram 4,

E

IF

tEl E1

F1

is commutative. 2.3. The induced mapping in the factor spaces. Let go: E—*F be a linear be its restriction to subspaces E1 c F and F1 c F. mapping and go1 : F1 Then there exists precisely one linear mapping E/E1

F/F1

such that (2.2)

where

and

are the canonical projections. Since is surjective, the mapping is uniquely determined by (2.2) if it exists. To define we notice first that =

(2.3)

whenever (2.4)

In fact, (2.4) implies that = E1.

x1 —

But by the hypothesis gox1 — gox2

=

go(x1

and so

=

—

X2)EF1

=

§ 1. Basic properties

43

It follows from (2.3) and (2.4) that there is a set map E/E1 satisfying (2.2). To prove that is linear let eE/E1 and 9eE/E1 be arbitrary and choose vectors x e E and y e E such that ICEX = and 7CEY 9. Then it follows from (2.2) that

+ iip)= and hence

+ +

+ jiy)

=

=

2(19X

+ P'PY

is a linear mapping.

The reader should notice that the relation (2.2) is equivalent to the requirement that the diagram E

F

E/E1

F/F1

xeE and 7rFY—9, yeF we can rewrite

be commutative. Setting (2.2) in the form

= 'pX. 2.4. The factoring of a linear mapping. Let p: be a linear mapping and consider the subspaces E1 =ker 'p and F1 =(O). Since 'px=O, xEE1 a linear mapping

çö: E/ker 'p

F

is induced by 'p (cf. sec. 2.3) such that (2.5)

where ir denotes the canonical projection ir: E The mapping xEker 'p and

restriction

E/ker 'p.

is injective. In fact, if ?,öirx = 0 we have that çox

0.

Hence

so rrx=O. It follows that

is injective. In particular, the to E/kerço, Im 'p (also denoted by is a linear isomorph-

ism

4 Im'p. Formula (2.5) shows that every linear mapping can be written as the composition of a surjective and an injective linear mapping,

E-*F 4,

E/ker'p

Chapter II. Linear mappings

44

As an application it will now be shown that for any two subspaces c E and 122 E there is a natural isomorphism E1/(E1 n

+E2)/E2.

(2.6)

Consider the canonical projection

+E2)/E2

ir:E1

and let 'p be the restriction of ir toE1, (E1 +E2)/E2. Then 'p is surjective. In fact, if X1EE1,X2EE2 X—X1+X2, is

any vector of

+ E2 we

have

irx = ir(x1 + x2) = irx1 = (px1. Since

ker'p = kent it

n E1

=

E2 n

follows that 'p induces a linear isomorphism n

E2)4(E1 +E2)/E2.

Now consider the special case that

E=E1 E2

= 0 and hence the relation (2.6) reduces to

E14E/E2. a second example, let define a subspace Fc E by As

(i = 1. r) . .

be

r

linear

functions in E

F= i= 1

Now consider the linear mapping p:

defined by

'p x = (f1 (x), . ., fr(X)) .

Then clearly

ker'p= flkerf1=F i= 1

and so 'p determines a linear isomorphism c

and

§ 1. Basic properties

45

It follows that Im go, and hence E/F, has dimension r,

dim ElF r. G are linear mappings such

Proposition I: Suppose go: E—*F and i/i:

that kergo

ken/i.

Then i/i can be factored over (p; that is, there exists a linear mapping

such that = Since

'1'.

maps ker go into 0 it induces a linear mapping

E/ker go

-÷G such that where E/ker go

it: E

is the canonical projection. Let E/ker go 4 Im go

be the linear isomorphism determined by go, and define a linear mapping by

Finally, let x: F—* G be any linear mapping which extends

Then we have

that go = 4)

0

(p0 it = it

whence Xo(P

=

=

=

= l/1.

Our result is expressed in the commutative diagram

ix G

2.5. Exact sequences. Exact sequences provide a sophisticated method for describing elementary properties of linear mappings. A sequence of linear mappings (2.7)

Chapter II. Linear mappings

46

is called exact at E if

Imp = We exhibit the following special cases:

1. F=O. Then the exact sequence (2.7) reads 4,

(2.8)

Since Im (p=O it follows that ker 11=0 i.e., çti is injective. Conversely, is injective. Then ker/i=0, and so the sequence (2.8) suppose Ji: is exact at E.

2. G=0. Then the exact sequence (2.7) has the form 4,

(2.9)

Since ,fi is the zero mapping it follows that

Imq = ker/i =

E

and so p is surjective. Conversely, if the linear mapping jective, then the sequence (2.9) is exact.

F—*E is sur-

A short exact sequence is a sequence of the form (2.10)

which is exact at F, E and G. As an example consider the sequence (2.11)

where E1 is a subspace of E and 1, projection respectively. Then

Imi =

it denote

E1

the canonical injection and

= kent

and so (2.11) is exact at E. Moreover, since i and are itrespectively injective and surjective, it follows that (2.11) is exact at E1 and E/E1 and so (2.11) is a short exact sequence.

The example above is essentially the only example of a short exact sequence. *)

It is clear that the first and the last mapping in the above diagram are the zero

mappings.

§ 1. Basic properties

47

In fact, suppose

is a short exact sequence. Let E1

lmço = ken/i

and consider the exact sequence

Since the mapping (p:F—÷Eis injective its restriction ço1 to F, E1 is a linear

isomorphism, Pi

On the other hand, i/i

induces

a linear iso-

morphism G.

Now it follows easily from the definitions that the diagram (2.12)

is commutative. 2.6. Homomorphisms of exact sequences. A commutative diagram of the form (pi

o —f F1

—÷

o —+ F2

—*

E1

—*

G1

0

E2

—÷

G2

0

(2.13) 402

where both horizontal sequences are short exact sequences, and Q, a, t are linear mappings, is called a homomorphism of exact sequences. If a, -r are linear isomorphisms, then (2.13) is called an isomorphism between the two short exact sequences. In particular, (2.12) is an isomorphism of short exact sequences. 2.7. Split short exact sequences. Suppose that (2.10)

is a short exact sequence, and assume that there X:E4—G is a linear mapping such that = Then x is said to split the sequence (2.10) and the sequence x

is called a split short exact sequence.

Chapter II. Linear mappings

48

Proposition II: Every short exact sequence can be split. Proof: Given a short exact sequence, (2.10) let E1 be a complementary subspace of ker in E, of i/i to E1, G. Since ker /11 =0, and consider the restriction, is a linear isomorphism, i/i 1: E1 G. Then the mapping x: E1 G defined by 1 satisfies the relation x=

zeG and hence x splits the sequence. 2.8. Stable subspaces. Consider now the case F=E; i.e., let ço be

a linear transformation of the vector space E. Then a subspace E1

E

will be called stable under 'p if (pXEE1

for xeE1.

It is easy to verify that the subspaces ker 'p and Im p are stable. If E1 is a stable subspace, the restriction, 'ps. of 'p to E1, E1 will be called the restriction qf p to E1. Clearly, is a linear transformation of E1. We also have that the induced map

is a linear transformation of E/E1.

Problems 1. Let C be the space of continuous functions f: D1—+ 111 and define the mapping p: C—k C by

Prove that Tm 'p consists of all continuously differentiable functions while

the kernel of 'p is 0. Conclude that 'p is injective but not bijective. 2. Find the image spaces and kernels of the following linear transfor-

mations of ['s: a)

b) c)

= = =

-

+

+

+

§ 1. Basic properties 3.

49

Find the image spaces and kernels of the following linear mappings

ofF4 into =

a) b)

=

c)

—

+

+

+ +

—

+

+ —

+

+ +

+ —

4. Construct bases for the factor spaces F4/ker i/i and F4/ker ço of problems 2 and 3. Determine the action of the induced mappings on these bases and verify that the induced mappings are injective. 5. Prove that if ço:E—*F and are linear mappings, then the relation

ker(p

is necessary for the existence of a linear mapping

such that

in parts a, b, c of problems 2 and 3. Decide Consider the pairs (k/I, be factored over p, or if ço can be factored over i//, in each case if k// can 6.

or

if both factorings are possible. Whenever i/i can be factored over conversely) construct an explicit factoring map. 7. a) Use formula (2.6) to obtain an elegant proof of formula (1.32). b) Establish a linear isomorphism

(or

(E/F)/(E1/F) where Fc: E1

E/E1

E.

8. Consider the short exact sequence

-+0.

0-+E1

correspondence between linear mappings x: E+-E/E1 which split the sequence, and complementary subspaces of E1 in E. G—÷0 is split if and only 9. Show that a short exact sequence if there exists a linear mapping w: F+-E such that w (p 1. correspondence between the split short In the process establish a 1 — exact sequences of the form

Show that the relation

x defines a 1 — 1

1

4,

0-+F-+E#G-+0 x

and of the form w 4

Greub. Linear Algebra

Chapter II. Linear mappings

such that the diagram (1)

is again a short exact sequence. 10. Assume a commutative diagram of linear maps

j.4'3

F1

—+

/33

132

13

F2

—*

F3

/34

—*

F4

—÷

F5

where both horizontal sequences are exact. i) Show that if ço4 is injective and m is surjective, then kercp3 =

ii) Show that if p2 is surjective and p5 is injective, then Im

Conclude that if the maps

iii)

then

'p3 =

so is

'p4, p5 are linear isomorphisms.

(5-lemma).

11. Consider a system of linear mappings

0

0

—*

I

I

'I.

0

0

E00

(Pot —÷

(P00 —÷

/'ooj.

/1021

0— E10

/'iol 0

—*

('II —p

/1iiI

E20

—÷

1

.1.

where

//121 (P21 —÷

1

all the horizontal and the vertical sequences are exact at each

Assume

=

Construct

n

a linear isomorphism between

1

and

12. Given an exact sequence (P

prove

1) by

that the diagram is commutative. Define spaces

that 'p is

surjective if and

1'

x

only if x is injective.

1,j•

§ 2. Operations with linear mappings

13. Prove the hexagonal lemma: Given a diagram of linear maps A &

B

the diagonals 120 i1 and

k1 k2 are isomorphisms and i2 are exact at E, show that

l1ok1oh1 + l2ok2 oh2 =

§ 2.

Operations with linear mappings

2.9. The space L(E; F). Let E and F be vector spaces and consider the set L(E; F) of linear mappings p: If p and cli are two such mappings + i/i and 2p of E into F are defined by (q, +

=

çox + clix

and

xeE.

It is easy to verify that p + cli and 2p are again linear mappings, and so the set L (E; F) becomes a linear space, called the space of linear mappings

of E into F. The zero vector of L (E; F) is the linear mapping 0 defined by 0 x=O, xeE. In the case that F=F ('p and cli are linear functions) L(E; F) is denoted simply by L(E). 2.10. Composition. Recall (sec. 1.10) that if and cli: F—÷ G are linear mappings then the mapping p: G defined by p) x =

i/i ('p

x)

is again linear. If H is a fourth linear space and x: G—*H is a linear mapping, we have for each xEE

= x(clioco)x = x[cli(px)] = (xocli)cox = [(xocli)oço]x whence

Consequently, we can simply write x 0 cl' 0 'p.

(2.14)

Chapter 11. Linear mappings

is a linear mapping and 1E and 1F are the identity mappings If ço : of E and F we have clearly and

(2.15)

is the inverse isomorphism

Moreover, if ço is a linear isomorphism and

we have the relations (2.16)

and

Finally, if checked that

E—* F and

F—p G are linear mappings, then it is easily

and

(2.17)

=

2.11. Left and right inverses. Let and cli: E+-F be linear mappings. Then is called a right in verse of p if çü cli = /i is called a left inverse of p if p=

surjective if and only if it I. A linear mapping has a right inverse. It is injective if and only if it has a left inverse. Proof: Suppose çü has a right inverse, Then we have for every reF Proposition

y=

c/' Y

and so yelm ço; i.e., çø is surjective. Conversely, if p is surjective, let E1 be a complementary subspace of ker p in E,

E=

E1

kerp.

Then the restriction ço1 of çü to E1, F is a linear isomorphism. Define the linear mapping c/i:E1÷—Fby where i1 :E1—*E is the canonical injection. Then = yeF y = j',

i.e., poi/'=lr. For the proof of the second part of the proposition assume that ço has a left inverse. Then if xeker p we have that

x = c/ipx = c/iO =

0

whence ker 'p=O. Consequently p is injective.

§ 2. Operations with linear mappings

53

Conversely, if p is injective, consider the restriction Pi of p toE, Im ço.

is a linear isomorphism. Let ir:F—*Im

Then

be

a linear mapping

such that

iry=y

yEIm(p

for

(cf. Cor. II, Proposition I, sec. 1.15) and define t/i:E+—F by

= Then we have that

whence /i

=

Hence

has

a left inverse. This completes the proof.

Corollary: A linear isomorphism right (left) inverse, namely, (p1. Proof: Relation (2.16) shows that (p Now let lji be any inverse of (p,

has a uniquely determined is

a left (and right) inverse to çø.

= Then multiplying by

1 from the right we obtain

= whence of (p is

t/i = ço

In the same way it is shown that the only right inverse

2.12. Linear automorphisms. Consider the set GL(E) of all linear auto-

morphisms of E. Clearly, GL(E) is closed under the composition ((p,

and it satisfies the following conditions:

i) x p) = (x ol/J) o (p (associative law) ii) there exists an element i (the identity map) such that (p i = lo(p=(p for every çoeGL(E) iii) to every (pEGL(E) there is an element p EGL(E) such that 1

1.

In other words, the linear automorphisms of E form a group.

Problems 1. Show that if E, F are vector spaces, then the inclusions

L(E; F)

C(E; F)

(E; F)

Chapter II. Linear mappings

54

are proper ((E; F) is defined in Example 3, sec. 1.2 and C(E; F) is defined in problem 9, § I, chap. I). Under which conditions do any of these spaces have finite dimension? 2. Suppose and are

linear mappings. Assume

that coi, P2 are injective,

I'2 are surjec-

tive and Xi' /2 are bijective. Prove that a) (p2ocoi is injective is surjective b) c) /2°/i is bijective

3. Let linear mappings then

be a linear mapping. a) Consider the space M'(q) of E4—F such that Prove that if is surjective

b) Consider the space (po/J=O. Prove that if

is

of linear mappings Ji: injective then

such that

4. Suppose that go:E—+F is injective and let M'(ço) be the subspace defined in problem 3. Show that the set of left inverses of is a coset in the factor space L(F; E)/Ml (ço), and conclude that the left inverse of is uniquely determined if and only if is surjective. Establish a similar result for surjective linear mappings. 5. Show that the space M'(ço) of problem 3 is the set of linear mappings /i: such that Im ker /i. Construct a natural linear isomorphism between M'(ço) and L(F/Im p; E).

Construct a natural linear isomorphism between Mr(p) (cf. problem 3) and L(F; ker p). 6. Assume that E—÷E is a linear transformation such that Ji = /i for every linear transformation /i. Prove that ço = It where I is a scalar. Hint: Show first that, foreveryvectorxEE there is a scalar 1(x) such that cox=2(x)x. Then prove that 1(x) does not depend on x. 7. Prove that the group GL(E) is not commutative for dim E> 1. If dim E= 1, show that GL(E) is isomorphic to the multiplicative group of the field F. 8. Let E be a vector space and S be a set of linear transformations of E. A subspace Fc E is called stable with respect to S if F is stable under every pES. The space E is called irreducible with respect to S if the only stable subspaces are F=O and F=E. Prove Schur's Lemma: Let E and F be vector spaces and be a linear mapping. Assume that SE and SF are two sets of linear transfor:

§ 3. Linear isomorphisms

mations of E

and

F such that SE = SF

to every transformation cOESE there exists a transformation i/JESF and conversely. Prove that =0 or is a linear such that ço = isomorphism of E onto F.

i.e.

§

3.

Linear isomorphisms

2.13. It is customary to state simply that a linear isomorphism preserves all linear properties. We shall attempt to make this statement more precise, by listing without proof (the proofs being all trivial) some of the important properties which are preserved under an isomorphism p: F. Property I: The image under 'p of a generating set (linearly independent set, basis) in E is a generating set (linearly independent set, basis) in F.

Property II: If E1 is any subspace in E, and E/E1 is the corresponding factor space, then 'p determines linear isomorphisms E1

'pE1

and

E/E14 (pE/çDE1. Property III: If

G is a third vector space, then the mappings

t/ieL(E;G) and

are linear isomorphisms

L(E;

G)4 L(F; G)

and

L(G;E)4

L(G;F)

is an injective linear mapping. 2.14. Identification: Suppose p: Then 'p determines a linear isomorphism

Chapter II. Linear mappings

It may be convenient not to distinguish between E and Im ço, but to regard them as the same vector space. This is called ident(flcation, and while in some sense it is sloppy mathematics, it leads to a great deal of economy of formulae and a much clearer presentation. Of course we shall only identify spaces whenever there is no possibility of confusion.

§ 4.

Direct sum of vector spaces

2.15. Definition. Let E and F be two vector spaces and consider the set Ex Fof all ordered pairs (x, y), xeE, yeF. it is easy to verify that the set E x F becomes a vector space under the operations (x1,y1) + (x2,y2) = (x1 + x2,y1 + Y2)

and = (2x,2y)

This vector space is called the (external) direct sum of E and F and is denoted by EEEF. If eA and (Y/,)p B are bases of E and F respectively form a basis of EEEF. In particular, if E 0) and (0, then the pairs and F are finite dimensional we have that dim (E

F) = dim E + dim F.

2.16. The canonical injections and projections. Consider the linear mappings

i2:F-*EEE)F defined by

i1x = (x,0)

i2y = (O,y)

and the linear mappings given by

y.

x

it follows immediately from the definitions that = 'E

ir10i2=O

°

'F

m2oi1—0

(2.18) (2.19)

§ 4. Direct sum of vector spaces

and

+

11

12

(2.20)

Jr2 =

The relations (2.18) imply that the mappings 1, 2) are injective are called and the mappings IrA = 1,2) are surjective. The mappings respectively the canonical injections and IrA the canonical projections associated with the external direct sum EEBF. Since i1 and i2 are injective we can identify Ewith Im i1 and Fwith Im i2. Then Eand F become subspaces of and E and F.

The reader will have noticed that we have used the same symbol to denote the external and the internal direct sums of two subspaces of a vector space. However, it will always be clear from the context whether the internal or the external direct sum is meant. (If we perform the identification, then the distinction vanishes). In the discussion of direct sums of families of subspaces (see sec. 2.17) we adopt different notations. If F= E we define an injective mapping A : by

Ax =(x,x). A is called the diagonal mapping. In terms of i1 and i2 the diagonal mapping can be written as

A =i1 +i2. Relations (2.18) and (2.19) imply that

0A =

ir1 0A

The following proposition shows that the direct sum of two vector spaces is characterized by its canonical injections and projections up to an isomorphism.

Proposition I: Let E, F, G be three vector spaces and suppose that a system of linear mappings

ço1:E-*G, is given subject to the conditions 1

°

=

'E

°

= 1F

and

+

=

Chapter II. Linear mappings

Then there exists a linear isomorphism =

toil

=

=

To

=

and 12

such that (2.21)

oil

are called (as before) canonical injections and projections. Proof Define linear mappings

The

and by

o:=(l/11z,t/12:),

zEG

and

t(x,y)=p1x+p2v,

XEE,yEF.

Then for every vector ZEG T0Z= and

Z=

Z+

for every vector (x, y)EEe3F

at(x,y) =

+

+

= (x,v).

These relations show that t and a are inverse isomorphisms. Formulae (2.21) are immediate consequences of the definition of t. Example: Let E be a real vector space. Then E can he made into a complex vector space as follows: fix) The complex vector space so obtained is called the coimiplexificatiomi of E

and is denoted by Eq. Every vector (x, V)EEq can be uniquely represented in the form (x, y) = (x, 0) + (0, v) = (x. 0) + i(v, 0).

Now identify the (real) subspace E under the inclusion map i1: Then the equation above reads (x, v) = x + iv

x,

E.

If E has finite dimension a. and if x1 is a basis of E, then the vectors x1 form a basis of the complex space Em as is easily verified. Thus dimm-Ec =

§ 4. Direct sum of vector spaces

59

2.17. Direct sum of an arbitrary family of vector spaces. Let be an arbitrary family of vector spaces. To define the direct sum of the family E2 consider all mappings (2.22)

such that i)

x (cx) are zero.

We denote

Then the mapping (2.22) can be written as

by

x:

—p x2.

The sum of two mappings x and y is defined by =

(x +

+

and the mapping 2x is given by =

Under these operations the set of all mappings (2.22) is made into a vector space. This vector space is called the (external) direct sum of the The zero vector of is and will be denoted by vector spaces the mapping x given by =

x

zero vector of Eu).

For every fixed QEA we define the canonical injection i

by x

(2.23) and (2.24) that

(2.25)

and EQ

lrQ x = x

xe

By 'abus de langage' we shall write (2.26) simply as EQ

=

(2.26)

Chapter II. Linear mappings

Proposition II: Suppose that a decomposition of a vector space E as a direct sum of a family of subspaces E is isomorphic to the external direct sum of the vector spaces Proof: Let Then a linear mapping ax =

where x =

Conversely, a linear mapping

E E2.

is given by 'C X =

X.

Relations (2.25) and (2.26) imply that

'Coat and

ao'C=z

and hence a is an isomorphism of E onto

and 'r is the inverse isomorph-

ism.

2.18. Directsumoflinearmappings. are linear mappings. Then a linear mapping defined by

:

is

x2).

x1,

('Pi

and'P2 E2—*F2

It follows immediately from the definition that

Im(p1

'P2) = 1mp1

1mp2

'P2) = ker'P1

ker'P2.

and

Now suppose E1, E2 are subspaces of E and F1, F2 are subspaces of F such that

and F=F1$F2. If

are linear maps then

(2.27)

is again a linear map, defined by

+x2)=cp1x1 +'P2X2 where x = x1 + x2 is the decomposition of any vector xe E determined by

(2.27). 'Pi $(P2 may be characterized as the unique linear map of E into F which extends 'Pi and 2.19. Projection operators. A linear transformation 'P : E—*E is called a projection operator in E, if If 'p is a projection operator in E, then (2.28) E = ker'p

§ 4. Direct sum of vector spaces

61

Moreover, 'P = 'Im

(2.29)

°ker

To prove (2.28) let xeE be an arbitrary vector. Writing

x=y+'Px (i.e.y=x—'Px) we obtain that 'P2 x = 0

'P Y = 'P X

whence yeker 'P. It follows that

E = ker'P + lm'P.

(2.30)

To show that the decomposition (2.30) is direct let :='Px be an arbitrary vector of ker 'P fl Im 'P. Then we have that 0 = 'P Z = 'P2 x

'P X = Z

and thus ker 'P fl Im 'P=0.

To prove (2.29) we observe that the subspaces Im 'P and ker 'P are stable under 'P (cf. sec. 2.8) and that the induced transformations are the identity and the zero mapping respectively. Conversely, if a direct decomposition E = E1 is

$ E2

given, then the linear mapping 'P

1E1

is clearly a projection operator in E. Proposition III: Let

(i = 1

. .

.r) be projection operators in E such that

1+1 and = 1.

Then

E= i= 1

Proof: Let xEE be arbitrary. Then the relation E i=1

(2.31)

Chapter II. Linear mappings

shows that

I

E=

(2.32)

To prove that the sum (2.32) is direct suppose that fl

Then

(some

I

j*i

yEE), so that (2.33)

On the other hand, we have that for some vectors YJEE, X

=

j*i

whence, in view of (2.31), (2.34)

j*i

Relations (2.33) and (2.34) yield x=O and hence the decomposition (2.32) is direct.

Suppose now that

E

is a decomposition of E as a direct sum of subspaces Let and is,: denote the canonical projections and injections, and consider the linear mappings E—÷E defined by =

Then the are projection operators satisfying (2.31) as follows from (2.25) and (2.26). Moreover, Im and so the decomposition of E determined by the agrees with the original decomposition.

Problems 1. Assume a decomposition E = E1 + E2. Consider the external direct sum

and define a linear mapping

by

ço(x1,x2)=x1+x2 Prove

that the kernel of 'p

is

X1EE1,X2EE2.

the subspace of E consisting of the pairs

§ 5. Dual vector spaces

63

— x) where XEE1 fl E2. Show that is a linear isomorphism if and only if the decomposition E= E1 + E2 is direct. 2. Given two vector spaces E and F, consider subspaces E1 c E, F1 F and the canonical projections

(x,

Define a mapping by

$F/F1 =

Show that p induces a linear isomorphism

3. Let E:=E1G3E2 and F=F1EI3F2 be decompositions of E and F as direct sums of subspaces. Show that the external direct sum, G, of E and F can be written as G = G1 G2 where G1 and G2 are subspaces of G and is the external direct sum of E, and F1(i— 1, 2).

4. Prove that from every projection operator ir in E an involution w is obtained by w 2ic — i and that every involution can be written in this form. 5. Let l...r) be projection operators in E such that

(i=1...r) where F is a fixed subspace of E. Let

a) If b)

+ 0 then Im

(i= 1.. .r) be scalars. Show that

=F

is a non-trivial projection operator in E if and only if

=

1.

6. Let E be a vector space with a countable basis. Construct a linear isomorphism between E and 7. Let X be any set and let C(X) be the free vector space generated by X C(Y). (cf. sec. 1.7). Show that if Yis a second set, then C(X U § 5. Dual vector spaces 2.20. Bilinear functions. Let E and F be vector spaces. Then a mapping Ex satisfying

and 1i(x,2y1 + ,uy2) = )Ai(x,y1) +

x1,x2eE,yeF

(2.35)

xeE,y1,y2eF

(2.36)

C hapter II. Li near mappings

is called a bilinear function in E x F. If (P is a bilinear function in E x F and E1 E, F1 F are subspaces, then (P induces a bilinear function in E1 x F1 defined by

cP1(x,y)=cP(x,y)

xEE1,yEF1

is called the restriction of (P to E1 x F1. in E1 x F1 may be extended (in Conversely, every bilinear function several ways) to a bilinear function in E x F. In fact, let Q:E—*E1,

be surjective linear mappings such that and a reduce to the identity in E1 and F1 respectively (cf. Cor. 11, Proposition I, sec. 1.15). Define (P by cP(x,y) = cP1(Qx,ay).

Then (P is a bilinear function in Ex Fand for x1EE1, y1eF1 we have that (P(x1, Yi)

X1, a Yi)

=

(x1, Yi)

Thus (P extends Now let

E and F

of subspaces. Then every

system of bilinear functions x

can be extended in precisely one way to a bilinear function (P in E x F. The function (P is given by (P(x,y) = and denote the canonical projections associated with the decompositions (2.37). 2.21. Nullspaces. A bilinear function (P in E x F determines two subspaces NE c E and NF F defined by where

NE={xI(P(x,y)=O} forevery yeF and

forevery xEE. It follows immediately from (2.35) and (2.36) that NE and NF are subspaces of E and F. They are called the nulispaces of (P. If NE = 0 and NF = 0 then the bilinear function (P is called non-degenerate.

§ 5. Dual

vector spaces

65

Given an arbitrary bilinear function consider the canonical projections 7rE.E-*EINE, 7rF.F÷FINF. Then P induces a non-degenerate bilinear function that — = ct'(x,y).

in E/NE x F/NF such

To show that b is well defined, suppose that x' e E and y' E F are two other vectors such that and JtFY=ICFY . Then x'—xENE andy'—yENF

and hence we can write x'=x+u, UENE and y'—y+v, VENF. It follows that cP(x',y') = P(x + u,y + v) = + + cP(u,y) + cP(u,v) = CP(x,y).

Clearly '15 is bilinear, it remains to be shown that In fact, assume that X, =0

is non-degenerate. (2.38)

for a fixed and every TrrY. Then P(x, y)=O for every yEF. It follows that XENE whence TrEx=O. Similarly, if (2.38) holds for a fixed and every ICEX, then Hence cii is non-degenerate. A non-degenerate bilinear function cli in E x F will often be denoted by Then we write cP(x,y)=<x,y> xEE,yEF.

Dual spaces. Suppose E*, E is a pair of vector spaces, and assume that a fixed non-degenerate bilinear function, <,>, in E* x E is defined. Then E and E* will be called dual with respect to the bilinear function x>, is called the scalar product of and x, The scalar and the bilinear function <,> is called a scalar product between E* and E. 2.22.

Examples.

1

Let E= E* = F and define a mapping

by

2,peF.

<2,p>=2ji

Clearly <,> is a non-degenerate bilinear function, and hence F can be regarded as a self-dual space.

2. Let E= E* = F" and consider the bilinear mapping <.> defined by = 1=1 Greub. Li near AJgcbra

Chapter II. Linear mappings

where

x=

...,

It is easy to verify that the bilinear mapping <.> is non-degenerate and is dual to itself.

hence

3. Let E be any vector space and E* = L (E) the space of linear functions in E. Define a bilinear mapping <.> by

=f(x),

JEL(E),XEE.

Sincef(x)=O for each XEE if and only iff=O, it follows that NL(E)=O. On the other hand, let aeE be a non-zero vector and E1 be the onedimensional subspace of E generated by a. Then a linear function g is defined in E1 by g(x)=)L where In view of sec. 1.15, g can be extended to a linear function f in E. Then

a> =

f (a) =

g

1 +0.

(a)

It follows that NE = 0 and hence the bilinear function <.> is non-degenerate.

This example is of particular importance because of the following

Proposition I: Let E*, E be a pair of vector spaces which are dual with respect to a scalar product Then an injective linear map & E*_*L(E) is defined by =

E E*, XE E.

x>

Proof? Fix a vector a*EE*. Then a linear function,

ja*k) = Since <,>

is

bilinear,

X>

XE E.

(2.39)

is defined in E by (2.40)

depends linearly on a*. Now define cP by setting (2.41)

To show that 1i is injective, assume that i(ci*)=O for some a*EE*. Then

X>=0 for every XEE. Since

is

non-degenerate. it follows

that a*=0. Note: It will be shown in sec. 2.33 that P is surjective (and hence a linear isomorphism) if E has finite dimension.

§ 5. Dual vector spaces

67

e E * and xe E are 2.23. Orthogonal complements. Two vectors called orthogonal if <xe, x> = 0. Now let E1 be a subspace of E. Then the of E*. vectors of E* which are orthogonal to E1 form a subspace is called the orthogonal complement of E1. In the same way every subspace

determines an orthogonal complement (Er)' E. The fact that the bilinear function <.> is non-degenerate can now be expressed by the relations

E'=O and (E*)±=O. It follows immediately from the definition that for every subspace E1

E

(2.42)

E1

Suppose next that E*, E are a pair of dual spaces and that F is a sub-

space of E. Then a scalar product is induced in the pair E*/F', F by VEF,

where is a representative of the class In fact, let be the restriction of the scalar product <.>to E* x F. Then the nullspaces of cli are given by

and NF=O. Now our result follows immediately from sec. 2.21.

More generally, suppose Fc E and H* E* are any subspaces. Then a scalar product in the pair H*/H* fl F', F/Fn (H*)', is determined by

= as

a similar argument will show.

2.24. Dual mappings. Suppose that E, E* and F, F* are two pairs of dual spaces and p:E—÷F, are called dual if 'p and

are linear mappings. The mappings

= exists at most one dual mapping.

To a given linear mapping

If

and

are dual to 'p we have that =

=

and

whence —

=

0

XEE,Y*EF.

Chapter II. Linear mappings

68

This implies, in view of the duality of E and E*, that

=

whence

an example of dual mappings consider the dual pairs E*, E and is a subspace of E (cf. sec. 2.23) and let it be the canonical projection of E* onto E*/EjL, As

E*/EjL, E1 where E1

E*. Then the canonical injection

i:E1

is dual to it. In fact, if xeE1, and y*EE* are arbitrary, we have

= <9*,x> =

=

and thus it

= i*.

2.25. Operations with dual mappings. Assume that E*, E and F*, F are two pairs of dual vector spaces. Assume further that p: E—*F and i/i:E-+F are linear mappings and that there exist dual mappings p*: and ,/,*:E*÷_F*. Then there are mappings dual to and 2p and these dual mappings are given by = (p* + 1/1*

(p +

(2.43)

and (2.44)

(2.43) follows from the relation + VJ*)y*,x> =

+ +

—

=

+

and (2.44) is proved in the same way. Now let G, G* be a third pair of F* +— G * be a pair of dual mappings. Then dual spaces and let x: F—* G, the dual mapping of x p exists, and is given by :

(xoco)* =

In fact, if z*EG* and xeE are arbitrary vectors we have that <x*z*,cox> = .

§ 5. Dual vector spaces

69

For the identity map we have clearly —1 —

\*

Now assume that p: E—+F has a left inverse 'Pi : F—+E, (2.45) and that the dual mappings

and

exist. Then we

obtain from (2.45) that = (1E)*

(2.46)

In view of sec. 2.11 the relations (2.45) and (2.46) are equivalent to p injective,

surjective

injective,

(p* surjective.

and

In particular, if p and

are inverse linear isomorphisms, then so are (p* and 'pt. 2.26. Kernel and image space. Let p : E—*F and (p*: be a pair of dual mappings. In this section we shall study the relations between the subspaces

kerpcE, ImqcF

and ker p*

F

Im (p* c E*.

First we establish the formulae

kerp* = (lmço)'

(2.47)

kerp =(Imp*)±.

(2.48)

In fact, for any two vectors y*eker p*, pxelm çü we have = <(p*y*,x> =

0

and hence the subspaces ker and Im 'p are orthogonal, ker Now let y*E(Im p)' be any vector. Then for every xeE <'p*

X> =

(p

(Im p)

X> = 0.

whence y*Eker 'p*. This completes the proof of It follows that (2.47). (2.48) is proved by the same argument.

Chapter II. Linear mappings

70

Now assume that 'p (2.47) implies that ker

is

surjective. Then Im (p=F and hence formula =0; i.e., 'p* is injective. If 'p is injective we ob-

tain from (2.48) that (Im p*)±=0. However, this does not imply that = E* and so we can not conclude that the dual of an injective mapIm ping is surjective (cf. problem 9). 2.27. Relations between the induced mappings. Again let p: E—÷F and be a pair of dual mappings. Then it follows from (2.48) and from the discussion in sec. 2.23 that a scalar product is induced in the pair Im 'p*, E/ker 'p, by

=

x>

Elm 'p*, xe E/ker

In particular, if 'p is injective, then the restriction of the scalar product in E*, E to Im 'p*, E is non-degenerate. The same argument as above shows that the vector spaces F*/ker (p* and Im 'p are dual with respect to the bilinear functions given by x> =

x Elm 'p.

Now consider the surjective linear mapping

'p and

the injective linear mapping E*

F*/ker 'p*

are dual. In fact, if induced by 'p*. The mappings 'p1 and and xEE are arbitrary vectors, we have that

= = <x*,'px> = In the same way it follows that the surjective mapping

induced by 'p* and the injective mapping

induced by 'p are dual. Finally, the induced isomorphisms

E/ker'p 41m'p

'p*

§ 5. Dual vector spaces

and

F*/ker (p* Tm (p* are dual as well. 2.28. The space of linear functions as a dual space. Let E be a vector space and L (E) be the space of linear functions in E. Then the spaces E, L(E) are dual with respect to the scalar product defined in sec. 2.22. For

these spaces we have three important results, which are not valid for arbitrary pairs of dual spaces. Proposition II: Let F, F* be arbitrary dual spaces and (p: E-÷F be a exists, and is given linear mapping. Then a dual mapping by

(2.49)

* ((p*y*)(x)

xeE

Proof: It is easy to verify that the correspondence by (2.49) determines a linear mapping. Moreover, the relation

defined

= ((p*y*)(x) shows that the form

is dual to (p.

If F* = L(F) as well, (2.49) can be written in feL(F).

(2.50)

Proposition III: Suppose p: E—+F is a linear mapping, and consider the dual mapping Then Tm

= (ker

(2.51)

Proof: From (2.42) and (2.48) we obtain that Tm

(p*

(Tm (p*)h1 = (ker

q)'.

(2.52)

On the other hand, suppose that ft (ker p)'. Then

kerf

kerq

and hence (cf. sec. 2.4) there exists a linear function g in F such that go(p

=f.

Now (2.50) yields = go(p

and sofelm (p*• Thus Im (p* (2.51).

f

(ker p)' which together with (2.52) proves

Chapter

72

Linear mappings

11.

is surjective. Corollary I: If p is injective, then Proof. If ker (p=O formula (2.51) yields Im(p* = (kerp)1 = (0)' = L(E)

and so (p* is surjective.

Corollary II: (ker (p)"=ker Proof. Proposition III together with the relation (2.48) yields

(ker (p)" = (Tm p*)l Proposition IV: If E1

ker

E is any subspace, then

= E1.

(2.53)

Proof Consider the canonical projection 7t:E—+E/E1. Then ker Now the result follows immediately from corollary II.

Corollary I: If E—÷F is a linear mapping and (p*: L(E)4—L(F) is the dual mapping, then = Imp. Proof It follows from (2.47) and (2.53) that

(ker p*)i =

(Tm

(p)'1 =

Tm (p.

Corollary II: The bilinear function

= a scalar product in the pair EjL, E/E1. 2.29. Dual exact sequences. As an application suppose the sequence

defines

4,

is exact at E. Then the dual sequence

is exact at L(E). In fact, it follows from (2.47) and (2.51) that ker (p* = (Im

In particular, if

p)' =

(ker v')' = Tm

§ 5. Dual vector spaces

73

is a short exact sequence, then the dual sequence

is again a short exact sequence. 2.30. Direct decompositions. Proposition V: Suppose

E=

E1

E2

(2.54)

is a decomposition of E as a direct sum of subspaces. Then

and the pairs E1L, E2 and E±, E1 are dual with respect to the induced scalar products. Moreover, the induced injections are surjective, and hence

L(E) =

Finally, (EjL)ll=EjL and (E±)11=E±. Let ir1 : E—3E1 and it2: E—+E2 be the canonical projections as-

sociated with the direct decomposition (2.54). LetfeL(E) be any linear function, and define functionsf1,f2 by

fl(x)=f(it2x) It

follows

and f2(x)=fOrlx).

1, 2) and

1=11 + 12. Consequently,

L (E) =

+

(2.55)

To show that the decomposition (2.55) is direct, assume that feEt n Then f(x)=O xeE1,xeE2

and hencef(x)=O for every xeE. Thus f=O, and so the decomposition (2.55) is direct. The rest of the proposition is trivial. Corollary: If E= E1 of r subspaces, then

Er is a decomposition of E as a direct sum

L(E)=FjLEE... where

F1= >EJ. j*i

Chapter II. Linear mappings

74

is again non-

Moreover, the restriction of the scalar product to E1, degenerate, and

Proposition V has the following converse: Proposition VI. Let E1 c E be any subspace, and let subspace dual to E1 such that

L (E) be a

= Then (2.56)

and

L(E) =

(2.57)

Proof: We have

(E1 + Er)' =

n

(Er)' =

n

=0

whence

E=O'=(E1

(2.58)

On the other hand, since E1 and E1 n

are dual, it follows that =0

which together with (2.58) proves (2.56). (2.57) follows from Proposition V and (2.56).

Problems 1. Given two pairs of dual spaces E*, E and F*, F prove that the spaces

and EG3F are dual with respect to the bilinear function <(x*, y*), (x, y)> = <x*, x> + •

2. Consider two subspaces E1 and E2 of E. Establish the relation (E1

n

E

consider the mapping ).:

fined by

Prove that is injective. Show that

de-

aEE, jeL(E). is surjective if and only if dim E <

i.

§ 5. Dual vector spaces

Suppose

4.

it

75

that

a

is

a projection operator in

that it Tm are dual pairs. it, ker that 5. Suppose E, E* is a pair of dual spaces such that every linear functionf: induces a dual mappingf* :E*4_F. Show that the natural

injection a

that that the natural injection E*_*L(E) is

E is space

not surjective. 7. Consider the vector space E of sequences

2,eF and the subspace F consisting of those sequences for which only finitely many 2, are different from zero (addition and scalar multiplication being defined as in the case of given by Show that the mapping Ex

defines a scalar product in E and F. Show further that the induced injection E—÷L(F) is surjective.

8. Let S be any set. Construct a scalar product between (S; F) and C(S) (cf. Example 3, sec. 1.2 and sec. 1.7) which determines a linear isomorphism (5: Hint: See problem 7. 9. Let E be any vector space of infinite dimension. Show that there is

a dual space E* and a second pair of dual spaces F, F* such that there exist dual mappings

p:E—*F, (p*:E*.E....F* where 'p is injective but

is not surjective. again a dual pair of spaces.

Prove that E, Tm 'p* Sup10. Let 'p E—* F be a linear mapping with restriction 'p 1: E1 is a dual mapping. Prove that 'p* can be restricted pose that 'p*: is

:

to the pair (Ft, Ei). Show that the induced mapping (p_*:EIE±

is dual to

F/Fr'

with respect to the induced scalar product.

Chapter II. Linear mappings

11. Show that if F is a vector space with a countable basis then there is no vector space E such that 12. Let E be an infinite dimensional vector space. Construct a linear automorphism of L(E) which is not the dual of an automorphism of E.

13. Let & E x

be a bilinear map such that the linear maps

q: and given by and are isomorphisms. Show that the canonical map 2 defined in problem 3 is given by 2= (i/i a (p. Conclude that E and F must have the same I

finite imension.

§ 6.

Finite dimensional vector spaces

2.31. The space L(E; F). Let E and F be vector spaces of dimension n and in respectively. Then the space L(E; F) has dimension fin,

dimL(E; F) =

(2.59)

..., m)be To prove (2.59) let ..., n)be a basis E—÷ F defined by a basis of F. Consider the linear mappings *)

Now let p:

=1,...,m

be any linear mapping, and define scalars xv

by

=

Then (p —

xA =

YQ —

Y!L

=

—

=

0

whence v, ji

It follows that the mappings

generate the space L(E; F). A similar argument shows that the mappings are linearly independent and hence they form a basis of L (E; F). This basis is called the basis induced by the bases ofEand F. Since the basis consists of nm vectors, formula (2.59) follows. *)

is the Kronecker symbol defined by

=

§ 6. Finite dimensional vector spaces

77

2.32. The space L(E). Now consider the case that F=I' and choose in F the basis consisting of the unit element. Then the basis of L (E) induced 1, ..., n) consists of n linear given by by the basis

f

of L (E) is called the dual of the basis

we have

dimL(E)

of E. In particular,

dimE.

form a basis of L (E) every linear function f in Since the functions E can be uniquely written in the form

f= where the scalars

are given by

p=1,...,n. This formula shows that the components off with respect to the basis f the basis 2.33. Dual spaces. We shall now prove the assertion quoted at the end of sec. 2.22.

Proposition I: Let E, E* be a pair of dual spaces and assume that E has finite dimension. Then the injection hi: defined by formula (2.39) sec. 2.22 is surjective and hence a linear isomorphism. In particular, E* has finite dimension and dim E* = dimE.

(2.61)

Proof: Since hi is injective and dim L(E)= dim E it follows that

dim E* dim E. Hence E* has finite dimension. In view of the symmetry between E and E* we also have that (jim E dim E*

whence (2.61). On the other hand, dim L(E)= dim E and hence p is surjective. Corollary I. Let E, E* be a pair of dual finite dimensional spaces. Then

the results of sec. 2.28, 2.29 and 2.30 hold. Proof: Each result needs to be verified independently, but the proofs can all be obtained by using the linear isomorphism E*4 L (E). The actual verifications are left to the reader.

Chapter II. Linear mappings

78

Corollary H: Let and be any two vector spaces dual to E. Then there exists a unique linear isomorphism such that =

Two bases and

(v = 1.. .n) of E and E* are called dual if

=

(2.62)

Given a basis (v = I .. ii) of E there exists precisely one dual basis of E*. it is clear that the vectors x are u.niquely determined by (2.62). To prove

the existence of the dual basis letfv be the basis of L(E) defined in sec. 2.32 and set

v=1...n

where P is the linear isomorphism of E* onto L(E). Then we have =

=

Given a pair of dual bases

(v = .. 1

= .11)

consider two vectors

x=

It follows from (2.62) that <x*,x> = Replacing

by

in this relation we obtain the formula =

which shows that the components of a vector XEE with respect to a basis

of E are the scalar products of x with the dual basis vectors. Proposition II: Let F be a subspace of E and consider the orthogonal complement F'. Then dimF + dimF' = dimE. (2.63)

Proof: Consider the factor space E*/FH In view of sec. 2.23, is dual to F which implies (2.63). Proposition III: Let E, E* be a pair of dual vector spaces and consider a bilinear function cP: E* x E—÷F. Then there exists precisely one linear transformation such that

= <x*,px>

x*EE*,xEE.

§ 6. Finite dimensional vector spaces

79

Let XEE be a fixed vector and consider the linear E* defined by =

in

In view of proposition I there is precisely one vector çoxeE such that

= The two above equations yield x> = cp(x*, x)

E

E*, xe E

and so a mapping (p : E—÷E is defined. The linearity of çø follows immediately from the bilinearity of and Suppose now that are two

linear transformations of F such that cli(x*,x) =

x>

and

cli(x*,x) =

(p2x>

Then we have that

<X, (pi — X> = 0 whence Proposition III establishes a canonical linear isomorphism between the

spaces B(E*, E) and L(E; F),

L(E; E). Here B(E*, F) is the space of bilinear functions &E* x dition and scalar multiplication defined by (tIi1 + tli2)(x*,x) =

and

(2 k)(x*, x) =

with ad-

+ cP2(x*,x) çji (x*, x).

2.34. The rank of a linear mapping. Let (p : E—÷F be a linear mapping of finite dimensional vector spaces. Then ker cp c E and Tm (p F have finite dimension as well. We define the rank of (p as the dimension of Tm (p

r((p) = dim Im (p. In view of the induced linear isomorphism Im(p

we have at once r((p) + dim ker(p = dimE.

(2.64)

(p is called regular if it is injective. (2.64) implies that (p is regular if and only if r ((p) = dim F.

Chapter 11. Linear mappings

In the special case dim E= dim F (and hence in particular in the case of a linear transformation) we have that is regular if and only if it is surjective. 2.35. Dual mappings. Let E*, E and F*, F be dual pairs and çø: be a linear mapping. Since E* is canonically isomorphic to the space L (E) there exists a dual mapping Hence we have the relations (cf. sec. 2.281

Im

= (ker

and (ker(p)L.

The first relation states that the equation = has

a solution for a given yEF if and only if y satisfies the condition

<x*,y>=O forevery

x*ekerp*.

The second relation implies that dual mappings have the same rank. In fact, from (2.63) we have that

dimlmp* = dim(kerp)' = dimE — dimkerço = dimlmp whence r(co*)

= r(p).

(2.65)

Problems (In problems 1—10 it will be assumed that all vector spaces have finite dimension).

1. Let E, E* be a pair of dual vector spaces, and let E1, E2 be subspaces of E. Prove that (E1 n

Hint: Use problem 2, § 5. 2. Given subspaces Uc E and V* c E* prove that

dim(U' n V*) + dim U = dim(U

n

+ dim

3. Let E, E* be a pair of non-trivial dual vector spaces and let 'p : be a linear mapping such that 'p = (z*) - 'p for every linear auto1

morphism t of E. Prove that 'p =0. Conclude that there exists no linear mapping which transforms every basis of E into its dual basis.

§ 6. Finite dimensional vector spaces

(v = 1. . .n) of E, E* show that the

4. Given a pair of dual bases

••,

bases (x*' + again dual. v2

...,

and (x1,

are

5. Let E, F, G be three vector spaces. Given two linear mappings prove that

ço:E—*F and

and

6. Let E be a vector space of dimension n and consider a system of n linear transformations such that

(i,j=1...n). a) Show that every

has rank 1 a second system with the same property, prove that there exists a linear automorphism t of E such that

b) If

=

t.

a

— 1

7. Given two linear mappings p:

r(q) -

and

: E—>F prove that

r(ço) +

+

Show that the dimensions of the spaces M' (ço), are given by

8. § 2

dimM'(p) dim

(dim F —

(p) in problem 3,

r(p))dimE

((p) = dim ker çø dim F.

9. Show that the mapping

defines a linear isomorphism,

cP:L(E;F)3 L(F*;E*). 10. Prove that c1'M'(ço) =

and = Ml(ço*) where

the notation is defined in problems 8 and 9. Hence obtain the

formula

11. Let p:

r(ço) = r(p*).

be a linear mapping (E, F possibly of infinite dimension). Prove that Im ço has finite dimension if and only if ker 'p has finite codimension (recall that the codimension of a subspace is the dimension 6

Greub, Lineur Algebra

Chapter II. Linear mappings

of the corresponding factor space), and that in this case

codimkerp = dimlmp. 12. Let E and F be vector spaces of finite dimension and consider a bilinear function P in Ex F. Prove that dim E — dim NE = dim F — dim NF

where NE and NF denote the null spaces of P. Conclude that if dim E=dim F. then NE=O if and only if

Chapter III

Matrices In this chapter all vector spaces will be defined over afixed, but arbitrarily chosen field F of characteristic 0.

§ 1. Matrices and systems of linear equations 3.1. Definition. A rectangular array A—_(

(3.1)

)

is called a matrix of n rows and m columns or, in of nm scalars an n x m-matrix. The scalars are called the entries or the elements of the matrix A. The rows (v—_1...n) and therefore are called the

can be considered as vectors of the space row-vectors of A. Similarly, the columns —

...

considered as vectors of the space

(it = 1 ... m)

are called the column-vectors of A.

Interchanging rows and columns we obtain from A the transposed matrix

A*=(

).

(3.2)

In the following, matrices will rarely be written down explicitly as in This notation has the (3.1) but rather be abbreviated in the form A = disadvantage of not identifying which index counts the rows and which the columns. It has to be mentioned in this connection that it would be very undesirable — as we shall see — to agree once and for all to always let

Chapter III. Matrices

the subscript count the rows, etc. If the above abbreviation is used, it will be stated explicitly which index indicates the rows. 3.2. The matrix of a linear mapping. Consider two linear spaces E and With the aid of F of dimensions ii and rn and a linear mapping p: (v 1. . n) and bases = 1. in) in E and in F respectively, every can be written as a linear combination of the vectors vector (1u:=l...rn), . .

.

(v

1... n).

(3.3)

where v In this way, the mapping p determines an n x rn-matrix counts the rows and ,u counts the columns. This matrix will be denoted by M(p, or simply by M(tp) if no ambiguity is possible. Conversely, every n x in-matrix determines a linear mapping p:E-+F by the equations (3.3). Thus, the operator M('p)

M: 'p

defines a one-to-one correspondence between all linear mappings ço:E—*F and all n x rn-matrices.

3.3. The matrix of the dual mapping. Let E* and F* be dual spaces of E and F, respectively, and p: E-+F, a pair of dual mappings. Consider two pairs of dual bases X*V, (v = 1. n) and (/2 1.. . in) of E*, E and F*, F, respectively. We shall show that the two corresponding matrices and M('p*) (relative to these bases) are transposed, i.e., that M('p*) = M((p)*. (3.4) :

. .

The matrices M('p) and M('p*) are defined by the representations 'p

=

and

=

Note here that the subscript v indicates in the first formula the rows of

the matrix and

and in the second the columns of the matrix in the relation

Substituting

=

(3.5)

=

(3.6)

we obtain Now (p

=

YK> =

(3.7)

§ 1. Matrices and systems of linear equations

and

=

(3.8)

The relations (3.6), (3.7) and (3.8) then yield */1_ — V• JL

V

— as stated before — that the subscript v indicates rows of and columns of (ce) we obtain the desired equation (3.4). 3.4. Rank of a matrix. Consider an n x ni-matrix A. Denote by r1 and by r2 the maximal number of linearly independent row-vectors and co-

Observing

lumn-vectors, respectively. It will be shown that r1 =r2. To prove this let E and F be two linear spaces of dimensions n and iii. Choose a basis x. (v 1.. .n) and y,2 (p = 1.. rn) in E and in F and define the linear mapping p: F by (/1X% = Besides

'p, consider the isomorphism f3:

F

F"'

defined by where

y= Then

j3 o p

is a linear mapping of E into F"'. From the definition of fi it into the v-th row-vector,

follows that f3 a p maps

=

/3

Consequently,

the rank of flap

is

equal to the maximal number r1 of

linearly independent row-vectors. Since /3 is a linear isomorphism, has the same rank as p and hence r1 is equal to the rank r of p. Replacing çø by we see that the maximal number r2 of linearly independent column-vectors is equal to the rank of ço*. But (p* has the same rank as ço and thus r1 = r2 = r. The number r is called the rank of the matrix A. 3.5. Systems of linear equations. Matrices play an important role in the discussion of systems of linear equations in afield. Such a system

=

(p = 1... in)

(3.9)

Chapter III. Matrices

of in equations with ii unknowns is called inhomogeneous if at least one is different from zero. Otherwise it is called homogeneous. From the results of Chapter II it is easy to obtain theorems about the existence and uniqueness of solutions of the system (3.9). Let E and F be two linear spaces of dimensions ii and in. Choose a basis (v = I .n) of .

E as well as a basis

(p =

1

. . .

in)

.

of F and define the linear mapping

F by

(p:

= Consider two vectors (3.10)

and

(3.11)

Then (3.12)

Comparing the representations (3.9) and (3. 12) we see that the system (3.9) is equivalent to the vector-equation X

= j.

Consequently, the system (3.9) has a solution if and only if the vector y

is contained in the image-space Im tp. Moreover, this solution is uniquely determined if and only if the kernel of ço consists only of the zero-vector. 3.6. The homogeneous system. Consider the homogeneous system =

0

(p

1... in).

(3.13)

is a From the foregoing discussion it is immediately clear that solution of this system if and only if the vector x defined by (3.10) is contained in the kernel ker p of the linear mapping p. In sec. 2.34 we have shown that the dimension of ker 'p equals n—r where r denotes the rank of 'p. Since the rank of 'p is equal to the rank of the matrix we therefore obtain the following theorem:

A homogeneous system of m equations with n unknowns whose coefficient-

matrix is of rank r has exactly n — r linearly independent solutions. In the special case that the number in of equations is less than the number n of unknowns we have ii— r n — in 1. Hence the theorem asserts that the system (3. 13) always has non-trivial solutions if ni is less than n.

§ I. Matrices and systems of linear equations

87

3.7. The alternative-theorem. Let us assume that the number of equations is equal to the number of unknowns,

=

= 1... n).

(3.14)

Besides (3.14) consider the so-called "corresponding" homogeneous system =0 (3.15) (ii = 1... n).

The mapping introduced in sec. 3.5 is now a linear mapping of the ndimensional space E into a space of the same dimension. Hence we may apply the result of sec. 2.34 and obtain the following alternative-theorem: If the homogeneous system possesses only the trivial solution (0.. .0), the inhomogeneous system has a solution

every choice of the right-

..

hand side. If the homogeneous system has non-trivial solutions, then the inhoinogeneous one is not so/cable for eterv choice of the ;7v(v = . ii). From the last statement of section 3.5 it follows immediately that in the first case the solution of (3.14) is uniquely determined while in the second case the system (3.14) has — if it is solvable at all — infinitely many solutions. 1

.

3.8. The main-theorem. We now proceed to the general case of an arbitrary system

=

(1u

= 1 ... m)

(3.16)

of m linear equations in n unknowns. As stated before, this system has a solution if and only if the vector

y= is

contained in the image-space Im ço. In sec. 2.35 it has been shown that

the space Im p is the orthogonal complement of the kernel of the dual mapping In other words, the system (3.16) is solvable if and only if the right-hand side (p=l...m) satisfies the conditions (3.17)

for all solutions

=I

.. ni)

of the system

(v=1...n). We formulate this result in the following

(3.18)

Chapter III. Matrices

Main-theorem: An inhoniogeneous system of n equations in in unknowns has a solution if and only if every solution 1. in) of the transposed homogeneous system (3.18) satisfies the orthogonality-relation (3.17). . .

Problems 1. Find the matrices corresponding to the following mappings:

a) px=O. b) çox=x. c) px=2x. d)

x=

where

(v =

1,

..., n) is a given basis and in n is a

given number and

2. Consider a system of two equations in n unknowns

Find the solutions of the corresponding transposed homogeneous system.

3. Prove the following statement: The general solution of the inhomogeneous system is equal to the sum of any particular solution of this system and the general solution of the corresponding homogeneous system. 4. Let and be two bases of E and A be the matrix of the basistransformation Define the automorphism c' of E by Prove that A is the matrix of as well with respect to the basis as with respect to the basis 5. Show that a necessary and sufficient condition for the n x n-matrix A= to have rank 1 is that there exist elements and ..., j31 •.., f3fl such that f32 = (v = 1,2, ...,n; = 1,2, ...,n). if A +0, show that the elements cc and 13P are uniquely determined up to = 1. constant factors ). and p respectively, where as 6. Given a basis a linear space E, define the mapping p : p

=

Find the matrix of the dual mapping relative to the dual basis. 7. Verify that the system of three equations:

+ + = 3, —11—4' = 4, + 3ij + 34' = 1

§ 2. Multiplication of matrices

89

has no solution. Find a solution of the transposed homogeneous system which is not orthogonal to the vector (3, 4, 1). Replace the number 1 on the right-hand side of the third equation in such a way that the resulting system is solvable. 8. Let an inhomogeneous system of linear equations be given,

(ii=1,...,in). augmented matrix of the system is defined as the in x (n+ 1)-matrix ..., çi"). Prove that obtained from the matrix by adding the column the above system has a solution if and only if the augmented matrix has the same rank as the matrix (cd). The

§ 2.

Multiplication of matrices

3.9. The linear space of the n x m matrices. Consider the space L(E; F) of all n x rn-matrices. of all linear mappings ço:E—÷F and the set Once bases have been chosen in E and in F there is a 1-1 correspondence between the mappings p: E—÷ F and the n x rn-matrices defined by (3.19)

This correspondence suggests defining a linear structure in the set MnXrn such that the mapping (3.19) becomes an isomorphism. We define the sum of two n x rn-matrices and

A= as

B

the n x rn-matrix

A+B=

+

and the product of a scalar 2 and a matrix A as the matrix 2A

is immediately apparent that with these operations the set is a linear space. The zero-vector in this linear space is the matrix which has only zero-entries. Furthermore, it follows from the above definitions that It

i.e., that the mapping (3.19) defines an isomorphism between L (E; F) and the space X

Chapter III. Matrices

3.10. Product of matrices. Assume that and

are linear mappings between three linear spaces E, F, G of dimensions ii, in and I, respectively. Then /i is a linear mapping of E into G. Select in) and a basis (v I .. ii), (p = .. .1) in each of the three spaces. Then the mappings and i/i determine two matrices and by the relations = 1

. . .

1

and

These two equations yield = Consequently, the matrix of the mapping

ço

relative to the bases x%, and

is given by

(3.20)

The n x 1-matrix (3.20) is called the product of the ii x rn-matrix A = and the rn x /-matrix B= and is denoted by A B. It follows immediately from this definition that

= M(tp)M(t/i).

(3.21)

Note that the matrix M p) of the product-mapping p is the product of the matrices M(ço) and in reversed order of the factors. It follows immediately from (3.21) and the formulas of sec. 2.16 that the matrix-multiplication has the following properties:

A(.)LB1 +pB2)=).AB1 +pAB2 (AB)C = A(BC) (AB)* = B*A*. 3.11. Automorphisms and regular matrices. An ii x n-matrix A is called regular if it has the maximal rank /1. Let 'p be an automorphism of the n-dimensional linear space E and A = M ('p) the corresponding /1 x nmatrix relative to a basis (v = .. n). By the result of section 3.4 the rank of 'p is equal to the rank of the matrix A. Consequently, the matrix A is regular. Conversely, every linear transformation p: E—* E having a regular matrix is an automorphism. 1

.

§ 2. Multiplication of matrices

To every regular matrix A

A' such that

there

exists an in verse matrix, i.e., a matrix

AA' =

=

.1,

where J denotes the unit matrix whose entries are In fact, let cp be the E automorphism of E such that M((p)=A and let be the inverse automorphism. Then = 1

whence

M(p)M(q,)' =

o(p)

= M(i)

J

and

M(ço')M(p)= These

M(t)= J.

equations show that the matrix

A' = is the inverse of the matrix A.

Problems 1. Verify the following properties: a) b)

(L4)*=2A*.

c) 2. A square-matrix is called upper (lower) triangular if all the elements below (above) the main diagonal are zero. Prove that sum and product of triangular matrices are again triangular.

3. Let p be linear transformation such that 'p2 = 'p. Show that there exists a basis in which 'p is represented by a matrix of the form: O...O

1 1

.

rn

O...O

1

o

0

n—rn 0

o n

Chapter III. Matrices

92

4. Denote by A13 the matrix having the entry I at the place (i,j) and zero elsewhere. Verify the formula =

Prove that the matrices form a basis of the space M"

X

Basis-transformation

§ 3.

3.12. Definition. Consider two bases and E. Then every vector (v = 1. ii) can be written as

l...n) of the space

. .

=

(3.22)

Similarly, (3.23)

The two n x n-matrices defined by (3.22) and (3.23) are inverse to each other. In fact, combining (3.22) and (3.23) we obtain — xi,— A

This is equivalent to

5:A = 0

— A

p

and hence it implies that

In a similar way the relations

are proved. Thus, any two bases of E are connected by a pair of inverse matrices. Conversely, given a basis (v = 1. n) and a regular n x n-matrix another basis can be obtained by = . .

To show that the vectors

are linearly independent, assume that

= 0. Then

=0

§ 3. Basis-transformation

and

93

hence, in view of the linear independence of the vectors

Multiplication with the inverse matrix

yields

(K=1...n). 3.13. Transformation of the dual basis. Let E* be a dual space of E, the dual of the basis (v I.. .n). Then the dual basis of and = where

is

(3.24)

a regular n x n-matrix. Relations (3.23) and (3.24) yield

;>.

x%) =

(3.25)

Now xi,> =

and

=

Substituting this in (3.25) we obtain I-'v

This shows that the matrix of the basis-transformation is the inverse of the matrix of the transformation The two basis-transformations and = (3.26) = are called contragradient to each other. The relations (3.26) permit the derivation of the transformation-law

for the components of a vector xeE under the basis-transformation Decomposing x relative to the bases

and

we obtain

and

From the two above equations we obtain in view of (3.26). =

=

(3.27)

Comparing (3.27) with the second equation (3.26) we see that the components of a vector are transformed exactly in the same way as the vectors of the dual basis.

Chapter III. Matrices

94

3.14. The transformation of the matrix of a linear mapping. In this section it will be investigated how the matrix of a linear mapping (p: E—÷ F is changed under a basis-transformation in E as well as in F. Let M(p; xi,, and M (p; be the ii x in-matrices of çø relative = = to the bases xv,, (v = .. in), respectively. Then and .1?, p = 1

.

1

(v=1...n).

and

Introducing the matrices A=

and

of the basis-transformations we then have the relations

-

(3.28)

B= and their inverse matrices,

and

=

=

=

=

(3.29) YK.

Equations (3.28) and (3.29) yield

and we obtain the following relation between the matrices =

and (n): (3.30)

Using capital letters for the matrices we can write the transformation formula (3.30) in the form = A M (p;

M ((p;

It shows that all possible matrices of the mapping p are obtained from a particular matrix by left-multiplication with a regular 11 x n-matrix and right-multiplication with a regular in x in-matrix.

Problems 1. Let f be a function defined in the set of all ii x n-matrices such that

f(TAT1)=f(A) for every regular matrix T. Define the function F in the space L (E; E) by

F (q) =

f (M (ço;

xv))

§ 4. Elementary transformations

95

where E is an n-dimensional linear space and (v = 1.. . n) is a basis of E. Prove that the function F does not depend on the choice of the basis is a linear transformation E—+E having the same 2. Assume that matrix relative to every basis (v = 1. n). Prove that p = 2i where 2 is a scalar. 3. Given the basis transformation . .

= X2

—

x2

—

x3

— X2

X3 = 2x2 + x3

find all the vectors which have the same components with respect to the bases x1, and

1,

§ 4.

2, 3).

Elementary transformations

3.15. Definition. Consider a linear mapping p: Then there exists a basis ..., n)ofE and a basis ..., n) ofFsuch that the corresponding matrix of 'p has the following normal-form:

r (3.31)

b

.

.

where r is the rank of 'p. In fact, let a basis of E such form a basis of the kernel. Then the vectors that the vectors ar+ bQ 'p aQ = 1, ..., r) are linearly independent and hence this system can be extended to a basis (b1, ..., bm) of F. It follows from the construction and that the matrix of 'p has the form (3.31). of the bases 1, ..., n) and 1, ..., m) be two arbitrary bases of Now let E and F. It will be shown that the corresponding matrix M('p; can be converted into the normal-form (3.31) by a number of elementary basis-transformations. These transformations are:

Chapter III. Matrices

(1.1.) Interchange of two vectors and x1(i+j). (1.2.) Interchange of two vectors Yk and Yi (k 1). (11.1.) Adding to a vector an arbitrary multiple of a vector (11.2.) Adding to a vector Yk an arbitrary multiple of a vector Yi (1+ k). it is easy to see that the four above transformations have the following effect on the matrix M((p): (1.1.) Interchange of the rows I andj. (1.2.) Interchange of the columns k and 1. (11.1.) Replacement of the row-vector by (11.2.) Replacement of the column-vector bk by It remains to be shown that every n x rn-matrix can be converted into the normal form (3.31) by a sequence of these elementary matrix-transformations and the operations be the given ii x rn-matrix. 3.16. Reduction to the normal-form. Let otherwise the It is no restriction to assume that at least one matrix is already in the normal-form. By the operations (1.1.) and (1.2.) this element can be moved to the place (1, 1). Then 0 and it is no restriction to assume that = 1. Now, by adding proper multiples of the first row to the other rows we can obtain a matrix whose first column consists of zeros except for Next, by adding certain multiples of the first column to the other columns this matrix can be converted into the form

o...o

1

0*

*

(3.32)

0* if all the elements

*

(v = 2.. .n, R = 2.. rn) are zero, (3.32) is the normal-

form. Otherwise there is an element This can be moved to the place (2,2) by the operations (1.1. and (1.2.). Hereby the first row and the first column are not changed. Dividing the second row by and applying the operations (11.1.) and (11.2.) we can obtain a matrix of the form 1

0

0

1

0 *

00*

*

In this way the original matrix is ultimately converted into the form (3.31.).

§ 4. Elementary transformations

97

3.17. The Gaussian elimination. The technique described in sec. 3.16 can be used to solve a system of linear equations by successive elimination. Let (3.33)

be a system of in linear equations in n unknowns. Before starting the elimination we perform the following reductions: If all coefficients in a certain row, say in the i-th row, are zero, consider the corresponding number on the right hand-side. If the i-th equation contains a contradiction and the system (3.33) has no solution. If the i-th equation is an identity and can be omitted. Hence, we can assume that at least one coefficient in every equation is different from zero. Rearranging the unknowns we can achieve that 40. Multiplying the first equation by — and adding it to the p-th equation we obtain a system of the form 1

+

= (3.34)

which is equivalent to the system (3.33).

Now apply the above reduction to the (rn—. 1) last equations of the system (3.34). If one of these equations contains a contradiction, the system (3.34) has no solutions. Then the equivalent system (3.33) does not have from the a solution either. Otherwise eliminate the next unknown, say reduced system.

Continue this process until either a contradiction arises at a certain step or until no equations are left after the reduction. In the first case, (3.33) does not have a solution. In the second case we finally obtain a triangular system 132+0 (3.35)

which is equivalent to the original system*).

*) If no equations are left after the reduction, then every n-tupie solution of (3.33). 7

Greub. Linear Algebra

...

is a

ChaI)ter I II. \4atrices

The system (3.35) can be solved in a step by step manner beginning

with (3.36)

1

—

Inserting (3.36) into the first (r— 1) equations we can reduce the system to a triangular one of r— equations. Continuing this way we finally obtain the solution of (3.33) in the form 1

(v=1...r)

JLr+ 1

In) are arbitrary parameters.

where the

Problems 1. Two n x ni-matrices C and C' are called equivalent if there exists a regular n x n-matrix A and a regular in x rn-matrix B such that C' = A CB. Prove that two matrices are equivalent if and only if they have the same rank. 2. Apply the Gauss elimination to the following systems: a) —

= 2.

+

b)

2,71+ 172+4173+ 3,71+4,72+ 2,7' + + 5,73 c)

+

2c'+c2

—

= 1,

= 3.

Chapter IV

Determinants In this chapter, except for the last paragraph, all vector spaces will be defined over afixed but arbitrarily chosen field F of characteristic 0.

§ 1. Determinant functions 4.1. Even and odd permutations. Let X be an arbitrary set and denote Let for each 1 by X" the set of ordered p-tuples (x1 Y

be a map from

to a second set, Y. Then every permutation

a new map

acP: X"-* Y

defined by

= It follows immediately from the definitions that (4.1)

and (4.2)

where i is the identity permutation. Now let X = Y= ZL and define cli by cP(x1

=

fl

—

•

x1 E ZL.

i<j

It is easily checked that for every ac/I

=

c,7

where C,7= ± 1. Formulae (4.1) and (4.2) imply that

= and El

= 1.

.

100

IV.

Thus t;

is

a homomorphism from

Determinants

to the multiplicative group — 1. + I }. = (respectively

A permutation a is called eten (respectively odd) if c(7

1

—I).

Since for a transposition i. r= — 1, the transpositions are odd permutations. 4.2. p-linear maps. Let E and F be vector spaces. A p-linear map from E to F is a map 1: E"—+F which is linear with respect to each argument; i.e.. cP(x1

x1,)

F

=)Mx1

A p-linear map from E to F is called a p-linear function in E. As an be linear functions in E and define 1i by example, let j1 P(x1

...

=

A p-linear map k: mutation a

E E.

is called skew symmetric, if for every per-

=.

that is,

= r7P(x1

Every p-linear map II', given by

XJ))

E"—*F determines a skew symmetric p-linear map, = cp. a

In fact, let 'r be an arbitrary permutation. Then formula (4.1) yields

=

a)

a

= W

and so

is skew symmetric.

Proposition I: Let 1i be a p-linear map from E to F. Then the following conditions are equivalent: (i) is skew symmetric. (ii) P(x1 whenever x1 for some pair i4j. (iii) k(x1 whenever the vectors x1 ,...,x1, are linearly dependent. is skew symmetric and that x=xj Assume that Proof:

§ 1. Determinant functions

101

Denote by -r the transposition interchanging i andj. Then, since x1 = xj. (rP)(x1

xv).

On the other hand, since ct is skew symmetric,

=

(-r1)(x1

—

These relations imply that

and so, since F has characteristic zero, cli(x1

Conversely, assume that 1i satisfies (ii). To show that 1i is skew symmetric, fix a pair i, j (i<j) and set W(x, y) = 'Ji(x1

where the vectors

4 1,/) are fixed. Then

W(x,x)=O

XEE.

It follows that W(x, v) +

x) =

+ i, x + v) —

y) = 0

x) —

whence x)

—

E.

x,

¶1'(x, y)

This relation shows that

icP= for any transposition. Since every permutation is a product of trans-

positions, it follows that

i.e..

is skew symmetric.

Assume that satisfies (ii) and let x1 dependent vectors. We may assume that xi,.

xp

Then, by (ii),

be linearly

-

xp1.

=0

Chapter IV. Determinants

102

and so P satisfies (iii). Obviously. (iii) implies (ii) and so the proof is corn plete. CoroIlar%': If dim E=n. then every skew symmetric p-linear map P from E to F is zero if p>n.

Proposition II: Assume that E is an n-dimensional vector space and let P be a skew symmetric n-linear map from E to a vector space F. Then P is completely determined by its value on a basis of E. In particular, if P vanishes on a basis, then = 0. a, of E and write Proof: In fact, choose a basis a1

x)=

(,.= 1 ...

n).

Then çJ5

(x1

.. .

v,,) =

=

1)(a,

"P

1)

=

1)

.

a,,).

Since the first factor does not depend on P, the proposition follows. 4.3. Determinant functions. A determinant function in an n-dimensional vector space E is a skew symmetric n-linear function from E to F. In every n-dimensional vector space E (n 1) there are determinant functions which are not identically zero. In fact, choose a basis, f of the space L(E) and define the n-linear function P by

=

P(x1. ....

...

E E.

is the dual basis in E,

Then. if a1

(4.3)

Now set 4

Then A is a skew symmetric. Moreover, relation (4.3) yields A(a1

and so A

0.

=

§ 1. Determinant functions

103

Proposition III: Let E be an n-dimensional vector space and fix a nonzero determinant function 4 in E. Then every skew symmetric n-linear map ço from E to a vector space F determines a unique vector b E F such that = zl(x1

(p(x1

b.

(4.4)

of E so that zl (a1 = Then the n-linear map i/i: given by

Proof: Choose a basis a1

b=

p(a1

a,1).

= 4(x1

.

1

and set (4.5)

b

agrees with on this basis.Thus, by Proposition II. ,/i = p, i.e., = 4 Clearly, the vector b is uniquely determined by relation (4.4).

.

b.

Corollary: Let zl be a non-zero determinant function in E. Then every determinant function is a scalar multiple of 4.

Proposition IV: Let 4 be a determinant function in E (dim E=n). Then, the following identity holds: XEE, (4.6)

If the vectors x1 are linearly dependent a simple calculation shows that the left hand side of the above equation is zero. On the other hand, by sec. 4.2, 4(x1 Thus we may assume x,, form a basis of E. Then, writing

that the vectors x1

x we

have x1

=

(—

1

4 (xi,, x1

=4(x1

x1,

(4.7)

xv). x.

Problem Let E*, E be a pair of dual spaces and 4+0 be a determinant function in E. Define the function 4* of n vectors in E* as follows: Ifthevectorsx*v(v = .n)arelinearlydependent,then4* (x* 1•• .x*'l)= 0. 1

. .

Chapter I V. Detcrtiiinants

104

Ifthe vectors

(v 1 where

=

.n)are linearly independent, then zi * (x*

1

1

n) is the dual basis. Prove that /1* is a deter-

I

minant function in E*. § 2.

The determinant of a linear transformation

4.4. Definition. Let p be a linear transformation of the n-dimensional linear space E. To define the determinant of (p choose a non-trivial determinant function z1. Then the function defined by /14,

=

(x1 ...

A

x1

...

obviously is again a determinant function. Hence, by the uniquenesstheorem of section 4.3, /14, =

LI,

is a scalar. This scalar does not depend on the choice of A. In fact, if A' is another non-trivial determinant function, then zI'=2A and where

consequently Thus, the scalar

A ,,

=

2

=

is uniquely determined by the transformation p. It is

called the determinant of (p and it will be denoted by det (p. So we have the

following equation of definition: /14,

where A is an arbitrary non-trivial determinant function. In a less condensed form this equation reads = det (p A (x1 ...

A (p X1 ... p

(4.8)

In particular, if (p = 2i, then /14,

=

and hence

det(2i) = It follows from the above equation that the determinant of the identitymap is 1 and the determinant of the zero-map is zero. 4.5. Properties of the determinant. A linear transformation is regular if and only if its determinant is different from zero. To prove this, select a basis (v = 1.. .n) of E. Then A (p e1 ...

p

= det (p A (e1 ... en).

(4.9)

§ 2. The determinant of a linear transformation

If p is regular, the vectors p (v =

1.

. .

105

are linearly independent; hence

n)

(4.10)

Relations (4.9) and (4.10) imply that

+ 0.

det

+ 0. Then it follows from (4.9) that

Conversely, assume that det

Hence the vectors p (v = 1. . n) are linearly independent and Consider two linear transformations and i,Ii of E. Then

is regular.

.

det

det

In particular, if is a linear automorphism and p1 is the inverse automorphism, we obtain

detq1 det(p = deti =

1.

4.6. The classical adjoint. Let E be an n-dimensional vector space and let zi +0 be a determinant function in E. Let çoeL(E; E). Then an n-linear map E)

is given by

v,)x =

px1

j=

E.

.

1

It is easy to check that Ji is skew symmetric. Thus, by Proposition 11,

there is a unique linear transformation, ad(p), of E such that cP(x1

= z1(x1

ad(p)

E;

E

E.

This equation shows that the element ad(p)EL(E; E) is independent of the choice of A and hence it is uniquely determined by 'p. It is called the classical adjoint of'p.

Chapter IV. 1)eterminaiits

106

Proposition V: The classical adjoint satisfies the relations

ad(ço)o(p=1 •detq and i

. det(p.

Proof: Replacing x by pX in (4.14) we obtain x,) x1 = 4(x1

x,

IP

,...,x,) ad

x).

j=1

Now observe that, in view of the definition of the determinant and

Proposition IV.

P1

=det(p j= 1

=detq)

x1EE, XEE.

x

.

Composing these two relations yields

4(x1 ,...,x,) ad (p((px) = det

.

zl(x1 ,...,x,) x .

and so the first relation of the proposition follows. The second relation is established by applying p to (4.12) and using the identity in Proposition IV, with x replaced by (i= ... a). 1

Corollary: If go is a linear transformation with det an inverse and the inverse is given by go1

then go has

det go

Thus a linear transformation of E is a linear isomorphism if and only if its determinant is non-zero. 4.7. Stable subspaces. Let go: E—*E be a linear transformation and assume that E is the direct sum of two stable subspaces,

E=

E1

$E2.

Then linear transformations and

go2:E2—*E2

§ 2. The determinant of a linear transformation

107

are induced by (p. It will be shown that

det(p = detp1detq2.

Define the transformations

and

E—÷E

by

mE1 i

Then (P

and so det(p =

Hence it is sufficient to prove that det

=

det(p1

and

detk//2 = dettp2.

0 be a determinant function in E and h1 ... the function /11, defined by

Let zl

...

be a

basis of E2. Then

p=dimE,

... hqh

is a non-trivial determinant function in E1. Hence

= detp1A1(x1

A1 (Pi x1 ...

On the other hand we obtain from (4.14) ... xi,, b1 ... A1(x1 ... xi,). A (x1

=

bq)

These relations yield det(p1

The second formula (4.13) is proved in the same way.

Problems 1. Consider the linear transformation p:

(v=1...n) where

(v =

1.

.

.n)

is a basis of E. Show that

(4.13)

defined by

(4.14)

I \'.

Deicriiii

2. Let be a linear transformation and assume that E1 is a and stable subspace. Consider the induced transformations Prove that

detp = 3. Let x:E-+F be a linear isomorphism and p be a linear transformation of E. Prove that = detp. 4. Let E be a vector space of dimension ii and consider the space L(E; E) of linear transformations. a) Assume that F is a function in L (E; F) satisfying

F can be written in the form

F(p)= f(detq) wheref:F—+F is a mapping such that

=f(2)f(p). b) Suppose that F satisfies the additional condition that Then, if E is a real vector space,

= detp or F(p) = and if E is a complex vector space F(ço) = detçc.

Hint for part a): Let (i== 1.. .n) be a basis for E and define the transformations and by

v+i v=t and

v=i

.

Show first that =I and that F(ço1)

is

independent of i.

i,j=1...n

§ 3. The determinant of a matrix

4. Let E be a vector space with a countable basis and assume that a function F is given in L (E; E) which satisfies the conditions of problem 3a). Prove that

peL(E;E).

Hint: Construct an injective mapping that

such

and a surjective mapping i/i

=0.

5. Let w be a linear transformation of E such that w2 = i. Show that where r is the rank of the map i — w. — 6. Let j be a linear transformation of E such that = — i. Show that then the dimension of E must be even. Prove that det j = 1. 7. Prove the following properties of the classical adjoint: det w = (

1

(i)

(ii) det (iii) If p has rank n—i, then Im ad((p)=ker cm (iv) If p has rank n—2, then ad(p)=0. (v) 8.

ad(ad p)=(det If n=2 show that

ad(p)=i . trp—p. § 3.

The determinant of a matrix

4.8. Definition. Let ço be a linear transformation of E and sponding matrix relative to a basis (v = 1. . n). Then .

=

'p

Substituting

in (4.8) we obtain A ('p

= det 'p A (e1 ...

... 'p

The left-hand side of this equation can be written as A ('p e1, ... 'p

= =

We thus obtain

A

ep,

r(1)

... c(n)

.A(e ...

en).

the corre-

Chapter IV. Determinants

1 1()

This formula shows how the determinant of is expressed in terms of the corresponding matrix. We now define the determinant of an ii x n-matrix A by

f(n)

detA =

(4.16)

Then equation (4.15) can be written as det(p =

(4.17)

Now let A and B be two n x n-matrices. Then det (A B) = det A det B.

(4.18)

In fact, let E be an n-dimensional vector space and define the linear transformations 'p and of E such that (with respect to a given basis)

M(q)=A and M ('p) M (i/i) = det M p) = det (i/i p) = det M ('p) det M (i/i) = det A det B.

= det 'p det

Formula (4.18) yields for two inverse matrices

detAdet(A1) = detJ =

1

(J unit-matrix)

showing that

det(A1) = (detA)'. Finally note that if an (n x ;i)-matrix A is of the form

A = det

det A2

(4.19)

as follows from sec. 4.7. 4.9. The determinant considered as a function of the rows. If the rows

of the matrix A are considered as vectors of the space = . .. the determinant det A appears as a function of the n vectors (v = n). To investigate this function define a linear transformation 'p of by 1

(v=1...n)

. . .

§ 3. The determinant of a matrix where the vectors

111

are the n-tuples

(v=1...n). Now let A be the deterThen A is the matrix of relative to the basis minant function in r which assumes the value one at the basis (v = 1.. .n),

A

A

A

det

A (a1 ... an).

A

(4.20)

This formula shows that the determinant of A considered as a function of the row-vectors has the following properties: 1. The determinant is linear with respect to every row-vector. 2. If two row-vectors are interchanged the determinant changes the sign.

3. The determinant does not change if to a row-vector a multiple of another row-vector is added. 4. The determinant is different from zero if and only if the row-vectors are linearly independent. An argument similar to the one above shows that

where the bV are the column-vectors of A. It follows that the properties 1—4 remain true if the determinant of A is considered as a function of the column-vectors.

Problems 1. Let A = (cd) be a matrix such that

= 0 if v
2. Prove that the determinant of the n x n-matrix is

= 1 — equal to (n— l)(— Hint. Consider the mapping p: E—*E defined by =

—

(v = 1... ii).

Chapter IV. Determinants define the matrix

3. Given an ii x n-matrix A

B = det A. prove that

4. Given ii complex numbers

det

=

where the numbers 13k are defined by

= cos

uk = Hiizt.

27tk ii

+

2mk fl

(k = 1... n).

Multiply the above matrix by the matrix

Dual determinant functions

§ 4.

4.10. Let E*, E be a pair of dual vector spaces and A* +0, A +0 be determinant functions in E* and E. It will be shown that

x*n)A(xi ...

=

(4.21)

where is a constant scalar. Consider the function Q of 2n vectors defined by Q(x*l ... = det(<x*i,xj>). x1 ...

Then it follows from the properties of the determinant of a matrix that Q is linear with respect to each argument. Moreover, Q is skew symmetric and with respect to the vectors x1 (i= I . .n). with respect to the vectors Hence the uniqueness theorem (sec. 4.3) implies that Q can be written as .

x*fl; x1 ...

= cP(x*' ... x*fl)A(x1 ...

(4.22)

§ 4. Dual determinant functions

depends only on the vectors X*i. Replacing the basis e of E we obtain

where

Q(x*l, ...

e1

...

=

...

in (4.21) by a

... en).

This relation shows that is linear with respect to every argument and skew symmetric. Applying the uniqueness theorem again we find that fJEF.

(4.23)

= f3A*(x*l ... x*fl)A(x1 ...

(4.24)

Combining (4.22) and (4.23) we obtain x1

...

Now let e*i, e.(i= 1...n) be a pair of dual bases. Then (4.24) yields 1

f3A*(e*' ...

...

(4.25)

and so we obtain the relation (4.21). Multiplying (4.24) by The determinant functions A* and A are called dual if the factor in (4.21) is equal to 1; i.e., 1

= det(<x*i,xj>).

(4.26)

To every determinant function A +0 in E there exists precisely one dual determinant function A* in E*. In fact, let be an arbitrary determinant function in E* and set where is the scalar in (4.21). Then A* and A are dual. To prove the uniqueness, assume that and are dual determinant functions to zi. Then we have that x*n)

—

...

...

=0

whence

4.11. The determinant of dual transformations. Let ço: E—3 E and (p*:E*4E* be two dual linear transformations. Then det

= det çø.

(4.27)

To prove this, let A*, A be a pair of dual determinant functions in E* and E. Then we have in view of (4.26) x*n)A(xi ...

= det(<x*i,xj>).

This relation yields ... 8

Greub. Linear Algebra

= det()

Chapter IV. Determinants

and =

...

Since

(i,j = ... n)

<(p* x*i, xJ> = <X*i, (p x1>

1

it follows that = A*(x*l ... x*n)A(cox1 ... (4.28)

But

A*(p*x*l,

and

q,*X*fl) = detco*zl*(x*l ...

= det

A((p X1 ... (p

A (x1 ...

and so we obtain from (4.28) that (det (p* — det co)zl* (x*' ... x*'l)A (x1 ...

=0

whence (4.27)

The above result implies that transposed n x n-matrices have the same determinant. In fact, let A be an n x n-matrix and let (p be a linear transformation of an n-dimensional vector space such that (with respect to a given basis) M(ço)=A. Then it follows that detA* = detM(cp)* = detM((p*) = = det (p* = det (p = det M ((p) = det A.

Problems 1. Show that the determinant-functions, A, 4* of § 1, problem 1 are dual. 2. Using the expansion formula (4.16) prove that,

detA* = detA. §

5. The

adjoint matrix

4.12. Definition. Let (p:E—*E(dim E=ii) be a linear transformation and let ad((p) be the adjoint transformation (cf. sec. 4.6). We shall express the matrix of ad ((p) in terms of the matrix of It is called the adjoint of the matrix of (p. Let a1

a,, be a basis of E such that 4(a1 ad(p)a1 =

I and write

adjoint matrix

§ 5. The

(v= I ...

Then, setting

a)

L15

and x=a1 in the identity (4.14) we obtain ...,

zi(a1.

= zl(a1

whence

=

(i,j= 1 ... ii).

a1,

This equation shows that = det

where

E—*E is the linear transformation given by =

(v

Since the matrix of

i)

(i,j = ...

a, =

1

a)

with respect to the basis a1

-'

a is

given by

+1

1

I

1 1

f—I

i

i±1

f-fl

•..

...

a

CL1-f1

+1•••

1

we have

= det is called the cofactor of Deilnition: The determinant of denoted by Thus we can write

and is

(i,j= la); i.e.. the adjoint matrix is the transpose of the matrix of cofactors. 4.13. Cramer's formula. In view of the result above we obtain from Proposition V. sec. 4.6, the relations =

. detA

(4.36)

and (4.37)

Chapter IV. Determinants

Setting k=i in (4.36) we obtain the expansion formula by cofactors.

(i= I

detA

and define a matrix

Now assume that detA

=

Then these equations yield

(4.38)

n). by setting

detA

is the inverse of the matrix (xi). showing that the matrix From (4.36) we obtain Cramers solution formula of a system of n linear equations in a unknowns whose determinant is non-zero. In fact, let

=

=

a

(4.39)

k=1

be such a system. Then we have, in view of (4.36).

whence

=

=

This formula expresses the (unique) solution of (4.39) in terms of the

cofactors of the matrix A and the determinant of A. Given an (a x ii)-matrix denote for each 4.14. The submatrices pair (Lj) (i= 1 .. a, j = I ... a) by SI the (a — 1) x (a — 1)—matrix obtained from A by deleting the i-th row and the j-th column. We shall show that .

cof

=(—

1)1

+

det

(i, I = 1 ... a).

(4.40)

In fact, by (i—I) interchanges of rows and (j— I) interchanges of columns we can transform 1

into the matrix (cf. sec. 4.12)

§ 5. The

Thus

adjoint matrix

= det

=(

117

—

On the other hand we have, in view of the expansion formula (4.38), with 1= 1

detBl =

and so (4.40) follows. 4.15. Expansion by cofactors. From the relations (4.35) and (4.30) we

obtain the expansion-formula of the determinant with respect to the 1th row detA = (1 = 1... ii). (4.41) By this formula the evaluation of the determinant of n rows is reduced to the evaluation of n determinants of n— 1 rows. In the same way the expansion-formula with respect to the Jth column is proved: + det A = ( det S/ (4.42) (j = 1 ... n).

4.16. Minors. Let A =

be a given ii x in-matrix. For every system

of indices and

the submatrix of A, consisting of the rows 1l'k and the is called a minor of order k of the matrix A. It will be shown that in a matrix of rank r there is always a minor of order r which is different from zero, whereas all minors of order be a minor of order k > r. Then the row-vectors k > r are zero. Let of A are linearly dependent. This implies that the rows of the a,1.. are also linearly dependent and thus the determinant must matrix be zero. It remains to be shown that there is a minor of order r which is different denote by

columns /1•Jk• The determinant of

from zero. Since A has rank r, there are r linearly independent rowvectors a, ..

a. The submatrix consisting of these row-vectors has again

the rank r. Therefore it must contain r linearly independent columnvectors h' ... 1* (cf. sec. 3.4). Consider the matrix Its columnvectors are linearly independent, whence

detA/'j"+O. If A is a square-matrix, the minors det

are called the principal minors of order k.

Chapter IV. Determinants

Problems 1. Prove the Laplace expansioii formula for the determinant of an 1) be a fixed integer. Then

(ii x n)-matrix: Let p (1 det A =

det

v1,) det A"

c (v1

where

= B'"

+

and

= 2.

(ii)

— 1)'

1

h, be two bases of a vector space E and a, and h1 1) be given. Show that the vectors can be reordered

Let a1

let p(l such that (i)

(v,—i) (

hr,, is a basis of E, is also a basis of E.

a1

h,,

3. Compute the inverse of the following matrices. 11

1

1

1

—1

1

1

—1 —1 1

—1 1

21 21 B= 2

4. Show that a)

xaa...a

a x a...a det

:

:

a...ax

= [x + (n —

1)

a] (x —

— 1

§ 5. The acijoint matrix

and that

r1

b)

...

1

1

22

det

... 2

1 1

—1

1

—1

Show

(n>2)

that

z11=x1; z12=x1x2+1. 6. Verify the following formula for a quasi-triangular

x11...x1 p

O....O

determinant:

xp+1p+1....xp+ln

o....o I

=

det

xnl 7.

xnn

det

xnp+1

Prove that the operation

properties: a) ad (A B)= ad B ad A. b) det ad A = (det c) ad ad A=(det

A =(det

xnn

_.)

1)2

(cf. sec.4.12) has the following

Chapter IV. Determinants

12()

§ 6.

The characteristic polynomial

4.17. Eigenvectors. Consider a linear transformation 'p of an n-dimensional linear space E. A vector a+O of E is called an eigenvector of 'p if 'p a =

2

a.

The scalar 2 is called the corresponding eigen value. A linear transformation (p need not have eigenvectors. As an example let E be a real linear space of two dimensions and define 'p by

(px1=x2 çox2=—x1 the vectors x1 and x2 form a basis of E. This mapping does not have eigenvectors. in fact, assume that where

a=

x1

+

is an eigenvector. Then pa=2a and hence These

equations yield

whence

+

=

and

4.18. The characteristic equation. Assume that a is an eigenvector of 'p and that 2 is the corresponding eigenvalue. Then

'pa=2a,

a+O.

This equation can be written as ('p — 2i)a = 0

(4.43)

showing that p—21 is not regular. This implies that det('p — 2i) = 0.

(4.44)

Hence, every eigenvalue of 'p satisfies the equation (4.44). Conversely, assume that 2 is a solution of the equation (4.44). Then 'p—2i is not regular. Consequently there is a vector a+O such that ('p — 2i)a = 0,

whence 'pa=2a. Thus, the eigenvalues of'p are the solutions of the equation (4.44). This equation is called the characteristic equation of the linear transformation 'p.

§ 6. The characteristic polynomial

121

4.19. The characteristic polynomial. To obtain a more explicit expression for the characteristic equation choose a determinant function A +0 in E. Then

=

A(px1 — Ax1 ... çox, —

=

1

det(ço

—

Ai)A(x1

... n).

(4.45)

Expanding the left hand-side we obtain a sum of

terms of the form

A(z1 ...

Denote by the sum of all terms in which p arguments are equal to and il—p arguments are equal to Collect in each term of Si,, the indices (v1 <
such that

and the indices

=—

=

—

Introducing the permutation a by

(i= in) we can write

= = =

A (z1 ...

... —

—

...

(—

xa(p+ 1)

Thus,

=

(4.46)

...

(—

where the sum is extended over all permutations a subject to the conditions

a(l)<...
=

(,—

(4.47)

where the sum on the right hand-side is taken over all permutations. Let be the function defined by =

...

...

(0

p n)

Chapter IV. Determinants

and -r be an arbitrary permutation of (1 .. ./1). Then

=

Xta(l)

A

=

A

=

Xrc(p + 1)

Xta(p), Xra(p + 1)

1)

A

Xra(fl))

1)

...

=

This equation shows that ments. This implies that =

is skew symmetric with respect to all argu(4.48)

—

(—

where ce,, is a scalar. Inserting (4.48) into (4.47) we obtain

= Hence, the left hand-side of (4.45) can be written as A((px1 — 2x1, ...

= A(x1 ...

—

(4.49)

Now equations (4.45) and (4.49) yield det (p —

2 i)

=

ce,,

—"

p=o

showing that the determinant of —2 i is a polynomial of degree n in 2. This polynomial is called the characteristic polynomial of the linear transformation p. The coefficients of the characteristic polynomial are determined by equation (4.48), and are called the characteristic coefficients. These relations yield for p = 0 and p = n

= (—

and ; = detq

respectively.

4.20. Existence of eigenvalues. Combining the results of sec. 4.18 and 4.19, we see that the eigenvalues of 'p are the roots of the characteristic polynomial

f(2) = This shows that a linear transformation of an n-dimensional linear space has at most n d!fferent eigen values.

§ 6. The characteristic polynomial

123

Assume that E is a complex linear space. Then, according to the fundamental theorem of algebra, the polynomialf has at least one zero. Consequently, every linear transformation of a complex linear space has at least one eigen value.

if E is a real linear space, this does not generally hold, as it has been shown in the beginning of this paragraph. Now assume that the dimension of E is odd. Then

lirnf(2)=+oo

and

and thus the

must have at least one zero. This proves that a linear transformation of an odd-dimensional real linear space has at least one eigen value. Observing that

f(O)=

=

detço

we see that a linear transformation of positive determinant has at least one positive eigenvalue and a linear transformation of negative determinant has at least one negative eigenvalue, provided that E has odd dimension. If the dimension of E is even we have the relations

lim f

=

urn f

and

= oo

A—' —03

and hence nothing can be said if det co>0. However,

if det p<0,

there

exists at least one positive and one negative eigenvalue. 4.21. The characteristic polynomial of the inverse mapping. It follows from (4.27) that the characteristic polynomial of the dual transformation coincides with the characteristic polynomial of 'p. Suppose now that where E1 and 122 are stable subspaces. Then the result of sec. 4.7 implies that the characteristic polynomial of 'p is the product of the characteristic polynomials of the induced transformations 'Pi : E1 and 'P2 Finally, let p: E—÷E be a regular linear transformation and consider the

inverse transformation p'. The characteristic polynomial of p' is defined by

F(2) = det('p1 Now, —

2i =

'p'

—

— 2'p) = — 2co_1

— )_1

Chapter IV. Determinants

124

whence

det((p'

—

2i) =

—

(—

).' i).

This equation shows that the characteristic polynomials of are related by F(2) = (— Expanding Fe.) as -

F (2) = we

and of

obtain the following relations between the coefficients of f and of F:

(v=0...n). 4.22. The characteristic polynomial of a matrix. Let e%(v = 1.. ii) be a basis of E and A = be the matrix of the linear transformation relative to this basis. Then M(p — 2t) =

—

2M(i) =

A

—

whence

det ('p — 2 i) = det M ('p — 2 i) = det (A — 2 J).

Thus, the characteristic polynomial of 'p can be written as

f

= det (A —

2

J).

(4.50)

The polynomial (4.50) is called the characteristic polynoiiiial of tile matrix A. The roots of the polynomialf are called the eigenvalues of the matrix A.

Problems 1. Compute the eigenvalues of the matrix

/1

0

\i

—1

3

(3 —2 —l 1

2. Show that the eigenvalues of the real matrix / ci

are real.

§ 6. The characteristic polynomial 3.

125

Prove that the characteristic polynomial of a projection

(see Chapter II, sec. 2.19) is given by —

= (— where n

dim E and p = dim E1.

4. Show that the coefficients of the characteristic polynomial of an involution satisfy the relations

(p=O...n). 5. Consider a direct decomposition Given linear transfor(i = 1,2) consider the linear transformation ço = mations E—÷ E. Prove that the characteristic polynomial of 'p is the product of the characteristic polynomials of and of 'pa.

6. Let ço:E—*E be a linear transformation and assume that E1 is E1 a stable subspace. Consider the induced transformations and Prove that :

X

Xi X

Xi and denote the characteristic polynomials of 'p, respectively. In particular show that where

and

=(

is the induced transformation of E/ker 'p and s denotes the dimension of ker 'p. 7. A linear transformation, p, of E is called nilpotent if ç0k=0 for some k. Prove that 'p is nilpotent if and only if the characteristic polynomial has the form where

x (2) = (—

Hint: Use problem 6. 8. Given two linear transformations 'p and cli of E show that det('p — Açli) is a polynomial in 2. 9. Let 'p and cl' be two linear transformations. Prove that 'pot/i and have the same characteristic polynomial. Hint: Consider first the case that is regular.

Chapter IV. Determinants

26

§

7. The trace

4.23. The trace of a linear transformation. In a similar way as the determinant, another scalar can be associated with a given linear transformation (p. Let 44z0 be a determinant function in E. Consider the sum

This sum obviously is again a determinant function and thus it can be written as (4.51)

is a scalar. This scalar which is uniquely determined by 'p is called the trace of 'p and will be denoted by tr 'p. It follows immediately that where

the trace depends linearly on p, tr(2ço +

= Atr'p +

Next we show that =

(4.52)

for any two linear transformations 'p and i/i. The trace of by the equation ...

A (x1 ...

Replacing the vectors

= tr

0p)A (x1 ...

l...n)

by

we

p

E

is defined

E.

obtain (4.53)

=

The left hand-side of this equation can be written as A

x1

...

o 'p

o

x1...

= det

A (x1 ... ('p

...

= and thus (4.53) implies that

=

(4.54)

If is regular, this equation may be divided by det cli yielding (4.52). If cl' is non-regular, consider the mapping t/i —). t where is different from all

§7. The trace

eigenvalues

of

Then

tr

1/I

127

—21 is regular, whence

-21)].

-1 i) q] = tr

In view of the linearity of the trace-operator this equation yields tr(110(p) — 2trp = — 2tr(p whence (4.52). Finally it will be shown that the coefficient of 2" in the characteristic

polynomial of p can be written as (4.55)

=(— Formula (4.48) yields for p = ...

= (—

(4.56)

A (x1 ...

the sum being taken over all permutations a subject to the restrictions This sum can be written as 1

(

A

x1 ...

...

A (x1 ...

=

x1,

1

We thus obtain from (4.56) =

(—

(4.57)

Comparing the relations (4.57) and (4.51) we find (4.55). = 1.. .n) be a basis of E. Then q 4.24. The trace of a matrix. Let determines an ii x n-matrix by the equations (4.58)

Inserting

l...n) in (4.51) we find = trcoA(e1

(4.59)

Equations (4.58) and (4.59) imply that

=

A(e1 ...

A(e1 ...

whence (4.60)

Observing that

=

çü e%>,

Chapter IV. Determinants

12g

where e*%' (v = 1 .. ii)

is the dual basis of

we can rewrite equation

(4.60) as tr çø =

(4.61)

<e*1, ço e>.

Formula (4.60) shows that the trace of a linear transformation is equal to the sum of all entries in the main-diagonal of the corresponding matrix. For any ii x n-matrix A this sum is called the trace of A and will be denoted by tr A, (4.62)

Now equation (4.60) can be written in the form

tr(p = trM(p).

4.25. The duality of L(E; F) and L(F; E). Now consider two linear spaces Eand Fand the spaces L(E; F) and L(F; E) of all linear mappings With the help of the trace a scalar product can be p:E—÷F and introduced in these spaces in the following way: q7EL(E; F),

=

E).

(4.63)

The function defined by (4.63) is obviously bilinear. Now assume that (4.64)

for a fixed mapping coeL(E; F) and all linear mappings /ieL(F; E). It has to be shown that this implies that ço=O. Assume that 'p+O. Then Extend the vector b1 =pa to there exists a vector aEE such that a basis (b1 .

.

.bm)

of F and define the linear mapping i/i: F—>E by

(p=2...in).

Then

=0

= b1,

= 2... in),

whence

=

=

= 1.

This is in contradiction with (4.64). Interchanging E and F we see that the relation

for a fixed mapping /ieL(F; E) and all mappings peL(E; F) implies that Hence, a scalar-product is defined in L(E; F) and L(F; E) by (4.63).

§7. The trace

129

Problems 1. Show that the characteristic polynomial of a linear transformation of a 2-dimensional linear space can be written as

f()) = )2 Verify that every such

satisfies (p2 —

— 2trço + detçD.

its characteristic equation,

(ptr(p + ldet(p = 0.

2. Given three linear transformations 'p, i/i,

of E show that

tr(yoç/ioço) + in general. 3. Show that the trace of a projection operator ir:E—+E1 II sec. 2.19) is equal to the dimension of Im it.

(see

Chapter

4. Consider two pairs of dual spaces E*, E and F*, F. Prove that the spaces L (E; F) and L (E*; F*) are dual with respect to the scalar-product defined by = F*). (pEL(E; F)

5. Letf be a linear function in the space L(E; E). Show thatf can be written as

f((p) = is a fixed linear transformation in E. Prove that is uniquely determined by f 6. Assume thatf is a linear function in the space L(E; E) such that where

Prove that

f(ço)=2tr(p where 2 is a scalar. 7. Let and be two linear transformations of E. Consider the sum

i*j

A (x1 ... (p Xi ... cli x1...

where A+O is a determinant function in E. This sum is again a determinant function and hence it can be written as i4j 9

A (x1

Linear Algebra

(p

...

cli

x1 ...

= B (p, cl') A (x1 ...

Chapter IV. Determinants

I 3()

By the above relation a bilinear function B is defined in the space L (E; E). Prove: a) B(p, p tr b) IB((p, (p)=(— where is the coefficient of 2P12 in the characteristic polynomial of çø. c)

=

—

8. Consider two ii x n-matrices A and B. Prove the relation tr(A B) = tr(BA)

a) by direct computation.

b) using the relation tr p=tr M(p). 9. If çø and are two linear transformations of a 2-dimensional linear space prove the relation +

=

+

+

—

10. Let A:L(E; E)—÷L(E; E) be a linear transformation such that

A

a 2-dimensional vector space and p be a linear transformation of E. Prove that ço satisfies the equation p2= —2i, 2>0 if and

only if

detço>0 and trp=0. 12. Let ço:E1—*E1 and

be linear transformations. Consider

=

Prove that tr ço=tr +tr P2• be a linear transformation and assume that there is a 13. Let into subspaces such that decomposition (1= l...r). Prove that tr q=0. be linear maps between finite dimen14. Let p: and /i: a sional vector spaces. Show that tr ((p /1) = )-th I 5. Show that the trace of ad ((p) (cf. sec. 4.6) is the negative (ii characteristic coefficient of 1

§ 8. Oriented vector spaces

§ 8.

131

Oriented vector spaces

In this paragraph E will be a real vector space of dimension n 1. 4.26. Orientation by a determinant function. Let A1 + 0 and A2 + 0 be two determinant functions in E. Then A2=2A1 where 2+0 is a real number. Hence we can introduce an equivalence relation in the set of all determinant functions A +0 as follows: if

2>0.

it is easy to verify that this is indeed an equivalence. Hence a decomposition of all determinant functions A +0 into two equivalence classes is induced. Each of these classes is called an orientation of E. If (A) is an orientation and A e (A) we shall say that A represents the given orientation. Since there are two equivalence classes of determinant functions the vec-

tor space E can be oriented in two different ways. A basis (v = n) of an oriented vector space is called positive if 1 . . .

where A is a representing determinant function. If (e1 .. is a positive basis and a is a permutation of the numbers (l...n) then the basis (eat!)... ea(fl)) is positive if and only if the permutation a is even. Suppose now that E* is a dual space of E and that an orientation is defined in E. Then the dual determinant function (cf. sec. 4.10) determines an orientation in E*. it is clear that this orientation depends only on the orientation of E. Hence, an orientation in E* is induced by the orientation of E. .

4.27. Orientation preserving linear mappings. Let E and F be two oriented vector spaces of the same dimension n and be a linear isomorphism. Given two representing determinant functions AE and AF defined by in E and F consider the function x1, ...,

= Clearly

is again a determinant function in E and hence we have that =

2+0 is a real number. The sign of 2 depends only on p and on the given orientations (and not on the choice of the representing determinant functions). The linear isomorphism çø is called orientation preserving if where

Chapter IV. Determinants

132

).>O. The above argument shows that, given a linear isomorphism and an orientation in E, then there exists precisely one orientation in F such that

preserves the orientation. This orientation will be called the orientation induced by ço. Now let be a linear automorphisni; i.e., F=E. Then we have AF=AF and hence it follows that = det(p.

This relation shows that a linear automorphism

preserves the

orientation if and only if det ço >0. As an example consider the mapping ço= —i. Since det(— i) = (— 1)'

it follows that ii is even.

preserves the orientation if and only if the dimension of

4.28. Factor spaces. Let E be an orientated vector space and F be an oriented subspace. Then an orientation is induced in the factor space E/F in the following way: Let A be a representing determinant function

in E and a1 .

.

be a positive basis of F. Then the function A (a1 ... ap,

depends only on the classes are equivalent mod F. Then

+

EE

1

in fact, assume for instance that

and

)p+1 = Xp+i +

and we obtain A (a1 ...

a

•..

A

=

A

(a1 ...

of (n—p) vectors in E1'F is well defined by

=

...

(4.65)

it is clear that 21 is linear with respect to every argument and skew symmetric. Hence A is a determinant function in ElF. It will now be shown that the orientation defined in E/F by A depends only on the orientations of E and F. Clearly, if A' is another representing determinant function in

§ 8. Oriented vector spaces

133

E we have that A'

2A, 2>0 and hence A' = 221. Now let another positive basis of F. Then we have that

>0

= whence A (a'1 ...

a,

.. .a,) be

=

1

A (a1 ...

1

It follows that the function A' obtained from the basis a'1 . . .a, is a positive multiple of the function A obtained from the basis a1... a direct decomposition

E=E1E13E2

(4.66)

and assume that orientations are defined in E1 and E2. Then an orientation is induced in E as follows: Let l...p) and l...q) be positive bases of E1 and E2 respectively. Then choose the orientation of E such that the basis a1 . b1 . . .bq is positive. To prove that this orientation depends only on the orientations of E1 and E2 let (i= I .. .p) and (1= 1... q) be two other positive bases of E1 and E2. Consider the linear transformations p:E1—+E1 and defined by .

(i=1...p) Then the transformation basis (a1..

b1

. .

(j=1...q).

and

carries the basis

bi...bq) into the

Since det ço >0 and det i/i >0 it follows from sec. 4.7

.bq).

that

>0

det(p

hi...hq) is again a positive basis of E. and hence Suppose now that in the direct decomposition (4.66) orientations are given in Eand E1. Then an orientation is induced in E2. In fact, consider the projection it:E—÷E2 defined by the decomposition (4.66). It induces an isomorphism

p:E/E1 4E2. In view of sec. 4.28 an orientation in E/E1 is determined by the orientations of E and E1. Hence an orientation is induced in E2 by p. To describe this orientation explicitly let A be a representing determinant function in E and e1

ep be a positive basis of E1. Then formula (4.65)

implies that the induced orientation in E2 is represented by the determinant function A2

...

=

A (e1

...

... yb),

(4.67)

Chapter IV. Determinants

134

Now let ..., be a positive basis of E7 with respect to the induced orientation. Then we have

and hence formula (4.67) implies that 4(e1 ... It follows that the basis e1 ...

of E is positive. In other words,

1

the orientation induced in E by E1 and 122 coincides with the original orientation. The space 122 in turn induces an orientation in E1. It will be shown that this orientation coincides with the original orientation of E1 if and only if p (n —p) is even. The induced orientation of E1 is represented by the determinant-function A1(x1 ...

=

...

...

(4.68)

where eA(2—p+ 1, ...n) is a positive basis of E2. Substituting (v

1.. .n) in equation (4.68) we find that

A1(e1 ...

=

But e)(i.=p+ I

...

...

= (— 1)

...

en).

(4.69)

n) is a positive basis of E2 whence

...

(4.70)

It follows from (4.69) and (4.70) that A1(e1 ...e")

c>

0

if p(n — p) is even if p(n—p)is odd.

(4.71)

is positive with respect to the original orienSince the basis of tation, relation (4.71) shows that the induced orientation coincides with the original orientation if and only if p(n—p) is even. 4.30. Example. Consider a 2-dimensional linear space 12. Given a basis (e1, e2) we choose the orientation of E in which the basis e1, e2 is positive. Then the determinant function A, defined by

zl(e1,e2) =

1

1,2) generrepresents this orientation. Now consider the subspace 1,2) with the orientation defined by Then E1 induces in ated by 122 the given orientation, but E2 induces in E1 the inverse orientation.

§ 8. Oriented vector spaces

135

In fact, defining the determinant-functions A1 and A1 (x)

A (e2,

x)

xe E1,

A2 (x) = A

and

A2 in E1 and in E2 by (e1,

x)

x e E2

we find that

4.31.

A2(e2) = A(e1, e2) =

1

Intersections. Let E1

and E2 be

and

A1 (e1) =

A(e2,

e1) = —

two subspaces of E

E = E1 + E2

1.

such

that (4.72)

and assume that orientations are given in E1, E2

and E. It

will be shown

that then an orientation is induced in the intersection E12=E1 n

E2.

Setting

dimE1 = p,

dimE2 = q,

dimE12 = r

we obtain from (4.72) and (1.32) that

r= p + q — n. Now consider the isomorphisms ço:E/E1 and Vi:E/E2

E1/E12.

Since orientations are induced in E/E1 and E/E2 these isomorphisms determine orientations in E2/E12 and in E1/E12 respectively. Now choose two positive bases +1• a,, and hr . bq in E1/E12 and E2/E12 respecand bJEE2 be vectors such that tively and let .

and

where it1 and

it2

denote the canonical projections and

ir2:E2—+E2/E12.

Now define the function A12 by A12 (z1 ... Zr) = A (Z1 ... Zr, ar+ 1

a,,, br+ 1 .. bq).

(4.73)

In a similar way as in sec. 4.30 it is shown that the orientation defined in depends only on the orientations of E1, and E (and not on E12 by the choice of the vectors and ba). Hence an orientation is induced in E12.

Chapter IV. Determinants

Interchanging E1 and E2 in (4.73) we obtain

(:i

•..

(:i

=A

r,

br+ 1

bq,

1

ar).

(4.74)

Hence it follows that =

(q-r)4

1)(P_r)

(n-q)4

= (_

(475)

Now consider the special case of a direct decomposition. Then p+q=n and E12=(0). The function /112 reduces to the scalar (4.76)

Moreover the number—2 depends

It follows from (4.76) that

only on the orientations of E1, E2 and E. It is called the intersection number

of the oriented subspaces E1 and E2. From (4.75) we obtain the relation =(

= 1.. ii) be two bases of E. Basis deformation. Let and is called deformable into the basis if there exist n Then the basis continuous mappings 4.32.

t t1

t

satisfying the conditions 1. (t0) = and (t1) = (t) (v = 2. The vectors 1

The

.

.11) are

linearly independent for every fixed t.

deformability of two bases is obviously an equivalence relation.

Hence, the set of all bases of E is decomposed into classes of deformable bases. We shall now prove that there are precisely two such classes. This is a consequence of the following Theorem: Two bases and (v = 1.. ./1) are deformable into each other if and only if the linear transformation p: E defined by 'p = has positive determinant.

Proof: Let 4+0 be an arbitrary determinant function. Then formula lii) are 4.17 together with the observation that the components continuous functions of shows that the mapping Ex x E—*R defined by A is continuous. into the is a deformation of the basis Now assume that basis Consider the real valued function (t) ...

§ 8. Oriented vector spaces

137

The continuity of the function 4 and the mappings

the function Ji

is

implies that

continuous. Furthermore,

ck(t)+0 the vectors (t) (v = I .n) are linearly independent. Thus the function cP assumes the same sign at t=t0 and at t=t1. But because

. .

= 4(pa1 ...

= A(b1

= det(pA(a1 ...

=

whence

detço >0 and so the first part of the theorem is proved. 4.33. Conversely, assume that the linear transformation a deformation assume first that the vector n-tuple (a1 ...

...

(4.77)

1). Then consider the de-

is linearly independent for every 1(1 composition = By the above assumption the vectors (a1..

are linearly independ-

ent, whence /3" + 0. Define the number c,, by

1+1 if if

L—1

It will be shown that the n mappings

Ix,(t)=a.(v=1...n—1) = define

(1 —

+

(0
a deformation

(a1... a determinant function in E. Then

4 (x1 (t) ... Since

(t)) = ((1

—

t) +

t) A (a1 ... an).

it follows that

(0t1)

Chapter IV. Determinants

138

whence

(0 t 1).

+0

(t) ...

zi

This implies the linear independence of the vectors

(t)(v =

1 . .

ii) for

every t.

In the same way a deformation

(a1...

(a1 ...

can be constructed where tain a deformation (a1 ...

—÷

1= ± I. Continuing this way we finally ob-

b1 ... c,1

= ±

1

(v

1 ... n).

To construct a deformation (b1 ...

(e1 b1 ...

consider the linear transformations

(v=1...n) and

(v=1...n). The product of these linear transformations is given by

(v=1...n). By hypothesis,

det(i/iop)>O

(4.78)

detp>O.

(4.79)

and by the result of sec. 4.32 Relations (4.78), and (4.79) imply that > 0.

det But whence

+1. Thus, the number of equal to — is even. Rearranging the vectors (v = 1.. n) we can achieve that 1

.

5—i

vl+i

(v=1...2p)

(v=2p+1...n).

§ 8. Oriented vector spaces

139

Then a deformation b1

...

e,,

-÷ (b1 ...

is defined by the mappings

+

=— =—

(v — —

1

—

(v=2p+1...n) 4.34. The case remains to be considered that not all the vector n-tuples (4.77) are linearly independent. Let A + 0 be a determinant function. The linear independence of the vectors (v = 1 . n) implies that . .

Since zl is a continuous function, there exists a spherical neighbourhood that Ua

(v= 1...n).

if XvEUa

Choose a vector a'1 e Uaj which is not contained in the (n — 1)-dimensional subspace generated by the vectors (b2. . Then the vectors (a'1, b2.. are linearly independent. Next, choose a vector a'2 E Ua2 which is not con-

tained in the (n— 1)-dimensional subspace generated by the vectors (a'1, b2. Then the vectors (a'1, a'2, b3. are linearly independent. Going on this way we finally obtain a system of n vectors (v = 1.. n) .

.

.

such that every n-tuple ...

(a'1

is

linearly independent. Since

Hence the vectors

(1= 1 ... ;i)

it follows that

(v = 1. . n) form a basis of E. The n mappings .

define a deformation (4.80)

In fact,

(t)(0 t

1) is contained in Uav whence A

(t) ...

(t)) + 0

(0

t

This implies the linear independence of the vectors

1).

(t)(v = 1.. .n).

140

Chapter IV. Determinants

By the result of sec. 4.33 there exists a deformation ...

... ba).

(4.81)

The two deformations (4.80) and (4.81) yield a deformation (a1 ...

... ba).

This completes the proof of the theorem in sec. 4.32. 4.35. Basis deformation in an oriented linear space. If an orientation is given in the linear space E, the theorem of sec. 4.32 can be formulated as and (v = I n) can be deformed into each other follows: Two bases if and only if they are both positive or both negative with respect to the given orientation. In fact, the linear transformation . .

p:

.

(v = I ... a)

—*

has positive determinant if and only if the bases and b%, (v = I .. are both positive or both negative. Thus the two classes of deformable bases consist of all positive bases and all negative bases. 4.36. Complex vector spaces. The existence of two orientations in a

real linear space is based upon the fact that every real number 2t0 is either positive or negative. Therefore it is not possible to distinguish two orientations of a complex linear space. In this context the question arises whether any two bases of a complex linear space can be deformed into each other. It will be shown that this is indeed always possible. Consider two bases and (v = a) of the complex linear space E. As in sec. 4.33 we can assume that the vector n-tuples 1

(a1...

. .

...

are linearly independent for every 1(1 1). It follows from the above assumption that the coefficient in the decomposition = is

different from zero. The complex number

Now choose a positive continuous function

r(0) =

1,

r(I) = r

can

be written as

1) such that (4.82)

§ 8. Oriented vector spaces

and a continuous function

1) such that

(t)(O t

(4.83)

Define mappings

(t),

(0

t 1)

(v =

=

by 1 ...

n — 1) )

and (t)

=t

Then the vectors (t) (v = fact, assume a relation

1 ..

(4.84)

1.

t

flV

+ r(t) .

n)

are linearly independent for every t. In

0. Then

+

r (t)

+

=

0

whence

(v==1...n—1) and = 0.

1, the last equation implies that )J' = 0. r (t) + 0 for 0 t (n— 1) equations reduce to in— 1). It follows from (4.84), (4.82) and (4.83) that

Since

Hence the

first

=

and

=

Thus the mappings (4.84) define a deformation (a1 ...

—÷

Continuing this way we obtain after 1...n). into the basis

(a1 n

...

ba).

steps a deformation of the basis

Problems 1. Let E be an oriented n-dimensional linear space and

(v = 1.. .11) be

the subspace generated by the vectors a positive basis; denote by is positive with respect Prove that the basis (x1 to

the orientation induced in

by the vector

l)i —

1

xL.

Chapter IV. Determinants

142

2. Let E be an oriented vector space of dimension 2 and let a1, a2 be two linearly independent vectors. Consider the 1-dimensional subspaces E1 and E2 generated by a1 and a2 and define orientations in E. such that the bases a are positive (i= 1,2). Show that the intersection number of E1 and E2 is + 1 if and only if the basis a1, a2 of E is positive.

3. Let E be a vector space of dimension 4 and assume that (v

= ... 1

4)

is a basis of E. Consider the following quadruples of vectors:

I. e1+e2,e1+e2+e3,e1+e2+e3+e4,e1—e2+e4 II. e1+2e3, e2+e4, e2—e1+e4, e2

III.

e2+e4, e3+e2, e2—e1

1V.

e1+e2—e3, e2—e4, e3+e2, e2—e1

V. e1—3e3, e2+e4, e2—e1—e4, e2. a) Verify that each quadruple is a basis of E and decide for each pair of bases if they determine the same orientation of E. b) If for any pair of bases, the two bases determine the same orientation, construct an explicit deformation. c) Consider E as a subspace of a 5-dimensional vector space E and assume that (v = 1, . .5) is a basis of E. Extend each of the bases above to a basis of which determines the same orientation as the basis (v= 1, ..., 5). Construct the corresponding deformations explicitly. 4. Let E be an oriented vector space and let E1, E2 be two oriented subspaces such that E== E1 + E2. Consider the intersection E1 fl E2 together with the induced orientation. Given a positive basis (c1, ..., Cr) of extend it to a positive basis (c1, ..., Cr, ar+ E1 fl a positive basis (C1, ..., Cr, br+ i, ..., bq) of E2. Prove that then (C1, ..., Cr, ar+ 1' ..., ap, br+ i, ..., bq) is a positive basis of E. .

5. Linear isotopices. Let E be an n-dimensional real vector space. will be called linearly Two linear automorphisms p: and i/i: (I the closed unit isotopic if there is a continuous map 1: I x and such that for each interval) such that x)=p(x), x) is a linear automorphism. tEl the map given by (i) Show that two automorphisms of E are linearly isotopic if and only if their determinants have the same sign. (ii) Let j: E—+E be a linear map such that = — Show that j is linearly isotopic to the identity map and conclude that detj= 1. 6. Complex structures. A complex structure in real vector space E is a linear transformation j: E—+E satisfying j2 = — 1.

§8. Oriented vector spaces

143

(i) Show that a complex structure exists in an n-dimensional vector space if and only if ii is even. (ii) Let j be a complex structure in E where dimE=2n. Let 4 be a determinant function. Show that for (v= 1 ... 11) either 4(x1

or 4(x1 ,...,x,1,jx1 The natural orientation of (E, J) is

defined to be the orientation re-

presented by a non-zero determinant function satisfying the first of these.

Conclude that the underlying real vector space of a complex space carries a natural orientation. (iii) Let be a vector space with complex structure and consider the complex structure (E, —j). Show that the natural orientations of and (E, —j) coincide if and only if n is even (dimE=2n).

Chapter V

Algebras In paragraphs one and two all vector spaces are defined over a fixed, but arbitrarily chosen field F of characteristic 0.

§ 1. Basic properties 5.1. Definition: An algebra, A, is a vector space together with a mapping A x such that the conditions (M1) and (M2) below both hold. The image of two vectors xeA, yeA, under this mapping is called the product of x and y and will be denoted by xy. The mapping A x A-+A is required to satisfy: (M1) (M2)

(2x1 + jix2)y = 2(x1 y) + ,u(x2y) + 11y2)

2(xy1) + ji(xy2).

As an immediate consequence of the definition we have that

0x = x0 0. Suppose B is a second algebra. Then a linear mapping p : a homomorphism (of algebras) if çø preserves products; i.e.,

p(xy)=pxpy.

is called (5.1)

A homomorphism that is injective (resp. surjective, bijective) is called a monomorphism (resp. epimorphism, isomorphism). If B=A, p is called an endomorphism. Note: To distinguish between mappings of vector spaces and mappings of algebras, we reserve the word linear mapping for a mapping between vector spaces satisfying (1.8), (1.9) and homomorphism for a linear mapping between algebras which satisfies (5.1). Let A be a given algebra and let U, V be two subsets of A. We denote by U V, the set

§ 1. Basic properties

145

Every vector aeA induces a linear mapping p(a):A—*A defined by

1u(a)x=ax

(5.2)

4u (a) is called the multiplication operator determined by a.

An algebra A is called associative if

x(yz)=(xy)z and commutative if

xy=yx

x,y,zEA

x,yeA.

From every algebra A we can obtain a second algebra (x y)OPP

by defining

=yx

AON' is called the algebra opposite to A. It is clear that if A is associative

then so is A°". If A is commutative we have A°"=A. If A is an associative algebra, a subset Sc: A is called a system of generators of A if each vector xeA is a linear combination of products of elements in 5, 1... x= e S, 1... Vp ... (v)

A unit element (or identity) in an algebra is an element e such that for every x

xe=ex=x.

(5.3)

if A has a unit element, then it is unique. In fact, if e and e' are unit elements, we obtain from (5.3) e = e e' = e'. Let A be an algebra with unit element eA and ço be an epimorphism of A onto a second algebra B. Then eB = eA is the unit element of B. In fact,

if yEB is arbitrary, there exists an element xeA such that y= çox. This gives

yeB =

=

=

= y.

In the same way it is shown that An algebra with unit element is called a division algebra, if to every element a 0 there is an element a' such that a a — = a — = c. 1

5.2. Examples: 1. Consider the space L(E; E) of all linear transformations of a vector space E. Define the product of two transformations by 10

= i/i o Greub Linear Algebra

Chapter V. Algebras

146

The relations (2.17) imply that the mapping (p, i)—*çlip satisfies (M1) and (M2) and hence L(E; E) is made into an algebra. L(E; E) together with this multiplication is called the algebra of linear transformations of E and is denoted by A (E; E). The identity transformation i acts as unit element in A (E; E). It follows from (2.14) that the algebra A (E; E) is associative.

However, it is not commutative if dim E 2. In fact, write E= (x1) and (x2) are the one-dimensional subspaces generated by two linearly independent vectors x1 and x2, and F is a complementary subspace. Define linear transformations 'p and by

'px1=O,

(px2=x1,

py=O,yeF

ç/ix2=O;

çliy=O,yeF.

and Then

= 'p0 = 0 while (pX2

=

çlix1

= x2

whence

Suppose now that A is an associative algebra and consider the linear mapping defined by

p(a)x=ax.

(5.4)

Then we have that

p(a b)x = a bx = p(a)p(b)x whence

p(ab) = p(a)p(b). This

relation shows that p is a homomorphism of A

into A (A; A).

M X be the vector space of (n x mi)-matrices for a given integer ii and define the product of two (mi x n)-matrices by formula Example 2: Let

(3.20). Then it follows from the results of sec. 3.10 that the space is made into an associative algebra under this multiplication with the unit matrix J as unit element. Now consider a vector space E of dimension n with a distinguished basis (v = 1. n). Then every linear transforE determines a matrix M ('p). The correspondence 'p —+ M(co) mation 'p . .

§ 1. Basic properties

determines a linear isomorphism of A (E; E) onto

147 X

In view of sec.

3.10 we have that

M is an isomorphism of the algebra A (E; E) onto the opposite algebra (M" X n)0PP. Example 3: Suppose F1 cF is a subfield. Then F is an algebra over F1. We show first that F is a vector space over F1. In fact, consider the mapping F1 x F—+F defined by

(2,x)-÷2x, It satisfies the relations

2eF1,xEF.

+ x =2x+ x 2(x + y) = Ax + 2y (24u)x = 2(1ux)

lx = x where 2,

x, yeF. Thus F is a vector space over F by (x, y) x y (field multiplication).

Then M1 and M2 follow from the distribution laws for field multiplication.

Hence F is an associative commutative algebra over F1 with 1 as unit element.

Example 4: Let cr be the vector space of functions of a real variable t which have derivatives up to order r. Defining the product by

(1 g) (t) = f (t) g (t) we obtain an associative and commutative algebra in which the function 1(t) = 1 acts as unit element. 5.3. Subalgebras and ideals. A subalgebra, A1, of an algebra A is a linear subspace which is closed under the multiplication in A; that is, if x and y are arbitrary elements of A1, then xyeA1. Thus A inherits the structure of an algebra from A. It is clear that a subalgebra of an associative (commutative) algebra is itself associative (commutative). Let S be a subset of A, and suppose that A is associative. Then the subspace B A generated (linearly) by elements of the form

is

SEES Si...Sr, clearly a subalgebra of A, called the subalgebra generated by S. It is

easily verified that

where the 10*

B= A containing S.

Chapter V. Algebras

A right (left) ideal in an algebra A is a subspace I such that for every xEI, and every yEA, xyEI(yxEI). A subspace that is both a right and left ideal is called a tivt'o-sided ideal, or simply an ideal in A. Clearly, every

right (left) ideal is a subalgebra. As an example of an ideal, consider the subspace A2 (linearly generated by the products xy). A2 is clearly an ideal and is called the derived algebra. The ideal I generated by a set S is the intersection of all ideals containing S. If A is associative, I is the subspace of A generated (linearly) by elements of the form

s,as,sa

sES,aEA.

In particular every single clement a generates an ideal principal ideal generated by a.

4

is

called the

Example 5: Suppose A is an algebra with unit element e, and let (p:F-÷A be the linear mapping defined by ço2 = Ae.

Considering F as an algebra over itself we have that =

e

=

e)(R e) =

then ).e=0 whence p is a homomorphism. Moreover, if )L=0. It follows that p is a monomorphism. Consequently we may identify F with its image under p. Then F becomes a subalgebra of A and scalar multiplication coincides with algebra multiplication. In fact, if). is any scalar, then = (2e).a = = Example

6: Given an element a of an associative algebra consider

the set, Na, of all elements xEA such that ax =0. If xENa then we have for every VEA

and so xyENa. This shows that Na is a right ideal in A. It is called the right annihilator of a. Similarly the left annihilator of a is defined. 5.4. Factor algebras. Let A be an algebra and B be an arbitrary subspace of A. Consider the canonical projection

It will be shown that A/B admits a multiplication such that ir is a homomorphism if and only if B is an ideal in A. Assume first that there exists such a multiplication in A/B. Then for

§ 1. Basic properties

149

every xeA, yeB, we have

ir(xy) =

irxiry =

it follows that and so B must be an ideal. Conversely, assume B is an ideal. Then define the multiplication in

A/B by

= ir(xy)

(5.5)

where x and y are any representatives of

and 7 respectively.

It has to be shown that the above product does not depend on the choice of x and y. Let x' and y' be two other elements such that it x' = and iry' =7. Then

x'—xeB and y'—yeB. Hence we can write

x'==x+b, It

beB and y'—y+c,

ceB.

follows that

x'y' — xy = by + xc + bceB and so

rr(x'y') = it(xy). The multiplication in A/B clearly satisfies (M1) and (M2) as follows from the linearity of it. Finally, rewriting (5.5) in the form

it (x y) = it x iry

we see that it is a homomorphism and that the multiplication in A/B is uniquely determined by the requirement that it be a homomorphism. The vector space A/B together with the multiplication (5.5) is called the

factor algebra of A with respect to the ideal B. It is clear that if A is associative (commutative) then so is A/B. If A has a unit element e then ë= ire is the unit element of the algebra A/B. 5.5. Homomorphisms. Suppose A and B are algebras and is

a homomorphism. Then the kernel of p is an ideal in A. In fact, if xeker ço and yEA

are

arbitrary we have that

p(xy) =

=

=

0

whence xyEker qi. In the same way it follows that yxEker tp. Next consider the subspace Im pcB. Since for every two elements x, yeA = qi(xy)elmq

it follows that Im 'p is a subalgebra of B.

Chapter V. Algebras

Now let

be the induced injective linear mapping. Then we have the commutative d iagrani

A/kerço

and since

is

a homomorphism, it follows that = = = =

'p

(x y) (x). 'p (y)

is a homomorphism and hence a monomorphism. In particular, the induced mapping This relation shows that

4 Imço is an isomorphism. Finally, assume that C is a third algebra, and let be a homomorphism. Then the composition ti0 p: A —+ C is again a homomorphism. In fact, we have 'ps. = = =

Let 'p:A—*B be any homomorphism of associative algebras and S be by a system of generators for A. Then 'p determines a set map (po:

'p0x=çox,

xeS. In fact, if

The homomorphism 'p is completely determined by

x=

...

E S,

(v)

is

an arbitrary element we have that

... 'poxv

= (v)

I

Vp

e I'

§ 1. Basic properties

151

be an arbitrary set map. Then Po can be Proposition I: Let can be extended to a homomorphism ço:A—+B if and only if ...

whenever

0

po

...

= 0. (5.6)

(v)

(v)

Proof: It is clear that the above condition is necessary. Conversely, assume that (5.6) is satisfied. Then define a mapping ço:A—÷B by

= (v)

...

(5.7)

(v)

To show that çü is, in fact, well defined we notice that if = ... •.. Ypq

(v)

(p)

then —

Ypq

= 0.

(ji)

(v)

In view of (5.6) ...

...

—

(v)

0

(p)

and so

...

It follows from (5.7) that

px=ip0x xeS p(2x + jiy) = 2px + and

'p (x y) = 'p

'p y

and hence 'p is a homomorphism. Now suppose that

be a linear map such

is a basis for A and let

= for each x, j3. Then 'p is a homomorphism, as follows from the relation 'p (x y) =

'p

ep)}

e2) (> p

2,fl

=

'p (ep)) =

'p p

'p

(x) 'p (y).

Chapter V. Algebras

152

5.6. Derivations. A linear mapping 0: A called a derivation if

A of an algebra into itself is

X,yEA.

(5.8)

and As an example let A be the algebra of Cr-functions f: define the mapping 0 by 0:f—+f' wheref' denotes the derivative off. Then the elementary rules of calculus imply that 0 is a derivation. If A has a unit element e it follows from (5.8) that Oe =

Oe

+ 0e

whence Oe=—O. A derivation is completely determined by its action on a

system of generators of A, as follows from an argument similar to that used to prove the same result for homomorphisms. Moreover, if O:A—*A

is a linear map such that

=

+

is a basis for A, then 0 is a derivation in A. For every derivation 0 we have the Leibniz formula

where

on(xy)

(5.9)

= In fact, for n = 1, (5.9) coincides with (5.8). Suppose now by induction that (5.9) holds for some n. Then +

1(x y) =

o on

(x y) or+

-r y +

= x.On+ly +

+

or x

-r+

=

= x.On+ly + n+ 1

and so the induction is closed.

Orx.On+l_ry +

(n+1) Orx.On_r+ly +

§ 1. Basic properties

153

The image of a derivation 0 in A is of course a subspace of A, but it is in general not a subalgebra. Similarly, the kernel is a subalgebra, but it is not, in general, an ideal. To see that ker 0 is a subalgebra, we notice that for any two elements x, yeker 0

0(xy) =

0

whence xy E ker 0.

it follows immediately from (5.8) that a linear combination of derivations 01:A—*A is again a derivation in A. But the product of two derivations 01, 02 satisfies

(0102)(xy) = 01(02x.y + x.02y) (5.10)

and so is, in general, not a derivation. However, the commutator [01,02] = 0102—0201 is again a derivation, as follows at once from (5.10). 5.7. ip-derivations. Let A and B be algebras and 'p : A B be a fixed homomorphism. Then a linear mapping 0: A B is called a 'p-derivation if

0(xy)=

x,yeA.

+

denotes In particular, all derivations in A are i-derivations where i : A the identity map. As an example of a 'p-derivation, let A be the algebra of Cm-functions f: R—* and let B= 11. Define the homomorphism cp to be the evaluation homomorphism : f -÷ (0) and the mapping 0 by

f

0:f Then it follows that

0(fg) = (1 g)'(O) = = 0

f' (0) g (0) + f (0) g' (0)

a 'p-derivation.

More generally, if °A is any derivation in A, then derivation. In fact, 0(xy) = çoOA(xy) + xOAv) = = 'p

+

'p x

0 y.

Similarly, if °B is a derivation in B, then °B° 'p

is

a 'p-derivation.

is

a ço-

Chapter V. Algebras

154

5.8. Antiderivations.

Recall that an involution in a linear space is a

linear transformation whose square is the identity. Similarly we define an involution w in an algebra A to be an endomorphisin of A whose square is the identity map. Clearly the identity map of A is an involution. If A has a unit element e it follows from sec. 5.1 that we=e. Now let w be a fixed involution in A. A linear transformation Q:A—*A will be called an antiderivation with respect to w if it satisfies the relation

Q(xj) =

+

wxQy.

(5.11)

In particular, a derivation is an antiderivation with respect to the involution i. As in the case of a derivation it is easy to show that an antiderivation is determined by its action on a system of generators for A and that ker Q is a subalgebra of A. Moreover, if A has a unit element e, then Qe=O. It also follows easily that any linear combination of antiderivations with respect to a fixed involution w is again an antiderivation with respect to w. Suppose next that Q1 and Q2 are antiderivations in A with respect to 0w1. Then w1 oW2 the involutions w1 and w2 and assume that w1 0w2 is again an involution. The relations

(Q122)(xy)= Q1(Q2xy + w2xQ2y)=

+

and

(Q2Q1)(xy)=

+ w1xQ1y)= Q2Q1xy + + w2Q1xQ2y + Q2w1xQ1y + w2w1xQ2Q1y

yield

(Q1Q2 ± Q2 ± Q2w1)xQ1 y + +(Q1w2 ± w2Q1)xQ2y + W1W2X.(Q1Q2 ± Q2Q1)y. Q2

(5.12)

Now consider the following special cases: 1. w1 Q2 = Q2 w1 and w2 Q1 = Q1 (this is trivially true if w1 = ± i and Q2 —Q2 is an antiderivation = ± i). Then the relation shows that with respect to the involution W1 w2. In particular, if Q is an antiderivation

with respect to w and 0 is a derivation such that 0)0=0W, then OQ—Q0 is again an antiderivation with respect to oi. 2. w1Q2=—Q2W1andw2Q1=—Q1W2.ThenQ1Q2+Q2Q1isanantiderivation with respect to the involution W1 0)2.

§ 1. Basic properties

155

Now let Q1 and Q2 be two antiderivations with respect to the same involution w such that

(i=1,2). Then it follows that Q1 Q2 + Q2 Q1 is a derivation. In particular, if Q is any antiderivation such that wQ = - Qw then Q2 is a derivation. Finally, let B be a second algebra, and let cp: A —÷ B be a homomorphism. Assume that WA is an involution in A. Then a p-antiderivation with respect to WA is a linear mapping Q:A-+B satisfying (5.13) If WB is

an involution in B such that WA

WB 'P

then equation (5.13) can be rewritten in the form

Q(xy) =

+

Problems 1. Let A be an arbitrary algebra and consider the set C(A) of elements aEA that commute with every element in A. Show that C(A) is a subspace

of A. If A is associative, prove that C(A) is a subalgebra of A. C(A) is called the centre of A. 2. If A is any algebra and 0 is a derivation in A, prove that C(A) and the derived algebra are stable under 0. 3. Construct an explicit example to prove that the sum of two endomorphisms is in general not an endomorphism. 4. Suppose 'p A —÷ B is a homomorphism of algebras and let ). + 0, 1 be

an arbitrarily chosen scalar. Prove that 2p is a homomorphism if and only if the derived algebra is contained in ker 'p. 5. Let C1 and C denote respectively the algebras of continuously differentiable and continuous functionsf: (cf. Example 4). Consider the linear mapping d: C' C

given by df=f' wheref' is the derivative off.

Chapter V. Algebras

a) Prove that this is an i-derivation where i: C1—*C denotes the canonical injection. b) Show that d is surjective and construct a right inverse for d.

c) Prove that d cannot be extended to a derivation in the algebra C. 6. Suppose A is an associative commutative algebra and 0 is a derivation in A. Prove that Ox" = px"' 0(x). 7. Suppose that 0 is a derivation in an associative commutative algebra A with identity e and assume that xeA is invertible; i.e.; there exists an element x ' such that

Prove that x" (p 1) is invertible and that

(x")' —(x')" Denoting the inverse of x" by

show that for every derivation 0

0(x") = — px"1 0(x). 8. Let L be an algebra in which the product of two elements x, y is denoted by [x, y]. Assume that [x, y] + [y, x] = z], x] + y], z] +

0

(skew symmetry)

x], y] = 0 (Jacobi identity)

Then L is called a Lie algebra. Let Ad (a) be the multiplication operator in the Lie algebra L. Prove that Ad (a) is a derivation.

9. Let A be an associative algebra with product xy. Show that the multiplication (x, y)—÷[x, y] where

[x,y]

xy — yx

makes A into a Lie algebra. 10. Let A be any algebra and consider the space D (A) of derivations in A. Define a multiplication in D(A) by setting [01,02] = a)

0102 —0201.

Prove that D (A) is a Lie algebra.

b) Assume that A is a Lie algebra itself and consider the mapping given by Show that 'p is a homomorphism of Lie algebras. Determine the kernel of p.

§ 1. Basic properties

11. If L is a Lie algebra and I is an ideal in A, prove that the algebra L/I is again a Lie algebra. 12. Let E be a finite dimensional vector space. Show that the mapping 1: A (E; E) —÷ A (E*; E*)OPP

is an isomorphism of algebras. given by 13. Let A be any algebra with identity and consider the multiplication

operator

ji: A

A (A; A).

Show that p is a monomorphism. If A = L (E; E) show that by a suitable restriction of p a monomorphism

(i= 1

E be an n-dimensional vector space. Show that each basis n) of L(E; E) such that n) of E determines a basis 1

(i) QijOlk = (ii)

i.

Conversely, given n2 linear transformations of E satisfying i) and ii), prove that they form a basis of L(E; E) and are induced by a basis of E. Show that two bases e1 and of E determine the same basis of L (E; E) if and only if e = A 2 eF. 15. Define an equivalence relation in the set of all linearly independent E), in the following way: n2-tuples (p1...pn2), ...

('P1 ...

if and only if there exists an element =

such that (v

= ... 1

,12)

Prove that 2EV

only if 2=1. 16. Prove that the bases of L(E; E) defined in problem 14 form an equivalence class under the equivalence relation of problem 15. Use this to show that every non-zero endomorph ism cP: A (E; E)—*A (E; E) is an inner automorphism; i.e., there exists a fixed linear automorphism of E such that 1

17. Let A be an associative algebra, and let L denote the corresponding

Chapter V. Algebras

158

is Lie algebra (cf. problem 9). Show that a linear mapping derivation in A only if it is a derivation in L. 18. Let E be a finite dimensional vector space and consider the mapping E)—*A(E; E) defined by

= — Prove that is a derivation. Conversely, prove that every derivation in A (E; E) is of this form. Hint. Use problem 14. § 2.

Ideals

5.9. The lattice of ideals. Let A be an algebra, and consider the set 1 of ideals in A. We order this set by inclusion; i.e., if and '2 are ideals if and only if in A, then we write '1 The relation is clearly a partial order in 1 (cf. sec. 0.6). Now let and '2 be ideals in A. Then it is easily cheeked that + and '1 n '2 are again ideals, and are in

fact the least upper bound and the greatest lower bound of '1 and '2 Hence, the relation induces in 5 the structure of a lattice. 5.10. Nilpotent ideals. Let A be an associative algebra. Then an element aEA will be called ni/potent if for some k, (5.14)

The least k for which (5.14) holds is called the degree of ni/potency of a. An ideal I will be called nilpotent if for some k,

,k0

(5.15)

The least k for which (5.15) holds is called the degree of ni/potency of I and will be denoted by deg I. 5.11.* Radicals. Let A be an associative commutative algebra. Then the nilpotent elements of A form an ideal. In fact, if x and y are nilpotent of degree p and q respectively we have that p+q

+

p+q

p

= i=O

and

(xy)" = x"y" = 0.

§ 2. Ideals

The ideal consisting of the nilpotent elements is called the radical of A and will be denoted by rad A. (The definition of radical can be generalized to the non-commutative case; the theory is then much more difficult and belongs to the theory of rings and algebras. The reader is referred to [14]).

It is clear that rad (rad A) = rad A.

The factor algebra A/rad A contains no non-zero nilpotent elements. To prove this assume that A is an element such that for some k. Then x"erad A and hence the definition of rad A yields / such that = (x")' = 0. It follows that xErad A whence by the formula

The above result can be expressed

rad(A/radA) = 0.

Now assume that the algebra A has dimension n. Then rad A is a nilpotent ideal, and (5.16) deg(radA) dim(radA) + 1 n + 1. For the proof, we choose a basis e1, ..., er of rad A. Then each is nilpotent. Let k = max (deg e1), and consider the ideal (rad An arbitrary element in this ideal is a sum of elements of the form where

k

so

...

=0. This shows that

=0 and so rad A is nilpotent. Now let s be the degree of nilpotency of rad A, and suppose that for some m<s, (rad A)tm = (rad A)m+

(5.17)

Then we obtain by induction that (radA)tm =

= (radA)m+2 =...= (radA)s = 0

which is a contradiction. Hence (5.17) is false and so in particular dim(rad A)tm> dim (radA)m+l,

m <s.

Chapter V. Algebras

It follows at once that s — cannot be greater than the dimension of rad A, which proves (5.16). As a corollary, we notice that for any nilpotent element xEA, its degree of nilpotency is less than or equal to ii+ I, 1

degx

n + 1.

5.12.* Simple algebras. An algebra A is called simple if it has no proper

non-trivial ideals and if A2 +0. As an example consider a field F as an algebra over a subfield F1. Let 1+0 be an ideal in F. If x is a non-zero element of I, then 1

and it follows that

F=

cJ

whence F = I. Since 12+0, F is simple. As a second example consider the algebra A (E; E) where E is a vector space of dimension n. Suppose I is a non-trivial ideal in A (E; E) and let p+O be an arbitrary element of I. Then there exists a vector aeE such

that pa+0. Now define the linear transformations

by

i,k=1...n where e (i= that

.n) is a basis of E. Choose linear transformations

such

i=1...n.

(E; E) be arbitrary and c4 let be the matrix of Let the basis e1. Then = = =

with respect to

whence 'I'

=

It follows that i/iel and so I_—A(E; E). Since, (clearly) A (E; E)2 +0, A (E; E) is a simple algebra. Theorem I: Let A be a simple commutative associative algebra. Then A is a division algebra. Proof: We show first that A has a unit element. Since A2+0, there is an element aEA such that Ia+0• Since A is simple, Thus there is an element eEA such that C!• e = a.

§2. Ideals

161

Next we show that e2 = e. In fact, equation (5.18) implies that a(e2 — e)

= (ae)e — ae = ae

—

ae =0

and so e2—e is contained in the annihilator of a, e2—eeN0. Since N0 is an ideal and Na*A it follows that Na=0 whence e2=e.

Now let xeA be any element. Since a=aeele we have Ie#0 and 50 IeA. Thus we can write x = er

for some ye A.

It follows that ex=e2v=ev=x and so e is the unit element of A. Thus every element xeA satisfies x=x. e; i.e., xel,. In particular, * 0 if .v * 0. Hence, if x + 0. = A and so there is an element x such

that xx"'=x'x=e: i.e., A is a division algebra.

5.13.* Totally reducible algebras. An algebra A is called totally recluclb/c if to every ideal I there is a complementary ideal I', A=

1$!'.

Every ideal I in a totally reducible algebra is itself a totally reducible algebra. In fact, let I' be a complementary ideal. Then

1.1' ci fl I' =

0.

Consequently, if J is an ideal in I, we have

J

an

ideal in A. Let J' be a complementary ideal in A, A

Intersecting with I and observing that id we obtain (cf. sec. 1.13) I It follows that us again totally reducible. An algebra, A, is called irreducible if it cannot be written as the direct

sum of two non-trivial ideals. 5.14.* Semisimple algebras. In this section A will denote an associative commutative algebra. A will be called semisimple if it is totally reducible and if for every non-zero, I, ideal j2 +0. 1

(reeb.

near AIgr±ra

Chapter V. Algebras

162

Proposition I: If A is totally reducible, then A is the direct sum of its radical and a semisimple ideal. The square of the radical is zero. Proof: Let B denote a complementary ideal for rad A,

A = rad A

B.

B contains no non-zero nilpotent eleIt follows from sec. (5.13) that B is totally reducible ments and so B2 and hence B is semisimple. To show that the square of rad A is zero, let k be the degree of nilan ideal in rad A, and potency of rad A, (rad A)k=0. Then (rad so there exists a complementary ideal J, Since

EEJ = radA.

Now we have the relations =

=

0

=0 n J=0.

jk-1 These relations yield (rad

0

A

and only if A is totally reducible and

rad A=0.

Problems 1. Suppose that

are ideals in an algebra A. Prove that

+ '2)/Il

'2/(11

'2).

Show that the algebra C' defined in Example 4, § 1, has no nilpotent elements +0. 111. Show that the oper3. Consider the set S of step functionsf: [0, ations 2.

(1 +g)(t)=f(t)+g(t) (Af)(t) =

(t)

(fg)(t) =f(t)g(t)

§ 3. Change of coefficient field of a vector space

163

make S into a commutative associative algebra with identity. (A function f: [0,1] —÷ is called a step function if there exists a decomposition of the

unit interval,

0=to

5. Show that the algebra S of problem 3 has ideals which are not principal. Let (a, b)c: [0,1] be any open interval, and letf be a step function such that f (t) = 0 if and only if a < t < b. Prove that the ideal gener-

ated by f is precisely the subset of functions g such that g(t)==0 for

a
transformations of E whose kernels have finite codimension form an ideal. Conclude that A (E; E) is not simple. Hint: See problem 11, chap. II, § 6 and problem 8, chap. I, § 4. § 3.

Change of coefficient field of a vector space

5.15. Vector space over a subfield. Let E be a vector space over a field F and let A be a subfield of F. The vector space structure of E involves a mapping

Fx

satisfying the conditions (11.1), (11.2) and (11.3) of sec. 1.1. The restriction of this mapping to A x E again satisfies these conditions, and so it deter-

mines on E the structure of a vector space over A. A subspace (factor space) of E considered as a A-vector space, will be called a A-subspace (A-factor space). Similarly we refer to F-subspaces and F-factor spaces. Clearly every F-subspace (factor space) is a A-subspace (A-factor space). Now let F be a second vector space over F and suppose that ço : is a F-linear mapping; i.e., + py) =

+

ppy x,yeE,2,peF.

Then 'p is a A-linear mapping ifEand Fare considered as A-vector spaces.

Chapter V. Algebras

As an example, consider the field F as a I-dimensional vector space over itself. Then (cf. sec. 5.2 Example 3) F is an algebra over A. 5.16. Dimensions. To distinguish between the dimension of E over F

and A we shall write dim1E and dim4E. Suppose now that F is finite dimensional over A. Assume further that the dimension of E over F is finite. It will be shown that dim4 E = dim1 E dim4 F.

(1= 1.. .n) be a basis of E over F and consider the F-subspace E generated by e. Then there is a F-isomorphism ço E.. But also a A-isomorphism and hence it follows that Let

dim4 E1 = dim4 F,

i= 1 ... n.

is

(5.21)

Since the A-vector space E is the direct sum of the A-vector spaces E, we obtain from (5.21) that

dim4E = ndim4F = As an example let E be a complex vector space of dimension 11. Then, since C =2, E, considered as a real vector space, has dimension 2n. If ; (v = .n) is a basis of the complex vector space E then the vectors (v = .. n) form a basis of E considered as a real vector space. 5.17. Algebras over subfields. Again let A be a subfield of F and let A be an algebra over F. Then A may be considered as a vector space over A, 1 . .

1

.

and it is clear that A, together with its A-vector space structure, is an algebra over A. We (in a way similar to the case of vector spaces) distinguish between A-subalgebras, A-homomorphisms and F-subalgebras, Fhomomorphisms. Clearly every F-subalgebra, (F-homomorphism) is a A-subalgebra, (A-homomorphism).

5.18. Extension fields as subalgebras of A4(E; E). Let E be a nontrivial vector space over F and A F be a subfield. Then E can be considered as a vector space over A. Denote by A4 (E: E) the algebra (over A) of A-linear transformations of E. Define a mapping by (5.22)

where Then

is

the ordinary scalar multiplication defined between F and E. =

=

§ 3. Change of coefficient field of a vector space

165

and

+ fix

+ 13)x =

+ /3)x =

= cP

(/3))

whence

+ /3) =

+ cP(f3)

and

Since A F, it follows that is a A-homomorphism. Moreover, is injective. In fact, implies that for every XEE whence o=O. Since cP is a monomorphism we may identify F with the A-subalgebra

Im

of AA (E; E).

Conversely, let E be a vector space over a field A and assume that F c: AA (E; E) is a field containing the identity. We may identify A with the subalgebra ofF consisting of elements of the form i, AeA, the identification map being given by Then we have A cF; i.e., A is a subfield ofF. Now define a mapping F x E—*E by çOEF,XEE.

(5.23)

Then we have the relations =

(x + y) = + x=

çø

ix=x

x+ x+

y

x,yEE;q9,I/JET,

and hence E is made into a vector space over F. The restriction of the mapping (5.23) to A gives the original structure of E as a vector space over A while the mapping P restricted to A reduces (E; E). to the canonical injection of A into 5.19. Linear transformations over extension fields. Let A F be a subfield and E be a vector space over F. Then we have shown that (E; E) (E; E). Now we shall prove the more precise

E) is the subalgebra of E) consisting of Proposition: those A-linear transformations which commute with every A-linear transformation of the form cXEF.

Chapter V. Algebras

166

Proof: Let

E). Then

and so

(5.24) Conversely, if (5.24) holds, then by inverting the above argument we ob-

tain that E is a vector space over A and F

(E; E) is a field

E) is contained in

such that lET'. Then a transformation

E) if and only if it commutes with every A-linear transformation

Problems I. Suppose A F is a subfield of F such that F has finite dimension over A. Suppose further that Prove A that A is a subfield of F. 2. Show that ifAcT is a subfield, and F has finite dimension over A, then there are no non-trivial derivations in the A-algebra F. 3. A complex number is called algebraic if it satisfies an equation of the form

;=

0

rational)

where not all the coefficients are zero. Prove that the algebraic numbers are a subfield, A, of C and that A has infinite dimension over the rationals.

Prove that there are no non-trivial derivations in A.

Chapter VI

Gradations and homology In this chapter all vector spaces are defined over a fixed, but arbitrarily chosen field F of characteristic 0.

§ 1. G-graded vector spaces 6.1. Definition. Let E be a vector space and G be an abelian group. Suppose that a direct decomposition (6.1) cxel

is given and that to every subspace an element k (tx) of G is assigned such that the mapping is injective. Then E is called a G-graded vector space. G is called the group of degrees for E. The vectors of are called homogeneous of degree k (ci) and we shall write

degx=k(cs), In particular, the zero vector is homogeneous of every degree. If the mapping is bijective we may use the group G as index set in the decomposition (6.1). Then formula (6.1) reads

E= keG

Ek

where Ek denotes the subspace of the homogeneous elements of degree k. If G = E will be called simply a graded vector space. Suppose that E is a vector space with direct decomposition

E=>JEk Then by setting

—1)

(ke7L).

we make Einto agraded space, and when-

ever we refer to the graded space E_—>Ek, we shall mean this particular k=O

gradation. A gradation of E such that Ek = 0, k — gradation.

1,

is called a positive

Chapter VI. Gradations and homology

Now let E be a G-graded space, E=>Ek,

and

consider a subspace

kEG

FcEsuchthat

F=

keG

Then a G-gradation is induced in F by assigning the degree k to the vec-

tors of Fn Ek. F together with its induced gradation is called a G-graded subspace of E. Suppose next that is a family of G-graded spaces indexed by a set I and let E be the direct sum of the EA. Then a G-gradation is induced in

Eby E=

where

Ek =

kEG

).eI

This follows from the relation Ad

kEG )et

AEI keG

keG

6.2. Linear mappings of G-graded spaces. Let E and F be two G-graded spaces and let E—3F be a linear map. The map p is called homogeneous if there exists a fixed element kEG such that

jeG

(6.2)

k is called the degree of the homogeneous mapping p. The kernel of a homogeneous mapping is a graded subspace of E. In fact, if

E and F induced by the gradations of E and F it follows from (6.2) that cJk+JO(P—(POQJ.

(6.3)

Relation (6.3) implies that ker 'p is stable under the projection operators and hence ker 'p is a G-graded subspace of E. Similarly, the image of 'p is a G-graded subspace of F. Now let E be a G-graded vector space, F be an arbitrary vector space (without gradation) and suppose that 'p:E—*Fis a linear map of Eonto F such that ker 'p is a graded subspace of E. Then there is a uniquely determined G-gradation in F such that 'p is homogeneous of degree zero. The G-gradation of F is given explicitly by jeG where

= 'p (Es).

(6.4)

§ 1. 6-graded vector spaces

169

To show that (6.4) defines a G-gradation in F, we notice first that since ço is onto,

F=

=

E=

To prove that the decomposition (6.4) is direct assume that where Since lows that ço

every can be written in the form = 0 whence

It fol-

Since ker p is a graded subspace of E we obtain

for eachj Thus the decomposition (6.4) is direct and hence it defines a G-gradation in F. Clearly, the mapping p is homogeneous of degree zero with respect to the induced gradation. Finally, it is clear that any G-gradation of F such that ço is homogeneous of degree zero must assign to the elements of the degree]. In view of the decomposition (6.4) it follows that this G-gradation is uniquely determined by the requirement that 'p be homogeneous of degree zero. This result implies in particular that there is a unique G-gradation determined in the factor space of E with respect to a G-graded subspace such that the canonical projection is homogeneous of degree zero. Such a factor space, together with its G-gradation, is called a G-graded factor whence

space of E.

Now let E= that

Ek and F= kEG

Fk be two G-graded spaces, and suppose keG

is a linear mapping homogeneous of degree 1. Denote by Pk the restriction of 'p to Ek, Ek —*

Fk+l.

Then clearly keG

It follows that 'p is injective (surjective, bijective) if and only if each c°k injective (surjective, bijective).

Chapter VI. Gradations and homology

6.3. Gradations with respect to several groups. Suppose that t: G

into another abelian group G'. Then a G-

gradation of E induces a G'-gradation of E by where

(6.5) t(2)=fl

/3

To prove this we note first that E= since

X

The directness of the decomposition (6.5) follows from the

fact that every space decomposition

is a sum of certain subspaces

and that the

is direct.

A Gp-gi'adation is a gradation with respect to the group

GEE... p

If G = we refer simply to a p-gradation of E. Given a G p-gradation in E consider the homomorphism

i:

G

given by

The induced (simple) G-gradation of E is given by (6.6)

j

The G-gradation (6.6) of E is called the (simple) G-gradation induced by the given G p-gradation. Finally, suppose E is a vector space, and assume that

E=

E=

jeG

keIi

Fk

(6.7)

define G and H-gradations in E. Then the gradations will be called coinpatible if E= n Fk. j,

k

if the two gradations given by (6.7) are compatible, they determine a in E by the assignment n Fk.

§ 1. G-graded vector spaces

Conversely, suppose a

171

in E is given

E= k

Then compatible G and H-gradations of E are defined by jeG

keH

and

E determined by these G and Hgradations is given by

jEG,kEH.

EJ,k=EfflFk

6.4. The Poincaré series. A gradation E== >.Ek of a vector space E is called almost finite if the dimension of every space Ek is finite. To every almost finite positive gradation we assign the formal series PE(t) = (t) is called the Poincaré series of the graded space E. If the dimension of E is finite, then (t) is a polynomial and

PE(l) = dimE. The direct sum of two almost finite positively graded spaces E and F is again an almost finite positively graded space, and its Poincaré series is given by = PE(t) + PF(t). Two almost finite positively graded spaces E and F are connected by a homogeneous linear isomorphism of degree 0 if and only if = In fact, suppose is such a homogeneous linear isomorphism of degree 0. Writing k=O

(cf. sec. 6.2) we obtain that each c°k is a linear isomorphism, Ek 4Fk.

Chapter VI. Gradations and homology

172

Hence

(k=O,1,...)

dimEk=dimFk

(6.8)

and so

= Conversely, assume that are linear isomorphisms

Then (6.8) must hold, and thus there (6.9)

(pk.Ek—*Fk.

Since

we

can construct the linear mapping

= k=O

E

F

which is clearly homogeneous of degree zero. Moreover, in view of sec. (6.2) it follows from (6.9) that ço is a linear isomorphism. 6.5. Dual G-graded spaces. Suppose E= > Ek and F= Fk are two GkE G graded vector spaces, and assume that k G ço:E x

is a bilinear function. Then we say that cv respects the G-gradations of E

and Fif (6.10)

for each pair of distinct degrees, k +j. Every bilinear function cv: E x F—÷F which respects G-gradations determines bilinear functions pk: Ek x (keG) by

(PIEkXFk=(Pk,

keG.

(6.11)

Conversely, if any bilinear functions (pk: Ek x are given, then a unique bilinear function p:Ex F-+F which respects G-gradations is determined by (6.10) and (6.11). In particular, it follows that ço is non-degenerate if and only if each is non-degenerate. Thus a scalar product which respects G-gradations determines a scalar product between each pair (Ek, Fk), and conversely if a scalar product is defined between each pair (Ek, Fk) then the given scalar product can be extended in a unique way to a G-gradation-respecting scalar product between E and F. E and F, together with a G-gradation-respecting scalar product, will be called dual G-graded spaces.

Now suppose that E and F are dual almost finite G-graded spaces.

§ 1. G-graded vector spaces

173

Then Ek and Fk are dual, and so

dimEk=dimFk In particular, if G

kEG.

and the gradations of E and F are positive, we have

=

Problems 1. Let çø : E—*F be an injective linear mapping. Assume that F is a Ggraded vector space and that Im ço is a G-graded subspace of F. Prove that there is a unique G-gradation in E so that p becomes homogeneous of degree zero. 2. Prove that every G-graded subspace of a G-graded vector space has a complementary G-graded subspace. 3. Let E, F be G-graded vector spaces and suppose that E1 E, F1 F are G-graded subspaces. Let p: E—+F be a linear mapping homogeneous of degree k. Assume that p can be restricted to E1, F1 to obtain a linear mapping p1 :E1—*F1 and an induced mapping

Prove that q1 and are homogeneous of degree k. 4. If cp, E, F are as in problem 3, prove that if has a left (right) inverse, then a left (right) homogeneous inverse of must exist. What are the possible degrees of such a homogeneous left (right) inverse mapping? 5. Let E1, E2, 123 be G-graded vector spaces. Suppose that p: E1 and i/i:E1—+E3 are linear mappings, homogeneous of degree k and 1 respectively. Assume that can be factored over 'p. Prove that cli can be factored over 'p with a homogeneous linear mapping x: and determine the degree of x. Hint: See problem 5, chap. II, § 1.

6. Let E, E* and F, F* be two pairs of dual G-graded vector spaces. Assume that and 'p is homogeneous of degree k, prove that 'p* homogeneous of degree k. and 7. Let E be an almost finite graded space. Suppose that are G-graded spaces each of which is dual to the G-graded space E. Construct

is

Chapter VI. Gradations and homology

174

a homogeneous linear isomorphism of degree zero

such that

<(py*x> =

xeE.

8. Let F, E* be a pair of almost finite dual G-graded spaces. Let F be a G-graded subspace of E. Prove that F' is a G-graded subspace of E*

and that (F')'=F. 9. Suppose E, E*, F are as in problem 8. Let F1 be a complementary G-graded subspace for F in E (cf. problem 2). Prove that

E* =F'f13F1' and that F, F and F1, F' are two pairs of dual G-graded spaces. 10. Suppose E, E* and F, F* are two pairs of almost finite dual G1L

graded vector spaces, and let p:E—*Fbe a linear mapping homogeneous of degree k. Prove that çQ* exists.

11. Suppose F, E* is a pair of almost finite dual G-graded vector spaces. Let be a basis of F consisting of the set union of bases for the in E* exists. homogeneous subspaces of E. Prove that a dual basis 12. Let F and F be two G-graded vector spaces and p: E—+ F be a homogeneous linear mapping of degree k. Assume further that a homomorphism w:G—*H is given. Prove that p is homogeneous of degree w(k) with erspect to the induced H-gradation. § 2.

G-graded algebras

6.6. G-graded algebras. Let A be an algebra and suppose that a Ggradation A =

is defined in the vector space A. Then A is called a kEG

G-graded algebra if for every two homogeneous elements x and y, xy is

homogeneous, and

deg(xy) = degx + degy. Suppose that A = keG

(6.12)

is a graded algebra with identity element e.

Then e is homogeneous of degree 0. In fact, writing

e=

ek

keG

ekeAk

§ 2. G-graded algebras

175

we obtain for each xeA that

x is homogeneous of degree 1 kEG

whence

xek—O for k+O and so

xe0=x.

(6.13)

Since (6.13) holds for each homogeneous vector x it follows that e0 is a right identity for A, whence e = e e0

= e0.

Thus eeA0 and so it is homogeneous of degree 0.

It is clear that every subalgebra of A that is simultaneously a G-graded subspace, is a G-graded algebra. Such subalgebras are called G-graded subalgebras.

Now suppose that 1c A is a G-graded ideal. Then the factor algebra A/I has a natural G-gradation as a linear space (cf. sec. 6.2) such that the canonical projection ir:A—A/I is homogeneous of degree zero. Hence if and 9 are any two homogeneous elements in A/I we have .9 =

and so

is

deg

= 7t (x y)

homogeneous. Moreover, = deg(x y) = deg x + deg y = deg

+ deg9.

Consequently, A/I is a G-graded algebra. More generally, if B is a second algebra without gradation, and p : A—÷B is an epimorphism whose kernel is a G-graded ideal in A, then the induced G-gradation (cf. sec. 6.2) makes B into a G-graded algebra. Now let A and B be G-graded algebras, and assume that p:A—+B is a

homogeneous homomorphism of degree k. Then ker 'p is a G-graded ideal in A and Im 'p is a G-graded subalgebra of B. G' is a homoSuppose next that A is a G-graded algebra, and r: morphism, G' being a second abelian group. Then it is easily checked that the induced G'-gradation of A makes A into a G'-graded algebra.

Chapter VI. Gradations and homology

176

The reader should also verify that if E is simultaneously a G- and an H-graded algebra such that the gradations are compatible, then the inalgebra. of A makes A into a duced A graded algebra A is called anticommutative if for every two homogeneous elements x and y = if x and y are two homogeneous elements in an associative anticommux y comtative graded algebra such that deg mute, and so we obtain the binomial formula n

yfl - i.

(x +

= an involution w is defined by

In every graded algebra A =

1)kx, wx = (6.14) XEAk. In fact, if XEAk and yeA1 are two homogeneous elements we have

w(xy) =

1)k+lxy = (_

1)1y

=

w preserves products. It follows immediately from (6.14) that = i and so w is an involution. w will be called the canonical involution of the graded algebra A. A homogeneous antiderivation with respect to the canonical involution (6.14) will simply be called an antiderivation in the graded algebra A. It satisfies the relation

Q(xy)=

+(— 1)kxQy,

xeAk,yeA.

If Q1 and Q2 are antiderivations of odd degree then Q1 Q2 + Q2 Q1 is a

derivation. If Q is an antiderivation of odd degree and 0 is a derivation then QO —OQ is an antiderivation (cf. sec. 5.8). Now assume that A is an associative anticommutative graded algebra

and let hEA be a fixed element of odd (even) degree. Then, if Q is a homogeneous antiderivation, p (h) Q is a homogeneous derivation (antiderivation) and if 0 is a homogeneous derivation, p(h)O is a homogeneous antiderivation (derivation) as is easily checked.

Problems 1. Let A be a G-graded algebra and suppose x is an invertible element homogeneous of degree k (cf. problem 7, chap. V, § 1). Prove that 1

§ 2. G-graded algebras

177

is homogeneous and calculate the degree. Conclude that if A is a positively graded algebra, then kz=O. 2. Suppose that A is a graded algebra without zero divisors. Prove that every invertible element is homogeneous of degree zero.

3. Let E, F be G-graded vector spaces. Show that the vector space LG(E; F) generated by the homogeneous linear mappings ço:E-+F is a subspace of L(E; F). Define a natural G-gradation in this subspace such that an element cOELG(E; F) is homogeneous if and only if it is a homogeneous linear mapping. 4. Prove that the G-graded space LG (E; E) (E is a G-graded vector space) is a subalgebra of A (E; E). Prove that the G-gradation makes LG (E; E) into a G-graded algebra (which is denoted by AG (E; E)). 5. Let E be a positively graded vector space. Show that an injective (surjective) linear mapping E) has degree Conclude that a homogeneous linear automorphism of E has degree zero. 6. Let A be a positively graded algebra. Show that the subset Ak of A consisting of the linear combinations of homogeneous elements of degree

k is an ideal. 7. Let E, E* be a pair of almost finite dual G-graded vector spaces. Construct an isomorphism of algebras: &AG(E;E)4 AG(E*;E*)0PP. Hint: See problem 12, chap. V, § 1. Show that there is a natural G-gradation in AG(E*; such that 1i is homogeneous of degree zero. 8. Consider the G-graded space LG (E; E) (E is a G-graded vector space). Assign a new gradation to LG (E; E) by setting

degp =

— degp

whenever PELG(E; E) is a homogeneous element. Show that with this new gradation LG (E; E) is again a G-graded space and AG (F; E) is a

G-graded algebra. To avoid confusion, we denote these objects by LG(E; E) and AG(E; E). Prove that the scalar product between LG (E;

E) and EG (E; E) defined

by

= makes these spaces into dual G-graded vector spaces. 12

Linear Algebra

Chapter VI. Gradations and homology

178

9. Let

be a graded algebra and consider the linear mapping

O:A—*A defined by

Ox = px

Show that 0 is a derivation. §

3•* Differential spaces and differential algebras

6.7. Differential spaces. A differential operator in a vector space E is a linear mapping E—5 E such that = 0. The vectors of ker = Z(E) are called boundaries. it are called cycles and the vectors of Im follows from 132=0 that B(E)cZ(E). The factor space

H(E) = Z(E)/B(E) is called the homology space of E with respect to the differential operator 13. A vector space E together with afixed differential operator 13E' is called a differential space. A linear mapping of a differential space (E, 13E) into a differential space (F, 13F) is called a homomorphism (of differential spaces) if (6.15)

CF0P_(P0GE

It follows from (6.15) that p maps Z(E) into Z(F) and B(E) into B(F). Hence a linear mapping p is an isomorphism of differential spaces and is the linear inverse iso1

morph ism, then by applying p - on the left and right of (6.15) we obtain 1

—1

°

= CEO çø

and so p1 is an isomorphism of differential spaces as well. If 1/1 is a homomorphism of(F, 13k) into a third differential space (G, we

have clearly

In particular, if ço is an isomorphism of E onto F and p isomorphism we have =I =

1

and

= Consequently,

= 1.

is an isomorphism of H(E) onto H(F).

is

the inverse

§ 3. Differential spaces and differential algebras

6.8. The exact triangle. An exact sequence of differential spaces is an exact sequence (6.16)

where (F, aF), (E, and (G, are differential spaces, and 'p, are homomorphisms. Suppose we are given an exact sequence of differential spaces then the sequence H(E) H(G) o is exact at H(E). In fact, clearly, = 0 and so Tm * c ker Conversely, let and choose an element yEZ(E) which repre-

sents /3. Then, since z1EG.

is surjective, there is an element y1 E E such that t/iy1 = z1. It follows that — CGZ1 = 0. — CEll) = = — 6G Since

Hence, by exactness at E,

for some XEF. Applying

we

obtain CE(PX = GEl

=0

whence

Since p is injective, this implies that i.e., XEZ(F). Thus x represents an element It follows from the definitions that (p*

= /3

and so

It should be observed that the induced homology sequence is not short exact. However, there is a linear map, y: H(G)—*H(F) which makes the triangle

H(F)-*H(E)

(6.17)

A

H(G)

exact. It is defined as follows: Let sentative of Choose VEE such that =

and let ZEZ(G) be a repreThen = GG

Z

=0

Chapter VI. Gradations and homology

and so there is an element XEF such that = Since

=

=

=

0

and since is injective, it follows that IFX=O and so x represents an element x of H(F). It is straightforward to check that is independent of all choices and so a linear map y: is defined by It is not difficult to verify that this linear map makes the triangle (6.17) exact at 11(G) and H(F). / is called the connecting homomorphism for the short exact sequence (6.16). 6.9. Dual differential spaces. Suppose (E, is a differential space and consider the dual space E* = L(E). Let a* he the dual map of Then for all xEE, X*EE* we have = <x*,0> =

=

0

whence 3*a*x*_o i.e., (o*)2 = 0.

Thus (E*,

is again a differential space. It is called the dual differential

space.

= Z (E*) are called cocycles (for E) and the vectors The vectors of ker of B(E*) are called coboundaries (for E). The factor space

H(E*) = Z(E*)/B(E*) is called the cohoinology space for E. it will now be shown that the scalar product between E and E* determines a scalar product between the homology and cohomology spaces. In view of sec. 2.26 and 2.28 we have the relations

Z(E*)=B(E)L,

Z(E)=B(E*)L

(6.18)

B(E*) = Z(E)-'-,

B(E) = Z(E*)--.

(6.19)

We can now construct a scalar product between H(E) and H(E*). Consider the restriction of the scalar product between E and E* to Z(E) x Z(E*), Z(E) x Then since, in view of (6.18) and (6.19),

z(E*)±

n

Z(E) = B(E)

n

Z(E) = B(E)

§ 3. Differential spaces and differential algebras

and n

181

B(E*) n z(E*) =

z(E*)

the equation

=

a scalar product between H(E) and H(E*) (cf. sec. 2.23). (E*, Finally, suppose (E, and (F, SF)' (F*, CF*) are two pairs of dual differential spaces. Let

defines

ço:E-*F be a homomorphism of differential spaces, with dual map (p*: Then dualizing (6.15) we obtain *

and so

induced

°

*_ F

* * E0(P

is a homomorphism of differential spaces. It is clear that the mappings

and

H(F*)

: H(E*)

are again dual with respect to the induced scalar products; i.e., I

*\,'# — —I

\*

6.10. G-graded differential spaces. Let E be a G-graded space, E= peG

and consider a differential operator in E that is homogeneous of some degree k. Then a gradation is induced in Z(E) and B(E) by

Z(E)=

Z,,(E)

and B(E)=

peG

and

where

peG

(cf. sec. 6.2).

Now consider the canonical projection

Since rr is an onto map and the kernel of is a graded subspace of Z (E), a G-gradation is induced in the homology space H(E) by

H(E)=

where

peG

Chapter VI. Gradations and homology

The Now consider the subspaces and factor space is called the p-ill homology space of the graded In differential space E. It is canonically isomorphic to the space fact, if then denotes the restriction of it to the spaces

is an onto map and the kernel of =

n

is given by

kent =

n

B(E) =

induces a linear isomorphism of if E is a graded space and dim is finite we write

Hence,

onto

=

dim

is called the p-th Betti number of the graded differential space (E,

If E is an almost finite positively graded space, then clearly, so is H(E). The Poincaré series for H(E) is given by P

D

H(E)

P=o

6.11. Dual G-graded differential spaces. Suppose (E= (E* = keG

E,

Ek, c5)

and

kEG is

a pair of dual G-graded differential spaces. Then if 0E

is homogeneous of degree 1, we have that

E1

E1 +

and hence

= <)j,GEXJ> = 0 unless

it follows that

fl

j+i—l

is homogeneous of degree —1. and so Now consider the induced G-gradations in the homology spaces

The induced scalar product is given by (6.22)

=

Since

and

§ 3. Differential spaces and differential algebras

183

are homogeneous of degree zero, it follows that the scalar product (6.22) respects the gradations. Hence H(E) and H(E*) are again dual G-graded vector spaces. In particular, if G = the p-th homology and cohomology spaces of E are dual. If has finite dimension we obtain that dim

=

dim

6.12. Differential algebras. Suppose that A is an algebra and that 13 is a differential operator in the vector space A. Assume further that an involution w of the algebra A is given such that 13w+w13=O, and that 13 is an antiderivation with respect to i.e., that (6.23)

Then (A, 13) is called a differential algebra.

It follows from (6.23) that the subspace Z(A) is a subalgebra of A. Further, the subspace B(A) is an ideal in the algebra Z(A). In fact, if and 13xeB(A) we have

yeZ(A)

13(xy) = 13xy

and

13(wy.x)= w2 y13x +

y13x

yeZ(A)

whence 13x .yeB(A) 13xeB(A). Hence, a multiplication is induced in the homology space H(A). The space H(A) together with this multiplication is called the homology algebra of the differential algebra (A, 13). The multiplication in H(A) is given by

irz1 nz2 =

z1,z2eZ(A)

m(z1 z2)

where ir:Z(A)—÷H(A) denotes the canonical projection. If A is associative (commutative) then so is H(A). Let (A, 8A) and (B, 0B) be differential algebras. Then a homomorphism A -+B is called a homomorphism of d(fferential algebras if 'P 13A =

13B

It follows easily that the induced mapping : is a homomorphism of homology algebras. Suppose now A is a graded algebra and that 13 and a) are both homogeneous, w of degree zero. The A is called a graded differential algebra. Consider the induced gradation in H(A). Since the canonical projection m:Z(A)—÷H(A) is a homogeneous map of degree zero it follows that H(A) is a graded algebra.

Chapter VI. Gradations and homology

If A is an anticommutative graded algebra, then so is H(A) as follows from the fact that rr is a homogeneous epimorphism of degree zero.

Problems 1. Let (E, 0k), (F, 02) be two differential spaces and define the differby ential operator 8 in 8= Prove that

2. Given a differential space (E, 0), consider a differential subspace; i.e., a subspace E1 that is stable under 0. Assume that Q:E—÷E1 is a linear mapping such that i) ii)

VEE1

iii) QX — XEB(E)

XEZ(E).

Prove that the induced mapping

H(E1) is a linear isomorphism. 3. Let (F, 0) be a differential space. A homotopy operator in E is a linear transformation h:E—*E such that

hO + Oh = i.

Show that a homotopy operator exists in E if and only if H(E)==O. 4. Let (E, be and (F, OF) be two differential spaces and let =1/1* if and only if homomorphisms of differential spaces. Prove that there exists a linear mapping such that liCE + CFh =

—

is called a honiotopy operator connecting çø and //. Show that problem 3 is a special case of problem 4. 5. Let 81, 02 be differential operators in Ewhich commute, 8102=8281. a) Prove that 01 02 is a differential operator in E. b) Let B1, B2, B be the boundaries with respect to 82 and 82. Prove that 02(B1) = 81(B2) = B. h

§ 3. Differential spaces and differential algebras

c) Let Z1, Z2, Z that

Z1 + Z2

be the cycles with respect to 02 and Z. Establish natural linear isomorphisms

Z/Z1 4B1 n

Z2

and

Z/Z2

185

Show

4B2 fl Z1.

d) Establish a natural linear isomorphism (B1 n

n

Z1)/02(Z1)

and then show that each of these spaces is linearly isomorphic to Z/(Z1 + Z2). induces a differential operator in Z2. Let P1 denote e) Show that the corresponding homology space. Assume now that Z=Z1 +Z2 and prove that P1 can be identified with a subspace of the homology space H1 =Z1/B1. State and prove a similar result for 02. f) Show that the results a) to e) remain true if 0102 02 be differential operators in E such that 0102= are differential operators in E. a) Prove that 01+02 and b) With the notation of problem 5, assume that

B=B1nB2 and Z=Z1+Z2. Prove that the homology space is linearly isomorphic to

of each of the differential operators in a)

Z1 n Z2/(Z1 (This

n

B2 + Z2 n B1).

is essentially the Künneth theorem of sec. 2.10

7.

formula. Let E =

be

volume II.)

a finite dimensional

graded

differential space and assume that 0 is homogeneous of degree — be a homomorphism of differential spaces, (1=0 in E0). Let p: homogeneous of degree zero. Denote the restrictions of p (respectively by ço1, (respectively Prove the (respectively to Lefschetz formula —

—

p=O

Conclude

=

p=O

that —

(EP)

p=O

and PH(E). (Euler-Poincaré formula). Express this formula in terms of The number given in (EP). is called the Euler-Poincaré characteristic

of E.

Chapter VII

Inner product spaces In this chapter all vector spaces are assumed to be real vector spaces

§ 1. The inner product 7.1. Definition. An inner product in a real vector space E is a bilinear function (,) having the following properties: 1. Symmetry: (x, y)=(y, x).

2. Positive definiteness: (x, x)O, and (x, x)=O only for the vector x==O.

A vector space in which an inner product is defined is called an inner product space. An inner product space of finite dimension is also called a Euclidean space. The norm xl of a vector xEE is defined as the positive square-root

A unit vector is a vector with the norm 1. The set of all unit vectors is called the unit-sphere. It follows from the bilinearity and symmetry of the inner product that x + y12 = 1x12 + 2(x,y)+ JyV

whence

(x,y) = 4(lx +

y12

-

This equation shows that the inner product can be expressed in terms of the norm. The restriction of the bilinear function (,) to a subspace E1 E has again properties 1 and 2 and hence every subspace of an inner product space is itself an inner product space. The bilinear function (,)is non-degenerate. In fact, assume that (a,y)=O

for a fixed vector aeE and every vector yeE. Setting y=a we obtain (a, a) =0 whence a =0. It follows that an inner product space is dual to itself.

7.2.

§ 1. The inner product

187

Examples. 1. In the real number-space

standard inner pro-

duct is defined by

(x,y) where

x =

and

...

...

y =

2. Let E be an n-dimensional real vector space and basis of E. Then an inner product can be defined by

(v =

1.

.

.n) be a

(x,y) = where

3.

Consider the space C of all continuous functions f in the interval 1 and define the inner product by

(f,g) = Jf(t)g(t)dt. 7.3. Orthogonality. Two vectors xeE and yeE are said to be orthogonal if (x, y)=O. The definiteness implies that only the zero-vector is orthogonal to itself. A system of p vectors +0 in which any two vectors are orthogonal, is linearly independent. In fact, the rex. and lation =0 yields x,1) =

0

=1

...

p)

whence

Two subspaces E1 E and E2 c E are called orthogonal, denoted as E1 ±E2, if any two vectors X1EE1 and x2eE2 are orthogonal.

7.4. The Schwarz-inequality. Let x and y be two arbitrary vectors of the inner product space E. Then the Schwarz-ine quality asserts that (x,y)2

1x121y12

(7.1)

and that equality holds if and only if the vectors are linearly dependent. To prove this consider the function Ix

+

Chapter VII. Inner product spaces

of the real variable 2. The definiteness of the inner product implies that

Expanding the norm we obtain )2

+ 22(x, y) + x12

0.

Hence the discriminant of the above quadratic expression must be negative or zero,

(x,r)2 < Now assume that equality holds in (7.1). Then the discriminant of the quadratic equation + 22(x,y) + lxi2 = 0

221yi2

is zero.*) Hence equation (7.2) has a real solution

(7.2)

It follows that

i2oy + XV = 0, whence 2oY + X = 0.

Thus, the vectors x and y are linearly dependent. 7.5. Angles. Given two vectors x + 0 and y 0, the Schwarz-inequality implies that (x,y) —

1

- lxi -1.

Consequently, there exists exactly one real number w (0

(x y)

cosw=—

w

m) such that (7.3)

.

lxi

The number w is called the angle between the vectors x and y. The sym-

metry of the inner product implies that the angle is symmetric with respect to x and y. If the vectors x and y are orthogonal, it follows that cos co =0, whence U) =

—

2

Now assume that the vectors x and y are linearly dependent, y=2x, Then

2

1+1

121

L—1

if 2>0 if 2<0

*) Without loss of generality we may assume that y

0.

§ 1. The inner product

and hence (0

if 2>0

(ic

if 2<0.

189

With the help of (7.3) the equation fx — y12

= ixi2 — 2(x,y) + iyi2

can be written in the form fx — y12 = 1x12 + 1y12 — 2fxi iylcosw.

This formula is known as the cosine-theorem. If the vectors x and y are

orthogonal, the cosine-theorem reduces to the Pythagorean theorem

ix- y12 = 7.6.

ixi2 + fyi2.

The triangle-inequality. It follows from the Schwarz-inequality

that lx

ixi2 + 2fxf fyI + fyi2

+ yi2 = ixi2 + 2(x,y) + fyi2

(lxi +

whence

x+yf fxi +iyi.

(7.4)

Relation (7.4) is called the triangle-inequality. To discuss the equalitysign we may exclude the trivial case y 0. It will be shown that equality holds in (7.4) if and only if

x—2y, 2>0. The equation

Ix+yi=ixi+iyi implies that lxi

2

+ 2(x,y)+

fyi2

= lxi 2 + 2fxifyf +

whence

(x, y) = lxi fyi.

(7.5)

Thus, the vectors x and y must be linearly dependent,

x=2y. Equations (7.5) and (7.6) yield 2=121, whence 20. Conversely, assume that x = 2y, where 2 0. Then fx

+ yf = (2 + 1)yI = (2 + 1)fyf = 2fyf + iyf = fxf + fyj.

(7.6)

Chapter VII. Inner product spaces

Given three vectors x, y, z, the triangle-inequality can be written in the form

x-yl Ix-zI +lz-yi.

(7.7)

As a generalization of (7.7), we prove the Ptolemy-inequality

ix-yllzl ly-zHxl +lz-xilyi.

(7.8)

Relation (7.8) is trivial if one of the three vectors is zero. Hence we

may assume that x+O, y+O and z+O. Define the vectors x', y' and z' by x

X—2, Y lxi

y

,

z Z—2. ,

2'

Izi

Then

lv'—y'l2=

k—12 xi2 in2'

1

lxl2

lxl2 lyi2

lyl2

Applying the inequality (7.7) to the vectors x', y' and z' lxi

=

we

obtain

Izi lxi'

Izi

whence (7.8).

7.7. The Riesz theorem. Let E be an inner product space of dimension n and consider the space L(E) of linear functions. Then the spaces L(E) and E are dual with respect to the bilinear function defined by

On the other hand, E is dual to itself with respect to the inner product. Hence Corollary II to Proposition I sec. 2.33 implies that there is a linear isomorphism a—*fa of E onto L(E) such that fa(Y) = (a,y). In

other words, every linear function f in E can be written in the form

f(y) and

= (a,y)

the vector aEE is uniquely determined byf(Riesz theorem).

Problems 1. For x =

1

(x,y) satisfies

show that the bilinear function

in

and y = —

the properties listed in sec. 7.1.

—

+

§ 2. Orthonormal bases

191

2. Consider the space S of all infinite sequences that

Show that

•••) such

converges and that the bilinear function (x,

is an inner product. 3. Consider three distinct vectors x40, y+O and z40. Prove that the equation lxi + iz - xi fyi ix izi = iY

-

-

holds if and only if the four points x, y, z, 0 are contained on a circle such

that the pairs x, y and z, 0 separate each other. 4. Consider two inner product spaces E1 and E2. Prove that an inner product is defined in the direct sum E1 EEE2 by x1,y1 eE1,

((x1,x2),(y1,y2)) = (x1,y1) + (x2,y2) 5.

x2,y2eE2.

Given a subspace E1 of a finite dimensional inner product space E,

consider the factor space E/E1. Prove that every equivalence class con-

tains exactly one vector which is orthogonal to E1. § 2.

Orthonormal bases

7.8. Definition. Let E be an n-dimensional inner product space and (v = .n) be a basis of E. Then the bilinear function (,) determines a symmetric matrix (v,p = 1... n). (7.9) 1 . .

The inner product of two vectors x= can

and

be written as

(x,y) =

if1(x

y=

x) =

(7.10)

and hence it appears as a bilinear form with the coefficient-matrix The basis (v = .ii) (v = .11) is called orthonornial, if the vectors are mutually orthogonal and have the norm 1, 1

1

. .

=

. .

(7.11)

Chapter VII. Inner product spaces

192

Then formula (7.10) reduces to

(x,y) and in the case y = x ixi2

The substitution

(7.12)

=

in (7.12) yields

=

(p

= 1... n).

(7.13)

Now assume that x+0, and denote by the angle between the vectors x and Formulas (7.3) and (7.13) imply that =

(p =

1

n).

(7.14)

lxi

If x is a unit-vector (7.14) reduces to cos

(p = I ... n).

=

(7.15)

These equations show that the components of a unit-vector x relative to an orthonormal basis are equal to the cosines of the angles between x and the basisvectors 7.9. The Schmidt-orthogonalization. In this section it will be shown that an orthonormal basis can be constructed in every inner product space of finite dimension. Let (v = ii) be an arbitrary basis of E. Starting out from this basis a new basis (v = I .. n) will be constructed whose vectors are mutually orthogonal. Let b1 = a1. Then put b2 = a2 + 2b1 1 . .

.

and determine the scalar A such that (h1, h2)=0. This yields

(a2,b1) + A(b1,b1) = 0. Since b1 40, this equation can be solved with respect to A. The vector b2 thus obtained is different from zero because otherwise a1 and a2 would be linearly dependent.

To obtain b3 set

b3=a3 -i-pb1 +-vb2

and determine the scalars p and v such that (b1,b3) =

0

and

(b2,b3) = 0.

§ 2. Orthonormal bases

193

This yields

b1) = 0

(a3, b1) +

and (a3, b2) + v(b2,b2)

0.

Since b1 + 0 and b2 + 0, these equations can be solved with respect to ji and v. The linear independence of the vectors a1, a2, a3 implies that b3 +0. Continuing this way we finally obtain a system of n vectors

such that

+ 0 (v = 1.. .n)

(v+ji).

It follows from the criterion in sec. 7.3, that the vectors are linearly independent and hence they form a basis of E. Consequently the vectors (v

=

= 1 ... n)

form an orthonormal basis. 7.10. Orthogonal transformations. Consider two orthogonal bases and (v = 1.. .n) of E. Denote by the matrix of the basis-transformation (7.16)

The relations and imply that (7.17)

This equation shows that the product of the matrix and the transposed matrix is equal to the unit-matrix. In other words, the transposed matrix coincides with the inverse matrix. A matrix of this kind is called orthogonal.

Hence, two orthonormal bases are related by an orthogonal matrix. Conversely, given an orthonormal basis (v 1. . n) and an orthogonal the basis defined by (7.16) is again orthonormal. n x n-matrix .

7.11. Orthogonal complement. Let E be an inner product space (of finite or infinite dimension) and E1 be a subspace of E. Denote by set of all vectors which are orthogonal to E1. Obviously,

E only.

of the zero-vector

is called the orthogonal complement of E1. If E has finite dimen-

sion, then we have that

dimE1 +dimEiL= dimE 13

the

Greub. Linear Algebra

Chapter VII. Inner product spaces

and hence E1

n E iL

=

0

implies that (7.18)

Select an orthonormal basis and a vector y= of

E1 consider

rn) of E1. Given a vector XEE

1

the difference z

=x

—

y.

Then (z,

= (x,

—

= (x,

(y,

if and only if

This equation shows that z is contained in (R

—

= 1... rn).

We thus obtain the decomposition

x=p+h

(7.19)

where

p is called the orthogonal projection of x onto E1.

Passing over to the norm in the decomposition (7.19) we obtain the relation ix12

= pt2 + hI2.

(7.20)

Formula (7.20) yields Bessel's-inequality lxi

showing that the norm of the projection never exceeds the norm of x. The

equality holds if and only if h=0, i.e. if and only if xeE1. The number ihi is called the distance of x from the subspace E1.

Problems 1. Starting from the basis a1 = (1,0,1) a2 = (2,1,

—

3)

a3 = (— 1,1,0)

construct an orthonormal basis by the Schmidtof the number-space orthogonalization process.

§ 3. Normed determinant functions

195

2. Let E be an inner product space and consider E as dual to itself. Prove that the orthonormal bases are precisely the bases which are dual to themselves.

3. Given an inner product space E and a subspace E, of finite dimension consider a decomposition

x=x1+x2 and the projection

x,eE,

x=p+h

Prove that 1X21

Ihi

that equality is assumed only if x, =p and x2 = h. 4. Let C be the space of all continuous functions in the interval 0 t 1 with the inner product defined as in sec. 7.2. If C' denotes the subspace of all continuously differentiable functions, show that (C1)' = 0. 5. Consider a subspace E1 of E. Assume an orthogonal decomposition and

E,=F1$G, F1±G,. Establish the relations

F,' =

I G1

and

IF1.

=

6. Let F3 be the space of all polynomials of degree inner product of two polynomials as follows: (P,Q)

2. Define the

= f P(t)Q(t)dt.

The vectors 1, t, t2 form a basis in F3. Orthogonalize and orthonormalize this basis. Generalize the result for the case of the space P of polynomials of degree § 3.

Normed determinant functions

7.12. Definition. Let E be an n-dimensional inner product space and A0+O be a determinant function in E. Since E is dual to itself we have in view of (4.21) z10 (x,, ...

where

...

is a real constant. Setting

= det (x1,

e E,

where

eE

is an orthonormal

Chapter VII. Inner product spaces

basis we obtain

= z10(e1 ...

and so the constant

(7.21)

is positive. Now define a determinant function A

by A

(7.22)

=±

Then we have

A (x1 ...

=

(Yi ...

(7.23)

y1).

A determinant function in an inner product space which satisfies (7.23) is called a normed determinant function. It follows from (7.22) that there are precisely two normed determinant functions A and —A in E. Now assume that an orientation is defined in E. Then one of the functions A and —A represents the orientation. Consequently, in an oriented inner product space there exists exactly one normed determinant function representing the given orientation.

7.13. Angles in an oriented plane. With the help of a normed determinant-function it is possible to attach a sign to the angle between two vectors of a 2-dimensional oriented inner product space. Consider the normed determinant function A which represents the given orientation. Then the identity (7.23) yields 1x12

(7.24)

1y12 — (x, y)2 = A (x, y)2.

Now assume that x+O andy+O. Dividing (7.24) by 1x121y12 we obtain the relation + 1x121y12

1x121y12

Consequently, there exists exactly one real number 0 mod 2m such that

cosO =

(x,y)

.

and

sin0 =

lxi

A(x,y) .

lxi

(7.25)

This number is called the oriented angle between x and y.

if the orientation is changed, A has to be replaced by —A, and hence 0 changes into —0. Furthermore it follows from (7.25) that 0 changes sign if the vectors x and y are interchanged and that 0(x,

—

y)

= 0(x, y) +

mod 2it.

§ 3. Normed determinant functions

197

1.. .p) in an inner 7.14. The Gram determinant. Given p vectors product space E, the Gram determinant G (x1 . x,,) is defined by . .

= det

G (x1 ...

It will be shown that

\ (xi,, x1) ... (xe,

G(x1 ...

(7.26)

).

(

0

(7.27)

and that equality holds if and only if the vectors (x1 . .x,) are linearly dependent. in the case p = 2 (7.27) reduces to the Schwarz-inequality. To prove (7.27), assume first that the vectors (v 1.. .p) are linearly .

dependent. Then the rows of the matrix (7.26) are also linearly dependent whence

if the vectors (v = .. .p) are linearly independent, they generate a p-dimensional subspace E1 of E. E1 is again an inner product space. Denote by A1 a normed determinant function in E1. Then it follows from (7.23) that G(x1 ... = A1(x1 ... 1

The linear independence of the vectors A1 (x1 .. . x,,) +0,

(v = .. .p) 1

implies that

whence

7.15. The volume of a parallelepiped. Let p linearly independent vectors be given in E. The set

(v=1...p)

(7.28)

is called the p-dimensional parallelepiped spanned by the vectors (v = 1.. .p). The volume V(a1 ..

of the parallelepiped is defined by (7.29)

where A1 is a normed determinant function in the subspace generated by the vectors (v = .p). In view of the identity (7.23) formula (7.29) can be written as 1 .

.

V(a1 ...

= det

(

a1) ... (ar,

).

(7.30)

Chapter VII. Inner product spaces

198

In the case p = 2 the above formula yields V(ai,a2)2=1a1121a212—(a1,a2)2=1a1121a212sin20,

(7.31)

where 0 denotes the angle between a1 and a2. Taking the square-root on

both sides of (7.31), we obtain the well-known formula for the area of a parallelogram: V(a1,a2) = 1a11 1a21 sin

p) and de-

Going back to the general case, select an integer i (1 i compose a1 in the form

a, =

v*i

+

=

where

(v + i).

0

(7.32)

Then (7.29) can be written as V(a1 ...

=

A1 (a1

...

...

Employing the identity (7.23) and observing that (hi,

= 0 (v +1) we

obtain *) (a1, a1) ... (a1, a1) ... (a

V(a1 ...

det

(7.33)

...

The determinant in this equation represents the square of the volume of the(p — 1)-dimensional parallelepiped generated by the vectors (a1 . .

a1..

We thus obtain the formula V(a1 ... a',) = V(a1

showing that the volume V(a1 .

(1 .

is

i

p)

the product of the volume

of the "base" and the corresponding height. 7.16. The cross product. Let E be an oriented 3-dimensional Euclidean space and A be the normed determinant function which represents the orientation. Given two vectors xeE and yEE consider the linear function f defined by (7.34) f (z) = A (x, y, z). V(a1 . . .á,. .

In view of the Riesz-theorem there exists precisely one vector uEE such that (7.35) f (z) = (u, z). *) The symbol at indicates that the vector at is deleted.

§ 3. Normed determinant functions

199

The vector u is called the cross product of x and y and is denoted by x x y. Relations (7.34) and (7.35) yield

(x xy,z)=A(x,y,z).

(7.36)

It follows from the linearity of 4 in x and y that the cross product is distributive (Ax1 + px2) x y = Ax1 x y + px2 x x x (AY1 + 11Y2) = AX X Yi + PX X

y

and hence it defines an algebra in E. The reader should observe that the

cross product depends on the orientation of E. If the orientation is reversed then the cross product changes its sign. From the skew symmetry of 4 we obtain that

x xy=—y xx. Setting

z=x

in (7.36) we obtain that

(x x y,x) = 0.

Similarly it follows that

(x xy,y)=O and so the cross product is orthogonal to both factors. It will now be shown that x x y + 0 if and only if x and y are linearly independent. In fact, if y—Ax, it follows immediately from the skew symmetry that x x y =0. Conversely, assume that the vectors x and y are linearly independent. Then choose a vector zEE such that the vectors x, y, z form a basis of E. It follows from (7.36) that (x

x y,z) =

zl(x,y,z) + 0

whence x x y +0.

Formula (7.36) yields for z=xxy

A(x,y,x x y) = Ix x

(7.37)

y12.

If x

and y are linearly independent it follows from (7.37) that 4 (x, y, x x y) >0 and so the basis x, y, x x y is positive with respect to the given orientation. Finally the identity

(x1 x x2,y1 x j'2) = (x1,y1)(x2,y2)

—

(x1,y2)(x2,y1)

will be proved. We may assume that the vectors x1,

x2

(7.38)

are linearly inde-

Chapter VII. Inner product spaces

200

pendent because otherwise both sides of (7.38) are zero. Multiplying the relations A(x1,x2,x3) = (x1 x x2,x3) and x A(y1,y2,i'3) = we obtain in view of (7.23) (x1 x x2, x3)(v1

13) =

Setting Y3 = x1 x x2 and expanding the determinant on the right hand side

by the last row we obtain that

(x1 x x2,x3)(y1 x

x x2) =

(x1 x x2, x3) [(x1, v1)(x2, v2) — (x1,

y2)(x2, v1)].

(7.39)

Since x1 and x2 are linearly independent we have that x1 x x2 + 0 and hence x3 can be chosen such that (x1 x x2, x3)+0. Hence formula (7.39) implies (7.38).

Formula (7.38) yields for x1=y1=x and x2=y2=y (7.40)

x

If 0(0 0 the form

m) denotes the angle between x and y we can rewrite (7.40) in

= xl yI sin 0

Ix x

x + 0, y + 0.

Now we establish the triple identity

xx (1'x z)=(x,z)y—(x,y)z. In fact, let UEE

be

(7.41)

arbitrary. Then formulae (7.36) and (7.38) yield

(xx ft x z),u) = A(x,y x z,u)= — 4(y x z,x,u) = — ft x z, x x u) = — (,x)(z, u) + (y, u)(z, x)

Since ueE is arbitrary, (7.41) follows. From (7.41) we obtain the Jacobi identity

xx(y xz)+vx(z x x)+:x(x xy)=0. Finally note that if e1, e2, e3 is a positive orthonormal basis of E we have the relations X e2

=

C2 X Cf3 = e1,

Cf3

X

= C2.

§ 3. Normed determinant functions

201

Problems 1. Given a vector a +0 determine the locus of all vectors x such that x—a is orthogonal to x+a. 2. Prove that the cross product defines a Lie-algebra in a 3-dimensional inner product space (cf. problem 8, Chap. V, § 1). 3. Let e be a given unit vector of an n-dimensional inner product space E and E1 be the orthogonal complement of e. Show that the distance of a vector xeE from the subspace E1 is given by

d= (x,e)J. 4. Prove that the area of the parallelogram generated by the vectors x1 and x2 is given by A = — a)(s — b)(s — c), where

a=1x11,

b=1x21, c=1x2—xiI, s=4(a+b+c).

5. Let a +0 and b be two given vectors of an oriented 3-space. Prove that the equation x x a = b has a solution if and only if (a, b) = 0. If this condition is satisfied and x0 is a particular solution, show that the general solution is x0+2a. 6. Consider an oriented inner product space of dimension 2. Given two positive orthonormal bases (e1, e2) and (e1, e2), prove that = e1 cos w — e2 sin co = e1 sin co + e2 cos co. where co is the oriented angle between e1 and ë1.

7. Let a1 and a2 be two linearly independent vectors of an oriented Euclidean 3-space and F be the plane generated by a1 and a2. Introduce an orientation in F such that the basis a1, a2 is positive. Prove that the angle between two vectors and

is determined by the equations —

cosO

=

lxi lyl

and sinO =

lxi lyl

x a21(— ic <0 pr).

Chapter VII. Inner product spaces

2U2

8. Given an orthonormal basis linear transformations

=x x Prove

9.

1, 2, 3) in the 3-space, define

by

(v = 1,2,3).

that

Let zl be a determinant function in the plane. Prove the identity

(x. x1)zl(x,, x3) + (x. x2)4(x3, x1) + (x, x3)zl(x1,

10. Let

x2)= 0

(1= 1. 2, 3) be unit vectors in an oriented plane and denote

by

the

e

formulae

xEE.

cosO13 = cose12 cosO23 — sin e12 sine23 sin e13

11. Let x, y, z

=

cose23 + cosO12

sine23.

be three vectors of a plane such that x and y are linearly

independent and that x+y+z=0.

a)

Prove that the ordered

pairs x, y; y, z and

z,

x represent

the same

orientation. Then show that

O(x,y)+

where

the angles refer

O(y,z) + O(z,x)=

to the above orientation.

b) Prove that O(y, — x) + O(z,

—

y) + O(x, — z) =

What is the geometric significance of the two above relations?

12. Given p vectors x1, ..., G(x1,

x,,

prove the inequality 1x112 x212...

Then derive Hadamard's inequality for a determinant /a11 ... a

§ 4.

Duality in an inner product space

7.18. The isomorphism t. Let E be an inner product space ofdi let E* be a vector space which is dual to E with respect to a scalar

ii and

§ 4. Duality in an inner product space

203

product <,). Since E* and E are both dual to E it follows from sec. 2.33

such that

that there is a linear isomorphism

<xx,y>_—(x,y) x,yEE.

(7.42)

With the aid of this isomorphism we can introduce a positive definite inner product in E* given by y*). (x*, y*) = (r (7.43) 1

1

Now introduce a scalar product in Ex E* by =

(7.44)

Then it follows from (7.42) and (7.44) that

= (x,y) = (y,x) = = <x,ty>. This relation shows that the dual mapping z*:E*÷_E coincides with z

and so z is dual to itself. e*v (v = 1. .n) be a pair of dual bases of E and E* and consider Let the matrices = (e*v, and (7.45) = .

It follows from the symmetry of the inner product that the matrices (7.45) are symmetric. On the other hand, the linear isomorphism determines an n x n-matrix by teA

Taking the inner product with

yields

=

= (eA,

=
(7.46)

A similar argument shows that — 1

=

e,2.

From (7.46) and (7.47) we obtain that

s'

pKçK

and hence the matrices (7.45) are inverse to each other.

(7.47)

Chapter VII. Inner product spaces

204

If

x=

(7.48)

is an arbitrary vector of E we can write (7.49)

The numbers are called the Co variant components of x with respect to the basis e1 ().= I .n). It follows from (7.48) that . .

=

=

= =

whence

=

(7.50)

We finally note that the covariant components of a vector XEE are its inner products with the basis vectors. In fact, from (7.48) and (7.50) we obtain that (x, e%,) =

=

=

If the basis e%,(v= l...n) is orthonormal we have that formulae (7.46) simplify to It follows that t

= and hence

every orthonormal basis of E into the dual basis. Moreover, the equations (7.50) reduce in the case of an orthonormal basis to

maps

— —

Problems 1. Let 1 ...n) be a basis of E consisting of unit vectors. Given a vector xeE write x = + h, where is the orthogonal projection of x onto the subspace defined by (x, e,) = 0. Show that

i=1...n where the are the covariant components of x with respect to the basis 2. Let E, E* be a dual pair of finite dimensional vector spaces and consider a linear isomorphism Find necessary and sufficient con-

§ 5. Normed vector spaces

205

ditions such that the bilinear function defined by

(x,y)=

x,yeE

be a positive definite inner product. § 5.

Normed vector spaces

7.19. Norm-functions. Let E be a real linear space of finite or infinite dimension. A norm-function in E is a real-valued function ii having the following properties: ii

0 for every x e E, and lix ii = 0 only if x = 0. N1: lxii N2: lix + iixil + iiyii. N3:

=

A linear space in which a norm-function is defined is called a normed linear space. The distance of two vectors x and y of a normed linear space is defined by = lix - yii. N1, N2 and N3 imply respectively

Q(x,y)>O (x, y)

Q(x,y)=

if x+y (x, z) + (z, y)

(triangle inequality)

Q(y,x).

is a metric in E and so it defines a topology in E, called the norm topology. It follows from N2 and N3 that the linear operations are continuous in this topology and so E becomes a topological vector space. 7.20. Examples. 1. Every inner product space is a normed linear space with the norm defined by Hence

iixii

2.

Let C be the linear space of all continuous functions fin the interval 1. Then a norm is defined in C by

= max Ot 1

f(t)l.

Conditions N1 and N3 are obviously satisfied. To prove N2 observe

that

1(t) + g(t)l

If (t)I + ig(t)i

ill ii +

whence

ui +

ii + iigii.

,

(0

t 1)

Chapter VII. inner product spaces

3. Consider an n-dimensional (real or complex) linear space E and let (v 1 .. .n) be a basis of E. Define the norm of a vector

= by 7.21. Bounded linear transformations. A linear transformation p: of a normed space is called bounded if there exists a number M such that

xEE.

(7.51)

It is easily verified that a linear transformation is bounded if and only if it is continuous. It follows from N2 and N3 that a linear combination of bounded transformations is again bounded. Hence, the set B(E; E) of all bounded linear transformations is a subspace of L (E; E). Let be a bounded linear transformation. Then the set = 1 is bounded. Its least upper bound will be denoted by II'pII

= sup

(7.52)

lixil = 1

It follows from (7.52) that

xeE. Now it will be shown that the function (p—* thus obtained is indeed a norm-function in B(E; E). Conditions N1 and N3 are obviously satisfied. To prove N2 let ço and be two bounded linear transformations. Then

xeE and consequently

The norm-function

has the following additional property: <

(7.53)

.

.

In fact, .

XEE

whence (7.53).

7.22. Normed spaces of finite dimension. Suppose now that E is a normed vector space of finite dimension. Then it will be shown that the norm topology of E coincides with the natural topology (cf. sec. 1 .22). Since the linear operations are continuous it has only to be shown that a linear function is continuous in the norm topology. Let (v = I, ..., 11) be a basis

§ 5. Normed vector spaces

207

of E. Then we have in view of N2 and N3 that = This relation implies that the function

is continuous in the natural

topology. Now consider the set Q c E defined by

in the naturaltopology and Q that there exists a positive constant 1)1 such that

for xeQ it follows

xeQ. Now N3 yields

xeE whence

v= 1,...,n.

(7.54)

Let f be a linear function in E. Then we have in view of (7.54) that

M

f(x)l = and sof is continuous. This completes the proof.

Since every linear transformation 'p of E is continuous (cf. sec. 1.22) it follows that 'p is bounded and hence B(E; E)=L(E; E). Thus L(E; E) becomes a normed space, the norm of a transformation 'p being given by kpM = max IIxH =1

Problems 1. Let E be a normed linear space and E1 be a subspace of E. Show that a norm-function is defined in the factor-space E/E1 by

= inf xex 1, 2...) of a normed linear 2. An infinite sequence of vectors space E is called convergent towards x if the following condition holds: To every positive number c there exists an integer N such that

if n>N.

Chapter VII. Inner product spaces

a) Prove that every convergent sequence satisfies the following Cauchycriterion: To every positive number a there exists an integer N such that — Xmh

n > N and in > N.

b) Prove that every Cauchy.sequence*) in a normed linear space of finite dimension is convergent.

c) Give an example showing that the assertion b) is not necessarily correct if the dimension of E is infinite. 3. A normed linear space is called complete if every Cauchy-sequence is convergent. Let E be a complete normed linear space and be a linear

transformation of E such that

<1. Prove that the series

convergent and that the linear transformation

v0

pV is

= has

the following properties:

a) b) 1—

§ 6.

The algebra of quaternions

7.23. Definition. Let E be an oriented Euclidean space of dimension 4. Choose a unit vector e, and let E1 denote the orthogonal complement of e. Let E1 have the orientation induced by the orientation of E and by e (cf. sec. 4.29). Observe that every vector XEE can be uniquely decomposed in the form

x=),e+x1 Now consider the bilinear map E x

X1EE1.

defined by

ex= x, xe= x =

— (x,

v) e + x x v

XEE

(7.55)

x.

(7.56)

where x denotes the cross product in the oriented 3-space E1. It is easily checked (by means of formula (7.41)) that this bilinear map makes E into an associative algebra. This algebra is called the algebra *) i.e. a sequence satisfying the Cauchy-criterion.

§ 6. The algebra of quaternions

209

of quaternions and is denoted by 0+ In view of (7.55), e is the unit element of 0-fl while (7.56) shows that

xy+yx=—2(x,y)e

x,yEE1

and so the algebra 0-fl is not commutative.

The elements of E are called quaternions, and the elements of E1 are

called pure quaternions. The conjugate of a quaternion e R, X1EE1) is defined by

=

C

— X1

It is easily verified that X,yEE and

XEE.

Note that xeE1 if and only if x= —x. Next, let XEO-fl be arbitrary and write

x

=

e

+ (x1, x1) e = (x, x) e =

e.

Similarly,

x=

e.

Thus we have the relations x=

=

x

e.

Now assume that x+O and define x' by 1

X

X.

=

Then the relations above yield

x x' =

x=e

(x + 0).

Thus every non-zero element in has a left and right inverse and is a division algebra. Finally note that the multiplication in satisfies the relations I-fl

so 0-0

0-fl

(x y, x z) = and

z)

(y. x, z. x) = (y,

x, y, ZE0-fl

z)

(7.57)

(7.58)

which are easily verified using (7.36) and (7.38). In particular, we have

= 14

Greub. Linear Algebra

.

yI

x, yE0+

Chapter VII. Inner product spaces

Proposition I: The 3-linear function in E1 given by = (x1 x2, x3)

x1,

X3EE1

is a normed determinant function in E1 and represents the orientation of E1.

Proof: First we show that 4 is skew symmetric. In fact, formulae (7.58) and (7.57) imply that 4(x1, x, x) =

(x1

.

x, x) =(x1, e)

=0

and zl(x, x2, x) = (x . x2, x) = (x2, e) x12 = 0.

Thus zl is skew symmetric. Next observe that 4(x1, x2. x3) = {4(x1, x2, x3) = —

4(x2, x1. x3)}

x x2, x3).

.

This relation shows that 4 is a normed determinant function in E1 and represents the orientation of E1. 7.24. Associative division algebras. Let A be an associative division algebra (cf. sec. 5.1) with unit element e. Observe that the real numbers, the complex numbers and the quaternions form associative division The dimensions of these algebras are respectively 1, 2 algebras over and 4. We shall show that these are the on/i finite dimensional associative division algebras over Let A be any such algebra with unit element e. Let aEA be arbitrary and consider the n + 1 powers

a'(v=0

n,a°=e)

n=dimA.

Since these elements are linearly dependent we have a non-trivial relation C!' = 0

This relation can be written in the form

f(a) = where f is the polynomial given by

f(t)

tv

0

(7.59)

§ 6. The algebra of quaternions

211

By the fundamental theorem of algebra, J is a product of irreducible polynomials of degree 1 and 2. Since A is a division algebra, it follows from (7.59) that a satisfies an equation of the form

p(a)=O where p is of degree two or one. Equivalently, every element a E A satisfies

an equation of the form = —fl2e

(a

Lemma I: If x and v are elements of A satisfying x2 =

—e

and y2 =

— e,

then

Proof? Without loss of generality we may assume that y ± x. Then the elements e, x. v are linearly independent as is easily checked.

Now observe that x+v and x—v satisfy quadratic equations + 2fie = 0

(7.60)

+?(x—y)+2àe=O

(7.61)

(x + y)2 +

and

(x—y) 2

+

Adding these equations and observing that x2 = + ;') x +

y

—

=—e

we obtain

+ 2(fl + —2) e = 0.

Since the vectors e, x and y are linearly independent, the equation above yields

cx+;'=O, fi

=

+

2

and therefore Now equations (7.60) and (7.61) reduce to and

(x+v)2 = —2fle

(7.62)

=

(7.63)

(x — y)2

e.

—

Since the vectors e, x. y are linearly independent the polynomial t2 + 2/i must be irreducible and so /3>0. Similarly, 5 >0. Now the equation j3 + 5 2 implies that 0< /3<2. Finally, relation (7.62) yields

xv + vx = 2(1 Since 0< /3 <2, the lemma follows.

—

/3) e.

Chapter VII. Inner product spaces

Lemma II: Let F be the subset of A consisting of those elements x which satisfy

=

—

for some

e

Then

(i) F is a vector space. (ii) A=(e)EJ3F where (e) denotes the 1-dimensional subspace generated by e. (iii) IfxEF and yEF, then xy+vxE(e) and Xy—VXEF. Now let xcF and VEF. Proof: (i) If XEF, then clearly, ).xEF for We may assume that x2 = — e and y2 = — e. Then Lemma I yields + xv + yx =

(x + 1)2 = x2 +

— 1)

e,

I

1.

It follows that x+yEF. Thus F is a vector space. (ii) Clearly, (e) fl F=O. Finally, let aEA be arbitrary. Then a satisfies an equation of the form (a + e)2 =

— 112

flE

e

Set

and

Then x1E(e), x2EF and x1+x2=a.

(iii) Let XEF and VEF. Again we may assume that x2 = —e and —e. Then, by Lemma I, xy+yxE(e). To show that xy— VXEF observe that xv —

yx =

— (x

+ y)(x —

v)

= (x

—

y)(x + v)

Thus (x v

v

x)2 = — (x + y) (x —

y)2

(x + v).

Since, by (i), x+VEF and x—yEF we have =— e (x + (x—y)2= —f32e

whence (xv — vx)2 =

—

112

e

and so xy—yxEF. 7.25. The inner product in F. Let xEF and yEF. Then, by Lemma II, xy+vxE(e). Thus a symmetric bilinear function, (,), is defined in F by xv + y x =

— 2(x, y) e

Since x2

—

(x, x) e

x, yEF.

(7.64)

§ 6. The algebra of quaternions

it follows that (,)

is

213

positive definite and so it makes F into a Euclidean

space.

On the other hand, again by Lemma II sec. 7.24, x y — y xe F. Hence a F x F—+F is defined by skew symmetric bilinear map

xy—yx=2W(x,y)

X,yEF.

(7.65)

Formulae (7.64) and (7.65) imply that

xy=—(x,y)e+W(x,y)

X,yEF.

(7.66)

Finally, observe that

(xy—yx,y)=O

X,yEF

(7.67)

since

(xy—yx))' +y(xy—vx)= xy2 —y2 x= fl2(xe—ex)=O.

7.26. Theorem: Let A he a finite dimensional associative division algebra over

Then A

C or A

A

9-0.

Proof: We may assume that n2 (n=dim A). If n=2, then F has dimension 1. Choose a vector JE F such that phism A 4 C is given by

= — e.

Then an isomor-

Now consider the case n> 2. Then dim F 2. Hence there are unit vectors e1EF, C2EF such that (e1, e2) = 0.

It follows that e1 e2 + e2 e1 = — 2(e1, e2) e = 0.

Now set e3 =

Then we have

2_

,

e3 — e1 e2 c1

e1

e2.

2 2_ ,— — — e1 e2 —

whence e3EF. Moreover, e1 e3 =

— e2,

e2 e3 =

These relations imply that (e3, e3)= 1 and

(e1, e3) = (e2, e3) = 0.

Thus the vectors e1, e2, e3 are orthonormal.

e1.

e

Chapter VII. Inner product spaces

214

In particular. it follows that dim Now we show that the vectors e1. c2. e3 span F. In fact, let :eF. We may assume that :2 = e. Then —

= :e1 e7 — e1e2:

= [— 2(:, e1) e — e1 :] C2 — C1 [— 2(:. e2) — = e2] = —2(:, e1)e2 + 2(:, e2)e1. On

the other hand, _C3

+

e3

= —2(:, e3)e.

Adding these equations we obtain

= —(, e1) e2 + (-.

—

(:, e3) e

whence

=

(, e1) C1 + L e2) e2 + (,

C3.

This shows that: is a linear combination of e1, e2 and e3. Hence dim F= 3

and so dim A = 4. Finally, we show that A is the quaternion algebra. Let A be the trilinear function in F defined by A(x, y, ) = (V'(x,

p),)

x. y,

:EF.

(7.68)

We show that A is skew symmetric. Clearly, A(x, x, =)=O. On the other hand, formula (7.67) yields A(x,

y)

= (W(x. v), i') =

—

Thus A is a determinant in F. Since A(e1, e2, e3) =

— C2

e3) = (e3, e3) =

1,

A is a norined determinant function in the Euclidean space F. In particular, A specifies an orientation and so the cross product is defined in F. Now relation (7.68) implies that v) = x x v

x, VEF.

Combining (7.66) and (7.69) yields

xi'=—(x,v)e+xxy

X,I'EF.

Since on the other hand,

xe=ex=x and A =

XEA

it follows that A is the algebra of quaternions.

(7.69)

§ 6. The algebra of quaternions

215

Remark: It is a deep theorem of Adams that if one does not require associativity there is only one additional finite dimensional division

algebra over

It is the algebra of Cavley numbers defined in Chap. XI, § 2,

problem 6.

Problems 1. Let a 0 be a quaternion which is not a negative multiple of e. Show that the equation x2 = a has exactly two solutions. (1= 1, 2, 3) are three vectors in E1 prove the identity 2. If Y2

Y3 = — 4(Yl' Y2' y3) e — (y1, y2) y3 + (y1, y3) y2 — (y2,

y1.

3. Let p be a fixed quaternion and consider the linear transformation p given by p x = px. Show that the characteristic polynomial of 'p reads

f(t) = (t2 - 2(p, e) t + p12)2. Conclude that p has no real eigenvalues unless p is a multiple of e.

Chapter VIII

Linear mappings of inner product spaces In this chapter all linear spaces are assumed to be real and to have finite dimension

§ 1. The adjoint mapping 8.1. Definition. Consider two inner product spaces Eand Fand assume that a linear mapping p: E—*Fis given. If E* and F* are two linear spaces

dual to E and F respectively, the mapping ço induces a dual mapping The mappings p and are related by =
x>

xe E, y* e F*.

(8.1)

Since inner products are defined in E and in F, these linear spaces can be considered as dual to themselves. Then the dual mapping is a linear mapping of F into E. This mapping is called the adjoint mapping of p and will be denoted by Replacing the scalar product by the inner product in (8.1) we obtain the relation

(px,y)=(x,çzy) xEE,yeF.

(8.2)

In this way every linear mapping p of an inner product space E into an inner product space F determines a linear mapping of F into E. The adjoint mapping of is again p. In fact, the mappings and are related by (8.3)

(y,

Equations (8.2) and (8.3) yield

xeE,yeF = ço. Hence, the relation between a linear mapping and the adjoint mapping is symmetric. As it has been shown in sec. 2.35 the subspaces Im çø and ker are whence

§ 1. The adjoint mapping

217

orthogonal complements. We thus obtain the orthogonal decomposition (8.4)

8.2. The relation between the matrices. Employing two bases (v = 1.. .n) and = 1.. .m) of E and of F, we obtain from the mappings and and two matrices defined by the equations

= and

Substituting

and

in (8.2) we obtain the relation (8.5)

= Introducing the components = (xv, xA)

and

=

YK)

of the metric tensors we can write the relation (8.5) as

Multiplication by the inverse matrix gVQ yields the formula (8.6)

Now assume that the bases normal, =

(v =

1. . .n)

and YL (ji = 1. . . m) are ortho-

=

Then formula (8.6) reduces to

= This relation shows that with respect to orthonormal bases, the matrices of adjoint mappings are transposed to each other. 8.3. The adjoint linear transformation. Let us now consider the case that F= E. Then to every linear transformation 'p of E corresponds an adjoint transformation Since is dual to 'p relative to the inner pro*) The

subscript indicates the row.

218

Chapter VIII. Linear mappings of inner product spaces

duct, it follows that det

= det

and

tr

The adjoint mapping of the product i/is l/ioço

The matrices of

tr ço. is

given by

=

and çø relative to an orthonormal basis are transposes

of each other.

Suppose now that e and ë are eigenvectors of p and

respectively.

Then we have that

çoe=2e and whence in view of (8.2) —

2) (e, e)

= 0.

It follows that (e, e) = 0 whenever 2; that is, any two eigenvectors of p and whose eigenvalues are different are orthogonal. 8.4. The relation between linear transformations and bilinear functions. Given a linear transformation ço:E—*E consider the bilinear function

1i(x,y) = (px,y).

(8.7)

The correspondence p —* c/i defines a linear mapping

Q:L(E; E) denotes the space of bilinear functions in Ex E. It will be shown that this linear mapping is a linear isomorphism of L(E; E) onto B(E, E). To prove that is regular, assume that a certain ço determines the zero-function. Then ('p x, y) =0 for every x e E and every ye E, whence 'p = 0.

It remains to be shown that is a mapping onto B(E, E). Given a bilinear function c/i, choose a fixed vector xeE and consider the linear funcdefined by = cIl(x,y). By the Riesz-theorem (cf. sec. 7.7) this function can be written in the form

L(y) = (x',y) where

the vector x'eE is uniquely determined by x.

§ 1. The adjoint mapping

219

Define a linear transformation ço:E—÷E by

X = X'. Then

xeE,yeE. Thus, there is a one-to-one correspondence between the linear transformations of E and the bilinear functions in E. In particular, the identitymap corresponds to the bilinear function defined by the inner product. Let çji be the bilinear function which corresponds to the adjoint transformation. Then

=(x,py) =

=

((py,x)

This equation shows that the bilinear functions cb and çji from each other by interchanging the arguments. 8.5. Normal transformations. A linear transformation p : normal, if

are

obtained is called (8.9)

The above condition is equivalent to

x,yeE.

(8.10)

In fact, assume that çø is normal. Then

(px,py) =

=(x,p

(x,

=

Conversely, condition (8.10) implies that (y,

=(py,px) =

= (y,p

whence (8.9).

Formula (8.10) is equivalent to xeE. This relation implies that the kernels of p and

coincide,

kerq = Hence, the orthogonal decomposition (8.4) can be written in the form (8.11)

Relation (8.11) implies that the restriction of 'p to Im 'p is regular. Hence, has the same rank as 'p. The same argument shows that all the trans-

formations p"(k=2, 3...) have the same rank as 'p.

Chapter VIII. Linear mappings of inner product spaces

220

It is easy to verify that if ço is a normal transformation then so is p —Ai, Hence it follows that

Ae 11.

ker(q — 2z)=

—

2i).

In other words, çø and have the same eigenvectors. Now the result at the end of sec. 8.3. implies that every two eigenvectors of a normal transformation whose eigenvalues are different must be orthogonal. be a linear transformation and assume that an orthogonal Let ço decomposition the restriction Then p is normal if and only if the subspaces are stable under and the transformations are normal. In fact, assume that p is normal and let be arbitrary. Then we

is given such that the subspaces E1 are stable. Denote by

of p to

have for every

= 0.

=

This implies that The normality of

Thus E, is stable under j+ whence follows immediately from the relation

Conversely, assume that

is stable under

and

that

is normal. Then

we have for every vector that 1px12 =

and

=

=

Vpjxjl2 =

=

so p is normal.

Problems 1. Consider two inner product spaces E and F. Prove that an inner product is defined in the space L(E; F) by

Derive the inequality

and show that equality holds only if

§ 2. Selfadjoint mappings

221

be the adjoint transbe a linear transformation and 2. Let (p: formation. Prove that ifFc:Eis stable under (p, then F' is stable under 3. Prove that the matrix of a normal transformation of a 2-dimensional space with respect to an orthonormal basis has the form fl\

/

)

§ 2.

11

or

Selfadjoint mappings

8.6. Eigenvalue problem. A linear transformation p: E—+E is called p or equivalently

selfadjoint if =

x,yeE.

(cox,y)=(x,coy)

The above equation implies that the matrix of a selfadjoint transformation relative to an orthonormal basis is symmetric. If p: E is a stable subspace

then the orthogonal complement F' is stable as well. In fact, let ZEF' be any vector. Then we have for every yeF

(coz,y) =(z,çoy)= 0 whence pzEF'. It is the aim of this paragraph to show that a selfadjoint transformation of an n-dimensional inner product space E has n eigenvectors which are mutually orthogonal. Define the function F by

F(x)=

(x,px)

x+O.

(x,x)

(8.12)

This function is defined for all vectors x +0. As a quotient of continuous functions, F is also continuous. Moreover, F is homogeneous of degree zero, i.e.

F(2x)=F(x)

(2+0).

(8.13)

Consider the function F on the unit sphere lxi = 1. Since the unit sphere is a bounded and closed subset of E, F assumes a minimum on the sphere lxi = 1. Let e1 be a unit vector such that

F(e1) F(x)

(8.14)

222

Chapter VIII. Linear mappings of inner product spaces

for all vectors

=

1.

Relations (8.13) and (8.14) imply that

F(e1) F(x)

(8.15)

for all vectors x+O. in fact, if x*O is an arbitrary vector, consider the corresponding unit-vector e. Then x=JxIe, whence in view of (8.13)

F(x) = F(e) F(e1). Now it will be shown that e1 is an eigenvector of çø. Let y be an arbitrary vector and define the functionf by

f(t) =

F(e1

+ tj).

(8.16)

Then it follows from (8.15) thatf assumes a minimum at t=O, whence f' (O)=O. inserting the expression (8.12) into (8.16) we can write (e1 +

ty,e1 + ty) Ditlèrentiating this function at t=O we obtain

f'(O) =

+

—

(8.17)

Since ço is selfadjoint,

= and hence equation (8.17) can be written as

f'(O) = 2((pe1,y) —

2(e1,çoe1)(e1,y).

(8.18)

We thus obtain (8.19)

for every vector yEE. This implies that pe1

i.e. e1 is an eigenvector of 'p and the corresponding eigenvalue is = (e1,'pe1).

8.7. Representation in diagonal form. Once an eigenvector of 'p has been constructed it is easy to find a system of ii orthogonal eigenvectors. In fact, consider the 1-dimensional subspace (e1) generated by e1. Then (e1) is stable under 'p and hence so is the orthogonal complement E1 of

§ 2. Selfadjoint mappings

223

Clearly the induced linear transformation is again selfadjoint and hence the above construction can be applied to E1. Hence, there exists an eigenvector e2 such that (e1, e2)=0. Continuing this way we finally obtain a system of n eigenvectors (v = 1.. .n) such that (e1).

e,4) =

form an orthonormal basis of E. In this basis the mapping p has the form The eigenvectors

(8.20)

denotes the eigenvalue of These equations show that the matrix of a selfadjoint mapping has diagonal form if the eigenvectors are used as a basis. 8.8. The eigenvector-spaces. If 2 is an eigenvalue of 'p, the corresponding eigen-space E(2) is the set of all vectors x satisfying the equation px=2x. Two eigen-spaces E(2) and E(2') corresponding to different eigenvalues are orthogonal. In fact, assume that where

çø e = 2 e

p e' = 2' e'.

and

Then (e', 'p e) = 2 (e, e') and

(e, 'p e')

=

2' (e,

e').

Subtracting these equations we obtain

(2'— 2)(e,e') = 0, whence (e, e') = 0 if 1' + 2. Denote by (v = . .r) the different eigenvalues of 'p. Then every two eigenspaces and are orthogonal. Since every vector xeE can be written as a linear combination of eigenvectors it follows that the direct sum of the spaces is E. We thus obtain the orthogonal decomposition 1

.

(8.21)

Let

be the transformation induced by 'p in ço,x

Then

xeE(23.

=

This implies that the characteristic polynomial of

det

—

2i) = (2

—

2)ki

(1 =

is given by 1

... r)

(8.22)

where k. is the dimension of E(21). It follows from (8.21), and (8.22)

224

Chapter VIII. Linear mappings of inner product spaces

that the characteristic polynomial of det((p —)

1)

is equal to the product

— 2)k1

=

(2r

—

(8.23)

The representation 8.23 shows that the characteristic polynomial of a selfadjoint transformation has n real zeros, if every zero is counted with its multiplicity. As another consequence of (8.23) we note that the dimenin sion of the eigen-space is equal to the multiplicity of the zero the characteristic polynomial. 8.9. The characteristic polynomial of a symmetric matrix. The above has n real eigenresult implies that a symmetric n x n-matrix A = values. In fact, consider the transformation =

(v =

1

... n)

(v = 1.. .n) is an orthonormal basis of E. Then çø is selfadjoint and hence the characteristic polynomial of p has the form (8.23). At the same time we know that

where

det ('p — 2 i)

= det (A — 2 J).

(8.24)

Equations (8.23) and (8.24) yield det(A — 2J) =

— 2)k1 ...

—

8.10. Eigenvectors of bilinear functions. in sec. 8.4 a one-to-one correspondence between all the bilinear functions qi in E and all the linear transformations 'p E-+E has been established. A bilinear function and the corresponding transformation 'p are related by the equation :

cli(x,y)=(çox,y)

x,yeE.

Using this relation, we define eigenvectors and eigenvalues of a bilinear function to be the eigenvectors and eigenvalues of the corresponding transformation. Let e be an eigenvector of and 2 be the corresponding eigenvalue. Then çJi

(e,

y) =

('p e,

y) =

2 (e,

y)

(8.25)

for every vector yeE. Now assume that the bilinear function k is symmetric,

cli(x,y)= cli(y,x). Then the corresponding transformation 'p

is

selfadjoint. Consequently,

§ 2. Selfadjoint mappings

225

there exists an orthonormal system of ii eigenvectors

(v = 1... n).

=

(8.26)

This implies that cP

2v

Hence, to every symmetric bilinear function in E there exists an orthonormal basis of E in which the matrix of P has diagonal-form.

Problems 1. Prove by direct computation that a symmetric 2 x 2-matrix has only real eigenvalues. 2. Compute the eigenvalues of the matrix

-1 —1 —2 2

L 3.

Find the eigenvalues of the bilinear function v*1L

4. Prove that the product of two selfadjoint transformations p and i,1i is selfadjoint if and only if cli p = 'p 0k/i. 5. A selfadjoint transformation 'p is called positive, if

(x,çox) 0 for every xEE. Given a positive selfadjoint transformation ço, prove that there exists exactly one positive selfadjoint transformation i/i such that

6. Given a selfadjoint mapping çø, consider a vector be(ker p)'. Prove that there exists exactly one vector aEker p' such that çoa=b. 7. Let 'p be a selfadjoint mapping and let 1 ...n) be a system of n orthonormal eigenvectors. Define the mapping 'pA by 'pA = 'p — 2t

where 2

is

a real parameter. Prove that PA

provided that 2 15

Greub. Linear Algebra

is

not an eigenvalue of 'p.

xeE.

Chapter VIII. Linear mappings of inner product spaces

226

8. Let p be a linear transformation of a real n-dimensional linear space E. Show that an inner product can be introduced in E such that becomes a selfadjoint mapping if and only if has n linearly independent eigenvectors. 9. Let be a linear transformation of E and the adjoint map. Denote the norm of which is induced by the Euclidean norm of E (cf. by sec. (7.19)). Prove that = where 2 is the largest eigenvalue of the selfadjoint mapping

çø.

10. Let ço be any linear transformation of an inner product space E. Prove that ço is a positive self-adjoint mapping. Prove that

if and only if ço is regular.

11. Prove that a regular linear transformation 'p of a Euclidean space can be uniquely written in the form (p = 0o1

is a positive selfadjoint transformation and z is a rotation. Hint: Use problems 5 and 10. (This is essentially the unitary trick of

where

Weyl).

§ 3.

Orthogonal projections

8.11. Definition. A linear transformation rc:E—÷E of an inner product space is called an orthogonal projection if it is selfadjoint and satisfies the condition it2 = it. For every orthogonal projection we have the orthogonal decomposition

E = ker it

Im it

and the restriction of it to Im it is the identity. Clearly every orthogonal

projection is normal. Conversely, a normal transformation 'p which satis= = 'p is an orthogonal projection. In fact, since fies the relation we can write

x='px+x1

x1eker'p.

Since 'p is normal we have that ker 'p = ker

and so it follows that

x1eker Hence

we obtain for an arbitrary vector yEE

(x,'py)=('px,'py) +(x1,q,y) =(px,'py)

= (px,'py)

§ 3. Orthogonal projections

227

whence

(x,(py) =

(y,çox).

It follows that p is selfadjoint. To every subspace E1 E there exists precisely one orthogonal projection it such that Im it=E1. It is clear that iris uniquely determined byE1. and define it by To obtain it consider the orthogonal complement iry = y,yeE1;

irz =

Then it is easy to verify that it2 = and = Consider two subspaces E1 and E2 of E and the corresponding orthogonal projections it 1: E—+ E1 and it2 E—3 E2. It will be shown that it2 it1 =0 if and only if E1 and E2 are orthogonal to each other. Assume first that =0. ConE1 ±E2. Then it1xEE± for every vector xeE, whence it2 versely, the equation =0 implies that it1 xeE± for every vector xeE, whence E1EE±. 8.12. Sum of two projections. The sum of two projections it1 and it2:E-+E2 is again a projection if and only if the subspaces E1 and E2 are orthogonal. Assume first that E1 ±E2 and consider the transformation Then it.

it

:

it(x1+x2)=itx1+irx2=x1+x2 Hence, it reduces

x1eE1,x2eE2.

to the identity-map in the sum E1$E2. On the other

hand,

itx=0

if

is the orthogonal complement of the sum n it is the projection of E onto Conversely, assume that it1 + it2 is a projection. Then But

(it1 + it2)2 =

it1

and hence

+ it2,

whence it1 This

it2 + it2 it1 =

0.

(8.27)

equation implies that

it1oit2oit1+it2oit1=0

(8.28)

=0.

(8.29)

and

it1oit2 + Adding

(8.28) and (8.29) and using (8.27) we obtain

=0.

(8.30)

Chapter VIII. Linear mappings of inner product spaces

The equations (8.30) and (8.28) yield

0.

7C2oit1

This implies that E1 I E2, as has been shown at the end of sec. 8.11. 8.13. Difference of two projections. The difference it1 — it2 of two proand it2 : E-+E2 is a projection if and only if E2 is a subjections it1 : space of E1. To prove this, consider the mapping

= t —(itt —it2)=(i —it1)+ it2. it follows that p is a projection if Since i —itt is the projection If this condition is fuland only if i.e., if and only if E1 This implies that filled, p is the projection onto the subspace —it2 = i —p is the projection onto the subspace n

This subspace is the orthogonal complement of E2 relative to E1.

8.14. Product of two projections. The product of two projections it1 : E—+E1 and it2 E—*E2 is an orthogonal projection if and only if the projections commute. Assume first that it2 it1 = it1 it2. Then it2 it1 x = it2 x = x for every vector

XE E1 fl E2.

(8.31)

On the other hand, it2 it1 reduces to the zero-map in the subspace (E1 n E2)' =

+ E±. In fact, consider a vector X=

+ X±

Then it2it1X =

=

+

+ it1

= 0.

Equations (8.31) and show that it2 0it1 is the projection Conversely, if it2 cit1 is a projection, it follows that it2

=

7t1 OTt2

(8.32) n E2.

= it1

Problems 1. Prove that a subspace JcE if and only if

J=Jn

is

stable under the projection it:E—+E1

§ 4. Skew mappings

229

2. Prove that two projections it1 : E—+E1 and it2 : E—+E2 commute if and only if

E1 +E2 =E1 n E2 +E1 fl

E at a subspace E1 is defined by

ox = where

x=p+h(peE1,

tion it:

p

—

h

Show that the reflection

and the projec-

are related by

= 2ir — 4. Consider linear transformation of a real linear space E. Prove that an inner product can be introduced in E such that p becomes an orthogonal projection if and only if p2 = 5. Given a selfadjoint mapping p of E, consider the distinct eigenvalues denotes and the corresponding eigenspaces 1 ...r). If the orthogonal projection prove the relations:

a)

(i+j).

b) c)

§ 4. Skew mappings

8.15. Definition. A linear transformation in E is called skew if —!i. The above condition is equivalent to the relation

x,yeE.

(8.33)

It follows from (8.33) that the matrix of a skew mapping relative to an orthonormal basis is skew symmetric. Substitution of y=x in (8.33) yields the equation

(x,t/ix)=O

xeE

(8.34)

showing that every vector is orthogonal to its image-vector. Conversely, a transformation having this property is skew. In fact, replacing x by x+y in (8.34) we obtain = 0, (x + + whence (y,

x) + (x, t/i v) = 0.

Chapter VIII. Linear mappings of inner product spaces

23()

It follows from (8.34) that a skew mapping can only have the eigenvalue I;. =0. The relation /i= —p/i implies that

tn/i =

0

and

den/i = The last equation shows that

(—

detk/J = 0

if the dimension of E is odd. More general, it will now be shown that the rank of a skew transformation is always even. Since every skew mapping is normal (see sec. 8.5) the image space is the orthogonal complement of the kernel. Consequently, the induced transformation : Im k/i—*Im is regular. Since is again skew, it follows that the dimension of Im must be even. It follows from this result that the rank of a skew-symmetric matrix is always even. 8.16. The normal-form of a skew symmetric matrix. In this section it will be shown that to every skew mapping cli an orthonormal basis (v = 1 n) can be constructed in which the matrix of i/i has the form .

. .

0

K1

— K1

0

(8.35) —

0

=

Consider the mapping Then 8.7, there exists an orthonormal basis 'p

=

=

According to the result of sec. (v = 1 ii) in which 'p has the form . . .

(v = I ... ii)

All the eigenvalues L, are negative or zero. In fact, the equation

pe=i.e

eJ=l

§ 4. Skew mappings

231

implies that 2

= (e,(pe) =

=

—

0.

(t/ie,çl'e)

has the same rank as i/i, the rank of Since the rank of is even and must be even. Consequently, the number of negative eigenvalues is even and we can enumerate the vectors (v = 1 . n) such that (p

. .

(v=2p+ in).

(v= l...2p) and Define the orthonormal basis

(v = 1. . ii) by .

(v=1...p) and

(v=2p+1...n).

In this basis the matrix of

has the form (8.35).

Problems 1. Show that every skew mapping (p of a 2-dimensional inner product space satisfies the relation (p

(p

(x, y).

2. Skew transfbrmations of 3-space. Let E be an oriented Euclidean 3-space.

(i) Show that every vector aEE determines a skew transformation of E given by (ii) Show that

x x.

a, bEE. = — (iii) Show that every skew map p: E—÷E determines a unique vector aEE such that given by Hint: Consider the skew 3-linear map P: E x E x

P(x, v, :)=(px, v) :+(pv. :) .v+(p:, x) y and choose aEE to be the vector determined by v. :) = zl(x, v, :)a

sec. 4.3, Proposition II). (iv) Let e1. e2, e3 be a positive orthonorma! basis of E. Show that the vector a of part (iii) is given by (cf.

a= where

C1 +

C2 +

C3

is the matrix of (p with respect to this basis.

Chapter VIII. Linear mappings of inner product spaces

232

3. Assume that p4O and are two skew mappings of the 3-space having the same kernel. Prove that i/i = ). çø where is a scalar. 4. Applying the result of problem 3 to the mappings çox = (a1 x a2) x x and

— a1(a2,x)

x= prove

the formula (a1 x a2) x a3 = a2(a1,a3) — a1(a2,a3).

E—*E satisfies the relation that a linear transformation if and only if p is selfadjoint or skew. 6. Show that every skew symmetric bilinear function P in an oriented 5. Prove

3-space E can be represented in the form

cP(x,y)=(x xy,a) and that the vector a is uniquely determined by 'P. 7. Prove that the product of a selfadjoint mapping and a skew mapping has trace zero. 8. Prove that the characteristic polynomial of a skew mapping satisfies the equation

x(- = (-

From this relation derive that the coefficient 0f is zero for every odd v. 9. Let çø be a linear transformation of a real linear space E. Prove that an inner product can be defined in E such that ço becomes a skew mapping if and only if the following conditions are satisfied: 1. The space E can be decomposed into ker 'p and stable planes. 2. The mappings which are induced in these planes have positive determinant and trace zero. 10. Given a skew symmetric 4 x 4-matrix A = verify the identity

detA = §

5.

+

+

Isometric mappings

8.17. Definition. Consider two inner product spaces E and F. A linear mapping 'p:E—*F is called isometric if the inner product is preserved under

('px1,'px2)=(x1,x2) x1,x2eE. Setting x1 =

= x we find

kpxl=IxI xeE.

§ 5. Isometric mappings

Conversely, the above relation implies that

=

+

x2)12

is

233

isometric. In fact,

—

—

= x1 + x2f2 — 1x112 — 1x212 = 2(x1,x2).

an isometric mapping preserves the norm it is always injective. We assume in the following that the spaces E and F have the same dimension. Then every isometric mapping ço:E—÷Fis a linear isomorphism of E onto F and hence there exists an inverse isomorphism p 1: F—FE. The isometry of implies that Since

((px,y)=(x,p'y)

xeE,yeF,

whence (8.36)

Conversely every linear isomorphism çü satisfying the equation (8.36) is isometric. In fact,

= (x1,x2).

=

(x1,

The image of an orthonormal basis

(v =

1. . . n)

of E under an isometric

mapping is an orthonormal basis of F. Conversely, a linear mapping which sends an orthonormal basis of E into an orthonormal basis (v = 1.. .n) of F is isometric. To prove this, consider two vectors

and then

and whence

(pxl,px2) =

=

=

=

It follows from this remark that an isometric mapping can be defined between any two inner product spaces E and F of the same dimension: Select orthonormal bases and (v = 1. n) in E and in F respectively . .

and define ço by

8.18. The condition for the matrix. Assume that an isometric mapping lii) we obtain from p:E—*Fis given. Employing two bases and ço an n x n-matrix by the equations =

234

Chapter VIII. Linear mappings of inner product spaces

Then the equations ((p

(p aM) =

aM)

can be written as (b), bK) = (as,, aM).

Introducing the matrices

=

and

(au,

h)K

(b1, bK)

we obtain the relation

=

(8.37)

Conversely, (8.37) implies that the inner products of the basis vectors are preserved under (p and hence that (p is an isometric mapping. If the bases and are orthonormal,

=

=

,

relation (8.37) reduces to

= showing that the matrix of an isometric mapping relative to orthonormal

bases is orthogonal. 8.19. Rotations. A rotation of an inner product space E is an isometric mapping of E into itself. Formula (8.36) implies that

(detp)2 =

1

showing that the determinant of a rotation is ± 1. A rotation is called proper if det (p = + 1 and improper if det (p = — 1. imEvery eigenvalue of a rotation is ± 1. In fact, the equation plies that let = let, whence )= ± 1. A rotation need not have eigenvecvectors as can already be seen in the plane (cf. sec. 4, 17). Suppose now that the dimension of E is odd and let be a proper rotation. Then it follows from sec. 4.20 that has at least one positive eigenvalue ).. On the other hand we have that ). = ± 1 whence 2= 1. Hence, every proper rotation of an odd-dimensional space has the eigenvalue 1. The corresponding eigenvector e satisfies the equation (pe=e; that is, e A similar argument shows that to every imremains invariant under proper rotation of an odd-dimensional space there exists a vector e such that pe= —e. If the dimension of E is even, nothing can be said about

§ 5. Isometric mappings

235

the existence of eigenvalues for a proper rotation. However, to every im-

proper rotation, there is at least one invariant vector and at least one vector e such that çoe= —e (cf. sec. 4.20). Let ço:E-+E be a rotation and assume that FcE is a stable subspace. Then the orthogonal complement F' is stable as well. In fact, if z is arbitrary we have for every yeF (coz,y) =

=

0

whence çozeF'.

The product of two rotations is obviously again a rotation and the inverse of a rotation is also a rotation. In other words, the set of all rotations of an n-dimensional inner product space forms a group, called the general orthogonal group. The relation det (p2 ° pi) = det c°2 det q1

implies that the set of all proper rotations forms a subgroup, the special orthogonal group.

A linear transformation of the form 2 ço where 2 0 and p is a proper rotation is called homothetic. 8.20. Decomposition into stable planes and straight lines. With the aid of the results of § 2 it will now be shown that for every rotation 'p there exists an orthogonal decomposition of E into stable subspaces of dimension 1 and 2. Denote by E1 and E2 the eigenspaces which correspond to the eigenvalues 2 + 1 and 2 = — 1 respectively. Then E1 is orthogonal to E2. In fact, let x1 eE1 and x2eE2 be two arbitrary vectors. Then

'px1=x1 and 'px2=—x2. These

equations yield

(x1,x2) = — (x1,x2), whence (x1, x2)=0. It follows from sec. 8.19 that the subspace F= (E1 is again stable under 'p. Moreover, F does not contain an eigenvector of 'p and hence F has even dimension. Now consider the selfadjoint mapping

of F. The result of sec. 8.6 assures that there exists an eigenvector e of i/i. If 2 denotes the corresponding eigenvalue we have the relation çoe

+

'p' e = 2e.

Chapter VIII. Linear mappings of inner product spaces

236

Applying p we obtain ço2e=)Lçce—e.

(8.38)

Since there are no eigenvectors of çø in F the vectors e and e are linearly independent and hence they generate a plane F1. Equation (8.38) shows

that this plane is stable under 'p. The induced mapping is a proper rotation (otherwise there would be eigenvectors in F1). F is again stable The orthogonal complement under and hence the same construction can be applied to Fj'-. Continuing in this way we finally obtain an orthogonal decomposition ofF into mutually orthogonal stable planes. Now select orthonormal bases in E1, 122 and in every stable plane. These bases determine together an orthonormal basis of E. In this basis the matrix of 'p has the form

cosO1

sin 01

—sin01

cos01

±1

—

where

(i =

1.

. .

k) denotes

(v =

1

...p)

2k=n—p

COSOk

sinOk

sin °k

C05 °k

the corresponding rotation angle (cf. sec. 8.21).

Problems 1. Given a skew transformation k/I of E, prove that

is a rotation and that — is not an eigenvalue of 'p. Conversely, if 'p is a rotation, not having the eigenvalue — I prove that =('p — i)o('p + 1)-i is a skew mapping. 2. Let 'p be a regular linear transformation of a Euclidean space E such that ('p x, coy) = 0 whenever (x, y) = 0. Prove that 'p is of the form 'p = )L t, ).+O where z is a rotation. 1

§ 6. Rotations of Euclidean spaces of dimension 2, 3 and 4

237

3. Assume two inner products qi and 'P in E such that all oriented angles with respect to and 'I' coincide. Prove that 'P(x, y) where 2>0 is a constant. 4. Prove that every normal non-selfadjoint transformation of a plane is homothetic. 5. Let be a mapping of the inner product space E into itself such that (pO=0 and çox — (P11 = lx — x. VEE.

Prove that

is then linear.

there exists a continuous 6. Prove that to every proper rotation family (p(0t I) of rotations such that and (P1 = 7. Let be a linear automorphism of an n-dimensional real linear space E. Show that an inner product can be defined in E such that p becomes a rotation if and only if the following conditions are fulfilled:

(i) The space E

can

be decomposed into stable planes and stable

straight lines. (ii) Every stable straight line remains pointwise fixed or is reflected at the point 0.

(iii) In every irreducible invariant plane a linear automorphism induced such that and tn/il <2. =

is

1

8. If

is a rotation of an n-dimensional Euclidean space, show that

Jtrcoln. 9. Prove that the characteristic polynomial of a proper rotation satisfies the relation =(

—

)4nf(2—

10. Let E be an inner product space of dimension n>2. Consider a proper rotation t which commutes with all proper rotations. Prove that = i where c = if ii is odd and = ± if n is even. 1

§ 6.

1

Rotations of Euclidean spaces of dimension 2,3 and 4

8.21. Proper rotations of the plane. Let E be an oriented Euclidean plane and let zl denote the normed positive determinant function in E. is determined by the equation Then a linear transformation j: zi (x, i') = (j

i)

x, v E.

Chapter VIII. Linear mappings of inner product spaces

238

The transformation / has the following properties:

I) (jx,v)+(x,jv)=O (jx,jv)=(x,v)

2)

3)

4)

detj=1.

In fact, 1) follows directly from the definition. To verify 2) observe that the identity (7.24), sec. 7.13, implies that (x, y)2 + (/x,v)2 y

= IxV

x we obtain, in view of 1),

(jx, jx)2 = j is injective (as follows from the definition) we obtain

= imply that

Now the relations .1= —.1 and

=—

Finally, we

obtain from 1), 2), 3) and from the definition of /

4(jx,iv) = (/2 x, iv) = — (x,jy) = (jx, v) = zl(x,

y)

whence

det j =

1.

The transformation / is called the canonical complex structure of the oriented Euclidean plane E. Next, let p be any proper rotation of E. Fix a non-zero vector x and denote by 0 the oriented angle between x and p x (cf. sec. 7.1 3). It is determined mod 2 it by the equation

px = x cos 0 +j(x) sin 0.

(8.39)

We shall show that cos 0 = tr 'p

and

sin 0 = 4 tr(jo p).

(8.40)

In fact, by definition of the trace we have

zl(p x, y) + 4(x, py) = 4(x, y). tr 'p. Inserting (8.39) into (8.41) and using properties 2) and 3) of j

(8.41)

we

obtain

2cos0 .A(x, v)=zl(x,y) . trqi and so the first relation (8.40) follows. The second relation is proved in a similar way.

§ 6. Rotations of Euclidean spaces of dimension 2, 3 and 4

239

Relations (8.40) show that the angle (9 is independent of x. It is called the rotation angle of and is denoted by e((p). Now (8.39) reads

cosO((p)+j(x) sinê((p)

XEE.

In particular, we have

O(i)=0,

/ix=x

&(—i)=m, cos

sin

a second proper rotation with rotation angle shows that is

a

simple calculation

= x cos(&(p)+ (9(h)) +j(x) sin((9(p)+ Thus any two proper rotations commute and the rotation angle for their

product is given by (p) = (9((p) +

(mod 2it).

(8.42)

Finally observe that if e1, e2 is a positive orthonormal basis of E then we have the relations (p e1 e1 cos O(p) + e2 sin O((p) (p e2 = — e1 sin O(p) + e2 cos O(p).

Remark: If E is a non-oriented plane we can still assign a rotation angle to a proper rotation (p. It is defined by the equation

cos 9(p) = tr p

(0

it).

(8.43)

Observe that in this case (9(p) is always between 0 and it whereas in the oriented case 9(p) can be normalized between —it and it. 8.22. Proper rotations of 3-space. Consider a proper rotation p of a 3-dimensional inner product space E. As has been shown in sec. 8.19,

there exists a 1-dimensional subspace E1 of E whose vectors remain fixed. If p is different from the identity-map there are no other invariant vectors

(an invariant vector is a vector xeE such that (px=x). In fact, assume that a and b are two linearly independent invariant vectors. Let c(c+0) be a vector which is orthogonal to a and to b. Then (pc—2c where 2= ± 1. Now the equation det (p= implies that 2= + 1 showing that (p is the identity. In the following discussion it is assumed that +1. Then the invariant vectors generate a i-dimensional subspace E1 called the axis of (p. 1

240

Chapter VIII. Linear mappings of inner product spaces

To determine the axis of a given rotation (p consider the skew mapping (8.44)

and introduce an orientation in E. Then

can

t/ix=u xx

be written in the form

UEE

(8.45)

(cf. problem 2, § 4). The vector u which is uniquely determined by (p iS

called the rotation-vector. The rotation vector is contained in the axis of (p. In fact, let a+O be a vector on the axis. Then equations (8.45) and (8.44) yield (8.46)

showing that u is a multiple of a. Hence (8.45) can be used to find the rotation axis provided that the rotation vector is different from zero. This exceptional case occurs if and only if = i.e. if and only if Then (p has the eigenvalues 1, —1 and — 1. In other words, is (p = (p a reflection at the rotation axis. 8.23. The rotation angle. Consider the plane F which is orthogonal to E1. Then transforms F into itself and the induced rotation is again proper. Denote by 0 the rotation angle of (p1 (cf. remark at the end of see 8.21). Then, in view of (8.43),

cosO =

Observing that trço =

+1

trço1

we obtain the formula cosO = 3-(trp

1).

To find a formula for sin 0 consider the orientation of Fwhicl' is induced by E and by the vector u (cf. sec. 4.29)*). This orientation is represented by the normed determinant function

A1(y,z)= lul

where A is the normed determinant function representing the orientation

of E. Then formula (7.25) yields

sinO=A1(y,(py)='A(u,y,(py) lul

*) It is assumed that u

0.

(8.47)

§ 6. Rotations of Euclidean spaces of dimension 2. 3 and 4

241

where y is an arbitrary unit vector of F. Now

A(u,y,çoy)= = 4(u,co' y,y)= — A(u,y,ço1y) and hence equation (8.47) can be written as (8.48) lul

By adding formulae (8.47) and (8.48) we obtain (8.49)

lxi

21u1

Inserting the expression (8.45) in (8.49), we thus obtain

x y)=

sin0 = lul Since

x lul

(8.50)

y is a unit-vector orcnogonal to u, it follows that iu

xyl=iuilyl=lui

and hence (8.50) yields the formula

sin0=iui. This equation shows that sin 0 is positive and hence that 0<0< it if the above orientation of F is used. Altogether we thus obtain the following geometric significance of the rotation-vector u: 1. u is contained in the axis of 'p. 2. The norm of u is equal to sin 0. 3. If the orientation induced by u is used in F, then 0 is contained in the interval 0<0< it. 1 has obviously the Let us now compare the rotations 'p and p 1= same axis as 'p. Observing that we see that the rotation vector of 1 is — u. This implies that the inverse rotation induces the inverse orientation in the plane F. To obtain an explicit expression for u select a positive orthonormal be the corresponding matrix of 'p. Then ,/i basis e1, e2, e3 in E and let has the matrix — = 16

Greub. Linear Algebra

Chapter VIII. Linear mappings of inner product spaces

242

and the components of u are given by

It should be observed that a proper rotation is not uniquely determined by its rotation vector. In fact, if and (p2 are two rotations about the same axis whose rotation angles satisfy 01 +02 = it, then the rotation coincide. Conversely, if ç1 and p2 have the same and vectors of rotation vector, then their rotation axes coincide and the rotation angles satisfy either 01 =02 or 01 + 02 = it. Since cos(m—0)= —cos0 it follows from the remark above that a rotation is completely determined by its rotation vector and the cosine of the rotation angle. 8.24. Proper rotations and quaternions. Let E be an oriented 4-dimensional Euclidean space. Make E into the algebra of quaternions as described in sec. 7.23. Fix a unit vector a and consider the linear transformation

(px=ax

Then

(p is

xeE.

a proper rotation. In fact, (pxl

= laxl= al xl = xl.

is proper choose a continuous map f from the closed unit interval to S3 such that

To show that (p

f(1)=a and set

xeE. Then every map

is

a rotation whence

det(p1=±l Since det

is a continuous function of t it follows that det 'p1 = det

= det i = +

1.

This implies that det (p = +

1

and so (p is a proper rotation. In the same way it follows that the linear transformation x=xa xeE is

a proper rotation.

§ 6. Rotations of Euclidean spaces of dimension 2, 3 and 4

243

Next, let a and b be fixed unit vectors and consider the proper rotation 'r of E given by XEE. (8.51)

If a = b = p (p a unit vector) we have

xeE.

(8.52)

It follows that t (e) = e and so t induces a proper rotation, t1, of the orthogonal complement, E1, of E. We shall determine the axis and the rotation angle of r1. Let qeE1 be the vector given by

q q

determines the axis of the rotation -c1. We shall assume that

±e so that q#0. To determine the rotation angle of 'r1 consider the 2-dimensional subspace, F of E1 orthogonal to q. We shall show that, for ZEF, TZ

=

—

l)z +

(8.53)

In fact, the equations

and p'=).e—q yield

+

q)

qz— zq

and

z

= 2(q x z)

qz+zq= —2(q,z)e——O

we obtain ZEF

and so (8.53) follows.

Let F have the orientation induced by the orientation of E1 and the vector q (cf. sec. 4.29). Then the rotation angle ê is determined by the equations (cf. see 8.21)

cos ê = (z, z, z) and .

sin ê =

1

zl

(q, z, t z)

where z is any unit vector in F.

—

244

Chapter VIII. Linear mappings of inner product spaces

Using formula (8.53) we obtain

cos 0 = and

2).2

—

sine

=2)

qx

Since

=

cos

w and

= sin

U)

where 0) denotes the angle between e and p (0< (y) <m) these relations

can be written in the form

cos 0 = 2 cos2 0) — 1 = cos 20) and

sin 0 = 2 cos 0)

sin

U)

= sin 2w.

These relations imply that

0 =2w. Thus we have shown that the axis of 'r1 is generated by q and that the rotation angle of is twice the angle between p and e.

Proposition I: Two unit quaternions p1 and p2 determine the same rotation of E1 only if p2 = ±p1. Moreover, every proper rotation of E1 can be represented in the form (8.52).

Proof: The first part of the proposition follows directly from the result above. To prove the second part let a be any proper rotation of E1. We may assume that a a unit vector such that aa=a and let F E1 be the plane orthogonal to a. Give F1 the orientation induced by a and denote the rotation angle of a by p = e cos

(0< & <2 it). Set

+ a sin

Then the rotation i given by

zx=pxp1 coincides with a. In fact, if q is the vector determined by p=Ae+q, qeE1, then q = a sin

and so i and a have the same rotation axis. Next, observe that since sin

>0, q is a positive multiple of a and so

the vectors q and a induce the same orientation in F. But, as has been

§ 6. Rotations of Euclidean spaces of dimension 2, 3 and 4

245

shown above, the rotation angle of 'r with respect to this orientation is given by

t9=2 It follows that t = a. This completes the proof. Proposition II: Every proper rotation of E can be written in the form (8.51). Moreover, if a1, b1 and a2, b2 are two pairs of unit vectors such that the corresponding rotations coincide, then

a2=ra1 and b2=rb, where

± 1. Proof: Let a be a proper rotation of E and define t. by

t(x)=a(x)a(e)1

XEE.

Then t is again a proper rotation and satisfies 'r (e) = e. Thus 'r restricts to a proper rotation of E1. Hence, by Proposition I, there is a unit vector p such that

xeE. It follows that

a(e)= axb'

a(x)= t(x)a(e) = with a = p and b = p. Finally, assume that a1 x

xeE.

= a2 x

Setting x = e yields a1 bj

1

= a2 b1.

Then we obtain from (8.53)

pxp1=x

xeE.

Now Proposition I implies that p = r e, r = ± 1 and so

a2=ra1, This completes the proof.

b2=cb1.

(8.53)

Chapter VIII. Linear mappings of inner product spaces

246

Problems I. Show that the rotation vector of a proper rotation ço of 3-space is zero if and only if is of the form (px= — x+2(x,e)e

where e is a unit vector. 2. Let p be a linear automorphism of a real 2-dimensional linear space E. Prove that an inner product can be introduced in E such that 'p becomes a proper rotation if and only if det 'p =

1

and

2.

Itr

3. Consider the set H of all homothetic transformations 'p of the plane. Prove:

a) If 'p1eH and 'p2eH, then 2p1+jup2eH. b) If the multiplication is defined in H in the natural way, the set H becomes a commutative field. c) Choose a fixed unit-vector e. Then, to every vector xeE there exists Define a multisuch that exactly one homothetic mapping Prove that E becomes a field under this multiplication in E by xy = defines an isomorphism of E onto plication and that the mapping H. d) Prove that E is isomorphic to the field of complex numbers. 4. Given an improper rotation p of the plane construct an orthonormal basis e1, e2 such that çoe1 =e1 and pe2= —e2. 5. Show that every skew mapping of the plane is homothetic. If #0,

prove that the angle of the corresponding rotation is equal to + orientation is defined by the determinant-function

A(x,y)=(çlix,y)

x,yeE.

6. Find the axis and the angle of the rotation defined by = çoe2 =

=

e1

+ 2e2 — 2e3)

+ 2e2 + e3) — e2 —

2e3)

(v = 1, 2, 3) is a positive orthonormal basis. If 'p is a proper rotation of the 3-space, prove the relation

where 7.

det('p + i) = 4(1 + cosO) where 0 is the rotation angle.

2

if the

§ 6. Rotations of Euclidean spaces of dimension 2, 3 and 4

247

whose determinant is + 1.

8. Consider an orthogonal 3 x 3-matrix Prove the relation — — +

4

V
V

9. Let e be a unit-vector of an oriented 3-space and 0 ( — ir <0 ic) be a given angle. Denote by Fthe plane orthogonal to e. Consider the proper rotation 'p whose axis is generated by e and whose angle is 0 if the orientation induced by e is used in F. Prove the formula (p x = x cos 0 + e (e, x)(1

—

cos 0)

+ (e x x) sin 0.

10. Prove that two proper rotations of the 3-space commute if and only if they have the same axis. 11. Let be a proper rotation of the 3-space not having the eigenvalue —1. Prove that the skew transformations are

x=(co—i)o(ca+iY' connected by the equation

and

1

1 + cos0

'1'

where 0 denotes the rotation-angle of 'p. 12. Assume that an improper rotation 'p

—

i of the Euclidean 3-space

is given.

a) Prove that the vectors x for which çox= —x, form a 1-dimensional subspace E1. is induced in the plane F orthogonal b) Prove that a proper rotation is the to E1. Defining the rotation-vector u as in sec. 8.22, prove that identity if and only if u = 0. is given by c) Show that the rotation-angle of + 1)

cosO

and that 0<0< ic if the induced orientation is used in F.

13. Let a be a vector in an oriented Euclidean 3-space such that 1. Consider the linear transformation 'pa given by (flaX = x cos(ic

+

1

(a, x)f /Ial\2 + (a x

where f is defined by

f(t)=-1-sinmt

Chapter VHI. Linear mappings of inner product spaces

(i) Show that

is a proper rotation whose axis is generated by a

and whose rotation angle is e=m (ii) Show that = i and that

(PaX =

if

x + 2a(a, x)

1.

(iii) Suppose that a+h. Show that P,=Pb if and only if and that there is a correspondence between the proper rotations of and the straight lines in h = — a. Conclude 14.

1

1

Let E be an oriented 3-dimensional inner product space.

a) Consider E together with the cross product as an algebra. Show that the set of non-zero endomorphisms of this algebra is precisely the group of proper rotations of E. b) Suppose a multiplication is defined in E such that every proper rotation z is an endomorphism, t (x y) = t x t y. Show that

x y) xy = a constant. 15. Let a be a skew linear transformation of the Euclidean space a) Show that a can be written in the form

where

is

ax=px+xq

I-Il.

XE1-11

and that the vectors p and q are uniquely determined by a. b) Show that E1 is stable under a if and only if q = — p. c) Establish the formula det a =

— 1q12)2.

16. Let p be a unit vector in E and consider the rotation i x = p x Show that the rotation vector (cf. sec. 8.22) of t is given by

u=2).(p—).e)

)=(p,e).

17. Let p + ± e be a unit quaternion. Denote by F the plane spanned by e and p and let F' be its orthogonal complement. Orient F so that the vectors e, p form a positive basis and give F' the induced orientation. Consider the rotations

çox=px and t/ix=xp.

§ 7. Differentiable families of linear automorphisms a)

Show that the planes F and F' are stable under p and

249

and that

in F.

b) Denote by e the

in F and by the common rotation angle of p and rotation angles for and in F'. Show that e and

Differentiable families of linear automorphisms

§ 7.

8.25. Differentiation formulae. Let E be an n-dimensional inner product space and let L(E; E) be the space of all linear transformations of E. It has been shown in sec. 7.21 that a norm is defined in the space L(E; E) by the equation = max kpxl. IxI=1

A continuous mapping of a closed interval into the space L (E; E) will be called a continuous family of linear transformations or a continuous curve in L (E; E). A continuous curve çø (t) is called d(fferentiable if the limit lim

exists for every t(t0 every fixed t.

t

(p(t+LIt)—(p(t)

ut

t1). The mapping

= is obviously again linear for

The following formulae are immediate consequences of the above definition: 1.

p+p

—2

+p

(2, p constants)

2. 3.

4. If p differentiable curves in L (E; E) and çJi is a p-linear function in L(E; E), then d

dt

= v=1

A curve çø (t)(t0 t is called continuously differentiable if the mapping t-÷ (t) is again continuous. Throughout this paragraph all differentiable

curves are assumed to be continuously differentiable. 8.26. Differentiable families of linear automorphisms. Our first aim is to establish a one-to-one correspondence between all differentiable families of linear automorphisms on the one hand and all continuous families of linear transformations on the other hand. Let a differentiable family

Chapter VIII. Linear mappings of inner product spaces

of linear automorphisms be given such that 'p (t0) = i. of linear transformations is defined by

'p

Then a continuous family i1Li (t)

Interpreting t as time we obtain the following physical significance of the mappings cli (t): Let x be a fixed vector of E and

x(t) = (p(t)x (t) is given by

the corresponding orbit. Then the velocity vector

= ç3(t)x = q3(t)q,(t)_1 x(t) = i/i(t)x(t). the mapping (t) associates with every vector x (t) its velocity at the instant t. Now it will be shown that, conversely, to every continuous curve i/i (t) in L (E; E) there exists exactly one differentiable family 'p (t) (t0 t t1) of linear automorphisms satisfying the differential equation Hence,

= cl/(t)o(p(t)

(8.54)

and the initial-condition 'p (t0) = z. First of all we notice that the differential equation (8.54) together with the above initial condition is equivalent to the integral equation

'p(t) = z+

(t0

t

t3.

(8.55)

In the next section the solution of the integral equation (8.55) will be constructed by the method of successive approximations. 8.27. The Picard iteration process. Define the curves 0,1...) by the equations

=

i

and

(8.56) +

1(t)

z + 5 (t) o

(0 dt

(n = 0, 1 ...)

Introducing the differences

=

—

(t)

(n = 1,2, ...)

(8.57)

§ 7. Differentiable families of linear automorphisms

we obtain from (8.56) the relations

(n = 2,3,...).

=

(8.58)

Equation (8.57) yields for n= 1

-

A1(t) =

=

Define the number M by

M= max It/i(t)I. tO

t

t1

Then

(t)I

M(t — t0).

(8.59)

Employing the equation (8.58) for n=2 we obtain in view of (8.59) IA

M2(

f (t — t0) dt

2(t)I

—

and in general (n

= 1,2...).

Now relations (8.57) imply that n+p A%,(t),

—

whence n+p MV

n+p —

(t —

vn+1

vn+1

v!

Let >0 be an arbitrary number. It follows from the convergence of the series

MV —i-- (t1

_t0)V that there exists an integer N such that

n+p MV v!

(t1 — to)V

<

c

for

n > N and p 1.

The inequalities (8.60) and (8.61) yield —

p

1.

(8.61)

Chapter VIII. Linear mappings of inner product spaces

These relations show that the sequence the interval

is uniformly convergent in (p(t).

In view of the uniform convergence, equation (8.56) implies that

t t1).

= i+

(8.62)

As a uniform limit of continuous curves the curve p (t) is itself continuous. Hence, the right hand-side of (8.62) is differentiable and so p (t) must be differentiable. Differentiating (8.62) we obtain the relation

= //(t)op(t) showing that 'p (t) satisfies the differential equation (8.54). The equation

i is an immediate consequence of (8.62). 8.28. The determinant of It remains to be shown that the mappings 'p (t) are linear automorphisms. This will be done by proving the formula

'p (ta) =

f tri/i(t)dt

det'p(t) = eto Let A

(8.63)

0 be a determinant function in E. Then A ('p (t) x1 ... 'p (t)

= det 'p (t) A

(x1 ...

E E.

Differentiating this equation and using the differential equation (8.54) we obtain ... (8.64)

= Observing that (ço (t)x1 ...

=

...

(co(t)x1 ...

= tr /i (t) det 'p (t) A

(x1 ...

we obtain from (8.64) the differential equation det 'p (t) = tr

(t) det 'p (t)

(8.65)

families of linear automorphisms

§ 7. Differentiable

253

for the function det p (t). Integrating this differential equation and observing the initial condition det (t0) = det i = 1 we find (8.63). 8.29. Uniqueness of the solution. Assume that (t) and p2(t) are two solutions of the differential equation (8.54) with the initial condition = z. Consider the difference p

p(t) = p2(t) — The curve 'p (t) is again a solution of the differential equation (8.54) and it satisfies the initial condition 'p (t0) = 0. This implies the inequality

=

(8.66) to

to

Now define the function F by

F(t)=

(8.67)

Then (8.66) implies the relation

E(t) MF(t). Multiplying by e_tM we obtain E(t)e_tM — Me_tMF(t)

0,

whence d

dt

(F(t)e_tM)

0. —

Integrating this inequality and observing that F(t0) = 0, we obtain

F(t)e_tM <0 and consequently

F(t)O

(8.68)

On the other hand it follows from (8.67) that

F(t)O

(8.69)

Relations (8.68) and (8.69) imply that F(t)_= 0 whence 'p (t)m 0. Consequently, the two solutions (t) and 'p2(t) coincide.

254

Chapter VIII. I.inear mappings of inner product spaces

8.30. 1-parameter groups of linear automorphisms. A differentiable family of linear automorphisms (p (t) ( — cc
Equation (8.70) implies indeed that the automorphisms (p(t) form an (abelian) group. Inserting t=0 we find (p(O)=l. Now equation (8.70) yields

q(t)0q(— t)==z showing that with every automorphism (p (t) the inverse automorphism (p (t)' is contained in the family (t)( — cc
+ t) = Inserting t = 0 we obtain the differential equation

(—cc<x
(8.71)

çà(0). Conversely, consider the differential equation (8.71) is a given transformation of E. It will be shown that the solution (p (r) of this differential equation to the initial condition p (0) i is a Iparameter group of automorphisms. To prove this let t be fixed and consider the curves where where

p1(t)= p(t+

(8.72)

and

(p2(t) = ço(t)op(t).

(8.73)

Differentiating the equations (8.72) and (8.73) we obtain (8.74)

and

=

= l//o(p(t)o(p(r) =

(8.75)

Relations (8.74) and (8.75) show that the two curves c°i (t) and (p2(t) satisfy the same differential equation. Moreover, (0) = (p2(0)

(p (x).

Thus, it follows from the uniqueness theorem of sec. 8.28 that whence (8.70).

(t)

(t)

§ 7. Differentiable families of linear automorphisms

8.31. Differentiable families of rotations. Let p(t)(t0 t1) be a differentiable family of rotations such that 'p (ta) = i. Since det 'p (t) = ± 1 for every t and det 'p (ta) = + lit follows from the continuity that det 'p (t) = + 1, i.e. all rotations cp (t) are proper. Now it will be shown that the linear transformations

= are

skew. Differentiating the identity

we

obtain

=

Inserting

= and

=

=

into this equation we find

+ i/J(t))op(t) = 0, whence (t) +

(t) = 0.

Conversely, let the family of linear automorphisms 'p (t) be defined by the differential equation

= b(t)0p(t),

p(t0) = 1

where cli (t) is a continuous family of skew mappings. Then every automorphism 'p (t) is a proper rotation. To prove this, define the family x (t) by

x(t) = Then

=

-

+

=0

and X(to)

i.

Now the uniqueness theorem implies that x (t)

i, whence

This equation shows that the mappings 'p (t) are rotations.

Chapter VIII. Linear mappings of inner product spaces

8.32. Angular velocity. As an example, let p(t) be a differentiable family of rotations of the 3-space such that çø (0) = i. If t is interpreted as the time, the family (t) can be considered as a rigid motion of the space E. Given a vector x, the curve

x(t) = co(t)x describes

its orbit. The corresponding velocity-vector is determined by =

=

=

(8.76)

Now assume that an orientation is defined in E. Then every mapping i/i

(t) can be written as

= u(t) x y.

(8.77)

The vector u (t) is uniquely determined by (t) and hence by t. Equations (8.76) and (8.77) yield ±(t) u(t) x x(t). (8.78)

The vector u(t) is called the angular velocity at the time t. To obtain a physical interpretation of the angular velocity, fix a certain instant t and assume that u(t)+0. Then equation (8.78) shows that (t) a multiple of u (t). In other words, the straight line generated by u(t) consists of all vectors having the velocity zero at the instant t. This straight line is called the instantaneous axis. Equation (8.78) implies that the velocity-vector ±(t) is orthogonal to the instantaneous axis. Passing over to the norm in equation (8.78) we find that J±(t)I

= u(t)I Ih(t)I

is the distance of the vector x 0') from the instantaneous axis (fig. 1). ConseFig. 1 quently, the norm of u 0') is equal to the magnitude of the velocity of a vector having the distance 1 from the instantaneous axis. The uniqueness theorem in sec. 8.28 implies that the rigid motion ço 0') is uniquely determined by p (t0) if the angular velocity is a given function

where h (t)I

of t.

8.33. The trigonometric functions. In this concluding section we shall apply our general results about families of rotations to the Euclidean plane and show that this leads to the trigonometric function cos and sin. This definition has the advantage that the addition theorems can be proved in a simple fashion, without making use of the geometric intuition.

§ 7. Differentiable families of linear automorphisms

257

Let E be an oriented Euclidean plane and zi be the normed determinant function representing the given orientation. Consider the skew mapping which is defined by the equation

x, y) =

A

(x, y).

(8.79)

First of all we notice that cli is a proper rotation. In fact, the identity 7.24 yields

(Vix,y)2 = A(x,y)2 =(x,x)(y,y) — (x,y)2.

Inserting y i,li x we find

= Now i/i is regular as follows from (8.79). Hence the above equation implies

that

= (x,x). Replacing x and y by clix and çliy respectively in (8.79) we obtain the relation x, y) = x, i/i y) = (cii x, y) = A (x, y) showing that detcii = + 1. Let (t)( — cc
=

c&o(p(t)

(0)

=

(8.80)

and the initial condition z.

Then it follows from the result of sec. 8.29 that + x) = p(t)0p(t).

(8.81)

We now define functions c and s by

c(t) = 4trp(t) and

—

(t) =

—

tr (ci'

cc < t

(8.82)

p (t))

These functions are the well-known functions cos and sin. In fact, all the properties of the trigonometric functions can easily be derived from (8.82). Select an arbitrary unit-vector e. Then the vectors e and i/ie form 17

Greub. Linear Algebra

Chapter VIII. Linear mappings of inner product spaces

an orthonormal basis of E. Consequently,

tr(p(t) =

+

(8.83)

is itself a proper rotation, the mappings ço (t) and cli commute. Hence, the second term in (8.83) can be written as Since çfr

= (p(t)e,e).

= We

thus obtain c(t) = (p(t)e,e).

(8.84)

In the same way it is shown that

s(t)

(8.85)

Equations (8.84) and (8.85) imply that

p(t)e = c(t)e + s(t)l/Je.

(8.86)

Replacing t by t+t in (8.84) and using the formulae (8.81) and (8.86) we obtain

c(t + r) = (q,(t + t)e,e) = (p(t)q(t)e,e) = c (t) (p (t) e, e) — s (t) ('p (t) e, e).

(8.87)

Equations (8.87), (8.84) and (8.85) yield the addition theorem of the function c: c(t + t) = c(t)c(t) — s(t)s(t). In the same way it is shown that

s(t + t) = s(t)c(t) + c(t)s(t). Problems 1. Let cli be a linear transformation of the inner product space E. Define the linear automorphism exp cl' by

= 'p(l) where 'p (t) is the family of linear automorphisms defined by

i/iop(t),p(O) = z. Prove that

'p(t)=

t
§ 7. Differentiable families of linear automorphisms

259

defined in problem 1 has the fol-

2. Show that the mapping lowing properties:

I.

2. exp(—çli)=(exp 3.

exp0=l.

4.

exp

if is skew. that exp is a 3. Consider the family of rotations 'p (t) defined by (8.80). a) Assuming that there is a real number p + 0 such that 'p (p) = i, prove

that (p(t+p)=p(t)(—cc
r

if and only if

dt

c) Show that the family 'p (t)

(k=0,±1,±2,...). has

derivatives of every order and that

(v=0,1...). d) Define the curve x(t) by

x(t) = p(t)e where e is a fixed unit-vector. Show that

= 1.

Derive from formulae (8.82) that the function c is even and that the function s is odd. 5. Let be the skew mapping defined by (8.79). Prove De Moivre's 4.

formula

= c(t)i + 6. Let a skew mapping of an n-dimensional inner product space and 'p (t) the corresponding family of rotations. Consider the normal form (8.35) of the matrix of Vi. Prove that the function 'p (t)( — oo
Chapter VIJI. Linear mappings of inner oroduct spaces

260

7. Let A be a finite dimensional associative real algebra. such a) Consider a differentiable family of endomorphisms is a derivation in A. that Po 1. Prove that b) Let 0 be a derivation in A and define the family çot of linear transformations by

c°o=1. is an automorphism of A. Show that every

Prove that every mutes with 0.

com-

8. The quaternionic exponential function. Fix a quaternion a and consider the differential equation

±(t)=ax(t) with the initial condition x(0) = e. Define exp a by

expa=x(1). Decompose a in the form a =2 e + h where I e and (b, e) =0. (i) Show that a)

exp 2 is the exponential function for the reals.

b) exp b=ecos IbI c)

sin IbI.

exp a = exp 1 (e cos lb I +

sin lb I).

(ii) Show that

lexpal=expl. (iii) Prove that exp a e if and only if

2=0 and

lbl=2km,

ke7L.

(iv) Let a1 = e + b1 and a2 =22 e + b2 ne two quaternions with b1 + 0 and b2 + 0. Show that exp a2 = exp a1 if and only if

22=21

kEl.

and

(v) Let B3 be the closed 3-ball given by (x, e) = 0, lxl mapp: ERxB3-*Eby

p(2,y)=exp(le+y)

yeB3.

Show that a) p maps B3 onto E —0. b) 'p is injective in the interior of B3. c) If S3 denotes the boundary of B3 then

'p(l,y)= —exple

yeS3.

Define a

Chapter IX

Symmetric bilinear functions All the properties of an inner product space discussed in Chapter VII are based upon the bilinearity, the symmetry and the definiteness of the inner product. The question arises which of these properties do not depend on the definiteness and hence can be carried over to a real linear space with an indefinite inner product. Linear spaces of this type will be discussed in § 4. First of all, the general properties of a symmetric bilinear function will be investigated. It will be assumed throughout the chapter that all linear spaces are real. § 1.

Bilinear and quadratic functions

9.1. Definition. Let E be a real vector space and qi be a bilinear function in E x E. The bilinear function cP is called symmetric if

x,yEE. Given a symmetric bilinear function çji consider the (non-linear) function 'I' defined by 1i(x,x). Then is uniquely determined by In fact, replacing x by x+y in (9.1) we obtain (9.2)

whence

=

+ y) —

—

(9.3)

Equation (9.3) shows that different symmetric bilinear functions 1i lead to different functions W. Replacing y by —y in (9.2) we find —

y) = W(x)

—

+

(9.4)

Chapter IX. Symmetric bilinear functions

262

Adding the equations (9.2) and (9.4) we obtain the so-called parallelogram-identity

W(x + y) +

(x

(x) +

y)

—

(9.5)

9.2. Quadratic functions. A continuous function W of one vector which satisfies the parallelogram-identity will be called a quadratic function. Every symmetric bilinear function yields a quadratic function by setting x=y. We shall now prove that, conversely, every quadratic function can be obtained in this way. Substituting x==y=O in the parallelogram-identity we find that (9.6)

Now the same identity yields for x

0

W(— y)

= W(y)

showing that a quadratic function is an even function.

If there exists at all a symmetric bilinear function çji such that

cli(x,x)= this function is given by the equation

cli(x,y) =

+ y) —

— W(y)}.

(9.7)

Therefore it remains to be shown that the function cP defined by (9.7) is indeed bilinear and symmetric. The symmetry is an immediate consequence of (9.7). Next, we prove the relation cli(x1 + x2,y) = cP(x1,y) + cP(x2,y).

(9.8)

Equation (9.7) yields

+ x2,y) = =

W(x1 + y) —

—

2cP(x2,y) =

+ y) —

—

+ x2 + y) — W(x1 + x2) —

whence 2

(x1 + x2, y) —

— {W(x1

+ y) +

(xi, y) — (x2, y)} = {W (x1 + x2 + y) + W (y)} — — W(x2)}. (9.9) + y)} — {W(x1 + x2) —

It follows from (9.5) that

W(x1 + x2 + y) + W(y) =

+ x2 + 2y) + W(x1 + x2)}

(9.10)

§ 1. Bilinear and quadratic functions

263

and

+ y)+ W(x2 + y)=

+ x2 + 2y)+

—

x2)}.

(9.11)

Subtracting (9.11) from (9.10) and using the parallelogram-identity again we find that {W(x1 + x2 + y) + W(y)} — {W(x1 + y) + W(x2 + y)} — x2)} = — W(x1)— W(x2) +

= 4{W(x1 + x2) —

(9.12)

+ x2).

Now equations (9.9) and (9.12) imply (9.8). Inserting x1 = x and x2 = into (9.8) we obtain

x,y) = — cP(x,y).

—x

(9.13)

It remains to be shown that 'Ii(Ax,y) = Acli(x,y)

(9.14)

for every real number A. First of all it follows from (9.8) that

cP(kx,y)= kP(x,y) for a positive integer k. Equation (9.13) shows that (9.14) is also correct for negative integers. Now consider a rational number A

"

(p,q integers).

q

Then

(p \q

\ J

whence

\j= p cli(x,y). j q

(p \q

To prove (9.14) for an irrational factor A we note first that is a continuous function of x and y, as follows from the continuity of W. Now select a sequence of rational numbers such that lim n

= A.

cxj

Then we have that

(9.15)

For n-+cx we obtain from (9.15) the relation (9.14). Our result shows that the relations (9.1) and (9.7) define a one-to-one correspondence between all symmetric bilinear functions and all qua-

Chapter IX. Symmetric bilinear functions

264

dratic functions. If no ambiguity is possible we shall designate a symme-

tric bilinear function and the corresponding quadratic function by the same symbol, i.e., we shall simply write

=

9.3. Bilinear and quadratic forms. Now assume that E has dimension n and let (v = n) be a basis of E. Then a symmetric bilinear 1

function

. . .

can be expressed as a bilinear form

P(x,y) =

(9.16) p

where and the matrix

is defined by *)

=

It follows from the symmetry of cP that the matrix

is symmetric:

= Replacing y by x in (9.16) we obtain the corresponding quadratic form cP(x) =

Problems 1. Letf and g be two linearly independent linear functions in E and let 0 be a derivation in the algebra 11. Show that the function

f(x) 0 [g(x)] —g(x) 0[f(x)]

satisfies the paralelogram identity and the relation Prove that the function cP obtained from W by (9.7) is bilinear if and only

if 0=0. 2. Prove that a symmetric bilinear function in E defines a quadratic function in the direct sum 3. Denote by A and by A the matrices of the bilinear function çji with respect to two bases and (v = I ii). Show that . .

A= TAT* where T is the matrix of the basis transformation *) The first index counts the row.

§ 2. The decomposition of E

265

§ 2. The decomposition of E

9.4. Rank. Let E be a vector space of dimension n and a symmetric bilinear function in E x E. Recall that the nulispace E0 of cP is defined to be the set of all vectors x0eE such that cb(x0,y) =

for every yeE.

0

(9.17)

The difference of the dimensions of E and E0 is called the rank of Hence çJi is non-degenerate if and only it has rank n. Now let E* be a dual space and consider the linear mapping p : defined by

x,yeE.

(9.18)

Then the null-space of çJi obviously coincides with the kernel of (p,

= kerp. Consequently, the rank of çJi is equal to the rank of the mapping p. Let be the matrix of relative to a basis x,, (v = 1. n) of E. Then relation . .

(9.18) yields

=

=

is the matrix of the mapping p. This implies that the rank of the matrix is equal to the rank of 'p and hence equal to the rank of In particular, a symmetric bilinear function is non-degenerate if and only if the determinant of is different from zero. 9.5. Definiteness. A symmetric bilinear function cli is called positive

showing that

definite if cli (x) > 0

for all vectors x +0. As has been shown in sec. 7.4, a positive definite bilinear function satisfies the Schwarz-inequality

cP(x,y)2 cIl(x)cP(y)

X,yEE.

Equality holds if and only if the vectors x and y are linearly dependent. A positive definite function çJl is non-degenerate. If cll(x)O for all vectors xeE, but cli(x)=O for some vectors x+0, the function cli is called positive semidefinite. The Schwarz inequality is still valid for a semidefinite function. But now equality may hold without the vectors x and y being linearly dependent. A semidefinite function is al-

Chapter 1X. Symmetric bilinear functions

ways degenerate. In fact, consider a vector x0 0 such that di (x0) =0. Then the Schwarz inequality implies that cP (x0, y)2 cP (x0) cP (y) = 0

whence dl (x0, y) =0 for all vectors y. In the same way negative definite and negative semidefinite bilinear functions are defined. The bilinear function cP is called indefinite if the function (x) assumes positive and negative values. An indefinite function may be degenerate or non-degenerate. 9.6. The decomposition of E. Let a non-degenerate indefinite bilinear function dl be given in the n-dimensional space E. It will be shown that E such that the space Ecan be decomposed into two subspaces cli is positive definite in and is negative definite in E. Since cli is indefinite, there is a non-trivial subspace of E in which cli is positive definite. For instance, every vector a for which cIl(a)>0 generates such a subspace. Let be a subspace of maximal dimension such that dl is positive with redefinite in Et Consider the orthogonal complement E of spect to the scalar product defined by c/I. Since cli is positive definite in E the intersection E consists only of the zero-vector. At the same time we have the relation (cf. Proposition II, sec. 233)

dimE = dimE. This yields the direct decomposition E =

Now it will be shown that cli is negative definite in E. Given a vector z +0 of E -, consider the subspace E1 generated by E + and z. Every vector of this subspace can be written as

This implies that cli (x) =

cli

(y) +

dl (z).

(9.19)

Now assume that li(z)>0. Then equation (9.19) shows that cli is positive definite in the subspace E1 which is a contradiction to the maximumproperty of E Consequently,

dl(z) 0

for all vectors

zeE

§ 2. The decomposition of E i.e.,

is

267

negative semidefInite in E. Using the Schwarz inequality

z1eE,zeE

(9.20)

we can prove that cP is even negative definite in E. Assume that cb(z1)=0 for a vector z1eE. Then the inequality (9.20) yields

1(z1,z) = 0

for all vectors zeE. At the same time we know that =0

for all vectors yeEt These two equations imply that 'k(z1,x)—_ 0

for all vectors xeE, whence z1 =0. 9.7. The decomposition in the degenerate case. If the bilinear function 'P is degenerate, select a subspace E1 complementary to the nulispace E0, E = E0 EBE1. Then 'P is non-degenerate in E1. In fact, assume that

'P(x1,y1)= 0 for a fixed vector x1 e E1 and all vectors Yi e E1. Consider an arbitrary vector yeE. This vector can be written as

YYo+Yi

y0eE0,y1eE1

whence

'P(x1,y) = 'P(x1,y0) + 'P(x1,y1) = 0.

(9.21)

This equation shows that x1 is contained in E1 and hence it is contained in the intersection E0 n E1. This implies that x1 =0. Now the construction of sec. 9.6 can be applied to the subspace E1. We thus obtain altogether a direct decomposition (9.22)

of E such that 'P is positive definite in and negative definite in E. 9.8. Diagonalization of the matrix. Let (x1 .x5) be a basis of E which is orthonormal with respect to 'P, (Xs+i...Xr) be a basis of E which is . .

Chapter IX. Symmetric bilinear functions

orthonormal with respect to —P, and (Xr+i...Xn) be an arbitrary basis of Then + l(v= is) ç — I (v = s + I ... r) where = (xv, = (

O(v=r+1...n)

then form a basis of E in which the matrix of Ji has the following diagonal-form: The vectors (x1 .

. .

0

9.9. The index. It is clear from the above construction that there are infinitely many different decompositions of the form (9.22). However, the dimensions of E + and E - are uniquely determined by the bilinear function b. To prove this, consider two decompositions (9.23)

and (9.24)

such that P is positive definite in

and

and negative definite in

Ej and E. This implies that n

whence dim

n.

(9.25)

+dimEj +dimE0=n.

(9.26)

+ dim Ej + dim E0

Comparing the dimensions in (9.23) we find

Equations (9.25) and (9.26) yield

§ 2. The decomposition of E

Interchanging

and

269

we obtain

whence

=

Consequently, the dimension of is uniquely determined by cu. This number is called the index of the bilinear function cui and the number —dim E =2s—r is called the signature of cP. dim Now suppose that (v = 1.. .n) is a basis of E in which the quadratic function cli has diagonal form 11(x) = and assume that

and Then p is the index of cli. In fact, the vectors (v = 1. . .p) generate a subspace of maximal dimension in which cli is positive definite. From the above result we obtain Sylvester's law of inertia which asserts

that the number of positive coefficients is the same for every diagonal form. 9.10. The rank and the index of a symmetric bilinear function can be determined explicitly from the corresponding quadratic form cil(x) =

We can exclude the case cu =0. Then at least one coefficient from zero. If apply the substitution

= Then

where say

=

+

is different

—

=

+0 and

0. Thus, we may assume that at least one coefficient is different from zero. Then çJl (x) can be written as

2"

X11

)

The substitution ill =

+

in (v=2...n)

Chapter IX. Symmetric bilinear functions

yields

(x) =

(p1)2 1

+

(9.27)

The sum in (9.27) is a symmetric bilinear form in (n — 1) variables and hence the same reduction can be applied to this sum. Continuing this way we finally obtain an expression of the form = Rearranging the variables we can achieve that

)V>O (v=l...s) (v=s+1...r) (v=r+l...n). Then r is the rank and s is the index of P.

Problems 1. Let cP + 0 be a given quadratic function. Prove that cu can be written in the form

±1 wheref is a linear function, if and only if the corresponding bilinear function has rank 1. 2. Given a non-degenerate symmetric bilinear form cu in E, let J be a subspace of maximal dimension such that cui(x, x)=0 for every xeJ. Prove that dim J = mm (s, n — s).

Hint: Introduce two dual spaces E* and F* and linear mappings and defined by

cli(x,y)= . in the space L(E; E) by

3. Define the bilinear function

= Let S(E; E) be the space of all selfadjoint mappings and A (E; E) be the space of all skew mappings with respect to a positive definite inner product. Prove:

§ 2. The decomposition of E

271

a) cb(p, (p)>O for every p+O in S(E; E), b) 'P(p, (p)
d) The index of

is

n(n+1)

where n

2

dim E.

4. Find the index of the bilinear function = tr(i/10p)

—

in the space L(E; E). 5. Find the index of the quadratic form =

6. Let

i<j

be a bilinear function in E. Assume that E1 is a subspace of

E such that çji is non-degenerate in E1. Define the subspace E2 as follows:

A vector x2eE is contained in E2 if b(x1, x2) =

for all vectors x1 eE1.

0

Prove that

E=E1 7. Consider a (not necessarily symmetric) bilinear function such that (x, x)>O for all vectors x #0. Construct a basis of E in which the matrix of çji has the form 1

K1

—K1 1

1

1

1, Hint: Decompose CP in the form

=

+

where

cP1(x,y)=

+ CP(y,x))

and P2 (x, y) =

(cP (x, y)

— 1i

(y, x)).

Chapter IX. Symmetric bilinear functions

272

8. Let E be a 2-dimensional vector space, and consider the 4-dimensional space L(E; E). Prove that there exists a 3-dimensional subspace Fc:L(E; E) and a symmetric bilinear function in F such that the nilpotent transformations (cf. problem 7, Chap. IV, § 6) are precisely the transformations t satisfying (In other words, the nilpotent transformations form a cone in F). § 3.

Pairs of symmetric bilinear functions

9.11. In this paragraph we shall investigate the question under which

conditions two symmetric bilinear functions cP and W can be simultaneously reduced to diagonal form. To obtain a first criterion we consider the case that one of the bilinear functions, say W, is non-degenerate. Then the vector space E is self-dual with respect to W and hence there exists a linear transformation satisfying

x,yeE (cf. Prop. Ill, sec. 2.33). Suppose now that x1 and x2 are eigenvectors of p such that the corresponding cigenvalues and 22 are different. Then we have that = ¶t'(x1,x2) and

cP(x2,x1) = 22 ¶P(x2,x1)

whence in view of the symmetry of cP and W —

Since

22) W(x1,x2)

0.

it follows that ¶I'(x1,x2)=O and hence k(x1, x2)=O.

Proposition: Assume that Y is non-degenerate. Then P and

are

simultaneously diagonalizable if and only if the linear transformation 'p has n linearly independent eigenvectors. If 'p has /1 linearly independent eigenvectors consider the distinct eigenvalues of 'p. Then it follows that . .

E=E1EE...EEE. where E, is the eigenspace of V'

(x,

=

Then we have for 0

and

CP (xe,

and

1jj

= 0.

Now choose a basis in each space E1 such that ¶1' has diagonal form (cf.

§ 3. Pairs of symmetric bilinear functions sec.

273

9.8). Since

= it follows that has also diagonal form in this basis. Combining all these bases of the E such that and W have diagonal form. Conversely, let e (i = 1.. .n) be a basis of E such that (ei, = 0 and W (ei, = 0 if i +j. Then we have that

i+j. This equation shows that the vector (p e1 is contained in the orthogonal complement (with respect to the scalar product defined by of the subspace v + i. But Fj is the 1-dimensional generated by the vectors subspace generated by e, =2 e1. In other words, the are eigenvectors of (p. As an example let E be a plane with basis a, b and consider the bilinear functions W given by

P(a,b)=O,

P(b,b)=—1

and

W(a,a)= 0,

0.

1,

It is easy to verify that then the linear transformation p is given by

(pa=b, pb=—a. Since the characteristic polynomial of p is 22+ 1 it follows that (p has no eigenvectors. Hence, the bilinear functions and 'P are not simultaneously diagonalizable. Theorem: Let E be a vector space of dimension n 3 and let and 'I' be two symmetric bilinear functions such that

D(x)2-E-W(x)2+O

if x+O.

Then çji and 'P are simultaneously diagonalizable. Before giving the proof we comment that the theorem is not correct for dimension 2 as the example above shows. 9.12. To prove the above theorem we employ a similar method as in sec. 8.6. If one of the functions rIi and 'P,say 'P, is positive definite the desired basis-vectors are those for which the function

F (x) = 18

Greub. Linear Algebra

P (x)

x

+ 0.

(9.28)

274

Chapter IX. Symmetric bilinear functions

assumes a relative minimum. However, if is indefinite, the denominator and hence the in (9.28) assumes the value zero for certain vectors

function F is no longer defined in the entire space x+0. The method of sec. 8.6 can still be carried over to the present case if the function F is replaced by the function

arc tanF(x).

(9.29)

To avoid difficulties arising from the fact that the function arc tan is not single-valued, we shall write the function as a line-integral. At this point the hypothesis n 3 will be essential *). Let E be the deleted space x 0 and x = x (t)(0 t 1) be a differentiable curve in E. Consider the line-integral

=

(9.30)

+

J

0

taken along the curve x (t). First of all it will be shown that the integral J

depends only on the initial point x0=x(O) and the endpoint x=x(l) of the curve x(t). For this purpose define the following mapping of E into the complex w-plane:

w(x)= cP(x)+ I W(x). The image of the curve xQ) under this mapping is the curve w (t) =

cP

(x (t)) + I 'I' (x (t))

in the w-plane. The hypothesis 1i

(0 t 1)

(9.31)

+ 'P (x)2 + 0 implies that the curve

w(t)(0t 1) does not go through the point w=0. The integral (9.30) can now be written as 1

2J u +v dt 2

2

0

where the integration is taken along the curve (9.31). Now let 0(t) be an angle-function for the curve w(t) i.e. a continuous function of t such that v(t) u(t) (9.32) and sin 0(t) = cos0(t) = .

w(t)I

*) The proof given is due to JOHN MILNOR.

Jw(t)I

§ 3. Pairs of symmetric bilinear functions

275

(cf. fig. 2) *). It follows from the differentiability of the curve w (t) that

the angle-function 0 is differentiable and we thus obtain from (9.32) —uiwi +

d wi (9.33)

iwi

and

d wi Iwi

Fig. 2

(9.34) by

0

0 and adding these equa-

tions we find that

ui'—üv

0=-

-

+

Integration from t = 0 to t = gives 1

liv 2dt=O(l)—0(0) +V

ut) — U

2

showing that the integral J is equal to the change of the angle-function 0 along the curve w (t), (9.35) 2J = 0(1) — 0(0).

Now consider another differentiable curve x = (t)(0 t 1) in E with the initial point x0 and the endpoint x and denote by Jthe integral (9.30) Then formula (9.35) yields taken along the curve

2J= 0(1) — 0(0)

(9.36)

where U is an angle-function for the curve (t) =

cP

(t)) + i

(t))

Since the curves w (t) and (t)(0 t the same endpoint it follows that O(O)—O(O)=2k0m

and

(0 t 1).

1) have the same initial point and

O(l)—0(l)=2k1it

(9.37)

*) For more details cf. P.S. A1ExANDRoV. Combinatorial Topology, Vol. I. chapter II. § 2.

Chapter IX. Symmetric bilinear functions

where k0 and k1 are integers. Equations (9.35), (9.36) and (9.37) show that the difference J—i is a multiple of 7t,

J—J = km.

it remains to be shown that k=0. The hypothesis n3 implies that the space E is simply connected. in other words, there exists a continuous 1 into E such that mapping x = x (t, t) of the square 0 t 1, 0 t

x(t,O)=x(t), and

x(O,t)=x0,

x(1,'r)=x

The mapping x(t, t) can even be assumed to be differentiable. Then, for every fixed 'r, we can form the integral (9.30) along the curve

x(t,z) This integral is a continuous function J(t) of i. At the same time we know that the difference —J is a multiple of m, J(t) — J = mk('r).

Hence, k (z)

(9.38)

a continuous integer-valued function in the interval Ot I and thus k(t) must be a constant. Since k(0)=0 it follows that is

1). Now equation (9.38) yields

J(x)=J Inserting t = 1 we obtain the relation

i=J showing that the integral (9.30) is indeed independent of the curve x (t). 9.13. The function F. We now can define a single-valued function F in the space E by

F(x) =

r (x) I

J

-

W (x

cP(x)2

-

(x

+ W(x)2

W

(x)

dt

(9.39)

where the integration is taken along an arbitrary differentiable curve x(t) in E leading from x0 to x. The function F is homogeneous of degree zero,

F(2 x) = F(x),

> 0.

(9.40)

§ 3. Pairs of symmetric bilinear functions

277

To prove this, observe that F(itx)—F(x)=

— + W(x)2

I

j

—-dt.

Choosing the straight segment

as path of integration we find that

=

(1

—

whence

= 0.

—

This implies the equation (9.40). 9.14. The construction of eigenvectors. From now on our proof will follow the same lines as in sec. 8.6. We consider first the case that is non-

degenerate. Introduce a positive definite inner product in E. Then the continuous function F assumes a minimum on the sphere lxi = 1. Let e1 be a vector on lxi = I such that

F(e1) F(x) for all vectors lxi = 1. Then the homogeneity of F implies that

F(e1) F(x) for all vectors x +0. Consequently, the function

f(t) =

F(e1

+ ty),

where y is an arbitrary vector, assumes a minimum at t==0, whence

f'(O)=O.

(9.41)

Carrying out the differentiation we find that (0) — —

Equations

'P(e',y)

y)

P(e1)2 + ¶P(e1)2

(9 42)

(9.41) and (9.42) imply that

cl(e1,y)

k11(e1)

— P(e1) ¶I'(e1,y) = 0

(9.43)

for all vectors yeE. In this equation W(e1)+0. In fact, assume that

_____

278

Chapter IX. Symmetric bilinear functions

W(e1)=0. Then and hence equation (9.43) yields ¶I'(e1, y)=0 for all vectors yEE. This is a contradiction to our assumption that W is non-degenerate. Define the number by = then equation (9.43) can be written as VEE.

9.15. Now consider the subspace E1 defined by the equation

V'(e1,z) = 0.

Since W is non-degenerate, E1 has the dimension n — I. Moreover, the restriction of 'P to E1 is again non-degenerate: Assume that z1 is a vector of E1 such that

'P(z1,z)=O

(9.44)

for all vectors zeE1. Equation (9.44) implies that (9.45)

for every vector xEE because x can be decomposed in the form zEE1.

=0, and so 'P is non-degenerate in E1. Therefore, the construction of sec. 9.14 can be applied to E1. We thus obtain a vector e2eE1 with the property that Now it follows from (9.45) that

b (e2, z) =

22

'P (e2, z)

for every vector z

E E1

(9.46)

where — 2

1(e2) W(e2)

Equation (9.46) implies that

=

22

¶I'(e2,y)

for every vector yeE; in fact, y can be decomposed in the form ZEE1

(9.47)

§ 3. Pairs of symmetric bilinear functions

and

279

we thus obtain

+ P(e2,z) =

= =

+

(9.48)

=

W(e1,e2) +

and =

W(e2,z).

+

(9.49)

Equations (9.46), (9.48) and (9.49) yield (9.47).

Continuing this construction we obtain after n steps a system of n vectors

subject to the following conditions:

;') =

P

(9.50)

E

(er, v)

W

4 0

= Rearranging the vectors

+ ii).

0

and multiplying them with appropriate scalars

we can achieve that = where

that

s

ç+i(v=i...s) — 1(v

(9.51)

= s + 1 ...n)

denotes the signature of W. It follows from the above relations

the vectors

Inserting

form a basis of E. in the first equation

(9.50)

we

find

(9.52)

= Equations (9.51) and (9.52) show that the bilinear functions diagonal form in the basis 1.. .n).

and 'P have

9.16. The degenerate case. The degenerate case now remains to be con-

We may assume that W+O. Then it will be shown that there exists a scalar is non-desuch that the bilinear function sidered.

generate Let E* be a dual space of E and consider the linear mappings

and defined

by the equations

cP(x,y)=

and

Then Im

To prove this relation, let

fl

p (ker fl

= 0. be any vector. Then

(9.53)

=px

Chapter IX. Symmetric bilinear functions

for some xe ker i/i. Hence (9.54)

=0

¶P(x) =

and

=

x,

=

x> =0

(9.55)

and

because

and hence that = ço x = 0. x Now let I .n) be a basis of E such that the vectors (Xr+ i. form a basis of ker i/i. Employing a determinant-function zl +0 in E we obtain zl(gox1 + + = + ... (PXr + . .

.

The expansion of this expression yields a polynomial f(2) starting with

the term is not identically zero. This follows from the relation (9.53) and the fact that the r vectors = ...r) and the (n —r) in) are linearly independent. (a=r+ vectors (pxae(p(ker i/i) implies Hence, f is a polynomial of degree r. Our assumption that r 1. Consequently, a number can be chosen such is non-degenerate (cf. sec. 9.4). Then By the previous theorem there exists a basis (v = .. .n) of E in which the bilinear functions P and P+)LW both have diagonal form. Then the The coefficient of

1

1

functions

and ¶1' have also diagonal form in this basis.

Problems 1. Let 'k and 'I' be two symmetric bilinear functions in E. Prove that the condition P(x)2 + W(x)2 > 0, x+0 is equivalent to the following one: There exist real numbers ). and ji such that (x) + /2 W (x) > 0 for every x +0. and B = 2. Let A = be two symmetric n x n-matrices and assume that the equations and

§4. Pseudo-Euclidean spaces

together imply that

(v=

1

281

n). Prove that the polynomial

+ AB)

is of degree r and has r real roots where r is the rank of B. § 4.

Pseudo-Euclidean spaces

9.17. Definition. A pseudo-Euclidean space is a real linear space in which a non-degenerate indefinite bilinear function is given. As in the positive definite case, this bilinear function is called the inner product and

is denoted by (,).The index of the inner product is called the index of the pseudo-Euclidean space. Since the inner product is indefinite, the number (x, x) can be positive, negative or zero, depending on the vector x. A vector x+0 is called space-like, if (x, x) >0 time-like, if (x, x) <0

a light-vector, if (x, x)=0 The light-cone is the set of all light-vectors. As in the definite case two vectors x and y are called orthogonal if (x, y) =0. The light-cone consists of all vectors which are orthogonal to themselves.

A basis

1 ...n) is called orthonormal if e,2) =

where

c+i(v=i...s)

In sec. 9.8 we have shown that an orthonormal basis can always be constructed.

If an orthonormal basis two vectors

(v = 1. . .n)

is chosen, the inner product of

and

is given by the bilinear form

(x,y) =

= and the equation of the light-cone reads

v1

vs+1

(9.56)

Chapter IX. Symmetric bilinear functions

9.18. Orthogonal complements. Since the inner product in E is nondegenerate the space E is dual to itself. Hence, every subspace E1 c: E determines an orthogonal complement EjL which is again a subspace of E

and has complementary dimension. does not necessarily consist of the However, the intersection E1 n zero-vector alone, as in the positive definite case. Assume, for instance, that E1 is the 1-dimensional subspace generated by a light-vector 1. Then E1 is contained in E1L.

It will be shown that E1 fl E1'=O if and only if the inner product is non-degenerate in E1. Assume first that this condition is fulfilled. Let x1 be a vector of E1 n E1L. Then (x1,y1) = 0 for all vectors Yi EE1 and thus x1 =0. Conversely, assume that E1 fl E1L =

0.

(9.57)

,

Then it follows that (9.58)

since E1 and have complementary dimension. Now let x1 be a vector of E1 such that (x1, v1) =

0

for all vectors Ji

It follows from (9.58) that every vector y of E can be written as Y = Yi + 3

ii E E1,

E

whence

(x1, y) = (x1,

+ (x1,

y e E.

This equation implies that x1 =0. Consequently, the inner product is non-degenerate in E1. 9.19. Normed determinant functions. Let be a determinant function in E. Since E is dual to itself, the identity (4.21) applies to E yielding

40(x1, Substituting basis, we obtain

= in (9.59), where e%(v=

=

This equation shows that

(—

(9.59) I

n) is an orthonormal

§ 4. Pseudo-Euclidean spaces

283

Consequently, another determinant function, A, can be defined by

(9.60) Then the identity (9.59) assumes the form A (x1

...

(y1 ...

y1).

= (—

(961)

A determinant function satisfying the relation (9.61) is called a normed determinant function. Equation (9.60) shows that there exist exactly two normed determinant functions A and —A in E. 9.20. The pseudo-Euclidean plane. The simplest example of a pseudo-

Euclidean space is a 2-dimensional linear space with an inner product of index I. Then the light-cone consists of two straight lines. Selecting two vectors and 12 which generate these lines we have the equations (li,

=

0

and

(12, /2)

= 0.

(9.62)

But

(l1,12)+0 because otherwise the inner product would be identically zero. We can therefore assume that (9.63) (l1,12)=— 1. Given a vector x= + 12 of E the equations (9.62) and (9.63) yield

(x,x) =

—

<0 and x is time-like if >0. In showing that x is space-like if other words, the space-like vectors are contained in the two sectors S1 and T. and S2 of fig. 3 and the time-like vectors are contained in The inner product of two vectors and

\ 7

(2z)
Fig. 3

is

given by (x

\:

This formula shows that the inner product of two space-like vectors is positive if and only if these vectors are contained in the same one of the sectors .

.

and

Let an orientation be defined in E by the normed determinant function A.

284

Chapter IX. Symmetric bilinear functions

Then the identity (9.61) yields (n=2, s= 1) (x,

—

(9.64)

(x, y)2 = (x, x) (y, y).

A

If x and y are not light-vectors equation (9.64) may be written in the form

2

2

—='

(x, x) (y, y)

(9.65)

(x, x) (y, y)

Now assume in addition, that the vectors x and y are space-like and are and S2. Then contained in the same one of the sectors

(x,y)>O.

(9.66)

Relations (9.65) and (9.66) imply that there exists exactly one real number 0 ( — oo <0< cc) such that cosh 0 =

(x,y)

and

sinh 0 =

lxi

A(x,y) .

-

(9.67)

lxi

This number is called the pseudo-Euclidean angle between the space-like

vectors x and y. We finally note that the vectors e1 =

and

12)

e2

+ 12)

form an orthonormal basis of E. Relative to this basis the equation of the light-cone assumes the form —

= 0.

9.21. Pseudo-Euclidean spaces of index n—i. More generally let us consider an n-dimensional pseudo-Euclidean space with index n —1. Then

every fixed time-like unit vector z determines an orthogonal decomposition of E into an (n — 1)-dimensional subspace consisting of spacelike vectors and the 1-dimensional subspace generated by z. In fact, every vector xEE can be uniquely decomposed in the form

(z,y)=0 where the scalar 2 is given by 2

= —(x,z).

Passing over to the norm we obtain the equation

(x,x)=

+(y,y)

§ 4. Pseudo-Euclidean spaces

285

showing that

<(y, y) if x is space-like A2 > (y, y) if x is time-like A2 = (y, y) if x is a light-vector.

(9.68)

From this decomposition we shall now derive the following properties: (1) Two time-like vectors are never orthogonal. (2) A time-like vector is never orthogonal to a light-vector. (3) Two light-vectors are orthogonal if and only if they are linearly dependent.

(4) The orthogonal complement of a light-vector is an (n — 1)-dimensional subspace of E in which the inner product is positive semidefinite and has rank n—2. To prove (1), consider another time-like vector z1. This vector z1 can be written as (z,y1) = 0. (9.69) = Az + Yi Then

> (yr)

whence

Inner multiplication of (9.69) by z yields

(z,z1) = A(z,z) + 0. Next, consider a light-vector 1. Then

l=Az+y

(z,y)=0

and

A2 =(y,y) >0. These two relations imply that

(!,z) = A(z,z) + 0 which proves (2). Now let and '2 decompositions

be

two orthogonal light-vectors. Then we have the

=21z+y1 and

12

22z+y2,

whence —

A2 + (Y1'Y2) = 0.

(9.70)

Observing that

= (Yi'Yi) and

= (Y2'Y2)

we obtain from (9.71) the equation (Yl,Yl)(Y2'Y2) = (Y1'Y2)2.

(9.71)

Chapter IX. Symmetric bilinear functions

286

The vectors Yi and Y2 are contained in the orthogonal complement of z. In this space the inner product is positive definite and hence equation. Inserting (9.71) implies that Yt and Y2 are linearly dependent, Y2 whence 12=2'l. this into (9.70) we find Finally, let / be a light-vector and E1 be the orthogonal complement of 1. It follows from property (2) that E1 does not contain time-like vectors. In other words, the inner product is positive semidefinite in E1. To find the null-space of the inner product, assume that Yi is a vector of E1 such that (Yi'Y) = 0 for all vectors yeE1.

This implies that (Yi' Yi)=O showing that Yi is a light-vector. Now it follows from property (3) that Yi is a multiple of 1. Consequently, the null-space of the inner product in E1 is generated by 1. 9.22. Fore-cone and past-cone. As another consequence of the properties established in the last section it will now be shown that the set of all T (cf. fig. 4.) To time-like vectors consists of two disjoint sectors this purpose we define an equivalence relation in the set T of all time-like

vectors in the following way:

if (:1,z2)<0.

2

(9.72)

Relation (9.72) is obviously symmetric and reflexive.

To prove the transitivity, consider three time-like 1, 2, 3) and assume that

vectors

(z1,z3)<0 and (z2,z3)<0. We have to show that Fig. 4

(z1,z2) <0.

We may assume that is a time-like unit-vector. Then the vectors z1 and z2 can be decomposed in the form

=

=

+

(i = 1,2)

—

(9.73)

where the vectors Yi and Y2 are contained in the orthogonal complement F of z3. Equations (9.73) yield

=

—

(1 = 1,2)

+

(9.74)

and

(zl,:2) = —

+

(9.75)

§ 4. Pseudo-Euclidean spaces

287

It follows from (9.68) that (ye, yi) <

(1 = 1,2).

Now observe that the inner product is positive definite in the subspace F. Consequently, the Schwarz inequality applies to the vectors Yi and Y2' yielding (Y1'Y2)2

inequality shows that the first term determines the sign on the right-hand side of (9.75). But this term is negative because = — z3) >0 (1 = 1, 2) and we thus obtain

This

(z1,z2)

<0.

The equivalence relation (9.72) induces a decomposition of the set T into two classes T which are obtained from each other by the reflection x—÷ —x.

9.23. The two subsets any two vectors z1 and

z2

T are convex, i.e., they contain with the straight segment

z(t)=(1—t)z1+tz2 In fact, assume that z1 and z2 ETt

Then

(z1, z1) <0,(z2, z2) < 0 and (z1, z2) <0, whence

(z(t),z(t))

(1 — t)2(z1,z1) + 2t(1 —

t)(z1,z2) + t2(z2,z2) <0,

In the special theory of relativity the sectors and the past-cone.

T

The set S of the space-like vectors is not convex as fig. 4 shows.

Problems 1. Let E be a pseudo-Euclidean plane and g1, g2 be the two straight lines generated by the light-vectors. Introduce a Euclidean metric in E such that g1 and g2 are orthogonal. Prove that two vectors x+0 and y+O are orthogonal with respect to the pseudo-Euclidean metric if and only if they generate the Euclidean bisectors of g1 and g2.

2. Consider a pseudo-Euclidean space of dimension 3 and index 2. Assume that an orientation is defined in E by the normed determinant

Chapter IX. Symmetric bilinear functions

288

function A. As in a Euclidean space define the cross product of two vectors x1 and x2 by the relation

(x1 x x2,x3) = A(x1,x2,x3). Prove: a) x1 x x2 = 0 if and only if the vectors x1 and x2 are linearly dependent. b) (x1 x x2, x1 x x2)= (x1, x2)2 — (x1, x1) (x2, x2) c) If e1, e2, e3 is a positive orthonormal basis of E, then

e1 x e2 =

— e3,

= — e2,

e1 x

e2 x

=

e1

3. Let E be an n-dimensional pseudo-Euclidean space of index n — 1. Given two time-like unit vectors z1 and z2 prove: a) The vector +z2 is time-like or space-like depending on whether z1 and z2 are contained in the same cone or in different cones. b) The Schwarz inequality holds in the reversed form (z1,z2)2 (z1,z1)(z2, z2).

Equality holds if and only if z1 and z2

are

linearly dependent.

4. Denote by S the set of all space-like vectors. Prove that the set S More precisely: Given two vectors x0eS and x1eS is connected if there exists a continuous curve x=x(t)(O t 1) in S such that x(O)=x0

x(l)=x1. §

5. Linear mappings of pseudo-Euclidean spaces

9.24. The adjoint mapping. Let p a linear transformation of the n-di-

mensional pseudo-Euclidean space E. Since E is dual to itself with respect to the inner product the adjoint mapping can be defined as in sec. (8.1). The mappings çø and are connected by the relation

X,yEE. The duality of the mappings ço and det

= det 'p

(9.76)

implies that and tr

= tr 'p.

(v, = 1 ii) be the matrices of 'p and relative to an and Let into (9.76) we find that Inserting and orthonormal basis .

.

(v,u= I ...n)

§ 5. Linear mappings of pseudo-Euclidean spaces

where

289

J+1(v=1...s) l—1(v=s+1...n).

Now assume that the mapping p is selfadjoint, = p. In the positive definite case we have shown that there exists a system of n orthonormal eigenvectors. This result can be carried over to pseudo-Euclidean spaces of dimension n 3 if we add the hypothesis that (x, (px) + 0 for all lightvectors. To prove this, consider the symmetric bilinear functions

k(x,y)=(çox,y) and W(x,y)=(x,y). It follows from the above assumption that çji

(x)2

> 0 for all vectors x + 0.

+ 'I'

Hence the theorem of sec. 9.11 applies to and exists an orthonormal basis (v= 1 ...n) such that

showing that there

(v,p = 1... n).

=

(9.77)

Equations (9.77) imply that

(v=1...n) showing that the are eigenvectors of (p. 9.25. Pseudo-Euclidean rotations. A linear transformation çø of the pseudo-Euclidean space E which preserves the inner product, (çox,çoy) = (x,y)

(9.78)

is called a pseudo-Euclidean rotation. Replacing y by x in (9.78) we obtain the equation

(px,px)=(x,x)

xeE

showing that a pseudo-Euclidean rotation sends space-like vectors into space-like vectors, time-like vectors into time-like vectors and lightvectors into light-vectors. A rotation is always regular. In fact, assume that çox =0 for a vector xeE. Then it follows from (9.78) that

(x,y) =((px,py) = 0 for all vectors yeE, whence x=0. Comparing the relations (9.76) and (9.78) we see that the adjoint and the inverse of a pseudo-Euclidean rotation coincide, (9.79) 19

Greub. Linear Algebra

Chapter IX. Symmetric bilinear functions

290

Equation (9.79) shows that the determinant of must be ± 1, as in the Euclidean case. Now let e be an eigenvector of 'p and ). be the corresponding eigenvalue,

çoe

=

Passing over to the norms we obtain (coe,coe) =

equation shows that ).= ± 1 provided that e is not a light-vector. An eigenvector which is contained in the light-cone may have an eigenvalue ± I as can be seen from examples. If an orthonormal basis is chosen in E the matrix of satisfies the relations = This

A matrix of this kind is called pseudo-orthogonal. 9.26. Pseudo-Euclidean rotations of the plane. In particular, consider

a pseudo-Euclidean rotation 'p of a 2-dimensional space with index 1. Then the light-cone consists of two straight lines. Since the light-cone is preserved under the rotation ço, it follows that these straight lines are either transformed into themselves or they are interchanged. Now assume that 'p is a proper rotation i.e. det 'p = + I. Then the second case is im-

possible because the inner product is preserved under 'p. Thus we can write

and

1

'p12—j2,

(9.80)

It

the basis of E defined in sec. 9.20. The number). is positive or negative depending on whether the sectors T are mapped onto themselves or interchanged. where

/2 is

Now consider an arbitrary vector (9.81)

Then equations (9.80) and (9.81) yield

'p

(x, x).

(9.82)

§ 5. Linear mappings of pseudo-Euclidean spaces

291

This equation shows, that the inner product of x and çox depends only on the norm of x as in the case of a Euclidean rotation (cf. sec. 8.21). To find the "rotation-angle" of (p, introduce an orientation in E such is positive. Let zI be a normed determinant function that the basis which represents this orientation. Then identity (9.64) yields 4(11,12)2 =

12) =

1,

whence 4(11,12) = 1.

Inserting the vectors x and çox into A we find that

= T Now assume in addition that p transforms the sectors and T). Then into themselves (i. e., that tp does not interchange

2>

and equation (9.82) shows that (x, q,x) >0 for every space-like vector x. Using formulae (9.67) we obtain the equations 0

cosh ü =

(2 +

and sinh

=1

(9.83) —

where 0 denotes the pseudo-Euclidean angle between the vectors x and Now consider the orthonormal basis of E which is determined by the vectors

=

— 12)

and

e2

+ 12).

Then equations (9.80) yield

—

+ 2(A

thus obtain the following representation of p, which corresponds to the representation of a Euclidean rotation given in sec. 8.21: We

0 + e2 sinh 0 = e1 sinh 0 + e2 cosh 0.

p e1 = e1 cosh

q, e2 19*

292

Chapter IX. Symmetric bilinear functions

9.27. Lorentz-transformations. A 4-dimensional pseudo-Euclidean space with index 3 is called Minkowski-space. A Lorentz-transformation is

a rotation of the Minkowski-space. The purpose of this section is to show that a proper Lorentz-transforrnation ço possesses always at least one eigen vector on the We may restrict ourselves to Lorentz-

transformations which do not interchange fore-cone and past-cone because this can be achieved by multiplication with — I. These transformations are called orthochroneous. First of all we observe that a lightvector 1 is an eigenvector of ço if and only if (1, col) = 0. In fact, the equation çol=2l yields (l,(p/) = 2(1,1) = 0.

Conversely, assume that 1 is a light-vector with the property that (1, pl) = 0. Then it follows from sec. (9.21) property (3) that the vectors 1 and çol are linearly dependent. In other words, 1 is an eigenvector of ço. Now consider the selfadjoint mapping (9.84) Then

=(x,gox) xeE.

= j-(x,(px)+

(9.85)

It follows from the above remark and from (9.85) that a light-vector 1 is an eigenvector of p if and only if (I, = 0. We now preceed indirectly and assume that ço does not have an eigenvector on the lightcone. Then (x, = (x, cox) 0 for all light-vectors and hence we can apply the result of sec. 9.24 to the mapping /i: There exist four eigenvectors (v = 1.. .4) such that

(+

=

=

1(v = 1,2,3) 1

(v

= 4).

Let us denote the time-like eigenvector e4 by e and the corresponding eigenvalue by 2. Then and hence it follows from (9.84) that ço2e = 22çIie

—

e.

Next, we wish to construct a time-like eigenvector of the mapping 'p. If çoe is a multiple of e, e is such a vector. We thus may assume that the vectors e and çoe are linearly independent. Then these two vectors generate *) Observe that a proper Euclidean rotation of a 4-dimensional space need not have

eig envectors.

§ 5. Linear mappings of pseudo-Euclidean spaces

a

293

plane F which is invariant under ço. This plane intersects the light-cone

in two straight lines. Since the plane F and the light-cone are both invariant under these two lines are either interchanged or transformed into themselves. In the second case we have two eigenvectors of p on the light-cone, in contradiction to our assumption. In the first case select two generating vectors and '2 on these lines such that (/i, /2)= 1. Then and

The condition (911,(p12) = (11,12)

implies that cxf3= 1. Now consider the vector z Z

= 11 + CL!2.

Then

=

+ cx/311

cL!2

+

=

z

To show that z is timelike observe that

(z,z) = and that the vector t=11

is

2ci(11,12)

time-like. Moreover,

(t, q,t) <0 because ço leaves the fore-cone and the past-cone in-

variant (cf. sec. 9.22). Hence equation implies that <0, showing that z is time-like. Using the time-like vector z we shall now construct an eigenvector on the light-cone which will give us a contradiction. Let E1 be the orthogonal complement of z. E1 is a 3-dimensional Euclidean subspace of E which is invariant under çø. Since p is a proper Lorentz-transformation it induces a proper Euclidean rotation in E1. Consequently, there exists an invariant axis in E1 (cf. sec. 8.22). Let y be a vector of this axis such that (i'. i')= —2 Then / = v + z is a light-vector and CL.

y

+z=

1

i. e. 1 is an eigenvector of p. Hence, the assumption that there are no eigenvectors on the light-cone leads to a contradiction and the assertion in the beginning of this section is proved.

Chapter IX. Symmetric bilinear functions

294

We finally note that every eigenvalue 2 of p whose eigenvector 1 lies on the light-cone is positive. In fact, select a space-like unit-vector y such that (1, y) = and consider the vector z = 1+ ry where t is a real parameter. Then we have the relation 1

(:,:)

= 2t + z2

showing that z is time-like if —2<'r
preserves

(z,çoz)
(—2
+ ty,2l+

t(2

fore-cone and

But

and we thus obtain +

(—2
<0

+

Letting 'r—+O we see that 2 must be positive.

Problems 1. Let p be a linear automorphism of the plane E. Prove that an inner product of index 1 can be introduced in E such that p becomes a proper pseudo-Euclidean rotation if and only if the following conditions are satisfied:

1. There are two linearly independent eigenvectors.

2. detp=l. 3.

Find the eigenvectors of the Lorentz-transformation defined by the matrix 2.

0 —l

H1

0

0

H

0

1

1

0

1

Verify that there exists an eigenvector on the light-cone. 3. Let a and b be two linearly independent light-vectors in the pseudoof E is defined by Euclidean plane E. Then a linear transformation

i/ib=—b.

§ 5. Linear mappings of pseudo-Euclidean spaces

295

Consider the family of linear automorphisms (p (t) which is defined by the

differential equation

= and the initial condition

(p (0) = a) Prove that (p (t) is a family of proper rotations carrying fore-cone and past-cone into themselves. b) Define the functions C(t) and S(t) by

and S(t) =

C(t) = Prove the functional-equations

C(t1 + t2) = C(t1)C(t2) + S(t1)S(t2) and S(t1

+ t2) = S(t1)C(t2) + S(t2)C(t1).

c) Prove that

(p(t)a=eta

and

(p(t)b_—etb.

4. Let E be a pseudo-Euclidean space and consider an orthogonal decomposition

E=

such that the restriction of the inner product to

(E) is positive

(negative) definite. Let w be a selfadjoint involution of E such that (and hence E) is stable under w. Define a symmetric bilinear function

x,yEE.

W(x,y)=(wx,y) Prove that the signature of

is given by

—trw where

w denote the restrictions of w to

E respectively.

Chapter X Q uadrics

*

In the present Chapter the theory of the bilinear functions developed

in Chapter IX will be applied to the discussion of quadrics. In this context we shall have to deal with affine spaces. § 1.

Affine spaces

10.1. Points and vectors. Let E be a real n-dimensional linear space and let A be a set of elements F, Q... which will be called points. Assume that a relation between points and vectors is defined in the following way:

1. To every ordered pair F, Q of A there is assigned a vector of E, called the vector and denoted by PQ. 2. To every, point PeA and every vector xeE there exists exactly one

point QeA such that PQ=x. 3. If P, Q, R are three arbitrary points, then (10.1)

A is called an n-dimensional affine space with the d?fference space E.

Insertion of Q =P in (10.1) yields PP+ FR=PR, whence PP= 0 for every point PeA. Using this relation we obtain from (10.1)

QP = - P Q. The equation P1Q1=P2Q2 implies that P1P2=Q1Q2 (parallelogramlaw). In fact

P1P2=P1Q2-P2Q2 =

P1Q2

—

Q1.

Subtraction of these equations yields P1P2= For any given linear space E, an affine space can be constructed which possesses E as difference space:

§ 1. Affine spaces

297

Define the points as the vectors of E and the difference-vector of two

points x and y as the vector y—x. Then the above conditions are obviously satisfied.

Let A be a given affine space. If a fixed point 0 is distinguished as origin, every point P is uniquely determined by the vector x = OP. x is called the position-vector of P and every point P can be identified with the corresponding position-vector x. The difference-vector of two points x and y is simply the vector y —x. 10.2. Affine coordinate systems. An affine coordinate-system (0; x1 . l...n) of the consists of a fixed point OEA, the origin, and a basis .

difference-space E. Then every point PeA determines a system of n numbers

(v= 1.. ii) by

OP = (v = .. n) are called the affine coordinates of P relative to the given system. The origin 0 has the coordinates Now consider two affine coordinate-systems

The numbers

1

(0; x1 ...

and

(0'; Yi ...

the matrix of the basis-transformation affine coordinates of 0' relative to the system (0; x1

and by f3V the

Denote by

. .

= The affine coordinates

tems (0; x1 .

.

and

and (0'; Yi

. .

of a point P, corresponding to the sys.y,,)

respectively, are given by

and

01P _>:JJVy

(10.2)

Inserting O'P= OP—00' in the second equation (10.2) we obtain = — whence

(JL==1...n).

Multiplication by the inverse matrix yields the transformation-formula for the affine coordinates:

(v=1...n).

Chapter X. Quadrics

10.3. Affine subspaces. An affine suhspace of A is a subset A1 of A such

that the vectors PQ(PeA1, QeA1) form a subspace of E. If 0 is the origin of A and (01; is an affine coordinate-system of A1, the points of A can be represented as 1

=

(10.3)

+

Forp= I we obtain a straight line through °1 with the "direction vector"

0P= 0

+ x.

In the case p=2 equation (10.3) reads

It then represents the plane through 01 generated by the vectors x1 and x2. An affine subspace of dimension n — is called a hyperplane. 1

Two affine subspaces A1 and A2 of A are called parallel if the difference-space E1 of A1 is contained in the difference-space E2 of A2, or conversely. Parallel subspaces are either disjoint or contained in each other. Assume, for instance, that E2 is contained in E1. Let Q be a point of the intersection A1 fl A2 and P2 be an arbitrary point of A2. Then

QP2 is contained in E2 and hence is contained in E1. This implies that P2EA1, whence A2 c: A1. 10.4. Affine mappings. Let P—÷P' be a mapping of A into itself subject

to the following conditions: 1. P1 Q1 =P2Q2

implies that

2. The mapping p:

defined by p (PQ)=P'Q' is linear.

Then P—*P' is called an affine mapping. Given two points 0 and 0' and

a linear mapping p: E—*E, there exists exactly one affine mapping which sends 0 into 0' and induces 'p. This mapping is defined by

0

=

00' + 'p (OP).

If a fixed origin is used in A, every affine mapping x—÷x' can be written in the form

x' =

'p

x + b,

where 'p is the induced linear mapping and b =0 0'.

§ 1. Affine spaces

299

A translation is an affine mapping which induces the identity in E,

For two arbitrary points P and F' there obviously exists exactly one translation which sends P into P'. 10.5. Euclidean space. Let A be an n-dimensional affine space and assume that a positive definite inner product is defined in the differencespace E. Then A is called a Euclidean space. The distance of two points P and Q is defined by

Q(P,Q)= IPQI.

It follows from this definition that the distance has the following properties:

1. Q(P,Q)0 and Q(P,Q)=0 if and only ifP=Q. 2. 3.

Q(P,R) + Q(R,Q). All the metric concepts defined for an inner product space (cf. Chap. VII) can be applied to a Euclidean space. Given a point x1eA of A and a vector p +0 there exists exactly one hyperplane through x1 whose difference-space is orthogonal to p. This hyperplane is represented by the equation (x — x1,p)= 0. A rigid motion of a Euclidean space is an affine mapping P—+P' which preserves the distance, (10.4) Q(P',Q') = Q(P,Q).

Condition (10.4) implies that the linear mapping, which is induced in the difference-space by a rigid motion, is a rotation. Conversely, given a rotation p and two points OeA and O'eA, there exists exactly one rigid motion which induces 'p and maps 0 into 0'.

Problems in an affine space are said to be in general 1. (p+ 1) points are not contained in a (p — 1)-dimensional subposition, if the points .p) are in general position if and space. Prove that the points only if the vectors

are linearly independent.

Chapter X. Quadrics

in general position, the set of all

2. Given (p+ 1) points points P satisfying —--*

p

p

is called the p-simplex spanned by the points .p). If 0 is the origin of A, prove that a point P of the above simplex can be uniquely represented as .

= 1.

= The numbers

(v=O.. .p) are called the barycentric coordinates of P.

The point B with the barycentric coordinates

1

+

(v=O. ..p) is

called the center of S. 3. Given a p-simplex (P0.. (p 2) consider the (p — 1) simplex Si deand denote by B1 the center fined by the points of Show that the straight lines (P,B1) and (i+j) intersect each other at the center B of S and that

BB=

1

p+l

B.P..

4. An equilateral simplex of length a in a Euclidean space is a simplex (P0..

=a(v+p). Find the angle between

with the property that

the vectors

and B

+ v, 2 + v) and between the vectors

is the center of (P0

. . .

Assume that an orientation is defined in the difference-space E by the determinant function A. An ordered system of (n + 1) points (P0.. . in general position is called positive with respect to the given orientation, if A (P0 P1 ... P0 P,j > 0. 5.

a) If the system (P0..

is positive and

numbers (0, 1 .. .n), show that the system

is a permutation of the . . .

is again positive if

and only if the permutation a is even. b) Let A1 be the (n — 1)-dimensional subspace spanned by the points P0.. P1.. Introduce an orientation in the difference-space of A, with

the help of the determinant function

§ 2. Quadrics in the affine space

301

where Q is an arbitrary point of A1. Prove that the ordered n-tuple is positive with respect to the determinant function an n-simplex andSbe its center. Denote by Sll...ik the center of the (n —k)-simplex obtained by deleting the vertices P,.., and define Now select an ordered system of n integers ii,..., the affine mapping by 6. Let (P0..

Prove that

(ii+l)!

is the corresponding linear map.

where

In this equation a denotes the permutation a(v)=i%. (v= l...n), a(O)=k ia). where k is the integer not appearing among the numbers 7. Let g1 and g2 be two straight lines in a Euclidean space which are not parallel and do not intersect. Prove that there exists exactly one . . .

point P1 ong1(i=l, 2) such that P1 P2 is orthogonal to g1 and tog2. 8. Let A1 and A2 be two subspaces of the affine space such that the difference-spaces E1 and E2 form a direct decomposition of E. Prove that the intersection A1 n A2 consists of exactly one point. 9. Prove that a rigid motion x' = tx + a has a fixed point if and only if the vector a is orthogonal to all the vectors invariant under t. A rigid motion is called proper if det t = + 1. Prove that every proper rigid motion of the Euclidean plane without fixed points is a translation.

10. Consider a proper rigid motion x'=tx+a(t+i) of the Euclidean plane. Prove that there is exactly one fixed point x0 and that + b is a vector of the same length as a and orthogonal to a. O designates the rotation-angle relative to the orientation defined by the basis (a, b). 11. Prove that two proper rigid motions + t and /3 + t of the plane commute, if and only if one of the two conditions holds: 1. cz and /3 are both translations 2. and /3 have the same fixed point.

§ 2.

Quadrics in the affine space

10.6. Definition. From elementary analytic geometry it is well known

Chapter X. Quadrics

that every conic section can be represented by an equation of the form

and are constants. Generalizing this to higher dimensions we define a quaciric Q in an n-dimensional affine space A as the set of all points satisfying an equation of the form where

'P(x) + 2f(x) =

(10.5)

is a quadratic function,f a linear function and a constant. For the following discussion it will be convenient to introduce a dual space E* of the difference-space E. Then the bilinear function 'P can be written in the form where

x,yeE, where p is a linear mapping of E into E* which is dual to itself: = çø. Moreover, the linear functionf determines a vector a*EE* such that

f

xeE.

Hence, equation (10.5) can be written in the form

+ 2 = cc.

(10.6)

We recall that the null-space of the bilinear function 'P coincides with the kernel of the linear mapping çø.

10.7. Cones. Let us assume that there exists a point x0eQ such that (pxo+a*=0. Then (10.6) can be written as — 2 =

cc

(10.7)

and the substitution x=x0 gives cc =

—

x0, x0>.

Inserting this into (10.7) we obtain

+ = 0.

—

Hence, the equation of Q assumes the form 'P(x —

x0)= 0.

A quadric of this kind is called a cone li'ith the vertex x0. For the sake of

§ 2. Quadrics in the affine space

303

simplicity, cones will be excluded in the following discussion. In other

words, it will be assumed that

p x + a* + 0 for all points x e Q.

(10.8)

10.8. Tangent-space. Consider a fixed point x0eQ. It follows from condition (10.8) that the orthogonal complement of the vector (pxo+a* is of E. This subspace is called the an (n — 1)-dimensional subspace if and tangent-space of Q at the point x0. A vectoryEEis contained in only if (10.9) 0.

andf equation (10.9) can be written as

In terms of the functions

(xe, y) + f (y)

0.

(10.10)

affine subspace which is determined by the point x0 and the tangent-space is called the tangent-hyperplane of Q at x0. It consists of all points The (n — 1)-dimensional

x=x0+y Inserting y=x—x0 into equation (10.10) we obtain

cli(x0,x—x0)+f(x—x0)=0.

(10.11)

Observing that cP(x0)+

we can write equation (10.11) of the tangent-hyperplane in the form

+ x)=

(10.12)

To obtain a geometric picture of the tangent-space, consider a 2dimensional plane (10.13)

through x0 where a and b are two linearly independent vectors. Inserting (10.13) into equation (10.5) we obtain the relation + +

+

a) + f (a)) +

+

b) + 1(b)) =

0

(10.14)

showing that the plane F intersects Q in a conic Define a linear function g by setting (10.15) g(x) = x) + .1(x)).

Chapter X. Quadrics

304

Now the equation of the conic y can be

then. in view of (10.8), written in the form

+ ijg(b) = 0.

b) + ij2'P(b) +

+

(10.16)

Now assume that the vectors a and h are chosen such that g(a) and g(h) are not both equal to zero. Then the conic has a unique tangent at the and this tangent is generated by the vector point

t=—g(b)a +g(a)b.

(10.17)

this follows from the The vector t is contained in the tangent-space equation g(t) = — g(b)g(a) + g(a)g(b) = 0.

can be obtained in this way. Every vector y+O of the tangent-space in fact, let a be a vector such that g (a) = and consider the plane through x0 spanned by a andy. Then equation (10.17) yields 1

t=—g(y)a +g(a)y=y

(10.18)

showing that y is the tangent-vector of the intersection Q fl F at the point

Note: If g(a)=0 and g(b)=0 equation (10.16) reduces to + ?J2cIi(b) = 0.

+

Then the intersection of Q and F consists of a) two straight lines intersecting at x0, if cli(a,b)2 — b)

the point x0 only, if 1(a, b)2 — cP(a)cP(b)

<0,

c) one straight line through x0, if —

but not all three coefficients (a),

7i(a)cP(b) = 0,

(b) and P (a, b) are zero

d) The entire plane F, if D(a) = 'P(b) = P(a,b) = 0. 10.9. Uniqueness of the representation. Assume that a quadric Q is represented in two ways (x) + 2f1 (x) = (10.19)

§ 2. Quadrics in the affine space

305

and

'P2 (x) + 212(x) =

(10.20)

It will be shown that =

=

'P2 =

+0 is a real number. Let x0 be a fixed point of Q. It follows from hypothesis (10.8) that the linear functions g1 and g2, defined by where ).

g1 (x) =

'P1

(x, x0) +

(x) and g2 (x) = 'P2 (x, x0) + 12(x)

(10.21)

are not identically zero. Choose a vector a such that g1 (a) + 0 and g2 (a) + 0, and a vector b +0 such that g1 (h) 0. Obviously a and h are linearly independent. The plane

x=

x0

+

+

then intersects the quadric Q in a conic y whose equation is given by each one of the equations 'P1(a) +

(b) + g1 (a) =

'P1 (a, b) +

0

(10.22)

and

'P2(a) +

'P2(b) + g2 (a) + g2 (b) = 0.

'P2 (a, b) +

(10.23)

The tangent of this curve at the point çe = = 0 is generated by the vector

=

g1

(a) b

and also by =

— g2

(b) a + g2 (a) b.

This implies that (10.24)

But b+0 was an arbitrary vector of the kernel of g1. Hence equation (10.24) shows that g2(b)=0 whenever g1(b)=0. In other words, the linear functions g1 and g2 have the same kernel. Consequently g2 is a constant multiple of g1

g2=Ag1 /+0.

(10.25)

Multiplying equation (10.22) by ). and subtracting it from (10.23) we obtain in view of (10.24) that —

).P1)(a) +

—

+

—

= 0. (10.26)

20

(Ireub. Linear Algebra

Chapter X. Quadrics

In this equation all three coefficients must be zero. In fact, if at least one coefficient is different from zero, equation (10.26) implies that the conic consists of two straight lines, one straight line or the point x0 only. But this is impossible because g1 (a)+0. We thus obtain from (10.26) that 2 (a, b) = 2 (a, b), (a), (10.27) 2 (b).

These equations show that 2

(x)

(10.28)

for all vectors XEE: If g1 (x)=0, (10.28) follows from the third equation (10.27); if g1 (x)+0, then g2 (x)+0 [in view of (10.25)] and (10.28) follows from the first equation (10.27). Altogether we thus obtain the identities (P2—2(P1

2+0.

and g2=1g1

Now relations (10.21) imply that 12 = 2f1

and equation (10.20) can be written as 2(cP1 (x)

+ 2f1 (x))

(10.29)

Comparing equations (10.19) and (10.29) we finally obtain completes the proof of the uniqueness theorem. 10.10. Centers. Let Q: (P (x) + 2f (x) = be

==2a1. This

a given quadric and c be an arbitrary point of the space A. if we

introduce c as a new origin,

x= the

C + X',

equation of Q is transformed into (P (x') + 2 ((P (c, x') + f (x')) =

— (P (c) —

2f (c).

(10.30)

Here the question arises whether the point c can be chosen such that the linear terms in (10.30) disappear, i. e. that (P (c,

x') + f (x') = 0

(10.31)

for all vectors x'eE. If this is possible, c is called a center of Q. Writing

§ 2. Quadrics in the affine space

307

equation (10.31) in the form

x'eE we see that c is a center of Q if and only if (cf. sec. 10.7)

coc=_a*.

(10.32)

This implies that the quadric Q has a center if and only if the vector a* is contained in the image space Im cp. Observing that Im is the orthogonal complement of the kernel of p we obtain the following criterion: A quadric Q has a center if and only if the vector a* is orthogonal to the kernel of p. if this condition is satisfied, the center is determined up to a vector of ker ço. In other words, the set of all centers is an affine subspace of A with ker ço as difference-space. Now assume that the bilinear function cli

is

non-degenerate. Then

is

a

regular mapping and hence equation (10.32) has exactly one solution. Thus it follows from the above criterion that a non-degenerate quadric has exactly one center. 10.11. Normal-form of a quadric with center. Suppose that Q is a quad-

nc with centers. If a center is used as origin the equation of Q assumes the form ck(x)=f3 (10.33) $+0.

Then the tangent-vectors y at a point x0 e Q are characterized by the equation = 0. Observing that = <x0, py> we see that contains the null-space of cli. every tangent-space The equation of the tangent-hyperplane of Q at x0 is given by cli (x0,y) = f3.

(10.34)

It follows from (10.34) that a center of Q is never contained in a tangenthyperplane. 1 Dividing (10.33) by ,13 and replacing the quadratic function cli by cli we can write the equation of Q in the normal-form cll(x) = 1.

Now select a basis

(10.35)

(v = 1.. .n) of E such that

+ 1(v = =

is)

= — I (v = s + 1... r)

0(v=r+1...n)

(10.36)

Chapter X. Quadrics

308

where r denotes the rank and s denotes the index of form (10.36) can be written as

Then the normal-

1.

(10.37)

10.12. Normal-form of a quadric without center. Now consider a quadnc Q without a center. If a point of Q is chosen as origin the constant in (10.5) becomes zero and the equation of Q reads

ck(x) + 2 = 0.

(10.38)

By multiplying equation (10.38) with — if necessary we can achieve that 2s r. In other words, we can assume that the signature of çJi is not 1

negative

To reduce equation (10.38) to a normal form consider the tangentspace T0 at the origin. Equation (10.9) shows that T0 is the orthogonal complement of a*. Hence, a* is contained in the orthogonal complement On the other hand, a* is not contained in the orthogonal complement K' (K=ker p) because otherwise Q would have a center (cf. sec. 10.10). show that T01cf K'. Taking the orthoThe relations a*ET0' and gonal complement we obtain the relation T0 K showing that there exists a vector aeK which is not contained in T0 (cf. fig. 5). Then +0 and hence we may assume that = 1. Now T0 has dimension n— 1 and hence every vector

xEE can be written in the form

yeT0.

(10.39)

Inserting (10.39) into equation (10.38) we obtain

2
Fig. 5

= 0.

(10.40)

Now

and

cP(a)=0,

because aeK, and = 0, because yeT0. Hence, equation (10.40) reduces to the following normalform: = 0. (10.41) cP(y) +

Since a is contained in the null space of 'P it follows from the decomposition (10.39) that the restriction of 'P to the tangent-space T0 has again

§2. Quadrics in the affine space

rank r and index s. Therefore we can select a basis = 1. such that (v'u = 1 ... n — I).

309

.

.n — 1) of T0

Then the vectors (v = 1.. .n — 1) and a form a basis of E in which the normal form (10.41) can be written as = 0.

+

(10.42)

Problems 1. Let E be a 3-dimensional pseudo-Euclidean space with index 2. Given an orientation in E define the cross product x x y by

(x x y, z) = A (x, y, z)

x, y, z e E

where A is a normed determinant function (cf. sec. 9.19) which represents the orientation. Consider a point x0 + 0 of the light-cone (x, x)

0

andaplane which does not contain the point 0. Prove that the intersection of the plane F and the light-cone is an ellipse if a x b is time-like a hyperbola if a x b is space-like a parabola if a x b is a light-vector.

2. Consider the quadric Q : (x) = where 1i is a non-degenerate quadratic function. Then every point x1 +0 defines an (n — 1)-dimensional 1

subspace P(x1) by the equation

cP(x,x1)= 1. This subspace is called the polar of x1. It follows from the above equation

that the polar P(x1) does not contain the center 0. a) Prove that x2eP(x1) if and only if x1 eP(x2). b) Given an (n — 1)-dimensional affine subspace A1 of A which does not contain 0, show that there exists exactly one point x1 such that A1 cP(x1). c) Show that P(x1) is a tangent-plane of Q if and only if x1 eQ. 3. Let x1 be a point of the quadric (x) = 1. Prove that the restriction has the rank r — and of the bilinear function to the tangent-space index s—l. 4. Show that a center of a quadric Q lies on Q only if Q is a cone. 1

Chapter X. Quadrics

5. Let x0 be a point of the quadric 'P (x) + 2f (x) + ci =

0

and consider the skew bilinear mapping w: E x w (x, y) = g (x) y — g (v) x

defined by x, y E E

the linear function g is defined by (10.15). Show that the linear closure of the set w (x,y) under this mapping is the tangent-space where

§ 3.

Affine equivalence of quadrics + b of A onto itself be

10.13. Definition. Let an affine mapping x' = given. Then the image of a quadric Q: 'P (x) is

+ 2f (x) =

the quadric Q' defined by the equation Q': W(x) + 2g(x) =

J3,

where

= 'P(t'x), g(x) = — 'P(t'x,r1 b) + f

(10.43)

(t1 x)

(10.44)

and (10.45)

In fact, relations (10.43), (10.44) and (10.45) yield

+ b) + 2g(tx + b)

= 'P(x) + 21(x)

—

that a point XEA is contained in Q if and only if the point x'=tx+b is contained in Q'. showing

Two quadrics Q1 and Q2 are called affine equivalent if there exists a one-to-one affine mapping of A onto itself which carries Q1 into Q2. The

affine equivalence induces a decomposition of all possible quadrics into affine equivalence classes. It is the purpose of this paragraph to construct a complete system of representatives of these equivalence classes. 10.14. The affine classification of quadrics is based upon the following theorem: Let E and F be two n-dimensional linear spaces and 'P and

two symmetric bilinear functions in E and in F. Then there exists a

§ 3. Affine equivalence of quadrics

linear isomorphism r:

311

with the property that

x,yeE

=

(10.46)

if and only if cP and W have the same rank and the same index. To prove this assume first that the relation (10.46) holds. Select a basis (v= 1.. ii) of E such that

+1(v=1..s) = — 1(v = s + 1 ... r)

(10.47)

0(v=r+1...n). Then equations (10.46) and (10.47) yield

=

=

showing that W has rank r and index s.

Conversely, assume that this condition is satisfied. Then there exist and of E and of F such that

bases

(ar, a,L)

and

=

=

W (by,

by the equations

Define the isomorphism 'r:

(v=1...n). Then

(v,u = 1 ... n)

P(aV,a,L) =

and consequently

x,yeE.

1(x,y)=W(xx,xy)

10.15. Affine classification. First of all it will be shown that the centers are invariant under an affine mapping. In fact, let Q:

çJi

(x — c)

=

/3

be a quadric with c as center and x' = xx + b an affine mapping of A onto itself. Then the image Q' of Q is given by the equation Q':

W

(x — c')

=

/3,

where

P(x1x) and

c' =

b

+ x c.

This equation shows that c' is a center of Q'.

Chapter X. Quadrics

312

Now consider two quadrics with center

Q1P1(x

(10.48)

— c1)= 1

and

Q2:P2(x—c2)=

(10.49)

1

is an affine mapping carrying Q1 into Q2. Since and assume that centers are transformed into centers we may assume the mapping x—+x' sends c1 into c2 and hence it has the form x' = r(x

—

c1) + c2.

By hypothesis, Q1 is mapped onto Q2 and hence the equation

c12(r(x — c1)) = 1

(10.50)

must represent the quadric Q1. Comparing (10.48) and (10.50) and applying the uniqueness theorem of sec. 10.9 we find that

= This relation implies that

= r2

r1

and

s1

= s2.

(10.51)

Conversely, the relations (10.51) imply that there exists a linear automorphism t of E such that 'P1(x) = 'P2(rx). Then the affine mapping

defined by

x' =

— c1) + c2

transforms Q1 into Q2. We thus obtain the following criterion: The two

normal forms (10.48) and (10.49) represent affine equivalent quadrics if and only if the bilinear functions and P2 have the same rank and the same index. 10.16. Next, let

=0

q1eQ1

(10.52)

=

q2 EQ2

(10.53)

and Q2:

(x

—

q2)

+2

x — q2>

0

be two quadrics without a center. It is assumed that the equations (10.52) and (10.53) are written in such a way that 2s1 r1 and 2s2 r2. If x' = t(x—q1) + q2 is an affine mapping transforming Q1 into Q2, the

§ 3. Affine equivalence of quadrics

313

equation of Q1 can be written in the form —

q1))

+

q1)>

—

= 0.

Now the uniqueness theorem yields

= 2b2('rx) where 2+0 is a constant. This relation implies that the bilinear functions have the same rank r and that cP1 and or =r depending

on whether 2>0 or 2<0. But the equation s2=r—s1 is only compatible with the inequalities 2s1

ifs1

and 2s2

=

and hence we see

that =s2 in either case. Conversely, assume that r1 = r2 = r and = s. To find an affine mapping which transforms Q1 into Q2 consider the tangent-spaces As has been mentioned in sec. 10.12 the restriction 71(Q1) and to the subspace (i= 1, 2) has the same rank and the same of index as cli1. Consequently, there exists an isomorphism _* such that yET41(Q1). P1(y) = Now select a vector a1 in the nullspace of

=

1

(i =

1,

2) such that

(1 = 1,2)

and define the linear automorphism 'r of E by the equations

ye71(Q1) and

'ra1=a2.

(10.54)

Then

=

+

cP1(x)

xeE.

+

(10.55)

in fact, every vector xEE can be decomposed in the form (10.56)

Equations (10.54), (10.55) and (10.56) imply that c112(tx) = cIi2(tY +

=

=

cP1(y)

= P1(y +

=

cP1(x)

(10.57)

and

=

+

=

=

=

(10.58)

Chapter X. Quadrics

314

Adding (10.57) and (10.58) we obtain (10.55). Relation (10.55) shows

that the affine mapping x'=r(x—q1)+q2 sends Q1 into

Q2 and we

have the following result: The normal-forms (10.52) and (10.53) represent affine equivalent quadrics if and only if the bilinear functions and '2 have the same rank and the same index. 10.17. The affine classes. It follows from the two criteria in sec. 10.15 and 10.16 that the normal forms

and

(r 2s) form a complete system of representatives of the affine classes. Denote by

N1 (r) and by N2 (r) the total number of affine classes with center and without center respectively of a given rank r. Then the above equations show that r+1 if r is odd 2

N1(r)=r and N2(r)= r+2. 2

.

if r is even

0

——

r=n.

The following list contains a system of representatives of the affine classes in the plane and in 3-space *):

Plane: I. Quadrics with center:

1. r=2: a) s=2: 2. r= I:

ellipse,

1

b) 5= 1:

s=1:

±

=1

hyperbola. two parallel lines.

=0

parabola.

1

11. Quadrics without center:

r= 1, s= 1: 3-space:

I. Quadrics with center:

1. r=3: a) s=3: b) s = 2:

c) s= 1:

ellipsoid,

+

— = I hyperboloid with one shell, =1

hyperboloid with two shells.

*) In the following equations the coordinates are denoted by scripts indicate exponents.

and the super-

§ 3. Affine equivalence of quadrics 2.

r

2: a) s = 2:

+ p12

elliptic cylinder,

1

=

b) s— 1: 3. r = I: s = 1:

315

hyperbolic cylinder.

two parallel planes.

=±I 1!. Quadrics without center:

1. r—2: a) s=2:

elliptic paraboloid,

=0 hyperbolic paraboloid.

b) s= 1: 2. r= I, s= 1:

parabolic cylinder.

Problems 1. Let Q be a given quadric and C be a given point. Show that C is a

center of Q if and only if the affine mapping —

defined by CP'=

CF transforms Q into itself. 2. If 1i is an indefinite quadratic function, show that the quadrics

=

1

'k(x) = — 1

and

are equivalent if and only if the signature of cP is zero.

3. Denote by N1 and by N2 the total number of affine classes with center and without center respectively. Prove that N1 = N2

n(n + 1) 2

(k2+k—1 if n=z2k

if n=2k+1.

4. Let x1 and x2 be two points of the quadric

Q:'P(x)= 1. Assume that an isomorphism t:

is given such that

P('ry,tz) = 1(y,z) which transforms Q into itself and Construct an affine mapping in the tangent-space which induces the isomorphism t 5. Prove the assertion of problem 4 for the quadric

P(x) + 2 = 0.

Chapter X. Quadrics

§ 4.

Quadrics in the Euclidean space

10.18. Normal-vector. Let A be an n-dimensional Euclidean space and

determines a selfadjoint linear

be a quadric in A. The bilinear function transformation of E by the equation

ck(x,y) = (çox,y).

The linear function f can be written as

f(x) = (a,x) where a is a fixed vector of E. Cones will again be excluded; i. e. we shall assume that çox4 —a for all points xeQ. Let x0 be a fixed point of Q. Then equation (10.10) shows that the tangent-space consists of all vectors y satisfying the relation

+ a,y) = 0. In other words, the tangent-space

is the orthogonal complement of

the normal-vector

p(x0) =

çox0

+ a.

The straight line determined by the point x0 and the vector p (x0) is called the normal of Q at x0. 10.19. Quadrics with center. Now consider a quadric with center

Q:P(x)= 1.

(10.59)

Then the normal-vector p (x0) is simply given by

p(x0) =

cpx0.

This equation shows that the linear mapping p associates with every point x0eQ the corresponding normal-vector. In particular, let x0 be a point of Q whose position-vector is an eigenvector of ço. Then we have the relation = showing that the normal-vector is a multiple of the position-vector x0.

§ 4. Quadrics in the Euclidean space

317

Inserting this into equation (10.59) we see that the corresponding eigen-

value is equal to

2=—. 1

1Xo12

As has been shown in sec. 8.7 there exists an orthonormal system of n eigenvectors 1.. .n). Then (v = 1... n)

=

(10.60)

whence

=

cP (es,,

Let us enumerate the eigenvectors

such that

(10.61)

where r is the rank and s is the index of cb. Then equation (10.59) can be written as 1.

(10.62)

The vectors

(v=1...s)

(v=s+1...r)

and

are called the principal axes and the conjugate principal axes of Q. Inserting

(v=1...s)

(v=s+1...r)

and

Ia%I

into (10.62) we obtain the metric normal-form of Q: (10.63) i

1

a straight line which intersects the and —ar. The straight lines generated by the conjugate axes have no points in common with Q but they intersect the conjugate quadric

quadric Q in the points

I

at the points

and

l...r).

Chapter X. Quadrics

318

10.20. Quadrics without center. Now consider a quadric Q without center. Using an arbitrary point of Q as origin, we can write the equation of Q in the form +

2(a,x) =

0

(10.64)

where a is a normal vector of Q at the point x=0. For every point xeQ the vector

p(x) =

+a

(10.65)

is contained in the normal of Q. A point xEQ is called a vertex if the corresponding normal is contained in the null-space K of cP.

It will be shown that every quadric without center has at least one vertex. Applying

to the equation (10.65) we obtain

(pp(x)=

+

a point xeQ is a vertex if and only if

ço2x=—qa.

(10.66)

To find all vertices of Q we thus have to determine all the solutions of equations (10.64) and (10.66). The self-adjointness of implies that the mappings ço and 'p2 have the same image-space and the same kernel (cf. sec. 8.7). Consequently, equation (10.66) has at least one solution x. The general

solution of (10.66) can be written in the form

x=

x0

+z

where z is an arbitrary vector of the kernel K. Inserting this into equation

(10.64) we obtain

+ (a,x0) + (a,z) = 0.

(10.67)

Now (otherwise Q would have a center) and consequently (10.67) has a solution zeK. This solution is determined up to an arbitrary vector

of the intersection K n

T0.

In other words, the set of all vertices of Q

forms an affine subspace with the difference-space K fl T0. This subspace

has dimension (n—r-—1). Now we are ready to construct the normal form of the quadric (10.64). First of all we select a vertex of Q as origin. Then the vector a in (10.64) is

contained in the kernel K. Multiplying equation (10.64) by an appropriate scalar we can achieve that al = 1 and that Now let I

§ 4. Quadrics in the Euclidean space n — 1)

319

be a basis of T0 consisting of eigenvectors of Then the vectors — 1) and a form an orthonormal basis of E such that

(v = 1.. . n

(v,u = 1...

=

—

1)

and

=

=0

(v = 1... n — 1).

In this basis the equation of Q assumes the metric normal-form +

= 0.

(10.68)

Upon introduction of the principal axes and the principal conjugate axes

(v=1...s)

(v=s+1...r)

and

the normal-form (10.68) can be also written as

v1

(10.69) v=s+1

10.21. Metric classification of bilinear forms. Two quadrics Q and Q' in the Euclidean space A are called metrically equivalent, if there exists a rigid motion which transforms Q into Q'. Two metrically equiv-

alent quadrics are a fortiori affine equivalent. Hence, the metric classification of quadrics consists in the construction of the metric subclasses within every affine equivalence class.

It will be shown that the lengths of the principal axes form a complete system of metric invariants. In other words, two affine equivalent quadrics Q and Q' are metrically equivalent if and only if the principal axes of Q and Q' respectively have the same length.

We prove first the following criterion: Let E and F be two n-dimensional Euclidean spaces and consider two symmetric bilinear functions and having the same rank and the same index. Then there exists an isometric mapping t: E—+F such that

cli(x,y)=W(tx,ry)

x,yeE

if and only if cP and W have the same eigenvalues. E and Define linear transformations p: P (x, y) = ('p x,

x, y E E

and

W (x, y)

(10.70)

F by x, y)

x, y e F.

Chapter X. Quadrics

Then the eigenvalues of 'P and W are equal to the eigenvalues of

and

// respectively (cf. sec. 8.10).

Now assume that 'r is an isometric mapping of E onto F such that relation (10.70) holds. Then

(px,y) = (ç/itx,ty)

(10.71)

whence

= 'c_I cl//cl. This relation implies that (p and i/i have the same eigenvalues. have the same eigenvalues. Then Conversely, assume that p and there is an orthonormal basis in E and an orthonormal basis 1...n) in F such that = 2%a%,

and

(v = 1... ii).

l//b%, =

Hence, an isometric mapping 'r:

(10.72)

defined by

'rap =

1... n).

(v

(10.73)

Equations (10.72) and (10.73) imply that ((p

=

(k/i

(v, p =

'r

1

... n)

whence (10.71).

10.22. Metric classification of quadrics. Consider first two quadrics Q and Q' with center. Since a translation does not change the principal axes we may assume that Q and Q' have the common center 0. Then the equations of Q and Q' read

Q: 'P (x) =

1

and

Q':cP'(x)= 1. Now assume that there exists a rotation of E carrying Q into Q'. Then

cP(x)='P'('cx)

xEE.

(10.74)

It follows from the criterion in sec. 10.21 that the bilinear functions 'P and cP' have the same eigenvalues. This implies that the principal axes of Q and Q' have the same length.

=

(v

=

1

...r).

Conversely, assume the relations (10.75). Then

(v=1...n).

(10.75)

§ 4. Quadrics in the Euclidean space

321

Observing the conditions (10.61) we see that l...ii). According to the criterion in sec. 10.21 there exists a rotation z of E such that

= cP(-rx').

This rotation obviously transforms Q into Q'. Now let Q and Q' be two quadrics without center. Without loss of generality we may assume that Q and Q' have the common vertex 0. Then the equations of Q and Q' read

Q:cP(x)+2(a,x)=0

aeK

al = 1,

(10.76)

Q':P'(x)+2(a',x)=0

a'EK

la'I = 1.

(10.77)

and

If Q and Q' are metrically equivalent there exists a rotation 'r such that cP(x) = cb'(rx).

Then

= 1 ... r).

(v

(10.78)

Conversely, equations (10.78) imply that the bilinear functions P and ck' have the same eigenvalues,

=

(v =

1

... n).

(10.79)

Now consider the restriction W of to the subspace T0 (Q). Then every eigenvalue of W is also an eigenvalue of cb. in fact, assume that W(e,y) = )L(e,y) for a fIxed vector eeT0(Q) and all vectors yET0(Q). Then cP (e, x) =

'P (e,

a + y) =

'P (e,

(e, y) = 'P (e, a) +

a) +

(e, y)

(10.80)

for an arbitrary vector xEE. Since the point 0 is a vertex of Q that 'P(e, a)=0. We thus obtain from (10.80) the relation 'P (e, x) = ). (e, y) =

A (e,

we have

a + y) = A (e, x)

showing that A is an eigenvalue of 'P. Hence we see that the bilinear In the same way we see that the function W has the eigenvalues restriction (/1' of'P' to the subspace T0 (Q') has the eigenvalues Ai.. Now it follows from (10.79) and the criterion in sec. 10.21 that there exists

an isometric mapping T0(Q') 21

Greub. Linear Algebra

Chapter X. Quadrics

322

with the property that y) = Define

(y)

y E T0

(Q).

the rotation t of E by )'ET0(Q)

t a = a'. Then

xeE and consequently, t transforms Q into Q'. 10.23. The metric normal-forms in the plane and in 3-space. Equations

(10.63) and (10.69) yield the following metric normal forms for the dimensions n =2 and n =3: Plane:

I. Quadrics with center: 1.

+

= 1,

2.

—

=1

a

ellipse with the axes a and b.

b

hyperbola with the axes a and b.

two parallel lines with the distance

= ±a

3.

2a.

II. Quadrics without center:

=

a

parabola with latus rectum of length a.

2

3-space:

I. Quadrics with center: 1.

2.

2

a

a2

+

+

+

—

—

b

—

= 1, a b c ellipsoid with axes a, b, c.

c

2=

1,

a

hyperboloid with one shell and axes

a,b,c. = 1, b

c

b

c

hyperboloid with two shells and axes

a,b,c.

§ 4. Quadrics in the Euclidean space

=

+

a2

b

2

1,

a

elliptic cylinder with the axes a and b.

b

hyperbolic cylinder with the axes a and b.

=1

two parallel planes with the distance

= ±a

6.

323

2a.

II. Quadrics without center: = 2 (,

—

a2 3.

b

a b

elliptic paraboloid with axes a and b.

hyperbolic paraboloid with axes a

2(

and b.

= 2(

a

parabolic cylinder with latus rectum of length a.

Problems 1. Give the center or vertex, the type and the axes of the following quadrics in the 3-space: (2

a) b) c)

1.

+

+ 7(2 — 1

—

—

= 9.

d)

2. Given a non-degenerate quadratic function qi, consider the family of quadrics defined by

Show that every point x +0 is contained in exactly one quadric Prove that the linear transformation 'p of E defined by

= ('px,y) with every point x +0 the normal vector of the quadric passing through x. 3. Consider the quadric associates

Q:P(x)= 1, 21*

Chapter X. Quadrics

324

where D is a non-degenerate bilinear function. Denote by Q' the image of Q under the mapping p which corresponds to çJi. Prove that the principal axes of Q' and Q are connected by the relation

(v=1...n). 4.

Given two points p, q and a number

consider the

locus Q of all points x such that Ix

Prove that Q

—

+ Ix — ql =

a quadric of index n whose principal axes have the length = (v = 2... n). a1! = — q12 is

5. Let cP (x) = be the equation of a non-degenerate quadric Q with the property that x is a normal vector at every point of Q. Prove that Q is a sphere. 1

Chapter XI

Unitary spaces Hermitian functions

§ 1.

11.1. Sesquilinear functions in a complex space. Let E be an n-dimensional complex linear space and ck: Ex E—*C be a function such that

+px2,y)=,Wi(x1,y)+#cli(x2,y) 2P(x,y1)+ j7P(x,y2)

cli(x,2y1 +

(11.1)

where 2 and 4ã are the complex conjugate coefficients. Then cP will be called a sesquilinear function. Replacing y by x we obtain from çji the corresponding quadratic function 11(x) = P(x,x).

It follows from (11.1) that

(11.2)

satisfies the relations

+ y) +

—

y) =

+

and

W(2x)

1212

The function çji can be expressed in terms of W. In fact, equation (11.2) yields

W(x + y) =

+ W(y) + cP(x,y) +

(11.4)

Replacing y by iy we obtain W (x + i y) =

W

(x) + ¶1' (y) — I cP (x, y) + I P (y, x).

Multiplying (11.5) by iand adding it to (11.4) we find

(x + y) + i

(x + iy) =

(1

+ i)(W (x) +

(y)) + 2P (x, y),

whence (x, y) = { W (x + y) —

W

(x) —

W

(y)} + I { W (x + iy) —

(x)

—

(y)} (11.6)

326

Chapter XI. Unitary spaces

Note; The fact that (P is uniquely determined by the function W is due to the sesquilinearity. We recall that a bilinear function has to be symmetric in order to be uniquely determined by the corresponding quadratic function. 11.2. Hermitian functions. With every sesquilinear function (P we can associate another sesquilinear function (P given by

= (P(y,x). A sesquilinear function (P is called Hermitian if (P= (P, i. e. (P (x, y) = (P (y, x).

(11.7)

Inserting y = x in (11.7) we find that

= V'(x).

(11.8)

Hence the quadratic function 'I' is real valued. Conversely, a sesquilinear function (P whose quadratic function is real valued is Hermitian. In fact, if is real valued, both parentheses in (11.6) are real. Interchange of x and y yields

2(P(y,x)={W(x+y)— W(x)— ¶t'(y)} + i{W(y+ ix)— W(x)— W(y)}. (11.9)

Comparison of (11.6) and (11.9) shows that the real parts coincide. The sum of the imaginary parts is equal to

W(x + iy) +

+ ix) — 2W(x)

—

2W(y).

Replacing y by iy in (11.3) we see that this is equal to zero, whence

(P(y,x) = (P(x,y). A Hermitian function (P is called positive definite, if ¶1' (x) >0 for all vec-

tors x+0. 11.3. Hermitian matrices. = 1 . ii) be a basis of E. Then every sesquilinear function (P defines a complex ii x n-matrix .

= (P (xv, xv).

The function (P is uniquely determined by the matrix and

In fact, if

§2. Unitary spaces

327

are two arbitrary vectors, we have that

The matrices relation

and

of cP and çli are obviously connected by the

=

If P is a Hermitian function it follows that = A complex n x n-matrix satisfying this relation is called a Hermitian matrix.

Problems 1. Prove that a skew-symmetric seq uilinear function is identically zero. 2. Show that the decomposition constructed in sec. 9.6 can be carried over to Hermitian functions. § 2.

Unitary spaces

11.4. Definition. A unitary space is an n-dimensional complex linear space E in which a positive definite Hermitian function, denoted by (,), is distinguished. The number (x, v) is called the Hermitian inner product of the vectors x and y. It has the following properties: 1. (Ax1 (x, Ay1 +PY2)

2. (x,y)=(y,x). In particular, (x, x) is real. 3. (x, x)>O for all vectors x+O. Example I: A Hermitian inner product in the complex number space C' is defined by (x. where

and

be the complex ificapie II: Let E be a Euclidean space and let tion of the vector space E (cf. sec. 2.16). Then a Hermitian inner product Exam

Chapter Xl. Unitary spaces

328

is defined in ('1 +

by

+ 'j2) =(x1. x)+(v1. 1))+

(x.

unitary space so obtained is called the coin plcxificutioii of the Euclidean space E. The norm of a vector x of a unitary space is defined as the positive square-root lxi = The Schwarz-inequality (11.10) l(x,y)l The

is proved in the same way as for real inner product spaces. Equality holds if and only if the vectors x and y are linearly dependent. From (11 .10) we obtain the triangle-inequality

lx+yi

+Jyl.

Equality holds if and only ify=Ax where ). is real and non-negative. In fact, assume that (11.11) lx+yi = fxl + lyl.

Squaring this equation we obtain

(x,v)+(x,y)=2lxHy(.

(11.12)

This can be written as

Re(x,v) = lxi yJ where Re denotes the real part. The above relation yields (x,y)I = lxl and

hence it implies that the vectors x and y are linearly dependent,

y—=Ax. Inserting this into (11.12) we obtain 2121

whence

Re2 =

121.

Hence, 2 is real and non-negative. Conversely, it is clear that

I(1+2)xI=lxH-21x1 for every real, non-negative number 2.

§ 2. Unitary spaces

329

Two vectors xeE and yeE are called orthogonal, if

(x,y)= 0. Every subspace E1 E determines an orthogonal complement consisting of all vectors which are orthogonal to E1. The spaces E1 and form a direct decomposition of E:

E= E1 A basis

(v = I ... n) of E is called orthonormal, if

=

(xv,

The inner product of two vectors and is

then given by

(x,y) = Replacing y by x we obtain x12

=

Orthogonal bases can be constructed in the same way as in a real inner

product space by the Schmidt-orthogonalization process. Consider two orthonormal bases and (v = 1.. .n). Then the matrix satisfies the relations of the basis-transformation

A complex matrix of this kind is called a unitary matrix. Conversely, if an orthonormal basis and a unitary matrix is given, the basis

= again orthonormal. 11.5. The conjugate space. To every complex vector space E we can assign a second complex vector space. E. in the following way: E coinis

cides with E as a real vector space. However, scalar multiplication in E, denoted by

z)—*2

.

z

is defined by

E is called the conjugate rector space. Clearly the identity map satisfies

—i

:eE.

Chapter XI. Unitary spaces

Now assume that (.) is a Hermitian inner product in E. Then a complex bilinear function. <>. is defined in E x E by

xeE. teE. Clearly this bilinear function is non-degenerate and so it makes E. E into a pair of dual complex spaces.

Now all the properties arising from duality can be carried over to unitary spaces. The Ries: theorem asserts that every linear function / in a unitary space can be uniquely represented in the form 1(x) = (x, a)

xeE.

In fact, consider the conjugate space E. In view of the duality between E and E there is a unique vector beE such that

f(x) =

<x, h>.

Now set a=K1(h). 11.6. INormed determinant functions. A nouined (leteulni000t Junction in a unitary space is a determinant function, zl, which satisfies x1eE.

4k1

(11.13)

We shall show that normed determinant functions always exist and and 47 are connected by that two normed determinant functions. the relation 47 where r is a complex number of magnitude I. In fact, let be any determinant function in E. Then a determinant in E is given by function.

\.*fl)_A(k._I\.*l

Since E and E

are

dual with respect to the scalar product <.> for-

mula (4.21) yields

4o(xi where

is a complex constant. Setting =

we

obtain x,)

Now set It follows that

where

(v= 1 ... n) 40(e1

is an orthonormal basis of E. =

y

§2. Unitary spaces and so

331

is real and positive. Let ). be any complex number satisfying and define a new determinant function, 4, by setting 4 I,.

Then (11.14) yields the relation

4

,...,x,,)

v,,) = det ((xi, vi))

4(v1

showing that zi is a normed determinant function. Finally, assume that z11 and are normed determinant functions in E. 1. Then formula (11.13) implies that 42=r41,

11.7. Real forms. Let E be an n-dimensional complex vector space and consider the underlying 2 n-dimensional real vector space E is a subspace, Fc: such that F is a real form of E, then every vector zeE can be uniquely written in the form z=x+iy. x. yeF.

Every complex space has real forms. In fact, let z1 ... z,7 be a basis of E and define F to be the subspace of consisting of the vectors X

E

=

Then clearly, F is a real form of E. A conjugation in a complex vector space is an satisfying of

iz =

involution,

—

Every real form F of E determines a conjugation given by x + ivF—*x —

iy

x.

yeF.

is a conjugation in E,then the vectors z which satisfy = z determine a real form F of E as is easily checked. The vectors of F are called real (with respect to the conjugation). Conversely, if

Problems 1. Prove that the Gram determinant

x1) ... (xv,

Chapter XI. Unitary spaces

332

of p vectors of a unitary space is real and non negative. Show that if and only if the vectors

are linearly dependent.

2. Let E be a unitary space. (i) Show that there are conjugations in E satisfying (ii) Given such a conjugation show that the Hermitian inner product of E defines a Euclidean inner product in the corresponding real subspace.

3. Let F be a real form of a complex vector space E and assume that a positive definite inner product is defined in F. Show that a Hermitian inner product is given in E by + i((x1.

:2) = ((x1. x2) + 2)

a unit quaternion / (cf. sec. 7.23) such that (I. e)=O.

Use / to make the space of quaternions into a 2-dimensional complex vector space. Show that this is not an algebra over C. 5. The complex cross product. Let E be a 3-dimensional unitary space and choose a normed determinant function /1. (i) Show that a skew symmetric bilinear map E x E E (over the reals) is determined by the equation

x, v. :eE. (ii) Prove the relations

(ix)xv= xx(iv)= —i(xxv)

(xxi',x)=(xxv,v)=O (x1 x x,.

X 12) = (yl. x1)(v2. x7) x

=

—

x1)(11. x2)

(x1. x2H2

6. Cavlev iiumhers. Let E be a 4-dimensional Euclidean space and E (cf. Example II, sec 11.4). Choose a and let E1 denote the orthogonal complement of e. unit vector Choose a normed determinant function 4 in E1 and let x be the corresponding cross product (cf. problem 5). Consider as a real 8-dimenlet

§2. Unitary spaces

333

sional vector space F and define a bilinear map x, y x y = —(x, y)e +

and

xx v

x i by setting

x. tEE1 tEC. VEE1 xeE1

x(ie)=/x

Show that this bilinear map makes F into a (non-associative) division algebra over the reals. Verify the formulae x =(xv)y and x2 y=x(xy). 7. Symp/ectic spaces. Let E be an rn-dimensional vector space over a field F. A symplectic inner product in E is a non-degenerate skew symmetric bilinear function <.> in E. (i) Show that a symplectic inner product exists in E if and only if in is even, rn=2n. (ii) A basis a1 a,,. b1 h, of a symplectic vector space is called syinpiectic. if K

a1>

=0

=

Kb,.

0

=

Show that every symplectic space has a symplectic basis. (iii) A linear transformation p: E—÷E is called symplectic. if it satisfies = <x, v>

x. yeE.

If (P and W are symplectic inner products in E, show that there is a linear automorphism of E such that x. yeE. 8. Symplectic ad joint transJorrnations. i) If p: E —* E is a linear transformation of a symplectic space the syrnp/ectic adjoint transformation. is defined by the equation <(PX. V> = <x. 10Y>

—(p) then (respectively (respectively skew syniplectic).

X.

VEE.

is called svinp/ectic se//ac/joint

(i) Show that = p. Conclude that every linear transformation ço of E where is symplectic can be uniquely written in the form selfadjoint and p2 is skew symplectic. (ii) Show that the matrix of a skew symplectic transformation with respect to a symplectic basis has the form (A B

Chapter XI. Unitary spaces

334

where A. B. C, D are square (n x ii)-mat rices satisfying

D=_A*.

C*=C.

B*=B.

Conclude that the dimensions of the spaces of symplectic selfadjoint (respectively skew symplectic) transformations are given by = ii(2ii

= ii(2,i

— 1)

—1.— 1).

(iii) Show that the symplectic adjoint of a linear transformation p of a 2-dimensional space is given by =

i

trço —

çø.

(iv) Let E be an n-dimensional unitary space with underlying real vector space Show that a positive definite symmetric inner product and a symplectic inner product is defined in by (x. j') = Re (x. j') and

x.

<x, i'> = Im (x, v)

Establish the relations <x, y> = — (ix.

and

v)

= (x.v).

Conclude that the transformation symplectic.

§

is both symplectic and skew

3. Linear mappings of unitary spaces

11.8. The adjoint mapping. Let E and F be two unitary spaces and (p: E-÷ F a linear mapping of E into F. As in the real case we can associate

of F into E. A conjugation in a unitary with ço an adjoint mapping of the underlying real vector space space is a linear automorphism

such that ix=

and

fl=(x,v).

If

is a conjugation in E,

then a non-degenerate complex bilinear function is defined by <x,y>= (x, and be conjugations in E and in F, respecx, yEE. Let tively. Then E and F are dual to themselves and hence 'p determines a dual mapping 'p*: by the relation = <x,'p*y>.

(11.16)

§ 3. Linear

mappings of unitary spaces

335

Replacing y by 9 in (11.16) we obtain (11.17)

Observing the relation between inner product aid scalar product we can rewrite (11.17) in the form

(çox,y) =(x,co*9). Now define the mapping

E+-F by (11.19)

Then relation (11. 18) reads

xeE, yEF. The mapping assume that (11.20). Then

(11.20)

does not depend on the conjugations in E and F. In fact, and are two linear mappings of F into E satisfying

0.

—

This equation holds for every fixed yeF and all vectors xeE and hence it implies that

=

The mapping

is called the adjoint of the mapping

It follows from equation (11 .18) that the relations

and hold for any two linear mappings and for every complex coefficient 2. Equation (11.20) implies that the matrices of 'p and relative to two orthonormal bases of E and F are connected by the relation

Now consider the case E= F. Then the determinants of 'p and are complex conjugates. To prove this, let A +0 be a determinant function in E and A be the conjugate determinant function. Then it follows from the definition of A that = det 'p A

...

) = det 'p

This equation implies that deUp = det'p.

A (x1

...

___________

336

Chapter XI. Unitary spaces

If (p is replaced by p —Ai, where 2 (11.21) yields det

—2

is

a complex parameter, relation

i) = det ((p — 2 i).

Expanding both sides with respect to 2 we obtain ;n—V

=

This equation shows that corresponding coefficients in the characteristic polynomials of (p and are complex conjugates. In particular,

= trço.

11.9. The inner product in the space L(E; E). Consider the space L (E; E). An inner product can be introduced in this space by

=

(11.22)

It follows immediately from (Il .22) that the function (p, i/i) is sesquilinear. Interchange of and yields

To prove that the Hermitian function (11.22) is positive definite let (v= in) be an orthonormal basis. Then and where

(11.23)

Equations (11.23) yield (p çoe%, =

whence

=

=

=

This formula shows that ((p, p)> 0 for every transformation (p + 0. 11.10. Normal mappings. A linear transformation (p : E—+ E is called normal, if the mappings (p and commute, (11.24)

In the same way as for a real inner product (cf. sec. 8.5) it is shown

§ 3. Linear mappings of unitary spaces

337

that the condition (11.24) is equivalent to

xeE.

It follows from (11.25) that the kernels of (p,

(11.25)

coincide, ker p = ker

and

We thus obtain the direct decomposition

E = ker (p

(11.26)

Tm (p.

The relation ker (p = ker implies that the mappings (p and have the same eigenvectors and that the corresponding eigenvalues are complex conjugates. In fact, assume that e is an eigenvector of (p and that is the corresponding eigenvalue, çoe = 2e. Then e is contained in the kernel of (p —2i. Since the mapping (p —2i is

again normal, e must also be contained in the kernel of —L, i. e.

qe—)e. In sec. 8.7 we have seen that a selfadjoint linear transformation of a real inner product space of dimension n always possesses n eigenvectors which are mutually orthogonal. Now it will be shown that in a complex space the same assertion holds even for normal mapping. Consider the characteristic polynomial of (p. According to the fundamental theorem Then of algebra this polynomial must have a zero is an eigenvalue of Let e1 be a corresponding eigenvector and E1 the orthogonal complement of e1. The space E1 is stable under (p. In fact, let y be an arbitrary vector of E1. Then ((py,e1) = (y,

=

= 2(y,e1) =

0

and hence is contained in E1. The induced mapping is obviously again normal and hence there exists an eigenvector e2 in E1. Continuing this way we finally obtain n eigenvectors (v = 1. n) which are mutually orthogonal, . .

(eV,e,L)=O

(v+ii).

If these vectors are normed to length one, they form an orthonormal basis of E. Relative to this basis the matrix of has diagonal form with the eigenvalues in the main-diagonal, (pep = 22

Greub. Linear Algebra

(v = 1... n).

(11.27)

Chapter XI. Unitary spaces

11.11. Selfadjoint and skew mappings. Let ço be a selfadjoint linear

transformation of E; i.e., a mapping such that

= (p. Then relation

(11.20) yields

(çox,y)=(x,coy)

X,yEE.

Replacing y by x we obtain

((px,x) = (x,px)

=(qx,x)

showing that ((px, x) is real for every vector xEE. This implies that all eigenvalues of a selfadjoint transformation are real. In fact, let e be an eigenvector and )L be the corresponding eigenvalue. Then çoe=Ae, whence

(pe,e) = 2(e,e). Since (çoe, e) and (e, e) + 0 are real, ). must be real.

Every selfadjoint mapping is obviously normal and hence there exists a system of n orthonormal eigenvectors. Relative to this system the matrix of p has the form (11.27) where all numbers are real.

The matrix of a selfadjoint mapping relative to an orthonormal basis is Hermitian. A linear transformation p of E is called skew if p= —(p. In a unitary

space there is no essential difference between selfadjoint and skew mappings. In fact, the relation iço

shows that multiplication by i associates with every selfadjoint mapping a skew mapping and conversely. 11.12. Unitary mappings. A unitary mapping is a linear transformation of E which preserves the inner product,

(çox,çoy)=(x,y)

x,yeE.

(11.28)

Relation (11.28) implies that

xEE showing that every unitary mapping is regular and hence it is an automorphism of E. If equation (11 .28) is written in the form

((px,y) =(x,(p1 y) it shows that the inverse mapping of (p is equal to the adjoint mapping, (11.29)

§ 3. Linear mappings of unitary spaces

339

Passing over to the determinants we obtain 1

whence

= 1. Every eigenvalue of a unitary mapping has norm 1. In fact, the equation çoe = 2e

yields = 121 el

whence 121=1.

Equation (11.29) shows that a unitary map is normal. Hence, there exists an orthonormal basis l...n) such that

= where the

are

(v = 1... n)

complex numbers of absolute value one.

Problems 1. Given a linear transformation p: E-÷E show that the bilinear function ck defined by

P(x,y)=(px,y)

is sesquilinear. Conversely, prove that every sesquilinear function can be obtained in this way. Prove that the adjoint transformation determines the Hermitian conjugate function.

2. Show that the set of selfadjoint transformations is a real vector space of dimension n2. 3. Let p be a linear transformation of a complex vector space E. a) Prove that a positive definite inner product can be introduced in E

such that p becomes a normal mapping if and only if ço has n linearly independent eigenvectors. b) Prove that a positive definite inner product can be introduced such

that p is i) selfadjoint ii) skew iii) unitary

if and only if in addition the following conditions are fulfilled in corresponding order: i) all eigenvalues of çø are real ii) all eigenvalues of 'p are imaginary or zero iii) all eigenvalues have absolute value 1. 22*

Chapter XI. Unitary spaces

4. Denote by S(E) the space of all selfadjoint mappings and by A (E) the space of all skew mappings of the unitary space E. Prove that a multiplication is defined in S(E) and A (E) by and

=

—

respectively and that these spaces become Lie algebras under the above multiplications. Show that S(E) and A(E) are real forms of the complex space L(E; E) (cf. problem 8, § 1, chap. I and sec. 11.7). §

4•* Unitary mappings of the complex plane

11.13. Definition. In this paragraph we will study the unitary mappings of a 2-dimensional unitary space C in further detail. Let -r be a unitary mapping of C. Employing an orthonormal basis e1, e-, we can represent the mapping i in the form (11.30) ie1 =c,e1 ±f3e2 're2 = c(— f3e1 + e

are complex numbers subject to the conditions =

1cx12 + 11312

and

= 1.

These equations show that

dett =

c.

We are particularly interested in the unitary mappings with the determinant + 1. For every such mapping equations (11.30) reduce to

te1=cxe1+$e2

2 cx

+

—

This implies that

-r'e1

—/3e2

T'e2=13e1 +cxe2. Adding the above relations in the corresponding order we find that (i +

= (cx +

t+

=

t' = itri.

(11.31)

§ 4. Unitary mappings of the complex plane

341

Formula (11.31) implies that

(z,rz) + (z,x' z) = zl2trt for every vector zeC. Observing that

(z,'r' z)

(tz,z)

(z,tz)

thus obtain the relation

we

2Re(z,rz)=Izl2trx

zEC

(11.32)

showing that if det t = 1. then the real part of the inner product (:.-r:) depends only on the norm of:. (11.32) is the complex analogue of the relation (8.42) for a proper rotation of the real plane. We finally note that the set of all unitary mappings with the determinant + 1 forms a subgroup of the group of all unitary mappings. 11.14. The algebra Q. Let Q denote the set of complex linear transformations of the complex plane C which satisfy

q3+ço=ltrço. It

(11.33)

is easy to see that these transformations form a real 4-dimensional

vector space, Q. Since (11.33) is equivalent to ç75 = ad p (cf. sec. 4.6 and problem 8, § 2, chapter IV) it follows that Q is an algebra under composition. Moreover, we have the relation pEQ.

(11.34)

Define a positive definite inner product in Q by setting (cf. sec. 11.9)

Then we have

(p.p)=detp.

çoeQ

(11.35)

and (11.36)

Now we shall establish an isomorphism between the algebra Q and the algebra I-fl of quaternions defined in sec. 7.23. Consider the subspace Q1 of Q consisting of those elements which satisfy or equivalently (p. i)=O. Then Q1 is a 3-dimensional Euclidean subspace of Q.

Chapter XI. Unitary spaces

342

and thus (11.34) yields

For q7eQ1 we have, in view of(11.33).

=—detq 1=— (p,p)i

eQ1.

This relation implies that =

+

'

— ((p,

(11.37)

Next, observe that for çoEQ1 and /JEQ1

((pop whence

°

—

i)=

—

Thus a bilinear map. x, is defined in Q1 by

°

(p x

(11.38)

It satisfies ((p

and

x

(q x

(11.39)

Moreover, equations (11.37) and (11.38) yield (11.40)

It follows that (11.41)

Finally, let /1 be the trilinear function in Q1 given by L1((p,

In view of (11.40) we can write

= (p x

L1((p,

(11.42)

This relation together with (11.39) implies that I is skew symmetric and hence a determinant function in Q1. Moreover, setting x = (p x yields, in view of (11.41), X

.

Thus A is a uoi';ned determinant function in the Euclidean space Q1. Hence it follows from (11.42) that the bilinear function x is the cross product in Q1 if Q1 is given the orientation determined by I (cf. sec. 7.16). Now equation (11.40) shows that Q is isomorphic to the algebra 1-fl of quaternions (cf. sec. 7.23).

§ 4. Unitary mappings of the complex plane

343

Remark: In view of equation (11.35), Q is an associative division algebra of dimension 4 over 11. Therefore it must be isomorphic to

I-fl

(cf. sec. 7.26). The above argument makes this isomorphism explicit. 11.15. The multiplication in C. Select a fixed unit-vector a in C. Then

to every vector :EC there exists a unique mapping

such that

This mapping is determined by the equations

a = a + f3 b = — fla + b

is a unit-vector orthogonal to a and

z = ca + fib. The correspondence

obviously satisfies the relation

+

=

for any two real numbers 2 and p. Hence, it defines a linear mapping of

C onto the linear space Q, if C is considered as a 4-dimensional real linear space. This suggests defining a multiplication among the vectors zEC in the following way: Z1Z2 = (11.43) Then (Pa1 and z1z2eC. (11.44) = In fact, the two mappings

Z2

and

°

are both contained in Q and

a into the same vector. Relation (11.44) shows that the correspondence preserves products. Consequently, the space C besend

comes a (real) division-algebra under the multiplication (11.43) and this algebra is isomorphic to the algebra of quaternions. Equation (11.31) implies that = 2(p, i)a z+ (11.45) for every unit-vector z. In fact, if z is a unit-vector then is a unitary mapping with determinant 1 and thus (11.31) and (11.36) yield z

+

= çoa +

=

+

a=

=

Finally, it will be shown that the inner products in C and Q are connected by the relation Re (z1, z2) =

(p.,

p:2).

(11.46)

Chapter XI. Unitary spaces

To prove this we may again assume that and :2 are unit-vectors. Then and are unitary mappings and we can write

pa,a) = ((p,-!a,a).

=

(:1,:2) =

(11.47)

is also unitary formula (11.32) yields

Since

a, a) = =

tr (p2_i

—1

= -3

= -3

(11.48)

=

Relations (11.47) and (11.48) imply (11.46).

Problems 1. Assume that an orthonormal basis is chosen in C. Prove that the transformations which correspond to the matrices (1

(i

O\ 1)

( 0-i

(O-l\

O\

o)

0

form an orthonormal basis of Q. 2. Show that a transformation çOEQ is skew if and only if tr p=O. 3. Prove that a transformation çOEQ satisfies the equation

=

—

if and only if

detp=1

and

trço=O.

4. Verify the formula (z1 z, z2 z) = (z1, z2)

z e C.

5. Let E be the underlying oriented 4-dimensional vector space of the complex plane C and define a positive definite inner product in E by

(x, J')E = Re(x, v).

Let a. h be an orthonormal basis of C such that a is the unit element for the multiplication defined in sec. 11.15. Show that the vectors = a.

e1 = ia_

c7 =

h,

c3 = ib

form an orthonormal basis of E and that, if 4 is the normed positive determinant function in E, = — I.

§ 5. Application to Lorentz-transformation

345

Let E1 denote the orthogonal complement of e in E (cf. problem 5). t/i be a skew Hermitian transformation of C with tn/i=0. a) Show that there is a unique vector peE1 such that x. xeE. 6.

Let

b)If / is the matrix of with respect to the orthonormal basis a. b in problem 5, show that p= + fi e2 + ; e3. §

5•* Application to Lorentz-transformations

11.16. Selfadjoint linear transformations of the complex plane. Consider the set S of all selfadjoint mappings a of the complex plane C. S is a real 4-dimensional linear space. In this space introduce a real inner product by (11.69)

This inner product is indefinite and has index 3. To prove this we note first that — (trcr)2) = — deta = (11.70) and

=—4tra.

(11.71)

Now select an orthonormal basis z1,z2 of C and consider the transformations (j = 1,2,3) which correspond to the Pauli-matrices

/1

0\ —i)

i\

/0

/0

o)

—i 0

Then it follows from (11.69) that
=

and =

0,

= — 1.

These equations show that the mappings i, a1, a2, a3 form an orthonormal basis of S with respect to the inner product (11.69) and that this inner product has index 3. Thus S becomes a Minkowski space (cf. sec. 9.7).

The orthogonal complement of the identity-map consists of all selfadjoint transformations with the trace zero.

Chapter XI. Unitary spaces

346

11.17. The relation between the spaces Q and S. Recall from sec. 11 .14 the definition of the 4-dimensional Euclidean space Q. Introduce a new inner product in Q by setting

We shall define a linear isomorphism Q:

such that

In fact, set

Q(p=

itrgo+igo

goeQ.

Then

=j

—

Since

.

trgo

— i(go

and so it follows that

we have

- Q go =0; i.e., Qgo is selfadjoint. The map W:

W(oi=-

given by

l+i

oeS

is easily shown to be inverse to Q and thus Q is a linear isomorphism. Finally, since = (I —

i)

-i tr(p•

+

1—

tr(p + go tn/i) — (po /1

it follows that

tr

= trgo . tn/i — tr(goo

whence — trQgo

=

=

go

=

tn/i —

trgo

= (go,

Note as well that, by definition,

Now consider an arbitrary linear trans11.18. The transformations formation of the complex plane C such that det = 1. Then a linear transformation T2: S—÷S is defined by TES.

(11.79)

§ 5.

Application to Lorentz-transformations

347

In fact, the equation

=

=

=

is again selfadjoint. The transformation shows that the mapping preserves the inner product (11.69) and hence it is a Lorentz-transformation (cf. sec. 9.7):

=
= — detaldetcxl2 = — detcr

Every Lorentz-transformation obtained in this way is proper. To prove this let 1) be a continuous family of linear transformations of C such that cx(O) = z = and detcx(t) = 1 (0 t 1).

it follows from the result of sec. 4.36 that such a family exists. The continuous function

det T2 (t)

t 1)

(0

is equal to ± 1 for every t. In particular det

= det i; = 1.

This implies that =1

(0 t 1)

whence

= 1. The transformations that

are orthochroneous. To prove this, observe i=

whence

=

=—

<0.

This relation shows that the time-like vectors i and are contained in the same cone (cf. sec. (9.22)). 11.19. In this way every transformation with determinant 1 defines a Obviously, proper Lorentz-transformation (11.80)

Now it will be shown that two transformations and if f3_— In view of(l 1.80) it is sufficient to prove that operator only if = ± i. If is the identity, then

forevery

creS.

coincide only is the identity (11.81)

Chapter XI. Unitary spaces

348

Inserting a = we find that cx o = i whence cx = implies that 1

1 Now equation (11.81)

cxa=a0x forevery aES.

(11.82)

To show that cx= ± i select an arbitrary unit-vector eEC and define a selfadjoint mapping a by a:=(z,e)e ZEC. Then

(aocx)e =(cxe,e)e and

(cxoa)e =

cxe.

Employing (11.82) we find that cx e =

(cx

e, e) e.

In other words, every vector cxz is a multiple of z. Now it follows from the linearity that cx = 2i where )L is a complex constant. Observing that det cx = we finally see that A = ± 1. 1

11.20. In this section it will be shown conversely that every proper orthochroneous Lorentz-transformation T can be represented in the

form (11.17). Consider first the case that i is invariant under T,

Ti = i. Employing the isomorphism Q:Q—*S (cf. sec. 11.17) we introduce the transformation

T'=Q'0T0Q

(11.83)

of Q. Obviously,

=

<(p,l/J>

(11.84)

i = i.

(11.85)

and

Besides the inner product of sec. 11.17 we have in Q the positive inner product defined in sec. 11.14. Comparing these two inner products we see that (q,,

+ tr

=
tr

=

-2

i>.

(11.86)

Now formulae (11.84), (11.86) and (11.85) yield

(T'ço,T'l./J)=(ço,l/J)

p,l/JEQ

showing that T' is also an isometry with respect to the positive definite inner product. Hence, by Proposition I, sec. 8.24, there exists a unitvector /3EQ such that

T'p=fopo/31.

(11.87)

§ 5. Application

to Lorentz-transformations

349

Using formulae (11.83), (11.87) and (11.74) we thus obtain

Tci =(Q0T'0Q1)a = Since

$0a0f31.

=f3 (cf. sec. 11.34). This equation can be written in the form

T ci = fi

ci

=

fi

ci.

Finally,

detfi =(/3,/3)= 1. Now consider the case Ti

i. Then, since T is a proper orthochroneous

Lorentz-transformation, Ti and i are linearly independent. Consider the plane F generated by the vectors i and Ti. Let w be a vector of F such that

(11.88) = 0 and <w. u> = 1. In view of (11.70) and (11.71) these conditions are equivalent to

trw=0 and detw=—l. Therefore WoW

1.

By hypothesis, T preserves fore-cone and past-cone. Hence Ti can be written as (cf. sec. 9.26)

Ti = i cosh 0 + wsinh0.

(11.91)

Let cx be the selfadjoint transformation defined by

=

i cosh

0

+ w sinh —.

(11.92)

Then

Ti=

0

= i cosh2 + —

2 w cosh

2

=

i cosh 0

0

0

2

2

- sinh - +

0

w w sinh2 — 2

(11.93)

+ w sinh 0.

Comparing (11.91) and (11.93) we see that

Ti = Tai. This equation shows that the transformation T' ° T leaves vector i invariant. As it has been shown already there exists a linear transformation f3e Q of determinant 1 such that T=

Chapter XI. Unitary spaces

Hence,

T= It remains to be proved that cz has determinant 1. But this follows from (11.70), (11.92) and (11.88): det

= —
=

0

— <1,

0

i> cosh2 — <w, w> sinh2 - = 1.

Problems 1) Let be the linear transformation of a complex plane defined by the matrix

(

1

21 3

Find the real 4 x 4 matrix which corresponds to the Lorentz-transformation with respect to the basis i, a3 (cf. sec. 11.16). 2) A fractional linear transformation of the complex plane formation of the form

T(:) =

c:+d

ad— hc

=

is a trans-

1

where a. b. c. d arc complex numbers. (i) Show that the fractional linear transformations form a group.

(ii) Show that this group is isomorphic to the group of proper orthochroneous Lorentz transformations.

Chapter XII

Polynomial Algebras in this chapter F denotes a fIeld of characteristic zero. § 1. Basic properties In this paragraph we shall define the polynomial algebra over F and establish its elementary properties. Some of the work done here is

simply a specialization of the more general results of Chapter VII. Volume II. and is included here so that the results of the following chapter will be accessible to the reader who has not seen the volume on multilinear algebra. 12.1. Polynomials over F. A polynomial over F is an infinite sequence

are different from zero. The sum and such that only finitely many the product of two polynomials =

...)

and

g=

...)

are defined by

jg =

...)

where i-t-j=k

These operations make the set of all polynomials over F into a commutative associative algebra. It is called the polynomial algebra over F and is denoted by ['[t]. The unit element, 1, of F[t] is the sequence (1.0...). It is easy to check that the map i: F—+F[t] given by

is an injective homomorphism of algebras.

Thus we may identify F with its image under 1. The polynomial (0, 1, 0 ...) will be denoted by

t=(0.l,0...).

Chapter XII. Polynomial algebra

It follows that

tk=(0. (11.0...)

k =0.1....

Thus every polynomial I can be written in the form (12.1)

where only finitely many are different from zero. Since the elements (k=0. 1. ...) are linearly independent, the representation (12.1) is unique. Thus these elements form a basis of the vector space T[t]. Now let .1

0

=

be a non-zero polynomial. Then ; is called the /eadi;ig ('oeff!cwIlt of f The leading coefficient of is the product of the leading (ff0, coefficients off and g. This implies that fg 0 whenever / 0 and g 0. A polynomial with leading coefficient is called monk. On the other hand, for every polynomial the element I

is called the sea/a, term. It is easily checked that the map is a surjective homomorphism. given by Consider a non-zero polynomial k=O

The number ii is called the degree of

I.

If g is a second non zero

polynomial, then clearly

deg(f + g) max(degf,degg) and

deg(fg) = degf + degg.

(12.2)

A polynomial of the form ;t' (; +0) is called a monomial of degree ii. be the space of the monomials of degree ii together with the Let zero element. Then clearly

F[t] and by assigning the degree a to the elements of

we make r[t]

into a graded algebra as follows from (12.2). The homogeneous elements

§ 1. Basic properties

353

of degree n with respect to this gradation are precisely the monomials of degree a.

However, the structure of F[t] as a graded algebra does not play a role in the subsequent theory. Consequently we shall consider simply its structure as an algebra. 12.2. The homomorphism Let A be any associative algebra with unit element e and choose a fixed element aEA. Then the map 1

e, t

a

can be extended in a unique way to an algebra homomorphism

The uniqueness follows immediately from the fact that the elements 1 and t generate the algebra F[t]. To prove existence, simply set

=

a".

It follows easily that k is an algebra homomorphism. The image of T[t] under J) will be denoted by T(a). It is clearly the subalgebra of A generated by e and a. Its elements are called polynomials in a, and are denoted by J(a). The homomorphism cP:f—÷f(a) induces a monomorphism cP:

In particular, if A is generated by e and a then cP (and hence P) are surjective and so P is an isomorphism

P: F[t]/kerP4A in this case.

Example: Let AeT[t] and let

be a polynomial. Then we

have

f(g) =

ti'.

In particular. the scalar term of f(g) is given by =

=J(/30).

Thus we can write

where g: 23

Greub. Linear Algebra

is the homomorphism defined in sec. 12.1.

Chapter XII. Polynomial algebra

12.3. Differentiation. Consider the linear mapping

d:E[t]—*F[t] defined by

dt"—pt"' dl =0. Then we have for p,q1 = = (p + = + t"q = + =

d

d

+

d

(12.3)

It is easily checked that (12.3) continues to hold for p=O or q=0. Since the polynomials form a basis of T[t] (ef. sec. 12.1) it follows from (12.3) that the mapping d is a derivation in the algebra F[t]. d is called the d(fferentiation map in F[t], and is the unique derivation which maps t into 1. It follows from the definition of d that d lowers the degree of a polynomial by 1. In particular we have ker d = F

and so the relations

df = dg and

are equivalent. The polynomial df will be denoted byf'. The chain rule states that for any two polynomials f and g,

(f(g))'=f'(g)g'. For the proof we comment first that

dgk=kgk_ldg dg° =

(12.4)

0

which follows easily from an induction argument. Now let

I= Then

I (g) =

gk

§ 1. Basic properties

355

and hence formula (12.4) yields

(f(g))' =

1) is called the r-th derivative of f and is The polynomial d7 We extend the notation to the case r=O by usually denoted by if and only setting f(0)=f It follows from the definition that if r exceeds the degree of j 12.4. Taylor's formula. Let A be an associative commutative algebra over F with unit element e. Recall from sec. 12.2 that if jET[t] and aeA then an element [(a)eA is determined by J(a) = Tailor's fhrmula states that for aEA, beA fl

j(a+h)= >J——--,-—-h". p=o

(12.5)

P•

Since the relation (12.5) is linear in j it is sufficient to consider the case f= t". Then we have by the binomial formula f(a + h) =

(a

=

+

p==o

P

...

(a

p

+ 1)

...

(a — p

+ 1)

On the other hand. = n(n

— 1)

—

whence (a)

= a (a —

1)

and so we obtain (12.5).

Problems defined by 1. Consider the mapping F[t] x a) Show that this mapping does not make the space F[t] into an algebra.

b) Show that the mapping is associative and has a left and right identity.

Chapter XII. Polynomial algebra

c) Show that the mapping is not commutative. d) Prove that the mapping obeys the left cancellation law but not the right cancellation law; i.e., (g) =f2(g) implies that f1 =f2 but f(g1) f(g2) does not imply that g1 =g2. 2. Construct a linear mapping

r[t] such that

do$=i. Prove that if such that

and

$2 are

two such mappings, then there is a fixed scalar,

-$2)f In particular, prove that there is a unique homogeneous linear mapping $

of the graded space F [t] into itself and calculate its degree. $ integration operator. 3. Consider the homomorphism Q:F[t]—J' defined by

is

called the

=

Q

Show that

=f(O). Prove that if $ is the integration operator in F[z'], then

=i Use

—

this relation to obtain the formula for integration by parts: $1 g'

4. What is the Poincaré series for the graded space F[t]? 5. Show that if ThF[t]—+F[t] is a non-trivial homogeneous antiderivation in the graded algebra F[t] with respect to the canonical involution, and then

pl.

6. Calculate Taylor's expansion

f (g + h) = I (g) + hf'(g) +.. for the following cases and so verify that it holds in each case:

a) f=t2—t+l,

g=t3+2t,

h=t—5

b)f=t2+l, g=t3+t—1, h=—t+l c)f=3t2+2t+5, g=l, h=—l d)f=t3—t2+t—l,

g=t,

h=t2—t+1

§ 2. Ideals and divisibility

357

7. For the polynomials in problem 6 verify the chain rule

[1(g)]' explicitly. Express the polynomialf(g(h))' in terms of the derivatives off, g and h and calculate [f(g(h))]' explicitly for the polynomials of problem 6.

§ 2.

Ideals and divisibility

12.5. Ideals in F Iti. In this section it will be shown that every ideal in

the algebra F[t] is a principal ideal (cf. sec. 5.3). We first prove the following

Lemma I: (Euclid algorithm): Let [+0 and g+0 be two polynomials. Then there exist polynomials q and r such that

f = gq + and deg r<deg g or r=0. Proof: Let degf= n and deg g = in. If in > n we write

f =gO+f and the lemma is proved. Now consider the case Without loss of generality we may assume thatf and g are monic polynomials. Then we have

f=

+

degf1
(12.6)

1ff1 + 0 assume (by induction on n) that the lemma holds forf1. Then

f1=gq1+r1

(12.7)

where deg r1 <deg g or r1 =0. Combining (12.6) and (12.7) we obtain

f = (tn_rn + q1)g + r1 and so the lemma follows by induction.

Proposition 1: Every ideal in F[t] is principal. Proof: Let I be the given ideal. We may assume that 1+0. Let h be a monic polynomial of minimum degree in I. It will be shown that I 4 (cf. sec. 5.3). Clearly, 4 c I. Conversely, let fel be an arbitrary polynomial. Then by the lemma, f= hq+ where (12.8) degr<degh or r=0.

Chapter XII. Polynomial algebra

Sincefe I and he I we have

r= and hence if

f

— h

q el

deg rdeg h. Now (12.8) implies that r=O; i.e.f=hq

and sofe The monic polynomial h is uniquely determined by I. In fact. assume Then there are monic that k is a second polynomial such that polynomials g1 and g, such that k=g1li and h=g2k. It follows that

k =g1g,k whence g1g2 = I. Since g1 and g2 are monic we obtain g1 =g2 = whence

k=li.

12.6. Ideals and divisors in F[t]. Let [and g be non-zero polynomials. We say that g dirides f or / is a multiple of g if there is a polynomial Ii such that

/=

g Ii.

In this case we write g/f Clearly. f is a multiple of g if and only if it is contained in the ideal, 1g. generated by g. If li/g and g/f then li/f Two monic polynomials which divide each other are equal. Next. letf and g be monic polynomials and consider the ideal I+Ig. In view of Proposition I. sec. 12.5. there is a unique monic polynomial. [v g. such that 'fvg = 1f + 1g. It is called the greatest coinnion dicisor of I and g. Since

g

and

[vg is indeed a common divisor of •[ and g. On the other hand, if li/f and li/g, then

whence

and 'g

g

and so li/f v g.

This shows that every common divisor of I and g is a divisor of I v g. A polynomial f whose only divisors are scalars and scalar multiples of I is called irreducible ar prime. In a similar way the greatest common divisor of a finite number of monic polynomials (i= I ... r) is defined. It is denoted by f v and is characterized by the relation 111 v

v

=

+

+ 1j;.

(12.9)

polynomials f are called relatirel prime. 12.7. Again let I and g be monic polynomials and consider the ideal n 'g. By Proposition I, sec. 12.5. there is a unique monic polynomial, .1 A g. such that the

'fAg

= 'f

§ 2. Ideals and divisibility

359

It is called the least common 17111/tip/c of f and g. It follows from the definition that [Ag is indeed a common multiple off and g and every common multiple off and g is a multiple of [Ag. If 1 (1= 1 ... r) are monic polynomials their least common multiple. A Jr. is defined by the equation ji A A

=

A

(12.10)

Proposition II: 1ff is the greatest common divisor of the f (i= 1 ... then there are polynomials such that

r)

(12.11)

Proof: It follows from (12.9) that every element

can be written as

= Since tel1. [ must be of the form (12.11). Corollary: If the polynomials nomials (1= 1.. .r) such that

are

relatively prime there exist poly(12.12)

Conversely if there exists a relation of the form (12.12) then the J

are

relatively prime.

Proof: The first part follows immediately from the proposition. Now assume that there is a relation of the form (12.12). Then every common divisor of theJ divides 1 and hence is a scalar.

Proposition III. Let f be the greatest common divisor of the monic polynomials

12 and write

= f/i1

(i = 1.2).

Then the polynomials /1-, are relatively prime and the least common multiple of the polynomials /12. f Proof: In view of Proposition II we can write .

f=>fg1.

(12.13)

Chapter XII. Polynomial algebra

It follows that .1 =

g1

whence 1.

This shows that and /12 are relatively prime. Clearly, the polynomial I 112 is a common multiple of .11 and .12. Now let g be any common multiple of f and and write

g=jp1.

g=J2p7.

Then (12.13) yields

=.1 1i2(g1 P2 + g2 Pi).

This implies that

Pi = h2(g1 P2 + g2 Pi) whence

g = ./1

Pi

.1

112 (g1 P2 + g2 Pi);

i.e., g is a multiple off. /11 112.

Corollan.: If the monic polynomials f (i= I ... r) are relatively prime, then

a

.

A

A fr = /1 ... fr

Proof: If r=2 this follows immediately from the proposition. If r>2 simple induction argument is required.

12.8. The lattice of ideals in F[t]. Let f denote the set of all ideals in r[t]. Recall from sec. 5.9 that .1 is a lattice with respect to the partial order given by inclusion. On the other hand, let 5) denote the set of all monic polynomials together with the zero polynomial. Define a partial order in 5) in by setting I g if g/f (.1 +0, g +

g0

for every

Then, in view of sec. 12.6 and 12.7, 5) becomes a lattice as well. Now let Ii.. Then .f g implies that P: 5) .1 be the map given by b: and so cP is a lattice homomorphism. Moreover, since b is bijective, it is a lattice isomorphism (cf. Proposition I, sec. 12.5).

§2. Ideals and divisibility

361

12.9. The decomposition of a polynomial into prime factors.

Theorem 1: Every monic polynomial can be written (12.14) f where the J are distinct irreducible monic polynomials and degf 1. The decomposition is unique up to the ordering of the prime factors. Proof: The existence of the decomposition (12.14) is proved by induction on the degree off. if degf=O thenf= 1 and the decomposition is trivial. Suppose that the decomposition (12.14) exists for polynomials of degree
f=gh deg h<degf it follows by induction that

and —

whence

—

t...

ii

t

1••

Collecting the powers of the same prime polynomials we obtain the decomposition (12.14).

The uniqueness part follows (with the aid of a similar induction argument) from

Lemma II: Let g, h be monic polynomials and assume that h is irreducible. Let m be a positive integer. Then htm divides fg if and only such that if there is an integer p (1

h"/f and

htm - 17g.

Proof The "if" part of the statement is trivial. Now suppose that Let pO be the largest integer such that /v!)/f. If p=m there is nothing to prove. If p<m, write Then

fg = h" 11 g.

On the other hand. by hypothesis, for some polynomial k.

Chapter XII. Polynomial algebra

These relations yield fi g =

is not divisible by /1. Since Ii is irreducible, By the definition of p. it follows that Ii and are relatively prime. Thus there are polynomials u and u such that (cf. the corollary to Proposition II, sec. 12.7) u Ii

+u

=

1.

Set g1

k.

= ug +

Then we have Ii g1

= liii g

+ r /i"j' k

g = g;

= /10 g + 1 i g = (Ii U +

i.e., hg1=g.

Now

equation (12.15) implies that f

/1i11—

g1

p—i

k

divides g. This establishes the Continuing this process we see that is complete. lemma and so the proof of Theorem I

Carol/an'. The monic polynomials which divide the polynomial are

precisely the polynomials

(v=1,...,r). Now let (12.14) be the decomposition of the monic polynomial f and set q = fk ... ... (1 = ... 1

It will be shown that the q are relatively prime and that for every i. I is the least common multiple of and the polynomial V q1, (12.16)

V 1=1

j*i Let g be a monic polynomial which divides that g has the form

(12.17)

Then Theorem I implies

g divides all polynomials q1, it follows that g =

1,

whence (12.16).

§ 2. Ideals and divisibility

A similar argument shows that V

and

363

now rormula (12.17)

follows from the relation q, A

and the fact that

and

q f:k

(V

=f

j *i

are relatively prime.

Proposition IV: Suppose g is a product of relatively prime irreducible

polynomials and suppose f is a polynomial such that g/j'" for some in 1. Then g/f Let

j—j1 ... I,

k,

be the decomposition of f into its prime factors. Then the corollary of Theorem I implies that g is of the form ho

= /1f/i

Is

by hypothesis, g is a product of relatively prime irreducible polynomials it follows that L = = J = whence Since,

1

...

and so g/f Proposition

V: A polynomial f is the product of relatively prime

irreducible polynomials if and only if f and f' are relatively prime. Proof Let jk

1— 1k

.1

Jr

J1

be the decomposition of f into prime factors. Suppose that some Then writing we

f = h j2

I

for

lie I' [t]

obtain

+ 2/if

f

f and

prime.

f and f' are not relatively

Conversely, assume that (i= I... r). Then, if I and 1' have a common factor, the corollary to Theorem I, implies that, for some i, J divides f'. 1

Since

= j=1

...

... Jr =

11

...

f1

...

Jy + /1

...

Jr

Chapter XII. Polynomial algebra

we obtain 11.11

.11'

fr

is irreducible and the polynomials .11 fr are relatively prime. This But this is impossible since follows that

Since it

contradiction shows that / and f' are relatively prime. 12.10. Polynomial functions. Let (F; F) be the space of all set maps furnished with the linear structure defined in sec. 1.2, Example 3. determines an elementf of (F; F) defined

Then every polynomialf= by

The functions f are called polynomial functions. for some lET, then 2 is called a root off. I isaroot off if and only if t—1 dividesf. In fact, if

f=(t-2)g it follows thatf(2) = 0. Conversely, if t —2 does not dividef, then t —2 and fare relatively prime and hence there exist polynomials q and s such that

I q + (t —1) S = 1.

This implies that

f(2)q(1)= 1 whencef(2)+0. It follows from the above remark that a polynomial of degree n has at most ii roots. Proposition VI: The mappingf—*Jis injective. for every Proof. Suppose J=O. Then Since F has characteristic zero it contains infinitely many elements and hence it follows that

f=0. In view of the above proposition we may denote the polynomial function f simply by f

Problems I. Let I be a polynomial such that 1(0) 0. Consider two polynomials g1 and g2 are g1 and g2 such that relatively prime.

____________

§ 2. Ideals and divisibility

365

2. Consider the set of all pairs (f, g) where g #0. Define an equivalence relation in this set by

(f,g)

if and only if

= Jg.

Show that this is indeed an equivalence relation. Denote the equivalence

classes by. Prove that the operations (f1,g1)

+(f2,g2) =(fi g2 + f2g1,g1 g2)

and

(f1,g1)(f2,g2) = (f1f2,g1g2) are well defined.

Show that with these operations the set of equivalence classes becomes

a field, denoted by 01[t]. Prove that the mapping is a monomorphism of the algebra F[t] into the algebra Q1[t]. 3. Extend the derivation d to a derivation in and show that this extension is unique. Show that the integration operator j' (cf. problem 2, § 1) cannot be extended to Q1[t]. 4. Show that any ideal in T[t] is contained in only finitely many ideals. 5. Consider the mapping 11[t] x R[t]—+111[t] defined by

Show that this mapping makes R[t] into an inner product space. Prove that the induced topology makes R[t] into a topological algebra (addition, scalar multiplication, multiplication and division are continuous). have the inner product of problem 5. Let I be any ideal. 6. Let Calculate I explicitly. Under what conditions do either of the equations =

(I')' = hold? Show that

((I')')' = I'. 7. Let f and g be any two non-zero polynomials and assume that deg g. Write

I=

+ g1

where g1 =0 or deg g1 <deg g. Prove that the greatest common divisor off and g coincides with the greatest common divisor of g and g1, unless

Chapter XII. Polynomial algebra

g dividesf. If g1 +0 write g2 =

g = p2g1 + g2,

degg2 <degg1.

or

0

Show that the repeated application of this process yields an explicit calculation of the greatest common divisor of f and g. (This method is

called the Euclidean algorithm).

8. Calculate the greatest common divisors and the least common multiples of the following polynomials over a) + t4 + t3 + b) c) 5t4 — +

d) 818 +

e) 3t4 +

12

—

— —

9t2

+ I + 1, t + 7, 3t

t6 — 712

+ 7,

+ 16, 218 + +

—

+ + 841 + 5, —

I

+ 16

72

+

—

2912 —

—

+ 21 + 1

— 612 —

64t

3t

— 15

+ 4.

9. 1ff, g are two polynomials and d is their greatest common divisor use the Euclidean algorithm to construct polynomials r, s such that

+ g s = d.

10. Construct the polynomials r, s explicitly for the polynomials of problem 8, (in parts b) and c) it will be necessary to construct three polynomials).

11. Decide whether the following polynomials are products of relatively prime irreducible polynomials: a) 17—t5+t3—t; b)

g

c) the polynomials of problem 6, § 1; d) the polynomials of problem 8. 12. Let g1, g2 be non-zero polynomials such that g1 +g2. Show that g2 divides f(g1) —

§ 3.

Factor algebras

12.11. Minimum polynomial. Let A be a finite dimensional associative

commutative algebra with unit element e. Fix an element OEA and consider the homomorphism P: T[t]—÷A given by

P(f)=f(a)

fer[t]

§ 3. Factor algebras

367

sec. 12.2). Its image is the subalgebra of A generated by e and a. It will be denoted by ['(a). The kernel, K of (P is an ideal in T[t]. Since A has finite dimension, By Proposition I, (Sec. 12.5) there is a unique monic polynomial. p is called the mininuim poli'nomial of a. p, such that K = 0, p must have positive degree. To see this assume that p = 1. If A and so (P =0. It follows that e=(P(1)=0 whence A =0. Then The homomorphism (P factors over the canonical projection to induce an isomorphism (cf.

['(ii)

W:

such

that the diagram

T[t]

['(a)

commutes.

Example I: If a=0, then K consists of all polynomials whose scalar term is zero and so p=1. Example II: If a=e. then K consists of all polynomials I In this case we have p=t— I.

satisfying

Example III: Set A = F [1]/Ih (where h is a fixed monic polynomial) and

a=t where t=m(t). Then, for every polynomial I (P(f) = and

=

ir(tY =

=

so (P coincides with the canonical projection. It follows that the

minimum polynomial of is the polynomial h.

Proposition I: The dimension of ['(a) is equal to the degree of the minimum polynomial of a. Proof: Let r denote the degree of p. Then it is easily checked that the tr_l form a basis of F[t]/I,,. Thus we have, in view of elements I. t the isomorphism W. dim ['(a) = dim T[t]/I,, = I'. Proposition II: Every ideal in ['(a) is principal.

Then I is an Proof: Let 14 be an ideal in ['(a) and set I = ideal in F[t]. Thus, by Proposition V. sec. 12.5. there is an element fE J'[t] such that I = I. It follows that = (P(I) =

=

Chapter XII. Polynomial algebra

12.12. Nilpotent elements in 1(a). Suppose that I (a) is a nilpotent element of 1(a). Then, for some in 1. 1(a)" = 0. It follows that f.rn) =

= 1(a)" = 0

and so the minimum polynomial. p. of a divides I Now decompose p into its prime factors P=

/1

tr

.

and set

g=f1... Then

fr

we have g/p1/f'. Now Proposition IV. sec. 12.9. implies that g

divides f Conversely, assume that g//. Set k = max

(k1

kr). Then

p, g' and so

we have

It follows that I = 0 and so 1(a) is nilpotent. Thus I (a) is in/potent if and on/v if g divides f. In particular. if all the

exponents in the decomposition of p are I. then p=g and so f(a)

is

nilpotent if and only if p/f; i.e.. if and only if f(a)=0. Hence, in this case there are no non-zero nilpotent elements in 1(a). 12.13. Factor algebras of an irreducible polynomial. Theorem I: Let I be a polynomial. Then the factor algebra 1(t) is a field if and only if f is irreducible.

Proof: Supposef= gh,wheredegg 1 and degh and so h+O. On the other hand, and so 1(1) has zero divisors. Consequently, it is not a field. Conversely, suppose f is irreducible. 1(1) is an associative commutative algebra with identity T. To prove that 1(1) is a field we need only show that every non-zero element has a multiplicative inverse. Let be any representing polynomial. Then since it follows that g is not divisible by f Since f is irreducible, f and g are relatively prime and so there exist polynomials h and k such that

gh +fk whence

Butf=O and so 1. Hence ii

is an inverse of

I

§ 4. The structure of factor algebras

369

Corollary: If j. is irreducible, then 1(1) is an extension field of F. Proof: Consider the homomorphism co:F—*F(i) defined by It is clear that ço is a monomorphism and so the corollary follows.

Problems 1. Consider the irreducible polynomial f= t2 + 5t + 1 as an element of [t] (where Q is the field of rational numbers). Let m:Q[t]—÷Q[t]/11

be the canonical projection. a) Decide whether the polynomials of problem 6, § 1, problem 8, § 2 (except for part d) and problems 11 a) and b), § 2 are in the kernel of it. b) For each polynomial p of part a) such that itp+O construct a polynomial [t] such that

= (mp)'. 2. Let feF[t] be any polynomial. Consider an arbitrary element xel'[t]/11. Prove that the minimum polynomial of x has degree

degf.

3. Suppose fer[t] is an irreducible polynomial, and consider the polynomial algebra F[t]/11[t] denoted by f'1[t]. a) Show that F[t] may be identified in a natural way with a subalgebra of F1[t].

b) Prove, that if two polynomials in F[t] are relatively prime, then they are relatively prime when considered as polynomials in ['1[t]. c) Construct an example to prove that an irreducible polynomial in F[t] is not necessarily irreducible in F1[t]. d) Suppose thatf has degree 2, and that g is an irreducible polynomial of degree 3 in F[t]. Prove that g is irreducible in F1[t]. §

4•* The structure of factor algebras

In this paragraph f will denote a fixed monic polynomial. and F(t) will denote the factor algebra F [t]/If. 12.14. The lattice of ideals in 1(i). Consider the set of all monic polyof nomials which divide 1. These polynomials form a sublattice then the greatest (cf. sec. 12.8). In fact. if ... is any finite set in 24

(i rcuh. Linctr \Igchra

Chapter XII. Polynomial algebra

370

common divisor and the least common multiple of the is again contained in f is a lower bound and 1 is an upper bound of ideals in the algebra On the other hand, consider the lattice F(i)=F[t]/11. The remarks of sec. 12.11 establish a bijection defined by

'Pg = where 'g denotes

the ideal in T(i) generated by

check that 'P and phism: i.e.,

The reader can easily 'P

cP(\/

a lattice isomor-

=

and

In

particular,

'P(1)=F(i) and

cP(f)=O. 12.15. Decomposition of F(i) into irreducible ideals. Let f =f1 denote the ideal in 1(1) generated byf,. Consider the ideal

. .

and let

(12.18)

Proposition I. I = if and only if the polynomials j are relatively prime. The sum (12.18) is direct if and only if, for each j. the polynomials and V J havef as least common multiple.

Proof: To prove the first part of the proposition we notice that F (1) = is

equivalent to 'P(1) = 'P(v fL)

which in turn is equivalent to

1= V

But according to sec. 12.6, this holds if and only if the J are relatively prime.

§ 4. The structure of factor algebras

371

For the second part we observe that the sum is direct if and only if

(j=1...m).

i*j Since

un

i*j

i*j

this is equivalent to

(j=1...m).

i*j Theorem I: Let be

the decomposition off into prime polynomials and let the polynomials be defined by k1

...f1

k2

fr

Then

1(1) = where

(12.19)

Moreover let

denotes the ideal generated by

(12.20)

and 1=1k

(12.21)

be the decompositions determined by (12.19). Then

is an identity in

and for every je1(f) q

41

=

(13.

(12.22)

Finally, if I is any ideal in 1(1), then (12.23)

Proof: The relation (12.19) is an immediate consequence of Proposition I, and formulae (12.16) and (12.17). To show that ë is an identity in 1(t) let be arbitrary. Then ci =

T

=

ci.

Since forj#i =0 24*

Chapter XII. Polynomial algebra

372

it follows that

= Now let

be

j.

an arbitrary element of T(t) and let q

=

we obtain q—

+...+

+•••+

But

and so It follows that k=O

+...+

= i=1

Finally, let I be any ideal in F(i). Then clearly

mu and so (12.23) is an immediate consequence of (12.20).

Theorem II. With the notation of Theorem I let —p

be

the homomorphism defined by =

p.(i) = Then

Thus

is an epimorphism and ker induces an isomorphism

=

F

(12.24)

In particular, the minimum polynomial of isf11". Proof: (12.22) shows that is an epimorphism. Next we prove that = ker We have

and the sum (12.19) is direct it follows that

(i=1...r);

§ 4. The structure of factor algebras

373

Now consider the induced map

i.e.,

Then

qEF[t]

= q(i,) and, in view of(12.22), j41

But

j*i

and thus

is the ideal generated

= 'fkj. This completes the proof.

Corollary I: An element cje 1(1) is contained in

if and only if

j+i.

and

Theorem III: The ideals are irreducible and (12.19) is the unique decomposition of 1(1) into a direct sum of irreducible ideals. Proof: Let In view of the isomorphism (12.24) it is sufficient to prove that the algebra 1' [t]/Igj is irreducible. According to Proposition I, T[t]/Igj is the direct sum of two ideals and '2 only if and '2 =

q1 and q2 are relatively prime divisors of

where

But this can only

happen if either q1 = or q2 = 1. If q1 1, say, then '1 is the full algebra and so 12=0. It follows that F(t)/Igj is irreducible. 1

Now suppose that I is an irreducible ideal in 1(1). Then Theorem I, sec. 12.16 gives

In I

I

for some i. Thus if

T(i)= for some ideal J, then intersection with I, gives I1).

is irreducible, it follows that Jn pletes the proof. Since

whence

This com-

374

Chapter XII. Polynomial algebra

Corollary I. The irreducible factor algebras 1(1) are precisely those for which f is a power of an irreducible polynomial.

Corollary II: Suppose the ideal I is a direct summand in 1(1). Then I is a direct sum of the I. Proof: Let J be an ideal such that J = 1(1). It is obvious that in a finite dimensional algebra any ideal is a direct sum of irreducible ideals. Since I and J are ideals, I J= 0, and so any ideal in 1(J) is an ideal in 1(1). Now the result follows from Theorem III. 12.16. Semisimple elements. Let A be a finite dimensional asso-

ciative commutative algebra with unit element e and let aeA. Recall from sec. 12.11 that the minimum polynomial, /, of a has positive degree. The element a is called seinisim pie. if f is the product of relatively prime irreducible polynomials.

Lemma I: a is semisimple if and only if the element f'(a) is invertible in the algebra T(a). If a is semisimple, then the polynomials I and f' are relatively prime (cf. Proposition V. sec. 12.9). Thus, by Corollary I to Proposition II. sec. 12.8. there are polynomials p and q such that

pf+qf'= I. Since f(a)=0 it follows that q(a)f'(a)=e and so f'(a) is invertible. Conversely, assume that ['(a) is invertible in 1(a). Then there is a poly-

nomial g such that f'(a)g(a)=e.

Set

Ii = 1' g

Then

1.

ii(a) = f'(a)g(a)

— e

= 0.

Since [ is the minimum polynomial of a it follows that f/i. Thus we can write

/

g

—

I

=

f. q

where q is some polynomial. It follows that [and [' are relatively prime.

Lemma II: Assume that a is semisimple and let h be a polynomial such that Ii(aY1=0 for some ,n 1. Then Ii(a)=0. Proof: Let .1 be the minimum polynomial of a. Then the hypothesis implies that f//i"'. Since a is semisimple. [is a product of relatively prime irreducible polynomials and it follows that f/li (cf. Proposition IV. sec. 12.9). Thus h(a)=0.

§ 4. The structure of factor algebras

375

IV: Let A be a finite dimensional associative commutative algebra. Let xEA and let Theorem

j=

Irk

the decomposition of the minimum polynomial of x into prime factors. Set h(1=1/1 r be

and

(i=l...r). Then there are unique elements xsEA and XvEA such that x5 is semisimple, x\ is nilpotent and X=

.vs

+ XN.

The minimum polynomials of xs and =g

are given by

and

=

Proof: 1. Existence: Identify the subalgebra 1(x) with the factor algebra

T(fl=r[t]/11 via x=t. Lemma IV below (cf. sec. 12.17) yields elements UET[t] and (')ET[t] with the following properties: (i)

(ii)

(iii)

U+Wt, g divides w.

Projection onto the quotient algebra yields the relations (12.25) (12.26)

and (12.27)

Now set

xs = ii and

=

Then g(x5) = 0.

Thus the minimum polynomial of xs divides g and hence it is a product of relatively prime irreducible polynomials. Thus, by definition. x5 is semisimple. Relation (12.26) shows that xv is nilpotent and from (12.27) we obtain the relation xv + xs = x. Finally. x5 and

are polynomials in x.

Chapter XII. Polynomial algebra

376

2. .\Iiiiiiniiin poltiiomiuls: Next we show that g is the minimum poiy— nomial of Let Ii he the minimum polynomial of Then Ii divides g. On the other hand. Taylor's expansion gives

+ x) =

/i(x) =

=

+q

q

where qEA is some element. This shows that /i(x) is nilpotent. Now Sec. 12.12 implies that g,'/i. Thus g=/i. consider me minimum polynomial, gI()L Thus ''V. It follows that

=

Since g/w we have

of

= 0.

= 1' with / k. To show that k / we may assume that k 2. Then (Ii

Thus

=g

r and r is

relatively prime tog (cf. Lemma III and IV, sec. 12.17). Hence r' is relatively prime to f It follows that r(x) is invertible. Now

=

= g(x) i(x)

=g.

and so

= xç. =

g(x)'

= 0.

Since t(x) is invertible we obtain g(x)'=O whence g'(x)=O. This implies that f/gt whence k 1. Thus k = 1: i.e.. = 3. Uniqueness: Assume a decomposition X=

Ys

is nilpotent. Set

where v5 is semisimple and

== and

+ .

Then

—

so = is nilpotent. We must show that ==0. Taylor expansion yields I

(i)(

j=1

)

x

j=O

g (i)(

)

.1•

showing that gft5) is nilpotent. Hence, for some in. (g(ys)Y1 = 0.

Since v5 is semisimple it follows that g(v5)=0 (cf. Lemma II) sec. 12.16 and so I.

(J(j) (

'=O.

(12.28)

§ 4. The structure of factor algebras

377

Let / 1 denote the degree of nilpotency of:. We show that 1= 1. In fact, assume that /2. Then, multiplying the equation above by :1_2 we obtain

=0.

In view of Lemma I (applied with A=T(x5)). g'(x5) is invertible in and hence invertible in A. It follows that zi =0, in contradiction to the choice of!. Thus 1=1; i.e., :=0. Thus i'5=X5 and IN=XN and the proof is complete.

are called the semisimple and

DefInition: The elements XN and ni/potent parts of x. 12.17. Lemma III:

Let g be a polynomial such that g and g' are rel-

atively prime. Then for each integer k 1 there are polynomials u and t' with the following properties: (i)

(ii) (iii)

u + gr = t. If k2, then r is relatively prime tog.

Proof: For k = I set u = t and v = 0. Next consider the case k =2. Since g and g' are relatively prime there are polynomials p and q such that

1+pg'=qg. The Taylor expansion (cf. sec. 12.4) gives, in view of the relation above, 0(n)

cx)

= g + g'pg +

g(t +

=g(1 +g'p)+g2 .

=

g2

+

+ (12.29)

/

q + g2 / = g2(q + /).

Now set

u=t+gp and r=—p. Then we have

u+gr=t

(12.30)

and, in view of(l2.29). = g2(q + /) which achieves the result for k =2. Finally, suppose the proposition holds for k — (k 3) and define Uk by 1

=

U2(11k1).

Chapter XII. Polynomial algebra

378

Replacing t by

in (12.30) we obtain )+

112(11k

12

By induction hypothesis

(uk_i) = Uk_i.

(12.31)

gkl

g

.

where q is some polynomial. Now (12.31) can be written as 11k

+ g.

q

.

Uk_i.

u2U(k_i)

(12.32)

Finally, (ii) yields in view of the induction hypothesis

=

t

(12.33)

g

Relations (12.32) and (12.33) imply that 11k

Setting tk = q

v2 (Uk

+g.

+

_

(q

17(11k_i)

.

we

+ tk—

= 1.

obtain

Uk + g Vk = t.

It remains to be shown that gk divides

But

=

=

Now the lemma, applied for k=2, shows that gk_

Since

it follows that and so the induction is closed. Thus property (i) and (ii) are established. (iii): Let k2. Suppose that ii is a common divisor of g and i.

g=p/i,

1:=q

11.

Then, by (ii)

g(u) =g(t — ig) = g(t

—

pqli2)

and so Taylor's expansion yields g(u) =

g

—

.

/

where / is some polynomial. Since k2. (i) implies that g2/g(u) and so /i2/g(u). Now the equation above shows that divides g g=

/i

.

§ 4. The structure of factor algebras

379

It follows that g' =

in +

2 Ii

m'

and so Ii is a common divisor of g and g'. Since g and g' are relatively prime. Ii must be a scalar. Hence g and r are relatively prime. This completes the proof.

Lemma IV: Let g be a polynomial such that g and g' are relatively prime. Then for each k I there are polynomials u and co with the following properties: (1) gk dirides g(u).

(ii) U+('J=t. (iii) g dirides ('i.

Proof: Define e by 0) = g

1

where r is the polynomial obtained in

Lemma III. 12.18. Decomposition of 1(t) into the radical and a direct sum of fields. Let be

the decomposition off into its prime factors and set

a= I/1 h

r

Consider the factor algebra T(i)=T[t]/11. By Theorem IV there is a unique decomposition =

ts + IN

is nilpotent. Moreover. g is the minimum is semisimple and polynomial oft5. Now let A be the subalgebra of 1(i) generated by I and Let be the ideal in T[f] generated by / (1= 1 ...r). where

Theorem V: 1. A is the (unique) direct sum of irreducible ideals (1=1 ... A

= Ii

and T[t]/11

In particular, each

(i =

1

... ij).

is a field and so A contains no non-zero nilpotent

elements.

2. The radical of T(t) consists precisely of the nilpotent elements in and is generated as an ideal by

Chapter XII. Polynomial algebra

3. The vector space T(t) is the direct sum of the subalgebra A and the ideal rad T(t). T(t) = A rad 1(t).

4. The subalgebra A consists precisely of the sernisimple elements in fliL 5. A is the only subalgebra of T(ii which complements the radical. Proof: 1) Consider the surjective homomorphism flt)—*A given by tF—*t. Since g is the minimum polynomial of it induces an isomorphism

T[t]/Ig

A.

According to Theorem III (applied to g) T[t]/Ig is the unique direct ['[tI/I1. (1 = ... sum of irreducible ideals where the corresponding ideal in A. Then 1

r).

Let

also denote

and the are irreducible ideals. is irreducible, Theorem I, sec. 12.13, implies Finally, since each is a field. that 2) and 3): Let denote the ideal generated by and let I be the ideal of nilpotent elements in T'(t). Since is nilpotent we have (12.34)

it follows that

Next, since

=

A

+

(12.35)

Moreover, since A is the direct sum of the fields rad

nA

rad A =

rad

= 0.

(12.36)

Relations (12.34) and (12.36) imply that the decomposition (12.35) is direct,

['(t) =

A

This yields

rad 1'(t) = rad 1'(t)n A + I\ = I\ whence I = rad ['(t). It follows that

T(t) =

A

rad T(t).

(12.37)

§4. The structure of factor algebras

381

We show first that every element in A is semisimple. In fact, let

4)

xe. Then, by Theorem IV, sec. 12.16, there are elements XNE['(X) such

that

is semisimple,

= In particular,

is

nilpotent and

+ XN.

is nilpotent in A. Thus, by 1),

and so

is semi-

simple.

be semisimple. In view of 3) we can write

Conversely, let

x=

XA

+ XR

T(t).

XAEA,

(12.38)

is nilpotent. Thus (12.38) must be the is semisimple and unique decomposition of into its semisimple and nilpotent parts.

Then

Since

is

semisimple it follows that

whence

which complements the radical. 5) Let B be any subalgebra of Then dividing out by the radical we obtain an algebra isomorphism B.

A

Thus. by 4), the elements of B are semisimple. Hence, again by 4),

BcA. It follows that B=A. Corollary: Let xET(t). Then the decomposition X = + XR obtained from the decomposition 3) is the decomposition of x into its

semisimple and nilpotent parts. The results of this paragraph yield at once Theorem VI: Let (12.39)

be the decomposition of the algebra T(t) into irreducible ideals. Then every ideal

is a direct sum rad I

I. = where

(12.40)

is a field isomorphic to T[t]/11 (i= I ..r). Moreover, the .

decompositions (12.39) and (12.40) are connected by rad T(t) = and A

rad I,

Chapter XII. Polynomial algebra

382

Problems 1. Consider the Polynomials

a) f=t3—6t2+ lit— 16. b) 1= t2 + t + 7. c) I = t2

—

5.

Prove that in each case I is the product of relatively prime irreducible polynomials. For k=2, 3. construct polynomials ii and r which satisfy the conditions of Lemma III. sec. 12.17. 2. Show that if and are any two polynomials satisfying the conditions of Lemma III (for some fixed k), then gk divides and g"' divides

Chapter XIII

Theory of a linear transformation Iii tills (ila/)ter E will ileiiote a clef iiied ouci

a/i aihitiaiv tie/cl T of characteristic 0.

l(X'tOl space E—E will

a liiiear tiaiisforiiiatioii.

§ 1. Polynomials in a linear transformation 13.1. Minimum polynomial of a linear transformation. Consider the algebra A(E; E) of linear transformations and fix an element E). E) is defined by Then a homomorphism P: cP:

fF—*f((p)

sec. 12.11). Let p be the minimum polynomial of (p. Since A(E; E) is non-trivial and has finite dimension, it follows that deg p (cf. sec. 12.11). The minimum polynomial of the zero transformation is t whereas the minimum polynomial of the identity map is t— 1. Proposition I. sec. 12.1 1. shows that (cf.

1

dim T(p) = deg p where

T(p)= Im&

13.2. The space Kifi. Fix a polynomial f and denote by K([) the kernel of the linear transformation f(p). Then the subspace is stable under (p. In fact. if xcK(f). then f(p)x=0 and so

=

= 0;

xEK(f). In particular we have

K(l)=0.

K(t)=kerço

and

where p denotes the minimum polynomial of (p.

K(p)=E.

Chapter XIII. Theory of a linear transformation

and let q1 : F F he the Let F E he ani suhspace stable ttIIL'iC /Lj: induced linear transformation. Then fL((pj;) 0 ciiid 50 denotes the minimum polynomial Now let g be a second polynomial and assume that I. Then

K(g)cK(f).

(13.1)

In fact, writingf==gg1 we obtain that for every vector xEK(g)

f(p)x =g1(p)g(p)x =0 whence xEK(f). This proves (13.1).

Proposition I. Letf and g be any two non-zero polynomials, and let d be their greatest common divisor. Then

K(d)=K(f)n K(g). Proof: Since d/f and d/g it follows that K (d)

K (f) and K (d)

K (g)

whence

K(d)

K(f) fl K(g).

(13.2)

On the other hand, since d is the greatest common divisor of f and g, there exist polynomialsf1 and g1 such that

d=f1f +g1g. Thus if xEK(f)n K(g) we have

d(p)x = f1(p)f(p)x + g1(p)g(p)x = 0 and hence xeK(d). It follows that

K(d)

K(f) fl K(g)

(13.3)

which, together with (13.2) establishes the proposition.

Corollary I: Letf be any polynomial and let d be the greatest common divisor off and 1u. Then

K(f) = K(d). Proof: Since K(p)=E, it follows from the proposition that

K(d)=K(f)n E=K(f).

§ 1. Polynomials in a linear transformation

385

Proposition II: Let f and g be any two non-zero polynomials, and let v be their least common multiple. Then

K(v) =

K(f) + K(g).

(13.4)

Proof. Sinceflv and gjv it follows that K(f)c:K(v) and K(g)cK(v); whence

K(v)

K(f) + K(g).

(13.5)

On the other hand, since v is the least common multiple off and g, there are polynomials f1 and g1 such that

f1f = v = andf1 and g1 are relatively prime. Choose polynomialsf2 and g2 so that f2f1 +g2g1 = 1; then + g2('p)g1(p) = 1. Consequently each xeE can be written as

x = x1 + x2 where

f2(p)f1(p)x and x2 = g2(p)g1 ((p)x.

x1

(13.6)

Now suppose that xeK(v). Then

f1(co)f(co)x = g1(ço)g(ço)x = v(p)x =

0

and sO (13.6) implies that

f(ço)x1 =O=g(p)x2. Hence x1EK(f), and x2eK(g), so that

xeK(f) + K(g) that is,

K(v) c K(f) + K(g).

(13.7)

(13.4) follows from (13.5) and (13.7).

Corollary I: 1ff and g are relatively prime, then K (1 g) = K (f)

K (g).

(13.8)

Proof: Sincef and g are relatively prime, their least common multiple is fg and it follows from Proposition 11 that

K(fg)= K(f)+ K(g). 25

Greub. Linear Algebra

(13.9)

Chapter XIII. Theory of a linear transformation

On the other hand the greatest common divisor of f and g is I, and so Proposition I yields that

K(f)n K(g)=K(l)=0.

(13.10)

Now (13.9) and (13.10) imply (13.8).

Corollary II. Suppose

f=fi"fr

is a decomposition of the polynomialf into relatively prime factors. Then

K(f)= Proof: This is an immediate consequence of Corollary I with the aid of an induction argument on r. Let f and g be any two non-zero polynomials such that g is a proper but the inclusion need not be proper. In divisor off Then fact, let g=p and I =/ip where ii is any polynomial with deg/i 1. Then g I (properly) but K (g) = E = K(f).

Proposition III: Letf and g be non-zero polynomials such that (i)

and (ii)

(properly)

Then

K (g) c K (f)

(properly).

Proof: (i) and (ii) imply that there are polynomialsf1 and g1 such that

p=ff1 and f=gg1

degg1>0.

(13.11)

Setting g2=gf1 we have degg2<degp and so p is not a divisor of g2. it follows that g2 cannot annihilate all the vectors of E; i.e., there is a vector xeE such that (13.12)

Let y =f1 (p)x. Then we obtain from (13.11) and (13.12) that

f(ço)y =f(p)f1(p)x = p(p)x =

0

while

= g(p)f1(p)x = g2(p)x

0.

§ I. Polynomials in a linear transformation

Thus

yeK(f), but

387

and so K(g) is a proper subspace of K(f).

Corollary I: Letf be a non-zero polynomial. Then

K(f)=O if and only iff and

(13.13)

are relatively prime.

1ff and /2 are relatively prime, then Corollary I to Proposition II gives

K(f)=K(f)n E=K(f)n K(/2)=O. Conversely suppose (13.13) holds, and let d be the greatest common divisor off and p. Then l/d and d/p but

K(d)=K(f)fl K(p)=O=K(1). It follows from Proposition III that 1 cannot be a proper divisor of d; hence 1 andf and p are relatively prime. Corollary II: Let f be any non-zero monic polynomial that divides p, and let ço1 denote the restriction of çü to K(f). Let pj. denote the minimum polynomial of p1. Then (13.14) pf=f.

Proof: We have from the definitions thatf(p1)=O, and hence ps/f. It K(f) and since K(pf) K(f) we obtain

follows that

K(p1) =

On the other hand,fjp and cannot be a proper divisor of Proposition IV: Let be

K(f).

Now Proposition III implies that,u1 which yields (13.14).

pf1...fr

a decomposition of p into relatively prime factors. Then

Moreover, if co denotes the restriction of q, to K(f,) and then polynomial of

pi =fi.

25*

is the minimum

Chapter XIII. Theory of a linear transformation

Proof: The proposition is an immediate consequence of Corollary II to Proposition 11 and Corollary 11 to Proposition 111. such 13.3. Eigenvalues. Recall that an eigenvalue of is a scalar that

(px=/x

(13.15)

for some non-zero vector XEE. and that x is called an eigenvector corresponding to the eigenvalue ).. (13.1 5) is clearly equivalent to

K(f)

0

(13.16)

where / is the polynomial t In view of Corollary I to Proposition III. (13.16) is equivalent to requiring that / and p have a non-scalar common divisor. Siiice deg/ I. this is the same as requiring that Thus the eigenvalues of p are precisely the distinct roots of p. Now consider the characteristic polynomial of çø, x

=

are the characteristic coefficients of p defined in sec. 4.19. The corresponding polynomial function is then given by where the

det ((p — )L i):

x

it follows from the definition that

dimE = Recall that the distinct roots of are precisely the eigenvalues of (cf. sec. 4.20). Hence the distinct roots of the characteristic polynomial coincide with those of the minimum polynominal. In sec. 13.20 it will be shown that the minimum polynomial is even a divisor of the characterisic polynomial.

Problems 1. Calculate the minimum polynomials for the following linear transformations: a) ço=).t b) (p is a projection operator c) (p is an involution

§ I. Polynomials in a linear transformation d) e)

389

'P is a differential operator

is a (proper or improper) rotation of a Euclidean plane or of

Euclidean 3-space 2. Given an example of linear transformations (p, t/i:E—*E such that 0i/i do not have the same minimum polynomial. o 'P and where (1= 1,2) are 3. Suppose E—E1E13E2 and linear transformations. Let p, 111, P2 be the minimum polynomials of (p, and 'P2. Prove that p is the least common multiple of P1 and P2. 4. More generally, suppose E1, E2 c E are stable under 'P and E= and 'P12:E1fl E2 be the E1+E2. Let q,1:E1—+E1, of'P and suppose that they have minimum restrictions polynomials P1,

is the least common multiple of Pt and P2• where v is the greatest common divisor of c) Give an example showing that in general P12 + 5. Suppose E1 E is a subspace stable under 'P. Let and minimum polynomials of p

b)

and

P2.

and 1ü be the

and

Prove that let v be the least common multiple of Pi and Construct an example where v=p+üp1 and an example where v+p= Finally construct an example where

6. Show that the minimal polynomial p of a linear transformation 'P can be constructed in the following way: Select an arbitrary vector x1eE and determine the smallest integer k1, such that the vectors (PVX1 (v= 0.. .k1) are linearly dependent,

= 0. Define a polynomial ft by k1

11

v0

tv.

If the vectors (PVXI (v=O...k1) do not generate the space E select a vector x2 which is not a linear combination of these vectors and apply the same

construction to x2. Let f2 be the corresponding polynomial. Continue this procedure until the whole space E is exhausted. Then p is the least common multiple of the polynomialsfQ. 7. Construct

the minimum and characteristic polynomials for the

following linear transformations of R4. Verify in each case that p divides

Chapter XIII. Theory of a linear transformation

390

a) b)

= =

c)

=

+ + + —

d)

+ —

+

+

+

+

—

—

—

+

— —

8. Let p be a rotation of an inner product space. Prove that the coefficients

of the minimum polynomial satisfy the relations

k=degp,v=0...k ± 1 depending on whether the rotation is proper or improper. 9. Show that the minimum polynomial of a selfadjoint transformation of a unitary space has real coefficients. 10. Assume that a conjugation z—*2 is defined in the complex vector space E (cf. sec. 11.7). Let ço:E—÷E be a linear transformation such that Prove that the minimum polynomial of p has real coefficients. 11. Show that the set of stable subspaces of E under a linear transformation is a lattice with respect to inclusion. Establish a lattice homomorphism of this lattice onto the lattice of ideals in F((p). 12. Given a regular linear transformation show that is a polywhere

nomial in çø.

13. Suppose p e L(E, E) is regular. Assume that for every //e L(E: E) Prove that J= I for some k. If/k is the leasi. integer (some

such that )'= I, prove that the minimum polynomial,

of p can be

written

=

§ 2.

Generalized eigenspaces

13.4. Generalized eigenspaces. Let

=

...

ft irreducible

(13.17)

be the decomposition of p into its prime factors (cf. sec. 12.9). Then the spaces

E. = K are called the generalized eigenspaces of p. It follows from sec. 13.2 that are stable under 'p. Moreover, Proposition IV, sec. 13.2 implies the

§ 2. Generalized eigenspaces

391

that (13j8) and

=

jk

where p, denotes the minimum polynomial of the restriction of p to E1. In particular, dim Now suppose 2 is an eigenvalue for 'p. Then t—2Jp, and so for some i t

Hence the eigenspaces of

—2.

are precisely the spaces

K(f) where the J, are the those polynomials in the decomposition (13.17) which are linear.

13.5. The projection operators. Let the projection operators in E associated with the decomposition (13.18) be denoted by ira. It will be shown that the mappings are polynomials in p,

i=1,...,r. If r= 1, it1 = t and the assertion is trivial. Suppose now that r> 1 and define polynomials by —

g

are relatively prime, and hence there

exist polynomials h1 such that (13.19)

On the other hand, it follows from Corollary II, Proposition II, sec. 13.2 that

K(g) =

j*i

and so, in particular, =

0

xe

j*i

Now let xeE be an arbitrary vector, and let x1EE1

(13.20)

392

Chapter XIII. Theory of a linear transformation

be the decomposition of x determined by (13.18). Then (13.19) and (13.20) yield the relation x=

=

x =

h (go) g1 (go)

h. (go) g. ((p) x1

whence x1

=

I = 1,

...,r.

(13.21)

It follows at once from (13.20) and (13.21) that

=

I

=

1,

...,r

which completes the proof. 13.6. Arbitrary stable subspaces. Let Fc: E be any stable subspace. Then

F=

(13.22)

n

where the are the generalized eigenspaces of go. In fact, since the projection operators are polynomials in go, it follows that F is stable under each 7C1.

Now we have for each xeF that X

=1X=

X

and fl

It follows that

E1,

whence E1.

inclusion in the other direction is obvious, (13.22) is established. 13.7. The Fitting decomposition. Suppose F0 is the generalized eigenspace of go corresponding to the irreducible polynomial t (if t does not divide then of course F0=0). Let F1 be the direct sum of the remaining generalized eigenspaces. The decomposition Since

E=

F0

F1

is called the Fitting decomposition of E with respect to go. F0 and F1 are called respectively the Fitting-null component and the Fitting-one component of E.

§ 2. Generalized eigenspaces

393

Clearly F0 and F1 are stable subspaces. Moreover it follows from the and F1, then definitions thatif 'p is ni/potent; i.e.,

forsomel>O

while Pi is a linear isomorphism. Finally, we remark that the corresponding projection operators are polynomials in (p, since they are sums of the projection operators defined in sec. 13.5. 13.8. Dual mappings. Let E* be a space dual to E and let E*

çp* E*

be the linear transformation dual to 'p. Then if f is any polynomial, it follows from sec. 2.25 that This

=

implies that f(p*) = 0 if and only if f('p) 0. In particular, the

minimum polynomials of 'p and

coincide.

Now suppose that F is any stable subspace of E. Then F' is stable under co*. In fact, if yEF and y*EF' are arbitrarily chosen, we have <(p* y*, y> =

=

'p

0

whence 'p*y*EFI. This proves that F' is stable. Thus 'p* induces a linear transformation ((p*)I.

E*/FI

E*/FI.

On the other hand, let and ('p*)1 are be the restriction of 'p to F. It will now be shown that dual with respect to the induced scalar product between F and E*/FI is a representative of (cf. sec. 2.24). In fact, if yEF is any vector and an arbitrary vektor e E*/FI, then <('p*)1

which proves the duality Suppose next that

v*, y> =
= =

=

'pF

and ('p*)1. Thus we may write

E=

F1

F2

is a decomposition of E into two stable subspaces. Then it follows that E* = F±

F1'

Chapter XIII. Theory of a linear transformation

394

is a decomposition of E* into stable subspaces (under co*). Moreover, the pairs and are dual, and

induce dual mappings in each pair. are two dual subspaces Conversely, assume that F1 c:E and stable under p and respectively. Then we have the direct decompositions (cf. sec. 2.30), and it is easily checked that çø and

E== F1

and E*

=

FjL

are again stable.

(cf. sec. 2.30). Clearly the subspaces (F1*)I and More generally. a direct decomposition

E into several stable subspaces determines a direct decomposition of E* into stable subspaces,

j*i

F)1

as follows by an argument similar to that used above in the case r=2. Each pair is dual and the restrictions to F, p and and

are dual mappings.

Pro position I: Let

p=

.

the decomposition of the common minimum polynomial. p. of p and (p* Consider the direct decompositions be

and (13.23)

of

£ and E*

into the generalized eigenspaces of p and

= Proof: Consider

Es)'

i = 1, ...,r.

j*i

the subspaces F,*cE*

F1*=(>EJ)l

j*i

defined by

i=1,...,r.

Then (13.24)

§ 2. Generalized eigenspaces

are stable under

Then, as shown above, the

395

and (13.25)

It will now be shown that (13.26)

Let y*EF,* be arbitrarily chosen. Then for each

=

=

we have

= 0.

In view of the duality between EL and F1*, this implies thatf/'1(cp*)y* =0;

y*EE* This establishes (13.26). Now a comparison of the decompositions (13.23) and (13.25) yields (13.24).

Problems 1. Show that the minimum polynomial of p is completely reducible (i.e. all prime factors are of degree 1) if and only if every stable subspace contains an eigenvector.

2. Suppose that the minimum polynomial p of ço is completely reducible. Construct a basis of E with respect to which the matrix of ço is lower triangular; i.e., the matrix has the form 0

*

Hint: Use problem 1.

3. Let E be an n-dimensional real vector space and p: E—*E be a regular linear transformation. Show that çø can be written p=c01p2 where every eigenvalue of p1 is positive and every eigenvalue of .p2 is negative.

4. Use problem 3 to derive a simple proof of the basis deformation theorem of sec. 4.32. 5. Let ço:E-*E be a linear transformation and consider the subspaces F0 and F1 defined by F0 =

and

F1 =

fl

Chapter XIII. Theory of a linear transformation

396

a) Show that F0 = U ker

ço1.

1

b)

Show that

c) Prove that F0 and F1 are stable under

and that the restrictions are respectively nilpotent and regular. and cond) Prove that c) characterizes the decomposition

(p0:F0-3F0 and ço1

clude that F0 and F1 are respectively the Fitting null and the Fitting 1-component of E.

6. Consider the linear transformations of problem 7, § 1. For each transformation a) Construct the decomposition of into the generalized eigenspaces. b) Determine the eigenspaces. c) Calculate explicitly polynomials g, such that the g.(p) are the pro-

jection operators in E corresponding to the generalized eigenspaces. Verify by explicit consideration of the vectors that the (ço) are in fact the projection operators. d) Determine the Fitting decomposition of E. 7. Let E= be the decomposition of E into generalized eigenspaces

of p, and let

be the corresponding projection operators. Show that

there exist unique polynomials

(p) = Conclude that the polynomials

such that and

deg

deg

depend only on p. E*

8. Let E* be dual to E and

E* into generalized eigenspaces of co* prove that

=

are the corresponding projection operators and the g, are defined in problem 7. Use this result to show that and are dual and to obtain formula

where the

(13.24).

9. Let FcE be stable under 'p and consider the induced mappings E/F—+ E/F. Let E= be the decomposition of F into

c°F :F—÷F and

be the canonical injection and generalized eigenspaces of 'p. Let Q:E—*E/Fbe the canonical projection. a) Show that the decomposition of F into generalized eigenspaces is given by

F=

where

F

fl

§ 3. Cyclic spaces

397

Show that the decomposition of E/F into generalized eigenspaces of is given by E/F = (ElF)1 where (E/F), = (E1). b)

Conclude that

determines a linear isomorphism (E/F),.

E1/F,

c) If denote the projection operators in E, F and E/F associated with the decompositions, prove that the diagrams

and F

are commutative, and that

are the unique linear mappings for

which this is the case. Conclude that if g are the polynomials of problem 7,then F g1((p) and = —

10. Suppose that the minimum polynomial ji of çü is completely reducible. a) By considering first the case p = (t —

dimE.

degp

b) With the aid of a) prove that of p. § 3.

prove that

x the characteristic polynomial

Cyclic spaces

13.9. The map 0a• Fix a vector aEE and consider the linear map ar,:

given by a

fer[t].

Let K1, denote the kernel of ar,. It follows from the definition that feK,, Ka, where p is if and only if aEK(J). K,, is an ideal in T[t]. Clearly the minimum polynomial of p. Proposition 1: There exists a rector a E E such that K,, = I,, Proof: Consider first the case that p is of the form

p=

k 1.

/ irreducible.

Chapter XIII. Theory of a linear transformation

398

Then there is a vector tIE E such that (13.27)

0.

Suppose now that hEKa and let g be the greatest common divisor of h and p. Since aeK(fl, Corollary Ito Proposition I, sec. 13.2 yields

ueK(g).

(13.28)

Since g, p it follows that g =1' where / k. Hence 1 (p) a = () and relation (13.27) implies that /=k. Thus K(g)=K0. Now it follows from (13.28) that

In the general case decompose p in the form =

J irreducible

...

and let (i= 1 i) be be the corresponding decomposition of E. Let is the induced transformation. Then the minimum polynomial of given by (cf. sec. 13.4) = r) (1= 1

Thus, by the first part of the proof, there are vectors K,, =

I,,

(i =

I

such that

. . .

Now set

(1(l1 Assume that jEK11. Then f(p) a=0 whence

= 0.

Since f(p)

it follows that

f(p) whence fe Ka (i =

1

=0

(1 =

I

... r). Thus •f C

and so tel0. This shows that

•'' fl

=

III

whence K(,=I,L.

13.10. Cyclic spaces. The vector space E is called crc/ic (with respect

to the linear transformation p) if there exists a vector acE such that is surjective. Every such vector is called a geiiclaloI of E. the map In fact, let I cK and let XEE. Then, Ifa is a generator of E. then

§ 3. Cyclic spaces

399

for some geT[t], x=g(p)a. It follows that

=f(p)g(p)a = whence

f(p)a = g(p)O = 0

and so tel1.

Proposition II: If E is cyclic, then

degp = dimE.

Proof Let a be a generator of E. Then, since Ka =

0a induces an

isomorphism E.

It follows that (cf. Proposition I, sec. 12.11)

= degp.

dim

Proposition III: Let a be a generator of E and let deg p = m. Then the vectors

a,p(a)

form a basis of E.

Proof. Let / he the largest integer such that the vectors a, p(a)

(13.29)

are linearly independent. Then these vectors generate E. In fact, every

vector xeE can be written in the form x=f(p)a where I =

is a

polynomial. It follows that i—i

k

X

=

(a) = V=

p' (a). 1)

1)

Thus the vectors (13.29) form a basis of E. Now Proposition II implies that I=ni. Proposition IV: The space E is cyclic if and only if there exists a basis (v=O... n—i) of E such that =

(v = 0 ... ii —2).

Proof: If E is cyclic with generator a set a=a and apply Proposiu—I) is a basis satisfying tion III. Conversely. assume that

the conditions above. Let v= e I'[t] by

be an arbitrary vector and define V

Chapter XIII. Theory of a linear transformation

400

Then

=x as is easily checked and so E is cyclic. (v = 0 ii — I be a basis as in the Proposition CoioI/urr: Let above. Then the minimum polynomial of is given by )

= where the

are

—

determined by

Proof: It is easily checked that p(p)=O and so p is a multiple of the minimum polynomial of p. On the other hand. by Proposition II. the minimum polynomial of has degree ii and thus it must coincide with p. 13.11. Cyclic subspaces. A stable subspace is called a crc/ic siihspacc if it is cyclic with respect to the induced transformation. Every vector UEE determines a cyclic subspace. namely the subspace = Im Proposition V: There exists a cyclic subspace whose dimension is equal to degp. Proof: In view of Proposition I there is a vector u e F such that ker = Then induces an isomorphism E11.

It follows that

dimE, = diml'[t]

'R

= degp.

Throic;,, I: The degree of the minimum polynomial satisfies

degp dimE. Equality holds if and only if E is cyclic. Moreover, if F is any cyclic subspace of E. then

dimF degp.

Proof: Proposition V implies that deg p dimE. If E is cyclic. equality

holds (cf. Proposition III). Conversely. assume that degp=dimE. By Proposition V there exists a cyclic subspace with dim F=degp. It follows that F=E and so E is cyclic.

§ 3. Cyclic spaces and irreducible spaces

401

Finally, let Fc:E be any cyclic subspace and let v denote the minimum polynomial of the induced transformation. Then, as we have seen above, dim F = deg v. But v/p (cf. sec. 13.2) and so we obtain dim F deg p.

Corollary: Let F E be any cyclic subspace, and let v denote the minimum polynomial of the linear transformation induced in F by (p. Then

v=u if and only if

dimF = degp. Proof: From the theorem we have

dimF =

degv

while according to sec. 13.2 v divides p. Hence v=p if and only if deg v=deg ,i; i.e., if and only if

dimF = degp. 13.12. Decomposition of E into cyclic subspaces. Theorem II: There exists a decomposition of E into a direct sum of cyclic subspaces. Proof: The theorem is an immediate consequence (with the aid of an induction argument on the dimension of E) of the following lemma.

E such that

Lemma I. Let dimEa = degp.

Then there is a complementary stable subspace, Fc E, E = Ea

F.

Proof: Let Ea

denote the restriction of (p to Ea, and let (p*:E*

E*

the linear transformation in E* dual to (p. Then (cf. sec. 13.8) stable under (p*, and the induced linear transformation be

*

*

J_

*

(pa:E lEa *—E lEa 26

Greub. Linear Algebra

is

Chapter XIII. Theory of a linear transformations

402

is dual to

with respect to the induced scalar product between Ea and

The corollary to Theorem I, sec. 13.11 implies that the minimum polynomial Of 'Pa is again p. Hence (cf. sec. 13.8) the minimum polynomial

are dual, so that

of p is p. But Ea and

= dimEa = degp.

dim

is cyclic with respect to çü.

Thus Theorem I, sec. 13.11 implies that Now let

be the projection and choose u*eE* so that the element generates E*/E±. Then, by Proposition III, the vectors (p = 0... in — 1) form a basis of Hence the vectors

(p* V

(p = I ... in — I) are linearly independent.

Now consider the cyclic subspace Since I) it follows that On the other hand. Theorem I. sec. 13.11. implies that dim Hence dim

= (p = 0... is injective. Thus 0

Finally, since 7r* TE to

dim

+ dim

=

in —

= m. I

)

it follows that the restriction of But

in + (ii — in)

=

ii

(n =dim E)

and thus we have the direct decomposition E* =

of E* into stable subspaces. Taking orthogonal complements we obtain the direct decomposition

E=

E into stable subspaces which completes the proof.

§ 4.

Irreducible spaces

13.13. Definition. A vector space E is called

!I1(Iecoin/)os(Ib/e or

ii'i'cthicible with respect to a linear transformation 'P. if it can not be expressed as a direct sum of two proper stable subspaces. A stable

§4. Irreducible spaces

403

subspace F E is called irreducible if it is irreducible with respect to the linear transformation induced by (p. Proposition I: E is the direct sum of iireducihle subspaces. Proof: Let

E into stable subspaces such that s is maximized

(this is clearly possible. since for all decompositions we have sdim E). Then the spaces are irreducible. In fact, assume that for some i, =

is a decomposition of

dim

Ft"

> 0, dim

>0

into stable subspaces. Then

E=

F Fe" j*i is a decomposition of E into (s+ I) stable subspaces, which contradicts the maximality of s. An irreducible space E is always cyclic. In fact, by Theorem II, sec. 13.12, E is the direct sum of cyclic subspaces.

E=

If E is indecomposable. it follows that 1=and so E is cyclic. On the other hand, a cyclic space is not necessarily indecomposable. In fact, let E be a 2-dimensional vector space with basis a. h and define p: E—+E by setting (pa=h and (ph =0. Then (p is cyclic (cf. Proposition IV, sec. 13.10). On the other hand, E is the direct sum of the stable subspaces generated by a + b and a — b. 1

Theorem I: E is irreducible if and only if i)

p=fk,

f irreducible;

ii) E is cyclic.

Proof. Suppose E is irreducible. Then, by the remark above. E is E into the cyclic. Moreover, if generalized eigenspaces (cf. sec. 13.4), it follows that r= 1 and so

(f irreducible). Conversely, suppose that (i) and (ii) hold. Let E=

E1

E2

Chapter XIII. Theory of a linear transformations

404

be any decomposition of E into stable subspaces. Denote by and c02 the linear transformations induced in F1 and E, by ço, and let and 112 be the minimum polynomials of and q2. Then sec. 13.2 implies that Hence, we obtain from (i) that p and 111 =

k2 k.

=

(13.33)

Without loss of generality we may assume that k1 k2. Then xEE1

or

XEE7

and so

0.

It follows that

whence k1 k. In view of (13.33) we obtain k1 =k

i.e., It

= Iti•

Now Theorem I, sec. 13.10 yields that

dimE1 degp.

(13.34)

On the other hand, since E is cyclic, the same Theorem implies that

dimE = degjt.

(13.35)

Relations (13.34) and (13.35) give that

dimE = dimE1. Thus E= F1 and F2 = 0. It follows that E is irreducible.

Corollary I. Any decomposition of F into a direct sum of irreducible subspaces is simultaneously a decomposition into cyclic subspaces.

Then a stable subspace Corollary II. Suppose that of F is cyclic if and only if it is irreducible. 13.14. The Jordan canonical matrix. Suppose that F irreducible with respect to p. Then it follows from Theorem I sec. 13.13 that F is cyclic and that the minimum polynomial of ço has the form

p_fk

kl

(13.36)

where f is an irreducible polynomial. Let e be a generator of E and

§4. Irreducible spaces

405

consider the vectors (13.37)

it will be shown that these form a basis of E. Since

dimE =

deg,u

= pk

it is sufficient to show that the vectors (13.37) generate E. Let Fc:E be the subspace generated by the vectors (13.37). Writing

we obtain that

i==1,...,k j=1,...,p—1 e

= f(çp)1_i ço"e =

p—i —

i1,...,k1 These

equations show that the subspace F is stable under 'p. Moreover,

e = a11 e F. On the other hand, since E is cyclic, E is the smallest stable

subspace containing e. It follows that F—E. Now consider the basis ...

a11,

...;akl ...

of E. The matrix of 'p relative to this basis has the form 0 1

(13.38)

0

406

Chapter XIII. Theory of a linear transformation

are all equal, and given by

where the submatrices

01 0

0 1

0•.

j=1,...,k. 0

1

The matrix (13.38) is called a Jordan canonical matrix of the irreducible transformation p. Next let ço be an arbitrary linear transformation. In view of sec. 13.12 there exists a decomposition of E into irreducible subspaces. Choose a basis in every subspace relative to which the induced transformation has the Jordan canonical form. Combining these bases we obtain a basis of E. In this basis the matrix of p consists of submatrices of the form (13.38)

following each other along the main diagonal. This matrix is called a Jordan canonical matrix of çø. 13.15. Completely reducible minimum polynomials. Suppose now that E is an irreducible space and that the minimum polynomial is completely

reducible (p= 1); i.e. that

p = (t

—

2)k.

It follows that the

are (1 x 1)-matrices given by Jordan canonical matrix of ço is given by

21

Hence the

0

2 1. (13.39)

..1 0

In particular, if E is a complex vector space (or more generally a vector space over an algebraically closed field) which is irreducible with respect to 'p, then the Jordan canonical matrix of çü has the form (13.39). 13.16. Real vector spaces. Next, let E be a real vector space which is irreducible with respect to ço. Then the polynomialf in (13.36) has one of the two forms f=t—2 2ER or

§4. Irreducible spaces

407

In the first case the Jordan canonical matrix of p has the form (13.39). In the second case the are 2 x 2-matrices given by 0

Hence

1

the Jordan canonical matrix of 0

has the form

1

-fl

1

- -

1

H 13.17. The number of irreducible subspaces. It is clear from the construction in sec. 13.12 that a vector space can be decomposed in several ways into irreducible subspaces. However, the number of irreducible subspaces of any given dimension is uniquely determined by as will be shown in this section. Consider first the case that the minimum polynomial of 'p has the form

p=f

degf=p

wheref is irreducible. Assume that a decomposition of E into irreducible subspaces is given. The dimension of every such subspace is of the form pK(l as follows from sec. 13.12. Denote by FK the direct sum of the irreducible subspaces of dimension pic and denote by NK the number of the subspaces Then we have

E=

FK.

Comparing the dimensions in (13.40) we obtain the equation = dim E=n.

(13.40)

Chapter XIII. Theory of a linear transformation

405

Now consider the transformation

it follows from (13.40) that

Since the subspaces FK are stable under

(13.41)

By the definition of

we have FK

dim

=

whence

=

Since the dimension of each follows that

decreases by p under k/I (cf. sec. 1 3. 14) it

dimk//Fh =

—

(13.42)

1)NK.

Equations (13.41) and (13.42) yield (r(k/I) = rank t/i)

K=2

Repeating the above argument we obtain the equations j=1,...,k.

Kj+

1

Replacing] byj+ I and]—! respectively we find that

(K—j—l)NK=p (13.43)

and —j + 1)NK =

=

—j)NK +

pINK.

(13.44)

Adding (13.43) and (13.44) we obtain

+

= 2p

(K

=2p Kj+

1

—j)NK +

+

+

§4. Irreducible spaces

409

whence

=

+

j

—

= 1,...,k.

This equation shows that the numbers N1 are uniquely determined by the

(j= l,...,k).

ranks of the transformations In particular.

1.(i/1k_l)>

Nk=

I

if]> k. Thus the degree of is pk where k is the largest integer k and such that Nk>O. In the general case consider the decomposition of E into the generalized eigenspaces E= a a direct sum of irreducible subspaces. Then E every irreducible subspace F, is contained in some E (cf. Theorem I, sec. 13.13). Hence the decomposition determines a decomposition of

each E, into irreducible subspaces. Moreover, it is clear that these subspaces are irreducible with respect to the induced transformation E. Hence the number of irreducible subspaces in E1 of a given dimension is determined by and thus by p. It follows that the number of spaces F, of a given dimension depends only on (p. 13.18. Conjugate linear transformations. Let E and F be u-dimensional vector spaces. Two linear transformations p: E—*E and i/i: F—÷F are called conjugate if there is a linear isomorphism such that (pt: E1

= Proposition II: Two linear transformations (p and u/i

are

conjugate if

and only if they satisfy the following conditions: (i)

The minimum polynomials of (p and

factors /1

u/i

have

the same prime

Ir

(ii)

Proof: If (p and u/i

1=1 ... are

I.

conjugate the conditions above are clearly

satisfied.

To prove the converse we distinguish two cases.

Chapter XIII. Theory of a linear transformation

410

Casc I: The minimum polynomials of p and /i are of the form Pq)

.1

=

and

f irreducible.

.11.

Decompose E and F into the irreducible subspaces I

i=-1 j=1

The numbers =

.\ (/1)

F=>i=1 j=1

and N1(l/J) are given by

-

+

p=degf

and

= (cf.

-

+

sec. 13.17). Thus (ii) implies that

il. Since

i>k and

I>!

it follows that k=l. In view of Theorem I, sec. 13.13, the spaces El and F/ are cyclic. Thus Lemma I below yields an isomorphism such that El

= These

isomorphisms determine an isomorphism 1/I

Case II: and ized eigenspaces

=

0

such that

0

are arbitrary. Decompose E and F into the general-

and set

(i= I ... i). Then

restricts to a linear isomorphism in the subspace

§4. Irreducible spaces

Moreover, for

411

is zero in E,. Thus dimE —

=

= 1 ...

dimF —

=

=

Similarly.

... r.

i

Now (ii) implies that

1=1 ...r. Let and denote the respectively. The minimum polynomials of

restrictions of 'p and are of the form

and

i=l ... r and Since

+ dimE —

and

= r(f

j

+ dim F — dim

1

—

it follows from (ii) that l

(j(y) = Thus the pair

satisfies

...

the hypotheses of the theorem and so we

are reduced to case I.

Lemma I: Let p: E E and /i: F —* F (dim E = dim F = ii) be linear transformations having the same minimum polynomial. Then, if E and F are cyclic, 'p and are conjugate. Proof: Choose generators a and b of E and F. Then the vectors 01='pi(a) and

ii— I) forma basis ofEand F respectively.

Now set a,='p(a) and h= t/i'(h). Then, for some and

a,, = 1=

h=

()

h1. 1=

()

Moreover, the minimum polynomials of 'p and =

t'

and

—

1=)

=

t'

are given by —

1=0

Chapter XIII. Theory of' a linear transformations

412

(cf. the corollary of Proposition IV, sec. 1 3.10). Since 1 = 0 ...

=

it follows that

I).

n

by setting

define an isomorphism =

=

/ = 0 ...

h1

n —

Then we have

= (a,) = These

= h,. j=()

j=()

equations imply that

j=0...ii— I. This shows that Anticipating the result of sec. 13.20 we also have Corollary I: Two linear transformations are conjugate if and only if they have the same minimum polynomial and satisfy condition (ii) of Proposition II.

Corollary II: Two linear transformations are conjugate if and only if they have the same characteristic polynomial (cf. sec. 4.19) and satisfy (ii).

Proof: Let J be the common characteristic polynomial of p and /i. Write

=

. .

=

...

.

(f irreducible).

and = .1111

Set

the generalized eigenspaces for direct decompositions

and

fr

13(1=1 ... r) denote 1= 1 ... r; let and t/i respectively. Then we have the

§4. Irreducible spaces

it follows that

Since

413

(1=1 ... r). Similarly.

whence

(i=l...r). Thus we may assume that (g irreducible). Then (kni). ,=g'

I

is

of the form f=g" and

the corollary

follows immediately from the proposition.

Corollaiv III: Every linear transformation is conjugate to its dual.

Problems 1.

(i)

Let P: A(E; E) be a non-zero algebra endomorphism. Show that for any projection operator it.

r(P(m))=i(rr), (ii) Use (i) to prove that 2. Show that every non-zero endomorphism of the algebra A (E; E) is an inner automorphism. Hint: Use Problem I and Proposition II. sec. 13.18. 3. Construct a decomposition of into irreducible subspaces with respect to the linear transformations of problem 7, § 1. Hence obtain the Jordan canonical matrices. 4. Which of the linear transformations of problem 7, § make into a cyclic space? For each such transformation find a generator. For which of these transformations is irreducible? 5. Let E1 cE be a stable subspace under p and consider the induced mappings p1: E1 E1 and E/E1 E/E1. Let p, 1u1, ji be the corresponding minimum polynomials. a) Prove that E is cyclic if and only if 1

i) E1 is cyclic ii) E/E1 is cyclic

iii) p=u1ü. In particular conclude that every subspace of a cyclic space is again cyclic.

b) Construct examples in which conditions i), ii), iii) respectively fail, while the remaining two continue to hold. Hint: Use problem 5, § 1. 6. Let j=1 are

E into subspaces and suppose linear transformations with minimum polynomials

Chapter XIII. Theory of a linear transformations

414

Define a linear transformation

of E by

a) Prove that E is cyclic if and only if each relatively prime.

is cyclic and the

are

b) Conclude that if E is cyclic, then each F3 is a sum of generalized eigenspaces for çø.

c) Prove that if E is cyclic and

a generates Eifand only if generates 1. ..s). 7. Suppose Fc: E is stable under p and let (pF:F-+F (minimum polynomial jib.) and (minimum polynomial ji) be the induced transformations. Show that E is irreducible if and only if i) E/F is irreducible. ii) F is irreducible. iii) wheref is an irreducible polynomial. 8. Suppose that Eis irreducible with respect to çø. Letf (f irreducible)

be the minimum polynomial of ço. a) Prove that the k subspaces K(f) stable subspaces of E b) Conclude that lmf(co)K =

K(fk_l) are the only non-trivial

K

k.

9. Find necessary and sufficient conditions that E have no nontrivial stable subspaces. 10. Let be a differential operator in E.

a) Show that in any decomposition of E into irreducible subspaces, each subspace has dimension I or 2. b) Let N1 be the number of/-dimensional irreducible subspaces in the above decomposition (j= 1, 2). Show that

N1-f-2N,=dimE and N1=dimH(E). c) Using part b) prove that two differential operators in E are conjugate if and only if the corresponding homology spaces coincide. Hint: Use Proposition II, sec. 13.18.

11. Show that two linear transformations of a 3-dimensional vector space are conjugate if and only if they have the same minimum polynomial.

§ 5. Applications of cyclic spaces

12. Let be a linear transformation. Show that there exists a (not necessarily unique) multiplication in E such that i) E is an associative commutative algebra ii) E contains a subalgebra A isomorphic to F((p)

iii) If &F(p)4A is the isomorphism, then

13. Let F be irreducible (and hence cyclic) with respect to p. Show that the set S of generators of the cyclic space E is not a subspace. Construct a subspace F such that S is in I — correspondence with the non-zero elements of E/F. 1

14. Let

be

a linear transformation of a real vector space having can not be written in the

distinct eigenvalues, all negative. Show that form § 5. Applications

of cyclic spaces

In this paragraph we shall apply the theory developed in the preceding paragraph to obtain three important, independent theorems. 13.19. Generalized eigenspaces. Direct sums of the generalized eigenspaces of 'p are characterized by the following

Theorem I: Let (13.45)

be any decomposition of E into a direct sum of stable subspaces. Then the following three conditions are equivalent: (i) Each is a direct sum of some of the generalized eigenspaces E, of 'p. (ii) The projection operators

in E associated with the decomposition

(13.45) are polynomials in 'p. (iii) Every stable subspace UcE satisfies U=

U n

Proof: Suppose that (i) holds. Then the projection operators are sums of the projection operators associated with the decomposition of F into generalized eigenspaces, and so it follows from sec. 13.5 that they are polynomials in 'p. Thus (i) implies (ii).

416

Chapter XIII. Theory of a linear transformation

Now suppose that (ii) holds, and let Uc E be any stable subspace. Then i, we have

since

Uc>o1U. Since Uis stable underQ1 it follows that Q1Uc: Un U c:

U

fl

whence

F1.

The inclusion in the other direction is obvious. Thus (ii) implies (iii). Finally, suppose that (iii) holds. To show that (i) must also hold we first prove

Lemma I: Suppose that (iii) holds, and let

E into generalized eigenspaces. Then to every i, such that (1= 1, ...,r) there corresponds precisely one integer], (1 E1 n

Proof of lemma I: Suppose first that s). Then from (iii) we obtain every ./(1

E. =

E, n

=0 for a fixed I and for

fl

=

0

j= 1

which is clearly false (cf. sec. 13.4). Hence there is at least one] such that

To prove that there is at most one] (for any fixed 1) such that E1 n we shall assume that for some f,]1,]2, fl

+0

and E1 n

0,

+0

and derive a contradiction. Without loss of generality we may assume that

E1flF1+0 and E1flF2+0. Choose two non-zero vectors

y1eE1flF1 and y2EE1flF2. Then since Yi' y2eE1 we have that =

§ 5. Applications of cyclic spaces

417

J irreducible, is the decomposition of p. Let and '2 the least integers such that =0 and =0. We may assume that = '2 we simply replace Yt by the vector where ji

. .

be

Then

f

= 0 and

11

+ 0 for k <

12.

Now set Y = Yi + Y2 and let V be the cyclic subspace generated by y. Clearly Yc F1 SF2, and so

in view of (iii) we obtain Y= Y

Y n F2.

n F1

It will now be shown that Y n F1 = 0. Let ue Yn F1 be any vector. Since ue Ywe have that

u =f(ço)y

+f(co)y2

for some polynomialf. Since ueF1, it follows that

f(p)y2=u—f(p)y1eF1 n F2=0 and hence d(cp)y2=0 where d is the greatest common divisor of f and Thus we obtain d=ft where

f1k

But dif; and so u=f(p)y=f(q,)y1+f(q,)y2=0. This proves that Yn F1 =0. A similar argument shows that Yn F2 = so that V = Y n F1 Y n F2 = 0. This is the desired contradiction, and it

completes the

0

proof of the lemma.

We now revert to the proof of the theorem. Recall that we assume that (iii) holds, and are required to prove (i). In view of the above lemma we can define a set mapping such that fl FC(I)

0

i = 1,...,r and

fl

Then (iii) yields that E1

= 27

Greub. Linear Algebra

E1 n

=

E1 n

=

0

Chapter XIII. Theory of a linear transforniation

Finally, the relation

E=

E

i1

t

j

i

implies that and

iet'(j)

j

E.

Hence t is a surjection, and for every integer] (1

of some of the

is a direct sum

and so (i) is proved. Thus (iii) implies (i), and the

proof of the theorem is complete.

13.20. Cayley-Hamilton theorem. It is the purpose of this section to prove the Theorem II: (Cayley-Hamilton) Let x denote the characteristic polynomial of 'p. Then or, equivalently, 'p satisfies its own characteristic equation.

Before proceeding to the proof of this theorem we establish some elementary results. Suppose ).ET is any scalar, and let v denote the minimum polynomial

of 'p —ii. Assume further that v has degree in, and let v be given explicitly by

=

+

trn

(13.46)

Then o=

-

- 2i)m + rn—i

(prn+

= where

(13.47)

•

f ('p)

j=O

=

trn

+

It follows that p/f, and so in particular degp

degf = degv.

On the other hand, 'p =

('p

—

i) +

i

(13.48)

§ 5. Applications of cyclic spaces

419

and thus a similar argument shows that

degv degp. This, together with (13.48) implies that

degf =

degv

degp.

(13.49)

andf has leading coefficient 1, we obtain that

I =p. In particular, sincef=v(g), where g=t—)L, we have

= v(0) = /3g.

=

(13.50)

Lemma II: Suppose E is cyclic and dim E= m, and let x be the characteristic polynomial for 'p. Then x

=(—

Proof: Let )LEF be any scalar, and let v be the minimum polynomial for ço—2i. Since E is cyclic (with respect to 'p), Theorem I, sec. 13.1 1

implies that

dimE = m. Now we obtain from (13.49) that deg v=m, and so a second application of Theorem I, sec. 13.11 shows that E is cyclic with respect to 'p —2i. Let a be any generator of E (with respect to 'p —2i). Then

is a basis for E (cf. Proposition III, sec. 13.10). Now suppose that zi is a non-trivial determinant function for E. Then —

2i)a,...,('p —

= det(q

2

—

=A(('p_2,)a,...,('p_Ai)ma) =

—

21)ma,('p

—

a) (13.51)

2z)a,...,('p

—

2,)m_1

a).

On the other hand, if (13.46) gives the minimum polynomial of 'p —2:, we obtain that ('p —2i)ma

Chapter XIII. Theory ofa linear transformation

42()

and substitution in (13.51) yields the relation —

=

—

—

(—

= It follows that

—

2l)m1 a)

—

—

= (and in view of (13.50) we obtain that

=

REF.

(—

(13.52)

Finally, since (13.52) holds for every 2eF, we can conclude (cf. sec. 12.10) that

=

(—

Proof of Theorem II: According to Proposition V, sec. 13.11 there

exists a cyclic subspace Eac:E such that dimEa = By Lemma I, sec. 13.12, Ea has a complementary stable subspace F0, E

Ea

F.

Let Xa and XF be the characteristic polynomials of the linear transformations induced in Ea and in F by ço. Then X

XaXF

(cf. sec. 4.21).

On the other hand, the minimum polynomial of the linear transformation induced in Ea by p is as follows from the corollary to Theorem I, sec. 13.11. Now the lemma implies that 1a

±/1

Hence,

13.21.* The commutant of p. The commutant of ço, C(co), is the sub-

algebra of A(E; E) consisting of all the linear transformations that commute with p. Let f be any polynomial. Then K(f) is stable under every cli e C(co). In fact ifyEK(f) is any vector, then

and so clJYEK(f).

§ 5. Applications of cyclic spaces

421

Next suppose that e C(ço) is any linear transformation. Consider the decompositions of E into generalized eigenspaces of and of i/i,

(forp) and (fon/i)

and the corresponding projection operators in E, mappings and it follows that

Since the and are respectively polynomials in p and (cf. sec. 13.5)

ltjoQjQjoJtj Now define linear transformations

=

in E by o

Then we obtain that

= and hence the

are

o Qj

o

=

o

=

o

o

again projection operators in E.

Since

cEn

and

=

=

it follows that

E=

fl

cE

whence

and

E=

n

F,.

E a direct Proposition I: Let E= F1 sum of subspaces. Then the subspaces F; are stable under 'p if and only if the projection operators aj are contained in C(p). . . .

Proof: Since

flkera1

1*j

it follows that the are stable under 'p if the e C(ço). Conversely, if the are stable under 'p we have for each YEFJ that 'pyEF1, and hence =

coy

=

422

Chapter XIII. Theory of a linear transformation

while

1+).

c11(py=O=çocr1y

Thus the cr1 commute with (p. 13.22.* The bicommutant of p. The hicoiiimutant, C2((p), of (p is the

subalgebra of (cf. example I, see 5.2) consisting of all the linear transformations which commute with every linear transformation in C(p). Theorem Ill. C2(co) coincides with the linear transformations which are polynomials in (p.

Proof Clearly Conversely, suppose t/JEC2(q9)

is

any linear transformation and let (13.53)

be a decomposition of E into cyclic subspaces with respect to p. A decomposition (13.53) exists by Theorem II, sec. 13.12. Let {a,} be any fixed generators of the spaces F induced by p, and let be the minimum polynomial of (pt. Then (cf. sec. 13.2) so we can write

1=1

S.

In view of Proposition V and the corollary to Theorem I, sec. 13.11 we may (and do) assume that Now the Fare stable subspaces of E (under and so by Proposition I, sec. 13.21 the projection operators in Eassociated with (13.53) commute with (p. Hence they commute with cli as well, and so a second application of Proposition 1 shows that the F are stable under i/i. In particular, Since is cyclic with respect to (p we can write

i=1 Thus ifh(p)aEF, is an arbitrary vector in

we obtain

= h(p) k/Ia =

=

since (p and k/I commute. It follows that

= g(p)

I=

1

s

where k/Is denotes the restriction of 4' to F. In the following it will be

§ 5. Applications of cyclic spaces

423

shown that

thus proving the theorem. Consider now linear transformations

(2

I s) in E defined by

j4:i

X1XX = To show that x,

is

well-defined it is clearly sufficient to prove that

f((p)v1(p)a1=0

whenever

But iff(ço)a1=0, then pIf and so

divides

whence

= 0.

The relation =

=

shows that xL commutes with 'p, and hence with t/i. On

the other hand we

have that =

=

=

and =

=

=

whence

=0. This relation implies that

But

=p and so

— g1).

Since

p=v1p1, we obtain that —g1.

This last relation yields that for any vector

i=2,...,s. It follows that

I'=gl@p) which completes the proof.

Problems 1. Let k1

"fr

kr

Chapter XIII. Theory of a linear transformation

424

be the decomposition of the minimum polynomial of p, and let P21 be the

the number of

generalized eigenspaces. 1ff has degree p denote by irreducible subspaces of E1 of dimension p1(1

Set

= Show that the characteristic polynomial of p is given by

2. Prove that E is cyclic if and only if x

=±

IL.

3. Let (p be a linear transformation of E and assume that E=>F1 is a decomposition of E as a direct sum of stable subspaces. if each is a sum of generalized eigenspaces, prove that each is stable under every is stable under every k/JEC(p) Conversely, assume that each and prove that each is a sum of generalized eigenspaces of (p. 4. a) Show that the only projection operators in C(ço) are i and 0 if and only if E is irreducible with respect to ço.

b) Show that the set of projection operators in C(p)

is

a subset of

C2(p) if and only if E is cyclic with respect to (p. 5.

a) Define C3(p) to be the set of all linear transformations in E

commuting with every transformation in C2(ço). Prove that C3((p) = C(ço).

Prove that C2(p) = C(p) if and only if E is cyclic. 6. Let E be cyclic with respect to p and S be the set of generators of P2. Let G be the set of linear automorphisms in C(p). a) Prove that G is a group. b) Prove that for each (/EG, S is stable under and the restriction of ,/' to S is a bijection. Show that has no fixed points if k/i i. c) Let aES be a fixed generator and let a, lEG be arbitrary. Prove that b)

a=r if and only if

aa =

hence in particular, a=r if and only if d) Prove that G acts transitively on S; i.e. for each a, bES there is a

and

k/lEG such that /ia=b.

e) Conclude that if aES is a generator, then the mapping given by

is a hijection.

§6. Nilpotent and seniisimple transformations

425

7. Let be a differential operator in E. Consider the set I of transformations k/JEC(() such that k/JZ(E)c:B(E) (cf. sec. 6.7).

Show that I is an ideal in

and establish an algebra isomorphism

A(H(E); H(E)). § 6.

Nilpotent and semisimple transformations

13.23. Nilpotent transformations. A linear transformation,

is called

ni/potent if (pk=0 for some integer k or equivalently, if its minimal polynomial has the form =

tm.

The exponent ni is called the degree of ço. It follows from sec. 13.7 that p is nilpotent if and only if the Fitting null component is the entire space.

It is clear that the restriction of a nilpotent transformation to a stable subspace is again nilpotent. are two commuting nilpotent transforSuppose now that and mations. Then the transformations çü + i/i and i/i ço are again nilpotent. In fact, jfk and / denote the degrees of 'p and i/i, then ('p +

(k+ 1)

=

=

0

j=k+1

j=O

and

which proves that 'p +

(k+l)

+ =

=

0

'p are nilpotent.

and

Assume that E is irreducible with respect to the nilpotent transformation 'p. Then it follows from sec. 13.14 that the Jordan canonical matrix of has the form 0

(13.54) 0

Hence, the Jordan canonical matrix of any nilpotent transformation consists of matrices of the form (13.54) following each other along the main diagonal.

Chapter XIII. Theory of a linear transformation

426

Suppose now that decomposition

Letf be the polynomial

is

any linear transformation, and that p has the =

f

J=

Then if/i is any polynomial, h(co) is nilpotent if and only at once from sec. 12.12.

as follows

Is 13.24. Semisimple transformations. A linear transformation called semisimple if every stable subspace E1 c:E has a complementary stable subspace. Example I: Let E be a Euclidean space and be a rotation of E. Since the orthogonal complement of every stable subspace is stable (cf. sec. 8.19) it follows that is semisimple. Example II: In a Euclidean space every selfadjoint and every skew transformation is semisimple, as follows from a similar argument. Example III: Let E be a unitary space. Then every unitary and every selfadjoint transformation is semisimple. Let ço be a semisimple transformation and suppose that E1 is a stable subspace. Then the restriction go1 of to E1 is semisimple. In fact, suppose F1 is stable under goi. Then F1 is stable under go, and hence there exists a complementary stable subspace, F2, in E,

E= Intersection with E1 yields E1 = Since F2 n

E1

F1

F1

F2. (F2 fl E1).

is (clearly) stable under go1 it follows that go1 is semisimple.

Proposition I: Suppose go is semisimple, and let f be any polynomial. Thenf(go) is nilpotent if and only iff(go)=O. Proof. The if part is trivial. Suppose now thatf(go) is nilpotent of degree k. Then is stable under go and so we can write E = where F is stable under go, and hence under f(go). On the other hand, it is

clear thatf(go)FcK(f"_ 1) whence

F=O i.e.,

§ 6. Nilpotent and semisimple transformations

427

It follows that E=K(f') where l=max (k— 1,1). Since k is the degree of nilpotency off(p) we have

l>k

whence k=l== 1. Hencef(p)=O. is simultaneously nilpotent and semisimple, then Corollary I: If (p = 0.

The major result on semisimple transformations obtained in this section is the following criterion:

Theorem I: A linear transformation is semisimple if and only if its minimum polynomial is the product of relatively prime irreducible polynomials (or equivalently, if the polynomials p and ji' are relatively prime).

Remark: Theorem I shows that a linear transformation p is semisimple if and only if it is a semisimple element of the algebra F(p) (cf. sec. 12.17).

Proof: Suppose (p

=

j1ki

is

...

semisimple. Consider the decompositions

1 irreducible and relatively prime

of the minimum polynomial, and set

ffifr

Then

f(p) is nilpotent, and hence by Proposition I of this section,

f(p)=O. It follows that

by definition, we have

This proves the only if part of the theorem.

To prove the second part of the theorem we consider first the special case that the minimum polynomial, ji, of ip is irreducible. To show that (p is semisimple consider the subalgebra, T(p), of A (E; E) generated by (p and i. Since p is irreducible, F(p) is a field (cf. sec. 12.14). E(p) contains F and hence it is an extension field of F, and E may be considered as a vector space over F(p) (cf. § 3, Chapt. V). Since a subspace of the F-vector space E is stable under p if and only if it is stable under every transforma-

tion of F(p), it follows that the stable subspaces of E are precisely the F(p)-subspaces of the space E. Since every subspace of a vector space has a complementary subspace it follows that 'p is semisimple.

Chapter XIII. Theory of a linear transformation

Now consider the general case

P=

fi

•

.

. fr

ft irreducible and relatively prime.

Then we have the decomposition

E into generalized eigenspaces. Since the minimum polynomial of the is preciselyf (cf. sec. 13.4) it follows induced transformation from the above result that is semisim pIe. Now let E be a stable subspace. Then we have, in view of sec. 13.6,

F F is a stable subspace of complementary subspace

=

and hence there exists a stable

(F n E1)

These equations yield =

H=

H is a stable subspace of E it follows that p is semisimple. Corollary I. Let p be any linear transformation and assume that

E=(F E into stable subspaces such that the induced are semisimple. Then p is semisimple.

transformations be the minimum polynomial of the induced transforProof: Let Since is semisimple each p is a product of relatively mation prime irreducible polynomials. Hence, the least common multiple, f, of the is again a product of such polynomials. Butf(p) annihilates E and hence the minimum polynomial, p, of 'p dividesf It follows that p is a product of relatively prime irreducible polynomials. Now Theorem I implies that 'p is semisimple. Proposition H: Let A c F be a subfield, and assume that E, considered as a A-vector space has finite dimension. Then every (F-linear) transfor-

mation, çü, of E which is semisimple as a A-linear transformation is semisimple considered as F-linear transformation.

§ 6. Nilpotent

and semisimple transformations

429

Proof Let p4 be the minimum polynomial of p considered as a zi-linear are relatively

transformation. It follows from Theorem I that and prime. Hence there are polynomials g,rE4[t] such that

(13.55)

On the other hand, every polynomial over A[t] may be considered as a polynomial in F[t]. Since ((p) 0 we have JLr

denotes the minimum polynomial of the (F-linear) transformation p. Hence we may write = p1 h some /1 E F [t] and so = (13.56) + Pr/I'. where

Combining (13.55) and (13.56) we obtain

q/zp1+h'rp1+hrp-= 1 whence

p- = I

(q Ii + h 'r) Pr + (Ii

are relatively prime. This relation shows that the polynomials and Now Theorem I implies that the r-linear transformation ço is semisimple.

Theorem II: Every linear transformation p can be written in the form

where c°s is semisimple and p and

are given by

and

k=max(k1 Moreover, if N

any decomposition of p into a semisimple and nilpotent transformation such that then is

and Proof': For the existence apply Theorem I and Theorem IV. sec. 12.16

with A=F(p).

430

Chapter XIII. Theory of a linear transformation

To prove the uniqueness let

be

any decomposition of (p

into a semisimple and a nilpotent transformation such that with Then the subalgebra of A(E; E) generated by i.

commutes is and

commutative and contains (p. Now apply the uniqueness part of Theorem IV. sec. 12.16.

13.25. The Jordan normal form of a semisimple transformation. Suppose that E is irreducible with respect to a semisirn pie transformation Then it follows from sec. 13.13 and Theorem I sec. 13.24 that the minimum polynomial of (p

has

the form IL =

where

f is irreducible. Hence the Jordan canonical matrix of (p

has

the

form (cf. see. 13.15) o i.••

(13.57) o

p

•i

p=

deg p.

It follows that the Jordan canonical matrix of an arbitrary semisimple transformation consists of submatrices of the form (13.57) following each other along the main diagonal. Now consider the special case that E is irreducible with respect to a semisimple transformation whose minimum polynomial is completely reducible. Then we have that p= 1 and hence E has dimension 1. It follows that if is a semisimple transformation with completely reducible minimum polynomial, then E is the direct sum of stable subspaces of dimension 1; i.e., E has a basis of eigenvectors. The matrix of with respect to this basis is of the form

(13.58)

§ 6. Nilpotent

and semisimple transformations

431

are the (not necessarily distinct) eigenvalues of p. A linear transformation with a matrix of the form (13.58) is called diagonalizable. Thus semisimple linear transformations with completely reducible minimum polynomial are diagonalizable. Finally let p be a semisimple transformation of a real vector space E. where the

Then a similar argument shows that E is the direct sum of irreducible subspaces of dimension 1 or 2. 13.26. * The commutant of a semisimple transformation.

Theorem III: The commutant C(p) of a semisimple transformation is a direct sum of ideals (in the algebra C((p)) each of which is isomorphic to the full algebra of transformations of a vector space over an extension

field ofF. Proof: Let

EE1EI3"EBEr

(13.59)

be the decomposition of E into the generalized eigenspaces. It follows from sec. 13.21 that the eigenspaces E are stable under every transformation I/JEC((p). Now let be the subspace consisting of all transformations t/i such that

is stable under each element of C(p) it follows that is an ideal in the algebra C((p). As an immediate consequence of the definition, we Since

have

fin

j=1,...,r.

k*j

(13.60)

be arbitrary and consider the projection operators associated with the decomposition (13.59).

Now let

Then (13.61)

where (13.62)

It follows from (13.62) that

Hence formulae (13.60) and (13.61)

imply that C ('p) =

It is clear that C

is the restriction of 'p to

Chapter XIII. Theory of a linear transformation

432

where the isomorphism is obtained by restricting a transformation to

induced by (p. Since the Now consider the transformations minimum polynomial of is irreducible it follows that E we obtain from chap. V, § 3 a vector space over that = of linear 13.27. Semisimple sets of linear transformations. A set { (p transformations of E will be called seniisimple if to every subspace F1 c: E which is stable under each there exists a complementary subspace F2 which is stable under each }

Suppose now that {co2} is any set of linear transformations and let be the subalgebra generated by the Then clearly, a subspace Fc:E is stable under each if and only if it is stable under is semisimple if and only if the each L'eA. In particular, the set algebra A is semisimple. be a set of commuting semisimple transforTheorem IV: Let mations. Then is a semisimple set.

Proof: We first consider the case of a finite set of transformations and proceed by induction on s. If s= I the theoreni is trivial. Suppose now it holds for s--I and assume for the moment that the minimum polynomial of is irreducible. Then E may be considered as a they may F((p1)-vector space. Since the ...,s) commute with be considered as F(ço1)-linear transformations (cf. Chap. V, § 3). Moreover, Proposition II, sec. 13.24 implies that the considered as linear transformations, are again semisimple. Now let F1 E be any subspace stable under the I, ...,s). Then since F1 is stable under (pt, it is a F(p1 )-subspace of F. Hence. by the induction hypothesis, there exists a F(p1 )-subspace of E, F2, which is stable under and such that

E=

F1

F2.

Since F2 is a F((p1)-subspace, it is also stable under stable subspace complementary to F1. of Let the minimum polynomial simple. we have P1 =

fi

. .

. Jr

and so it is a p

f1 irreducible and relatively prime.

is semi-

§ 6. Nilpotent and semisimple transformations

433

Let

E into the generalized eigenspaces of (pt.

Now assume that F1 cE is a subspace stable under each (1=1, ...,s). According to sec. 13.21 each E1 is stable under each It follows that the subspaces F1 n are also stable under every Moreover the restrictions of the to each are again semisimple (ci. sec. 13.24) and in particular, the restriction of ço1 to has as minimum polynomial the

irreducible polynomial f1. Thus it follows that the restrictions of the form a semisimple set, and hence there exist subspaces which are stable under each co and which satisfy to

= (F1

P

fl

Setting F2

I, ...,

=P

we have that F2 is stable under each

E=

j=

F1

and that F2.

This closes the induction, and completes the proof for the case that the are a finite set. If the set is infinite consider the subalgebra A c: A (E; E) generated by the 'pa. Then A is a commutative algebra and hence every subset of A consists of commuting transformations. In view of the discussion in the beginning of this section it is sufficient to construct a semisimple system of generators for A. But A is finite dimensional and so has a finite system of generators. Hence the theorem is reduced to the case of a finite set. Theorem IV has the following converse: Theorem

V: Suppose Ac:A(E; E) is a commutative semisimple set.

Then for each çoEA, p is a semisimple transformation. Proof Let coEA be arbitrary and consider the decomposition

of its minimum polynomial p. Define a polynomial, g, by

Since the set A is commutative, K(g) is stable under every t/ieA. Hence 28

Greub. Linear Algebra

Chapter XIII. Theory of a linear transformation

434

there exists a subspace E1 c:Ewhich is stable under every /JEA such that

E= Now let

Ii =

(13.63)

1k1—1

Then we have

h(ço)E c: K(g). On the other hand, since E1 is stable under h(p), h((p)E1

E1

whence

K(g)

h((p)E1

n E1

= 0.

It follows that E1 c: K(h). Now consider the polynomial

1,1).

where

Since

k, —

1

it follows that

p(p)x=O

xeE1

(13.64)

xEK(g).

(13.65)

and from 1, 1 we obtain

p(p)x=O

In view of (13.63), (13.64) and (13.65) imply that p(p)=O and so Now it follows that

max(k1—1,l)k,

i=1,...,r

whence

i = 1, ...,r. k, = I Hence ço is semisimple. Corollary: If çü and are two commuting semisimple transformations, then p+i/i and IJco are again semisimple.

Proof. Consider the subalgebra A cA(E;E) generated by p and Then Theorem IV implies that A is a semisimple set. Hence it follows

from Theorem V that ço +

and

p

are semisimple.

Problems I. Let p be nilpotent, and let be the number of subspaces of dimension 2 in a decomposition of E into irreducible subspaces. Prove that dim kerp = EN).

§ 6. Nilpotent

and semisimple transformations

435

2. Let p be nilpotent of degree k in a 6-dimensional vector space E. For each k (1 k 6) determine the possible ranks of p and show that k and r(co) determine the numbers N) (cf. problem 1) explicitly. Conclude that two nilpotent transformations ço and are conjugate if and only if

and 3. Suppose ço is nilpotent and let p*:E*÷_E* be the dual mapping. Assume that E is cyclic with respect to p and that p is of degree k. Let a be a generator of E. Prove that E* is cyclic with respect to and that

is of degree k. Let a*EE* be any vector. Show that

if and only if a* is a generator of E*. 4. Prove that a linear transformation p with minimum polynomial 1u is diagonalizable if and only if I) p is completely reducible ii) p is semisimple Show that i) and ii) are equivalent to

where the

are

distinct scalars.

5. a) Prove that two commuting diagonalizable transformations are simultaneously diagonalizable; i.e., there exists a basis of E with respect to which both matrices are diagonal. b) Use a) to prove that if p and t/i are commuting semisimple transformations of a complex space. then a complex space E. Let a p be the decomposition of £ into generalized eigenspaces, and let

be the corresponding projection operators. Assume that the minimum is Prove polynomial of the induced transformation that the semisimple part of 'p is given by cos

Let £ be a complex vector space and 'p be a linear transformation a), not necessarily distinct. Given an arbitrary polynomialf prove directly that the linear transformationf('p) has the eigenvalues (v= 1 n) (n=dimE). 7.

with eigenvalues

28*

I

Chapter XIII. Theory of a linear transformation

8. Give an example of a semisimple set of linear transformations which contains transformations that are not semisimple.

9. Let A be an algebra of commuting linear transformations in a complex space E. such that for any pEA, a) Construct a decomposition E1 is stable under 'p and the minimum polynomial of the induced transformation is of the form (t

(p) —

b) Show that the mapping A—÷C given by

'peA is a linear function in A. Prove that preserves products and so it is a homomorphism. c) Show that the nilpotent transformations in A form an ideal which is precisely rad A (cf. chap. V, § 2). Consider the subspace T of L(A) Prove that generated by the radA = T1 d) Prove that the semisimple transformations in A form a subalgebra, A5. Consider the linear functions in A5 obtained by restricting to A5. Show that they generate (linearly) the dual space L (A5). Prove that the is a linear isomorphism. T4 L(A5). mapping

10. Assume that E is a complex vector space. Prove that every commutative algebra of semisimple transformations is contained in an ndimensional commutative algebra of semisimple transformations. 11. Calculate the semisimple and nilpotent parts of the linear transformations of problem 7, § 1. § 7. Applications

to inner product spaces

In this concluding paragraph we shall apply our general decomposition theorems to inner product spaces. Decompositions of an inner product space into irreducible subspaces with respect to selfadjoint mappings,

skew mappings and isometries have already been constructed

in

chap. VIII.

Generalizing these results we shall now construct a decomposition for a iiorinal transformation. Since a complex linear space is fully reducible with respect to a normal endomorphism (cf. sec. 11.10) we can to real inner product spaces. restrict

§ 7. Applications

to inner product spaces

437

13.28. Normal transformations. Let E be an inner product space and (p: E—*E be a normal transformation (cf. sec. 8.5). It is clear that every

polynomial in ço is again normal. Moreover since the rank of (p1' (k =2, 3 ...)

equal to the rank of p, it follows that p is nilpotent only if = 0. Now consider the decomposition of the minimum polynomial into its prime factors, (13.66) = ... is

and the corresponding decomposition of E into the generalized eigenspaces,

(13.67)

Since the projection operators m1 associated with the decomposition (13.67) are polynomials in they are normal. On the other hand and so it follows from sec. 8.11 that the and X1EE1 be arbitrary. Then

arc selfadjoint. Now let

=0 i + I; = i.e., the decomposition (13.67) is orthogonal. It follows from Now consider the induced transformations sec. 8.5 that the are again normal and hence so are the transformations On the other hand, is nilpotent. It follows thatf(pj=0 and (xi,

= (xi,

hence all the exponents in (13.66) are equal to I. Now Theorem I of sec. 13.24 implies that a normal transformation is semisimple. Theorem I: Let E be an inner product space. Then a linear transformation ço is normal if and only if i) the generalized eigenspaces are mutually orthogonal. are homothetic (cf. sec. 8.19). ii) The restrictions

Proof: Let p be a normal transformation. It has been shown already that the spaces E1 are mutually orthogonal. Now consider the minimum polynomial. of the induced transformation p.. Since j is irreducible over it follows that either (13.68)

or

ft =

+

;t +

— 4,8k

In the first case we have that consider the case (13.69). Then

+

<0

131e

(13.69)

and so is homothetic. Now satisfies the relation

+ fit = 0

Chapter XIII. Theory of a linear transformation

438

and hence the proof is reduced to showing that a normal transformation

which satisfies (13.70)

is homothetic. We prove first that

— is regular. In fact, let K be the kernel of p whence If:EK is an arbitrary vector, we have

=

=

(ço

=

(p.

=

It follows that K is stable under 'p and hence stable under

Clearly the

restriction of 'p to K is selfadjoint. Hence, if K1= 0, 'p has an eigenvector in K which contradicts the hypothesis — 4f3 <0. Consequently, K= 0. Equation (13.70) implies that (13.71)

Multiplying (13.70) and (13.71) respectively by we find that

whence, in view of the regularity of

and 'p and subtracting

— ço,

(13.72)

Define a transformation, i. by

(notice that c2 —4/3<0 implies that /3>0). Then (13.72) yields and so z is a rotation. This proves that every normal mapping satisfies i) and ii). The converse follows immediately from sec. 8.5. Corollary I: If 'p is a normal transformation then the orthogonal complement of a stable subspace is stable. Proof: Let F be a stable subspace. In view of sec. 13.6 we have

F(E1fl F the restriction, of'p to is homothetic it follows that the orthogonal complement, of n F in E1 is again stable under 'p. Hence, the space is stable under 'p. On the other hand the equations Since

=

n F)

§ 7. Applications

to inner product spaces

yield

439

H=F'.

E=FEi?H

Hence F' is a stable subspace. As an immediate consequence of Theorem I we obtain Theorem II: Let E be an inner product space. Then a linear transformation p is normal if and only if E can be written as the sum of mutually

orthogonal irreducible subspaces such that the restriction of

to

every subspace is homothetic. 13.29. Semisimple transformations of a real vector space. In sec. 13.28 it has been shown that every normal transformation of an inner product space is semisimple. Conversely, let tp:E—*E be a semisimple transformation of a real vector space. Then a positive definite inner product can be introduced in E such that p becomes a normal mapping. To prove this let

be a decomposition of F into irreducible subspaces. In view of Theorem II it is sufficient to define a positive inner product in each such that the restrictions, ço1. of p to F1 are homothetic. In fact, we simply extend these

inner products to an inner product in E such that the F be one of the irreducible subspaces. Since 'p is semisimple, F has dimension 1 or 2. If dim F= 1 we choose the inner product in F arbitrarily. If dim F= 2 there exists a basis a, b in F such that

pa=b, 'pb=—fla—ctb, (cf. sec. 13.16). Define the inner product by

(a,a)=1, Then we have for every vector

x=

ofF (x,x)

+ + flij2.

—

— 4f3 <0 it follows that (x, x) x=O. Moreover, since

Since

0 and equality holds only for

('pa,'pa)=fl=fl(a,a), and ('pb,'pb) =

/32

= J3(b,b)

44()

Chapter XIII. Theory of a linear transformation

it follows that

xEF.

lcoxi2=fiIxI2

This equation shows that is homothetic and so the proof is complete. 13.30. Lorenti-transformations. As a second example we shall Construct a decomposition of the Minkowski-space into irreducible sub(cf. sec. 9.27). For spaces with respect to a Lorentz-transformation the sake of simplicity we assume that the Lorentz-transformation is proper orthochroneous. The condition implies that the inverse of every eigenvalue is again an eigenvalue. Since there exists at least one eigenvalue (cf. sec. 9.27) the minimum polynomial. p. of has at least one real root. Now we distinguish three cases: I. The minimum polynomial p contains a prime factor

of second degree. Then consider the mapping =

+

+ fit.

The kernel of t is a stable subspace F of even dimension and containing

no eigenvectors. Since p has an eigenvector in E, E+F. Thus F has necessarily dimension 2 and hence it is a plane. The intersection of the plane F and the light-cone consists of two straight lines, one straight line, or the point 0 only. The two first cases are impossible because the plane F does not contain eigenvectors (cf. sec. 9.26). Thus the inner product must be positive definite in F and the induced transformation Pi is a proper Euclidean rotation. (An improper rotation ofF would have eigenvectors.)

Now consider the orthogonal complement F'. The restriction of the inner product to F' has index 1. Hence F' is a pseudo-Euclidean plane. Denote by P2 the induced transformation of F'. The equation

detp = detp1detp2 implies that det ço2 = + I, showing that c°2

is

a proper pseudo-Euclidean

rotation. Choosing orthonormal bases in F and in F' we obtain an orthonormal basis of E in which the matrix of ço has the form

cosw

sinw

0

—5mw cosw LO

coshO sinh 0

sinhO cosh 0

§ 7. Applications

to inner product spaces

441

II. The minimum polynomial is completely reducible, and not all its roots are equal to 1. Then has eigenvalues )L+ and be corresponding eigenvectors 1

1

1. Let e and e'

1

The condition

1 implies that e and e' are light-vectors. These vectors

are linearly independent, whence (e,e')+O (cf. sec. 9.21). Let F be the plane generated by e and e' and let z =

+

be any vector of F. Then (z,z) =

This equation shows that the induced inner product has index 1. The orthogonal complement F' is therefore a Euclidean plane and the induced mapping is a Euclidean rotation. The angle of this rotation must be o or ir, because otherwise the minimum polynomial of p would contain an irreducible factor of second degree. Select orthonormal bases in Fand F'. These two bases form an orthonormal basis of E in which the matrix of çø has the form 0

040

cosh0

sinh0

c

0

Oc

0

lIT. The minimum polynomial of ço has the form

i-i=(t_1)k if k= 1, çü reduces to the identity map. Next, it will be shown that the case k=2 is impossible. Ifk=2, applying 'p' to the equation (p—i)2=0 yields

+

= 2i

whence

XEE. Inserting for x a light-vector, 1, we see that (1, 'pl)=O. But two light-vectors

can be orthogonal only if they are linearly dependent. We thus obtain çol=J. Since ço does not have eigenvalues 2+1, it follows that çøì=ì for all light-vectors 1. But this implies that çü is the identity. Hence the minimal polynomial is t — in contradiction to our assumption k =2. 1

Chapter XIII. Theory of a linear transformation

442

Now consider the case k 3. As has been shown in sec. 9.27 there exists an eigenvector e on the light-cone. The orthogonal complement E1 of e is a 3-dimensional subspace of F which contains the light-vector e.

The induced inner product has rank and index 2 (cf. sec. 9.2 1). Let F be a 2-dimensional subspace of F in which the inner product is positive definite. Selecting an orthonormal basis in F we can write çoe2 = — e1sinw

+ e2cosw +

(13.73)

çoe = e.

and are not both zero. In fact, if =0 and =0 the plane F is invariant under p and we have the direct decomposition of F into two 2-dimensional invariant subspaces. This would The coefficients

imply that k2. Now consider the characteristic polynomial.

of the induced mapping

Computing the characteristic polynomial from the matrix

coi : E1

(13.73) we find that Xi = (t2

—

2tcosw + 1)(1 — t).

(13.74)

At the same time we know that Xi =(1 — t)3.

(13.75)

Comparing the polynomials (13.74) and (13.75) we find that w=0. Hence, equations (13.73) reduce to e

'p

e

çoe = e.

Now consider the vector

=

e2 —

e1.

Then

(yy) =

+

=

+

>0

and

=

'pe2

—

—

+

e)

= y.

In other words, y is a space-like eigenvector of 'p. Denote by Y the 1-dimensional subspace generated by y. Then we have the orthogonal decomposition

E= Y

Y'

into two invariant subspaces. The orthogonal complement Y' is a 3dimensional pseudo-Euclidean space with index 2.

§ 7. Applications

to inner product spaces

443

The subspace Y' is irreducible with respect to p. This follows from our

hypothesis that the degree of the minimal polynomial p is 3. On the other hand, p can not have degree 4 because then the space E would be irreducible. Combining our results we see that the decomposition of a Minkowskispace with respect to a proper orthochroneous Lorentz-transformation has one of the following forms: I. E is completely reducible. Then is the identity.

II. E is the direct sum of an invariant Euclidean plane and an invariant pseudo-Euclidean plane. These planes are irreducible except for the case where the induced mappings are ± i (Euclidean plane) or t (pseudo-Eudidean plane) Ill. E is the direct sum of a space-like I-dimensional stable subspace (eigenvalue 1) and an irreducible subspace of dimension 3 and index 2.

Problems 1. Suppose E is an n-dimensional vector space over F. Assume that a

symmetric bilinear function ExE—J' is defined such that <x, x> 40 whenever x +0. a) Prove that <.> is a scalar product and thus E is self dual. b) If F E is a subspace show that

c) Suppose p:

is a linear transformation such that (p(p* =

(p*(p.

Prove that çø is semisimple. 2. Let ço be a linear transformation of a unitary space. Prove that p is normal if and only if, for some polynomial

=

I ('i').

3. Suppose ço is a linear transformation of a complex vector space such = i for some integer k. Show that E can be made into a unitary space such that ço becomes a unitary mapping. 4. Let E be a real linear space and let p be a linear transformation of E. Prove that a positive definite inner product can be introduced in E such that p becomes a normal mapping if and only if the following conditions are satisfied: a) The space Ecan be decomposed into invariant subspaces of dimension 1 and 2.

that

444

Chapter XIII. Theory of a linear transformation

b) If -r is the induced mapping in an irreducible subspace of dimension 2, then — dett <0.

5. Consider a 3-dimensional pseudo-Euclidean space E with the index 2. Let (1= 1, 2, 3) be three light-vectors such that =

I

I

Define a linear transformation, ço, by the equations

(ply =11

= (p13 =

+

— —

—

1)1k

+ (1 — — 1)12 + (2—

+

Prove that (p is a rotation and that E is irreducible with respect (p. 6. Show that a real 3-dimensional vector space cannot be irreducible

with respect to a semisimple linear transformation. Conclude that the pseudo-Euclidean rotation of problem 5 is not semisimple. Use this to show that a linear transformation in a self dual space which satisfies p*=(p*(p is not necessarily semisimple (cf. problem 1). 7. Let a Lorentz-transformation (p be defined by the matrix 1

2

2

3

2

1

1

939

1— 43

69 5

6

10

4

10

1

—

5

43

3

18

5

3

Construct a decomposition of E into irreducible subspaces. 8. Consider the group G of Lorentz transformations. Let e be a time-like unit vector and F be the orthogonal complement of e. Consider the subgroup Hc G consisting of all Lorentz transformations such that çoH=H. a) Prove that H is a compact subgroup. b) Prove that H is not properly contained in a compact subgroup of G. Hint. Show first that if K is a compact subgroup of G and (pEK, then every real eigenvalue of (p is ± 1.

Bibliography [1] BAER, R. Linear Algebra and Projective Geometry, New York, Academic Press Inc. 1952. [2] BELLMAN, R. Introduction to Matrix Analysis. Mac Graw-Hill Publ. Comp. Inc. 1960.

[3] BIRKHOFF, G., and S. MAC LANE. A Survey of Modern Algebra. New York, The MacMillan Co. 1944. [4] BOURBAKI, N. Elements de mathernatique, Premiere Partie, Livre II. [5] CHEVALLEY, C. Fundamental Concepts of Algebra, Academic Press Inc., New York, 1956. [6] CULLEN, C. Matrices and Linear Transformations, Addison-Wesley Pubi. Comp. 1966.

[7] GANTMACHER, F. R. Matrizenrechnung. Berlin, Deutscher Verlag der Wissenschaften, 1958. [8] GEL'FAND, 1. M. Lectures on Linear Algebra, Tnterscience Tracts in Pure and Applied Mathematics No. 9, New York, 1961. [9] GRORNER, W. Matrizenrechnung, Miinchen, R. Oldenbourg, 1956. P. Finite Dimensional Vector-spaces. Princeton. D. van Nostrand Co.. [10] Springer-Verlag. 1974.

[II] HAIMOS. P. Naive Set Theory, D. van Nostrand Co.. 1960. Springer-Verlag. 1974. [12] HOFFMAN, K., and R. KUNZE. Linear Algebra. Prentice-Hall Inc., 1961. [13] JACOBSON, N. Lectures in Abstract Algebra, 11. Princeton, D. van Nostrand Co. 1953.

[14] JACOBSON. N. The Theory of Rings. American Math. Soc. 1943.

[15] KELLER, 0. H. Analytische Geometric und lineare Algebra. Berlin, Deutscher Verlag der Wissenschaften, 1957. [16] KELLEY, J. General Topology, D. van Nostrand Co., 1955. [17] KOWALSKI, H. Lineare Algebra, Walter de Gruyter Co., Berlin, 1963. [18] KUIPER, N. Linear Algebra and Geometry. North-Holland PubI. Comp. Amsterdam, 1961. [19] LICHNEROWICZ, A. Lineare Algebra und Lineare Analysis, Berlin, Deutscher Verlag der Wissenschaften, 1956. [20] MAL'CEV, A. Foundations of Linear algebra, W. H. Freeman, San Francisco and London. [21] NEVANLINNA, R. u. F. Absolute Analysis, Grundlehren, Math. Wiss. 102, Gottingen-Heidelberg, Springer, 1958. [22] NoMizu, K. Fundamentals of Linear Algebra, MacGraw-H ill Book, Publ. Comp. Inc.

[23] PICKERT, G. Analytische Geometric, Leipzig, Akademische Verlagsgesellschaft, 1953.

[24] REICHARDT, H. Vorlesungen liber Vector- und Tensorrechnung, Berlin, Deutscher Verlag der Wissenschaften, 1957. [25] SCHREIER, 0. and E. SPFRNER, Modern Algebra and Matrix Theory, New York, Chelsea PubI. Co., 1955. [26] SHILOV, G. Theory of Linear Spaces, Prentice Hall Inc., 1961.

Bibliography

446

W. I. Lehrgang der höheren Mathernatik Teil III, 1, Berlin, Deutscher Verlag der Wiss., 1958.

[27] SMIRNOW,

[28] STOLL, R. Linear Algebra and Matrix Theory, London, New York-Toronto, McGraw-Hill Book PubI. Comp. Inc., 1952. [29] VAN DER WAERDEN, B. L. Algebra, Griindlehren Math. Wiss. Berlin-GOttingenHeidelberg, Springer, 1955.

Additions to the Bibliography [71 DicKsu\. L. linear Algebras: Cambridge tJnk Press. 1914. [8] Dii i i)O\\i . .1. Algêhre linêaire et geometric elémentaire: Hermann & (ic. Paris. [1 5] lii Si\iOIi.i:R. 1). libre Bundles: McGraw— H ill Book [22] 1965. S. Algebra: Addison

[23]

S. Linear Algebra: Addison Wesley 1970.

Subject Index Ahelian group 2 Adjoint mapping 216 matrix 114 Affine coordinates 297 coordinate-system 297 equivalent quadrics 310 mapping 298

Chain 4 rule 354 Characteristic coefficients 122 equation 120 polynomial 122. 388 nomial of a matrix 124

space 296 suhspace 298 Algebra 144

Classical adjoint transformation adjoint matrix 105 Coboundaries 180

of linear transformations of quaternians 208, 341 Algebraic number 166 Almost finite gradation 171 theorem 87 Angle

zero (field)

188

Barvcentic coordinates Basis

176

300

11

Basis transformation 92 Bessel's-inequality 194 Bicommutant 422 Bilinear function 63 Binomial formula 176 Boundaries 178 Bounded linear transformation

Canonical complex structure injection 23 involution 176 projection 2. 26 Icy-Hamilton theorem numbers 332 Cauchy-criterion 208 Center 300 of an algebra 155

Cofactor 115 Cohomology space 180 Column-vectors 83 Commutant 420 of a semisimple transformation 431 Commutative algebra 145 (group) 2 Compatible gradations 170 Complementary suhspace 26 Complete normed linear space 208 Complex vector space 6 Components 11 Composition 18 Cone with vertex 302 Conjugate linear transformations 409 principal axes space 329

206

238

418

105

Cocycles 180 Co-dimension 36 Coefficient field S

146

Anticommutative graded algebra Antiderivation 1 54 Associative algebra 145 di\ision algebra 210 Associativity I Augmented matrix 89

3

317

Conjugation 334 Connecting homomorphism 180 Continuous curve 249 Continuously differentiable curve 249 Convex subsets 287 Contragredient basis-transformations 93 Convergent sequence 207 Coset (group) 2 Cosine-theorem 189 Covariant components 204 Cramer's solution formula 116 Cross product 198 product in a three dimensional uIlitarv space 332

Subject Index

44S

characteristic formula 185 l.s en permutation 100

178

• uler—

('sclic space

398 400

suhspace

1)eformahle basis

1

Ixact sequence 46 sequence of differential spaces Exchange theorem of Steiniti 16

36

Degree of a homogeneous linear mapping 168 of a polynomial 352 of nilpotenc\ 158 DeMois re's formula 259 Dens ation 1 52 p—denis atiori

—-

Factor algebra group 3 set

1 53

space

Field

Gaussian elimination 97 General linear group 53 — orthogonal group 235 Generalized Eigenspaces 415 Generator 398 of a cyclic space 398 G-graded algebra 174 — differential spaces 181

Differentiation map 354 Dimension 32 Direct sum (external) 56 (internal)

24

Distance 194 Distrihutis e lasv (field) 3 Dis ision-algebra 145 Dis isor 358 Dual bases 78 determinant function 113 —. differential spaces 180 6-graded spaces 182 mappings 67 spaces

factor space

309

Endomorphism 144 Entries of a matrix 83 Equilateral simplex 300 Equis alent matrices 98 sectors 28 Euclid algorithm 357. 366 Euclidean space 1 86. 299

169

suhalgebra 175 suhspace 168 sector space 167 Graded differential algebras 183 space 167 Gram determinant 197. 331 Greatest common divisor 358 —- lower hound 4 Group ofdegrees 167 — of linear automorphisms 53 of permutations

65

Eigensalue 120 Eigens alues of a matrix 124 Eigen-space 223 Eigenvector 1 20 Elementary basis transformation Elements of a matrix 83 Ellipse

392

I

operator 1 78 space 178

-—

29

3

Fitting decomposition Fore-cone 286 Free vector space 13 Function

104

Differentiable curse 249 Differential algebra 183

——

149

296

space

1 79

Expansion—formula of a determinant Ixtension field 3

—.

Derised algebra 148 Determinant function 102 of a linear transformation ofan ii xii matrix 110 Diagonal mapping 57 Difference sector 296

1 85

I

1

95

Hadamard's inequality 202 Hermitian function 326 matrix 327 Hexagonal lemma 51 Homogeneous linear mapping 168 system of linear equations 86 vectors

167

Homology algebra — space

183

178

Homomorphism of algebras 144 of differential algebras 1 83

11 7

449

Subject Index of differential spaces — of exact sequences — of groups 2

Homothetic linear transformation Homotopy operator 184 Hyperbola 309 Hyperplane 298 Ldeal

Left ideal

178

47 235

148

Identification 56 Identity element Image space 41 Indecomposable space 402 Indefinite bilinear function 266 Index of a bilinear function 269 of a pseudo-Euclidean space 269 Induced basis 76 orientation 132 Infinite dimensional \ector space 32 Inhomogeneous system of linear equations 86 Inner product 186 186 — product space Instantaneous axis 256 Integration operator 356 Intersection 23 number 136 Inverse isomorphism 17 Inverse matrix 91 1

18

Irreducible algebra

Isometric linear mapping 232 Isomorphic 17 Isomorphism of exact sequences 47

Jacobi identity 156 Jordan canonical matrix

406

41

Kronecker symbol

— closure

23

combination function 16 — isomorphism

8

17

mapping 16 transformation 17 Linearly dependent ectors 9 independent vectors 10 — ordered elements 4 Lorentz transformation 292 Lower hound 4 — triangular matrix 91

Mapping Matrix 83 Metric normal-form of a quadric 319 Metrically equivalent quadrics 319 Minimum polynomial 367 Minkowsky-space 292 Minor 117 Monic polynomial 352 Monomial 352 Multiple 358 Multiplication operator 145 1

161

— polynomial 358 54, 402 — space — stable subspace 403

Kernel

148

inverse of a linear mapping 52 Leihniz formula 152 Lie algebra 156 Light-cone 281 -sector 281 Linear automorphism 17

76

Laplace expansion formula 118 Lattice 4 Leading coefficient 352 Least common multiple 359 upper bound 4 Lefschetz formula 185

n-dimensional affine space 296 n-space over 1 7 Nilpotent element 158 — linear transformation 125. 425 part 377 Non-degenerate bilinear function 64 ial linear relation 9 Norm of a vector 186. 328 Norm-function 205 Norm-topology 205 Normal form of a matrix 95 form of a quadric 307 — linear transformation 219. 336

Normal-\ector 316 Normed determinant function in a Euclidean space 196 Pseudo-Euclidean space space 330 Normed linear space 205 Nullspaces 64

— in a — in a

283

Subject Index

45()

Odd permutation Opposite algebra Orientation 1 3 1 presers trig

10()

Quadratic form

145

—

Oriented anole 196 Origin 297 Orthochroneous transformation Orthonormal basis 19 1

Quatern ion—algebra 292

67. 193

vectors

67. 187

/)—linear map 100 Parabola 309

Parallel affine suhspaces

29$ 262

1-parameter group 254 Partially ordered set 4 Past-cone 287 Pauli-matrices 345 Point 296 Poincaré series 1 71 Polar suhspace 309 Pol\ nomial 359 -—

function

364

definite

Quatern ionic exponent al funct on Quaternions 209

— -vector

Position sector 297 Positis e basis

Ross-vector

240 83

265

Scalar

gradation

167

selfadjoint transformation 225 semidefinite symmetric bilinear function 265 Prime pols nomial 358 Principal axes 3 1 7 minors 117 Product 144 —. of matrices 90 Projection operator 60 Proper rigid motion 301 rotation 234 p—dimensional parallelepiped

p-th Betti number — homology

5

product

326

space

197

182 182 190

Ptolems-inequalit\ Pseudo-Euclidean rotation — -Euclidean

200

131

definite Hermitian function —-

119

341

Radical of an algebra 1 59 Rank of a linear mapping 79 of a Sr mmetric bilinear function 265 Real form 331 sector space 6 Reflection 229 Regular linear mapping 79 matrix 90 Relatis clv prime polr nomials 358 Restriction of a bilinear function 64 — of a linear mapping 42 Riesi theorem 190 Right ideal 148 ins erse of a linear mapping 52 Rigid motion 299 Root of a polynomial 364 Rotation 234 Rotation-angle 239. 240

matrl\ 193 projection 194. 226 --

264

262

Quadric 302 Quasi—triangular determinant

131

complement

function

space

-orthogonal matrix Pythagorean theorem

281

290 1 89

65

multiplication 6 term 352 Schmidt—orthogonaliiation 192 Schur's lemma 54 Schwari-inequalitr 187 Selfadjoint linear transformation 221 sr mplectic transformation 333 Semisimple algebra 161 — element

—

374

linear transformation 426 part 377 set of linear transformations 432 transformation of a real sector

space 439 Sesquilinear function 325 Set mapping Short exact sequence 46 Signature of a bilinear function Simple algebra 160 Simplex 300 I

289

269

Subject Index

Skew linear transformation 229, 333 symmetric map 100 symplectic transformation 333 Space-like 281 — of linear mappings

51

Special orthogonal group 235 Split short exact sequence 47 Stable subspace 48, 383 Standard iiiner product 187 Step function 163 Suhalgebra 147 Subfield

Subgroup Suhspacc

Topological vector space 37 Totally reducible algebra 161 Trace of a linear transformation of a matrix 128 Transposed matrix 83 Triangle-inequality 189 Trigonometric functions 257 Two-sided ideal 148 Unit element — -sphere — vector

3

matrix

trick of Weyl 226 Upper hound 4 triangular matrix 145 9

303

91

Vandermonde determinant Vector space 5 Vectors

Tangent-hyperplanc

338

329 327

space

89

Sylvester's law of inertia 269 Symmetric bilinear function 261 Symplectic space 333 System of generators of an algebra of generators of a vector space

-space 303 Taylor's formula Time-like 281

145

186 186

Unitary mapping

2 23

Sum 23 of matrices

451

5

Vertex 318 Volume 197

355

Zero-xector

5

119

1 26