Die Grundlchrcn der mathematischen Wissenschaften in Einzcldarstcllungcn
F. and R. Nevanlinna
Absolute Analysis
Springer-Verlag New York Heidelberg Berlin
F. and R. Nevanlinna
Absolute Analysis Translated from the German by Phillip Emig
With 5 Figures
Springer-Verlag
New York Heidelberg Berlin 1973
F. and R. Nevanlinna Department of Mathematics. University, Hclsinki/Finland
Translator
P. Emig Granada Hills, CA 91344/175.
Geschitftsfuhrvnilc-
B. Ecklnann
Herausgeber
l:idgenossische Technische I-iochschule Zilrich
B. L. van der Waerden \lathematisches Institut tier UniversitSt Zurich
AMS Subject Classifications (1970) : 26 A 60
ISBN 0-387-05917-2 ISBN 3-540-05917-2
Springer-Verlag New York Heidelberg Berlin Springer-Verlag Berlin Heidelberg New York
This work is subjekt copyright. .\II rights are reserved, wheter the whole or part of the material is concerned, specifically those of translation, reprinting, re-use of illustrations, broadcasting, reproduction byphotocopyingnmachineorsinmilarmeans, and storage in data banks. Under § 54 of the German Copyright Law where copies are made for other than private use, a fee is payable to the publisher, the amonut of the fee to be determined by agreement with the publisher. er by Springer-Verlag Berlin Heidelberg 1973. Printed in Germany. Library of Congress Catalog Card 'Number 73-75652.
To the Memory of our father Otto Nevanlinna
Foreword The first edition of this book, published in German, came into being as
the result of lectures which the authors held over a period of
several Years since 1953 at the Universities of Helsinki and Zurich. The Introduction, which follows, provides information on what motivated our presentation of an absolute, coordinate- and dimension-free infinitesimal calculus. Little previous knowledge is presumed of the reader. It can be recom-
mended to students familiar with the usual structure, based on coordinates, of the elements of analytic geometry, differential and integral calculus and of the theory of differential equations. We are indebted to H. Keller, T. Klemola, T. Nieminen, Ph. Tondeur and K. I. Virtanen, who read our presentation in our first manuscript, for important critical remarks. The present new English edition deviates at several points from the first
edition (cf. Introduction). Professor I. S. Louhivaara has from the beginning to the end taken part in the production of the new edition and has advanced our work by suggestions on both content and form. For his important support we wish to express our hearty thanks. We are indebted also to W. Greub and to H. Haahti for various valuable remarks.
Our manuscript for this new edition has been translated into English by Doctor P. Emig. We express to him our gratitude for his careful interest and skillful attention during this work. Our thanks are also due to Professor F. K. Schmidt and to SpringerVerlag for including this new edition in the,,Grundlehren der mathematischen Wissenschaften" series. Helsinki, May 1973
THIi AUTHORS
Table of Contents Introduction
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1. Linear Algebra
4 4
§ 1. The Linear Space with (teal Multiplier Domain § 2. Finite Dimensional Linear Spaces 3. Linear Mappings § 4. Bilinear and Quadratic Functions § 5. Multilincar Functions § 6. Metrization of affine Space.
II. Differential Calculu.
10 13
26 41
................... .
1. Derivatives and Differential Formula
2.
.
§ 1. The Affine Integral 2. Theorem of Stokes. § 3. Applications of Stokes'., Theorem
IV. Differential Equations 1. Normal System, . § 2. The General Differential Equation of First Order $ 3. The Linear Differential Equation of Order 0mw a 1. Regular Curves and Surfacer 2. Curve Theory 3. Surface Theory
14 0 .
148
.
148 158 163
.
181 .
v)3 202 .
§ 6. Thcorcma Egregium f 7. Parallel Translation 5 8. The Gauss-Bonnet Theorem.
221
22S 232
VI. Riemannian Geonletr\'
.
181
183
4. Vectors and Tenor, t 5 Integration of the Derivative Formulas.
Index
its 11S 124
V. Theory of Curves and Surfaces
.
74 74
"5
III. Integral Calculus
Bibliography.
56
'Na
3. Partial Differentiation § 4. Implicit Functions.
1. .\ffine Differential Geommtry 5 2. Riemannian Geometryy .
i
.
.
247 24S 257
.
241
.
2O4
Introduction In the modern development of functional analysis, inspired by the pioneering work of David Hilbert, Erhard Schmidt and Friedrich Riesz, the appearance of J. von Neumann (1928) signaled a decisive change. Before him the theory of linear operators and of quadratic and Hermitian forms was tied in an essential way to the coordinate representation of the vector spaces considered and to the matrix calculus. Von Neu-
mann's investigations brought about an essentially new situation. Linear and quadratic analysis were freed from these restrictions and shaped into an "absolute" theory, independent of coordinate representations and also largely of the dimension of the vector spaces. It was only on the basis of the general axiomatic foundation created by von Neumann that the geometric points of view, crucial to Hilbert's conception of functional analysis, were able to prevail. It is not necessary here to recall in more detail the enormous development to which von Neumann's ideas opened the way. In this work an attempt is made to present a systematic basis for a general, absolute, coordinate and dimension free infinitesimal vector calculus. The beginnings for such a calculus appear in the literature quite early. Above all, we should mention the works of M. Frechet, in which the notion of a differential was introduced in a function space. This same trend, the translation of differential calculus to functional
analysis, is pursued in a number of later investigations (Gateaux, Hildebrandt, Fischer, Graves, Keller, Kerner, Michal, Elconin, Taylor,
Rothe. Sebastilo e Silva, Laugwitz, Bartle, Whitney, Fischer and others)'. In all of these less attention was paid to classical analysis, the theory of finite dimensional spaces. And yet it seems that already
here the absolute point of view offers essential advantages. The elimination of coordinates signifies a gain not only in a formal sense. It leads to a greater unity and simplicity in the theory of functions of arbitrarily many variables, the algebraic structure of analysis is clarified, and at the same time the geometric aspects of linear algebra become more prominent, which simplifies one's ability to comprehend the overall structures and promotes the formation of new ideas and methods. ' Cf., e. g., the book of E. Hille and R. S. Phillips [i j as well as the bibliography in this book.
2
Introduction
Since this way of viewing things is just as valid for classical analysis as for general functional analysis, our presentation is restricted for the most part to the finite dimensional case, that is, to the theory of finitely many variables. But it lies in the nature of the methods that they can
be applied, either directly or with certain modifications, which in general are readily apparent, to the case of infinitely many dimensions (for Hilbert or Banach spaces).
Our presentation starts with a chapter on linear algebra and the analytic geometry of finite dimensional spaces. As is well known, a great number ofworks are available on thisbasic topic, among them some in which the general points of view discussed above are fully considered. In this regard the fundamental presentation of N. Bourbaki 1 deserves
special attention. Nevertheless, we have deemed it necessary to enter into a thorough discussion of linear algebra, in order to introduce the basic concepts and the notations that have fundamental significance for infinitesimal analysis. Our presentation of the infinitesimal calculus begins in the second chapter, where the central problems of differential calculus are treated briefly. Among those problems for which the advantages of absolute analysis are particularly apparent, the theorems on the commutativity of differentiation and the theory of implicit functions merit special attention. The coordinate free method together with the application of an extension of the classical mean value theorem leads in a natural way to a uniqueness result for the second problem that is complete as regards the precision of the domain of validity for the inverse of a function or the solution of an equation. The following Chapter III is devoted to integral calculus. The inte-
gration of a multilinear alternating differential form constitutes the central problem. The "af fine integral" introduced here is essentially the
same as the Grassmann-Cartan integral of an "exterior differential". As an application, the problem of integrating a skew symmetric tensor field is solved. In the theory of differential equations (Chapter IV), the "absolute" mode of viewing things also produces order and unity. After a prepara-
tory treatment of normal systems, there follows the solution of partial differential equations by two different methods, the second of which leads to a sharpening of the conditions under which the problem, is usually solved. In Chapter V the basic features of the theory of curves and of the Gaussian theory of surfaces are presented. Although this chapter offers nothing new as far as content is concerned, the usefulness of the absolute coordinate free point of view is also shown in this case. In the theory of surfaces we restrict oursdves to the case of a m-dimensional surface
Introduction
3
embedded in an (in + t)-dimensional euclidean space. In accordance with the basic theme of our work, the "inner geometry" is heavily emphasized, and the theory is so constructed that it also includes an independent presentation of Riemannian geometry and of affine differential geometry. To this end it was necessary for tensor calculus to
receive special attention. This latter theory is also developed in a coordinate free way. The elimination of coordinates and indices, which in the usual treatment of tensors is already typographically burdenso-
me, simplifies the notation. On the other hand, very extensive abstractions are avoided, as occur in Bourbaki, for example. It was our goal to so shape the tensor calculus that its connection with the classical version is not broken and that it retains the character of an automatic calculus. It seems to us that the thus modified calculus, as indeed the
absolute infinitesimal calculus, can be used to advantage not only in mathematics, but also in theoretical physics. From many quarters the wish has been expressed that the authors append a synopsis of the elements of Riemannian geometry. Such a survey is contained in the last chapter (Chapter VI) of this new edition. Otherwise this edition also deviates at several points from the first. We refer in particular to the theory of implicit function, which is present-
ed in Chapter II following two different methods, and to Chapter IV, on differential equations, which has been substantially reworked and extended.
I. Linear Algebra § 1. The Linear Space with Real Multiplier Domain I.I. The basic relations of linearity. Let R be a set whose elements
we denote by a,b,c...,x,y,z...
We assume that addition and multiplication by a real scalar i. are defined so as to satisfy the following rules of linearity: Suet. Corresponding to each pair of elements a, b there is a unique element a + b in R. the sum of a and b, with the following properties: I.1. The sum is associative,
a + (b - c) = (a -}- b) + c. 1.2. There exists in R a unique element 0, the zero, so that for every it
a-..o=0Ta=a.
1.3. Each element a has a unique inverse, -a, in R with the property
a + (-a) = (-a) + it = 0
I.4. The sum is commutative.
a+6=b+a. The first three axioms state that with respect to addition R is a group, and as a consequence of I.4, a commutative or abelian group.
They imply the existence of a unique difference a - b with the property b + (a - b) = (a - b) -L b = a; namely
a - b=a+ (-b)=(-b)±a.
Product. Uniquely corresponding to each real A and each element a E R there is in R an element 7. a, the product of 7. and a, with the following properties:
II.1. 1 a = a. 11.2. The product is associative,
J.(pa)=(itp)a. 11.3. The product is distributive,
A(a + b) =Aa +Ab,
(A+ u) a =).a 4-,u a.
1 1. The Linear Space with Real Multiplier Domain
5
It follows from the distributive laws for b = 0 and ,u = 0 that
1.0=Oa=0.1 Conversely, from A a = 0 one concludes, provided A * 0, that
The product therefore vanishes if and only if one of the factors is zero.
Since for a = 0 the equation A a = µ a holds only for d = is, the additive abelian group R either reduces to the element zero or is of infinite order; for with a + 0 R includes all of the elements A a, which for different scalars). are different.
Further, note that as a consequence of II.3 and II.1 we have for an integer 1. = tit,
ma=(1+.....}_1)a=1a+...+1a=a±....{-a; M
hl
M
consequently, in a is equal to the multiple in a. From here it follows
that
a=1 a=(rrt. and hence a ist equal to the quotient a/rn. Thus for a positive rational 1. = p/q q
a=(p
=p a
1.2. Linear dependence. Dimension. A set R whose elements satisfy the axioms of addition and of multiplication listed above is called a linear space over the real ,nnnltiplier domain. If al, ... , a. are arbitrary elements of the space and 1.1, ... A. are arbitrary real numbers, the linear combination Ala, also has meaning and is contained in R.
The elements al, ... , a are said to be linearly independent if the above linear combination is equal to the zero element only for Al = ... = A. = 0; otherwise, they are linearly dependent. It follows immediately from this definition that every subset of linearly independent elements is likewise linearly independent. Because
the linear independence of one single element a is synonymous with a $ 0, this implies in particular that linearly independent elements are different from zero. 1 We use the symbol 0 for every zero element. In the above formula, 0 on the right and left stands for the zero element in the linear space, in the middle for the number zero.
6
1. Linear Algebra
With regard to the linear dependence of the elements of a linear
space R there are two possibilities: Either there exists a positive integer in such that in 1 elements are always linearly dependent, while on the other hand at least one system of in linearly independent elements can be found. Or there is no such maximal system : for every in,
no matter how large, there are always in + I linearly independent elements. In the first case in is called the dimension of the linear space R; in the second case the dimension is infinite. We shall be concerned primarily with the first far simpler case. However we wish to at first discuss a few concepts which are independent of the dimension. 1.3. Subspaces. Congruences. A subset U of the linear space R that with respect to the given fundamental relations is itself a linear space is called a subs 5ace of R. For this it is necessary and sufficient that U contains every linear combination of its elements. Every subspace contains the zero element of the given space R. and this element by itself is a subspace. Let U be a subspace, a and b two elements in R. The elements a and b are said to be congruent inodulo U,
a==b (mod U),
if b - a is contained in U. It follows from the notion of a subspace, first of all, that congruence is an equivvalence: a; 1. a
2. a
b implies b = a :
3. a = b and b = c imply a _ c; all modulo U. Further, if a =_ b and c
d the congruence
a-!- cb±d and for every A the congruence
7.2b
hold modulo U. Conversely, an equivalence a
b in R having the last two op-
erational properties is a congruence, modulo a definite subspace U
which contains precisely those elements of the space R that are equivalent to zero (cf. 1.6. exercise 1). Therefore, a congruence in the space R can be defined as an equivalence that has the above two operational properties. 1.4. Hyperplanes. Factor spaces. Any equivalence in R decompo-
ses all of the R-elements into disjoint classes so that two elements belong to the same equivalence class precisely when they are equiva-
§ 1. The Linear Space with Real Multiplier Domain
7
lent. If the equivalence is in particular a congruence modulo a sub-
space U, then two elements belong to the same congruence class provided their difference is an element of U. In conformity with the geometric terminology introduced in the next section, we call these congruence classes modulo U hyperplanes
(or more briefly planes) that are "parallel" to U. If a is an arbitrary element of such a hyperplane parallel to U it contains exactly all of the elements a + U. Of these parallel hyperplanes only U, which contains the zero element, is a subspace.
Retaining the basic relations of linearity, replace the original identity of elements with congruence modulo U. By means of this identification modulo U there results a new linear space the factor space of R modulo U:
RV = RJU. As elements of the latter factor space, the elements of the hyperplane
a + U are all equal to a. I.S. The affine vector space. Parallel displacement. Up until now we have spoken of "elements" of the "space" R without worrying about concrete interpretations of these concepts. It is however useful to think of the abstractly defined linear space as a generalization of the concrete 1-, 2- and 3-dimensional spaces and to introduce a corresponding geometric terminology.
An ordered pair of elements a, b] of the linear space R is called a vector; a is the "tail", b the "head" of the vector. Using the basic linear relations defined in R, we define
'a, b] +[c,d'=fa+c, b+d] and for every real i. A [a, b] = 'A a, A b)
.
One verifies at once that the vectors together with these definitions
form a linear space, the affine rector space associated with R. The zero element of this vector space is the vector f0, 01, and the vector inverse of [a, bl is
-[a,b)=
a,-bl
If one sets
(a, bl =
ic,
dl
whenever b - a = d - c, an equivalence (indeed a congruence) is defined in this vector space. For it follows from [a, b- = [c, d) and fa', b'l = [c', d'l, by virtue of the above definitions, that also and
fa,b] +[a',b'j° [c, d] +'c',d'] 2[a, b, - Arc, d!
8
1. Linear Algebra
The modulus of this congruence is the subspace of the affine vector space consisting of all vectors _a, a:, and the factor space that corre-
sponds to this subspace consists of all vectors with an arbitrarily fixed tail, for example, of all vectors 0, x' with the fixed tail 0. This factor space is hence isomorphic to the original linear space R : in the one-to-one correspondence 0, x]
x
.c - y and 10, x + _0, y" = 0, x ,- y and likewise A. x and 1[0, x 0, A x' are image elements. These elements can therefore be thought
of as either a vector congruent to -0, .r', or as a point, namely as the point at the head of the vector [0, a-. c, d; it follows that c - a = d -- b From the congruence ;a, h] which implies the congruence [b, d], !a,cl and conversely. The congruent vectors a, b], i, d; and -a, c_, _b, d, are the "parallel" sides of the parallelogram a b c d. One says that Lc, d] is obtained from [a, bl through parallel displacement by the vector a, c = [b, d , and [b, d] from Via, cl through parallel displacement by the vector [a, b" __ ;c. d". According to the definitions we have established,
ra, b; + ;.b, c l = -a + b, b + c_ = "a, c";
this is the elementary geometric rule for the "combining" of two vectors, which more generally implies that a1, b1)
.b1. b,] - ...
. b,.-s. bm-c. + [b,,- c. - :a. c]
1.6. Exercises. 1. Let be a given equivalence in the linear space R. with the following properties: from a b and c d, it follows that and for every real 1.
1 a,).b.
Prove that the set of all elements it 0 is a subspace U of R and further that the equivalence a b is the same as the congruence
a=b(mod U). 2. Let M1, ... , 11,E be arbitrary point sets of the linear space R and
(M1,...,M) the set of all finite linear combinations of elements in the sets. Prove that this set is a subspace. This subspace of R is generated or spanned by the sets 1111, ... , .Wk.
§ 1. The Linear Space with Real Multiplier Domair.
9
3. Suppose that the sets in the previous exercise are in particular subspaces U1, ... , U,, of R and that
U=(Lr1,...rUR) is the subspace of R spanned by these subspaces. Show that every element in U can be represented uniquely as a sum of elements from the subspaces U j precisely when the equation
=0
it,
with ui E U; holds only for u1 = ... = uh = 0. In this case the subspaces U1 are called linearly independent and U is written as the direct shit
U = Ul + ....+ U'V, 4. Show, keeping the above notation, that the intersection [UP
..
. U,1
r
i.e., the set of elements common to all of the subspaces Ui, is a subspace. 5 . Prove : In order that the subspaces U1, ... , U,, of the space R be linearly independent it is necessary and sufficient that the k intersections (1 = t r ... , k) i. ( Ir . . . , i_" Ui+tr ... r Uk)) = 0 Thus in particular two subspaces L'1 and U are linearly independent precisely when they are disjoint except for the zero point. r
+
Let R, and R. be two linear spaces over the real multiplier domain. We consider the set of all ordered pairs of points (x, y; and 6.
define : Lx1 r
Y1] = [x2, .V
if and only if x1 = x2 in R, and 3,1 = Y2 in R,,. Further, let and for each J.
X.Y:=p.x,).'1 Prove that with these definitions the elements ;x, vi form a linear :Apace, the product space
R,xR7. Construct more generally the product space of k linear spaces R3, ... , R,,. 7. In the above construction, identify Lx, 07 with x and Lo, y) with y,
and show that R., and R. are then linearly independent subspaces of the product space and that
RjxR,.=Rs+Rr.
I. Linear Algebra
10
2. Finite Dimensional Linear Spaces 2.1. Linear coordinate systems. Having introduced the basic concepts of general linear spaces we now consider those of finite dimension.
Let R' be a linear space of dimension no: Every system of its + 1 vectors is linearly dependent, while there exists at least one system of in linearly independent vectors al, ... , a,,,. It follows from this that any vector x in the space can be uniquely represented as a linear combination of these vectors r..1
The linearly independent vectors aj consequently generate the entire space R' = (a,, . . . , a,,,). They form a basis or a linear coordinate system for the space. The unique real numbers t;' are the in linear coordinates of the point x in the former coordinate system. We claim: Airy coordinate system of an 9n-dimensional space contains Precisely m linearly independent rectors.
Let b,, . . . , b he a second basis, in addition to a,, ... , a,,. Since these vectors are linearly independent, it is necessarily true that
n 5 m ; we maintain that it =: in. Since the vectors by by hypothesis generate the entire space R'", the vectors ai are, in particular, unique linear combinations of the vectors b,. Since a, + 0, the coefficients in the representation of a, cannot all vanish. This equation can therefore be solved for some b;, say for b,, and this expression for b, substituted into those for as, ... , a,",
which then become linear combinations of a, and b,, ... , b,,. Because of the linear independence of a, and a2, not all of the coefficients of b,,
.
. . ,
b" vanish in the expression thus obtained for a,, and the
equation therefore can he solved for b_. say. and b, can be eliminated from the expressions for a,, ... , if,,.
Now if it were true that it < in, then the continuation of this elimination process would after n steps lead to an equation of the form
;-t which contradicts the linear independence of the vectors a;. Consequently it = in, and the assertion is proved. 2.2. The monomorphism of the space N'". It follows from the above theorem that with respect to the relations of linearity all m-dimensional
linear spaces are isomorphic and consequently have the same linear structure. In fact: Let R' and R' be two in-dimensional linear spaces with coordinate systems a; and ii;. Two arbitrary points x and in these spaces then
1 2. Finite Dimensional Linear Spaces
11
have unique coordinates, ("/
MI
x = ' 75 al,
x = L E, a1 . l=1
i-1
If one now lets x and ]i correspond if and only if
e' (i = 1,
fit), a one-to-one mapping
x-x is obtained that has the following properties: 1. From x + y = z in RI it follows that z + jY = z in Rm. 2. From Y = A x in R" it follows that y = 3 r in RM. The one-to-one mapping just defined consequently preserves the basic relations of linearity to which any linear statement can be reduced. Every linear statement which is correct in RM remains correct
when the points which appear are replaced by their image points in RM, and conversely. This means that spaces of equal dimension have
the same linear structure, they are linearly isomorphic. Conversely, it obviously follows from a given isomorphic mapping RN
Je'"
that there is corresponding to any coordinate system a, in R' a uniquely determined coordinate system `ai -- a; in R'". We shall come back to the determination of this coordinate system and the corresponding isomorphisms later in connection with linear mappings. 2.3. Subspaces and factor spaces in R'". Let Ud be a subspace of R" of dimension d ; thus 0 5 if 5 in. For d = 0 Ud reduces to the
zero point, for if = in it spans R. Let now Ud be a subspace with
0
If a,, . . . , ad is then a basis for Ud it can be completed to a coordinate system for the entire space R'". For this choose new basis vectors ad L,, .
. .
, a," so that ad +4+1 is not contained in the subspace
(a,, ... , ad+i) generated by the preceding vectors. By virtue of this completion one can write for any x in R"' d
in
x= i-1 Z 'ai+ i-di E i;'ai. I or
M
x
E1 ai (mod Ud) . i-d+1
From the linear independence of the vectors a,, ... , aM it follows that the in - if basis vectors on the right are even independent modulo Ud, and consequently they form a basis for the factor space RM/Ud. This factor space thus has dimension in - if.
12
1. Linear Algebra
As vectors in the original space R"' the vectors ad.l , ... , a,, gene-
rate a subspace V'"-d of dimension in - d. The subspaces Ud and I'P"' are obviously linearly independent, and R'" = Ud + These subspaces are linearly independent coin/ilen:ents of one another
in R". It can be seen from the above that a linearly independent complement F"'-d for a given space Ud can be generated in many ways. They are all of dimension in - d and are linearly isomorphic to each other and to the factor space R'"f Ud. In general, the following holds: If the linearly independent subspaces L°' (i - 1, ... , k) are, respectively, of dimension d, and generate R', then k
Ed, = nl . Then if at', ... , ad is an arbitrary basis for Ud', the set of these in basis vectors provides a coordinate system in R' . 2.4. Hyperplanes in R"e. We have defined hyperplanes in t.4 as congruence classes modulo the subspaces. A subspace Ud of dimension d S in and a point xo of R'" determine a d-dimensional hyperplan e Ed through xo parallel to Ud and containing the set of points x with the property x_: x0 (mod U"). For d = 0 Ed reduces to the point xo: for d = 1 Ed is a line through .v0, etc. Two points x0, xl uniquely determine a line containing these
points and three points not lying on the same line, a plane through these points. We prove: If xa, xl, ... , xd (d 5 nl) are points in the space R' such that the d differences
- xo
.r, - X 0
are linearly independent, then there is one and only one d-dimensional
plane Ed through these points; Ed then contains precisely the set of points d
it
.t" _ E,el' x,
with
f'tli = 1
i=:o
where the numbers Ell are uniquely determined for any x in the plane.
For the proof, observe that it makes no difference which of the d + I vectors xj appears in the above differences as the subtrahend. For if they are linearly independent, the same is obviously true for the differences .1'0 --
%i_.t
xi, t'.: I __ xi, .. , xd - 1i
§ 2. Finite Dimensional Linear Spaces
13
for every j = 1, ... , d. They determine the same d-dimensional subspace Ud.
The desired plane Ed through the given points is parallel to the subspace Ud and consequently contains, since it goes through x°, all points d
x= t'0+ i-1 L' '(xi-x0)-
If we set
µ°=1 -
d
(i=1,...,d), i-t
we obtain the equation for the plane Ed in the above form. Conversely, this point set obviously defines a hyperplane Ed through the given points parallel to the subspace Ud, where the numbers u' are uniquely determined for a given x in Ea. This proves the assertion. The plane Ed discussed above contains the origin and coincides with d
the subspace U' if there exists a system of numbers µi with r µi = 1 such that i-O d
Eµixi = 0
i-o
that is, if the d + I vectors xi are linearly dependent. This is for d = in, naturally, always the case, and then E" = U" = R"'. 2.5. Simplexes. Barycentric coordinates. A configuration of the kind considered above, which consists of d + I arbitrary points x°, x,, ... , xd (d 5 in) with linearly independent differences xi - x°, is called a d-dimensional simplex. As was shown above, the simplex determines a plane Ed in which the simplex lies, and the points of this plane are uniquely given by
x= Gµixi
with L,µi=1. i-o i-0 This representation can be interpreted in the following way: If the total mass 1 is to be distributed among the vertices xi in such a way that the center of mass of the system lies at x, then the point xi must be given precisely the mass µi. For this reason the numbers µ' are called barycentric coordinates for the Ed-point x with respect to the simplex that determines Ed. If the barycentric coordinates of the point x are all positive, x is called an interior point of the simplex. If one or more of the coordinates is zero, the remainder positive, then x is a boundary point. Finally, if at least one of these coordinates is negative, x is called an exterior point of the simplex. The vertices xt of the d-dimensional simplex are obtained for µ0 = t and µ' = 0 (i + j), and hence, using the Kronecker delta 8', forµi = 61f.
1. Linear Algebra
14
These are the side simplexes of dimension zero. In general, a d-dimensional simplex has
lp+1 side simplexes of dimension p whose interior points are obtained when-
ever d - p of the barycentric coordinates are zero and the remaining p + 1 positive. 2.6. Exercises. 1. Let U' and I'll be subspaces of the linear space R" of dimensions p and q, respectively. Suppose the space (Ut, V9) generated by these subspaces has dimension s and the intersection [UP, V 9'j dimension r. Show that
s + r = p+q,
and determine r in the case where the subspaces U + TV are of dimen-
sion p=q=in-1. 2. Let
Sm = Sm(X0, .
.
8'n = s"( 0.... , .l"m)
, .Ym)
be two tit-dimensional simplexes in the same ,n-dimensional plane E'n and ccm
X ujxI i-0
(j--o,...,sn)
the barycentric representation of the points zj with respect to sm. Further let x be a point in the plane Em and
X = G, ,:ii j-O
the barycentric representation of x with respect to sm. Determine the barycentric representation of this point with respect to sm.
3. In the preceding exercise suppose that zj = rj for j = 1, ... , tit and
xo=µXo+E4u X is the barycentric representation of the vertex z, with respect to sm and, conversely, as a consequence, m pi x0 = - z0 - E - zi is i-t 11 the barycentric representation of x0 with respect to sm. Prove: The simplexes and im(xo, z, ... , rm) sm(x0, x,, ... , 1
§ 3. Linear Mappings
15
in the plane El with the common (in - 1)-dimensional side simplex xi.
SO
.
X.)
then have no common interior points if and only if p < 0.
§ 3. Linear Mappings 3.1. Definition. Let Rx and R;, be two linear spaces of dimensions in and n, respectively, and G, a point set in R,, which for the time being
is arbitrary. Then if corresponding to each point x in Gs there is a unique point x - y = y(x)
in the space Ry, a mapping of the set G, in R." into RY is defined. G. is
the domain and the set y(G,) of image points in R. the range of the mapping vector function y(x). We shall thoroughly investigate such general vector functions later.
Here only the simplest, namely the linear mappings, are to be discussed. The mapping x y = y(x) is said to be linear if y (x1 + x=) = y(xl) + y(x') and y(A x) = A y(x) for every real A, and consequently Y
0
7. Ai Xi)
i
1
_ Z A, y(Xi) i-t
We shall in what follows denote the vector function y(x) by A(x) and in general omit the parentheses for linear mappings. We write instead of y = A (x), y = Ax for short.
3.2. Domain of definition and range of linear mappings. In order for the above definition of a linear mapping to make sense, the domain, the point set G,, must contain all linear combinations of its vectors and therefore must be a subspace of R. In the following we restrict ourselves to this subspace and denote it from the start by R'. Let
y = Ax be a linear mapping from R= into R. We prove the following theorem : In R"y the image A U, of a subspace U, of R', and thus in particular the entire range ARz, is a subspace of R". whose dimension is at most equal to the dimension of the preimage U,r. k
Let x = E Ai xi be a finite linear combination of vectors xi in U,
i-f
and Axi = yi. Then k
k
k+
Y=Ax=A 'Aixi)= ZAiAxi= LrAiyi
16
I. Linear Algebra
Conversely, if V',, ... , yk are arbitrary points in the image A U., then there are in Us points x, such that the above equations hold. Since is a subspace, x also lies in V, and therefore y in A L. Accordingly, the latter set contains every finite linear combination of its vectors and it is therefore a subspace of R. The image of the most strict subspace, which contains only the zero point of R', is a subspace; since every subspace contains the zero point, it thus follows that AO = 0. This equality also follows, by the way, directly from .-1(i. x) = i. Ax for A = 0. It follows from this remark and from the above equations that the image vectors yi = .-ix,. of linearly dependent vectors x; are likewise linearly dependent. The dimension of the image space A U,, can therefore not exceed the dimension of the preimage.
3.3. Regular and nonregular linear mappings. The set of those vectors in the original space RT which in the linear mapping .4 are mapped onto the gem Factor of the Image space R" obviousy- contains NO Ilnife 11Nrhw rnnihtlishiMns fit th %4#01 NO Niid is thpfniy a 10041>K00
V- K"(A) of R". This subspace, whose dimension we assume is
(s tit), is called the kernel of A. If p = 0, then Ax = 0 only for x = 0, and hence Ax, = Axg only
for x, = x9. The range A R in R; is then "simple"; the mapping is said to be regular in this case. Since for a regular mapping the linear dependence of the vector x; follows from the linear dependence of the image vectors y; _ .4x;, the dimension of the preimage (Tx is no larger than the dimension of the image space A L's. These dimensions are thus equal for a regular linear mapping. In particular, the dimension of the range A R' is also equal u. to in and therefore necessarily in If the dimension of the kernel KD(A) is not = 0, then the mapping is said to be non regular. To determine the dimension of the image A Pd of a d-dimensional subspace Ud of R. in this case, observe that A considered as a mapping of the space Z'd has as kernel the intersection KA) = KO,
whose dimension we suppose is po (0 9 to d, p). If we now go over to the factor space UK. = L'dfKo, replacing the original identity relation in U" by congruence modulo KO, A obviously becomes regular in UK.. Since nothing was changed in the image space Ry, AUK. = A Ud, and the image space has by the above
the same dimension as UK., i.e., d - p,.
17
§ 3. Linear Mappings
If Ud and K-1 are linearly independent, then po = 0, and the dimension of A Ud is equal to d. If K" is a subspace of Ud, this dimension is d - p. Finally, if Ud is a subspace of the kernel Ko, then A Ud is also of dimension 0 and reduces to the zero point. From this simple argument it follows in particular that the dimension of the range ARz is equal to
r=m For a linear mapping of RM into R, therefore, we must always have n, and consequently the dimension of the kernel has to be
in -
3.4. Matrices. We fix two coordinate systems a,, ... , a,, and
b,, ... , b in the spaces Rs and R". Then
x= i-1rtia,
y=Ax=, j-1 'rtiby.
and
In particular, let
Aa; _ Eai bJ; as a consequence
Ax=A
i
111
i-1
i
M
w
`
=ZEAa;=E Za1¢ bJ i-I tit
Hence
''ii= ,'a;
lj=1,...,n).
'
We see :
Corresponding to a linear mapping or, as we also wish to say, a linear operator A from Rs into Ry there is, with respect to two fixed coordinate systems for these spaces, a system of equations that gives the n coordi-
nates of y = Ax as linear homogeneous expressions in the m coordinates of x. The coefficients in this system of equations form a matrix a,
.
Qp1
I = ('x")
with is rows and m columns. Conversely, by means of the above system of equations such a matrix defines a linear mapping y = Ax when we set M
x = r f' a, i-t
A
and
y=
it1 b1 .
18
I. Linear Algebra
If R' is a third linear space and z _ By stands for a linear mapping of R3 into R? with the matrix /
#,t
1,
_ (f) with respect to the above coordinate system b,..... b. in R.' and a coordinate system c,, ... , cp in R=, the equation
By=BAx obviously defines a linear mapping from R' into Ro which is produced by composition of the linear mappings A and B. With respect to the coordinate systems a,, ... , a, in Rs and c,, ... , cp in R' this mapping has as matrix the product of the matrices (1) and (1;)
(Yi)=(fl)(all)=( Efllot with p rows and in columns.
3.5. The linear operator space. We now consider the set of all linear operators from R, into R". and are going to look at them as elements of a space, to which we give a linear structure by means of the following definitions.
If A and B are two linear operators from Rs to R;. we set A = B if
Ax=Bx identically in x. We define the sum A + B by means of the identity
(A+B)x=Ax-'-Bx and the product 1A by
(;.A)x=i. Ax.
With these definitions the linear operators considered obviously form a linear space, the linear operator space from R',' into R;, whose zero element is the identically vanishing linear mapping which maps all vectors in R' onto the zero vector in R'.. In order to determine the dimension of this linear operator space we consider an arbitrary operator A in the space and have, with the earlier notation, for every x in RY M
n
.lx- jt ,-1 `6ittiji1 b/=.l.,siS'bj t-1 j-1 where
.4.r=
bj
(i
1,..
m;
of
'I
ix Aix,
t-11-1 1,
.., n)
§ 3. Linear Mappings
19
are linear mappings of into R. Since the above equations hold identically in x, we have in the operator space a
IN
.4 = E E a A; from which it can be seen that the inn operators .4' span the entire operator space. Furthermore because
Aa,=Ex ,bl j-1
the coefficients a are uniquely determined by the operator A. and the generators A' are consequently linearly independent and form a basis for the operator space. This space therefore has dimension mn. 3.6. The case n = 1. The dual space. The case is = 1, where R; is a one-dimensional line
y=,q b which, if one wishes, can be identified with the real q-axis, merits special attention. The linear operator space from R' into the real axis has dimension in according to the above and in this case is called the linear space dual to R' . This linear space is spanned by the linearly independent operators
.4 = A':
A'x=e'b
y',
where, as before, for any x in Rx 10
x = r 'a1. i-1
If the dual basis A', dual to a,, of the dual operator space is denoted by a*' and the operators in general by x*, one has for every operator x* in the dual space
x* = E* a*r with uniquely determined real coefficients x* a; _*. If this operator is applied to x, one obtains the linear mapping 12
'-1
of the space R.,' into the real y-axis. According to this, to each vector .r in R,,' and to each operator or
"vector" x* in the dual space Re there corresponds a real number whicli depends linearly on x as well as on x* and is therefore a real-
valued bilinear function of these vectors. Corresponding to each fixed xo of the dual operator space R= there is a real linear function
20
1. Linear Algebra
xo x of the vector x, and each fixed x0 in RM gives a real linear function x* x0 of the vector operator x* of the dual space R,.. thus an element
x,** of the likewise in-dimensional space R, "t., dual to the former operator space. The correspondence TO .-. xo A
is one-to-one and linearly isomorphic. If these image elements are identified, R"' = R* * becomes, conversely, the space dual to Rr * ; in this sense the duality is in fact symmetric, thus motivating the term "dual".
3.7. The case to = tit. Linear transformations. For it = in the spaces Rx and R3 are isomorphic: there exists a one-to-one mapping that preserves linear relationships. Such an isomorphic mapping is obviously a regular linear mapping
of Rz onto R. For if the well-defined image of x in Rq is denoted by y = Ax it follows from the invariance of linear relations that A is linear; and since the mapping is one-to-one A is moreover regular. Conversely, any such linear mapping of RT onto RM provides an isomorphism between these spaces. In the case at hand the inverse linear mapping x = .4 -1 y from R"' onto R,' exists ; it is likewise regular, and the identities
.4-1Ax-x,
AA-ly=y
hold.
If the isomorphic spaces R' and RY are identified, the linear mappings become linear self-mappings of the in-dimensional linear space R'". We are going to call such self-mappings of R"' linear transloruratiotis and denote them by y = Tx. If such a transformation is regular. it maps R"' one-to-one onto itself, and the inverse linear transformation T-1 likewise exists and has the property
T-1 T= 7' T-1= 1, where I stands for the identity transformation v = x. If on the other hand the transformation T is nonregular. with the p-dimensional kernel RI' = KD(T), then T Kr = 0, and the space R" is mapped onto the (m - p)-dimensional subspace T R". which is isomorphic to the factor space R'"/Kr. With regard to regular linear transformations of the linear space R"', observe that they obviously form a group with respect to composition or multiplication. For if 7; and T1 are two regular linear transformations, then the composite transformation
T'x=
T172X
§ 3. Linear 'Mappings
21
is also linear and regular. Furthermore, the identity transformation 1 is regular and
TI=IT=T.
Finally, as has already been remarked, every regular linear trans-
formation T has a regular linear inverse transformation T-1. This group of regular linear transformations is not commutative, if nt > J. On the other hand, the set of all linear transformations of R' does not form a group. Of course, the composite of two transformations is again a linear transformation, and the identity transformation does exist. But a nonregular transformation has no inverse.
3.8. Determination of all linear coordinate systems In R"'. A linear coordinate system a1, . . . , a,,, for R"' is mapped by a regular linear transformation T onto in vectors bi = T ai, which are likewise
linearly independent and which therefore form a basis for RI; and indeed one obtains all linear coordinate systems in this way. For if b 1 , . .. , b," are at first an arbitrary ordered system of vectors in R'", there exists a unique linear transformation y = T x, namely
y = Tx=
/"
i-t
on
for
bi
x = £e' a,,
i-I
which transforms the ai into the bi. If the vectors bi are in addition linearly independent, 7' is obviously regular: from y = 0 it follows
that x = 0. We see: The ordered coordinate systems for the space R'", on the one hand, and the regular linear transformations of this space, on the other hand, are in one-to-one correspondence. Let ai and bf be two coordinate systems for the space R'. Then for an arbitrary x
x=Eiiai=Erllbi. According to the above, a unique regular transformation exists such
that "1
b,
I"
ai=T-1bi=Eigrb
1-1C&
1-1
where because T-1 T = T T-1 = I w
+"
ER.a; ='ra'ik = d; . If the expressions for ai and b1 are substituted into the above representations of x, the formulas for the linear transformation of the coordinates are obtained: M
i-t
f','=Eaj me
/_I
22
1. Linear Algebra
3.9. Affine transformations. Beside-, the linear transformations of the space R, we consider somewhat more generally the a/fine transformations of this space. Such a transformation results from the composition of a linear transformation Tx with a translation of the space. A translation of the space RI is a single-valued mapping .4 (.r) of this space onto itself such that every vector [x,, x=], with initial point x, and end point x= is transformed into a congruent vector [.4(x,), A(xs)). Consequently, we have
A(x,)-x,=A(xs)-x2. If one sets x, = 0, x2 = x, it follows from this that the translation is necessarily of the form
.ro i x,
A(x)
where x0 = A (0).
This is the general form of a translation : for an arbitrary vector xo the expression A (x) = xo ± x possesses the property required by the definition. The general affine transformation is thus
A (x) = xo + T x , where T x is a linear transformation of the space R.". Those affine transformations for which the linear transformation T is regular form a group. In this group the set of translations is a commutative subgroup.
3.10. Exercises. 1. We start from the matrix introduced in 3.4 a, ..aM n
n
and with the notation used there define the "column vectors" K
(i=1,...,,n)
yc= 74b f in R; and the "row vectors" M
xj = E'ai ar
(1=1,
.
,n)
in Rx'. The transpose
a
_ (ai) has, on the other hand, the column vectors x, and the row vectors y,.
§ 3. Linear Mappings
23
By means of the equations
Aa{=y{ (i=9,...,m),
A*bj=x; (9=i,...,n)
we define in Rx and R3 the linear mappings A and A * and denote their m and q n, respectively, by K; = K;(A) kernels of dimension p and K9 = Ky°.(A *) ; therefore, A x = 0 or A* y = 0 precisely when x
lies in li or y lies in K. Prove that
dim ARx =m-p=n-q=dimA*Ry. Proof. A is a regular linear mapping of the factor space R; /K; and consequently dim A R"' = dim A(R" IKO') = to - p; for the same reason dim A* Ry = dim A*(Ry/K,) = is - q. In order to prove that m - p = n - q, show that A Rx and KY(A *) are linearly independent subspaces in R". In fact: If y is a vector common to these subspaces, one has
y=Ax= E$'y; = Erl'bi j-1
with
A*y= j-1 Erjjxj=0,
and consequently
11'=Ea Thus
a
tt
with
n
Ea;nf=0.
j_1 W
0 = E E E `%1'i' = E 11' E ai
w
E ('71)
and therefore y = 0. According to this it follows from A* Ax = 0 that Ax = 0, and A is therefore a regular mapping of the factor space R; /Ks into the factor space R;/K,4., and thus in - p it - q. The linear independence of the subspaces A* Ry and KP(A) in R, is established in the same way, from which the converse is - q S in - p
follows. Therefore, m - p = is - q, which was to be proved. Remark 1. The matrix invariant
r=m-p=n-q,
which by the above is independent of the choice of the coordinate systems at and bj, is called the rank of the matrix (a;). It indicates the number of linearly independent column vectors and row vectors. If
in = is, the matrix is said to be square. The square matrix (ai) is symmetric if a; = a(, that is, if the matrix is identical with its transpose. Further, a square, not necessarily symmetric matrix is called regular
if the rank of the matrix is r = in.
24
I. Linear Algebra
Remark 2. It follows from the above that A Rs and KV(A *) are linearly independent complements in Ry and that the same is true for A* Re,. and K; (A) in R.,:
R,", =A R?`+K,Q.(ARx=A*R"+K?(A). 2. The results of exercise I contain the complete theory of systems of real linear equations. Verify the following main theorems: a. If the coefficient matrix in the linear homogeneous system of equations
(j=1,...,n)
ZajE'=O
i-1
is of rank r, the system has precisely sit - r linearly independent solution vectors x
a,.
i-s
b. For the solvability of the corresponding nonhomogeneous system
(j=1,...,n)
'=#1
Zc
i-+
it is necessary and sufficient that for every solution y = the transposed homogeneous system p
a;r71=O
'-'
sj, b, of
(i=1....,m)
the equation
holds. Then if xQ
JO' ai is a particular solution of the nonhomoge-
neous system, one obtains the general solution by adding the general solution of the corresponding homogeneous system of equations. Proof. The condition named in theorem b is obviously necessary. That it also is sufficient follows from remark 2 of exercise 1, according to which b = E R' bf can be uniquely decomposed into two components
b = c + d with c = r y' b, in A Rz and A * d = 0. 0
1-,
Therefore,
d = ' d' bi is a solution vector of the transposed homogeneous system
of equations, and there exists an x = I Ei ai in R" with the property
i-/
§ 3. Linear Mappings M
M
c = A x, i.e., yi _
25
a $'. Now since for every solution y = E qi b;
of the transposed homogeneous system, because
EYini=z `Ea)r1i=0, i-t
;°,
i.-t
the equation w
w
Ebiili holds and the left hand side is by hypothesis always = 0, we have for
ri = 8'
ifibi= E(a')'=0 and therefore d = 0 and b = c. 3. Let T be a linear transformation in R,. Show that T is regular if and only if the corresponding matrix of T is regular with respect to an arbitrary coordinate system (i.e., the matrix is of rank in).
4. Let x" = L,, .... x* = L. be elements of the space R,". dual to RT. Show that they are linearly independent precisely when the system of equations
Lix=0
(i=i,...,m)
has the vector x = 0 as its only solution in R,". It follows from this that x = 0 provided L x = 0 for all operators x" = L of the dual space. 5. Let .i B = (R;) be matrices, A with is rows and in columns, B with m rows and columns. We denote the transposes by A' and B'. Verify that:
a. (A B)' = B' A'. b. If is = m and A is regular, i.e., of rank in, the matrix A-' having the property
A A-1 = A-1 A = I exists, where I is the unit matrix (8;) (8; is the Kronecker delta). Furthermore, one then has
(A-')' = (A')-' . 6. Let T x be a linear transformation of the linear space R,". Prove : If and only if K(T) = K(T2) are the subspaces K(T) and T R,; linearly independent and
Rr=K(T)+TRs.
I. Linear Algebra
26
7. Let U and V be linearly independent complements of the space
Rs and therefore for every x in R
x=it +V with uniquely determined vectors a in U and v in V.
Show that u = P x, v = Q x are linear transformations with the property
Pax=Px,
Qsx=Qx.
P x is called the projection of x onto U in the "direction" I' and Q x = v the projection of x onto V in the direction U. 8. Show conversely : If P x is a linear transformation of Rs with
the property P2 X. = P x, there exist uniquely determined linearly independent complements U and V of R°' such that u = P x is the projection of x onto U in the direction V and v = x - P x = Q x is the projection of x onto V in the direction U. 9. Let T be a linear transformation in R, and A a real number. Show that the set of all solutions to the equation
Tx =Ax is a subspace of R,'. If the dimension of this subspace is d > 0, ). is called an eigenvalue of T of multiplicity d; the solutions x are the associated eigenvectors and the subspace the eigenspace belonging to A. 10. Show that the transformation T, taking into account the multiplicity of the eigenvalues, can have at most m eigenvalues. Hint. The eigenspaces belonging to the various eigenvalues are linearly independent. 11. Let T A = P x be the projection onto U in the direction V. Determine the eigenvalues and the eigenspaces of P.
§ 4. Bilinear and Quadratic Functions 4.1. Real bilinear and quadratic functions. Let R?, Ry and Rp be three linear spaces and B a mapping that assigns to each ordered pair (x, y) an element z: a (x. Y) -+ z .
The mapping is called bilinear provided it is linear in both arguments x, y, which is to be indicated by the notation
z= Bxv. In the following we shall restrict ourselves to the case p = I and discuss, which is then no further restriction, only real-valued bilinear functions B x y.
§ 4. Bilinear and Quadratic Functions
27
We additionally assume is = in. The argument spaces R. and Ry" are then linearly isomorphic. If vectors which correspond to one another in some isomorphic mapping of these spaces are identified, we are in what follows treating real bilinear functions of the vectors x and r, which vary independently in the m-dimensional linear space If a coordinate system is fixed in R' in which
x=Eitia, i-1
and
YrJ'ill, j-1
it follows from the bilinearity of B that
Bxy=E
Ba1a,= Eftl$'rti
becomes a bilinear form of the coordinates 1 ' and ill with real coefficients. Conversely, every such form with arbitrary real coefficients defines, if E' and ill are interpreted as coordinates with respect to a linear coordinate system, a real bilinear function on R. The square matrix of the coefficients
Yil = B ai a, is called the matrix of the bilinear form with respect to the fixed coordinate system. For y = x the bilinear function B x y becomes the associated quadratic function or form
Bxx=Bx== EflileV, 0_1
which is related to the generating bilinear function through the polarization identity
B(x+y)2-B(x-Y)2=2(Bxy+Byx) . The bilinear function is symmetric if
and alternating if
Byx=Bxy Byx= - Bxy.
Any bilinear function B can be represented in a unique way as the sum of a symmetric bilinear function S and an alternating function A :
Bxy=Sxy+Axy, where
Sxy= 2 (Bxy+Byx),
Axy= 2 (Bxy-Byx).
28
1. Linear Algebra
The quadratic function generated by an alternating bilinear function vanishes identically, and it follows from the polarization identity that the converse is also true. The symmetric part S of a bilinear function B generates the same quadratic function as B itself and, according to the polarization identity,
B(x
y)'-B(x-y)'=4Sxy.
The symmetric part S is uniquely determined by the quadratic function
BA
A quadratic function is called positive or negative definite if it vanishes only for x = 0 and otherwise assumes only positive or negative values, respectively. It is semidefinite if it also vanishes for certain vectors x + 0, but is otherwise positive or negative, respectively. It is indefinite if it assumes positive as well as negative values. We also use the corresponding terminology for the generating symmetric bilinear function. 4.2. The inertia theorem. Let B x2 be a real quadratic function in the space R'" and B x y the generating symmetric bilinear function; consequently,
B(x+y)'- B(x - y)2 =4Bxy. We prove the following
Inertia theorem. In R'" there exist coordinate systems e1, .
. .
, e"
such that (°)
Bei=±1 or =-1 or=0. Then if Bei=+I for i=l,...,p, Be?=- t fori=p+1, p+gandBel, =0fori=p+q+1,...,p4-q+r=wand
Beet=0 for i4j,
if UP, V4, W are the linearly independent subspaces spanned by these three sets of vectors:
R'"= UP +V9+it", the following invariances hold: The dimensions p, q, r, of the subspaces named are invariant numbers characteristic of the function B for every coordinate system with the properties (*), and indeed the subspace W' is itself invariant.
Before we go on to the proof, with a view to later metric concepts, we introduce the following terminology. Two vectors x and y are said to be orthogonal to one another with respect to the given symmetric bilinear function B if
Bxy=Byx=0. Further, x is called a positive or negative vector according as B x= > 0
or B x2 < 0, and a "unit vector" if B x' = ± 1, while x is a "null
§ 4. Bilinear and Quadratic Functions
29
vector" provided B x2 = 0; all with respect to B. A positive or negative vector can be normalized to a unit vector by multiplying with
i.= yl Bx=1 1
By this, a coordinate system with the properties (*) is orthogonal and normalized, for short orthonornial, with respect to B. In such a coordinate system a symmetric bilinear function has an especially simple form. Namely, if
y=
x= Lr s'e.,
,i'ei,
then
Bxy=E.'n'Ef+)t. i-t f-P+t The subspaces UP, T'q, IT" with vectors is
=
P
c
i-t j'ei, ,
P`+v
v= G' $fof , f-P+t
P+q+r
w=k-P+9+t 6,r
.
e*
are pairwise orthogonal in that
Btsv= Buuw= Bvu'=0, and for every x in R*' the representation x = ii + v + uw is unique. Furthermore, P
B us = E i-t
P+Q
B v2 = - E
B u2 = 0,
f-P+t
and B is therefore positive definite in UP, negative definite in V q. while W' contains nothing but null vectors.
4.3. First proof of the inertia theorem. If R' contains only null vectors, it follows from the polarization identity that B x y =_ 0. All vectors are orthogonal to one another with respect to B, and any coordi-
nate system satisfies the inertia theorem. We have UP = V4 = 0 and R'" reduces to the null space W' = TV'. If this is not the case, then R"contains positive or negative vectors, consequently also unit vectors. Let e say, be a positive unit vector, so that B el = 1. Then the set of all vectors xt orthogonal to el (B e, xl = B xt et = 0) is obviously a subspace. We claim that this subspace and the onedimensional subspace (e,) generated by el are linearly independent complements in R'". In fact, every vector x can be decomposed in a unique way into components
x=i:'e, +xt
30
1. Linear Algebra
from these subspaces; for from
0=Bx1e1=B(x-$'e1)e1= Rxel-$' ti' is uniquely determined to be
ti' = B x ei. {t' el is the orthogonal projection of the vector x onto el and x1 the pro-
jecting norueal with respect to B. The orthogonal and linearly independent complement to (e1), which consists of these normals, has dimen-
sion tot - I and can be denoted by R'"-'. One proceeds with R'"-' just as above with R"' and continues the procedure until a subspace R' of dimension r Z 0 is reached. It contains nothing but null vectors, where hence, by the polarization identity, B x v = 0. One has. then
found in turn to - r positive or negative pairwise orthogonal unit vectors which are completed, with an arbitrary coordinate system for the null space R' orthogonal to these unit vectors, to a complete coordinate system which is orthonornial with respect to B. If p g 0 of these
vectors are positive and q > 0 are negative, then p + q + r = tit; the positive unit vectors generate a p-dimensional subspace U. where B is positive definite ; in the q-dimensional subspace generated by the negative unit vectors B is negative definite; and It" = R' contains nothing but null vectors. This establishes the existence of a coordinate system of the kind required in the inertia theorem, and it remains to prove the asserted invariances.
First, the invariance of the null space R' - W' follows from the fact that this space contains precisely those vectors which are orthogo-
nal to every R'"-vector x with respect to B. In fact, according to the it ' v -- wr and a :ro from IV' above, for an arbitrary .v
Bxu'o= Buwo+Bvw0+Bwwo=0. Conversely, if the identity B x yo - 0 holds in R'" for a yo ='ro + wo, then for x = no one has
0=Bno yo =Brro±Brrozo+Brroub
Ve
Bu,,
0. In the same way it follows that ro = 0, and consequently yo = w , which proves and therefore, since B is positive definite in UP, uo the invariance of the null space W'.' I The invariant null space contains nothing but null vectors, to be sure, but in general by no means all null vectors. In fact, if x = is + r -- w,
Bx2=Bu=-r Bv2=0 precisely when - B r2 = B u=. Only if B x1 vanishes identically or is seinidefinite does it follow from the above that is = v = 0 and therefore x = w.
§ 4. Bilinear and Quadratic Functions
31
On the other hand, the positive space Ut and the negative space IM are in general not invariant as subspaces, however their dimensions p and q are. To see this, we consider a second decomposition of the required kind,
RX, = UP + VQ + W' , where by the above W' = W'. Let the dimensions of L1 and V be p
and q; the claim is that p = p and q = q. In fact, for an arbitrary u in UP one has uniquely
u=u+v+ v
by virtue of the second decomposition. Here u = A u is obviously a linear mapping of the space U' into the space V "P, and indeed a regular
mapping. For from u = A u = 0 it follows that u = v + ii and hence
Bu'=B(i +i)2=Bill +2Bow+Bw'=Bv= g0; thus because B is positive definite in Up, B u' = 0 and it = 0. But then by virtue of 3.3 one must have p p, and since the converse also holds for reasons of symmetry, one has 7, = p and conse-
quently also q = its - p - r = tit - p - r = q, with which the invariance claims of the inertia theorem have been established.
4.4. E. Schmidt's orthogonalization process. Second proof of the inertia theorem. We give in addition a second variant of the above proof, which is no shorter, but which does give rise to considerations that are of interest in themselves. Consider first the case where B is definite in R*, e.g. positive definite. Under this hypothesis we shall construct an orthonormal system with respect to B, starting from an arbitrary coordinate system
at,...,a..
Since a1 + 0, B a; > 0. If the real number ± All is defined by s 11=Ba1,
X=
the equation a1 = A12 e1 ,
because )11 = 0, yields a positive unit vector et. Then project as onto et and thus determine the number A21 so that from which
B (as - A21 el) el = 0
,
All = Bas et follows. Since at and as are linearly independent, so also are et and as; consequently, the normal is as - As1 et = 0 , and B (as - 221 et)' > 0. The roots of the equation 4i2 = B (a: -)21 et)'
I. Linear Algebra
32
are thus real and + 0, so that a positive unit vector orthogonal to e1 is determined by e,
a2 = All c1 + 2
In the third step we project a, onto the subspace (e1. e,) = (a1, a3) and thus determine the numbers i.81 and 7.82 so that B(a3-).3le1 -'-32 e,) e. = 0
for i = 1, 2; they are
)3i=Ba,e1. Because of the linear independence of the vectors a1, a2 and a,, e1, e,
and a3 are also linearly independent, and consequently the normal a3 - 131 e1 - i3, c, of a3 on (e1, e,) = (a1, a,) is different from zero. Therefore B (a3 - 131 e1 - 7.,, e,)2 > 0. We determine %33 +' 0 from 233 = B (a3 - )31 e1 - i.32 e,)2
and define by means of a3 =231e1 +139ea+133e3
a third unit vector e3 which is orthogonal to e1 and e,. Continuing in this way, one obtains a system of equations for determining the orthonormal system e1, . . . , e," a1=;.11e1
a3=J.1e1+4,,e, a3
231e1 r 23lel+i.83 e,
a,,, = 2 n,, e1 + ),,,,g et i /., 3 e3
where for
;- ....+. 2," . C.
j < i g tit
and for i=1,...,tit
20 =Ba,e,
B (a1 - /i1 e1 - " ' - )ili-1) ei-1)= For every i, (e1.... , e1) = (a1.... , a), and by successive solution of the above system of equations one obtains e1 = pi 1 a1 ,
e:= its, a1+I.,as, e,
1431 a1 -1- P32 a3 + 1133 a3
c," = It" 1 a1 + E1",, a, 4- N",3 a3
T.... + p,,," a,"
That completes the orthogonalization procedure of E. Schmidt. Now let B be an arbitrary real symmetric bilinear function on R"' which does not vanish identically. Then if Bx' assumes positive values,
§ 4. Bilinear and Quadratic Functions
33
for example, let Ut be a maximal positive space, that is, a subspace of the highest possible dimension p where B is positive definite. In this subspace we can by means of the Schmidt orthogonalization procedure construct a coordinate system
which is orthonormal with respect to B. We thus have B et ej = 0 for
i*j,Be?=+land U1'=(es,...,eP).
We are now going to project the R"-vector x onto U' and determine the corresponding normal of x to UP. It is thus a question of decomposing x into two components P
x=EE'e1+n=no+it
r_, in such a way that the normal it is orthogonal to all vectors it and con-
sequently B it it = 0 for every it in UP. For this it is necessary and sufficient for is to be orthogonal to all of the generators ej of UP: P
Bne, = B(x - Z 'er)ej= Bxej-0. -r
Accordingly, the projection is t,
troe,Bxeti rvt
and it. = x - no. Note that this projection as well as the corresponding normal is uniquely determined by x and UP and is thus independent of the choice of the orthonormal system er in UP. For if
x= uo+n', is a second decomposition of the required kind then the UP-vector fro - uo = it - n' is orthogonal to UP and in particular to itself, and consequently B (it; - uo)' = 0. Since B is definite in UP, we must have it; = uo and as a consequence also n' = is. The set of all UP-normals is obviously form a subspace N"'-P, the linearly independent orthogonal complement of UP in R"' with respect to
B. Because of the maximality of UP, N"'--P can include no positive vectors; for if B n' > 0, then for every it in U' we would have
B(urn)s= Bu'+2Bun+Bits= Bus+Bits ; Bn'>O, and B would therefore be positive definite in the (p + l)-dimensional space generated by UP and is. If N"'-' contains negative vectors, let j'Q be a maximal negative subspace of dimension q in N'"-' and eP++ ... , ej,+v
34
I. Linear Algebra
an orthonormal coordinate system constructed by means of the Schmidt
orthogonalization process, so that hence B ei e1 = 0 for i
Be =-1.
j and
Then if II" is the orthogonal complement of Va in.N "`-p construct-
ed by the above method, W' contains nothing but null vectors and is
of dimension r = in - p - q. One has R"' = UP + PQ + W', and if one adds an arbitrary basis
sp+v+j,...,ep+q+.=en. for the subspace IV' to the above p -!- q vectors, a coordinate system has
been constructed in R" that satisfies the requirements of the inertia theorem. The invariance of the space IV' and of the dimensions p and q is proved in the preceding section.
4.5. Orthogonal transformations. In connection with the inertia theorem we add a few supplementary remarks. The first concerns the determination of all coordinate systems for the space R"' that are orthonormal with respect to the symmetric bilinear function B. Thus let ei be a second basis besides ei which is orthonormal with respect to B. According to the inertia theorem this basis likewise contains p positive and q negative unit vectors and r null vectors ii. The null vectors ii span the same null space lf" = II" as the r null vectors ei. If one further orders the vectors in both systems, for example so that the positive vectors are written first, then the negative ones and the null vectors last, one has for all indices r, j = f .... , in Beie1= Beie1. For any ordering of this kind, according to 3.8 there exists a uniquely determined linear transformation
z=Tx which maps ei ontoei. Since both systems of vectors are linearly independent, this transformation is regular. Then if m
we have
In
Y=E jet, i-I
y= Erlie,, 1-t
771
777
Tei 711
1-I
"I
Te,P'el, 1-t
1 4. Bilinear and Quadratic Functions
35
and
Bzy= £ 'riiBe;ei=EZ'r7fBe,ef=Bxy, and thus B(T x) (T y) = B x y. The symmetric bilinear function B x y is thus invariant with respect to the regular linear transformation T. Such a linear transformation of the space R' is said to be orthogonal with respect to B. If, conversely, T is an arbitrary regular transformation that leaves
B x y invariant and maps ei onto T e, = a,, then the m vectors ai, because of the regularity of T, are linearly independent and moreover, as a consequence of the equations
BR;ef= B(Te;) (Tef) = Beset, orthonormal relative to B. They hence form a basis for the space R"' that is orthonormal relative to B. We see : If we understand a linear transformation T' to be orthogonal with respect to the symmetric bilinear function B x y whenever it is, first, regular and, second, leaves the latter bilinear function invariant, then all ordered coordinate systems for R"' that are orthonormal relative to B, on the one hand, and all transformations of this space which are orthogonal relative to B, on the other, are in one-to-one correspondence. The linear transformations T of the space R"' which are orthogonal for a symmetric bilinear function B obviously form a group of transfor-
mations which are orthogonal relative to B. It is a subgroup of the group mentioned in 3.7 of all regular linear transformations of the space.
4.6. Degenerate bilinear functions. Let B x y be a real not necessarily symmetric, bilinear function on the linear space R. Those vectors y for which
Bxy=O
holds identically in x obviously form a subspace in R'". Provided the dimension r of this space is positive we say that B x y is r-fold degenerate
with respect to y. Thus if B x y is not degenerate with respect to y, it follows from the above identity in x that y = 0. If in an arbitrary coordinate system a,
Y=Erlfa,, and hence
Bxy=i2
&'r7iBara,=iE+lc,$4of
36
1. Linear Algebra
then the fact that B is r-fold degenerate with respect to Y is equivalent with the linear homogeneous system of equations
(r= 1..... 111)
EFfrirl,=0
f-+
having precisely r linearly independent solution vectors y. Since the transposed system of equations on
lirt0
(1=1,...,m)
then likewise has precisely r linearly independent vectors x (cf. 3.10, exercises f and 2), we see : The bilinear function B x y is r-fold degenerate with respect to y as well as with respect to x, and hence simply r-fold degenerate if the rank of the matrix (f3;1) is nr - r. In particular,
B is not degenerate precisely when this rank is m and the matrix is therefore regular. If B is symmetric, then 11 is obviously r-fold degenerate precisely when the dimension of the null space 11" mentioned in the inertia theorem is equal to r.
Note in addition that the polarized, symmetric, bilinear function corresponding to a semidefinite quadratic function B xs is obviously always degenerate. A nondegenerate symmetric bilinear function always generates a quadratic function that is definite or indefinite, never semidefinite.
4.7. Theorem of Frechet-Riesz. Let B x y be a nondegenerate real bilinear function in RI; symmetry is not required. For a fixed y Bxy=Lx is a real linear function of x, thus an element of the space dual to R'". When y runs through the space R", we obtain in this way all of the elements of the dual space, each once. Namely, if al, ... , a. is a coordinate system in R", where m
!'= f-t 11,ar, then m
Lx= Bxy= E,i1Bxa.= Erl'Lix, i-t
and here the operators L, on the right are linearly independent and hence form a basis for the dual space if B is not degenerate. For each L in the space dual to R' there exists, therefore, a unique v in RI such that for all x in RI
Lx=Bxy.
3 4. Bilinear and Quadratic Functions
37
This is the theorem of Frechet-Riesz in the present elementary case where the dimension is finite'.
4.8. Adjoint linear transformations. Retaining the above hypotheses, let T now be an arbitrary linear transformation. Then for a fixed y, B(T x) y is a linear function of x, and according to the theorem Of Frt chet-Riesz, for each y in Rn there exists, therefore, a unique y* in R"' such that
B(T x) y = B x y* identically in x. One verifies at once that
y*=T*y is a linear transformation of the space R'", the transformation adjoint to T relative to B. We thus have B(T x) y = B x(T* y) identically in x and y. Provided the nondegenerate bilinear function B x y is in addition symmetric, this identity can bewritten B y(T x) = B(T* y) x or. after switching x and y, B(T* x) y = B x(T y) .
Thus T is conversely the linear transformation adjoint to T*, and consequently
(T*)*=T**=T.
The relation of adjunction is thus involutory. The linear transformations that are sell-adjoint (or symmetric) relative to a given symmetric, nondegenerate, bilinear function B, those for which T* = T deserve special attention. If the linear transformation T is orthogonal with respect to B and therefore regular, B(T x) (T y) = B x y, and thus B(T x) y = B x(T-' y) . According to this
T*=T''.
a relation which is obviously equivalent with the original definition of a transformation which is orthogonal with respect to B, provided B is symmetric and nondegenerate. If a linear transformation commutes with its adjoint,
TT*=T*T, then T is said to be normal with respect to B. Self-adjoint orthogonal transformations are special normal transformations. ' Actually the theorem of Frdchet-Riesz is understood to be the deeper corresponding theorem in infinite dimensional Hilbert space. We have retained the same name for the almost trivial case of a finite dimensional space.
1. Linear Algebra
38
4.9. Exercises. 1. Let B = (f;) be a symmetric square matrix with m rows and columns. Prove that there are regular square matrices M such
that
MBM=D where M' is the transpose of M and D stands for a diagonal matrix
with 4=0 for j $ i and e = + l for
e i = - 1 for
i=p+ 1,...,p; q, a=0 for i=p+q+1,.. ,p+q+r =,n.
Show further that provided .l! = 1/o accomplishes the above, the remaining M can be obtained from
11=..TM0, where T stands for an arbitrary regular matrix for which
T'BT=B. 2. Let B x v and C x y be two bilinear functions on the linear space R" such that h x y implies C x y, = 0. Prove that
Cxy - %Bxy, where x stands for a real constant. Hint. If B x v - 0, then C x y =_ 0, and there is nothing to prove. Otherwise, let B x0 Yo -- 0. For an arbitrary pair of vectors x, y set x = $ xo + x,, v - tj yo -- y, and determine the coefficients i:,, so that B x, yo = B x0 v, = 0 , which is the case if B x yo = $ B xo ya B x0 y = tl B xo yo. Then by hypothesis C x, yo = C x0 y, = 0 too, and
Bxyt1Bxoyr,+ Bx,y,,Cxy-= nCxgyo+Cx,y,,andtherefore
BxoyoCxCx03'0Bxy= B,roYoCx,y,-Cx0 oBx,y1 Since the right hand side of this equation is independent of i= and rl,
the equation holds for every pair of vectors x' = r' xo ± x,, y' = 7)' vo -L.- y, with an arbitrary choice of the coefficients ;' and t)'. If the latter are chosen so that B x' y' = $',q' B xo yo + B x, y, = 0, then by hypothesis C x' y' = 0 too, and the right hand side of the above equation is therefore = 0, from which the assertion follows with x = C x0 Yo/B x0 Yo.
;. Let B x y be a symmetric bilinear function in R'" and U a subspace of R"'.
Set up the necessary and sufficient condition for the existence of a normal with respect to B at a given point x of U. Give the general expression for these normals and show in particular that the normal is uniquely determined precisely when B is not degenerate in L'.
1 4. Bilinear and Quadratic Functions
39
4. Prove, provided B x y is positive definite in R'", the so-called Bessel inequality
Bp= -g B x2, where p stands for the orthogonal projection of x on U, and show further that the so-called Parsenal equation
Bp2= Bx= holds only for x = p. 5. Let B x y be a nondegenerate bilinear function in R"' and T x a linear transformation of R"', T* x its adjoint transformation relative to B. Prove that the kernel of these transformations have the same dimension.
6. With the hypotheses and notations of the preceding exercise we fix in R'" an arbitrary coordinate system. With respect to this coordinate system the bilinear function B and the linear transformations T and T* have well-defined square matrices which we denote by the same letters T* = (T*') T = (Ti) , B=(Nt), Prove that
T* = (B-1)' T' B'
,
T = B-1(T*)' B ,
and more particularly show: a. T is self-adjoint relative to B if
TB-1= B-1 T'. b. T is orthogonal relative to B if
T B-' = B-'(T-')' . c. T is normal relative to B if
T'BTB-'=BTB-'T'. Finally show that provided B is symmetric and definite, the coordinate system can be chosen so that
T*=T'. 7. Let R'. and Ry be linear spaces in each of which a real bilinear function is given which we denote by (x,, x=) and (y,, y=),, for short. Show:
a. Provided the bilinear functions are not degenerate, the linear mappings y = A x from R, into Rv and the linear mappings x = A * y from R*, into Rj are pairwise adjoint to each other so that
(Ax,y)t.= (x,A*y): holds identically in x and y.
I. Linear Algebra
40
b. If the bilinear functions are in addition definite, then Rt' = K(A) -+- .4 * R" , R3 = li (A*) -r A RX' , where K(A) and K(.4 *) stand for the kernels of the mappings A and .4 *. Hint, a is a direct consequence of the theorem of Frechet-Riesz. To
prove b, show that the subspaces K(A) and A * R° have no vectors in common other than the zero vector and are therefore linearly independent. Namely, if x = A * y is a zero vector, then because A x = 0
(x,x),= (x.A*y)x_ (A x,y),,=0 and therefore x=A*y-0. Remark. The above clearly repeats in a shorter formulation what has already been said in exercises I and 2 of 3.10. 8. Let .4 x y be a real bilinear alternating function that is defined in
the space R'" and that does not vanish identically; thus for arbitrary vectors in the space
Axy= - Ay.v
and A x x = 0. Prove: There exist coordinate systems e1, ... , e," in R' and a number it (:S n1/2) such that
A e2i_I c2i = - .4 a:, c:;-1 = I for i = 1, ... , n, while A e,, e* = 0 for every other index pair It, k and consequently-
A x y == ; (t2i-1 ,2i - t2i 172i- 1) i-l
where $1, ... , r"' and ill, ...
, ;j"' stand for the coordinates of x and y in such a distinguished coordinate system. Hint. Since A x y does not vanish identically, there exist two vec-
tors ai and a= such that A ai a2 = - A as a1 > 0. These vectors are obviously linearly independent, and the same holds for the normalized vectors ai
el
a1
es
A ca ci = f and which span a two-dimensional subspace U; in R'". Each vector x in the space R"' can now be decomposed in a unique way into two components for which A ei e=
so that
x -- /'i + -V1 e1 + tl' e, lies in Ell, while xi stands perpendicular to U
relative to A. For A
el x.
,
§ S. Multilinear Functions
41
The normals x1 form in R'" a linearly independent complement R11-2
to U;, and if y = q1 + y, = 2 e1 + 172 e2 + y1 is the above
decomposition for y,
1+Ax1y1.
4xy=Ap1g1 + .4 x, t,, If A x1 yl
0 in R;'-2, then we are done. Otherwise, one continues the
above technique in R;'`2, and in this way, step by step, one finally reaches a subspace R" whose vectors x,,, yM are orthogonal to the pairwise perpendicular subspaces U' (i = t , ... , n) and where in addition .4 x yn = 0. Then n
n
x=zfi+xn,
y=Ei-tqi+Y.
i-1
and n
Axy
u+
A pi qi = G
t
2i - Eli 21-1)
(I.e.d.
Remark. The vectors in the space RM -2a are, as is easily seen, not only mutually perpendicular, but are perpendicular, relative to A, to all of the vectors in the space R'", and the space R"`2" is therefore through this property uniquely determined by .4. In particular, the number is is according to this an invariant which is uniquely determined by A. 9. With the hypotheses and notations of the previous exercise show : All coordinate systems of the kind mentioned there are obtained from one by means of the group of regular transformations T that leave A invariant, so that
A T x T y =Axy. § 5. Multilinear Functions 5.1. Real n-linear functions. A real function defined for the vectors
x1, ... , x of the space R*
11x1...x"
is said to be it-linear if it is linear in each of its arguments. For it = 1, M is a linear, for n = 2, a bilinear function.
In an arbitrary coordinate system for the space R' let Then
x1 =+.f-1 E ,i ai,
Y= t.... , n) .
Itx,...xn= with
pr, ... i, = 11 aj, ... a,.,
1. Linear Algebra
42
is a real homogeneous form of degree it in the coordinates of the vectors x1. Conversely, such a form with arbitrary real coefficients and a given coordinate system at in R' defines a real n-linear function. The it vectors x1 admit the n! permutations of the symmetric permutation group. These permutations are even or odd according as they can be broken up into an even or an odd number of transpositions (x1. x{). The n!/2 even permutations form the alternating subgroup of the symmetric permutation group. An n-linear vector function M which remains unchanged under per-
mutations from the symmetric permutation group, and thus for any is called symnsletric. transposition of the vectors If, on the other hand, it has the value M1 for precisely those permu-
tations of the alternating permutation group, then it has a value Al. (+ M1) for all of the odd permutations. 311 - M2 is then an alternating
n-linear function, which for any transposition of the vectors x, only changes its sign. If all of the permutations of the alternating permutation group are
applied to an arbitrary n-linear function M x1 ... x,,, the sum of the functions obtained is either symmetric or has at most two values Mi;` and M. The it-linear alternating function
,' (31"-
111"s)
is then called the alternating part of the n-linear function M and is deno-
ted by A 31 x1 ... x" .1 In what follows the real alternating multilinear functions of several vectors will be of particular interest to us.
5.2. Alternating functions and determinants. Thus let
Dx1...x be a real, multilinear, alternating function defined in R". For is = 1, D x1 is a real linear function. It is convenient to also think of such a function as being "alternating", because all theorems
valid for in fact alternating multilinear functions (n > 1) also then hold for it = 1, as is easily verified in each individual case. u In Cartan's alternating calculus the transition from a multilinear form to its alternating part is usually indicated by separating its arguments by the mark A. We prefer to write this symbol in /rout of the operator of the multilinear form. The symbol becomes a linear operator, which presents advantages for the techniques of the alternating calculus.
§ S. Multilincar Functions
43
Switching two vectors x; and x, (i $ f) changes the sign of D, and for x; = x1, D must therefore vanish. From this we see more generally: If the vectors x ... , x" are linearly dependent, then
.. x" = 0.
D x,
For it = I this says that any simply linear function D x, vanishes for x, = 0. If n > I , then one of the vectors, for example "-1
X"=E2ixi, i-t is a linear combination of the others, and therefore because of the linearity of D in x
Dx1...xn=
"-1
i-t
).;Dx1...x"-1Xj
and here all of the terms on the right vanish. If the number is of arguments is greater than the dimension of the space R"`, the argument vectors are always linearly dependent, and therefore D = 0. Alternating n-linear functions that do not vanish in. In what follows we conidentically thus exist in R'" only for is sider in particular the case is = in. Then if the value of D is given for one linearly independent vector
system al, ... , a,", D is uniquely determined in R. In fact, for an arbitrary system of vectors x,, .
. .
, x",
xt = E tit ai,
"-I
and since
Da;,...a;M = ±Da,...a,",
with the sign + or - according as the permutation y -r it (y = 1, ... tot) is even or odd,
,
Dx,...x.=6Da,...a.,
where the real number in
8=
E
Ei' ... E = E
f
E = det (Ef) = det (E;)
is the m-rowed determinant of the coordinates Ef. The value D x, ... x" is therefore, according to the above equation, uniquely determined if
Da,...a,"isgiven.
From this it follows that D vanishes in R'" identically if it is equal to zero for one single linearly independent system of vectors a,. If this case is excluded, then D vanishes only if the tit argument vectors are
1. Linear Algebra
44
linearly dependent. There then exists precisely one in-linear alternating function, namely
Dx1...x,n=et det which is uniquely determined up to the arbitrarily normalizable factor
Da,...an=a.
However, we shall not make use of the ordinary theory of determinants. To the contrary, this theory can be derived from the concept of an to-linear alternating vector function on the space Rl". In order to
prove the multiplication rule for determinants, say, using this approach, start with two arbitrary nn-rowed determinants det (Es) and det (>11) and, taking a,, ... , am to be a basis for Rm, set mtt
4 G rlk aj j-1
.rj = E sj a, , ial
,
from which In
m
! (E-l
Yk
ijk Sj ar
follows. Then on the one hand
Dy,...vin= det
EiiLEA!
t-1
Da,...a",,
and on the other hand
Dyi...ym=det(,](1)Dx,...xm=det(rdd)det(.;)Dit,...a", and consequently, since D a, ... am *0, \ ur m m E lik $;1- det E $I n! = det E ;;
det (s)ki) det (;) = det 1
5.3. Orientation of a simplex. Referring to what was said in 2.4 and 2.5, we consider an tit-dimensional simplex s'"(xa..... x,,) in a linear space R" (us :t) with the vertices xo, ... , xm and linearly independent edges Y, These edges generate an ni-di. . . . . . . - x0. .v0.
mensional subspace f.'m, and the simplex lies in a hyperplane E' parallel to the space Cm whose points nr
(
i-r1
have unique barycentric coordinates ft' relative to the simplex .%-ith m
Ell'=I. i-0 In order to orient all simplexes lying in Em or in planes parallel to it (observe that the edges of such simplexes thus determine the same sub-
§ 5. Multilinear Functions
45
space U'"), we take on the space UM an »s-linear real alternating function
I), which is uniquely determined up to an arbitrary factor, and for each of the simplexes sM(xo, ... , x,") mentioned we form the expression
D (xl - x0) ... (x", - xo) = d (x0, ... , x,") Since the edges x, - x0 are linearly independent vectors in the space Ul", this real number is different from zero and therefore positive or negative. We define : For a given ordering of the vertices of the simplex, ? " ( x 0 , . .. , x",) is positively or negatively oriented with respect to d according as the above expression is positive or negative.
The function d is, to be sure, not linear, but still it is alternating, and therefore changes its sign for any transposition (x;, xi). For i. j # 0 this is evident. For a transposition (x0, xi) set
x; - xi = (Xs - x0) + (xo - xi) for each i, from which it follows that d(xi, x,, . .. , xi_,, x0, x,+ ,, . .. , x",)
= D (xl - x,) ... (xi_, - xi) (xo - x,) (xi+, - xi) ... (x," - xi) = D (x, - x0) ... (xi_, - x0) (xo - x.) (xi+, - x0) ... (x", - x0) _ - D (xl - x0) ... (,ri_, - x0) (xi - x0) (xi+i - x0) ... (Y", - xo) _ - 4(x0, x1, ... , xi_,, Xi, xi+,, ... , x,,,) . Thus d also changes its sign for this transposition and is therefore alter-
nating. Note that this also holds for in = 1. From this it follows by virtue of the above definition that the orientation of a simplex remains unchanged under even permutations of its vertices and changes its sign under odd ones'. The in-dimensional simplex s"'(x0, . . . , x",) has the (scs - 1)-dimensional side simplexes si,"-, (xo,... ' xO ... ' Xl) (t = 0, ... , in)
where ^ indicates the omission of the point thus designated. If s' i oriented in the above way by A(x0..... x",), then we define the orientation of the side simplexes s"-' induced by this orientation by means of the signs of the alternating functions
AAA ... , xi' ... , x,") _ (-
1)i d (x0,
... , Yj, - .. , Y,")
I This is the usual definition of orientation. We have given preference to the one above because the function ( x . . . . . . , x,,,) not only decides the orientation of the simplex 5 ( 0 . . . . . x,"), but also has a significance for the simplex which becomes evident in the theorem that follows in 5.6.
1. Linear Algebra
46
Observe that J; has the same meaning with respect to the side and the space UT' spanned by its edges as
simplex s',"
J(xo..... x",)=Dh,...h.
(hi=xi-x0)
has for the space U"'. For, taking at first i $ 0, D h, ... hi..., (xi - xe) hi-:, Di h, ... 1ti ... It,,,
... h,u
becomes for a fixed xi a nonzero (tit - 1)-linear alternating function of the edges h,, . . . , hi_,, hi+,, . . . , h," of the side simplex s;''', which can be taken to be the fundamental form of the space UT- 1. For i = 0
write, for example, xs - ro = (xi - x,) + (x, - x0), and hence
Jo(zo, .r,...... ,") = D (xi - xo) ks ... km
(ki = xi - x1)
=1)0k2...k.. For a fixed ro, 1)o is here a nonzero (in - 1)-linear alternating function . . . . It,,, of the side simplex s" which can be used as the fundamental form of the space Ut -'. of the edges It ! .
5.4. Simplicial subdivisions. Let sbe a closed in-dimensional simplex (i.e. the closed convex hull of the points x0, ... , x,"). We consider a subdivision D of s", i.e. a finite set of closed tit-dimensional subsimplexes s;" (i -= 1, 2, ... , .N) with the properties: 1. The sinsplex s' is the union of the subsitnplexes s,". 2. Any two subsimplexes s;." and s"I hare no common interior points. The subdivision D is simplicial, if in addition: of the subsimplexes s" that 3. Each (tit - 1)-dimensional lace s contains interior points of s' belongs as a lace to precisely two subsimplexes sT and sr. Following an idea due to H. Whitney I J we will construct a simplicial subdivision D of the simplex s'.
B.S. Construction of a simplicial subdivision. With the vertices xo, X,, ... , x," of the given simplex s' we form the points
x'si= a (xi±xt)
(0SiSj Stit; xii=xs)
and partially order them by a relation 5 such that .t-i i S .t'k k
if h S i and j S k.
Consider now all increasing sequences a of i n + I points xkk (beginning with one of the points xi (i = 0, 1, ... , in) and ending with xo.). For each xkk (h > 0, k < tit) there are two possible successors, namely x(k_,)k and xk(k;-,). The number of sequences a is 21, and they
1 5. Multilinear Functions
47
correspond to 2"' one-dimensional oriented polygonal paths. The Figure illustrates the case in = 4.
Fig. I
The simplicial subdivision D is now defined by the 2"' simplexes s'
which have the points of a as vertices. In order to show that D is a simplicial subdivision, we first prove the property 3. Let s'-'(xii.... , r,,, ... , xo",) be a (ur - 1)-dimensional fact of se(x,i, ... , x,,, ... , One has to distinguish three cases: (x) The neighbors of x,, in the sequence a are or x,(j_,), xjr+1) x(,+1)1, x(,_,), Then the face so -' consists of points of the boundary as' of the given simplex s"', since all points x E s;'' have the barycentric coordinate 0 with respect to the vertex x in the first case and x,,, in the second.
(g) Similarly, if r = I = i or if r = 0 and t = in, then the face s0'"'' c as'". (y) The neighbors of x,, are or
x(,+ 1)f, X,(t+1)
X,(t_t), x(,_,)t .
In these cases there are precisely two subsimplexes of the subdivision D which have s; '' (xi:, ... , x,,, . . . , xe,,) as a common face, namely the
given subsimplex so and the subsimplex s, in which the point x has been replaced by x,(,+,) in the first case and by X(,_1)1 in the second. For the proof of properties 1 and 2, let the sequences a be numbered as follows: xos, al: xoo, xol, xo 1 , as:
x1j,
xov
Xo2,
XO",
a3:
x11,
xa,
xo
xo," ,
x(m_2)m....
X0M
a=m: X,,,
sj+, Then the corresponding consecutive simplexes s = si ands; (i = 1, 2, ... , 2'" - 1) have presicely one common face s",-1. The vertices of s;'-' lie on the boundary as" of the given simplex s"I and have the same order in both simplexes s',' and s,+t. Hence the orienta-
tions of these simplexes are opposite (cf. 5.3). By Lemma A, which will
48
1. Linear Algebra
be formulated in 5.7, one concludes that the simplexes s;" and s;"+, contain no common interior points. Because of the convexity of sm, the (ut - 1)-dimensional simplex
(1 5 i S 2'" - 1) cuts s' in two closed convex polyhedrons with no common interior point. One of them, P;", has the point xO as a vertex. Denote by Q'" the other, complementary polyhedron. We prove now: The interiors of the simplexes s;, ... , Sr are disjoint, and the union
Usi= 1'7 i-I These statements are evident for i = 1, P" = s;'. Assuming that they hold for i = 1.... , k (:5 2'" - 1), we will prove their validity for
i=k±1.
the polyhedron P,t is the union of the simplexes s, ...By, assumption sk, any two of which have no common interior point, and the
face sk'-' separates Pk from the complement Qk. As we have seen, the
face sk'' also separates the simplexes s. and sty,, and therefore Hence, s', has no interior points in common with any of the simplexes s;', ... , s,t . The union P,t u sk f, is a polyhedron cut by the face sk;, from the given simplex s"and having the point x0 as a vertex. sk '.,., c Qk . 00
Thus, P,t u s* , = Pk'*,, and PR+, is, as stated, the union of the simplexes sw , , plexes
. . .
, sk_ I.
Repeating this reasoning for k = 1, ... , 2'" - 1, one concludes that the properties I and 2 are valid. 5.6. Additivity of the function A. We now consider a decomposition of the closed simplex s'"(x0, . . . , x",) (i.e., of the closed convex hull of the points xe, ... , x,,,) into a finite number of m-dimensional subsimplexes: m
`
a
k
k
k
with the two first properties 1, 2 of 5.4. 1. s°' is the ration of the closed substmplexes sk. 2. Any two subsimplexes hare no common interior points. For such a decomposition one has the following
Theorem. It s' and the subsintplexes s' have the sane orientation with respect to J (YO, ... , 1,) = D (x1 - x0) .
.
. (x,,, - O) ,
then
ii (YO, ... , X,,,) = E L) (XO, .... XM) k
and the sum on the right is there/ore independent of the decomposition.
In this sense the function d is an additive set function.
§ S. Atultilinear Functions
49
5.7. Lemmas. In order to be able to establish the proof without disturbing interruptions, we wish to present a few preparatory considerations. Let
x = Eµixi f-o
be an arbitrary point in.the plane of the simplex sm(xo......v.). Be-
µi = t one has
cause
f-0 M
x - xo= £µf(xi-xo) f-+
and consequently if we replace the point x; in 4(x,, ... , z",) by x, first for i =1= 0,
d(xo,...,xs-1, x,Xt+1r...,X.,) = D (x1 - xo) ... (xi-, - xo) (x - xo) (xt+1 - xo) ... (x., - xo) r1j
E µf D (xt - xo) ... (x1_1 - xo) (x1 - xo) (xi+1 - xo) ... (x,. - xa) 1-1
and thus, since all terms vanish for y + i, 4.1(x0, ... , x,-1rx, X,+1, ... , x_) = µ+d(xo, ... , xi_,, x1,xr.1.3.... Xs,) For i = 0 write xi - xo = (xi - xm) + (x. - xo), for example, from which it follows that d (xo, x1, ... , x.,) = D (x1 - x,,,) ... x.) (x., - xe) If instead of xo one substitutes fn
x=Eµ1
M-t
then
4(x,x,,...,x,,,) _ -µ0D(xl-x,.)...(x,,,_1-x,,,)(x0-x.,) =µ0 d(xo,x1,...,x,,,),
from which it can be seen that the above equation also holds for i = 0. If in addition exercise 3 in 2.6 is taken into account, one has
Lemma A. If
s"'(xo...... ;_,,x.,x,+,,...,xj and
,.
)
are simplexes o l the sane m-dimensional hyperplane with the common side simplex s" -1 (xo, ... , z;, ... , x_,) and i f in the barycentric representation
of x with respect to s'" the coefficient of x; is equal to µ{, then for
i =0,...,nt d(XO'
...
xf_1,X, Xl+1, .
.
. ,x,,,) =.U i!'(Xo, ... , Xj-,,xi,
Xm).
I. Linear Algebra
50
Since ui < 0 is the necessary and sufficient condition for the simplexes to have no common interior points, this is the case precisely when for the given ordering of the vertices the simplexes are oppositely oriented.
if in this lemma i is set equal to 0, ... , m successively, addition yields Lemma B. For every x in the plane of the simplex s"'
,' .1(xo...... i_,, x, Xi...,.... , .r.) = .1(.re......r,") . Observe that this equation already implies the additivity to be proved in the special case of a "star" decomposition. For if x lies in the
interior or on the boundary of s', s"'(x0... ..r,")
;-0
xi_,. X, .Vi- ,...... ,")
s'"(ao, ..
is obviously a simplicial star decomposition of s"' centered at x, where because pi - 0 the subsimplexes on the right are according to Lemma A all oriented like s'".
5.8. Proof of the theorem. After these preparations we now give a general proof of the asserted additivity of A, whereby we restrict ourselves to simplicial decompositions R
of the simplex sm.
Letting x stand for a temporarily arbitrary point in the plane of s'", we have, according to Lemma B, for every k m
l(x;;,..
X..)
-
Zd(xk,...,x; .-0
-***-X.
m).
and therefore
J (Xk, k
..
It) - G G 11X0, .. k i-O
,
xk.-,..C, 4 .
,.
.
. .
. .m
The problem is the evaluation of the double sum on the right.
For this let s" be an (in - 1)-dimensional side simplex of the subsimplexes that contain interior points of s'". Since the decomposition is simplicial, there exist precisely two subsimplexes, k k "i k SA (xo...... t-I, -V/,, xr.TI .. At
,
A
.rm)
and
'k"(....... , xq_,. Xq' xqk i_,, ... , that have the side simplex s'"-' in common, where xp and xq are the only vertices of the neighboring simplexes s' and sk not common to both. Now since these simplexes have no common interior points and
§ S. Multilinear Functions
51
are for the above sequence of vertices by hypothesis oriented in the same way, it follows from Lemma A that the ordering k tp, ...
.*
TD_ 1,k xq,h xp+1...... ";
must be an odd permutation of the ordering
... , xy_t, xq, xq.1.,...... T",
xkO,
of the vertices of s"h. But then k
R R , xp-t, x, xp+t, .
xo,
. .
, X.k
is likewise an odd permutation of the ordering k
k
k
and the corresponding J-terms in the above double suns consequently cancel each other out. Thus in this double sum only those terms remain which come from side simplexes s'"-' of the subsimplexes that contain no interior and thus only boundary points of the decomposed simplex s"'. If in addition the point x is now shifted to a vertex of the simplex, to x0, for example, then only the terms which correspond to the side simplex s,'-' of s" opposite to xo remain. If, finally, in these remaining terms xo is brought into the first place by means of an even permutation of the vertices, which is obviously possible for its > 1 and which leaves the orientations unchanged, then Lr J(x*O,
[1(x0, Al, ...
.
, AM)
It
Al
where as always the sum on the left is taken over all subsimplexes of the original simplicial decomposition of s'" and the sum on the right over the induced simplicial decomposition Sa -'(xl,
... , .Y",) = E S"-,(Y;, }M) Is
of the side simplex s". Now if y1, . . . , y,,, denote arbitrary points in the plane of the side simplex s; -', according to the remark in 5.3, 4(xo, yl, ... , ym) = Do 0`2
- 1) ...
YO
is the (up to a real factor) uniquely determined (»r - 1)-linear alternating fundamental form for the space U".1-1 parallel to e,-'. We assert that the expressions (.Vol Yi,
Yh,)
all have the sign of J(x0, ... , x,") and consequently are oriented in the same way.
I. Linear Algebra
52
In fact, there is a subsimplex in the original decomposition that up to an even permutation of the vertices is equal to r"
sh
k
,h
t, ... OM , A
where xo is an interior vertex of the decomposition. This simplex istherefore oriented like sm(r0,
x1.
,
X.)
but on the other hand also like , too
For in the barycentric representation Au=/tox0"T_
i-
lug i
the coefficients u', thus in particularpO, are positive. If the barycentric representations of x1.... , x," with respect to y;, ... , y , are substituted here, the barycentric representation of Yo with respect to x0, y;, is obtained, where the coefficient of x0 is unchanged and equal to #0 and therefore positive. Now by virtue of Lemma A
J(xo,AI ....,ym) =µ°d(x0,A. from which it can be seen that J(x0, )-,, ... , ye,) in fact has the sign of d(x", y,, . . . , y;,) and thus for every It the sign of .1(.ro, x,.... , .r,"). The induced simplicial decomposition of s;'-' is therefore oriented in the same way.
Assuming the theorem to have been proved for dimensions < in, it follows from the above that
But then also L (x
,
.
xkii) = f
k
,
'f (x0, Y4
..... y,a) = A (x0,
.
x,,,)
h
and the theorem is also true for dimension u. For the dimension to = 1, the theorem is trivial, and the proof is therefore complete. 5.9. Exercises. 1. Let
Dx,...x,,
be a real n-linear alternating function in the linear space R'", it and further
x,=rL'E a, (7=1....,n)
in,
§ 5. Multilinear Functions
53
in a coordinate system a,, ... , a, for this space. Show that
D xl ... x,, _
ai, ...i. D
E
ai, ...
where d"" stands for the determinant formed by the i,-th, ... , rows of the matrix N
1
tt
N,
ttMI
2. Show that the above determinants air i^ are linearly independent, n-linear alternating functions that span the linear space of all such functions of x,, ... , x,,. Note in particular the extreme cases is = I and n = tit. 3. Let T be a linear transformation of the space R' and D the space's real, in-linear alternating fundamental form, which is uniquely determined up to a real factor. Show : The quotient DTx1...Tx,N Dxl ..x,N
is independent of the vectors x1, ... , xN, and in every coordinate system equal to the determinant
det 7' = det (r;) of the transformation T; this determinant is consequently an invariant (independent of the coordinate system). 4. Show generally that for each k (t S At S us) the quotient Qk
Dx1...xMt5i,<
D x1 ... T x... T xir... x.
is independent of the vectors x1, ... , x. and in any coordinate system
is equal to (-t)m_ times the coefficient of A'"-k in the mah degree polynomial
det(T - AI) = det(r;-Ab,) where I denotes the identity transformation. Remark. In particular, 4s = D x11.. xM i
D xs ... xi_, T xi xi+, ... x,
is equal to the trace Tr T =
T, ins
of the transformation T (or of the matrix (Ti)).
1. Linear Algebra
54
5. Prove: In order that ie be an eigenvalue of the transformation T, it is necessary and sufficient that A. satisfy the secular equation
det(1'-i.1)=det(s -Ab;)=0 (cf. 3. 10, exercises 9-10).
Further show: If 1b is an n-fold (I S it ; m) eigenvalue of the transformation, then it is at least an n-fold root of the secular equation
and is an n-fold root of this equation precisely when the kernels of 7' - J.e 1 and (7' - 201)$ coincide. Hint. With no loss of generality assume Ao = 0. Since in the preceding exercise the expression for the coefficients qk of (-1)M-hAM-k
is independent of the vectors xt, ... , xM, it is possible to take for . . , x" a basis of the eigenspace corresponding to the n-fold eigenvalue 4 = 0, i.e., of the kernel K (T) of the transformation 7'. Then qk
x1, .
certainly vanishes for k = in, in - 1, . . . , to - n + 1, so that A = 0 is at least an n-fold root of the secular equation. The coefficient for A" is (-1)
"q,
D x, , ....r" 7'x,+ t
_"=
... T x.
This coefficient is + 0 precisely when xl, ... , x", T' x"_,, ... , T xM are linearly independent, which according to exercise 3.10.6 is the case
if and only if K(T) = K(T'). so that K(T) and T(R") are linearly independent complements in the space R'. This is, for example, not the case for the linear transformation T defined by
7'x1=...=Tx"=0, Tx".tt= x"
,Tx,,,=xM-t
In fact, as is at once apparent, the secular equation here is AM = 0, while A = 0 is only an n-fold eigenvalue of the transformation T. 6. We consider an in-dimensional simplex
s"' = s"(xo, ... , xM) and order the vertices into (tn + 1)! different sequences
Prove that the closed simplexes sm(f) corresponding to these (tit + 1)! permutations p with vertices k
k
Z xrr
}-i i-o
(k = 0, ... , in)
at the centers of gravity of the side simplexes of 0t", ... , utt° dimension of s" decompose this closed simplex simplicially.
Show further: If D is the tn-linear alternating fundamental form (which is uniquely determined up to a real factor) of the space RM =
1 5. Multilinear Functions
55
= (hl, . . . , h.) spanned by the edges h; = x, - xo of the simplex s", or using the earlier notation,
Dhl...h.=d(xo,.....Y.), then for each of the above subsimplexes s'"(p) --,
IJ(yi......
1);
I'1(xo, .... x",)I
Finally, show: If
kO stands for the center of gravity of the subsimplex s'"(p) corresponding to the permutation p, then
ZYW
.0
P
In other words, the center of gravity of these (m + 1) 1 centers of
gravity is equal to the center of gravity of the simplex s'". Remark. The above simplicial decomposition of s"` is called the barycentric decomposition of first order. If each subsimplex sM(p) is again barycentrically decomposed, one obtains the barycentric decomposition of second order of s"', etc. 7. Let h, = xi - xo (i = 1, ... , m) be linearly independent vectors of a linear space and D the m-linear alternating fundamental form of the space R"' spanned by these vectors. Show: a. The its simplexes ST (X0 + h",, .... xi_1 + h,", xi, ... , xM-1, X. + ho) ,
where i = 0, . . . , in - 1, are oriented in the same way and have the same "volume" in that for each i
A(xo+h..... xi-+ +h",, x,,...,x,"-1, xi+h,") = Dhl...h,". b. The simplexes named decompose the prism ,"-1
x=1,ah,"+Z iixi, i-o where 0
p
m-1
1, µi
0, r µi = 1, simplicially. s-o
Remark. This simplicial decomposition of the m-dimensional prism
is a generalization of the decomposition which for in = 2 and in = 3 had already been given by Euclid. 8. Prove that the m-dimensional parallelopiped
x=xo+E'µihi i-I
(OSpi;S 1)
I. Linear Algebra
56
spanned at the point x0 by the linearly independent vectors Iii, ... , Is,,, can be simplicially decomposed into nil "equal volume" like-oriented simplexes.
Hint. The proof follows from problem 7 by means of induction on the dimension in.
§ 6. Metrization of Affine Spaces 6.1. The natural topology of linear spaces of finite dimension. A linear space of f unite dimension possesses a "natural" topology which
goes back to the topology of the real multiplier domain. If al, ... , a,n is a basis of the space in which
x=E i.-7
a,
i-I
then, for example, the limit x -+ xo can be defined by the in limits $i -. $4 (i = 1, ... , in), which are meaningful in the domain of-the real numbers, and, indeed, this can be done in a way that is independent of the choice of a coordinate system: If according to this definition x - xo in one coordinate system, then it follows from the finite, linear formulas for the transformation of the coordinates that the same is the case in all coordinate systems. The corresponding holds for the remaining basic topological concepts and relations such as the accumulation point of a point set, interior point of a region, etc. It is because of the existence of this natural topology that the basic
notions and relations in the "absolute analysis" developed in the following chapters will be for the greatest part of a purely affine character in finite dimensional spaces. We shall, however, partly in order to be able to formulate the concepts and theorems conveniently and partly for technical reasons having to do with the proofs, almost everywhere use the metric concepts introduced in this section. But the notions and theorems of absolute analysis in themselves will mostly be independent of the particular choice of the auxiliary metrics introduced.
6.2. The Minkowski-Banach metric. The in many ways simplest d-dimensional point set of an affine space is the closed d-dimensional simplex
.... xd) with the vertices xi and the if linearly independent edges x, - xo, ... , xd - xo. This configuration is the most elementary possible Sd(XO,
principally because the point set d
x= E/z'xi, teo
d
Epi= I , i-O
µ'a0,
a 6. Metrization of Affine Spaces
57
is defined in an a f f ine way, without any kind of metric, and invariantly
retains its character as a d-dimensional simplex for arbitrary regular linear transformations of the space. In establishing a general measure theory for point sets of the affine space it is thus expedient first to define the notion of measure accordingly for simplexes. For d = 0 the simplex shrinks to one single point whose "measure"
we set = 0. For d = 1 we are dealing with a closed line segment x = p° -To + itl xl fto -f- ltl = I , l4° 0, µl a 0. We start by introducing a suitable definition for the measure or length of such a segment x° xl.
If this notion of length is to conform to our usual ideas, then it will be a real number Ix°, xll, defined for any segment x° xl, that satisfies the following postulates: A. The length I.ro, xll is independent of the orientation cf tire simplex sl(x°, xl) and invariant with respect to parallel translations of the latter. According to this, Ix0, x11 = 1x1, x01 = 10, xl - xol = 10, Xe - xlI .
We therefore denote this length more briefly with
ixl-xo1= Ix0 -X11 B. The length is to be additive in the following sense: For an interior point
x=µ°xo+µlxt, µ°+/e1=1, ,o>0, It'>0
of the segment we have
C. We have
lx - x°1 + Ixl - xl = Ixl - x01
Ix1-x01 a: 0, and = 0 only it xl = x° and the segment degenerates to one single point. These postulates are equivalent with the following ones (cf. 6.11, problem 1) . Associated with each vector x of the affine space is a real number Ix!, the length or the norm of the vector, which has the following properties : 1. For every real A, I). xl = IAI IxI
2. Ixl?0.and =0only for x=0. When in addition the triangle inequality
3. Ix:-x01
Ix:-x11 + Ixl -xol
holds for each triangle in the space, one has a Minhowski-Banach
58
I. Linear Algebra
welric for the affine space, and, indeed, a Minkowski or Banach one according as the dimension of the space is finite or infinite. It is easily shown that the "unit sphere" IXI a-1
of such a metric is a convex point set.
Conversely, keeping the remaining postulates, the triangle inequality can be replaced by requiring the unit sphere to be convex (cf. 6.11, problem 2).
In what follows, the case, considered by Minkowski, of a finite dimensional space will be treated almost exclusively. Regarding such a Minkowski metric the following should be observed. In some coordinate system at, ... , a,,, let
x=
M
$'ai.
Then each length Ixi is a continous function of the coordinates, and, conversely, the latter, as functions of x, are continuous with respect to the given metric.
It obviously suffices to show this for fl =. =
0orx=0.
For this we set i-t
and thus claim that (x! - 0 for P -+ 0 and conversely. In fact, as a consequence of the triangle inequality and 1. "I
1xl S E IE'I jail
i-,
and therefore IX12 S (E i
l
jail)'
Z jail' E (Ei)2 = K2 es; i-t
i-t
i-t
consequently !x! 5 K e and !xi -+ 0 as B -+ 0. Since according to this jxl is a continuous function of them coordinates ¢r, Ixi has on the surface m
E (si)9 = I i-1
a nonnegative lower bound k which is reached for at least one system ;. Because
xo= i-t EEoai +0,
§ 6. Metrization of Affine Spaces
59
k = 1x01 is by 2 positive and, as a consequence of 1, for an arbitrary x
IxI=f eel- elelke from which it follows, conversely, that a - 0 for IxI - 0. Hence if x - 0 in the natural topology of the space, then IxI - 0 also in every Minkowski metric, and conversely.
6.3. Norms of linear operators. Let
y-Ax be a linear mapping of the Minkowski space R,' into the Minkowski space R. Since in the natural topologies of these spaces obviously A x - 0 as x -. 0, this is also the case in the Minkowski metrics: IA xI - 0 as IxI -+ 0. By the triangle inequality, for arbitrary x and h
in R'
IIA (x+h)I-IAxji
IA(x+h)-AxI=IAkI.
and it follows from this that A x as well as IA xJ are continuous functions
of x. Consequently, on the m-dimensional sphere IxI = 1, 1yl = IA xl has a finite upper bound sup IA xI = IAI , I=I-I
which is even reached at at least one point on the sphere. This is the norm of the linear operator A with respect to the metrics introduced
in the spaces R' and R. For an arbitrary x + 0 one has IA xI = IxI IA (izl)1
IAI IxI
If A x vanishes identically, then IAI = 0; it can be seen from the above inequality that, conversely, the identical vanishing of Ax follows from IAI = 0. Therefore IAI = 0 precisely when A is the null operator of the m#-dimensional operator space introduced in 3.5. Further, for every real ;
I2AI=12IIAI Finally, it follows from the triangle inequality that when B is a second linear operator in this space that I(A + B) xj = IA x + B xl 5 lA xI + IB xI S (IAI -I- IBI) Ixi; hence the inequality is valid for the norms.
IA+BI SIAI+IBI
1. Linear algebra
60
We see: If the operator space mentioned is metrized by the introduction of Mlinkowski metrics in the spaces R.r' and Ry, then this norm metric is likewise a Minkowski one. Moreover, notice the following. If R' stands for a third Minkowski space and
- =Cy
is a linear mapping from R'y into R!' with the norm ICI = sup IC yl
.
Iii-' then one obtains for the composite linear mapping C B 1:1 = IC B xl 5 ICI IB xl <. ICI 1BI Ixl .
and hence for the norms ,C BI <_ ICIIBi The norm of a multilinear mapping
y=Mx1...xP between two Minkowski spaces Rs and R" is defined in a corresponding fashion as the least upper bound
IJII =
sup
111 x1 ...
For arbitrary vectors x1, ... , xr in R' one thus has 1.11 x1 .
.
. xpl
0.1i 1x11 ... I x,l .
6.4. The euclidean metric. In a general Minkowski metric there is no measurement of angles. On the other hand, in elementary euclidean geometry, besides the measurement of lengths, a measure for the angle t formed by two vectors x and y is introduced which is tied to the lengths of these vectors by the "law of cosines" Ix + yl= = Ixi' + 13,12 + 21x1 IYI cos $ .
If y is replaced in this formula by -y, the addition of both equations yields the parallelogram identity
4. Ix + yls + Ix - yl' = 2 IxI' + 2 IYi2. which now only contains lengths of vectors and according to which the sum of the squares of the diagonals is equal to the sum of the squares of the four sides of the parallelogram with the vertices 0. x, y, x + y.
If one adds this equation, which makes sense in any Minkowski metric, as a fourth postulate to the three already mentioned, the thus specialized metric is, as is known, a euclidean one, i.e.:
1 6. 3Ietrization of Affine Spaces
61
There exists in the alline space a uniquely determined bilinear, symmetric, positive definite /unction G x y such that for every x IxI = -}- l G xs .
If, conversely, a norm of this kind is introduced in an affine space, whereby the fundamental metric form G has the properties mentioned
above, but may otherwise be chosen arbitrarily, then this metric satisfies the three postulates 1, 2, of a Minkowski metric and in 3 addition the parallelogram identity 4. In both of the following sections we shall for the sake of completeness briefly prove these assertions, which are fundamental for the
euclidean metric.
6.5. Derivation of G from the four postulates. In an affine space (here the dimension makes no difference) suppose the length of vectors is defined in a way which satisfies the four postulates 1-4 mentioned
above. The claim is that a bilinear, symmetric, positive definite function G x y then exists so that IxI = + x2. Assuming the correctness of the assertion, it follows from the polarization formula (cf. 4.1) for G that
4Gxy=G(x+),)2-G(t
lx±y12-- Ix - ye,
and thus it must be shown that this expression is actually a bilinear, symmetric, positive definite function which stands in the asserted relationship to IxI. First, the above definition for y = x yields by 1, according to which 101' = 0 and 12 xI2 = 4 IxIs, that G x' = 1x14
and therefore according to 2 is a positive definite function of x which stands in the proper relationship to IxI. Since further ly - xI = Ix - YI, G x y is symmetric, and it remains to demonstrate the linearity. For the most part this will be a consequence of the parallelogram identity 4. Thus the equations
G(x+y)z= Gxz+Gyz, G(dx)y=AGxv must be verified. The first identity is according to the definition of G equivalent with
ix+ Y -I-zl=-Ix +Y-zls=Ix -I-zls-Ix -zl'-{- y+z1s-
-ly-zl4,
Now by 4
2Ix +zI2-4-2Iy-1-:I==Ix -4-y--z+
62
I. Linear Algebra
furthermore,
Ix+y+2+Z12+Ix+y12=2Ix+y+ :12+21212, and therefore
2Ix +12+21y+z12= 2Ix +y+,2+ 2lzl2-Ix+y12+Ix-vj2. NGith -z in place of
since I-zl = iz1, this becomes
2Ix -z12+ 2!y-z12=_ 2Ix+y-2i2+21zI2-Ix +yl2+IX -yI2 and the subtraction of these two identities yields the desired identity for G.
In order to prove the second equation, G(A x) y = A G x y, for real 2, observe that
4G(Ax)y= 1Ax+y12- IAx-y12 is as a consequence of the triangle inequality 3 and I a continuous function of A.
For one has
12'x+yl= 1(A'-A")x+A"x+yl :5 I(2'-A")xl +IA"x+, = 12' -A"I IxI +!A"x+yl, and hence, since 2' and A" can be exchanged,
IU'x+yl - IA"x -yl IS IA'-A"'x1. It is therefore possible to restrict the multiplier A to rational and even to positive rational values. But then the asserted homogeneity in A follows from the above proved additivity. In the first place, for a positive integral A = p, because p x = x + + x, ft
G(px)y= pGxy, and thus, when p x is replaced by x, G(fix)y=.nGxy,
and consequently for a positive rational A = pJq
G(9 x)y=-y Gxy, which completes the proof.
6.6. Derivations of the four postulates from G. Suppose, conversely, that G x y is an arbitrary, bilinear, symmetric, positive definite function in the finite or infinite dimensional affine space under con-
63
§ 6. Metrization of Affine Spaces
sideration. We shall briefly present the well-known proof for the fact
that the norm
1xl=+I'GX' satisfies postulates 1-4. Properties I and 2 are clear without further ado. Moreover,
ix+yl'=G(x+Y)'=Gx'+2Gxy+GY'= IxI'+Gxy
- IYI'. and with -y in place of y, we have an equation whose addition to the above relation yields the parallelogram identity. And, finally, to prove the triangle inequality Ix + YI'
IxI' + 2 IxI IYI + IYI'.
as a glance at the preceding identity shows, the inequality G x y S IxI IYI, that is, Schwarz's inequality
(Gxy)' ;5Gx'Gy', must hold. In the trivial cases x = 0 or y = 0 it together with the triangle inequality clearly hold with the equality sign. In the general case these inequalities result from the identity
Gx'G(y - A x)s=Gx'Gy'-(Gxy)'+(A Gx'-Gxy)' for
Gxy
Gary
Gr= - Ix1'
.
from which the relation
Gx'Gy' - (Gxy)' =Gx'G (y - Ax)z
0
follows. Here equality holds for y = A x, so that x and y are linearly dependent. In order for equality to then also hold in the triangle
inequality one must furthermore have G x y = IxI IYI, and hence A = I yII ixl > 0; the linearly dependent vectors x and y must therefore have the same direction.
In summary, a euclidean space can, as is customarily done, be defined as an affine space with an arbitrary, bilinear, symmetric, positive definite fundamental metric /orm G x y = (x, y)
called the inner product or the scalar product of the vectors x and y, where lxi' = (x, x) ; or as an affine space in which a metric is introduced
that satisfies the four often mentioned postulates.
6.7. Angles and orthogonality In the euclidean space. By virtue of Schwarz's inequality one can define the angle 0 formed by the vectors x and y using the equation G x y = (x. Y) = IxI IYI cos 0 .
1. Linear Algebra
64
Accordingly, these vectors are orthogonal (with respect to G) when
G x y = 0, in agreement with the general terminology already used in 4.2.
All that has been said about euclidean spaces up to this point is independent of the dimension. If this dimension is now finite and = Ill, it then follows from the general inertia theorem in 4.2 that coordinate
systems e1..... en, exist which are orthonormal with respect to the inner product, and G e, e; = M. Thus if ,n
»I
?'=Z il1ej
c;
x
then in such an orthonormal system Ji = G x c;.
G y e1, and
therefore m
no
G x v = (x,
G x2 = 1xi= -- ,
and
Orthonormal coordinate systems can be constructed from arbitrary bases by means of the Schmidt orthogonalization process. Once such a system has been found, all other orthonormal systems are obtained by means of the group of linear transformations T which are orthogonal with respect to the fundamental metric form, which thus preserve the inner product: (T X, T )') = (x, y) .
6.8. The principal axis problem. Let an arbitrary real sv,nu,elric bilinear function B x y be given in the euclidean space R' which has been metrized with the fundamental form G x y = (x, y). The principal axis problem consists in finding a coordinate system e ... , e,,, (the principal axes) orthonormal with respect to G such that in
Bxy
=
t
s+'1'J'
that is, B e, e, = )i b . Since the treatment of this classical problem of linear algebra may be assumed as known here, only the line of thought behind one of the many solutions to this problem is to be briefly sketched. We present the details as problems in 6.11. According to the Frkhet-Riesz theorem a unique linear transformation T x of R"' exists such that
B x y = (x, 7'),) = G x Ty. Because of the symmetry of G and R
(x,Ty) =(Tx,y),
16. Metrlsatlon of Affins Spacos
65
and T is therefore self-adjoint with respect to G. Conversely, each such
transformation defines by means of the above equation a bilinear symmetric function B. Assuming the principal axis problem to have been solved for a given B, it follows from the above representation of B that
(T e;, e,) = Beret=fib; _ and thus, since G is not degenerate, T6j=?. e+ (i=1,...,tn) The self-adjoint linear transformation T therefore has in eigenvalues A, with the associated eigenvectors e, (cf. 3.10, problems 9-10). Conversely, the principal axis problem is obviously solved when it can be
shown that every linear transformation of R" which is self-adjoint with respect to G has precisely m eigenvalues 1, whose associated eigenvectors e; form an orthonormal coordinate system. This is proved in a coordinate-free way in exercises 7-12 of 6.11.
6.9. Measure of a simplex. Assume that I S p S in and that st(xo,
. . .
, xp) is a p-dimensional closed simplex of the affine space R'".
The p linearly independent edges x; - x0 span a subspace UP in RN parallel to the plane of the simplex, and for the time being we restrict ourselves to p-dimensional simplexes that are parallel to a given sub-
space U. If a measure is to be defined for these p-dimensional simplexes in
a natural way, the simplest would seem to be to retain the first three postulates A - C in 6.2 in an appropriately generalized form. Accordingly we postulate : A.
The measure of simplexes parallel to Up remains invariant
under reorientation and parallel translation.
This postulate permits us to take the vertices xe, .... xt, in an arbitrary sequence and to shift the origin of the space to one of the vertices, e. g., to x0. Then , r , ,. .. , x, are the linearly independent edges of the simplex, and, in agreement with the notation in the case p = 1, we denote the measure of the simplex s P(0, xt, ... , xp) which is to be defined, by
lxl,...,xpI.
B. For a decomposition of the simplex its the sense of 5.6 the measure is an additive function. Observe that for a decomposition of the simplex sr the subsimplexes
lie in the same subspace UP determined by so and in this sense are parallel; for p = I one has precisely postulate B of 6.2. C. The treasure is a positive definite /unction o/ the edges x,; thus
1xt,...,xt,l 2: 0,
66
1. Linear Algebra
and = 0 precisely when the edges are linearly dependent and the simplex consequently degenerates to a simplex of lower dimension.
For p = I this condition coincides with postulate C of 6.2. Taking the theorem in 5.6 into account, these postulates are easily satisfied if one defines
Ix,...... yl=ID x,...x1,i
for the p-dimensional simplexes "in the direction Un", which are thus
parallel to the fixed subspace U. Here D is the not identically vanishing, real, p-fold linear alternating function of the edges x,, which is uniquely determined in UP up to a normalizing factor.
6.10. Connection with the fundamental metric form. Final definition of la,, . . . , x j. Now assume, that R'" is a euclidean space with inner product G x y = (x, y), which until now was not relevant. In order to find the connection between the above definition with the one-dimensional metric determined by G, it is convenient to orthogonalize the subspace UP with respect to G. Therefore, let e,, ... , e, be an arbitrary orthonormal coordinate system in UP and set A
xi-EVei j=1
(i=1... .p)
By 5.2
1) .v1 ....vr = det (t;;) D e, ... e.. Using the theorem on the multiplication of determinants deduced there, according to which (det (¢r))= = det
i-+
Ej R) = det (G xs xi,),
(D xl ... x,,)2 = det (G xj, xR) (D e, ... cp)2 .
This formula shows, first, that det (G x,,xt)
0,
and = 0 precisely when .1.1, ... , XP are linearly dependent. This is a generalization of Schwarz's inequality, which is included here for
=2.
This formula further shows that IL) x1 ... x,,I is invariant under orthogonal transformations of the subspace U and that ID e, ... epi is independent of the choice of the orthonormal system e,. We can therefore fix this positive quantity arbitrarily for each p-dimensional subspace U. Now if the measure being defined for the simplex sp is to be "isotropic", that is, independent of the direction Up of the simplex (as is the case, of course, for p = 1), then the factor ID el ... e1I = ID(U°)I
1 6. Metrization of Affine Spaces
67
must be assigned the same value for every p-dimensional subspace. It is now only a question of doing this in an appropriate way. It seems
natural, moreover, to make this assignment so that the unit cube receives measure 1. According to problem 8 in 5.9 one gets then, for every p,
p!IDel...e1, = I
.
Thus in Ix1...... rQi = ¢ I'det (G xk xk)
we have a definition for the euclidean measure or the volume of a p-dimensional simplex that satisfies all postulates proposed. This definition is also satisfactory insofar as the volume remains invariant under orthogonal transformations of the euclidean space. 6.11. Exercises. 1. Show that postulates A, B, C and 1, 2 in 6.2 are equivalent. Hint. One shows first, keeping in mind the final remark in 1.1, that -L I
qxl
gIxI
follows from B for every positive, rational p/q. Then if A is a positive irrational number, let (a) be the class of positive rational numbers < A and (R) the class of rational numbers > A. It then follows from the above and B that aixI < IA xl < PIxi . on the one hand, while, on the other hand,
alxI <'Ixl < Rlxi Consequently, for every pair a, p
III xI - AIxiI < (P - a)IxI . from which it follows that IA xI = AI.rI. Hence because of A, IA xI = = IAI IzI for every real A. Conversely, A and B are immediate consequences of 1. 2. Show that the unit sphere Ixi 5 1 in a Minkowski-Banach metric
is a convex point set and, conversely, that the triangle inequality is a consequence of the convexity together with 1. 3. Let x x1, ... be an infinite sequence of points in the space R". Prove the Cauchy convergence criterion :
In order for there to exist a limit point
=limx" "-6-ao
68
1. Linear Algebra
in the sense of the natural topology of the space, it is necessary and sufficient that as p, q - oo Ixq - X,I
0
in an arbitrary Minkowski metric. In a Hilbert or Banach space the validity of the Cauchy criterion is an axiom, independent of the other axioms defining these spaces.
4. Let s"(xo, ... , x",) be a simplex in the u1-dimensional linear space R. Show that there exists precisely one euclidean metric so that the lengths of all the edges Ix1 - x;I have unit length in this metric. Further determine the angle between two adjoining edges and the height of such a regular m-dimensional simplex. 5. Prove Schwarz's inequality as a consequence of the Pythagorean theorem, according to which in a euclidean space lylz = Ip12 + Inls
Here p stands for the projection of the arbitrary vector y onto a vector
x + 0, and n = y - p is the projecting normal. 6. Let el, ... , e_, be a coordinate system which is orthonormal with respect to the inner product G x y = (x, y)
of a euclidean space R. Establish for an arbitrary vector x of the space the orthogonal expansion with respect to G : x
e;Gxe;= E(x,e1)e;.
Develop in a corresponding way the vector x into an orthogonal series with respect to a nondegenerate symmetric bilinear function B x y.
7. In a euclidean space with the inner product G x y = (x, y) let
p = P., be the orthogonal projection of x onto a subspace U and
q=x - Px=Qx be the projecting normal.
Prove that P and Q are self-adjoint linear transformations with respect to G which are characterized as orthogonal projections by this property and by the (purely affine) property P2 = P, Q2 = Q (cf. 3.10, exercise 7), which is valid for any projection.
§ 6. Metrization of Affine Spaces
69
8. Let T be a self-adjoint linear transformation of the euclidean space R* and set sup I(Tx.y)I = IBI IxI-Irl-,
,
sup I(T x, x)I =IQI , ITI-t
sup IT xI
= ITI
1+I-1
Prove that
IBI=IQI=ITI Hint. For arbitrary x and y l (T x, y)1 S IBI IxI IYI
1(T x, x)I S IQI IxI' .
.
IT xl a I TI IxI .
From IT xl' = (T x, T x) it thus follows that IT xl S IBI IxI, hence ITI S IBI. Further, it follows from the polarization formula and the parallelogram identity that 4 I(T x, y)I S IQI (Ix + yl' + Ix - y1') = 2 IQI (IxI' + IyI') Hence IBIS IQI and therefore
ITI;9 IBISIQI On the other hand, it goes without saying that IQI S IBI, and as a consequence of Schwarz's inequality, I (T x, y) l S IT xl lyl. Therefore IBIS ITI, and hence
IQISIBISITI 9. Because of the finite dimension of R", the sphere IxI = I is compact, and the least upper bound IQI is thus achieved for at least one unit vector el. Prove that
T el = Al e1
and that any further possible eigenvalues of T are in absolute value S Ia11
Hint. According to Schwarz's inequality IQI = I (T
e1)I
and e1 and e1 are linearly dependent. 10. Let Ui' be the eigenspace of dimension d1 corresponding to the eigenvalue Al found above and P1 x the orthogonal projection of x onto U;1. If one sets
x=P1x+(x-P1x)=P1x+Q1x, then
Tx=A1 P1x+ TQ1x.
I. Linear Algebra
70
Show that
T,=TQ,=Q,TQ,=Q,T and that T, is consequently self-adjoint and, further, that it maps U;.
onto x = 0 and the (u, - d,)-dimensional orthogonal complement R", of U;' into itself and that there T, x _= T x. 11. Treat T, = Tin R;'-d, exactly as T was above in R'" and show by continuing the method followed that T has eigenvalues A,, ... , Ak k
(k
in) with multiplicity sum
iaJ
di = ,u, where IA,I a
.
? I2kl. fur-
ther that the corresponding eigenspaces L'°' as a consequence of the method have to be pairwise orthogonal and yield R'4 as direct sum, and finally that k
Tx='2iPix i-I
where P, x stands for the projection of x onto the eigenspace U. 12. Orthogonalize the eigenspaces U°' and show that in this way there results a principal axis system e,, ... , en for the symmetric bilinear function
1'xv- (Tx,y)=(x.T')').
Remark. The results in the above exercises 7-12 contain a complete solution to the principal axis problem in the euclidean space Rm. 13. Let A x be a skewsymmetric linear transformation of the euclid-
ean space Rte, that is, A * = - A. Then an in (9 n/2) exists and an orthonormal coordinate system e...... e,, such that .4 e2i-1 = :ii e2,
,
A e21= - n; e2j-,
fori-l,....in,while Ae,= 0 fori>2in. Hint. The transformation A= is self-adjoint and has negative eigen-
values -
2 691
,..., - o
hick are each double, and the (n - 2 in)-
fold eigenvalue zero. The orthonormalized vectors e2i-,, e2e =
.4
'2i--I ---
1.1 e2i-11
.4 e2i-i
_ -. -- -ei
are the eigenvectors that correspond to the eigenvalue - o; 14. Let Rk" and R". be euclidean spaces with the inner products (x,, x=), and (v vs),,, respectively, and A and A * adjoint linear mappings (cf. 4.9, exercise 7), so that (A x, v),. _ (x, .4 * Y)X
Show:
a. A * A = T, and .4 .4 T,, are self-adjoint linear transformations of the spaces R'" and R", respectively.
§ 6. Metrization of Affine Spaces
71
b. If KO(A) and Kv(A *) are the kernels of the mappings A and A*. respectively, of dimensions p and q, then zero is a p-fold eigenvalue of the transformation Ts with the eigenspace K"(.4) and a q-fold eigenvalue for T. with the eigenspace K, (A *). c. The r = m is - q nonzero eigenvalues of the transformations Ts and 7. are precisely the same and of the same multiplicity. 15. Let G x y be a bilinear symmetric positive definite function in an affine space of arbitrary dimension. Give a proof of the generalized Schwarz inequality det (G xk xk) a 0 based on the fact that with arbitrary real coefficients A, M
G (1., x, + ....+ A. x",)2 = E Ar A, G x; x; i.i=i
is a positive semidefinite quadratic form of the A, 'which only vanishes for
J., x, + ...+AM x"=O. 1 G. We close this section with the following consideration of E. Schmidt's orthogonalization procedure, described in 4.4; the details are left to be carried out by the reader. Let U" be a subspace of a euclidean space with the inner product G x y = (x, y). Suppose D is the up to a real factor unique fundamental form of this subspace and that a,, ... , a," is an arbitrary orthonormal coordinate system in U"'.
Then if x,, . . . , x," and y,, ... , v," are two arbitrary systems of vectors in U", one has M
xk=E a;, and
on
.
vk=Egkai
Dx,...x,"=det(fl)Da3...a,,,, I) v, ... y",=det(rJk)Da3...a,".
Since the coordinate system a; is orthonormal with respect to G x y = = (x, Y), E ti r!R = (xk, xk)
,
and the multiplication law for determinants therefore yields
Dx,...x,,,Dy,...y,"=det((xk,yk))(Da,...a The formula (D x, ... xM)= = det ((xk, xk)) (D a, ... a,,,)=
already mentioned follows from here for yi = x,.
I. Linear Algebra
72
Presuming this, let >1, ... , z, be a linearly independent system of vectors in UT. If this system is orthogonalized by means of the Schmidt procedure, one obtains n1 equations of the form
zi=1.i1e1+... +,iie,
(i=
where e ... , e,,, is an orthonormal system for the space U'". This system and the coefficients ;AA (1 5 k S is $ ni) are uniquely determined by the vectors zi, in the given order, provided at each step of the procedure the sign of 1.;; is fixed. We wish, for example, to take ).ii > 0 for all indices i, which because
means that the simplexes s"'(0, e1, . , c;_,, tip .... z,,,) are all oriented like the simplex s"'(0, z1, . . . , a,,,). It is a question of setting up explicit expressions for the coefficients ).AA in the given vectors &V
To this end we take for i % k S A S M in the Above formula A'1 w 834 , , , , Xk-, - 4-14 AAA - Ry, 4+1 k,- $6*1, , ,
,
1 App M A,,, ,
Yk-i = Zk-i, yk = zk, Yk+ l = ek+1, Yl = zi, and thus obtain
Ym = em
Dz1... zk_,zkek+f ... e,,, l);... zk-1 zkek+l... e,n=Jlkk(Da, ... a,,,)s, where AAA
(zk, Z,) ... (zk, Zk_1) (zk, zk)
But on the other hand, according to Schmidt's scheme, D z1 .
.
. zk _, zk ek+l ... e,,, _ All ... J.,k-1) (k-1) )kk D e1 ... eM
Dz1...zk-lzkek+,...e,,,=p11...A(h-1) (k-1) "kkDe,...e,,, and, therefore, D z1 ...:k_, zk ek+f ... e,,, D z1 ... zk-1 zk ek+, ... em = 411 ...1.ik -11
(k-11
k k k k (D e1 ... e,,,)s
= )-it . ;(k-1)(k-1)Akk;kk (D a1
... a,,,)'
Since (D a1 ... am)s + 0, comparison of both expressions yields A ...1.(k-1)0-1)AkkAkk = vkk
1 6. Metrization of Affine Spaces
for 1
k
73
h g m. In particular in view of the choice of sign
Al>0, In this way we obtain the required representation d.w
Yd(.-,) (k-,) ek.
where A is to be set = 1.
H. Differential Calculus § 1. Derivatives and Differentials 1.1. Vector functions. Let R' and Ry" be two linear spaces and G,.' a region in R.1, that is, a point set which is open and connected with respect to the natural topology. In Gs let a single-valued vector /unction
x-+y=y(x) he given, providing a mapping from G' into R". The simplest examples of such functions are the linear mappings y = A x of the space R'" into the space R. In what follows we shall be dealing with more general mappings and vector functions. If coordinate systems a, and bi are introduced in R',""" and R. so that M
x = EE'ar.
b; .
corresponding to the relation y = y(.,r) there is a system of n real functions ,1'=17f(f'. .., "') (/=1,...,n) of the real variables $l, ... , '". Conversely, any such system given in G7 defines by means of the coordinates systems ai and b, a vector function y = y(x). For is = I this function is reduced to one single component, hence to a real function of the in real variables i;'. If in addition in = 1, we have the elementary case of a real function of one real variable.
The vector function y(x) and the corresponding mapping x -+ y is continuous at the point x in the domain G"' of definition provided it is continuous at x in the natural topologies in the spaces R T and R;. The same is also true with respect to arbitrary Minkowski metrics, in parti-
cular also euclidean metrics, and conversely. Consequently, for an e > 0 there exists a o > 0 such that in R" y (x + 1+) - y(x)i < e
11. Derivatives and Differentials
75
t?, in R. The real components Or, . . . , i"') are then also continuous, and, conversely, the continuity of y(x) follows from the continuity of these functions. 1.2. The derivative. In the simplest case on = n = f, where y=y(x) can be thought of as a real function of the real variable x, the derivative at the point x is defined in the elements of the differential calculus as the limit provided IhI
lim
y(x + h)
(X)
= a(x)
IM-00
provided this limit exists and is finite. For a general vector function this definition is obviously meaningless. But the above definition can also be written y (x + h) - y(x) = o,(x) h + IhI (h; x) , where I(h; x)I - 0 for IhI 0, and in this equivalent form it can be generalized in a natural way. In fact, y = a(x) h defines a linear mapping of the real h-axis into the real y-axis and, conversely, each such mapping has the above form. That yields the following definition of differentiability and of the derivative of a general vector function y = y(x) : The mapping y = y(x)
defined in the region Gx is said to be differentiable at the point x Provided a linear mapping A (x) h o t Rx into R; exists such that
y(x-}-h)-y(x) =A(x)h+IhI (h;x), where the length (measured in R;.) I (h; x) I -. 0 when the length (measured
in Ri) IhI -. 0. The linear operator A (x) is called the derivative of the vector function y(x) at the point x.
The derivative is at every point x E Gm where it exists a linear operator, which only in the simplest case to = n = 1 can be characterized by a single real numbera(x). However, even for this generalized derivative, we retain the usual Lagrange notation A(x) = y'(x). The defining equation
y(x -I- li) - y(x) =y'(x)h+Ihl (h;x) states that the mapping y (x + Is) - y(x) of the neighborhood of the point x can in the first approximation be replaced by the linear mapping y'(x) h.
The above definition presumes the introduction of some kind of I1linkowski metrics in the spaces Rs and R. However, from what was
said in 1.6.2 it follows that differentiability is independent of this
11. Differential Calculus
76
choice, and the existence of the derivative is therefore an affine property of the function v(x) as long as the dimensions m and n are finite.
Further observe that the operator y'(x), provided the function is differentiable at all, is uniquely determined. For if one has
y(x+h)-y(x)=A(x)Is+IhI (h;x)i= B(x)h+IhI(h;x):, then A (x) h - B(x) Is = (A(x) - B(x)) h = IhI ((h; x)= - (h; .r),) = IhI (h; x) . if Is is replaced by I Is, where h is an arbitrary vector in the space R,* and2 is a sufficiently small real multiplier, then
(A(x) - B(x)) h = IhI (A h; x) 0 for A -+ 0. Consequently, (A (x) - B(x)) Is = 0, and hence for every it
in R" A(x) Is = B(x) Is = y'(x) h . 1.3. The differential. Following the example of elementary calculus we call the vector
dy=y'(x)h
the differential at the point x of the differentiable function y(x), corresponding to the argument differential Is. If in particular y = A x is a linear mapping, then
A(x+h)-Ax=Ah,
and the derivative A' thus is identical with the linear operator A. For y = x, dy = Is = dx, and accordingly we can write dy = y'(x) dx , whereby the argument differential dx is an arbitrary vector in the space
This equation suggests introducing the Leibniz notation y,(x) = dz' where, of course, the right hand side is not a quotient, but a symbol for
the linear operator y'(x). We shall in what follows often operate, instead of with the derivatives, with the in many respects more convenient differentials dy = y'(x) dx = dYdx .
1.4. Coordinate representation of the derivative. If one introduces coordinate systems ai and b1 in the linear spaces R, and R"y, so
that
i-l
y = j-l Enjbi,
1 1. Derivatives and Differentials
77
then, as already mentioned, the vector function y(x) can be represented by the system 771 =
of real functions. If this vector function is differentiable at the point x, then for every IZI
Ax =dx=£d'a,, provided x + dx lies in G?, one has
dy
f-1
Atli bi = y (x + Ax) - y(x) = y'(x) dx + Idxl (dx; x)
,
where I(dx; x)I -,. 0 as idxl -. 0. Corresponding to the linear mapping y'(x) (with respect to the fixed coordinate sytems) is a matrix y'(x)
and one has for j = 1, ... , n M
dll' _ rai(x) dd' + Idxl (dx; x)' i=s
where I(dx; x)tl - 0 for Idxl - 0. This means that all of the components r1i are differentiable in the sense of ordinary differential calculus. The partial derivatives 8>>j
o$r = a;(x)
exist, and one has for
Ani = £ r=I
is
a all dds + Idxl (dx; # = drlf + Idxl (dx; x)1.
Conversely, it follows from these relations that the vector function y(x) is differentiable in our sense with the linear derivative operator y'(x), which with respect to the fixed coordinate systems is uniquely determined by the functional matrix ry!)
Gaf` ' I.S. The differential rules. The definition given above of the differential dy of a vector function y(x), dy = y'(x) dx , fly = y (x + dx) - y(x) = dy + Idxl (dx; x) , is formally the same as in the elementary case of a real function y = y(x) of one real variable x. Since the derivation of the differential rules for such a function does not depend on the differential dy = y'(.Y) dx being
11. Differential Calculus
78
a product, but only on dy being linearly dependent on dx, it is clear that the differential rules known from elementary analysis also hold for vector functions.
Thus, if y,(x) and y2(x) are two differentiable mappings into Ry defined in G,, then with arbitrary real coefficients 11 and 1.z one has
d (.tyj -!-2s 2) =2,dy, +2!dy,. Further, if 2(x) is a real function differentiable in Gz and therefore
d1.= 1.'(x)dx=
try
°-' de
,
`
and if y(x) is a differentiable mapping of GY into Rv, then one has d(i y) =- A dy -4- dA y
,
and provided A + 0
d(y1
2dy-dAy i.=
t\ 7. l1
Finally, the chain rule holds for differentials of composite differentiable mappings. Let Rf be a third linear space, which we endow with
an arbitrary Alinkowski auxiliary metric. Assuming that the vector function y = y(x), which is differentiable at the point x of the region G,1, maps this region onto a region G; in R'. where a mapping z = :(y) into Rs which is differentiable at the point y = y(x) of Gy is given, then = z(Y(x)) = mi(x)
is a mapping of G' into R!'. The chain rule states that this mapping is differentiable at the point x with the differential dz = z'(),) dy = »'(y) y'(x) dx . We wish to prove this rule as an example.
Therefore, let Ax = dx and y (x + Ax) = y + Jy. Since y(x) is differentiable at the point x, one has Ay = Y (x + dx) - Y(x) = y'(x) dx -F IdrI (dx; .r) = dy + Idxl (dx; x), and, since z(y) is differentiable at the point y =).(x), Jz == z (Y -- A y - z(Y) = <'O') y + !Jy (ply: Y) . Hence,
Jz = ='(Y) (d), + Idxl (dx; x)) + IJyI (J),; Y(x)) = ='(Y) dy + IdxI z'(y) (dx; x) + I.-lyl (-.I)'; Ylx))
If Iz'(y)I stands for the norm of the linear operator z'(y), then one has Idxl j:'(y) (dx; 1)1
lz'(')I Idxl I(dx; x)I .
79
§ 1. Derivatives and Differentials
and hence Idxl z'(y) (dx; x) is a vector (in R;) that can be denoted by Idxl (dx; x). Further, Idyl + Idxl I(dx; x)l
Idyl
Idxl (ly'(x)I + I (dx; x)l)
,
where ly'(x) I stands for the norm of y'(x). Consequently, Idyl (dy; y(x)) is also of the form ldxl (dx; x), and altogether we have
dz = z'(y) dy + Idxl (dx; x) = z'(y) y'(x) dx + Idxl (dx; x) which proves the chain rule.
.
1.6. The mean value theorem. We wish to derive an analogue of the elementary mean value theorem for the vector function y(x) given in G' X".
Let xl and xa be two points in the region GX such that the connecting line segment
x = x(T) = xl + T (x= - xl)
(0
T S 1)
lies in G'. We make the following assumptions: 1. y(x) is continuous on the closed segment 0 S r 1. 2. y(x) is differentiable at each interior point 0 < T < 1. Now let L y be an arbitrary real linear function in the space Ry; thus, L is an element of the space dual to R;. If using this function the composite function
/(r) = L y(x(T)) = L y (xl +,r (xs - xl)) is formed, we obtain a continuous real function on the interval 0 g r I which by the chain rule is differentiable and has the derivative /'(T) = L y'(x(T)) (xs - xl) at each interior point of this interval; for dL = L dy, dy = y'(x) dx and
dx = (xs - xl) dr. The real function /(T) therefore satisfies the hypotheses of the elementary mean value theorem on the interval 0 ;:5,r S 1, according to which
/(1) - /(o)= /'(e) . with 0 < 0 < I. Since, on the other hand, /(1) - /(0) = L (y(xx) -
- )'(xl)) L (y(xs) - y(xl)) = L y'(x(O)) (x, - xl) where x($) = xl + 0 (x, - xl) and 0 denotes a value (dependent on the choice of L) in the interval 0 < 0 < 1. That is the general formulation of the mean value theorem for a vector function satisfying condi-
tions I and 2.
11. Differential Calculus
80
If one introduces a euclidean metric with inner product (y,, y2) in R,i"., it becomes convenient to in particular take L y = (y, e) = (e, y)
with an arbitrary vector e from R. If one then sets
x2-x, =Ax, y(xi) = Yi , y! the above mean value theorem yields (. ly, e) = (y'(x,) Jx, e) ,
v2 - vi - J1,
with x(= x,+0,Axand0<0,<1. In this form the mean value theorem plays the same role in vector analysis as the ordinary mean value theorem in elementary differential calculus. We therefore wish to give it a complete formulation :
Mean value theorein. Let Rr' and R' be linear spaces and G? a region o/ the space R where a slapping y = y(x)
of G'" into R" is given. Suppose further that this rector /unction satisfies the following conditions on the segment x = x, -!- r (x= - x,) (0 5 r S 1), which is contained in Gr': 1. y(x) is continuous on the closed segment x, x2.
2. y(x) is differentiable at each interior point of the segment. I/ in Rv a euclidean metric is given, there then exists for each vector e
in R" at least one mean value x, = x, + 0, (x2 - x,) (0 < 0, < 1), such that
(iv, e) = (y'(x,) :Ix, e) where Jx = x2 - x,, Ay = v(xs) - y(xl) When this equation is written
(Jy - y'(x Jx, e) = 0 , it states, interpreted geometrically, that for each vector e in R". there is an interior point x, of the segment x, x2 such that the difference Jy - ),'(x,) Jx is perpendicular to e.
In particular, if is = I and hence ;'(x) = ,1(x) e = t)(1i, .... '")e, one obtains the traditional differential calculus mean value theorem rn
J17 = rh'(x) Jx = E
(x1
M.. 0 (x2 - xi)) Jai
(0 < ,y < 1)
and for in = I the elementary mean value theorem, on which, of course, the above generalization is based.
1.7. Fundamental theorem of Integral calculus. As the first application of the mean value theorem we are going to prove the socalled fundamental theorem of integral calculus.
§ 1. Derivatives and Differentials
81
The proof results from the following simple inequality. If in the mean value theorem e is specially fixed so that lei = I and . ly = Idyl e. the result is Idyl = (y'(xe) 4x, e) , and as a consequence of Schwarz's inequality
Idyl s ly'(x.) dxI , an in itself important estimate which we shall also use later. Now if the derivative operator y'(x) = 0 in the region G7, it follows from here that idyl = Iy(x) - y(xo)i = 0, and therefore y(x) = y(xo), at first in the largest sphere Ix- xo1 5 0 contained in Gs, from which one then deduces in a well-known fashion that y(x) is constant in the entire region G,'. Since, conversely, a constant vector function obviously has the zero operator as its derivative, this completes the proof of the fundamental theorem o/ integral calculus for vector functions: In order for a function which is defined and differentiable in G, to be constant it is necessary and sufficient that its derivative be the zero operator.
1.8. We wish to prove this theorem again by means of a classical method whose central idea was employed by Goursat in his proof of Cauchy's integral theorem in complex function theory. As we shall later see, this principle can be vastly generalized (cf. III. 2.7-2.9 and IV. 3.10).
Let Gx, as above, contain the entire segment x = x1 -i- r (xs - x1) r S 1) (0 of length Ix2 - xi1 = 1. If this segment is bisected and x;, x; stand for the end points of that half for which the difference ly(xt) - y(x1)I = dl is the greatest, it follows from the triangle inequality that
do= ly(x2) -y(x1)i By treating the segment
x=x;+r(x;- x;)
2A1
(0 ;5 r 51)
in the same fashion, one obtains Al S 2 A2 = 2 Iy(x4) - y(x.) I and after n steps one has do S 2" d" = 2" ly(xs) - y(x ,)1 The segments x1 x2, x'xy, ... , x, x", ... are nested, and the length of the segment x, xz is equal to
lx," - xil = 2`"1x2-x1I = n
11. Differential Calculus
82
Consequently, one uniquely determined point r which is common to all of the segments lies on the closed segment x, x2. At this point y(r) has by hypothesis the zero operator for its derivative operator, and therefore
r(x) - y(x) = I.r - xl (x - r; x) where
A. = ly(x) - y(x:)I
Thus ly(xs) - y(Y)I +
X)i
52'-"l n) with l (1 jn) - 0 for it - oc, and hence X10 = 0 and Y(X2) = y(xi)
which was to be proved.
1.9. Linear operator functions. The derivative y'(x) of a vector function differentiable in Gam' is a linear operator function defined in G_?. Generally in what follows we consider a linear operator function A (x)
defined in a region G' of the linear space R', which for each fixed x in G' thus maps the space R7 linearly into a space R. Hence for a fixed x
y=A(x)k is a linear mapping of the space R4 into R. If, after the introduction of arbitrary Minkowski metrices, the norm
'A (x+h) - A(x) I = sup JA Ikl-+
for Ihl - 0, the operator A(x) is said to be continuous at the point x. Because
!A(x+h)k-.4(x)kl 5 1.4 (x+h) - A(x)I Ikl, it follows from here that for a fixed k the mapping of Gx into R" defined in the former region by y = .4 (x) k is continuous for each k. The operator A(x) is said to be di//ercntiable at the point x if the vector function .4 (x) k is differentiable for each fixed k, that is, if
.4 (x+h)k-A(x)k=B(x,k)hi h (h, k; x), where B(x, k) is a linear operator and I(h, k; x)I 0 for Ihl 0. To investigate the dependence of the operator B(x, k) on the parameter k, we replace k, successively, by k,, k2 and an arbitrary linear combination Al k, + As k2. If the first two equations multiplied by A, and Aq, respectively, are added, subtraction of the third yields, because of the linearity of the left hand side in k, B(x, 2, k, -;- As k=) it - A, B(x, k,) it - is B(x, k2) it _ - Ih I ((h, A, k, + 7.a k$ ; x) - i., (h, k,; x) - i.2(h, k2, .c) }
83
§ 1. Derivatives and Differentials
where I{ } 1
I(h. Al k1 + As ks; x)I + i111 J (h. k1; x)I + I).sl I(It. ks; x)I --p. 0
as Ihl - 0. If h is replaced by A h, a factor A comes out on both sides, and if one then lets A, for arbitrarily fixed 11, converge to zero, the result is
B(x,A,k1+As Q h-A,B(x,k1)h-A,B(x,k=)h =0. Thus B(x, k) h is not only linear in h, but also in k. We can therefore write B(x, k) h = A'(x) It k , where for x e Gs A'(x) is a bilinear operator on the space R' with range
in R". But then
(h, k; x) = (h;x)k is also linear in It, and altogether we have A (x + h) It - A(x) k = A'(x) It k + IhI (h; x) k , where the norm I (h ; x) l of the linear operator (It; .r) converges to zero
The bilinear operator A'(x) is called the derivative of the operator A (x) at the point x. From the inequality IA (x -1- h) It - A(x) kl s (IA'(x)I -!- I(h; x)I) 1Ir1 Ikl
it can be seen that
IA (x-}-h)-A(x)l-..0
for IhI -. 0. The continuity of the operator A(x) at the point x therefore follows from its differentiability. The above argument can be carried out without modification for multilinear operators. Let .4 (x) be a p-linear operator /unction defined in Gz ; thus y = A (x) h1 ... ho
is for each x E G,"' a multilinear function of the R'"-vectors h1,
... , ho
with a range in R. The operator A (x) is said to be continuous at the point x if the norm I A (x + h) h, ... hD - .4 (x) h1 ... hol IA (x + hl - A (x)I = sup converges to zero with It. It is differentiable at the point x if the vector function A (x) h, . . . hr is differentiable for each fixed system h1, ... , h
One then has
.4 (x+h)hl ...lip -A(x)hi ...ht, _ .4' (x) h h1 ... It,, -L IhI (It. x) It, ... ho ,
II. Differential Calculus
84
whereA'(x), the derivative of the operator A(x), is a (p -+- 1)-linearoperator and the norm 1(h; x)I of the p-fold multilinear operator (h; x) converges to zero with I/r'.
1.10. The second derivative. Now let y(x) be a vector function which is differentiable in the region GX ; v'(x) is therefore a linear opera-
tor defined in this space. If this operator is differentiable at the point x in the sense of the preceding paragraph, then the derivative is a bilinear operator which we shall denote by y"(x). For a sufficiently small h and an arbitrary k R.°' one has
v' (x+Ii)k-v'(x)k=y"(x)hh+ 1h1 (h; x) k, where the norm j (h ; x) l -+ 0 if 1h j -. 0. The bilinear operator v"(x) is the second derivative of the given vector function at the point X. If coordinate systems are introduced in R' and R° in which
xa,, vr,'
it follows from the existence of the bilinear operator v"(x), first of all, that the n matrices a2,,i
(
exist. If
e
eel)
(, = 1
, n)
m p,11
dtlf = rE ay r dEi is the differential of the component oii corresponding to the vector differential
k=E ddk s-I
one further has ..
'N
s,
f
((h;x)k)',
t 0'v
for the increase in drli that corresponds to the vector differential N1
Ir=Zdayas. s-.t
Because of h)11
J (h; x) ki a 1(1i; x)! Ikl
this increase can also be written I
#with I(h; xfI - 0 for 1h.: -+ o.
d k + (h; x)' Ihi Ilr1
§ 1. Derivatives and Differentials
85
Conversely, it obviously follows from is such component equations that y(x) is twice differentiable with the second differential
d=y = y"(x) h k = rE \t e
43sraF) bt
1.11. Symmetry of the second derivative. The fundamental syrnmetry
y"(x) h k = y"(x) k h of the second derivative operator of a vector function follows from the above coordinate representation of the second differential under the well-known sufficient conditions for commutativity of partial differentiation. However, we wish to go in the opposite direction and give a direct, coordinate-free proof of this symmetry. We start out from the following, abundantly sufficient assumption which, as we shall later see, can be restricted in various directions: The second derivative operator y" exists in a neighborhood of the point x and is continuous at this point, so that the norm
for Idxl - 0.
ly" (x + dx) - y"(x)I - 0
For the proof, take 1r and k so small that the parallelogram with vertices x, x + h, x + k, x + Is + k lies in the mentioned neighborhood of the point x, and form the second difference
d'y=y(x+h+k)-y(x+h)-y(.Y+k)+y(x), which is symmetric in h and k. If for brevity we denote the first differences,by
then
y (x + h) - y(x) = V(x) ,
y (x + k) - y(x) = p(x)
d'y=9'(x+k) -4r(x) =1V(x+h) -p(x) Now let L be an arbitrary operator from the space dual to Re,, L y thus being a real linear function of y. From the existence of the derivatives y' and y" in the neighborhood of the point x it follows, in view of the mean value theorem in 1.6, that, on the one hand, L d'y = L (9' (x -i- k) - p(x)) = L q?' (x + 02 k) Is
= L (y'(x+h+$sk)-y'(x+6,k))k = Ly"(x+O1h+O,k)hk, where 0 < 01, d= < 1. On the other hand, one obtains in the same way
L y"
(x + $,h + 04k) kh,
86
11. Differential Calculus
with 0 < 03, 6S < 1, and consequently L (y" (x + 01 h + ii= k)h k y" (x + 03 h -4- 04 k) k h) = 0.
-
If the original vectors h and k are replaced here by). h and; k, with arbitrarily fixed vectors h and k and sufficiently small real A. then 22 comes out on the left. Because of the continuity of the second derivative at the point x, for ). -+ 0 and for each pair h, k from R',
L (y"(x) It k - y"(x) k h) = 0. Since this holds for every operator L in the space dual to R3, it follows that (cf. I. 3.10, exercise 4) "(x) h k = y"(x) k h,
with which the asserted symmetry of the bilinear operator y"(r) is proved.
1.12. Higher derivatives. Provided the bilinear, symmetric operator y"(x) is again differentiable in the sense of 1.9. one defines y...(x)
to be the third derivative of the vector function y(x). In the same way one has generally the definition aTPITI
Thus if the derivative operators exist up to the order p + 1 this means
that forq=0,1,.. ,p
y)a) (x + h) h,
... try - ),(")(x) h, ... ht
_ j,(y+')(x)hh,...by+ i1il (h; x) 1t,...by where the norm 1(h; .r)1 converges to zero with lh1 and y(O)(x) = y(x).
From this definition it follows more generally that for nonnegative integral indices i and j dt diy dx' (ax!)
=
di dty dxt dri
d;+,,.
- dx++i '
provided the derivative operators which appear exist. It further results from here by means of induction, observing the symmetry of the second derivative proved in 1.11, that under certain sufficient conditions the pth differential dry - y'P)(.r) It,
... hr = ytr)(x) d,x ... dtx
is a multilinear and symmetric function of the arbitrary R"-vectors h; = dix.
87
1 1. Derivatives and Differentials
1.13. Exercises. 1. Let R,r be a euclidean space and x = (x) e(x) . Show that fix{ de = dx - (e, dx) e
d;xI = (e. dx) , and
Ixjs e" (x) h k = 3 (e, h) (c, k) e - (e, h) k - (e, k) h - (h, k) e . 2. The mapping y = y(x) of the euclidean space R' onto itself is
toll/orinal at the point x provided the derivative y'(x) = ).(x) T(x), where a (x) is real and T (x) is an orthogonal transformation of the space
R?. Prove that the mapping by means of reciprocal radii X
Y = wit is- conformaI.
3. Let y = y(x), x = x(y) be inverse, twice differentiable vector functions. Then for two pairs of associated differentiaLs dix, dly and d,x, d=y d aft
d=y d1y +
1
dx d2y dy
dy!
d!x d,x = 0 .
=
4. If the one-to-one twice differentiable variable transformations x - y -- z are carried out for the differentiable function u = u(x), then du Ox TX dr=
d!z diz +
du d=y
dy d .
dsx d1x +
du d2z d-: dys
d#
0.
5. Let s > 0 and T(o) be a monotonically increasing continuous function defined fore > 0 with lim 9;(o) = 0 such that the least upper Q for which 97(o) < p is positive (finite or infinite). Further suppose R,' is a euclidean space, x = lxi e(x) and
is .
Y(x) = fI(p - 7(Q)) do e(x) =p x -- f 4'(Q) do e(x) . 0 1X
0
Show:
a. The function y(x) is differentiable in the sphere )x) < o, b. The norm iy'(0)I = p. c. The least upper bound of the norm Iy'(x) - y'(0) I for )x) S e < is precisely = 9:(Q).
d. If one sets Qs-0
Q.-O
µ ex - f ¢(e) de = f (p - 4>(e)) de = ey 0
0
then y = y(x) defines a one-to-one mapping of the spheres fix) < es, jy) < o,, and Q, and e,, are the largest radii for which this is the case.
88
II. Differential Calculus
6. Let y = y(x) be a differentiable function in the region Gs of the space Rr with a range in Ry; consequently, for each x in GO
Y(x+li)-Y(x)=Y'(x)h+IhI(h;x) with l(h; x)I --* 0 for IhI --> 0. Prove: In order that the derivative y'(x) be continuous in G' 8 it is necessary
and sufficient that the equation lim I(h; x)I = 0 jkj -o
hold uniformly on each compact subregion of G. Proof. Since the ratio of two Mlinkowski lengths Ix1' and Ixl", accord-
ing to what was said in I. 6.2, lies between two positive bounds which are independent of x, the hypotheses and conclusions of the theorem are independent of the choice of the metrics. We can therefore give the spaces R' and R', euclidean metrics. If one then sets (h; x) = I(h; x) e, the mean value theorem, in view of Schwarz's inequality, yields 1(h; x) 15 ly' (x -E 0 h) - Y'(x)I (0 < 0 < 1) Now if y'(x) is continuous in G',, it is even uniformly continuous on each compact subregion of GY', from which the necessity of the condition in the theorem follows.
The condition is also sufficient. For if x is an arbitrary point in G', take go > 0 at first so small that x + k lies in Gx' for Al 5 No. If one further sets It = IhI e = A e with an arbitrary unit vector e,
ly'(x+k)e-Y'(x)ci
z ((Y(x+Ae+k)-y(x+Ae);+IY(x±k)-y(x)I + 1(d c; x + k)I + I(A e: x)I In consequence of the uniform convergence I(h; x)I -+ 0 for IhI -. 0 one
can, for preassigned tj > 0, at first fix A > 0 so small that the last terms are smaller than r!!3 for Ikl 5 Loo, and then o7 5 go so that the first term on the right is also smaller than 10 for Ikl < oo. Since this holds for every unit vector c in the space R,', the norm IY' (x + k) - y'(x) I <'7 for Ikl < e,,, which was to be proved. In general, the following holds: Provided the p-linear operator A(x) is differentiable in Gx, and consequently
.4 (x+h)h, ...ho-A(x)h1...h.0 = A'(x) h h, ... h,, -I- IhI (h; x) hi ... ho ,
§ i. Derivatives and Differentials
89
A'(x) is continuous in G' precisely when lim I(h; x)) = 0, Ihj- o
uniformly on every compact subregion of G,'".
7. Let yi,(x) (p = 1, 2, ...) be differentiable functions in the region G. of the space R' with ranges in R;. Prove: Provided the limit function and the limit operator Yp(x) - Y(x)
,
yp(x) - A (x)
exist in G' for p - oo and the second convergence is uniform on each compact subregion of G,", y(x) is differentiable and y'(x) = A (x)
at every point where A (x) is continuous.
Hint. If for each p one sets
ya(x+h) - y,,(x) =Y;(x)k+ IhI (h;x)p with I(h; x) ,I -. 0 for Ihl --,. 0, the convergence (h; x)p - (h; x) for p -. oo follows, and therefore
y (x + h) - y(x) = A (x) h + IhI (h; x). It remains to be shown that I(h; x)I -,. 0 for Ihl -. 0. From the mean value theorem it follows that for every p I(h; x)pl S Iyp (x + 0p h) - Yp(x)I 9 ly' (x + '8p h) - A (x + Op h) I + I A (x + $n h) - A (x) + IA (x) - yp(x) I
with 0 < 6p < 1, which because of the uniformity of the convergence
yp -. A and the asserted continuity of A at the point x implies the conclusion. If, granting the uniform convergence y'p -. A, the derivatives y'p are
continuous in the region G,,, A(x) is also continuous, and the equation y'(x) = A(x) holds for each x in Gz.
8. Let A be a bilinear operator that maps the product space R'". x Ry into R. Assuming further that A is symmetric, set
AhAky=AItAhy, Ax...Axy=(Ax)Py p
and prove:
It. Differential Calculus
90
For each yo from Rs. the vector function
Y(x) = PE-0
(9 -x
P
a
yo =P-0 E
(A x)P ye
P,
is meaningful in the entire space Rs and satisfies the linear differential equation
dy = A dx y. 9. If y = y(x) is a differentiable, homogeneous vector function of degree p of the vector x, i.e., y(1. x) = 7.P y(x), then Euler's equation y'(x) x
y(x) holds.
§ 2. Taylor's Formula 2.1. Powers and polynomials. In preparation for what follows we consider vector powers and polynomials.
Let n be a linear space and A a constant, multilinear, symmetric operator from the former space into the linear space R,"., that is, let
y=Ax,...x,, be a p-linear symmetric function of the R,-vectors x 1 . . . . . . . One such vector function of the differentials hi = x;, for example, is the pth differential of a sufficiently differentiable vector function v(x) for a fixed x. = xP = x, this function transforms into a homoIf one sets xl = geneous vector function of degree p, which we briefly denote by
y=AxP and call a pth "power" of x. For p = 1 this is a linear, for p = 2 a quadratic vector function. With respect to the coordinate systems a; and b;, in which on
x=Et-t 'a, tjf =
0
y=Et'bi, i-t
tt sue) =E
i,,..., {,-t
whereby Al a;,
is
t . ...r,,'' ... S''
... ar,
is symmetric in the indices ill . . . , iP. We are going to calculate the differential of the power A x*. As a consequence of the linearity and symmetry of the function
Axl...xb,
A(x+h)P=E(Pi)A0-'h', s-0
§ 2. Taylor's Formula
91
and consequently
A(x+h)p-Axr=P Axr-'h+Ihj(h;x). where the norm p
i(h; x)I
IAI Ih1 E (i)
IxII-'
(-2
Ihl'-2
converges to zero as IhI -. 0; IA I is the norm of the operator A. Accordingly,
dAxp=AAxp-'h=/Axp-'d1x,
where the differential h has been denoted by d1x. One finds in the same way, formally exactly as for the elementary power a xp of a real variable, p A (x + h)t -' d1x - P -A xp-' d1x
_p (P- 1)Axp-=hdlx+IhI(h;x)dlx, where the norm I(h; x)I vanishes with IhI. Consequently, the second differential corresponding to the differentials d1x and h = d!x is
d2Axp=p (p - 1)Axp-2d2xdlx= p(p-1)Ax¢--2dlxdsx. In general one obtains for each positive i 5 p
d'Axt'=p(p-1)...(p-i+1)Axp-'dlxddx...d,x. Fori=p d''Axp=p!Ad1x...dpx
is a multilinear and symmetric vector function of the differentials d,x, independent of x, and thus for i > p
d'Ax.=0. A sum of powers p
P(x) = E Ap-v x4 ?-o
we call a vector polynomial of degree p in the vector x, which varies in Rs . One has for i p
d'P(x) = E p4 (4 - 1) ... (q - i + 1) Ap-q xR-i d1x dsx ... dix , 4-1
and fori>p d' P(x) = 0.
2.2. The Taylor polynomial. In analogy with the elementary differential calculus, we can now form the Taylor polynomial of degree p, T,(x, x0), for a p-times differentiable vector function y(x) about the
It. Differential Calculus
92
expansion center x0. Take xo = 0; we are then treating the 1ltaclaurin Polynomial
r 7 0(x, 0) _ Z AO-q 0 q-o
which for x = 0 has, up to the order p. the same derivative operators as the given function y(x). According to 2.1, for i S P and x 0 d iTO(0, 0) = i ! A
dlx ... dx ,
and therefore i! AO_; = Tp)(0, 0) = y('l(0)
,
from which 0
T9(x. 0) _ E
y(1)(0) xq
o9'
follows.
With this we have the Maclaurin formula y(x) = TO(.r, 0) + Rp+,(x, 0)
It still remains, as in the elementary differential calculus, to investigate the remainder term RO+, for p -. oo.
2.3. The asymptotic behavior of RO+,. If. as above, one assumes that y(x) is p-times differentiable at the origin, this is also true for RO+,(x, 0)
RO4.,(x), and one has
RO+, (0) = R,+1(0) _ ... = (t+,(0) = 0. Set (in an arbitrary Alinkowski metric) x = IxI e. With x fixed and an arbitrary operator L from the space Ry" dual to Ry form the real function
1(r)=LROH,(re)
(0;5 r
Then /1,11(0) _ L Rqi, (0) eq
IxI)
=0
for q = 0, 1, ... , p, and for a sufficiently small IxI and q = 0, 1,
fi - I the derivatives
/(q)(z) = L l;+, (r e)
... ,
cq
also exist on the interval 0 5 r IxI. They are even continuous if we in addition assume that y(0-')(x) is continuous in a certain neighborhood of the origin. Then, because f(P)(0) = 0, for 0 5 t 5 IxI PP-')(t) = (T)1 r
,
with I(r),I - 0 for r - 0. For r = IxI we have, according to this, L RO+,(x) = J(Ixl) = (x) 1x10, where I (x) I -+ 0 for IxI -. 0.
93
§ 2. Taylor's Formula
This holds for each operator L from the space R'y" to Rr", thus in particular also for the inner product L y = (a, y) of a euclidean metric introduced in Ry, where RP+,(x) IRP4(x)la. Then L RP+, (x) = IRP+,(x)I = (x)IxIP
with I(x)I -,. 0 for Ixl -+ 0, for arbitrary Minkowski metrics in the
spaces R' and R. Under the assumptions made: 1. y(P-')(x) exists and is continuous in a certain neighborhood of
x=0;
2. y(P)(0) exists;
We see: the Maclaurin polynomial gives an asymptotic representation of the vector function y(x) so that in arbitrary \finkowski metrics IYI_P
P+
y(a)(0) xAl - 0
1Y(x) - E r
v-o 4 l
for Ixl - 0.
-
2.4. The remainder formulas in Taylor's theorem. If one assumes that the function y(x) is p-times continuously differentiable in
the neighborhood of the origin and that moreover the (p + I)st derivative operator y(P+')(x) exists there, then the real auxiliary function 1(r) = L RP,-,(r e)
has corresponding properties on the interval 0 r 5 Ixl for a sufficiently small Ixl: it is p-times continuously differentiable and the derivative /(P+,)(r) = L RP+,') (a e) eP+t = L y(W) (r e) eP+,
exists. Since /(0) = /'(0) = = f-)(0) = 0, the elementary Maclaurin formula with the Lagrange remainder term
/(r)
-
(A + s) --
/(P+,) (0 r) rP+,
(0<0
thus yields for r = Ix1 L RP+, (x) = ---
- L y(P+,) (0 x)
xP+,
(P+O) and that again for each L of the space R'r" dual to R. where by the number 0 naturally depends on x as well as on L. Corresponding formulas are obtained if other of the known forms of the remainder (like those of Cauchy or of Schlomilch-Roche) are used for f(T).
94
11. Differential Calculus
From here one obtains as above estimates for the remainder term
Rp+,(x). It follows that Rp+,(x) - 0, for a fixed x and p -. co, and consequently the existence of the Maclaurin expansion 1(x) _.
4 yla)(0) .r9
a-o4 can be inferred.
It is worth emphasizing that the point set of the linear space Rx for which Rp_,(x) possibly converges to zero as p -+ oo is independent of the metric and therefore absolutely determined. This concerns the convergence of the Taylor series just as the other notions introduced and treated in this chapter, such as the continuity of a vector function,
the existence of certain derivative operators, etc. They have an "affine character", independent of the auxiliary metrics introduced to facilitate formulation or proof, provided the linear spaces R, and R. are of fissile dimension and therefore possess a natural topology.
If, on the other hand, one goes over to normed or Hilbert spaces of infinite dimension, the concepts of absolute analysis remain meaningful, to be sure, but for the most part only with respect to the metric or the
topology which is introduced. Thus, for example, the same vector function can be differentiable in the sense of the definition given in 1.2
with respect to one metric, with respect to another not.' 2.5. Exercises. 1. Show with the aid of the fundamental theorem of
integral calculus: If v = y(x) is a vector function whose (p - 1)st differential vanishes identically in a region of the linear space R" d(p+t)y(.r) = 0, then v(x) is a vector polynomial of degree p. The set of all such polynomials is the general solution of the above differential equation. be so given 2. Let the sequence of powers Al x' E Ry"
that the power series Z 1.4;i o1 converges (IA,; = the norm of Al feo relative to a 3linkowski metric on the space Re). Then the series ro
i-o
Al.vi is absolutely and uniformly convergent for IxI
r < tP.
ao
3. If the power series r(x)
A, x' converges uniformly for
.r1 < r, one obtains the derivative y'(.V) in the same sphere by termwise differentiation. I I3oundcdness of the linear derivative operator A(x) = y'(x) with respect to the metrics must then be required.
13. Partial Differentiation
95
§ 3. Partial Differentiation
3.1. Partial derivatives and differentials. In the definitions of the derivative and the differential of a vector function y(r) given in 1.2 and 1.3 it was assumed that the argument differential dx = h varies freely within the space RF . One can for this reason more accurately describe the differential defined in this way
dy=;v'(x)dx as the total differential of the function y(x) at the point x. If h is restricted in these definitions to a certain subspace U of the space R,', we have the notion of the partial derivative or partial diiJerential of the vector function in the direction of the subspace U. Hence the function y(x) being differentiable at the point x in the direction of the subspace U means that a linear derivative operator OAT)
Yu(x) = au
from the subspace U into the space R; exists so that y (x + h) - y(x) = Yt-(x) Is + Ihl(h; x) where I (h; x) I -- 0 when h tends to zero in the subspace U. The operator is then called the partial derivative and the vector doy = yu(x) Is
the partial differential of the function y(.v) in the direction of the subspace U. One shows as in 1.2 that this partial derivative, provided it exists, is uniquely determined. If the function is differentiable in the direction of the subspace U, then it is of course differentiable with the same derivative operator in
the direction of each subspace of U. In particular, it is partially differentiable in every direction provided it is totally differentiable. Further, it follows from the uniqueness of the partial derivative that the function is differentiable in the direction of a nonempty intersection
TV=fU,Vj provided the partial derivatives yv (x) and yl.(x) in the directions of the subspaces U and t' exist, whereby for h e 1V Y,,.(x) h = yc(x) h = yj-(x) it .
However, this does not in general imply the existence of the partial derivative in the direction of the space (U, V) generated by U and V.
3.2. Functions of several variables. The case where the vector function y is given as a function of several vector variables x1.... , xp
96
11. Differential Calculus
each of which varies in certain regions G ' of the linear spaces Rj' can be reduced to the case of one single variable x. For this purpose one forms the product space (cf. 1.1.6, exercises 6 and 7) R,., = RM, x ... x R m. x x, xD
+ nrr, where the spaces R" are linearly
of dimension in = in, +
independent subspaces which generate the product space R."'. vector
The
x= x,+...+x0
can be is in the product space, and conversely each x of the space represented in a unique way as such a sum. In the product domain
G"'==Gm-x... xG' the original function
y(xl, ... , xp) = Y(x) is therefore a single-valued function of x.
We say the original function y(x ..... xD) is differentiable for (x,, ... , xy) provided y(x) is totally differentiable at the point x = x, -?4- xD in the product space. For this it is necessary and sufficient + that for arbitrary differentials h; from R"'
-Y(x,....,xy) = EAi(x1,...,xp)1:+ Ihi(h;x,,...,xo), P
i-,
where the Ai(x ... , xq) stand for linear mappings of the spaces R ' into the range space R,,. and J(h; x,, ... , xy)) -+ 0 for 0
Ihl'= i-I The operators .......
. . .
..r,
) _)'x,(x, .
. xp) = eX
are the partial derivatives of the function y(x ... , xp) with respect to xi. Interpreted as a derivative of the function y(x), Ai is the partial derivative in the direction of the space R" as a subspace of the product space R'. We leave the simple proofs of these assertions to the reader.
3.3. Partial derivatives of higher order. If the vector function y(x) is partially differentiable in the direction of a subspace I' of R' at each point x of a region G the partial differential d,,y =
k
§ 3. Partial Differentiation
97
defines for each fixed k from V a vector function. If it is partially differentiable in the direction of the subspace U at the point x, one shows exactly as in 1.9 and 1.10 that a bilinear operator a=y
Yo r(x) = all ev
defined for each Is from U and each k from V exists such that
yv(x+h)k-yv(x)k=yuv(x)hk+Ihl(h;x)k. Here, calculated in the subspace V, the norm I(h; x)lv = sup I(h; x)I - 0 11
1
when Ihl tends to zero in U. The bilinear operator ),,,,(x) is the second
partial derivative of the function y(x) taken first in the direction V and then in the direction U. Partial derivatives and differentials of higher order are defined similarly.
3.4. Theorem of H. A. Schwarz. Regarding the sequence of the differentiations there is the following generalization of the classical theorem of H. A. Schwarz: Let y(x) be a vector function with the following properties:
1. The function y(x) and its partial derivatives y0(x) and yv.(x) in the direction of the subspaces U and V of the space R;' are continuous in a neighborhood of the point x. 2. The partial derivative v..,.(x) exists in this neighborhood and is continuous at the point x. Then the partial derivative yr.t.(x) also exists at the point x and for each Is in U and each Is in V yl,U(x) k h = y. ,.(x) Is k .
For U = V = R17 this includes as a special case the symmetry of the second total differential y"(x) h Is proved in 1.11. The following proof is an easy modification of the well-known proof of Schwarz.
As in 1.11 and with the notation already introduced there, it follows from the existence and continuity of the derivatives y,,, and yv
in the vicinity of the point x together with the existence of the derivative ),t, v in this neighborhood that
LA'y=L(yv(x+03lr+k)-yy,(x+03h))h = L(yv(x+Ir+$3k) -yr..(x+$3k)) Is
=L(yuv(x+01h+03k))hk,
98
11. Differential Calculus
and thus'
L(yu(x+03h+k)-yv(x+03h))h=LYuv(x+01h+0,k)hk. Because of the hypothesized continuity of the derivative y,t. at the point x,
Yt.v(x+t?1h+0,k)hk=Yuv(x)Itk+Ull k) - 0 for Ih12 + ;kls -. 0. The above equation can therefore be written L(yu (x + O h + k) h - yu(x + 0 h) h - yuv (x) It k) = lhl lkI L e(h, k) ,
where the number 0 = 0s depends on the choice of the real linear form L from the space dual to Ry, but always satisfies the condition
0<0<1.
If in particular one takes L to be the inner product L y = (e, ),)
of a euclidean metric on the range space R'y, whereby e denotes a temporarily arbitrary unit vector, then Schwarz's inequality yields
l(e,(yt.(x+19h+k)h-yu(x+0h)h-Yuv(xhk))l ltl IkI Ie(h, k)l ,
where, thus, for an arbitrary small e > 0 a Q, > 0 exists such that Ie(h, k)l < e for Ih)2 + Ikl4 < &3. If h is replaced in this equation by ). h, a factor of ). comes out on
both sides, and as a result for i. - 0, because of the continuity of the derivatives yu and yv, I(e, {yv (x .4
k) h - iv(x) It -
It k})I < flhl )k)
for IkI < p, If, finally, in this inequality, which holds for every unit vector e e R;, one takes e in the direction of the vector { }, then Iyu (x ± k) it - yu(x) h - vuv(x) h kj < elhl IkI for lk; < t,, Thus one can write
yc(x+k)It-yv(x)h=yt,v(x)hk+Ikl(k;x)h, where, calculated in U, the norm I(k; x)it, = sup I(k; x) hl < e In formulating the mean value theorem in 1.6, from which the above equation was deduced, the existence of the total derivative y'(z) at the interior points of the
segment x = r1 + r (x2 - x1) (o < r < t) was, to be sure, assumed. However, it is at once clear from the proof of the mean value theorem that even the existence of the partial derivative in the direction of the one-dimensional subspace spanned by the vector x= - .r, suffices.
§ 4. Implicit Functions
99
for {kl < n, Accordingly, the partial derivative yvv(x) exists, and for each k in U and each k in V yvv(x) k 1: = Yvv(x) Is k , q.e.d.
§ 4. Implicit Functions 4.1. The problem. We consider in the following a function z = z(x, Y)
of two vector variables x and y which is defined in certain neighborhoods of the points xo and yo in the linear spaces Ri and R; and which has a range in a third space R.O. The problem then is to solve the equation z(x, y) = 0
with respect to x or to y in the vicinity of xo and yo such that z(xo, yo) = 0. If coordinate systems are introduced in the three linear spaces in which mtt
x = L' e' a, ,
i-t
n+
Ayr
y = Lr ni bi.
z = G bk ck, k-t
j-I
then the equations 0
1,
hold for E' = go, it = t , and the problem in this formulation is to solve this system of real equations for the variables ¢' or of in the neighborhood of the points t' and t /o. In the case at hand, where the dimensions m, n and p are finite, the results which we shall obtain will, as far as the content is concerned, contain nothing that is in principle new. It is only the coordinate-free
approach used that deviates from the traditional and permits a brief treatment. But it is worth mentioning that, with modifications, the method used also achieves the same end in the more general Hilbert space case, which, however, we shall not discuss. If, more particularly, the function z(x, y) is of the form
z(x,Y)=Y(x) - Y, where R; is thus the same as R, the problem is to find the inverse of a mapping Y = Y(x)
from the space Rte' into the space Rr near the points xe and yo = y(xa). We shall treat this special case of the problem in detail. The investiI R. Nevanlinna [t], [3), F. Nevanlinna (1), (s).
II. Differential Calculus
100
gation of the general problem only requires a few easy modifications which are in part left to the reader as exercises.
4.2. Inversion of differentiable mappings. Let Y(x)
be a mapping from the in-dimentional linear space Ru' into the n-dimensional linear space R, that satisfies the following conditions: A.
In a neighborhood of the point xQ, y(x) is continuous and
differentiable. B. The operator y'(x) is regular for x = xQ. C. The operator y'(x) is continuous at the point xo. It follows from B that in can be no greater that n. We first consider
the casein=n. Under these conditions we shall prove that in a certain neighborhood of the point y(xo) = Yo the inverse mapping x = x(y) exists and is differentiable and has the derivative x'(Y) = (y'(t.))-' (x = x(v)) In this formulation the theorem has a purely affine meaning. But in order to be able to use the methods of coordinate-free analysis, it is also expedient here to equip the spaces Rx' and Ry with arbitrary euclidean metrics. With respect to these metrics the assumptions made can be stated in the following equivalent fashion, whereby we set
xu=Yo=0: A.
The /unction y(x) is di/Jerentiable in a spherical neighborhood
;xI < Nx of the Origin. B. H'e hare
inf;y'(0)hI =fi>0. Hypotheses C is equivalent to assuming that the norm IY'(x) - Y'(0)! - 0
for Ixl - 0. This condition can be weakened somewhat. For this purpose we consider for 0 5 0 < L9: the nonnegative function 9(o) = sup IY'(x) - Y'(o)I IrISo
which increases monotonically with o. The above hypothesis C states
that the limit
lim c(n) = (0+) = 0 . Qy0
We replace this with the weaker
V. r (o+) < N.
§ 4. Implicit Functions
101
From B and the definition of (o) it follows for IxI o (< o,) that ;y'(x) hl L> 1)''(0) hl - ly'(x) It - y'(0) hl a (p - 9'(0)) IhI and therefore inf ly'(x) h 2 p - T(e)
.
(4.1)
phi - l
As a consequence of hypothesis C' the least upper bound of those
radii a for which p - 97(e) > 0 is positive, and it is consequently possible to assume from the beginning that the number e, introduced
in A is so small that the inequality p - T(o) > 0 holds fore < e,. Further, since the dimensions in and n of the spaces R" and R;, were assumed to be equal, it follows that /or Ixi < n, the linear operator y'(x) maps the space R' one-to-one onto the entire space R".'.
We shall establish the invertibility of the mapping v = y(x) in the following more precise formulation :
Inverse function theorem. Under the hypotheses A, B and C' the mapping x = x(y)
(r(0) = 0)
y = y(x)
(y(0) = o)
inverse to
exists in the neighborhood 03-0(P-q(a))do Iyl<e,.=f 0
of the point y = 0. It is also differentiable here, having as its derivative
(x = x(y)) . x'(y) = As follows from exercise 5 in 1.13, this theorem is precise in the (y'(x))-1
following sense: Without additional assumptions the spherical neighborhoods Ixl < o*, lyl < os, cannot be enlarged in a generally valid fashion.
4.3. Proof of the inverse function theorem. We go on to the proof of the inverse function theorem and show in this order the following: 1. The range set G;.' in Ry , the image of the sphere lxI < or by the function y = y(x), is schlicht : two different points xl and x=of the
sphere are mapped onto different point- yi _ y(.rl) and yi = y(xr). The inverse mapping x = x(v) (x(0) = 0) therefore exists in Gy. 2. Interior points of the sphere Ixl < o, are mapped onto interior points of G. The image set G"' is consequently an open region. 3. The region G; contains the spherical neighborhood Iyi < ey of the origin. 4. The inverse mapping x = x(y), which is uniquely defined in G,".', is differentiable and has the derivative x'(y) = (y'(x))-i (x = x(v)).
11. Differential Calculus
102
4.4. Gf is schiicht. Let xl 4 x2 be two points of the sphere Ixl < n xp - x3 = Ax, y(x2) - y(xl) = _iy. We claim that dv $ 0. With a for the time being arbitrary unit vector e from the space R." because of the mean value theorem, (Ay, e) = (y'(x,) Ax, e)
,
where x, = x, t 0, Ax (0 < 49, < 1). and consequently I (Ay, c)I a I (v'(0) zl x, e) I - I (y'(x,) Ax - y'(0) Ax, e) I
.
It follows from Schwarz's inequality, on the one hand, that I(dy, e)j S ldyI , and further that I (y'(x,.) Ax - y'(o) dx, e) I S I (y'(xx) - y'(o)) dxI
Iy'(x,) - y'(o)Ildxi 9 9'(e)Idx1. where e (< e,) stands for the greater of the lengths Ixil and lx21. Thu Idyl 9 I (y'(o) Ax, e) I - Vi(e) Idx1 .
Now take the unit vector e so that the first term on the right is as large as possible, which by Schwarz's inequality is the case for e = y'(0) dx/ly'(0) zlxl. It then follows from hypothesis B that 1(3,'(0) Ax, e) I = ly'(o) dxI
p 121x1
and we obtain the inequality (4.2) - (e)) I=1x1. which is important for the entire proof. From this in particular, because a - 9;(e) > 0 fore < e,, it follows that Ay + 0 fordx 4 0,
Idyl
which was to be proved. 4.5. Gy,' is open. If the range set Gy includes the entire space R,".',
there is nothing to prove. Thus let b stand for a point of the latter space that does not belong to G.'", so that y(x) 4 b for I xI < e,. Then
if y(xo) = yo is an arbitrary point of the range set G', we wish to show that lb - yol lies above a positive bound which is independent of b: corresponding to the interior point xo of the sphere Ixl < o, there is then an interior point yo of the image set.
For this let
n < e, and a be the greatest lower bound of
ly(x) - b1 in the sphere IxI Z; e. Because Rz is finite dimensional this lower bound is reached at at least one point x = a of this closed sphere:
1y(a)-b1=6>0. We first show that a is necessarily a boundary point of the sphere 1xl 5 e and that consequently lal = e.
§ 4. Implicit Functions
103
In fact, because the operator y'(x) for Ixl < os maps the space R,'" one-to-one onto the entire space R.'", the vector h(E R;) can be determined so that
y'(a)h=b-y(a).
Now if we had Ial < p, then one could take A > 0 so small that Ixl = la + A hl and one would have y(x)
Ial + Aihl
- b = y(a) - b + A y'(a) h + A(A)
with I(A)I
o, (y(a)
- b) + A(A) ,
0 for 7. --, 0. Finally for a sufficiently small A
ly(x) - bl
(1 - A)ly(a) - bl + AI(A)I = 8 - A(8 - I(A)I) < 6,
which because Ixi o contradicts the definition of 8. Hence I&I = p. After this preparation we have (cf. Fig. 2)
IY(a) - Y(xo)I - ly(a) - bl = IY(a) - y(xo)I - 6, - y(xo)l where by virtue of inequality (4.2) (1t - 9(p)) (o - Ixol) (p - 47(p))la - xol IY(a) - Y(xo)1 and, as a consequence of the definition of 8, 8 g lb - y(xo) I . Ib
Fig. 2
Thus ib - Y(xo)I
2
(u -10(2)) (2 - Ixol) > 0,
which proves the assertion: If yo belongs to the image point set G., then for any e in the interval lxol < p < p the neighborhood
IY-Yol<2(,a-q'(o))(2-Ixol), lies entirely in Gy'.
II. Differential Calculus
104
4.6. The region G.' contains the sphere Iyl < ey. Let e be a unit vector from the space R?. Because the open region G?_contains the point y(O) = 0, there exists for y = 2 c (A a 0) a unique A > 0 so that the segment 0 I A* lies in G' for 2 * < 2, while, provided ). is finite, this is not the case for;.* z i.. We have to show that for every c v:-0
;. znf (p - 9(o)) do . For this we take 0 < i.* < i. and set i * e = y*, v(y*) = x*, so that the segment y =1, e (0 51, ).*) lies in G' and Ix*1 = n* oT Let
for
0=)0
he a decomposition of the interval 0 A 2*, 7.; e = y1 and x(yi) = xi. According to (4.2) (µ IYi+1 - Yil a (,u - q(9i)) Ixi+1 - x1I 12i+1 - 911 where o; = Ixil and n1 = max io1, o1±111. From here it follows that n-1 1
i-n
)i)
nl-+1/,
Yd z !r (!i - T(iJ)!Ui+1 - Oil .
i-u
i=0
If the terms on the right with 9i_1 < n; are omitted, then endless refinement of the decomposition 11 yields the estimate
a f (p-q-(L,))do,
0
from which the desired inequality for i follows by taking the limit
i* - ). From #0
f (it - q(e)) do - t)
(p - q(o*))
0
ce
Q*
.f o dq%(o) a f n d-7. (9) t,..o 0=o
it follows that o:-0
Qt-O
f (14 - q(o)) d(, 4 j o dq(0)
0
0-0
.
where equality holds on the right provided n, stands for the least upper bound of those radii 9 for which It - q (9) > 0.
4.7. The derivative a'(ff) = (y'(x)) 1 exists in GY'. Let r and y + Ay be two points of the open image region Gy', x and x -r . Ix the uniquely determined preimages of these points in the sphere ± Jy. Then because of the Ixl < o,, so that y(x) - y, y(x + Ax) differentiability of the function y(x)
:ty=y'(x) AX -.. 1x1(::Ix;.v) with I(dx; a)i
0 for IJxl - 0.
§ 4. Implicit Functions
405
Since y'(x) is regular in the sphere Ixl < o, and maps the space RY one-to-one onto the entire space R,"' the inverse operator (y'(X))-1 exists, and by inequality (4.1) its norm is
(it - 4(0)) where Ixl < e < 0x The above equation can therefore also be written I (y'(x))-'1
(y'(x))-1 Jy = AX + IJxI (y'(x))-' (AX; x) , where
I(y'(X))-' (LIX; X)l s (R Further, according to (4.2) 1_Ixi s (N and hence
-
q(0))-11 OX; x) I
7(o))-' IJyl ,
Ax = (y'(x))-' LIy + IiyI(Jy; y)
.
where in consequence of the above inequality l(:1y; y)l 0 for The inverse mapping x = x(y) is therefore differentiable at the point y
with the derivative x'(y) = (y'(x))-1 (x = x(y)). This completes the proof of the inverse function theorem of 4.2.
4.8. The case tit < it. From hypothesis B of the inverse function theorem it follows, as already remarked, that the dimensions of the spaces R" and Ry satisfy the inequality in -, n ; above we assumed M = It. If in < it, then the equation Y= defines an tit-dimensional surface in the space Ry which at the point y(0) = 0 has the in-dimensional tangent plane Eo = y'(0) Rx . Let P y be the orthogonal projection of the vector y onto this
tangent plane and P y(.r)
.
Now if y(x) satisfies the metric conditions A, B and C', then y(x), because y'(x)-= P y'(x), also fulfills hypothesis A and as a consequence
of y'(0) h = P y'(0) h = y'(0) h hypothesis B too. Further, since IPI = 1, and consequently
(0) = sup ly'(x) - y'(o)I IxI' Q
sup ly'(x) - y'(O)I = 9)(0) 1X1 :5e
'(x) also satisfies condition C'. From the inverse function theorem it now follows that the function y = y(x) maps the sphere 1xl < ox onto an open region G,, of
II. Differential Calculus
106
the tangent plane E. and that the inverse function x = x(y) = x(P y) exists at the very least in the neighborhood ex-0
f (,u - 9(2)) de
I?'I <
0
of the origin y = 0 on E0. As a result, we have for the function y = y(x) this Theorem. Let y = y(x) be a Junction defined for IxI < e, which satisfies conditions A, B and C' of 4.2. Provided the dimension n of the space Ry is greater than the dimension in of R', then y = y(x) maps the sphere IxI < er one-to-one onto a point set GY C R"y in such a way that the following holds: If y = y(x) is orthogonally projected onto the tangent plane E0 y'(0) R', then the projections y(x) on E0 cover the sphere Q:-0
IYI <
f (p - 9'(e)) de
0
without any gaps.
4.9. Solution of the equation z(a, y) = 0. Having treated the case z(x, y) = y(x) - y thoroughly in the preceding paragraphs we can now take care of the general equation z(x, y) = 0
(4(O, 0) = 0)
more briefly.
Let Rr, R"y, R; be three spaces with euclidean metrics, and suppose z = z(x, v) (z(0, 0) = 0) is a mapping into the space Rt defined in the spheres
Ix;<e"
IYI<es..
We assume :
The partial derivative zt(x, y) exists for Ix) < o,, jyj < oj., and one has for its greatest lower bound
inflz,(O,y)h!=,u>0. lkl-1
l rl <er
From here it follows that in
p; as in the inverse function theorem
we restrict ourselves to the case m = p, so that for each y in the sphere Iyj < o, the operator z,(0, y) maps the space R' one-to-one onto the entire space R,". As in the inverse function theorem in 4.2 we require an upper bound for the norm of the operator z,(x, v) - Z,(0, y). We assume:
For o < o, let Oe) = sup I zX(x, v) - za (0, y)I < ft . 14:90 IYKQV
§ 4. Implicit Functions
107
For a fixed y in the sphere lYI < n,, one can then apply the inverse function theorem to the function z(x, y) - z(0, y) of x. It follows that x is defined as a single-valued function x(z, y) of z and y in the spheres Q.-o
Iz - z(0, ),)I < E), = f (p -09)) de 0
(4.3)
and IYI < ey, so that x(z(0, y), Y) = 0. Ix(z, y)I
We now assume that as - 0 the least upper bound sup Iz(0, Y)I IxI Q
is smaller than the integral on the right in (4.3). Then if eo stands for the least upper bound of those numbers 0 < o < Q,, for which Qz-o
sup NO. Al < f (J' - c'(p)) do ,
then 0 < eo S ey, and the above result can be applied in particular for z = 0. Hence, under this condition, x = x(0, y) - x(y) is defined for IyI < go as a single-valued function of y, and z(x(y), y) - 0 for !YI<eo (5ey). We therefore have this
Theorem. Let z = z(x, y)
(z(0, 0) = 0)
be a vector function defined for IxI < ez. IYI < ey (x E RF, y c Ry, z E R;) with these properties:
A. The partial derivative z,(x, y) exists for IxI < e;, Iv! < e,., and
infIz,(0,y)hl =,u>0. JAI-1 19l
B. Fore < e, we have 9,(e) = sup Iz,,(x, Y) - z'(0, Al < N IXISQ IYI
Q:-o
C.
i sup
Iz(0, y) I < f (,a - 4"(0)) d@.
Under these conditions there exists for IyI < oy one and only one function
x = x(Y)
(x(0) = 0)
,
such that
z(x(y), Y) - 0 . If z(x, y) is totally differentiable, then x(y) is also differentiable, and for IyI < ey x'(Y)
(zx(x, Y))-1 zy(x, y)
(x = x(Y))
II. Differential Calculus
109
4.10. The essential part of the above theorem is contained in the following corollary: Let the vector function :. = z(x, y) (x E MAN, y E R. Z E R. z(o. o)
0) which is defined in a neighborhood of the point x = 0, y = n be differentiable with respect to x and let z(0, y), rt(0. y) furthermore be continuous for y = 0. Then provided
inflz.(0,0)1:1>0 ;*1- I
and
tins sup 1z.r(x, 0) - =r(0, 0)1 < inf Iz,.(O, 0) h, Q...n jrf4Q
Ik1- I
the equation »(.r. V) = 0
can be solved for x in a neighborhood of the point x = 0, y = 0: There exists a number or > 0 and a function x = x(y) which is weltdefined for IvI < o,, such that x(0) = 0 and
4.11. Second proof of the inverse function theorem. Because of the great significance of the theory of implicit functions we shall in addition treat the inversion problem using Picard's method of successive approximations. The following discussion can he read directly, independently of the previous investigation. It takes us to our goal under somewhat broader assumptions than those basic to the proof of 4.2-4.7. The following construction also works without modification for the general case of a Banach space. 4.12. The inverse function theorem. We consider a self-mapping y -)'(.r) (y(o) = 0) of the Nlinkowski (or Banach) space R defined oc with the following property: for Ixl < ro For o < r < ro the function q.(x) - y(x) - .r satisfies the inequality
x
0(r)
sup 1)(.r) < 1
(4.4)
,
X >%r
where
1)(x) - Hill sup
!
('I-_ AX) -
)I
"There Under these conditions the ,,mapping . - y can be y--1- . _ x(v), uniquely defined in the
exists a well-determined sphere
r.
f 0(r)dr such that y _.
§ 4. Implicit Functions
109
We first give two simple lemmas. Lemma 1. In the sphere Ixl < ro of the space R let a single-valued mapping y = I (x) into R be given. If there exists a number M (0 5 M boo) such that lim sup It (x + dx) - /(X)I S M ,
IixI
Jx-+o
then for lal, IbI < ro 11(a) - 1(b)I 5 M Ia - bI . Proof. Since the assertion is trivial for a = b, we assume that a + b. Let c be an interior point of the line segment a b. Then 11(a)
- /(b)I s Ia - cl
u(al
Hc)I-
Ia - cl
+ Ic - bi
it(c).
IC
1(b)I bI
and, provided, al bl is that one of the subsegments a c, c b corresponding to the greater of the two difference quotients on the right, 11(a) - /(b)l I a - Ia cI -+bIIc - bI I1(a) -/(b,)I = I/(a1) -f(b1)l Iai - bil
Ia, - b11
Repetition of this argument yields a sequence of nested line segments a b such that I/(a) - 1(b)I
Ia - bI
S
I(b )1
Ian -
(u - 1, 2, ...) .
We now determine this process so that a0 - b -. 0 for n oo. Then the limit point x = lim a = lim b exists (IxI < r0). Provided, for a given n, x lies interior to the segment a b,,, then the application of the above argument gives I/(a) - 1(b)I
Ia - bl
I1(a,,) ..Iaa-bMI
I/(ca) - /(x)I
Ir" - xl
where c is either a or b,,. The same inequality holds even if x coincides with either a,, or b0, where c is then taken to be b or a,,, respectively.
Taking the limit n - oo one finds I/(a) - f(t). --/(x)I g lim sup I/(ca) as - XI_ Ia - bI n-0cr.
g JI ,
which completes the proof of Lemma 1. Lemma 2. Assuming that lim sup It (x + dx) _
M (x)
then for lal, IbI < ro 11(a) - f(b)I
f M(x) IdxI ab
II. Differential Calculus
110
where on the right we have the upper (Riemann) integral of M(x) along The segment ab.
Proof. Divide ab into finitely many intervals (Ax). By Lemma 1
I/ (x + Ax) - /(x)I S Me Idxl , where M4 = sup 11(x) on Ax, and the assertion results by summation and then unrestricted refinement of the partition (Ax).
We now come to the proof of the inverse function theorem and show :
1) The mapping x we have
y is univalent for Ixl < re. As our first result
(Y(a) - y(b)l = la - b + 4(a) - q(b)I
2: ia-bl-1q(a)-,(b)I. If, for example, IbI 5 lal (< ro), then by hypothesis D(x)
6(1x1)
6(lal) on the segment ab, and by Lemma 1 IT(a) - q(b)l 6(Ial) la - bl . Consequently,
IY(a) - y(b)I a la - bi - 6(Ial) la - bl = la - bl (1 - 6(IaI)) > 0. y(b) for a $ b. Hence, y(a) 2) By means of Picard's method of successive approximations we
construct the sought-for inverse function x = x(y). To this end one sets
x=Y -4(x') and for a given y determines the sequence x by x,.=t=3, (u=0,1, ...xo=0).
(4.5)
We must show :
A. There exists a number no > 0 such that the points x all lie in the sphere Ixl < ro provided IYI < go. B. The sequence x converges for IYI < !.-o and it - oo to a point x = x(y), Ix(Y)I < ro.
It then follows from A and B that x(y) is the sought-for inverse function. For because of hypothesis (4.4) T(x) and thus also y(x) is continuous for Ixl < ro. It further results from (4.5) and B, for is - oo, that y - Y(x(Y)) = 0. x(Y) = Y - q(x(Y)) , x = x(y) is thus the sought-for inverse function. 1'roo f of A. A number go > 0 is to be determined so that the points x5 computed by means of (4.5) lie in the sphere lxl < ro. Assum-
§ 4. Implicit Functions
111
ing that y is a point for which the points x0, x1, ... , x have the desired property, one inquires, under which additional conditions is it also true that Ix,,,s-dd < ro?
For this one estimates the function lop(x)j. Let 0 < IxI = r < ro and x = r e, where e is a unit vector. If one takes a = x, b = 0, then as a result of Lemma 2, and in view of (4.4), one has
f 8(r) dr ,
l9(x) I
0
and hence
5 IYI + Iq(x,.)I 5 IYI + f 8(r) dr.
1X.++I = l y -
0
Now if we are to have I
I < r0, it suffices to take Iyl smaller than r0 - .f 8(r) dr . 0
For this reason we set
o0=r0-
f (1 - o(r))dr 0
(>0).
0
With this choice of eo all of the points x,, lie in the sphere IxI < r0, provided IYI < eo Then for n = 0, Ixol = 0 < ro, and the above induction argument shows that Ix, j < ro holds for every n. Because
5 IYI + f 8(r) dr = r0 - (eo - IYI) 0
the points x even lie in the sphere IxI 5 ro - (eo - IYI) Proof of B. By (4.5) one has for n > 0 Iq(x«) - 40x-01 If IYI < eo, then the inequality Ix. 1 S ro - (eo - IYI) follows from A
for every is. The same inequality holds for each point x of the line segment
x,,, and by (4.4)
D(x) 50(ro-(eo-IYI))= 80
The series Z
5 00 1x 5 0, IxI = B; IYI thus converges uniformly in the sphere
IYI 5 eu < e0 from which the existence and even the continuity of the inverse function follows for lyl < oo. The uniqueness of the inverse function results from the single-valuedness of the function y(x).
112
11. Differential Calculus
4.13. Example. The following example shows that under the hypotheses of the inverse function theorem the region in which the mapping
x - y is given cannot be continued outside of the sphere lyi = f (1 - 0(r)) dr.
Ao
T*
0
Define y(x) by
= y'(x) = x (1 - g(r))
r)
where 0(r) dr
and 0(r) is for r such that
0 a continuous, monotonically increasing function
0 S 0(r) < I for 0; r< ro and 0(r) = I for r ? ro . Suppose Ix = r < ro, and further let , lx ( r o) be an increment of x and Jr. Jg and Jq the corresponding increments of the functions r,
g and q' = y(x) - x = - x g(r). One has (g(r) = g) -19': _= I(.r
Thus, since dr
Jx) (g
Ig) - x g = Ix Jg
ix g
,
.
Ix ..Ig.
IJx!, I'19*1srI_181+g-,l.ixl
lim sup
I
IJgl.
<_ r g'(r) -r g(r) = 0(r) < I
The hypotheses of the inverse function theorem are thus satisfied, and one concludes that the function y(x) can be uniquely inverted for
Iy'I < go = f (1 - 0(r)) dr. This also follows at once from the expression ti'(x) =
(r - r g(r)) _
.,.
r
f (1 - 0(r)) dr . 0
The norm Iy(x)j increases on the interval 0 S Ixi s ro from 0 to no and for Ixi ? ro maintains the constant value go. The sphere Iyj < go is thus the precise domain of existence of the inverse function x = x(y).
4.14. Case where y(r) is differentiable. If '(x) is differentiable for Ixl < r0, then D(x) = Iip'(x)I = Iy'(x) - 111, where 1 is the 1 The norm ,Y'l is defined as 14'(x)l - sup 14'(x) hi. U+I_1
1 4. Implicit Functions
113
identity mapping. Thus the inverse function theorem holds provided
0(r) = sup lY'(x) - II < 1
(r < ro)
The proof in 4.7 shows that the inverse function x(y) is differentiable
for lyl < go with the derivative x'(y) _ (y'(x))'e.
In addition, we apply the inverse function theorem to a twice differentiable function y(x). If y'(0) = I and
sup IY"(a)I = M < oo , I*ISr.
then it follows from Lemma 1, for an arbitrary constant k E R, that
I (Y'(x) - I) kI = If y"(x) dx kl 9 MlkI f IdxI = M rlkl . Ox
Hence
0(r) =suply'(x)-II SMr<1 14s1,
forr
I)'I
y(x) = x -
x
2/M, then the example
f 0(r) dr = x (1 0
shows that the radius
- 2 rl
112M cannot be increased. For the mapping
x - y is no longer schlicht for IxI
r (I /M < r < 2/111), and for
IxI < 11M the image region is precisely the sphere IYI < 112M = 80
4.15. Exercises. 1. Let y = y(x) be a mapping from the euclidean space Rz into the euclidean space Ry which is continuously differenti-
able in a neighborhood of the point x0. The point x0 is to be called a regular point of the function y(x) provided the kernel of the operator y'(x0) is of the lowest possible dimension. Now let in > it. Then the kernel of the operator y'(x0) is of dimension p a ore - n; and the point x0 is regular provided this dimension p is exactly in - it. Prove: Let in > n and suppose x0 is a regular point of the function y(x). Then if one considers the set of points in the space R.., that satisfy the equation Y(x) = Y(xo) = Yo
the subset of these points that lie in a sufficiently small neighborhood
Ix - x0l < rr can be mapped one-to-one and differentiably onto an
114
11. Differential Calculus
open region GO of a p-dimensional parameter space R. There consequently exists a function
x=x(u)
(xo = x(uo))
defined and continuous in G such that on the one hand y(x(u)) = yo and Ix(u) - x01 < r, for It E Gp, and on the other hand, conversely, corresponding to each x in the neighborhood Ix - xo1 < rx that satisfies the equation y(x) = yo there is a unique it E G such that x(u) = x; moreover, the operator x'(uo) is regular. Hint. Take for the parameter space RY the kernel of the operator y'(xo).
A linearly independent e.g. orthogonal complement R",. of R.*
in R",' is then of dimension in -
it, so that each x E R' can be
represented in a unique way as a sum .r = u + v. Then y(x) = y (u4-v)
y(u, r), and here the partial derivative y in the direction of the subspace R, is regular for xo =- uo 4- too, so that
inf ly'(xo) ri = u > 0
.
IvI-I
By the theorem of 4.9 the equation y(x) -= y(xo) = yo can be solved in a neighborhood of the point xo = uo = t'o for r, which implies the assertion for x = it 4- v(u) - x(u). Remark. Phrased geometrically the above theorem states that the equation y(x) = y(xo) = yo defines in a neighborhood of the regular which is empoint x0 a "regular surface" of dimension in - it bedded in the space Rx . The kernel RN defined by y'(xo) it = 0
is the tangent space and the orthogonal complement R". the normal space to the surface at the point x0. Each one-to-one continuously differentiable mapping it = u(ii), Ti = u(u) of the region Gp into a region CG;; of the same or of another p-dimensional parameter space .R leads to a new parametric represen-
tation x = x(u) = x(u(u)) - (a) of the required sort. 2. We consider in the following, as a supplement to the case of a regular point x0, treated in the text and in the previous exercise, a "degenerate" function y(x) which has no regular points at all in its domain of definition. For each x, therefore, the dimension of the kernel of y'(x) is positive if in it, and > tit - it provided to > in. Let xo then be a point where the dimension of the kernel of y'(x) reaches its minimum p; according to hypothesis xo is also an irregular point of the function y(x), consequently q = in - p < it. Prove: In a sufficiently small neighborhood ix - x01 < r, a function
.=F(x)
14- Implicit Functions
115
can be defined whose range lies in a space R, of dimension q = nt - p. so that in this neighborhood y becomes a function of z y = Y(x) = Y(z(x))
which in the image region z(lx - xol < r:) is everywhere regular. Hint. Suppose, as earlier, that RIO is the p-dimensional kernel of x y'(xo) and that RZ is a linearly independent complement, so that x = it + v and y(x) = y (it -i- v) _ y(u, v). Then for xo = ito -l- va
inf lY'(xo) rl > o. Inl-I
and the operator y'(xo) therefore maps Rn one-to-one onto a q-dimensional strict subspace R, = y'(xo) R$ of R. Let P y be the projection of y on this subspace and
Py(x)=z(x)=z(u,v). Hence z'(x) = P y'(x), and in particular
y'(.ro).
Conse-
quently
0
Ivi-,
In consequence of the continuity of y'(x) for x = .ro, rj can therefore be chosen so small that for Ix - xul < r, 0 < inf lz'(.x) vI a inf IY'(x) vi Iti-+
Iui-I
Then the dimension of the kernel of y'(x), which by hypothesis is
(a)
p,
is precisely = p for lx - xol < r.; for it follows from (a) that this dimension can be at most p. If R' stands for this kernel (us. = u), then for each x in the above neighborhood the kernel RP,., which varies with x, and the fixed space R$ are hence linearly independent complements, i.e.,
R'=R;,+R9.
(b)
According to the inverse function theorem of 4.2, the equation z = z(u, v) can be solved for v in a neighborhood of the points xo, zo = z(xa), so that Y = Y(u, v(u. z)) = y(u. z) and z
z(u, v(u, x))
(c)
identically. We claim thaty only depends on z, so that z) dit = y'(x) du -l- y'(x) d.v (d) vanishes for Ix - xol < rr and for each du of the kernel R. For the proof, observe that according to (b) the differential du can be uniquely represented as a sum du = dtts + dv (e)
11. Differential Calculus
116
for each x of the sphere ;.x - xo1 < r, where du, stands for a vector from the kernel of y'(x), while dr is a vector from the fixed space Jr. By (c)
0 = _-'(x) drr = :'(x)
from which it follows, in view of (e), that 0 = z'(x) dtr,. + z'(x) dv -i- z'(x)
z'(x) (dv + dqr) ;
for z'(x) du,. = P dtt, -- 0, since dtt, is a vector of the kernel of y'(x). But then because of (a)
0,
dr -{-
(f)
and it therefore results from equations (d), (e), (f) that z) du = y'(x) dtt,. -f. y (x) dr
y'(x) dv
y'(x) (dr -; dxv) - 0 ,
so that in fact y only depends on The function y = 5 (z) has nothing but regular points. For from jP'(z) dz =- v'(x) d y = 0
it follows by (a) that dr must = 0, which according to (c) further implies that
dz=Z'(x)dr=0,
which was to be proved. 3. Using the notation of exercise 1, let xo be a regular point of the differentiable function y(x) and RM the kernel (p - in - it) of y'(xo). Let z(x) be a second function which is differentiable in a neighborhood of xo with a range in the space R. Prove: For z(x) to be stationary with respect to y(.r) at the point .ro, so that from y'(.ro) dtt = 0 it always follows that z'(xo) du = 0, and the kernel of y'(xo) is thus contained in the kernel of -'(xo), it is necessary and sufficient that a linear mapping
..=.Ay
of the space R. into the space R! exist such that for each differential dx from R' (:'(x0) -- A y'(xo)) dx = n .
Hint. As before, let R','. be a linearly independent complement of the kernel R. The equation y = y(x) = y(tt, r) can in a neighborhood of the points xo, yo = y(xo) be solved for r, v = v(u, y), so that we have identically y = y(u, r(u, y)). Because y'(xo) drr = 0 it follows that dy = y,.(xo) dr, and therefore dr = (yt(xo))-' dy. Now if z'(xo) dit = 0 also, then for an arbitrary d.r = du -i. dv :'(.ro) dx - :'(xo) du
-'(.vo) dt' = :'(x0) dr = '(x0) (vt(xo))
di'
§ 4. Implicit Functions
117
Thus if one sets the linear mapping of the space R."" into R1, z'(x0) (yv(x0)) -l = A ,
then z'(xo) dx = A dy = A y'(xo) dx and (z'(xo) - A y'(xo)) dx = 0. The condition in the theorem is thus necessary. That it is also sufficient is a result of the above identity for dx = du; for
z'(xo)dit=Ay'(xo)du=0. Remark. If for given functions y(x) and z(x) one seeks those points xo where z(x) is stationary with respect to y(x), then by the above it is necessary to make use of the identity
(z'(xo) - A y'(xo)) dx - 0 in the unknowns xo and A. For the nt real unknowns xo and l n real unknowns A we have from here l m real equations. Hence if the problem generally is to have a solution, l nt + it must be m + l n, i.e.,
ip s p, which is the case only for l = 1, that is, for a real function z(x). Then A y is a real linear form of y, and the above result contains as a special case the "method of Lagrange multipliers" for determining the stationary points of a real function z(x) on the surface y(x) = 0.
III. Integral Calculus § 1. The Affine Integral I.I. Alternating operators and differentials. Let A(x) be a p-linear alternating operator defined in the open domain Gz of the linear space Re : for each fixed x in G'
y=A(x)It,...hp=A(x)d,x...dpx
(1.1)
is a p-linear alternating function of the vectors hl = dx E R' with range in a linear space R. Such a function is called an alternating differential of pth degree.
Even for P = I we interpret this differential as "alternating." It will be shown that the concepts and theorems developed in this section which refer to alternating differentials of pth degree also remain mean-
ingful and valid for p = 1. The same holds for p = 0 if a differential of degree zero is understood to be an ordinary vector function A (x). For p > nt the differentials d;x are linearly dependent, and the alternating differential vanishes identically; therefore suppose p - in. Let U' be a p-dimensional subspace of R' and D d,x ... d px stand for the real alternating fundamental form of this subspace, which is uniquely determined up to a real factor. Then one has (x E Gr", dx e U¢)
A(x)d,x...dpx-a(x)Dd,x...dpx,
(1.2)
where a(x) e G stands for a vector function that is uniquely defined for a fixed Up in Gs'. Since this function consequently depends on the oriented subspace Up, a(x; Up) would be a more fitting designation. However, we wish to retain the above shorter notation, as long as there is no fear of misunderstanding. The definitions of the continuity and the differentiability of a multilinear operator were already given in II.1.9. By these definitions the alternating operator A (x) is differentiable at the point x if
A(x+h)h,...hp = A(x)h,...hp+A'(x)hh,...hp + 41ti (h; x) h, ... hp, where in the present case the (p + 1)-linear derivative operator A'(x) is alternating in the p vectors h,, . . . , hp and the norm I(h; x) I of the likewise alternating operator (It; x) vanishes for IhI -,. 0. We recall in 1For the case of a real form A (x) d,x ... dpx this follows from 1.5.2; only in the present case the density a will be a function a = a(x) of the point x. The arguments of 1.5.2 remain valid without essential modifications when the value of the given alternating p-linear form varies in a linear space R;.
i 1. The Affine Integral
119
this connection the following fact (cf. II.1.13, exercise 6) : if the derivative operator A'(x) is continuous on a closed subregion G., of G',, then in G. I(h; x)I -. 0 even uniformly for IhI -. 0.
1.2. The affine integral of an alternating differential. We now suppose that the alternating differential (1.1) is continuous on a p-dimensional closed simplex So = SON.
XP)
of the domain G'. The edges Is, = x, - xe emanating from the vertex xe generate in Rx a p-dimensional subspace U' which is parallel to the plane xe + Up of the simplex. If p = m, U'" = xe + U"= R?. We orient the simplex sP and the other simplexes of the plane xe + U' with the real alternating fundamental form
Dh,...h1,=A(xe,...,x,) of the space Up. We are concerned with the definition and the existence of the integral of the alternating differential (1.2) taken over the simplex sr. Taking as a model the Cauchy-Riemann integral concept, we consider a simpliciat decomposition
so=Es;(xo,...,xa)
(1.3)
of the simplex sO into finitely many subsimplexes sf, fix in each subsim-
plex an interior or boundary point x7 and form the sum
EA(x7)14...hi =
(1.4)
where h; = x; xo and the function a(x), because the vectors hi, ... , hi for each j span the same subspace (hi,. .. ,14) = (h1, ... , hp) = UP, is uniquely determined. We then have this Theorem. If under the assumed continuity of the operator A(x) on the closed simplex sp all subsimplexes are oriented with respect to D the same as s¢, then for unrestricted refinement of the decomposition the sum (1.4) approaches a uniquely determined limit vector j in the Space R;.
Here the "unrestricted refinement" as well as the existence of the limit vector can be understood either in the sense of the natural topologies of the linear spaces R'" and R,, or with respect to arbitrary hiinkowski metrics. If 6 stands for the greatest side length in the subsimplexes, then the theorem asserts the existence of a unique vector j F. Ry such that for each e > 0
IEa(xi)Dh;...hi -JI
provided 6 is sufficiently small.
120
111. Integral Calculus
The proof can either be based on the Cauchy convergence criterion, which is valid in metric spaces of finite dimension, or also, if the n real components of the sum are considered separately, on the investigation of lower and upper sums, and goes, upon taking the additivity of D into account, in the well-known fashion.' The limit vector J, which according to this theorem exists, we call the a//rite integral of the alternating differential taken over the simplex 0, and we write=
J=f A(x)dlx...dyx= f a(x)Dd,x...d,x. SP
SP
For P = I we have the line integral f A (x) dx = f A (x) dr.. Xe X,
df
For p = 0 the differential degenerates into a vector function .4 (x). For "the integral over the zero-dimensional simplex s°(x°) = x8" it is convenient to understand simply the vector A(x°). The following properties are, just as with the Cauchy-Riemann integral concept, immediate consequences of the definition of the affine integral.
First of all, the integral changes its sign when the simplex s, is reoriented.
Further, the affine integral is additive in the following sense: If (1.3) is a decomposition of the simplex s , into like-oriented subsimplexes
sp, then one has s SP
i.
J
Finally, if the inequality Ia(x)I tivity of D,
a holds on sP, because of the addi-
If A(x)d,x...dixi = if a(x)Dd,x...doxi ;g a ID h,...1i1, .P
.
SP
The following representation of the vector function a(x) is a result of this inequality. If x* is a point of the simplex s¢ and one sets a(x) = a(x*) + e(x) 1 For the proof of the uniqueness of J one considers two sequences of simplicial subdivisions of s0. Their intersection is a subdivision of sP, composed of convex polyhedrons. By induction one shows that the polyhedrons can be divided in simplexes which form a common simplicial subdivision of sP and of the two given subdivisions (cf. T. Nieminen (1)). The uniqueness of J follows as a consequence of the continuity of .1(x). This integral agrees (except for the notation) with the integral introduced by 2. Cartan in his Caicul exlMrieur.
1 1. The Affinc Integral
121
then as a consequence of the continuity of a the length Ie(x)I is smaller than an arbitrarily small E > 0 as soon as the edges of the simplex are sufficiently small. In the equation s
f A(x)d,x...dpx=a(x*)Dhl...ho+jfE(x)Ddtx...dpx
,
therefore, according to the above inequality, the norm of the integral on the right is smaller than e ID h, ... h1j, and consequently a(x*)
,l im. (Iakl 1.. hp
f A(x) d1x ... dyad
,
(1.5)
where sy - x* indicates that sP shrinks in the fixed plane x* + Up to the point x*. One sees that a(x*) = a(x*; U") has the character of a density of the operator A (x), in the direction U".
1.3. Computation of affine integrals. If the alternating operator ... , xy), A(x) - A = const.,
A(x) is independent of the point x on sy(x0,
then it follows from (1.4), because of the additivity of the real fundamental form D, that A is also additive, and therefore
fAd1x...d,,x=Ah,...by.
(1.6)
Second, we consider the case where the operator depends linearly on x, A (x) = A x. Then the density of A in UP is a linear vector function a x of x, and hence
Axd,x...dyx=axDdlx...dpx. Decompose sy barycentrically r-times in succession into N = ((p + 1) !)' subsimplexes s f 4), which according to exercise 6 in 1.5.9 have the same affine measure ID 1'
... hi) _
(f = 1,...,N).
ID h1 ...hpl
If these subsimplexes are oriented like sy and one takes in the sum
ax"Dh'
... MP =N Dh1...hpfraxi'= Dh1...hpa( .2 xt ) _j I
x* f to be the center of gravity x" i
= z1-= 1(J(p
-
simplex s f, then (cf. I.5.9, exercise 6)
NfEx!
EXi-x,
y
1)) £i xi of the sub--
"0
III. Integral Calculus
122
where z stands for the center of gravity of sp(xa, ... , xp), and the above
sum is for each r equal to r
p
1
-_
.
rAx.h1...hp o
Consequently,
JfAxdlx...dpxp
P
z Ax;hl...lip=A rh,...hi,.
(1.7)
Finally, we indicate some general formulas for the calculation of affine integrals which will later be of use to us. For the sake of brevity we content ourselves with a differential geometric argument (Fig. 3).
Fig. 3
We assume the side simplex S0-'(x1, ... , xp)
of sp opposite to xo to be decomposed somehow into infinitesimal sub simplexes. Let the subsimplex of the former side simplex which contains the point x have edges dlx, ... , d ,,-,x, whereby we so orient the subsimplexes that D h1 d1x ... dp_,x has the sign of D h1 (hs - h,) ... (hp - h1) = D h1 hg ... hp. We join the. vertices of these subsimplexes of s;-' with xo and cut the thus resulting pyramids with planes that are
parallel to s;'' into infinitely thin truncated pyramids which in the limit behave like prisms. We decompose each of these prisms following the classical method of Euclid into p infinitesimal p-dimensional simplexes of equal affine measure (cf. 1.5.9, exercise 7). The prisms between the planes through the points
xa-}-r(x-x0) and x0+(T+dr)(x-x0)
(05T<1,dr>0).
§ 1. The Affine Integral
123
each contribute an amount
pa(x0+r(x-x0)) Ddr(x-x0)rdlx...rdp_,x = p rp-' dr a(xo + r (x - xo)) D (x - xo) d,x ... dp_,x = d(rp) A (xo + r (x - x0)) (x - x0) dlx ... d ,-,x to the affine integral. For the affine integral taken over sp we therefore obtain the formula f A (x) dix ... dpx aP
= f d(rp) f A (x0 + r (x - x0)) (x - x0) dlx .. - dp_,x 0
:f
=f
0-+
ao-t
d(rp) A (x0 + r (x - x0)) (x - xo)) dlx ... d,,.-,x
0
= f 97 A(x) d2x ... d,,-,x ,
(1.8)
where the operator q' A is defined by the operator A of degree p accord-
ing to the equation A (x) = f d(TP) A (x0 + r (x - x0)) (x - x0) 0
as an alternating operator of degree p - 1. Thus p is a linear functional
that assigns to an alternating operator of degree p an alternating operator of degree p - 1. Repeated application of this reduction formula ultimately yields a representation for the affine integral over the simplex sp as an ordinary p-fold Cauchy integral over the unit cube in the p-dimensional number space, from which further representations can then be deduced by means of suitable variable transformations. However, since we do not need these, we shall leave them to the reader as exercises. 1.4. Exercises. 1. If A,,(x) is a p-linear alternating operator which is continuous on the simplex sp = sp(x0, ... , xp), then for q < p f 4p-Q(x) dlx ... dp-qx f Ap(x) dlx ... dpx = 'P_1 SP
where, for j = p, ... , p - q + 1, A,-,(x) is defined by Af-+(x) = 97 A,(x) = f d(Ti) A, (xp-, + r (x - xp_j)) (x - x, _1) 0
as a (p - q)-linear alternating operator on the simplex s$-q = sp-q(xq, ... , xp) which is provided with the orientation induced by the orienta-
M. Integral Calculus
124
tion of sp. For q = p,
f.4,(x)d1x...d,x=A0
sP
= f dr .4, (xp-., + r (xp - xp-1)) (xp
f A1(x) dx . .VP-1-VP
2. Let the p-linear alternating operator A(x) be continuous on the simplex sp = sp(x0, ... , xp). Further let i = x(x), x = x(i) be an affine
mapping which transforms sp into P = ?(x0, ... , .i.) so that xi = x(x,)
(i = 0, ... , p). Then f .4 (x) d1x
... drx = f A(i) d1s ...
dpz ,
ip
.,P
where the p-linear alternating operator .4 (i) is obtained by the substitution x = x(a-): A dls ... dpx = A dr d,--
... dx
e
§ 2. Theorem of Stokes 2.1. Formulation of the problem. Let sp+, (x0, .
, xp+1) (x1 - x0 = hi) be a closed (P + 1)-dimensional simplex in the space RT (p in - 1). We determine the orientation of the simplex sp'" from the sign of the real (p + 1)-linear alternating fundamental form
D h, .
.
. .
. hp+, = A(x0...... p+1)
of the subspace Up+' spanned by the edges h,, ... , hp+, of the simplex; suppose this orientation is positive, i.e., for the above ordering of the vertices D h, ... h,1 > 0. On sp+' we consider a p-linear alternating form
A(x)k,...kpE Ry
(xesp+',k.E Up+')
If the operator A(x) is continuous on s"", we can form the integral of the differential form A(.x) d1x ... dpx over the boundary as-""' of sp+' as the sum of the integrals over the p + 2 boundary simplexes s?(x0,
... , "xj, ... , xp+,) (i = 0, ... , p + 1). The induced orientation of the boundary simplex s? has the sign (- I)', so that
f A(x)d1x...dpx =x (- 1)'/A(x)dlx...dpx. p -o jr
..,p+1
(2.1)
§ 2. Theorem of Stokes
125
The theorem of Stokes transforms this boundary integral into a (p + 1)-fold integral over sp+'. To derive this theorem we analyze the boundary integral (2.1) more carefully. 2.2. Special cases. We first consider two simple kinds of operators A (x).
First, suppose A (x) = A is independent o/ x. Then according to (1.6),
for i = 0,
fAd1x...dr,x=A(xt-x1)...(xp4,-x1) 3%
p+t
Ah1...1h;. ..ht,41,
1-1
and for i= 1,. .. ,p+i (- 1)'f Ad1x...dpx= (- 1)'A h1...h;.. sp
Thus by (2.1)
f A d1x ... dpx = 0.
(2.2)
e:p-t
Second, suppose A (x) = A x is linear in x. By (1.7), for i = 0. one has
Axd1x...dpx=Az0(h,-h,)...(hp+t-h1) p+t
°po
Axo111...hi...hp&.1
1-1
andfori=1,...,p+1 (- 1)' f Axd1x...dpx= (- I)'Axiht...J1;...hp;, sI
whereby x; = (1/(P + 1)) (x0 + ... + Y; + ... + xp+t) (i = 0, ... , p + 1) stands for the center of gravity of sp. This allows (2.1) to be written
f A x d1x ... dpx = P,O+t
p+1 1
P + 1 t-1
(- 1)'-' A h; b1 ... ii ...Tip+1 . (23)
The expression on the right is an alternating form on the 1'
=
1 vectors
ill,...,hp+1
2.3. The differential formula of Stokes. We now go on to the general case where A (x) is a p-linear alternating operator with the following properties: 1. A(x) is continuous on the simplex sp+'. 2. A (x) is differentiable at an arbitrary interior or boundary point x o/ the simplex sp+'.
111. Integral Calculus
126
Hence, with arbitrary vectors d,x, ... , dx from the space R,", after the introduction of a Mfinkowski metric, A(x) d,x ... d,,x
=A(x*)d,x...d,,x+A'(x*) (x-x*)d,x...dx + Ix - x*1 (x - x*; x*) dtx ... dlx , (2.4) where the norm of the alternating operator (x - x*; x) converges to zero for Ix - x*i - 0. To compute the boundary integral (2.-1) substitute the expression
(2.4). For i =0,...,p+ I f .4 (x) dlx
... d,r
:l(.r*)d,.r...d,x
f.4'(x*) (x-x*)d1.r...d,x± r; sp
sP
where
r; = f Jx - .r*I (x - x*; x*) d,x ... dpx . .?
We let b stand for the length of the greatest diameter of sA`'. Then
for each x E St .', ;x - x*I S 6. Further, let F(sP ") = sup l(x - x*; X*)I
(2.S)
.
CESA-I
To estimate the remainder term ro, using the orienting fundamental form D k, ... kA+, of the space i.'r+', one sets
(x-x*;x*)d1x...d,,x
eo(x)Dh,d,x...drx
on c". Then Fo(x) E R, is uniquely determined on the latter side simplex,
and for d;x = h;_, - h, (h; = x; - xo) we have (X -- x*; x*) (h3 - h,) ... (hps., - !t1) = Fo(x) D h, (1i, - h1) ... (ho+, - h1) -O(x) D h1 ... hp-;. I
Thus using the definitions of 6 and e(A'') we see that leo(x)I D It, ... hA.j-, 9 I(x - x*; x*)I Ih: - h11 ... IhA+, - h1l Because of the additivity of the absolute value of the p-linear alternating differential Do d1x ... dx = I) ht dAx ... dl,x it follows from here
that Irol 9 f Ix - x*i IFO(x)! ID h1 d1x ... ui xI :a ap+t
I>h1...hA+,
f ID h, d,x . 'o
. .
-
d,x=aA=+ e (sA-) +
§ 2. Theorem of Stokes
127
In order to find an estimate for the remainder terms r; (i = 1, .. . + 1), one writes on s'
(x-x*;x*)dlx...d,,x=e;(x)Dd1x...d;_,xhi dix...dpx which uniquely determines e;(x) E Rr on the above side simplex. In particular, for the vectors d t x = ht (j = 1, ... , i - 1). d,x = ht+,
( i=i ,.. . , p) one has (x-x*;x*)hl...hi...hp++=e,(x)Dh,...hp++
Because of the additivity of the absolute value of the p-linear alternating differential D. d1x ... dpx = D d1x ... d;-,x h, d;x ... dpx one obtains in this way precisely the same estimate for r; as we did above for ro, 6P+1 e(Sp++)
Iril
.
Summation over i now permits (2.1) to be written f A (x) d1x ... dpx esp+ I
= f (A(x*) - A'(x*) x*) dlx ... dx + f A'(.x*) x d1x ... dpx + r, Psp+,
8sp++
with (2.6) (p + 2) b ' e(Sp++) . where e(sp+') is a number that converges to zero when the simplex sp+'
I rl
is allowed to converge in the fixed plane x* + Up+' to the point x*. By (2.2) the first integral on the right vanishes. According to (2.3) the contribution made by the second integral becomes
f A'(x*) x d1x
... dpx =
rep+'
p+t t P
1 t-i
A'(.x*)h;hl ... h; ... hp+1 .
and finally one finds
f A(x)dlx...dpx
g,p+t
+t
E(- 1)' A'(x*) h, h1 ... hi ... ltp+I + r
(2.7)
Equation (2.7) together with the estimate (2.6) of the remainder term contain the di//ereatial /ornnula of Stokes, which has been establish-
ed under hypotheses 1 and 2 of this section.
2.4. The exterior differential. The rotor. By means of the above analysis, associated with the alternating differential co = A (x) h, ... It,., of degree p there is an alternating differential form A A'(x) h1 ... hp++ p+1
p
* 1 ' (- 1)'-' A'(x) h; h, ... h; ... hp+t
(2.$)
III. Integral Calculus
128
of degree p + 1, the alternating part of the differential of the given form w. This (p + 1)-form is (up to the factor 1/(p 1- 1) the same as the exterior differential of the form o, introduced by E. Cartan1.
The operator defined by the exterior differential is the rotor of the operator A (.-o:
rot A (x) = A A' (x) .
(2.8')
2.5. Coordinate representation of the rotor. We start from the expression (2.8) for the rotor and restrict the differentials dix to a (p + 1)-
dimensional subspace Up+' of the linear space Rx and consequently consider the rotor in the direction Up+'. If in a linear coordinate system e1, ... , epT, of this subspace p+1
dix=Ed$ e, ;-1
(i=1,....p+ 1),
then p-1
A (x) d1x ... dix ... dp+1x = E d;a;(x) where a;(x) = A(x) e1 ... et ... ep.o-,
and A stands for the subdeterminant of the complete (p -i- 1)-rowed determinant dpi
... dgr
d¢,
p+1 ... d,:p_1
associated with the differential dd;. Let A(x) k1 ... ky (ki E L'p+'), and thereby also the vector function: a,(x), be real. Then by the above
A'(x) dig dlx... dix
... dp..,x = E E -!k d$; -f;
and consequently p+' p-, p
i-1 k_1
k ie1
I Our presentation of the "exterior calculus" deviates from that of Cartan in that we proceed in a coordinate-free fashion. A second formal difference i, found in the differing notations for the exterior differential. Usttally one writes den for the exterior differential of the form (n. Since we use the symbol d only for the ordinary differentiation operator, we prefer the more explicit notation A dry) (the alternating part of the ordinary differential do)).
§ 2. Theorem of Stokes
129
Here the sum over i on the right vanishes for k + j, and for k = j it is equal to (- 1)1-1 A. Therefore, rot A(x) dlx ... dp+1x =
P+t
- Z (A
I
f)1-'
ail
which we can write
rot A(x) d3x ... dp+1x = p(x) D d,x ... dp+,x where D stands for the real alternating fundamental form of the subspace Up+' with D e, ... ep+l = 1, and p+t
Nlx)
E (+ t 1-+
1)'-'
is the coordinate representation of the rotor density in the direction of this subspace.
2.6. Extension of the definition of the rotor. Suppose the p-linear alternating operator A(x) is continuous in a neighborhood of the point
x* E R' and differentiable at this point. Further, let Up+' be a subspace and sp+'(xo, ... , xp+t) a simplex in the plane x* + Up+' that lies in the neighborhood mentioned and that contains x* as an interior or boundary point. Then by 2.3 Stokes's differential formula holds, f A(x) d1x . . dx = rotA(x*) hl ... hp+1 + 60+1 (S.0+1; x*) .
8rp+t
where 6 stands for the greatest side length Ihrj = fix; - x01 and (sp+'; x*) designates a vector whose length vanishes with 6. This formula can, conversely, be used to define rot A(x*) by postu-
lating that the formula holds. In this way we find the following, relative to (2.8), generalized
Definition. Let the -lin ear alternating operator A (x) be continuous in a neighborhood o/ the point x*. 1/ a (p + 1)-linear operator B(x*) exists such that for each simplex sp+'(xa, . . . , xp+,) in the space Rz that contains the neighborhood mentioned and the point x* a decomposition
f A(x) dlx ... dx = B(.x*) hl ... hp+1 + 8r+t(sp ft ; x*)
(2.9)
8:p+1
holds with I(sp+'; x*)l --*.o for 6 = max 1h,j = max Ix, - xol - 0, then we call B(x*) the rotor of A (x) at the point x* and write B(x*) = rot A (x*).
Regarding this definition observe the following. First, it is clear that the rotor, provided it exists in the sense of the above definition at a point x*, is uniquely determined by A(x). Further, it follows from the definition that the operator B(x*), for which (p + 1)-linearity was hypothesized, is an alternating operator. For if two vectors hr and hi, and thus the vertices x, and x, of the sim-
Ill. Integral Calculus
I 30
plea, are commuted in (2.9), the orientation of the simplex st'+' and of the boundary Ps!'-'', and with that also the left hand side of the equation changes sign. If in addition the vectors h are replaced by i. h (o < 7. 5 1), it follows from this equation that
B(x*) ...hi...h;...±
0= B(x*) ...
with 1(i.)1 - 0 for A - 0, from which the assertion follows. Therefore, at each point x where rot .4 (x) exists in the sense of def inition (2.9) one can write rot A (x) h1 ... hp , _ Q(x) D h, . . . le p.; , (2.10) where 1) is the real alternating fundamental form of the subspace U'
spanned by the fi + I vectors h; and o(x) = o(x; A; UP") stands for the rotor density of the operator A (x) in the direction Up+'. The definition (2.9) yields the representation
fA
A ; Ur+') = lim
. dpx
(3.11)
ISO-'
for this rotor density, where the limit sP'" --* x is to be taken in the fixed plane Ct ' in such a way that the regularity index of the simplex gyp".
reg sr"
jD41...hp_.,I
V(sP-1)
remains below a finite bound; here V(sr ') is set equal to the "volume" !D h1 ... hp+1! of sO+'.
Finally, the analysis carried out in 2.) shows that for the existence of the rotor in the sense of definition (2.9) it in any case suffices for A (x)
to be continuous in a neighborhood of x* and differentiable at this point, in which case the rotor can be represented by formula (2.8), and
the coordinate representations derived in 2.5 also hold. The above definition, however, hypothesizes nothing about the differentiability of
the operator .4 (x) and therefore provide,, an extension of the more narrow definition (2.8). For 1' _,= 0 both definitions are equivalent. Then the "0-linear alter-. nating operator" A (x) is simply a vector function A (x), and by (2.8) one
then has rot .4(x*) = .l'(x*), while equation (2.9) degenerates into .4(x1)
.4 (.%-0)
B(x*) (x, - 1'0) -L Ix1 _ xoj (x1 - xo; x*) 0 for vo, x, -* x*. For .vo = x* the equation
with j(x1 - vo; x*)j states that A (x) is differentiable at x* with the derivative .4'(x*) R. (x*),sothat alsoaccordingto thesecond definition rot .4 (x*) = .4 '(x').
By this latter definition the differential operator rot can therefore be thought of as a formal generalization of the derivative as one passes from a o- to a p-linear alternating operator.
n 2. Theorem of Stokes
131
2.7. The transformation formula of Stokes. We now assume that the p-linear alternating operator A(x) satisfies the following conditions on the closed simplex sp''(xo, ... , xp+,): 1. A (x) is continuous on sp+'. 2. rot .4 (x) exists in the sense of the extended de/inition (2.9) at each point x of the simplex sp''. 3. rot A(x) is continuous on sp+'. According to the discussion in 2.3 and 2.6 it is sufficient for this that
A(x) be continuously differentiable on the closed simplex sp+t. In what follows we only make use of the three above assumptions, which say nothing about the differentiability of A(x), and prove Stokes's integral theorem. Provided the p-linear alternating operator .4(x) satisfies the above three conditions on the closed simplex s°-'(xo, , xp+,) c Rx, the integral transformation formula
f A (x) dlx ... drx = f rot A (x) dlx P$P+'
... d,,.;.,x
(2.12)
sD++
holds, where the boundary c' sp++ for a given orientation of with the induced orientation.
sp``1, 1 is endowed
For the proof we first remark that as a consequence of the hypotheses made on the simplex sp+' as well as on each (p + 1)-dimensional subsimplex s both of the integrals in this formula make sense. Thus for each such subsimplex s the difference J(s) f .4(x) d,x ... dpx - f rot A(x) d,x ... dp+,x (2.13)
is meaningful, and in the set (s) of these simplexes it defines a welldetermined set function. Stokes's integral theorem asserts that J(s)=O. The following proof uses a well-known idea which was applied by Goursat to establish Cauchy's integral theorem. It rests on the following two properties of the set function J(s). In order to formulate the first one we consider a point x* of the :losed simplex st'I(xo, . . . , xp+,) and a subsimplex s = s(vo, ... , yp+,) that contains this point. Let D be the real, orienting fundamental form of the plane x* + Up}' of the simplex sp+'. Further let T'(s) = ;D k, . . . kp++l (k, ' Yi - Yo) be the "volume" of s and 6 the greatest side length ski;. Then if s converges to x* so that the regularity index of s, ,1p++
reg s =;ts remains below a finite bound, then lim
s+.r
at each point x* a sp+1.
!J(S)I 1'(S)
=0
(2.14)
III. Integral Calculus
132
In fact, if (s; x*) stands for a quantity that vanishes with 8, then it follows on the one hand from the existence of rot A (x*) according to definition (2.9) that A(x) dlx ... d,,x = rut .4(x*) k, ... kp+, + bp''(s; .r*)i
f
z
On the other hand, because of the continuity of rot A (x) at the point x*, f rot A(x) dlx ... d,-.,x = rot A(x*) k, ... kp.,., + 1'(s) (s; x'*)2.
Thus IJ(s)I = 1811+1(s; x*), - V(s) (s; x*),I
(reg s i(s; x*),, + I(s; x*)2I) V (s)
which implies (2.14), provided reg s remains bounded as d - 0. The second property of the set function J(s) is its additivity, in the following sense: If Z stands for a decomposition of a simplex s in the set (s) considered into a finite number of (p + 1)-dimensional simplexes sz which are oriented the same as s, then (2.15)
J(s) = E J(sz) . z
Now the volume V(s) is also additive, and it follows from (2.15) that IJ(s)! V(s)
_ ' 1'(s)
' E zIJ(sz)i I zE J(sz)I ; V(S)
and if one sets z
V(sz)
Ifz,
(s( .r
(EZ V (sz)) 111-Z = Mz .
(2.16)
Now suppose Z, Zs, ... is an infinite sequence of decompositions of the simplex s with the following properties:1 A. Z;+, is a subdivision o/ Z,. B. Z; is relined without bound for i -. oo. C. The regularity indices of all subsinrplexes sz that occur are uni/ormly bounded. Then let s, stand for a simplex in the decomposition Z, for which the
maximum Mz, = M, is reached, ss c s, a simplex in the decomposition Z. which corresponds to the maximum 11z. = M,, whereby one only takes into account those subsimplexes in the decomposition Zg which lie in s,, etc. Then according to inequality (2.16) IJ(s);
111 S M2
...
(2.17)
T, _(ST
Since the simplexes s, s,, s2, ... are nested, there exists, because of condition B, a well-determined point x* a s; (i = 1, 2,.. .) such that 3 The construction of H. Whitney (1.5.5) gives an example of such a decomposition (cf. 2.8). Cf. also T. Nieminen (1j.
§ 2. Theorem of Stokes
133
Si -+ x" for i --,. oo. Now in view of condition C and equation (2.14)
lim M; = lim i1(st)l = 0 i-1.00
r-+ao V(sr)
and, as a consequence of the inequalities in (2.17),
I(s) = 0 for every simplex s of the set (s), in particular also for s = sp+'. With that Stokes's integral theorem is proved. 2.8. A sequence of subdivisions. Using the method of H. Whitney, discussed in 1.5.4- 5, we will construct a sequence Z. Z=, ... of subdivisions satisfying the conditions A, B, C. Let s$+'(xo, x1, . . . , x¢+,) be a simplex with vertices xo, x1, ... , .Vp+,. Denoting the edges
-x1,
.kp+r=x+1--x,,
the affine volume of sp+' is equal to
A(sp+') = D (x1 - xo) (x: - x0) ... (xp+1 - ro) = D k, k$ ... kp+1 Each simplex sa+' of the subdivision D = Z, of sp+', constructed by the method of section I. 5.5, has as edges p + I vectors defined by
s- ((x, + :) - (x, +
-2 kt
or by i.e., the vectors
f -z k,..... ± .Z kp+, with one combination of the signs. The affine volume of s;+' is therefore
=±
A(sp+r) Zp+1
and s;+' is similar to one of the simplexes with the edges 4- k1, ... . ± kp+,. The regularity indices of similar simplexes have the same value. Thus, if a is the maximal value of the regularity indices of the simplexes with the edges ± k1, ... , ± kp.;., (with all combinations of the signs), then a is equal to the maximum of the regularity indices of all simplexes of the division ZI. The same is true for the subdivisions Zz, Zs, ... , where Zj+, is the subdivision of Zf constructed by the method of Whitney. Hence property C has been proved. Condition A
is valid by construction, and property B is evident. because the
111. Integral Calculus
134
edges of the simplexes of Zf_., are obtained by dividing the corresponding edges of Z1 in two equal parts.
2.9. Remark. As a comment on the proof of Stokes's theorem given above we note that another method of proof which has often been used in the theory of integral transformation formulas would at first glance seem more obvious than the one followed above, which in principle stems from Goursat. The former proof would go briefly as follows. Under the above hypotheses 1, 2, 3 decompose the simplex sp'' into positively oriented subsimplexes (r = 1.....N) and write the boundary integral of the given alternating differential .4 (.r) d,x ... d .,x
f A (x) dtx
... d px - ' f :1(x) d,x ... dtx
On the ith subsimplex, which we designate by s s;'" for short, we choose some point x? = x and then have by the definition (2.9) of the rotor f A(x) d,x ... dpx = rot .4 (x) kl ... kp,, -; 8p+'(s; x) . (2.18') as
where kl, ... , kp+, are the edge vectors, whose greatest length is b, that span the simplex s; the quantity (s; x) vanishes when the simplex converges to the fixed point x. If one proceeds in a similar way for all A' subsimplexes s = s?-', the sum of the first terms on the right in (2.18') yields an expression which in view of the hypothesized continuity of the operator rot .4 (x) tends to the integral of rot .1(x) over sp when the decomposition is (regularly) refined without limit. Because of (2.18) Stokes's theorem is thus proved provided one succeeds in proving that the sum of the remainder terms r = hp''(r; x) vanishes in the limit. Let us see what can he said about this last question. For the individual remainder term r one has the estimate `;r. =
by--,
I(s; x);
11 ID kl .. kp,.,* I(s;.r)1
where M is a finite upper bound given a priori for the regularity indices
which appear. According to the definition (2.9) of the rotor, (s; x) vanishes for a regular approach of s to the /fixed point x. If one knew in addition that the convergence i(s; x)I 0 for b -- 0 were ufli/orm with respect to all points x E sp+', summation of the remainder terms r would obviously yield an expression which for indefinite refinement of the decomposition vanishes, and the proof of Stokes's theorem would then be completed.
But the uniform vanishing for b - 0 of the quantity required in this method of proof cannot in general be deduced directly from hypo-
§ 2. Theorem of Stokes
135
theses 1, 2, 3 in 2.7. In the case p = 0 it follows, of course, from the mean value theorem. But an analogous generalized mean value theo-
rem is, in general, not available for p > 0; rather such a theorem results only as a corollary to Stokes's theorem (cf. 2.11, exercise 2). which is to be proved. Because of this circularity; the proof sketched above fails for P > 0'. On the other hand, the proof of Stokes's theorem starting from the general postulates 1, 2, 3 following the method of 2.7, which in no way needs the uniform existence of the rotor of .4(x) in the sense described above, succeeds. Herein lies the real point of Goursat's idea.
The advantages of the above sharpened formulation of Stokes's theorem will be clear in the following applications.
2.10. The divergence. As before, let Dh, ... lap+, be an alternating fundamental form on the subspace Up+' spanned in the space R.0 by the
p + 1 (< nt) vectors h;. By means of this form, for a given differentiable vector field u(x) a Up+' (x E R,'), another differential operator, the divergence of tr(x), can be defined by computing the trace of the operator u'(x) (cf. I. 5.9, exercise 4). p+t
Z D h, ... h;-+ (,r'(x) hr) lei+t ... hp--.t
div it(x) = Tr u'(x) = '-- -
D
l,i
.
(2.19)
lrp+t
There exists a simple connection between the linear operators div and rot. We have p'
I
E D h, ... hi-t (u'(x) hi) h;.; , t..t
. . .
hp+t
p+t
=E(-1}'-'D(u'(x) hi)ht.../t;...h,4, i.+t The last expression is equal to the exterior differential of the p-linear alternating differential form
A(x)h,...hp-Du(x)h,...ho and can thus be denoted by (p + 1) rot (D u(x)) h, ... Jtp+t The divergence of u(x) is therefore equal to the rotor density p(x) of the
operator D u(x) multiplied by p + 1: rot (D (x)) ht_._hp div ttx () _ (P + 1) e(x) _ (P + 1) nh,...hp+,
.
(2.20)
1 if one assumes more specially that the operator .4(x) is continuously differentiable, then an application of the expression (2.8) for rot A (x) yields the property
essential to the above method of proof, the uniform vanishing of (s; x), and the path sketched above becomes feasible (cf. 11.,. t3, exercise 7).
111. Integral Calculus
136
Provided the rotor on the right exists in the extended sense 2.9) of this concept, this equation gives a corresponding extended definition of the divergence, which is obviously independent of the particular normalization of the fundamental form D of the space Up+'. If a coordinate system e,, ... , ep+, is introduced in the subspace i'..-p+' and the vector u(x) has the representation
u(x) _
p41
w'(x) ei ,
one obtains the usual coordinate-dependent representation for div it '-1 a.,,;
div u = E --- ,
j 1af'
where the i are the coordinates of the point x E R, with respect to the coordinate system e,..... ep+, of the space Up+' (cf. 2.12, exercise 4).
2.11. Gauss's transformation formula. Let u(x) be a vector field defined on the closed simplex sp+t = Sp+10x0,
(i - xo = lr;) in) spanned by with values in the subspace Up+' of R,, (p + 1 ... , Xp+i)
hl, ... , hp+ I. If one applies Stokes's transformation formula (2.12) to the alternating differential'
A(x)d,x...dpx=Du(s)d,x...dpx, then we get
f D u(x) d,x
. .
. dt,x = f rot (D u(x)) d,x
... dp+,x
or, using the above definition (2.20) of the divergence,
fD a(s) d,x ... dpv =
P _
f div u(x) D d,x ... dp..,x .
l xp_
(2.21)
This is the a/fine form of Gauss's trans/orutalion formula. In order to see this and to bring the formula into its usual metric formulation, we introduce a euclidean metric in Up+' and let et, . . . , ep+, stand for an orthonormal coordinate system and normalize the fundamental
form I) so that D e, ... ep., = 1. Then if in the above formula sp`1 is positively oriented relative to D,
according to I.6.9-10 D d,x ... d1,x = (p + 1)! dt'p+, , I This, according to Stokes's theorem, is permitted provided div u(x) exists on sp+t in the above mentioned extended sense and is continuous there, hence, in particular, provided u(x) is continuously differentiable.
§ 2. Theorem of Stokes
137
where dvP+, stands for the euclidean volume of the simplex spanned by the vectors d,x, ... , dP+,x. The right side of formula (2.21) can therefore be written p ! f div u(x) dvp+, . ,P+i
On the left hand side we determine on each of the p -- 2 side simplexes the "positive" unit normal n, so that on the simplex s¢(x0, ... , xP+,) (i = 0, ... , p + 1) the expression
(- 1)'-' D ni dlx ... dpx turns out, for example, to be positive, whereby the orientation of the side simplexes hypothesized in Stokes's transformation formula is taken into account. Then if u(x) is decomposed on the side simplex s;' according to u(x) = tvi(x)'iii 4. Y,(x) , where p,(x) stands for the orthogonal projection of u(x) onto s;', then p,(x) and the differentials d,x for the side simplex are linearly depen-
dent, and the contribution of this side simplex to the boundary integral, according to the definition of the simplex volume given in 1.6.10, becomes
-p!I
dv,, .
where dvP stands for the euclidean element of the side simplex.
Thus, altogether, with the above definition of the "positive" normal component v(.xr) of u(x) on the boundary NO- 1,
f r(x) dvP + f div u(x) dvP+, = 0.
(2.22)
iP+ I
esP+,
This is Gauss's tray:sforuiation /orniula in the customary euclidean /oruiulation. For other formulations of the formulas of Stokes and Gauss we refer to I. S. Louhivaara '2], C. MUller [21 and P. Hermann [5].
2.12. Exercises. 1. Let R; be a 3-dimensional linear space in which a euclidean metric is defined by means of the inner product (x,, x=). Let y(x) be a differentiable vector function of the variable x E R;, with range in RX, and set A(x) It = (y(x), h)
(h e R;) .
Show that the relation rot A(x) h, h= _ -2- (rot y(x), [hl, lri)
111. Integral Calculus
138
holds between the rotor concept defined by (2.8) and the rotor rot v(x) of the ordinary vector analysis. where :h,, h, stands for the "exterior product" of the vector- h,, h= E R.L. 2. Prove the following theorem, which can be considered as a generalization of the mean value theorem in 11.1.6. Let y = A (x) d,x ... d,,x he an alternating differential on the closed simplex sP+'(x0.... , xp--,1) of the space R,` with values from the euclidcan space R:'. Suppose this differential satisfies these assumptions: 1.:1(x) is continuous on sl`+. 2. rot .4 (x) exists in the sense of (2.9) and is continuous on sP+r.
Then for each vector e from R" there exists at least one point x, E sP"' such that (hi = x; - xo) ( f .1(x) d,x ... d ,Y. e) = (rot A (.r,) h, ... hP-', e)
r,r. Hint. Let I) h, . . . hp;., be the real fundamental form of the subspace parallel to sP+' and a(x) the rotor density (cf. (2.10)) in the direction of this subspace. From Stokes's transformation formula (2.12) it then follows that
( f .4 (.r) d,x ... d px, f ta(x) D d,x
G i. I
e)
(f rotA (x) d,.r ... dP. ,x, a)) ,P1+ I
... d..0'. e) _
f
(o(x), e) D d,x .
. .
d1,_,x
sP+r
which in view of the continuity of the real function (g(x), e) on sP+' yields the assertion. 3. Prove with the aid of the above mean value theorem the following generalization of the theorem proved in 11. 1. 13, exercise 6: Provided the p-linear alternating operator .4 (x) is continuous in the region G" of the space RX and rot A (x) exists in the sense of (2.9) at each is continuous in G.",' if and only if point .x* of this region, then rot
the equation Jim i(s'T';
0
A-.n
holds uni/orurly with respect to x* on each compact subregion of G"'.
4. Prove the formula
div it = : where .v c Ra and it = u(x) E UP-,' are vectors with coordinates ;'.
d (j = 1, ... , P + 1) in the coordinate system e,, ... , e1_., of the subspace UP'' of R','.
§ 2. Theorem of Stokes
119
5. Let D h,_... Is. be the euclidean volume of the parallelpiped spanned by the vectors h; E R "M (i = 1, . . . , n). (D h, . . . h")= = det ((h;, h1))
and :t" c R" a polyhedron. Then the "index" j of the origin x = 0 with respect to the boundary an" of a":
fe," Dxdlx...d"_1x Ixi" equals zero provided x = 0 lies outside of :x" and equals the (oriented) volume
f Dxd,x...d"_1x
i.q-t
of the surface of the unit sphere provided x lies inside of -C".
Hint. Prove that rot (D -x) = 0 for x + 0. The first part of the x claim then follows from Stokes's formula. In the second case remove !
a small sphere IxI
r from :x"; Stokes's theorem then yields
Dxd,x...d"_,x, independently of r, and the theorem follows for r = 1. 6. Let G be a finite region in RI bounded by a piecewise regular curve ?G. Prove Stokes's formula x f A (x) dx = f rot A(x) d,x dsx, G
eG
where A (x) is continuously differentiable on G + ?G. 7. Let Rr be the complex (x = c + i ri)-plane and G (c RR) a polygonal region. If the complex-valued function y = y(x) = u(x) -F i r(x) is continuously differentiable on G + OG, then (the formula of MloreraPompeiu) f y'(x) dx = f f E,..r(x) dd eC.
where
C.
+ bt. = i (011 - av ) + i (all 0,1
j7S)).
oil
Cg
Hint. The product y(x) dx is a linear differential form, and Stokes's formula yields
f y(x) dx = " f (dly drt - d=v d,x) , -
where d,x and dzx stand for two arbitrary differentials of x and d1y and dsy are the corresponding differentials of y(x). The rotor density
E_
i d,p dsx 2
d_y d1.r
nd;xd2x
140
111. Integral Calculus
where the numerator is an arbitrary real alternating form (= 0), is independent of the differentials d,x, d=x. For example, if one sets d,x = I and dsx = i and chooses for D the oriented area of the triangle spanned by the vectors d,x and d=x: !d,x! Idtxi sin (d,x, dx,
1) dlx ds.Y =
,
where the expression in brackets represents the angle between d,x and d2x, then D d,x drx = 1 /2 and E,., = i d,), - dsy, from which the claim can be inferred. 8. Prove under the hypotheses of the previous exercise the formula
dx - If r:,.,.(." de; dil
2 z i y(I) = f
r
iG' t where I is an interior point of G.
,
§ 3. Applications of Stokes's Theorem 3.1. Symmetry of the second derivative. For p = 0 a p-linear alternating operator degenerates into a vector function Assuming that this vector function is continuously differentiable on the closed simplex s'(xo, x,), i.e.. on the segment x = xo + T (x, -- xo) (0 5r S 1),
ntoke's theorem (2.12), whose proof remains valid even for
0.
states that
3'(x,) - v(xe) = f y'(x) dx . , r,
(3.1)
This formula can also be proved, as in the elementary calculus, with the aid of the mean value theorem. After this preliminary remark, we consider a vector function y(x) in the space R' that satisfies these conditions: 1. y(x) is conlinriorrsly di//erendiable in a neighborhood o/ the point xo. 2. The second derivative operator y"(x) exists.
We assert that even under these conditions, which assume considerably less than the ones indicated in II.I.II, the second derivative is symmetric: For arbitrary vectors h, k o/ the space R' 3,..(1'0) h k =. y"(-Yo) k h
(1.2)
In fact, it follows from hypothes t and (3.1) that
f y'(x)dx=0
is,
for every 2-dimensional simplex sa in the vicinity of ro. Consequently, according to the definition (2.9) of the rotor, rot y'(xo) exists and has
§ 3. Applications of Stokes's Theorem
141
the value zero. Furthermore, since y"(xo) exists, one has, in view of expressions (2.8) and (2.8'), rot y'(xo) h k =
which was to be proved.
(y"(xo) h k - y"(xo) k h) z
3.2. The equation rot rot A(x) = 0. The symmetry of the second derivative can be thought of as a special case (p = 1) of the following general theorem : Let A(x) hl ... hy_, E R;, (x, h; E R.') be a (p - 1)-linear differential that satisfies the following conditions: 1. A(x) is continuous in a neighborhood of x0. 2. rot A (x) exists and is continuous in the former neighborhood. Theis rot rot A(x) exists and vanishes. Let sp+' be an arbitrary simplex in the mentioned neighborhood that contains the point x0. Then p+t
f zsQ+rot A(x) dlx ... dx = Z (-1)' f rot .4(x) dlx ... dPx . sip
As a.consequence of the hypotheses, Stokes's integral transformation
formula (2.12) can be applied to each term on the right. One thus obtains for the boundary integral on the left a double sum of integrals of the differential A(x) dix ... dp_,x over the side simplexes sd-7' in which each integral appears twice with opposite signs. Consequently, for each simplex spy' of the kind mentioned rot A(x) dlx ... dpx = 0 .
f
esp+ t
But according to the definition (2.9) that means that rot rot A (xo) exists and has the value zero.
3.3. Integration of the equation dy(x) = A(x) d!x. Let GY be an open region in the space R"' that is "starlike" with respect to the point x0: thus along with x, the entire segment x xo lies in G'. In Gs let a differential A(x) dx e R', (dx E R?) be defined that satisfies the following conditions: 1. A(x) is continuous in Gx. 2. rot A (x) exists and vanishes in G. The problem is to completely integrate the differential equation dy(x) = A(x) dx . (3.3) As a consequence of the theorem in the previous sections condition 2 is
necessary for the solvability of the problem we have posed. It will turn out that this integrability condition is also sufficient.
III. Integral Calculus
142
Our differential equation can also be written rot y(x) dx = .1(x) dx
(3-Y)
and can therefore be conceived of as a special case (p = 1) of the general differential equation
rot Y(x)dlx...d,,x =.1(x)d,x...dyr which is to be considered later. For this reason we treat the present special case so as to make the unity of the integration method used, which is based essentially on the application of Stokes's transformation
formula (2.12), apparent for all dimensions 1 S P S in. For each x in the starlike region G;' the simplex sl(.ro. x1), i.e., the closed segment
(0 sT S 1) lies in G's.
Supposing the existence of a solution of the differential equation (1. 1) in Gx' that assumes an arbitrarily given value y'a E R° at x0, we make use of Stokes's theorem (2.12) for s1(xa, x,) and y(x) in order to derive an expression for this hypothetical solution. In the present case, because rot y(t) = y'(t), this formula degenerates into y(x) - y(a"0) = f y'(t) dt , A.A
and it is therefore necessary that y'(x)
y'o -f .1(t) dt = y0 ± f A (To - T (x
xa)) (.r - xe) dT . (3.4)
On the other hand, this expression defines a vector function in the region G that for x0 assumes the value ya. We claim that undeconditions I and 2 this function actually satisfies the differential
equation (;.;) presented and is therefore the only solution with the given initial value ya at .v0.
For the proof, let x = x1 (* .r0) be an arbitrary point in GL'. For a sufficiently small differential dx = h, which we take to be linearly independent of x1 - x0, the simplex s2(x0, x1, .v2) (x2 = x1 + h) then also lies in G'. As a consequence of assumptions I and 2 we can apply Stokes's transformation formula (2.12), according to which f .1(1) it = f rot .1(1) d11 d21 i.' x'
Because of assumption 2 the integral on the right vanishes, and therefore
f 1(t) dt = f A (t) dt -!- f A (I) dt ± f A(t) dt = 0 is
.1, A,
X'S,
X,X.
§ 3. Applications of Stokes's Theorem
143
and consequently, according to the definition (3.4) of y(x), y(x2)
f A(t) dt = f A(t) dt , - y(xl) = y (x + l:) - y(x) = x,..s x(xik)
an equation that valid even if h and x - xo are linearly dependent, thus in particular also for x = xo. Because of the continuity of A(t) at the point t = x, it follows from here that
y(x±It)-y(x)-=A(x)h-i-Ihl (Ii;x). with J(h; x)i - 0 for I11I - 0, which implies the assertion y'(x) = A (x). Under conditions I and 2 the differential equation (3.3) presented therefore has a solution in G_that assumes an arbitrarily preassigned value at xo and is thereby uniquely determined. When the region G, is in particular convex, the point xo can be chosen arbitrarily in this region.
If the open connected region Gr is not starlike relative to the point .v0, then one obtains a solution of the differential equation (3.3) in G' if one first constructs a solution yo(.r) with yo(xo) = yo in a convex subregion Go containing xo and then continues it in the well-known
fashion to an arbitrary point x of the region G. For this, join the points x a and x with a finite chain G 0 .
. . .
. . t of open convex regions
which are chosen so that the intersections G, n G,,, (i = 0, . .
. ,
j-1)
are not empty. Then if in each of these intersections one takes a point x;_,, there exists a unique solution yo(x) in G. with yo(xo) = yo, a unique solution y,(.r) in G, with y,(x,) = ya(x,), etc. Since the solutions y(x) and 3,;_,(x) assume the same value yi(xi ,,) at the point xi_, of the convex region of intersection G; fl G;,.,, they are by the above identical in the entire intersection and consequently continuations of one another. y,(x) therefore is a solution element in G1 that uniquely continues yo(x) along the above chain to the point x. The integral function thus obtained is unique locally. In general
it is true that the integral experiences a zero increase on any path in Gx which is homologous to zero. On a closed path which is not homologous to zero the continuation of the integral can yield nonzero periods c). The periods form an abelian group which is homomorphic to the homology group of the region G'.
3.4. Integration of the equation rot 1(x) = A(x). Now suppose,
more generally, that .4 (x) d,x ... d,,x E R; (drx e RX", I s p g in) is a p-linear alternating differential defined in the region G' of R' which is starlike with respect to xo that satisfies both of the conditions in the previous section :
1..4(x) is continuous in G. 2. rot .4 (x) exists and vanishes in G'"'.
III. Integral Calculus
144
Our task is to solve the differential equation rot Y(x) d,x ... dox = A(x) d,x ... d,,x .
(3.5)
Supposing that Y(x) is a (p - 1)-linear alternating continuous solution of this equation, we wish, in generalization of the method pursued for p = 1, to first apply the p-dimensional Stokes transformation formula (2.12) to this solution in order to derive an expression for Y(x) and to then show by means of the (p -{- 1)-dimensional Stokes's theorem that the expression actually gives the general solution of the above equation in G". For this, let x = x1 (+ x0) be an arbitrary point of the region Gx' and h . . . , hp_, vectors that we first assume to be linearly independent of x, - x0 and moreover so small that the entire p-dimensional simplex
= x1 + h; (r = 1, ... , p - 1), lies in
st, = SP (X0, ... , xP), % % - h e r e Gin
I..
The hypothetical solution Y was assumed to be continuous. Further, since according to ().5) rot Y exists and as a consequence of assumption I is continuous in GA', we can apply Stokes's theroem (2.12) to Y and thus obtains, because rot Y = A, f Y(t) d,t . dP_,t = f A(t) d,t ... dot .
.
est
SP
or
f 1'(t) d,t ... dP_,t sP0-t
(-1)'-' f 1'(t) d1t ... do_,t + f A(t) d,t ... dot ,
(3.6)
Si'
where the p -(- 1 side simplexes of st are denoted by sr-' _ 4-' (.170'..
.....(l .
XP)
=0,....Y)
The left hand side can, because of the continuity of Y(1) for = = x, be written f Y(t) d,t
... d,-,t = Y(x) h, ... h._1 -L aP-'(so-'; x)
,
(3.7)
sP._ 0
where 6 = max 1h,I and (st-' ; x) in general stands for a quantity that vanishes with 6. We apply formula (1.8) to the right side of (3.6). The last integral becomes
f A(1)d,t...d,t= f gr .4(1)d11...do_11, S.
so
with the (p - t)-linear alternating operator g A (t) = f d(rP) A (vo + T (t - x0)) (t - x0) . 0
(3.8)
145
§ 3. Applications of Stokes's Theorem
Because of the continuity of A (t) for t = x, = x
f A(t) d,t ... dpt = q, A (x) hl ... ht,_, 'p
-
60-' (s:'; x)
(3.9)
follows from this. The sum on the right in (3.6) is similarly transformed:
E = E (-1)'-' f Y(t) d11... dd_,t p
= E (-1)'-'
q' Y(t) d11 ... dp_2t ,
and
where SP-2 = SP-2 (.Y,, ,
Y(t) = f d (r ') Y(xo + T (t - xe)) (I - xo;
(3.10)
0
is a (p - 2)-linear alternating operator. It further results from this, since the simplexes spa 2 (i = 1, ... , p) form the boundary cps;-' of so-', that E = f q Y(t) dlt ... dp_21 (3.11) a,o-' As a resume of formulas (3.7), (3.9), (3.11) it follows from (3.6) that
f 'Y(t)d,t...dp-2t z a-'
= Y(.r) h, ... Itp_+ -.p A(x) hl ... hp_, + hp-' (s;-'; x) which according to the definition (2.9) of the rotor states that rot q' V exists and is equal to Y - q' A. Consequently, one has Y(x) h, ... hp_, = rot I Y(x) h, . . . hp_, + q- A(x) h, ... hp-1 (3.12) for each x E Gx.
If our problem can be solved at all, then there is associated with each continuous solution Y a well-determined (p - 2).-linear alternating operator q' Y for which the rotor exists so that equation (3.12) holds. Consequently rot T Y is even continuous, and according to the theorem proved in 3.2 rot rot qq V = 0. Thus, because rot Y= A, it is necessarily true that rot qo A(x) = A(x) , and q' A is therefore a particular solution of equation (3.5).
(3.13)
In the following section it will be shown that, under integrability condition 2, q+ A actually satisfies equation (3.13). Thus if B(x) is an
111. Integral Calculus
146
arbitrary (p - 1)-linear alternating operator defined in Gx for which rot 11 exists and vanishes, then Y(x) h, ... hp-, = B(x) ht ... hp_1 + 9. A(x) h, ... 1ip_1 (3.14) is the general solution to the problem posed. It follows from the above derivation that the operator B is connected with the solution Y by the formula B(x) =. rot 9, 3'(x) .
Concerning the meaning of this arbitrary operator for the problem, let the following he remarked: From the definition (3.8) it follows at once that the differential A(x) h, . . . hp_, vanishes provided the vectors x - x0, h ... , 1r p_. , are linearly dependent. But the result (.14) remains valid even in this case, and one thus sees that B play the role of an "initial operator" the giving of which uniquely determines the solution.
#14, UMPU 1MeIs 0 M p r1to1 IN iw&r to 1*1%to #44010 11,111,
we take an arbitrary point x = x, (+ in GI and the vectors F.,,.... k,, p g in - 1) from Rte' so that they with x, - x0 form a linearly (1 0''(x0..... Sp. independent system and so that the simplex sp (xi.., = x, + k;) lies in G'. The operator A(x) is according to hypothesis 1 continuous in G. Further, since the rotor, according tan 2, exists in G,.' and because rot .4 = 0, we can apply Stokes's integral transformation (2.12) for the operator .4 and the simplex st'"I and find f .l (r) d,.r ... d,x =-: 0. i sp .'
Because of the additivity of the alternating differential
.4(x)d,x...dpt this result also holds if the vectors x, - xo, k,, ... , kp form a linearly dependent system and the simplex degenerates. Our argument hence
remains valid in the case p = in, and this is true for p
in at the
point x, = xo too. Equation (;.15) yields
f:l(.v)d,x...dp.r= where s;'
3-j
.
p=t
f A(x)d,.r...dpx, : ,)
(t.lO
(i = o, .. , p + I) stand; for
the side simplexes of sp"'. For i =. 1, ... , p -I'- I one obtains by (I .$)
,f .4 (x) d,x ... d px = f T .4 (x) d,.r ... d,,- x Rio'
S 3. Applications of Stokes's Theorem
147
and q' A is the operator (3.8). where s¢-' = sro' (x1, ... , (i = 1, ... , p + 1) form the boundary asp of s$, The simplexes
and summation on i in (3.16) gives, in view of the continuity of A(x), f q' .4 (x) d,x ... dp_,x "u
= f A(x) d,x ... d,,x = A(x,) k1 ... kp + bt' (at; x,) SP
U
where 8 = max ;k11 and (st; x1) vanishes with 6. But this relation, according to definition (2.9), states that
rot 9;A(x1)k1...kp=A(x,)k,...kp. Here x, is an arbitrary point in GT. Since the equation is valid as a consequence of the linearity for arbitrary vectors k; E R?, provided it holds for sufficiently small ones, assertion (3.13) is therefore proved. 3.6. Exercise. Let A (x) be a p-linear alternating operator that is continuous in G",.' . and whose rotor exists in G' in the sense of definition
(2.9) and is continuous. If rot A(x) does not vanish identically, the value of the boundary integral (3.15) is no longer zero. Transform this integral according to Stokes's theorem and then carry out transformation (1.8). By passing to the limit (analogously to 3.5) further prove the formula
rot 97 .4=A- q' rot A, which is important for certain questions in the theory of tensor fields on differentiable manifolds.
IV. Differential Equations In this chapter the first order differential equation
ax=/(x,y) is to be investigated. Here x is a vector in an fn-dimensional linear space R,, and y = y(x) is a vector function, which is to be determined. The range of y lies in an n-dimensional space R,,"., while /(x, y) stands
for a linear operator, which maps the space Rx into the space R. In differential form the equation is d}v _ /(x, y) dx
(0.1')
.
If a basis a...... It,. is introduced in R' and b ... , h, in R,,! and one sets
xa;. ,~
K
the equation transforms into a system of inn partial differential equations of first order for the like number of partial derivatives anil2$' of the it unknown functions rif = t)i($1, i rr
t
nt;
where (/) is the matrix associated with the operator /. Conversely, such a system can be summarized vectorially in the form (0.1).
§ 1. Normal Systems I.I. Definition and problem. If the space R, of the independent variable x is one-dimensional, in = 1, the differential equation dp do
=
is called a normal equation. Written in coordinates (x it becomes
(1.1)
e, e ( R,)
§ 1. Normal Systems
149
and is therefore equivalent to a normal system of n equations for the equally many functions ill = 7110). Keeping the vector notation y, equation (1.1) can also be written in the form
where
fix, y) dx/dd = /(.v, y) e. For the sake of unity we
prefer to use the general vector form (1.1) even in the present special case of a one-dimensional x-space. In what follows the uniqueness and existence of the solution to a normal system is to be investigated under the following assumption: The linear operator /(x, y) is continuous for Ix - rol 5 r. < oo,
IY - Yol S ry<00. We exclude the trivial case where /(x. y) only depends on x and the integration of the differential equation is reduced to a quadrature.
1.2. The uniqueness problem. We consider in the following solutions of the normal system, i. e., functions y = y(x) that are differentiable for lx - xol 5 r, and whose values fall in the sphere IY - Yol 5 ry, so that Y,(x) =
dazx)
== /(x. Y(x))
and
'(xo) =Ye The problem is to determine under which assumptions there exists precisely one solution.
For this purpose we associate with the operator /(x, y) the upper variation of / with respect to y, which is defined for each e 0 by means of q+(p) =_ sup I/(x. Yi) - /(x. Ys)I
(1.3)
for Ix - xol 5 rr. IY1 - yol 5 ry (1 = 1. 2), IY, - Y:I 5 Q. This function satisfies these conditions: 1. T(g) (2: 0) increases monotonically with e, q,(e) = q:(2 ry) fore 2 r,..
(0) = 0 and
2. 97(e) is subadditive: q'(p, + es) S op(L91) + 9)(e:) 3. 97(Lo) is continuous.
Property I is evident from the definition of 97. To prove 2, let p, and et be two arbitrary numbers ? 0. Then choose two points y y,
IV. Differential Equations
150
in the ball Iy - yol - r,. so that !y, - y=1
of + tip. Moreover, let y = y3 be that point on the segment that joins y, and y2 for which :YJ
y3
Y31 =
91 -
P.
Q
yI
81
;Y1
Y2;
8a
(
Y)
Then II (x, y1) - f(x, '2) 19 I/(x. y1) - /(x, 9'3)' 4 If (x, y3) - /(x, )'9)1
s9(01)+9(92) This relation is valid for any pair of values y,, y2 with
ly1 - y21
9 o, + 22. Hence y(o, + 02) q (o1) + q (92) . Properties i and 2 hold independently of the assumption of the continuity of /. If the continuity requirement is satisfied, then /(x, y) rx, is uniformly continuous on the closed point set Ix - xo; ly - yol S r,,, and 9(p) 0 for o - 0; i.e., q, is continuous for o 0. The subadditivity then yields the continuity of y,(g) for every e 0. Observe that qp(o) > 0 for o > 0, because /(x,)') would otherwise depend only on x, which was excluded above.
1.3. Osgood's uniqueness theorem. Now let vl(x) and y2(x) be two solutions of the normal equation (1.1) that satisfy the initial condition (1.2). We define y(x) = y,(x) - 1'2(x) and for r 5 rx
m(r) - sup Iy(x)l I+-xdS
Then fix two numbers r, Ar so that 0 r < r + dr r, and choose x and Ax in such a way that Ix - xol = r Ix + Ax - x0I
s r 4- Jr. Then
y'(x) =
dyd(r )
= /(x, y1(x)) - /(x, y,(x))
and
Iy(x + dx)I = ly(x) + f (I(1, y1(1)) - l(i, y,(/))) dil x(x+dx) s- lyy(x)I + f !/(1, yr(1)) - /(1, y2(l))I Idil x(x+dx)
H1(r) + f w(!y(i)!) !dl, x(x+dx)
m(r) + q,(m(r + Jr)) f
Idtl
x(x+dx)
r11(r) + 9(»1(r + Jr)) dr .
I I. Normal Systems
151
If the maximum in (), + Mfr) of Iy(l)I (11- xol -.,9 r + ^r) is reached
at a point x' of the interval r < It - x0I 19 r + A r, then for x+Jx=x'
in (r + Jr) g in (r) + q(m (r + Jr)) Jr . In the other case m (r + L1 r) = m(r). The above inequality is thus valid in general, and consequently, setting m(r) = p(r) = p,
.1p=p(r+Jr)-p(r) ST(e+Jp)Jr. We let ro (s rs) stand for the least upper bound of those numbers
r < r,r for which m(r) = 0 and assume ro < r,.. Then p(r) -. 0 as r -,. re + 0, and p(r) > 0 for ro < r 19 r1. Consequently 9' (p + Jo)>0, so that the above inequality can be written
_dQ 9, (Q 4- JQ)
$ dr
for these values of r. From this it follows that for ro < r s rs
f
Q(f
do
40)
Sri-r
Since p(r) -. 0 as r -+ ro + 0, this implies that o
de-
9'(Q)
Thus if one assumes that the integral
f(e = 00 then necessarily ro = rz. Consequently p(r) = m(r) = 0 for r < r, and therefore Y(x) = Ni(x) - Y2(x) = 0 on the entire interval Ix - x0I < r:. This result implies the
Uniqueness theorem of W. F. Osgood. Provided the integral
f 0
dp
(1.4)
9,(Q)
diverges, the normal system (1.1) has at most one solution y = y(x) with the given initial value y(xo) = yo.
The theorem also includes the case where /(x, y) only depends on x, because then q'(p) = 0. Osgood's condition is certainly fulfilled when /(x, y) satisfies a Lipschitz condition (K = const.) l/(x, '1) - /(x. Y:)1
K IY, - Ysl
1\'. Differential Equations
152
with respect to y. For then g(p) 5 K u, and Osgood's integral diverges.
More generally, uniqueness holds provided for sufficiently small IY, -3':I
-1- r,l
.. log, IAX, Ys) - /(x, Y2)1 5 K IY, - Y:1 log ..... ' hI -Y21 It,
where log,, is the t-fold iterated logarithm.
1.4. Inversion of Osgood's theorem. The question arises whether Osgood's sufficient condition is also necessary for the uniqueness of the solution. This is actually the case, in the following sense.
Let q(n) be a continuous function with properties f and 2 of 1.2. We consider the set {/} of all operators /(x, y) continuous for Ix - xel 5 rx, ly - yol 5 ry which satisfy the inequality q'1(p) 5 q(P) ,
where qf(e) stands for the upper variation of /(x, y) defined by (1.3). This theorem is then valid : It the Osgood integral (1.4) converges, there is an operator / E {l) such that the differential equation (1.1) has at least two solutions with the initial condition y(xo) = yo. For the proof we restrict the discussion to those special differential
equations in the class if) for which the linear operator /(x, y) carries the differential dx a RI over into a one-dimensional subspace R,'. of the space R;'. We then are concerned with the one dimensional case (fit = is = 1). After introduction of arbitrary unit vectors in R;. and R', the differential equation can be written in the coordinate form dy = /(x, y) dx , where x, y and / are now real. This subclass of {/} contains, in particular, the (real) operator /(x, Y) = 9;(iYI) ,
where q. is the function defined above for all Iyl. In fact, for this function
It(x, y0 - /(X, Ys)I = (O)'1I) - 44iY21) Iy:l, and in view of inequality 2 in 1.2
provided ly,l gOY2I) = 9;(IYs1 + (IY,I - IYsI)) 5 q(Iy:O + T(IYtl - ly:l) But here lytl - 1y:I 5 lye - yI. and therefore u)GM1 - NO
5 T(IY, - yal). Hence 1/(x, Ys) - /(x, ys)I s q'0,1 - 7'20 and / thus belongs to the class {/} considered. But the real differential equation dy = q(IYI) dx
§ t. Normal Systems
153
has two solutions that vanish for x = 0, namely the trivial solution y = 0 and the function y = y(x) which for y 0, x 0 is defined as the inverse function of the integral
x=Iy o
dv 9'(Y)
y(-x).
and for x < 0 by y(x)
1.5. Existence proof by the polygonal method. We now consider the question of the existence of a solution of the normal system (1.1).
This is to be established using the classical polygonal method of Cauchy. \Ve assume for the time being only the continuity condition of 1.1. Let I f(x, y)I 5 M < oo in the region Ix - xol r, Iy - yol we set ro = min [rx, rrf.'lfl. Let the function associated with /(x, y) in 1.2 be V(o). Then on the domain just mentioned If(x, YO - f(x, Y:)I s q(Iy1 - Y01
Let x = a be one end point of the interval Ix - xul
ro.
On
the closed segment xo a insert a monotonic sequence of points D: xo, ... , xN = a, Ixi - xol < . < IxN - xoI. We define, beginning with the initial value yo, the sequence of points
(j = 1...
Yi = Y,-s + f f(1, )'i-,) dt 'i_, xi
. N)
That this procedure is meaningful follows from the relation
1yi-)'ol if I.xi-x01 ;5 .11ro Sr" which can be established by induction. For if j = 0 it is trivial, and if it holds for yo,
. . . ,
then
ly, - yol S l y, - yi-, I + I)',-, - )'oI
sxi_,x, f 11(t. y,-+)I Idtl + M
Ixi-, - xoI
;9 M(Ix,-x,_11+lxi-,-xol)= 11 Ix, - xoI For Ix,-, - xoI 5 Ix - x01 a Ix, - x01 we now set YD(x) = Yi-I + f /(t. yi-+) di .
(1.5)
By this equation YD(x) is defined on the closed segment xoa as a contin-
uous function such that yD(xi) = y, (j = 0......Y). For Ix,_ , - xoI < < Ix - x01 < Ix, - x01 it has the derivative YD(x) = /(x, Yi-+) .
(1.6)
IV. Differential Equations
154
and for x = x, the one-sided derivatives /(x1,
and /(.r,, y1). The
derivative is continuous (except at the points xj)1 and bounded (1y;,(x)1 5 M). One further deduces from (1.5) that Y01 - 11'L(V) - Yj-,l + h'i-, - Y01 f 1/(t. Yj-J' ldh + X11 Ix,_, - xel .,,_,
;x-x0;
-x0I)
11 (Ix-xj_,I + Stllro;5 r3.. Now we write
y'n(x") = /(x, YD(x)) + rD(x)
where according to (1.6) IrD(x)I = 1/(x, YD(x)) - /(x, )'i-,)I
5^ (b'D(x) - Yi-+1)
for Ixj_., - x0l < !.r - x0l < Ix; - x01. Here, by (1.5). IYD(x) - Yl- ,I
f 1/(t.
sf
`1(1,
Idti
j _,)l Idti
M lx, - xj_,l
.
Suppose av is the largest of the numbers Ix1 - x1_, I (j = 1, ... , .') . Then
Therefore at the points of continuity x * .rj
(j = I...... ) of the
derivatives YD(x) = /(x,
YD(-r)) + <9: U118D)>
and the same inequality also holds for the two one-sided derivatives at the points x = x1. The integration of the relation between x and x + Ax further yields YD(x + Ax) = yD(x) 4- f /(t, yv(t)) dt + 'zIx1
Now let 6 be an arbitrary positive number. We consider two divisions D1 and D2 of the segment xo a, so fine that BD,, 6D, If one then sets 9'(x) = 3'D,(x) - 3'v,(x)
(;Y(x)i
b.
2111 ro)
and applies the relation (1.7) for D = D1 and D = D2, one obtains by subtraction
+ dx) = Y(x) + .,(x+Jx) f (/(t, YD,(t)) - /(1, yD.(1))) dt + 2 Idxl ' At the point x - xo, yD(x) is continuous and
t(xe, )'o)
_ W e use (g) to designate any quantity whose norm I(g>1 < C.
.
(1.S)
§ 1. Normal Systems
155
We now complete the argument, more or less as in 1.3, in the following way. This time let r0) (r m(r) = sup Ir-r.IS? If one again takes 0:5 r< r + Jr;5 r°andlx-x°! = r;5 Ix+Ax-xo1 9 r + Jr, then in view of (1.8)
ly (x + dx)I
1'(x)1 + f 11(1. YD,(t)) - f(1. yo.(t))I Idil -r- 2 x(x
xI q (Af 6)
in(r) + f 4'(Iv(t)I) Idil + 2 IJxI q,(M b) r(.r+_ix)
Here ly(t)I s m(r + Jr), and one concludes just as in 1.3 that m(r + Jr) 5 m(r) + (q (in(r + dr)) + 2 go(.11 6)) . tr . Taking into account the monotonicity of 99 it follows from here that
Ant = ni(r ± Jr) - m(r) S (T(m(r + dr)) + 29:(lf 6)) .Ir
39:()Ft(r+Jr)+M6)Jr and Am
q,(nt(r +
r) + Mb)
3 =lr .
If one sets one finds, as in 1.3,
e=m(r)+Mb, de
ri a
9F(e)
r0
1.6. We consider now for a given a > 0 the integral a+d de
I
V(Q)
as a function off
0 (cf. Fig. 4, where o is the abscissa and I /q'(,2) is the ordinate). Since it vanishes for ft = 0 and increases continuously,
monotonically and without bound as tg increases, there is a welldetermined value ft = f3(a) such that
'+f(a) de
f
_
We) - 3 r° .
f1(a) is defined for a > 0 as a positive, monotonically increasing and continuous function of a. Hence the limit Jim fi(a) = f1° 2 0
a'*0 exists, and P. > 0 or Y0 = 0 according as the Osgood integral (1.4) converges or diverges.
IN'. Differential Equations
156
1.7. We return to inequality (1.9). If one sets 1118 = a, then
f
s+ (&)
+s'('.) dg a
re .
1
9:(Q)
do
-L
,
W(e)
from which follows: nu(ro) Z (3(c) = f1(11 6) .
VIg. 4
Now it the integral (1.4) is divergent, l3(M d) tends to zero as 6 -- 0,
and since then for Ix - ro!
ro
IY(x')I - IYn,(x) - YD.(x)I S 9n(ro)
fl(Dl 6) ,
(1.10)
it follows from the Cauchy convergence criterion that for unlimited refinement of the partition D the approximating function tends to a well-defined limit function, uniformly on the interval lx - xoj ro. The function thus constructed is a solution to our problem. First ro, because each apy(xo) = yo and l y(x) - yol ;rk r . for Ix - xol proximating function possesses this property. Further, according to (1.7) (x = xo, Ax = x - to), YD(x) = ).0 ± f f (t, )'D(1)) di ; ro,'q.(.ll 8D): x.x
and refining D, one finds
)'(x) = yo + f t(t, y(t)) dt. .To i
Thus O(T)
ax = t(x, Y(x))
for Ix - xol
ro = min [r,,, r,.lllf'. This completes the proof.
1.8. Summary. The preceding investigation has led to this result : Let f(x, y) be an operator defined for Ix - xol E; rr
§ 1. Normal Systems
157
A. /(x, y) is continuous and bounded (1/(x, v)I -9 M < oo). B. The Osgood integral f de o T(e)
diverges, where q(o) is the /unction associated with /(x, y) by means o/ equation (1.3). Under these assumptions the differential equation d) dx
= /(x, y)
has one and only one solution on the interval Ix - x01;5 ro = min rr., r/M;
such that y(xe) = yo and l y(x) - yol S rr. 1.9. The above analysis of the Cauchy polygonal method also provides information about the rapidity of the limit process (inequality (1.10)). If, for example, one assumes the Lipschitz condition, q(p)=Ko, then a+B d
1
log (1 -F-
a) ,
and /3(a) .= 19(M b) = n (e3K' - 1) ti 3 h rocs = 3 li ro 1116. For 97.(Q) = K o log (1/0) one finds fl(M a) = (M d)" - 416 with p = e--3K,.. The slower the Osgood integral diverges, the slower the convergence of the polygonal method.
Finally, observe that the above existence proof also includes the Osgood uniqueness theorem. In fact, by the above derivation, in-
equality (1.10) also holds if y(x) is understood to be the difference between a given solution and an approximating function yD(x) constructed according to the polygonal method. If 6 is the length of the greatest subsegment in the partition D, then 13'(x) - YD(x) I
AM 6) .
If the Osgood integral (1.4) now diverges, ly(x) - yD(x)I vanishes as 6 -. 0. from which the uniqueness of the solution follows. 1.10. Exercises. 1. In the differential equation dy = f (x, y) dx (x E R', y e R,)
let /(x, y) be a linear operator that is continuous for Ix - x01 ly - vol S r}, (If(x, y)I S 111) and satisfies a Lipschitz condition (K = const.).
If(x, y,) - /(x. yt)I 19 K ;y, - yti
ra,
IV. Differential Equations
1 i8
Following Picard's method of successive approximations this normal
equation is solved with the initial condition y(xo) = v0 by taking as the first approximation to the solution y - '(x) the constant vo(.r) ==yo and determining the sequence of approximations '1(x) (i 1. 2.... by means of the recursion formula vo
f 1(1, vi- 1(t)) dt
r. r
Show that this sequence converge, uniformly for !x - xo - ro and that the limit function v(x) _= lim y;(.c) solves the differential equation, where y(.ro) = yo and ly(x) - yol S r... Hint. By means of induction, show for each i, first, that min
l'1(x) - 3o1
and further that
rr
)'r- 1(x);
jl
2. For the linear differential equation dv= .1(x)dxy (x E R, )' E Rv) where .4(x) is continuous for Ix - vo . ro, the Picard method yields the solution y(x) _
,
.4,(x) yo
irn where the operator :1,(.r) is calculated recursively from .lo(x) :
...
I , .4;(x) =
(' A(() dt ..1r-1(t)
(i
1, 2,
..) .
.1. ,
Hint. The ,cries
' .4i(x) has the exponential series i-o
f!:lI i.r
.10!)1
where l.C, _= max !:l(x)i for ;x - xol s ro, as a majorant. ;. The method given above for the existence and the uniqueness of the integral of the equation dy = /(x, y) d* gives also a bound for the difference lyl'x' - y2(x)l of two solutions yl(x) and y2(.r), corresponding to two different initial values y,(xo) - v1, y2(xo) _ y2-
§ 2. The General Differential Equation of First Order 2.1. Uniqueness of the solution. In the following we treat the differential equation dy == I (x, )') dx,
(2.1)
§ 2. The General Differential Equation of First Order
159
investigated above for r = 1, in the general case where the dimension
of the space R of the independent variable is arbitrary and to > 1. We make the following hypothesis concerning the linear operator /(x, y): 1'. The operator /(x, y) is continuously dillerentiable for Ix - ro.
r,r,
IY - yol s ry. That means: for each pair of points x, y in the balls mentioned h, y + k) = /(x, y) + /x(x, y) Is + /,.(x, y) k _F d(b) holds with the continuous operators /., and /,, where 1612 = Ihl2 _.. Ikl2 and the norm of the operator (h) converges to zero for 6 -,. 0.
Then if K stands for the least upper bound of the norm of the continuous operator /,.(x, y) in the above closed balls, it follows from the mean value theorem that /(x, v) satisfies the Lipsclhitz condition
!/(x, )'i) -
K 13'1 - 3,21
(2.2)
for Ix - xol s r,.. Iy). - yol s r,, (i = 1, 2). In what follows we are concerned with the solvability of equation (2.1) under the additional conditions y(xo) = Yo,
f y(x) - yol S r,, for Ix - x01 s ro = min (ri, max I/(.x, y); for Ix - xoI 5 r, Iy - yo!
ry
Assuming that such a solution y(x) exists, then on the connecting segment t = x0 + r (x - x0) (0 5 r S 1), y(t) is a solution of the normal equation dy = /(t, y) dl. In view of (2.2) it follows from exercise I
in 1.10 that this normal equation has one and, as a consequence of Osgood's uniqueness theorem, only one solution y(t) such that y(xo)=yo and ly(t) - yol s r,, for It - xol 5 Ix - x0' 5 ro. For r = I this
solution has the well-determined value y(x), and one concludes that this radial integral, constructed for lx - xo1 5 ro, is the only solution of (2.1) with the initial value y(xe) = Se and the supplementary property ,3-(.x) - yol s r,., if such a solution exists at all.
2.2. Integrability condition. Now the question arises whether the above constructed radial integral y(x) actually solves the differential equation (2.1). For dimensions n > I of Rx' this is not the case unless besides 1 an additional integrability condition is satisfied. For if y(x) is to solve equation (2.1) the operator y'(x) exists, and for an arbitrary differential dx = Is e R",.' we have v'(x) h = 1(x, y(x)) h
.
160
IV. Differential Equations
In view of hypothesis I it follows from this that y(x) is continuously differentiable even twice, so that for an arbitrary second differential k E R.A'
y"(x) k h = /.r(x, y(.r)) k h -4- /y.(x, y(x)) (y'(x) k) h = 1,(x, 9'(x)) k h + /,.(x. y(Y)) (/(x, y(x)) k) h
But then, because of the symmetry of the continuous operator y"(.r), we have A (1,(x, y(x)) 1 , k .. / (x, 1'(x)) /(x, y(x) h) k) = 0
We see: in order for the radial integral y(x) constructed in 2.1 to satisfy the differential equation (2.1) it is necessary that the operator /(.r, y) fulfills, besides 1., the following integrability condition: 2. For y = y(x) the bilinear function alternating (in h and k)
R(x.y)hk=A(f,(x,y)hk+/,,(x.y)/(x,y)hk)-0. 2.3. Conditions I and 2 are sufficient. In this and the following section we show that integrability condition 2 if taken with the general
continuity assumption I is not only necessary but also sufficient for the radial integral y(x) to satisfy the differential equation (2.1 ). As already mentioned, because of the Lipschitz condition (2.2) we can construct the function y(x), applying Picard's method of successive approximations (exercise -1 of 1.10). We thus join each point x of the ball Ix - x01 <, ro with r0 by the segment I =: x0 +T (x - x0) (0 S T 1) and for each n ? 0 set, starting with yo(x) = yo, v;,. ,(t) dt = /(t, y,,(t)) dl, ,(.r0) = yo, so
that y, ,(x) = yo ± f /(t, y,,(t)) dt = yo + f r-o
r-0
dt
,
(2.1)
whereby dt = dr (x - x0). According to the exercise mentioned y (x) converges as n -+ oo to the function y(x) and in fact uni/or my in the ball Ix - xol r0; consequently, ;ta(x) is continuous in this ball. It further follows from y (x0) = v0 and Iv (x) - yol s r,. that these last relationships are also valid for the limit function y(x).
It follows from hypotheses I by means of induction that the approximating functions v (x) are continuously differentiable fot Ix - x0! - r0. For a differential dx = d7. (x -- x0), which depends linearly on x - x0, we have by construction
!' : 1(x) dx = /(x, y (x)) dx = F5(x) dx ,
(2 ..1)
§ 2. The General Differential Equation of First Order
161
and for an arbitrary differential dx, because t'(x) dx = t dx, dx = f (t F' (t) dx dt + FN(t) dx dt) . 0
This we write, after addition and subtraction of t Y.(1) dt dx,
dx = f (t
dt + dr
dx +
dx
,
r-0
where
rr(x) dx = 2 f t AK. (t) dx dt . T-0
In the expression for y;,+,(x) dx the integrand in the first term is Consequently,
dx = F (x) dx + r (x) dx = /(x, y.(%)) dx + r.(%) dx, (2.6) which implies (2.4), because according to (2.5), rp(x) dx = 0 for dx = A (x - x0), dt = dr (x - x0).
In the following section we show that as a consequence of integrability condition 2 lira r,(x) = 0 (2.7) uniformly for Ix - x0l 4 ro. Since y (x) also converges uniformly in this ball to the limit function, which is continuous there, it follows from (2.6) and (2.7), because of the continuity of the operator /(x, y), that y;,+,(x) converges for n -* oo uniformly in the ball Ix - Vol a ro
to the continuous limit operator /(x, y(x)). But then according to exercise 7 in II.1.13, the operator y'(x) exists. Thus for each dx E R' y'(x) dx = /(x, y(x)) dx, which, up to the proof of (2.7), brings us to our goal.
2.4. Proof of
0. We must still settle the key point, the
uniformity of the convergence (2.7) for ix - xoI Completely written, formula (2.5) for r (x) is
dx = 2 f t/\ (felt,
ro.
d x dt + / i, Y»(t)) (Y«(t) dx) dt} .
dx, and by dx + For n > 0, by (2.6), y;,(t) dx = /(t, (2.4), for di = dt (x - x0), y;,(t) di = /(t, y.-,(I)) dt. We introduce this in the above formula and set for arbitrary differentials h, k set A (h 0, y.(1)) h k + /,(t, ye(t)) (1(t.
h) k} = R (t) h k .
(2.8)
1V. Differential Equations
162
Then for it > 0 r (x) dx = 2 f r R (t) dx dt + f'T j,,(t, r..0
dx) dt
re0
(2.9)
,
while for it = 0 ro(x) d x = 2 f T Ro(t) dx dt ,
(2.9')
r-O
with (2.8')
Ro(t) k k = A jt(t. )'0) h k.
From the uniform convergence y.(t) --. y(t) for It - xol r0 and the continuity hypothesis I it follows, according to (2.8), that as oo the norm I tends uniformly in the ball mentioned to the norm IR(t, y(t))I. Since the latter according to integrability condition 2 vanishes identically, the maximum R. of IR.(1) I in the ball It - x01 S ro also converges to zero: of --
limR.=0.
(2.10)
From here it furthermore follows that R. is bounded: R. R(
Iro(x) dxI S Ro 1x - xoi IdxI f 2 s dt = Ro Ix - xol IdxI . 0
Therefore Ir0(x)I 5 R0(x - ;I, which, with regard for what follows, we write -z 1r0(x)1
S Z0-r IX - 0l
5 K for each n and that If one takes into account that 1/7 (1, according to the above (1/2) Iro(t)! = (R0(2!) It - x01 = (R0!21) Ix-x011. then ix - x01 + ?°K Ix - x012
2 Ir1(x)I S
follows from (2.9) for is = 1. Hence for it = 0 and is = 1 1- Irw(x)15
...R"-'. K' Ix - xoi'+' .
(2.12)
If this is true for an arbitrary it, then it follows from (2.9) that Z
dxI s - -' Ix - xol IdxI f 2T dT + 0 1
n+1
1fK+
20
RNT,
'r. K-' Ix
izt (i + 1)1
- xolr+t IdxI t'+1 dr ,
1 3. The Linear Differential Equation of Order One
163
and therefore N+1
R"+t- Kr Ix -
1z Ir.+t(x)l S t-o E (i + 2)1
Hence, the estimate in (2.12) holds for every it 0. Now let e > 0 be given. As a consequence of (2.10) there exists an no such that RN-{ < e for it - i > no, while according to (2.11) R"_j
S R for n - no g i 5 n. Accordingly we decompose the sum (2.12)
into two parts: X, from i = 0 to n - ne - 1 and Es for to - no i n . We then have
Ei sel.r-x0I ;Ep
and consequently for each x in the ball Ix - x01
£i<eroe For E. one obtains by (2.11) Xilx-.roll
£2S RIx-xol E - _
and for Ix - xol
ro
Es
ro
Kr
<-
R!x - xei
R(K r0)"-"
(n-no+2)1 0
"_
(K Ix-xol) "'exl
xi
reKr
In summary, in the ball Ix - xol 5 ro
2- Ir"(x)I 5 £3 + E! < e +
t)
(n
ro
eKr
and
lim sup Ir"(x)I
Since this holds in the ball Ix - xo)
2 re
ro for each a > 0 no matter
how small, the uniformity of (2.7) is established.
§ 3. The Linear Differential Equation of Order One We are going to investigate more carefully the special case of a linear homogeneous equation
dy=A(x)dxy,
(3.1)
where A (x) is a bilinear operator that maps the space Rr X R' into R;. The functions denoted in § I by /1 are now linear homogeneous functions of the coordinates ril, ... , ti" of y, and equation (3.1) becomes
fti
N
kit
tl
+N
k
ns;
IV. Differential Equations
164
3.1. Uniqueness of the solution. In the present case hypothesis 1 of 2.1 reduces to 1. The operator A (x) is continuously differentiable in a ball x - x01 o/ the space R,'.
r
Instead of the ball one could also take an arbitrary finite compact region G", starlike with respect to xo. The validity of the Lipschitz condition follows from the continuity
of A(x), the constant K being the maximum of the norm IA(x)I for x E Gz . The radial integral y(x) can thus be constructed with an arbitrary initial value y(xo) = yo E R' by means of the Picard method. According to exercise 2 of 1.10 we have Y,(x) = t'. A,(x) Yo i-u
where, setting t = xo + r (x - xo), A0(x) = I
and
Ai+1 =
f to
A(t) dt Ai(t) .
As v -+ oo, y,(x) converges in G7 uniformly to the radial integral y(x), which by 2.1 is the only possible solution of equation (3.1), with the initial value y(xo) = Yo
Because (3.1) has the trivial solution y(x) = 0, it follows from the uniqueness theorem that a nontrivial solution y(x) nowhere vanishes. We now consider the set of all solutions of (3.1). From the linearity of this equation it follows that any linear combination of particular solutions yl(x), . . . , y,(x) is likewise a solution. If the particular solutions are linearly dependent for one value x = xo, so that I
Y(xo) = E 2' y1(xo) = 0
r-t
(A' +0 for some i = 1,
... , 1) ,
then y(x) - 0, and the solutions yi(x) are linearly dependent for all x E G'x. This implies that the values y = y(x) of the solution space form a subspace LI(x) of Ry whose dimension q (0 5 q 5 n) is independent of x.
3.2. Integrability condition. The necessary and sufficient condition 2 found in § 2 for a radial integral y(x) (y(xo) = yo) to satisfy the differential equation is in the present linear homogeneous case A (A'(x) h k + A(x) k A(x) h) y(x) = 0 , AA
which also can be written
R(x) h k y(x) _ A (A' h k- A h A k) y(x) = 0.
(3.2)
AA
This condition must hold for each x e G' and arbitrary differentials dx = h, k.
§ 3. The Linear Differential Equation of Order One
165
The expression R(x) 'h k y is a trilinear form in h, k and y; it is alternating in the first two. For a given pair h, k R(x) h k
A (A' (x) h k - A (x) h A (z) k)
is a linear transformation of the space Ry into itself. For fixed values of x, h, k, the solution space of the equation R(x) h k y = 0 (3.3) is a linear subspace of Ry, namely the kernel K,(x; h, k) of the linear transformation R(x) It k. For fixed x let the differentials h and k run through the space R. The intersection of all the kernels K,(x; h, k) thus obtained is then a subspace KO(x) of R,", of dimension p (0 S p n). The set of all of the points in this subspace yield the complete solution of equation (3.3) for a given x E R? and for freely varying differentials h, k. On the other hand we have seen in 3.1 that for a given x e G,' all the values y = y(x) of the solution set of equation (3.1) form a linear subspace L4(x). From integrability condition (3.2), which holds for x E G' and for arbitrary differentials dx = h, k, it follows that y(x) belongs to the subspace Kv(x). Thus for each x E Gam', the inclusions Ly (x) c K"(x) c R.0
(3.4)
hold.
3.3. Complete integrability. We say the differential equation (3.1) is completely integrable if it has a solution with the initial value y(xo) = yo for any yo E R. This is the case precisely when the radial integral represents a solution of the differential equation (3.1) however the initial value v is chosen. It follows from the definition of the subspace LI(x) that for complete integrability the dimension q must be equal to it for each x. Then, by the inclusions (3.4), for x e G" the dimension p = n, and thus LY(x) = K"y(x) = R. Therefore, in order for (3.1) to be completely integrable, it is necessary that p = u and that
R(x) = 0 for x E G. This condition is also sufficient. For if the equation holds for h, k e Rt and v e R°, integrability condition (3-2) is satisfied.
Later (in 3.12) we shall see that this result is valid under less restrictive hypotheses relative to the operator A(x) than postulated in 3.1. For example, it suffices to assume the simple (not necessarily continuous) differentiability of A.
3.4. Differential equations with constant coefficients. We now consider the case where the operator A(x) = A is constant, i.e., inde-
IV. Differential Equations
166
pendent of x. The integrability condition for the equation dy = A dx y is now I
Rhky or more briefly,
(AkAhy-AhAky)=0,
AhAk=AkAh.
(;.S)
The operator A x defines for each x E R'a linear self-mapping of the space R, The integrability condition thus states that the Product of two such transformations A h and A k is commutative.
In order to integrate the differential equation dy = A dx y under this assumption, we fix two points x0 and x in the space R' and carry out the integration along the segment x0 x, assigning an arbitrary initial value yo E Ry to the value x = x0. Using the method of successive ap rox-
»nalions (cf. 1.10, exercises 1-2) one arrives at the point x witlh the value
0 y(x) = E A,(x) Yo .
j-0 where A0(x) =_ I is the identity transformation and A,(x) = f A dt A;_,(I)
(j = 1, 2, ...) .
.V$ X
These integrals can be computed successively for j = 1, 2.... in an elementary way. One finds
At(x)A(x-x0)...A(x-x0)(A (x-xo))'; (A x)' is the jth power of the operator A x, and thus
),(x) = / (x - x0) yo, where &) stands for the linear self-mapping
f(x) = E (`t /-0
0 ((A x)° = I)
(3.6)
1
of the space R,"..
It follows from the general theory of linear homogeneous differential equations developed above that the function y(x) = / (x - xe) Ye' which is defined in the space R" X', is actually the solution ; it is uniquely
determined by the initial value y(xo) = yo. In the simple case at hand this can also be seen directly (cf. II.1.13, exercise 8). The operator series f (x - x0) and the series obtained from it by termwise differentiation are uniformly convergent for !x - x0! S r < oo. 'In view of the commutativity of the operators A x,
d/ (x-x0)=AdxI(x-x0), dy(x)=d(I(x-x0)Yo)=Adxy(x)
§ 3. The Linear Differential Equation of Order One
167
The operator /(x) can be differentiated infinitely often, and for dx = x one has d'/(0) _ (A dx)1 = (A x)' , and the differential quotient di/(0)/dx1 is defined by means of the equation dd.vi ei = (A x)'. The expansion (3.6) is the Maclaurin expansion of the operator /(x). 3.5. Functional equation. Let a, b, c be three points of the space R'". If y,, ys, yg are the three corresponding values of a solution y = y(x) of the differential equation dy = A dx y (A constant), then one has
yt=/(b-a))'i,
ys=/(c-b)yz=/(c-a)y
hence y$ = / (c - b) / (b - a) y,
(c - a) yl. Since this relation (c - b) / (b - a),
holds for each choice of the initial value y,, / (c - a)
or, if wesetx,=b-a,xp=c-b,
/ (.vi -1- x.) _ /(xs) /(xi) = 1(x,) /(x:)
Since /(0) = I, we have the special result
l(x) A- X) = I I. The linear self-transformations /(x) are therefore regular, and we have
(/(x))-' = /(- x) . The automorphisms /(x) (x a R',') consequently form an abelian topological group'. 3.6. The case its = n. It is of interest to ask under which conditions the solution y = y(x) of the differential equation dy = A dx y (A constant) provides a local, i.e., in the neighborhood of each point x, oneto-one mapping x -+ y. It follows from the theory of implicit functions (cf. 11.4.2) that this is the case precisely when the derivative y'(x) is regular and provides a one-to-one mapping between the spaces RM and R. This implies that the dimensions its and is are equal. A further analysis of the problem shows that, in the case m = it, y'(x) can be regular only in the two lowest dimension cases its == n = I and
m = n = 2. In the former case one is essentially dealing with the elementary real exponential function. In the second case, y(x) is simply periodic, and the mapping x - y is then isomorphic to the one provided
by the complex exponential function; by an appropriate choice of the coordinate systems in the x- and y-planes one can bring the equa3 In this regard, see the article of G. P61ya [1].
IV. Differential Equations
168
tion y = y(x) into the form uw = as, where a is real and z and w are complex variables. The proofs of these assertions can be carried out based on exercise 5 in 3.14.
3.7. Generalizations of the existence theorem. We turn back to the general linear differential equation
dy=A(x)dxy. It was shown in 3.1-3, assuming the continuous differentiability of the bilinear operator A (x), that in order for the differential equation to have a solution that assumes an arbitrarily preassigned value yo E Rx at an arbitrary point x0 E GX' it is sufficient for the operator R(x) h k A (A'(x) h k - A(x) h A(x) k) to vanish identically. In what follows this condition for complete integrability of the differential equation will be derived in a new way and at the same time sharpened'.
3.8. The transformations T and U. Our method is based essentially on certain general structural properties of the integrals of a linear normal system. Let us first summarize these rules. We restrict ourselves in what follows to a convex region G7, where for the present we only require the continuity of the operator function A (x). The differential equation can then be integrated as a normal sys-
tem along any oriented piecewise regular curve 1. The initial value y, (: R°y at the initial point x = x, of l can be prescribed arbitrarily, and one arrives at the end point x = x2 of I with a well-determined value y2 E R. From the linear structure of the differential equation it follows, in view of the uniqueness theorem, that the final value y$ for a given l depends linearly on the initial value y,; for if the final value of the integral corresponding to the initial value y, is y=, then A y(x) F A y(x) () and J. real) is the normal integral on l uniquely determined by the initial value A y, + A y,, and consequently the final value corresponding to this initial value is A y(xs) + A y(xs) = i yz + i yg.
Hence there is associated with every oriented piecewise regular curve 1 in the region GX a well-determined linear transformation T, of the space Ry : the integral y(x) of the normal system dy = A(x) dx y along 1, uniquely determined by the initial value y(x,) = y,, has at the end point x$ of I the final value
y2= I R. Nevanlinna (6, 7).
y,4..fdy(x)=T,yi I
g 3. The Linear Differential Equation of Order One
169
If I stands for the identity transformation of the space R. then the increase of the integral y(x) along 1
f dy(x) = f A dx y(x) _ (T, - I) y1 = I
U, y1
I
is likewise a linear transformation of the initial value y1. From the uniqueness theorem it follows that the transformations T, have the following properties: 1. If 11 = x1 x=, 19 = x2 x3 are two paths and is 11 = x1 xs x3 is the "product path" of 1, and l=, then T,,,, = T,. 7-1,.
2. If 1-1 = x2 x, stands for the path reciprocal to I = x1 x2, which results from 1 by reorientation, then
T, Tr, = T,-, T, = I . From the second rule it follows in particular that the transformations T, are regular: the inverse transformation TI' = T,-, exists. Besides these group theoretical properties, the transformations T, in addition satisfy the following metric condition. which has the character of a Lipschitz inequality: 3. If the paths 1 are restricted to those in a compact subregion G of G;
with arc lengths Ill S eo < oo, then for any metrics on the spaces Rz and Ry there exists a constant M = M(G, Lao) > 0 such that the norm of the transformation U, satisfies the inequality
lull =1T,-II SMIII. This condition means that the inequality if dy(x)l = by:
- y,I S Al III IyjI
holds for the increase in the integral y(x) constructed along I with the initial value y1, for any path in and every y, in Ry, provided III S o0 (cf. 3.14, exercises 3-4). Let y, be a closed path, beginning and ending at the point a. The corresponding transformation U, = U4 then yields for each y, E R.* the increase U. y, = f dy (x) ye
in the integral y(x), integrated around the curve V. with the initial value y, at a. If instead of a one chooses another beginning and end point b,
which is associated with the transformation Ur, = U6, and if we denote the portion of the curve from a to b (oriented in the same way as y) by 1, then by rules I and 2 the relations Ue= 7', U.
T,-1,
U.= T1 Jb T,,
(3.7)
170
IV. Differential Equations
hold. In particular this implies: If for one a on y U. represents the null transformation of the space R", then this is true for every choice of the initial point a. Therefore f dy(.r) = f .f (x) dx y(x) = 0 . however the initial value of the integral y(x) is chosen at an arbitrary point on the closed curve y. Besides 0.7) we shall use another transformation formula, which likewise is an immediate consequence of rules I and 2. On an oriented closed curve y we take in the order determined by the orientation f points x 1 . . . . . . . We connect these points with a point xo outside of or on y by means of curves l; = x0 x;. The curves I. lc+, (i modulo p) and the arc x; x; ., of y then form a closed curve y;, which we describe in the sequence xo x; x;,., xo from x0 to x0. Then (setting 7';; - Ti, i.';., = U1 and 7',, = T;) 7', _: 7.1 I'1, ... T1 7',-. .
If 7'; = I + U; is introduced, then for U., = 7'. - I one obtains the expansion
t'
r 7-
(3.8)
,a.
where
Si = E U taken over all combinations fi
, . . .
U
,
jl > ... > j1
1. Because the prod-
ucts of the transformations U are in general not commutative, the factors in the individual terms of this sum must be taken in the order indicated.
3.9. The integrability condition U,, = 0. After this preparation we prove Lemma 1. 1/ the operator A(x) is continuous in the convey region G' then for the complete integrability o/ the dil/erential equation dy = A (x) dx y it is necessary and su//icient that the transfornzatiou for every closed,
regular curve y in G"".
It follows from (3.7), as already remarked, that it makes no difference at which point of y one starts along the curve. The condition is obviously necessary. For if one constructs that normal integral of the differential equation that assumes a preassigned
§ 3. The Linear Differential Equation of Order One
171
value yo at a point x0, then the increase of this integral for one revolution is
f dy(x) = U3, yo V
On the other hand, for complete integrability of the differential equation this normal integral, as a consequence of the uniqueness theorem. must agree on y with that solution which assumes the value yo at xo; and since this solution is unique in G", the increase calculated above is in fact equal to zero for each yo E R. The condition is also sufficient. In order to see this, we integrate the differential equation as a normal system along the segment xo x, beginning with an arbitrary initial value yo at a point xo E G'. One thus obtains a function y(x) which is well-determined in G".' as the only possible solution of the differential equation with the property y(xo) = Yo-
If the integrability condition in the lemma is assumed, this function in fact satisfies our differential equation. For the proof let x and x + It
be two arbitrary points of the region G, and y = x xo (x + h) x the boundary of the simplex determined by the points xo, x and x + h, described in the direction indicated. On this boundary let y(t) be the integral with the initial value y(x) = y(x) for t = x. It follows from the uniqueness theorem .that y(t) = y(t) on the entire polygonal path x xo (x + h) ; consequently the corresponding increase is
y(x) = f dy(t) .
Ay = y (x + h) - y(x) = y (x -}-
x T.(x+h;
But according to the integrability condition U,, 5 (.r) = f dy(t) = 0; therefore
Ay = f dy(t) = J dp(t) = f A(t) dt y(t) xx.(x+h)
x(x+?)
,
x(x { 4)
so that, because of the continuity of A(t) fort = x, A)' = A(x) It Y(x) + IhI (h; x) = A(x) It y(x) + Ihi (h; x) , with I(h; x)I -- 0 as Ihl 0. Hence for .r E G,* and It = d.r dy(x) = A (x) dx y(x)
,
which was to be proved. Remark. As a result of the second part of the above proof, the integrability condition Us, = 0 can be restricted to the boundaries of triangles in the region G,; the condition then holds for every other closed, piecewise regular curve in the region.
IV. Differential Equations
172
3.10. Reduction of the integrability condition U. = 0. The above integrability condition relates to the behavior of the operator A(x) "in the large". The central point of our method of solution consists in being able to replace this condition with the help of the "Goursat idea", already used in I1.1.8 and III.2.7--2.9, by an equivalent local differen-
tial condition which refers to the behavior of A (x) at every point of the region G,". Let xo be a point of G;'' and y the boundary of a triangle s = s(x,, x2,
x3) that contains this point. We set x2 - x, = h, x, - .r, = k and denote by A = D h k the real fundamental form of the plane E of s, so that A represents the oriented (affine) area of the triangle. If the integrability condition of Lemma t holds, then L',!/1 = 0, and the limit transformation lim
0
exists trivially when the triangle in the arbitrary but fixed plane E converges to xo so that the greatest side length A -,. 0.
This obvious state can now be turned around, and one can even restrict oneself to regular convergence 8 under a finite bound. That follows from
0. for which 621A remains
Lemma 2. Let the operator A (x) be continuous in Gx, so that for each point xo of the region and in each fixed plane E through xo 0
(3.9)
when the triangle s = s(x,, x2, x3) with boundary y its E converges regularly
to .Yo es (A = Dhh,h=x=- x,,k=x3-x,).1 Then the integrability condition of lemma I is valid for every closed, picceuvse regular curve in the region Gt.
Proof. According to the final remark in 3.9 it suffices to show that the integrability condition (,',, = 0 holds for every triangle in G,'.
Fig. 5 zr
sa
Hence, let so = sa(x,, xs, x3) be such a triangle. In order to demonstrate the assertion E70=.- L,, = 0 for the boundary y = x, x= x3 x, we I Such regular sequences exist (cf. construction of \Vhithney, 1.5.5). Cf. also
T. \ieminen I.
1 3. The Linear Differential Equation of Order One
173
introduce a euclidean metric on the space R;', in such a way that so becomes a right isoceles triangle with the right angle at xl. Denote by M the constant M(so, eo) of postulate 3 in 3.8, whereby eo = 2 ao = 2 Ixs - x21 (cf. Fig. 5). We decompose so by means of the perpendicular x; x1 to the hypo-
tenuse xs x2 into two congruent triangles s1 and s; with boundaries
yl = x; x1 x2 x; and y; = x, xs x1 x;. If one sets U., = Up U.,; = U;, then by (3.8)
Uo=T,UT1-.,
where
U=U1+U;+U;Ul.
According to condition 3 eM0.12 . I T1-,l I T11 5 I + I U11 5 dull s e"'UI = eMe./a , Furthermore, provided I U1I is the larger of the norms I U11, 1 U; I,
IUI 15 IU1I+IU;I+IU;IIU1I 921U11(1 21 U1I eMlr,ita 5 2 I U11 a"", ,
and therefore IUoI
IT11 IUI IT,-.I -9 2IU11
ea"'a.
If J. and A, = A012 are now the areas of the triangles so and s1, it follows
that I UoI
Jo
g IAlU11 ave,
This procedure can be repeated ad infinitum. One obtains a se-
quence of triangles so) s1) ... ) s;)
and there exists a well-
determined limit point xe e s; (j = 0, 1,.. .). If d; = ao/21n stands for the length of the hypotenuse and A; the area of s;, the above inequality, after j iterations, yields IUoI
IU;I
The triangles s; converge (regularly) to the limit point x0 E so, and
since the quotient I U,I/A, converges to zero by (3.9). U. = U,, = 0. which was to be proved. 3.11. Extension of the definition of the operator R(z). Assuming only the continuity of the operator A (x), Lemmas I and 2 have supplied the necessary and sufficient condition (3.9) for the complete integrability of our differential equation. This condition is to be related to the sufficient condition R(x) = 0 cited earlier in 3.7. This connection is apparent from the following
IV. Differential Equations
174
Lemma 3. Let the operator A(x) be continuous in a neighborhood of the point xo and differentiable at this point.
Then if s = s(.rl, xz, x3) is a triangle (2s = y) in the neighborhood mentioned containing the point xo and h is the greatest side length, then
(xs-x,=h,x3-xl=k) Uy
R(xo) h k -{- 62(b)
= A (A'(xo) h k -- A(xo) It .1(xe) k) + bs(b) (3.10) where (6) stands for a linear transformation of the space R. whose norm converges to :ero as b 0. The orientation of the boundary as = y is determined by the above ordering of the vertices, while the beginning point can be chosen arbitrarily. Therefore, if the differential equation dy = A (x) dx y, with the initial value yo at an arbitrary point, is integrated as a normal sys-
tem around the boundary as oriented in the way described, then for every Ye a R. the increase in the integral y(x) is
t;, %, 0 = f dy(x) = f A(x) dx y(x) = R(xo) h k yo ± 8=(b) yo
.
(x.11)
za
In order to not interrupt the course of our existence proof, we shall prove Lemma 3 later (in 3.13). Formula (3.11) leads to a generalization of the definition of the trilinear operator R(x) analogous to the extension of the definition of the rotor given in 111.2.6. In fact, the left hand side of this formula makes sense for every triangle s in the space R.c' containing xo and each yo E Ry, assuming only the continuity of the operator A(x). Consequently
R(xo) h kYe can be uniquely defined by requiring trilinearity and the validity of relation ().11) for each triangle s. This definition agrees with the earlier one R(-ro) Is k 7b = A (A' (xe) h k - A(xo) h A(xo) k) }o hh
if the operator A (x) is continuous in a neighborhood of the point xo and in addition differentiable at xo. On the other hand, the extended definition supposes only that the differential equation dy = A (x) dx y can be integrated normally along piecewise regular curves. It can therefore,
in certain cases, make sense assuming only the continuity of .4(x). 3.12. Existence theorem. Summarizing the results of Lemmas 1, 2 and 3 we obtain the following existence theorem, in which the operator R(x) is to be taken in the sense of the extended definition given in 3.11.
Theorem. Let the bilinear operator.-t(x) be continuous in the convex region Gx of the space R,,'.
§ 3. The Linear Differential Equation of Order One
175
In order that the dillerential equation dy = A(x) dx y may possess a solution /or every xo in Gs and an arbitrary y. in Ry Which for x = xs assumes the value yo, it is necessary and sullicient that the operator R(x) exists in Gr and vanishes identically, so that for arbitrary vectors h and k in the space R.` and every y in R.
R(x)hky=o. It in particular A (x) is dil/erentiable at every point o/ the region Gs, then
R(x) h k y= A (A' (x) h k- A (x) h A (x) k) y, Ak
and the identical vanishing of this expression is necessary and sufficient for the complete integrability of the dillerential equation. The second part of this theorem gives in a sharpened formulation the existence theorem mentioned in 3.7 and already proved by another method in 3.1 -3.3, where continuous differentiability of the operator was assumed. Our second method of proof, based on the Goursat idea,
shows that requiring the continuity of the derivative A'(x) is superfluous. As known, this sharpening is also typical in other connections, for example. in the proof of the Cauchy integral theorem in function theory, using the Goursat idea. 3.13. Proof of Lemma 3. Let A (x) be continuous on G: Ix - x.I S r and suppose IA I is the least upper bound of the norm I A (x) I on G. Since
(for a euclidean metric) any triangle s c G has a boundary length 3 I { r =: go, by exercise 4 of 3.14 the Lipschitz condition I U1I S M Ill holds, with dI = JAI e14i ', for any subsegment I of the boundary as. If 8 stands for the length of the longest side of the triangle s = s(x,, then because 111 5 3 8, U, = (8) , (3.12) where (8) tends to zero with 8. Now integrate dy(t) = A(t) dl y(t) along
the oriented boundary y = as, beginning with an arbitrary boundary point x = a and an initial value yo. For a boundary point x and the corresponding subpath ax = 1 Y(x) = Yo -I- j A (t) dt y(t) = Yo + U *j Yo
.
which because of (3.12) implies that y(x) = Yo + (6) yo (3.12') In view of the continuity of the operator A(x) at the point xo this implies the more precise relation y(x) = Ye + A(xo) (x - a) Yo + 6(6) Yo
IV. Differential Equations
176
By hypothesis A (x) is even differentiable at the point x = xo. Thus at each boundary point A (x) = A (xo) + A' (xo) (x - xo) + Ix - x01 (x - xo; xo) .
where the norm I(x - xo; xo)I tends to zero uniformly on as when b -- 0. Therefore Uv. Yo = f A(x) dx y(x) f A(xo) dx y(x) + f A'(xo) (x - xo) dx y(x) + 62(6) Ye .
v.
?a
where y, denotes the closed path of integration y having a as initial and final point. If y(x) is replaced in the first integral by the expression (3.12") and in the second by (3.12'), then Uv. Y, = f {A(xe) dx - A(xo) dx A(xe) a -- A'(-to) xo dx) yv yo
The first Integral vanishes, and a simple direct evaluation yields
{-AA(xo)hA(xo)k+AA'(xo)hk}Yo for the second. Thus, independently of the choice of the initial point a and of the vector yo,
Ur.Yo = R(xo) h kYo+62(h)Yo. which proves Lemma 3. Remark. The method developed in 3.8- 3.13 and based on Goursat's idea of successiv subdivisions of a simplex can be generalized to the nonlinear case (cf. G. Baclili [1; and the dissertation of S. Heikkilii f1]).
3.14. Exercises. 1. If the differential equation dy=A(x)dxy (xE Rz,yE R"y), where the bilinear operator A (x) is continuously differentiable in the region Gx' (e R,,*), which is starlike with respect to xo, is integrated along a straight line from xo to x with an initial value yo = y(xe), then (cf. 1.10, exercise 2) the function thus obtained '(x) = E00 A.(x) yo, i-O
is continuously differentiable in G.'", and 00
dy(x) = y'(x) dx = ' .4;(x) dx y.
.
1 3. The Linear Differential Equation of Order One
177
2. Under the hypotheses of exercise I the function v = y(x) has a second differential y"(x) h k for everyk E R s and for an h that depends
linearly on x - x0. Hint. By construction of y(x), dy(x) = y'(x) h = A (x) h y(x) for each h which depends linearly on x - xa. If one sets h = x - xa, the identity y'(x) (x - x0) = A(x) (x - xa) y(x) thus holds. Therefore, since y(x) is differentiable, for an arbitrary
dx=k,
d(y'(x) (x - xe))
= A'(x) k (x - xa) y(x) -I- A(x) k y(x) + A(x) (x - xa) y'(x) k. Applying the definition of the differential, one concludes from this that y"(,x) k (x - xa) exists and is equal tdd(y'(x) (x - xe)) - y'(x) k. Since y" is a bilinear operator, the differential y"(x) k h hence exists for an
arbitrary k and every h that depends linearly on x - xa. Further, it follows from the expression for y"(x) k h that it is a continuous function of x. Finally, as a result of Schwarz's theorem (cf. II.3.4), for the vectors h and k mentioned, y"(x) It It also exists and equals y"(x) It It. 3. Let A (x) dx y E Ry" (x e n, y E Ry) be a bilinear function such
that the operator A (x) is continuous in the region Gs (c Rr). Let the number JAI denote the maximum of the norm IA(x)I on a compact subregion Cr of Gx. Integrate the differential equation
dy = A(x) dxy along a piecewise regular path l c i; that connects the point x = xl with x = x3 so that y(xl) = y,. If the final value of the integral y = y(x) is equal to y= = y(x2), then Iy l S IYvI eIAImmm
(Ill = length of 1)
Hint. The claim is trivial for y, = 0. For y2 + 0 the integral y(x) 0 at every point x of the curve 1. From dIYI 19 Idyl = IA(x) dxyl
IAI Idxl IYI
it follows by division with IYI $ 0 that dlyl iY,
;5JAII
dxi ,
from which the assertion results by means of integration along l from x, to x2.
4. Under the assumptions of the previous exercise, IY. - Y,I
e1'43. III IY1I,
IV. Differential Equations
178
and hence for III 5 00, IYs - Y1I 9 DI I'll IYtI ,
where 11 - 'A e'-1Ie'Coo.
5. Let the constant bilinear operator .4 be so constituted that the solutions y = y(x) of the differential equation dy = .1 dx y (x, y : R") which do not identically vanish provide a locally' one-to-one mapping x -. v. Then the dimension is it = I or it = 2, and y(x) is isomorphic to the real or complex exponential function. Hint. Every solution y(x) of the problem is o. For if v(x0) - il, the representation y(x) = j(x) y(0) shows, in view of the functional equation for the transformation I(x), that y(x) = j(x) y(O) = / ((x - .r0) + xo) Y(O) = / (x - x0) j(xo) y(O) = / (x - x0) y(xo) = 0 (cf. 3.5). and y(x) would therefore vanish identically (also cf. 3.1). On the other hand, v(x) assumes all values different from zero. We first prove: of Each locally defined element of the inverse function x = y = y(x) can he continued in the region 0 < iyi < oo. To see this, observe that the minimum a = min IA h kI for ill -_ ;k; = I is Positive. Namely, let k be an arbitrary unit vector in R". There is then a well-determined solution v = y(x) = 1(x) k of our problem such that y(O) = k. We have y'(0) h A h k, and since the mapping -
.r
y is by hypothesis locally one-to-one, the derivative v'(0) is regular,
and A h k 4- n for every hl = 1. It follows from this that the minimum a is positive. Now let .r = .r, be chosen arbitrarily and v, = w(x,), where v = y(x) 0) is an arbitrary solution. Join v, (4- 0) with an arbitrary second ( point y2 =?_ () by means of it polygonal path l that avoids the point
y = 0. In a neighborhood of y = y, the inverse function .r --- x(y) (xl = x(,,)) is single-valued. Now continue this element x = x(y) as inverse function of v v(x) along 1. We claim that the continuation Y. with a well-determined element x -- x(w) arrives at the end point Y i(x). of the inverse function of y
Otherwise, the continuation succeds only on at subarc y, c0 c 1, without reaching the end point y = co. This subarc borders on c0 with a certain line segment. Let y == c,, cs be two arbitrary interior points of this segment, and a, = .r(c) (i = 1. 2). On the interval c, c2, x -=: x(y) satisfies the equation dy A dx y, from which we have `dyi = I A dx vl a a Idxi IyI
a g Idxl ,
§ 3. The Linear Differential Equation of Order One
179
where N > 0 denotes the shortest distance from y = 0 to 1. Therefore,
la, - all a f Idxl S ae 1 c,ca f idyl = IC1se- C11. ctce
According to the Cauchy criterion this implies the existence of a limit point ao = lim x(y) for y - co. The function x = x(y) as the inverse of y = y(x) would thus exist in a neighborhood of the point y = co, so that y(a0) = co, which contradicts the assumption regarding co. The truth of the assertion follows from here: in the continuation of x(y) along 1 one reaches the end point y = y, with a value x(y1) = x9, and the inversion of the final element coincides with y = y(x). Consequently y(x,) = ya. Since y. + 0 was arbitrary, it follows in particular from what was proved above that y(x) assumes all values it E R", with the single exception y = 0. We now consider the unit sphere lyl = 1. According to the above, the inverse function x = x(y), provided the dimension n > 1, can be continued without bound on this surface. We now consider the case it = 3. Since the sphere is simply connected (it has the null homology group), the continued function x = x(y) must be single-valued not only locally, but on the entire sphere. The mapping y -- x is hence one-to-one on IYl = 1 and has as image in the x-space a compact surface F. But this leads to a contradiction. Suppose yo = y(0). Since the form A h yo is different from zero, .4 Ir yo as a linear transformation of h e R" is regular, and there is a well-determined value h = a + 0 such
that Aa)-o=v. On the line x = a (-oo<E<+ao)y=e1Yeis then a solution of the differential equation. For
dy=eFddyo=e1ddAayo=A(d$a)(eEyo)= Adry. Since this solution agrees with the given solution y(x) for = 0, we thus have y($ a) = eF yo.
Now let x = b + $ a. Clearly (b + $ a) _ / (b +
=
a) Yo = 1(b) /(sr a) 9'0 = 1(b) y(, a) =: eE 1(b) Yo
e° Y(b)
From here one sees that the function y = y(x) maps the line x = b
± a one-to-one onto a half ray (y = et y(b)) that emanates from y = 0. Since for variable b E R3, y(b) assumes all values - 0, these half rays cover the entire punctured space y -+. 0. Because the rays cut the sphere lyl = 1, the compact surface F, must be intersected by every line in the parallel family x - b + e; a (b variable, a fixed). But for sufficiently large lbl this is impossible, and the thus deduced contradiction shows that the case it = 3 is excluded. It is similarly proved that the assumption n > 3 is also contradictory.
180
IV. Differential Equations
Therefore only the dimensions it = 1, 2 remain. In the case it = 1. y(x) = y(t a) is equal to the exponential function of yo. In the case n = 2, the following can be deduced. Since the inverse function x = x(y) can be continued without bound on the unit sphere Iyl = 1, corresponding to one full revolution, starting
at y = c, is a continuous arc with end points x = b, b2 such that y(bl) = y(bs) = c. In a way similar to the above one sees that this are cannot be closed. Hence b, + bs. If one sets w = b1 - b2, then y(b1) = Y(bs), 1(b1) = 1(b2), f(u)) = I (b, - b:) = 1(b1) /(- b2) = (/b1) /((bs))
= I. This further yields
Y (x + ow) = / (x + 0)) yo = A(") AX) Ye = 1(x) Yo = Y(x)
and the function y(x) is thus periodic with period cu. From the properties of y(x) on the lines parallel to x = f a, where y(x) is nonperiodic, it follows that rw is linearly independent of a. This function then has a primitive period b parallel to co, and the vectors a, b span the space R2. Taking into account the integrability condition A h A k = A k A h it now follows by means of a simple calculation, in which one uses a and b as a coordinate system, that y(.c) is isomorphic to the complex exponential function.
V. Theory of Curves and Surfaces As an application of the preceding chapters, the basic features of the classical curve theory and of the Gaussian surface theory will be presented in this chapter. We shall consider curves and surfaces which 'lie in an n-dimensional linear space and investigate the properties of these structures relative to a euclidean metric introduced in the embedding space.
§ 1. Regular Curves and Surfaces I.I. Regular arcs and surfaces. In the following we consider two linear spaces of dimensions in and n. In the former so-called parameter space R," let an open its-dimensional domain Gr be given. Further, let x -+ y = y(x) (1.1) be a single-valued mapping from G' into the space Ry, with the following properties: 1. The vector function y(x) is continuously differentiable. 2. The derivative operator y'(.x) is regular. Such a mapping defines a local "its-dimensional regular surface"
F'" in the space R. It follows from our assumptions that Pit n. The theory of implicit functions (cf. 11.4.2) shows that a sufficiently small neighborhood H, of a point x e G, is mapped one-to-one onto a set H., determined by equation (1.1)1. The set Hy is then, and only then, an n-dimensional neighborhood of the point y = y(x) when fit = n. In the case fit < it the surface F' is in the proper sense of the word "embedded" in R. For rx = 1, (1.1) is a regular arc in R. When Pit = 2, n = 3, one is dealing with the main problem in the Gaussian surface theory. We shall in what follows choose the finite dimension is of the embedding
space arbitrarily and the dimension of the surface in general to be ni = n - 1. However, we do not impose this restriction at first; thus its is an arbitrary number in the interval 1
in S is - 1.
1 If this one-to-one-ness does not hold for the entire parameter region Gi, then
the latter region can be replaced by a subregion for which this is true, which is possible under hypotheses i and 2.
182
V. Theory of Curves and Surfaces
1.2. Transformation of parameters. Tangent space. In the above definition y = y(.v) is the equation of an are or surface F" with respect to the parameter x, which is called "admissible" in order to briefly indicate that the equation of the surface has the properties I and 2 mentioned above relative to this parameter. Now let
z=.X(x),
x=x(x)
be a one-to-one mapping of the open region G,' onto an open region Gx of the same or of another in-dimensional linear space Rs. If this funcand the derivative -'(x) is tion r(x) is continuously differentiable in regular, then x is also an admissible parameter, and the equation of the given surface F'" with respect to this parameter is
' = y(x(x)) = Y(X) = ti .
Because the surface F" in the space R'. is given independently o the admissible parametric representations, the representing functio y(x), going from one admissible parameter x to another r, is according invariant: one obtains y(i) from y(x) simply by substituting x = x(i) and conversely y(x) = y(z(x)). If one sets x , x whenever these two points from different parametric representations correspond to the same surface point y(x) = y(r), then this relation is an equivalence whose equivalence classes
(x, i, 'x, ...) correspond one-to-one to the surface points
Y = Y(x) = y(z) = y(x) = ... By the definition of the differential, di = x'(x) dx, dx = x'(i) dz. The parameter differentials dx = h and dz = lr are transformed according to the formulas
h-dxh,
It=h.
This is the transformation law of contravariance. On the surface, there is corresponding to the parameter differential dx = h the tangent vector
k = dy = y'(x) dx = y'(x) h . By the chain rule y'(x) = y,(x) x'(x), F(i) = Y'(x) x'(x) and consequently
k=dy=y'(x)h=y'(x)x'(x)h=y'(z)h=dy=k.
§ 1. Regular Curves and Surfaces
183
This states that the derivative operator y'(x) = A(x), which can also be interpreted as a vector from the linear ran-dimensional operator
space, is transformed under an admissible transformation of the parameter according to the law of covariance
A=A
, dX_
A=Ad
.
If 'a covariant operator operates upon a contravariant vector, there
results an invariant, as can be seen from the invariance of the surface tangent k = dy = dd = k. From this invariance it follows that the set of all tangent vectors at the point y = y(x) the tangent space y'(x) Rx = Y'(Y) I?Y ,
is also invariant at the above surface point. It is therefore an nn-dimensional subspace of the embedding space R;, independent of the choice of the parameter. The nn-dimensional hyperplane E? (x) = v(r) + y'(x) R." is the tangent Platte of the surface at the point y = y(x). Later we shall return to the notions of invariance, contravariance and covariance in more detail and from a more general point of view.
1.3. The surface as in-dimensional manifold. According to the definitions of 1.1 and 1.2 the surface F" is homeomorphic (topologically
equivalent) to the parameter region Gr (and to 61"). If "open point sets" or "neighborhoods" H"' on F' are declared to be images of the open point sets H.` in G,, then F" satisfies locally the axioms for a Hatisdor// space'.
But more than that, F"' is an in-dimensional manilold, for (in addition to A and B) the essential axiom holds:
C. There is a covering set (H'") of F'", each neighborhood H'", ... of which is homeomorphic to an open region Hs , Hs , ... of the parameter spaces Rs, Rs, ... Furthermore, for a nonempty intersection H"' n n," (etc.) the equivalence relation x x, in which the parameters have the same image point p on H" n m,, is topoA`"',
logical.
If in particular the equivalence relations x = x(z) (etc.) are continuously differentiable and the derivatives x'(i) regular (which was F" can be covered with a set of neighborhoods (H'") such that: A. (Axiom for a topological space). The union of arbitrarily many and the intersection of finitely many neighborhoods is again a neighborhood H". . B. (Hausdor(f's separation axiom.) Two different points Q of the surface have two disjoint neighborhoods H"'.
184
V. Theory of Curves and Surfaces
hypothesized above), then F'" is a "continuously differentiable" or "regular" manifold. This property allows the notion of a continuously
differentiable vector function to be introduced on F'". One special such invariant is the function y = y(x) by means of which the manifold F'" was defined as a surface embedded in the vector space R.
1.4. General definition of a surface. The above considerations are based upon the especially simple assumption that the surface F'" in its full extent permits a single-valued parametric mapping Gx -- F"'. With such a special assumption one does not come far in the theory of surfaces. In surface theory one is not interested only in the local properties of a surface; however, if one does go on to investigate such a structure in the large, the situation becomes more complicated. Thus, for example, a two-dimensional surface in R y' generally cannot be mapped over its entire extent homeomorphically onto a region GX c R. In order to go on, one can, following the classical procedure of Gauss, construct the surface out of adjacent surface elements that possess the above homeomorphy property C (triangulation of the surface), or, more generally: one covers the surface with a system of neighborhoods H"" so that each H"` has property C. In order to formulate this last point of view into a precise definition of an in-dimensional surface F"' in the space R", it is convenient to first define the surface independently (of the embedding), specifically as a Hausdorff space (Axioms A, B) with the parametrization property C. After F' is thus defined as an in-dimensional manifold, its embedding is brought about by means of a continuous (or differentiable)
mapping of the surface points /' = (x, into the vector space R. In what follows we shall at first stick mainly to the local properties of a curve or a surface, and for this reason it suffices to undertake our investigation relative to a fixed parameter x. 'Not until § 4, in connection with the above general definition of a surface, shall we return to a more careful examination of the question of parameter transformations mentioned briefly in 1.2.
1.5. Metrization of the embedding space. The above definitions have assumed no metrization of the embedding space or of the parameter spaces. We now introduce a mctric into the embedding space R' by means of a real, bilinear, symmetric, positive definite fundamental /ornt (yt, y=), which also induces a euclidean metric in each tangent space of the surface. The goal of the differential geometric embedding theory established by Gauss is to investigate the properties of curves and surfaces relative to a euclidean embedding space. Regarding the dimension tit (1 S tit 19 it - 1), we shall restrict ourselves to the extreme cases in = I and in = it - 1. The case
§ 2. Curve Theory
185
n: = 1 yields the theory of curves governed by the Frenet equations, while the case rn = n - I is a generalization of the classical Gaussian surface theory associated with the special dimensions ni = 2, n = 3.
§ 2. Curve Theory 2.1. Arc length. We consider a regular arc y = y(x) of the euclidean space R. In this 1-dimensional case we can use the real number line as the parameter space, and the arc is therefore defined by a mapping
y = y() of the interval a < < fi into the space Ry, so that y(e) is continuously differentiable in the latter interval and the mapping is regular. In the case at hand, the derivative can be represented in the usual way as the limit v(e+CIE)-YW Y O = lim 4.
-o
which for a < < P is therefore continuous and different from zero. Corresponding to the subinterval o is a subarc with arc length I
f Iy'(e)Idsr=a($) to
> 0 it follows that u(E) is a monotonically inFrom a'($) creasing function which is differentiable with respect to $ as often as the function y(() is continuously differentiable. The transformation
a=a(),
$=. (a)
is consequently admissible, and one can use the arc length or as the parameter. This offers certain advantages, because dy do
and as a consequence Idyl
hold identically.
_ dy d_f d$ do
Y(f) IT(ol
Idyl = do,
i
2.2. The associated n-frame. For our further investigation it is to be assumed that the parametric representation y = y() satisfies the following more special conditions: 1. The first n + I derivatives of the function y(¢) exist in the interval
a<$«
2. The first n derivatives are for each $ of this interval linearly
independent.
V. Theory of Curves and Surfaces
186 If
i = 1W.
S = S(L)
is an admissible parameter transformation which is moreover (n -!- I)-
times differentiable (as is the case, for example, for ; = a(E)), then the above hypotheses are also fulfilled with respect to the parameter E. The second hypothesis is a consequence of 0 and of the pth
derivative (p = 1, ... , u + 1) of y with respect to the one parameter being a linear combination of the first p derivatives with respect
to the other parameter. From these statements it further results that the first p derivatives at each point y = y() of the arc span a p-dimensional subspace SP(f)
which is independent of the choice of the parameter. This is the p-dimensional osculating space and y($) + S,(J) the p-dimensional osculating plane of the curve at the point v($); for p = 1 one has the tangent to the curve at the point in question. Now orthogonalize the linearly independent derivatives Y'W, ... , y.(nW) , in this order, by means of the Schmidt orthogonalization process, with respect to the fundamental euclidean form (y,, ),=) laid down in 1.5. Using the notation from 1.4.4, we find Y'M() = AtI(E) el() + ... + )"PR) er(1;)
2.1
ep( ) = pp+(E) y'() + ....{.. ppp() y(P)() Here App(l;) pVp($) = 1. All coefficients Sit and pi j are, for a given parameter, uniquely determined if the sign of ).AP is fixed; for example, we take AppO > 0. The first p orthonormal vectors e,($), ... , ee(J) generate the osculat-
ing space SP(f) at the point y($). Because this space is invariant, it follows that the orthogonal coordinate system is uniquely determined, independently of the choice of parameter, at each point of the arc. This is the accompanyingn-/rame of the curve; e, is the unit tangent, es the first or principal normal, e3 the second or binormal, and so on. If one takes the arc length a as parameter, which will be the case in what follows, where we do not write out the argument or, then it follows from the identities (Y', Y') = 1 ,
(Y' Y'") = 0 ,
that e, = y', e2 = y"/IY"1 Therefore, in the Schmidt orthogonalization scheme, for the parameter a, All = I - si = 0 , via = I7' I
187
§ 2. Curve Theory
2.3. Frenet's formulas. Since the existence of the derivative y("++) was assumed, by the Schmidt orthogonalization scheme not only do the exist, but so also does e;,. For each a of the derivatives e;, . . . , interval a < o < f1, therefore, there exists a uniquely determined line-
ar transformation A (a) = A of the space R such that for = f .... , n
e;= Aep.
(2.2)
A is the Frenet operator and the corresponding matrix (a'') (a') = (e', e,) = (A e,,, eq)
with respect to the n-frame at the point a, the Frenel matrix of the arc at this point. In order to investigate the properties of this transformation, we consider the orthogonal transformation T(a) = T determined by. the n equations
a < ao < d) , (2.3) where T(ao) = I is the identity transformation. By means of this eo(a) = T(a) ep
(eP = e,,(oo)
,
transformation one obtains from
eD=T'eO =1'7-1e,, the representation
A=7"T-'= T' T*
for the Frenet operator, where T' = dT/da and T* stands for the transformation adjoint to T. From this it is seen that A is skew symmetric. In fact, it follows from the identity T T-1 = T T* = 1. by means of differentiation with respect to a, that
0=T'T*+T(T*)'=T'T*+T(T')*=A+A*, which expresses the skew symmetry of A. Consequently, the matrix of the operator A with respect to any orthonormal coordinate system, in particular also with respect to the n-frame at the point a, is skew symmetric :
a'+ay=0.
(2.4)
If one further considers that the unit vector ey, according to the Schmidt orthogonalization scheme, is a linear combination of the first p derivatives of y, and the derivative y(P), conversely, is a linear com-
bination of e ... , ey, then it follows that for q > p + 1. ap = (er, e,,) _ (A ep, e4) = 0. Because of (2.4), the same is also true for q <
Thus, if one sets aP+' = xt, (p = 1, ... , n - 1), then 9P
aP+1
0,
ap=0 (q+p+1, p-1).
(2 .5)
188
V. Thcory of Curves and Surfaces
The Frenet matrix (ar)
is therefore a "skew symmetric Jacobi matrix", and the Frenet equations (2.2), written out with respect to the n-frame e,, ... , e,, at the point or, are
e; =xle=,
e;=-xP-Iet-I+n eP+i
(p=2,...,n-1),
(2.2')
e,, = -xn-t e., The is - 1 quantities xP = r.P(a) are called the curvatures of the curve y = y(a) at the point a; r1 is the first or principal curvature, r.= is the second or torsion, etc. As functions of the arc length a they are uniquely determined at each point of the arc up to the sign, which depends on the orientation of the unit vectors eP. From the Schmidt orthogonalization scheme (2.1) and the Frenet formulas (2.2') one obtains the expression rp = cn+a gyp.. , (2.6) APP
for these curvatures, so that xP > 0 provided AAP is assumed to be > 0
in equation (2.1) (cf. 2.6, exercise 1). With these formulas one also easily computes the curvatures directly from the derivatives of the function y(a) (cf. 2.6, exercise t).
2.4. Integration of the Frenet equations. From (2.1) and (2.6) it can be seen that xp(a) is differentiable (n - p) times in the given interval, provided the curve it = y(a) satisfies conditions I and 2 of 2.2. AVe now show, conversely: it the xP(o) (p = 1, ... , it - 1) are de/fined as (positive) and each
(n - p)-times-dilferentiable functions in the interval n < a < f3, then in the euclidean space RY there exists an are r = y(a) satisfying conditions I and 2 of 2.2 with the arc element da and the prescribed curvatures xP.
This curve is uniquely determined up to a translation and orthogonal transformation of the space.
Assuming there exists an arc in R,". with the asserted properties, then its n-frame satisfies the Frenet equations (2.2'). The operator A
has with respect to the n-frame at the point a the prescribed Frenet matrix (2.5) and is in this way uniquely determined for each a of the interval a < a < f3 as a skew symmetric transformation of the space R. If the orthogonal transformation T is introduced by means of (2.3), then A = T' T-1, and 7' satisfies the differential equation
T' = A T
(2.7)
in the its-dimensional operator space, with the initial value T(ore) = f.
1 2. Curve Theory
189
Here the Frenet operator A depends on the accompanying n-frame of the curve y = y(x). But in our converse problem, determining this
curve by means of the given curvatures xp(a) (¢ = 1, ... , n), only the alternating Frenet matrix M(a) is known. In order to fix the operator A, we observe that M(a) is invariant under the transformation of A(a) by the orthogonal transformation T(a), so that A(a) has the same Frenet matrix in the frame (el (a), . . . , as the alternating operator B(a) = T-1(a) A (a) T(a) in the frame (el(ao), e,,(ao)). The equation (2.7) can therefore be written
T'(a) = A T = T(a) B(a)
.
(2.7')
In order to integrate this equation for the unknown operator T(a),
with the initial value I in an arbitrarily given orthonormal basis
(e1(ao), ... , e (ao)), we use the main theorem of the theory of linear differential equations. The solution T(a) is uniquely determined, and T (ae) = I. This solution is orthogonal. In fact,
(TT*)'=T'T*+T(T')*=TBT*+T(T'B)*
=TBT*+TB*T*=0, because B + B* = 0. Thus T T* is independent of a and equal to the constant I. That implies the orthogonality of T(a). Thus if the initial vectors a (ao) = e;
(p = 1, ... , n) in (2.3) are orthonormalized, the equation (2.7') determines an orthonormal n-frame (ep(a)) (p = 1, . . . , n). In particular, by y'((r) = el(a), we conclude that Y(a) = yo + f el(a) da
(Y(ao) = Yo)
(2.8)
a,
is the only possible curve of the sort required that goes through the point yo, with the frame (e°, ... , eM). This arc, which is uniquely determined by the prescribed curvatures xp, the point yo and the n-frame (e°, ... , has, in fact, all of the required properties. First, it follows from (2.8) that Iy'(a)) = Iel(a)l = 1. The parameter a is thus the arc length of the constructed curve. Further, because of the (n - p)-differentiability of the functions up,
the derivatives y' = el, y" = e; = A el = xl es, y"' = x; e2 + x1 e = x; e, + x1 A es 16 =2, ,n with
el + x; ex + xl x2 es exist, and in general for y(p) _ Ap, e1 + ... + App ep
Ilpp = !cl ... xp-t
(> 0)
V. Theory of Curves and Surfaces
190
exists; even the derivative y'"+'1 exists. From the expression for y(M it can be seen that the derivatives y', ... , yW are linearly independent
and that the above constructed orthonormal system e,..... e results from these derivatives by means of the Schmidt orthogonalization process. The vectors ep thus form the n-frame of the constructed curve at the point a. Further, in view of formula (2.6) it follows from the above expression for A,,, that the curve constructed has the prescribed functions
r.,,... ,
as curvatures at each point of the interval x < a < (f. The claim relative to the uniqueness of the curve constructed is also a result of the construction. For if y = y(a) is a second solution that goes through the point y(ao) = vo with the it-frame e?,..., ep, there exists a unique orthogonal transformation T. such that To e'=eo Then the curve y = y°(a) yo + T0(}'(a) - 3'0) goes through the point y*(ao) 3,0 with the n-frame e ?. . . . . . . . At each
point a of the given interval it has the same curvatures as y = y(a) and thus also as the constructed curve y = y(a). But then we have identically y°(a) - y(a), which completes the proof.
2.5. Degeneration of the curve. In the above discussion it was assumed that the first n derivatives of the function are linearly independent at each point of a certain parameter interval. Now assume that only the first I (< n) derivatives turn out to be linearly independent, while for y" 1- 1)(!) the linear relation
"l) _
3r(ii) y'"(E)
holds in this interval, with continuous coefficients 2r.
For a fixed ;o of the parameter interval the linearly independent derivatives y'(E0).... , y(')($o) generate an 1-dimensional subspace U,
of the space R. In the latter, U; has an orthogonal complement of dimension it - 1; let a be an arbitrary vector of this complement. With the constant vector a we form the functions
These l functions now satisfy the like number of equations in the normal system of linear differential equations
i1(E)
and this Zr(f)=°0.
is
i
=i)- r- ,').(E)zr()
also satisfied by the identically vanishing functions
But for E=Eo, "r(seo)
(y"'(Eo), a)
=0
(i == 1, ....1) .
§ 2. Curve- Theory
191
and therefore, according to the already often used uniqueness theorem, these two systems of solutions are identical. 0 in the entire parameter interval, and in particular By this,
C, (t) = (y'(e), a) = 0 , and e
I
dd
Y(o), a) = 0.
Since this equation holds for each a from the orthogonal complement of U., and each a in the parameter interval, the entire arc is orthogonal to this complement and therefore lies in the 1-dimensional hyperplane
y(0) + U;.. If the curve is shifted parallel to itself by the vector -y(e0), it lies in the subspace U;., where the curve theory developed above can be applied to it. 2.6. Exercises. 1. Prove formula (2.6), A(P_ +) (P+/)
xP
.PP
2. Show that the curvature xP = vp(a) of the curve y = y(a) can be computed from J)
1,
dpp
where d00 = 1, and for p
1
(Y', y')
...
(y', y(P))
.jPP =
i
(y(P), y') ... (Y(P), y(P))
Hint. According to exercise 16 in 1.6.11, where one is to set = ylP), one has for
=1,...,n 2 2 A17 ...APP
- PP
3. Determine the curve in Ry (uniquely determined up to a eucliare (positive) dean displacement), whose curvatures rl, constants. Hint. Let 7' be the orthogonal operator defined by equation (2.3) and o° = 0. Since the matrix (2.5) of the Frenet operator A with respect to the i+-frame e1, ... , e is by hypothesis independent of the arc length a, integration of the differential equation (2.7) by means of Picard's method of successive approximations (cf. IV. 1.10, exercises 1 -2) yields
T _ T(a)=
E'7,
00
i 0 il
A'
(A°=T(o)=I),
V. Theory of Curves and Surfaces
192
so that Ye(a) _ ei(a) = T(a) e,(0) = E00aF, A' ej0)
(a)
Because of the skew symmetry of A, by exercise 13 in 1.6.13, an orthogonal transformation To and a fixed orthonormal system a,,..., ap
exist such that for P = 1, ... , n ep(0) = To ap
,
and for q = 1,...,in= [n/2]
a p = T ep(0) a
A a2q = - oq where for an odd is = 2 in + I the axis a with A a2q-t = LPq a2q ,
a2q-t
.
Aa.=0
(b)
(b')
must be added. From the special Jacobi structure of the Frenet matrix (2.5) one concludes that the numbers Log are + 0. For if one of these numbers should = 0, then it would follow from (b) that the kernel of the operator A would have at least dimension 2. But if one refers this
operator to the coordinate system e,(0), ... , en(0), then one finds that the equation
Ax=LEAe,(0)=0
r-t holds for an even is only for x = 0 and for an odd is = 2 its + I only for
x = rl Zin x, ti
t
t
a2(+t
xs x4 ... M21
(0)
(c)
The kernel of A is therefore of dimension 0 in the first case, and of dimension 1 in the second. Because equations (b) are invariant with respect to an orthogonal transformation of the plane spanned by a2q_t and a2q, the transformation To and the orthogonal system a1,. .. , a,t can be normalized in a unique way so that, for example, e1(0) = To a, = E Aq a2q + I a , qo,
where A = 0 for an even is = 2 in. If this is substituted in (a) a short calculation yields, because of (b), m
f. A. (sin (Uq Cr) a2q-t - cos (9, a) a2q) + A an .
y'(a)
q-t
from which
y=E
R (cos (oy a) a29-1 + sin (oq a) a2q) + ) a a
q-t Qq
is obtained, if the integration constant y(0) is chosen suitably.
(d)
§ 3. Surface Theory
193
It can be seen from the above that the constants A,,
... , A. and
(for an odd n) A satisfy the relation M
Iy'(a) I2 = E' ).q + As = 1 .
(e)
On the other hand, the constants A. and, for an odd n, A too are 4 o. Otherwise, as follows from (d), the curve would degenerate, which is
not possible if all curvatures are different from zero. Further, it follows from equations (d) and (e) that equation (d) contains precisely n - I independent parameters, corresponding to the n - I curvatures.
Conversely, if these parameters, that is, the numbers A. + 0, eq + 0
and (for an odd n) A + 0. are arbitrarily prescribed according to relation (e), then a is the arc length of the curve, and by the formulas
in exercise 2 one obtains for the curvatures of the curve constant nonzero values that up to their sign are uniquely determined. Remark. Equation (d) shows that for an even n = 2 n: the curve lies on the sphere
lyl=E; q_1q while for an odd is = 2 in + f it winds itself indefinitely around the axis a" determined by equation (c). For n = 2 the curve is a circle, for it = 3 a helix. 4. Compute the curvatures of the curve (d) in the preceding exercise
fore=2,3,4.
Hint. For the computation of the determinant A., in exercise 2, observe that as a consequence of equations (d) and (e) (y', y') = f
andfori+j>2
(y(i),
y(i))
n4+r-2 cos Y
`
§ 3. Surface Theory M. The first fundamental form. We refer to the definition given in § 1 and consider an in-dimensional regular surface F" embedded in the n-dimensional euclidean space Ry, where for the time being we assume f 5 in S is - i. Such a surface is defined by an "admissible" parametric representation (cf. 1.1 and 1.2). The parameter space Re,' is mapped by means of the regular linear operator y'(x) one-to-one onto the in-dimensional tangent space of the
surface at the surface point y(x) determined by the parameter x. Corresponding to the vectors h and k of the parameter space are the tangents y'(x) h and y'(x) k with the inner product G(x) h k = (y'(x) h, y'(x) k) . (3.1)
V. Theory of
194
and Surfaces
This is the first fundamental form of the surface theory, first introduced by Gauss. It determines the measurement of volumes and angles on the surface. At each point of the parameter region G' it is a real, bilinear, symmetric, positive definite function of the parameters h and k. The tangent dy = y'(x) dx = y'(x) h has length Idyl = ly'(x) It I = jet (x) h h ,
and the angle 0 enclosed by the tangent vectors y'(x) h and y'(x) k is determined up to its sign by h, y'(x) k)
Cos Q =
b (x) t+IIY (x) Al
j'G(x) k k G(x) k k
More generally, the following is true: Corresponding to a d-dimen-
sional simplex spanned at the point x by the linearly independent vectors ht, . . . , hd (1 5 d g in it - 1) by means of the regular operator y'(x), there is in the tangent space y'(x) R" a simplex at the surface point y(x), spanned by the linearly independent tangents y'(x) ht, ... , v'(x) h,,, Which according to 1.6.10 has the volume 11det (G(x) hr ht)
where
G(x) hl hl ... G(x) hl !rd ; det (G(x) It, ht) = I
G(x) hi ht ... G(x) hd hd
In the theory of curves it is convenient to use the arc length of the curve as the parameter, so that the parameter interval and the arc are related isometrically. Corresponding to this, in'the theory of surfaces one uses at each point x of the region Gx' the first fundamental form as the basic metric form, by means of which the linear parameter space, for a fixed x, becomes euclidean with the inner product
In this metric y'(x) consequently provides an orthogonal mapping of the parameter space R onto the tangent space y'(x) R", in the If one introduces an affine coordinate system af, ... , parameter space Rx' and writes the differentials M
m
h = dix = E d,$' a, t-I
k = dax
a,
f_t
§ 3. Surtace Theory
195
the result is the usual coordinate form of the first fundamental form: G(x) d1x dtr = E g;i(x) dIE' `,, 1 . i.i=+
where
g;i(x) = G(x) a; a,
(gfi - gig)
3.2. The unit normal. From now on we restrict ourselves to the case us = it - I (in 2). Under this assumption the rn-dimensional tangent plane of the surface at the point y = y(x) defined by the equation Em(x) = y(x) 1 y'(-,r) R*. (3.2) possesses a well-determined one-dimensional orthogonal complement;
this is the normal to the surface at the point y = y(x). With the latter point as the initial point lay off two opposite unit normals (normals of length 1). For what follows it is important to fix the orientation of
the normal. This can be accomplished by means of the Schmidt orthogonalization process in the following way.
Start at a point y = y(x), determine here the tangent plane (3.2) and fix an arbitrary vector y = a + 0 that intersects this plane. If this vector, which in the following is to be held constant, is projected onto the tangent plane Ey (x) and if the projection is p = p(x), then the unit normal e = e(x) can be fixed at the point x by c(x) _ a - P(x) I,-' - p(x)I
(la - p(x)l > 0);
the other is -e(x). Because of the continuity of the derivative y'(x) the tangent plane for a continuous displacement of the point x moves continuously, and hence the projection p = p(x) is also a continuous function of the location. The condition ja -- p(x)l > 0 is thus satisfied in the vicinity of the initial point x, and the above formula hence determines e(x) uniquely as a continuous function of x. A careful analytic proof of this plausible conclusion is easy to give with the aid of the Schmidt orthogonalization process (cf. 3.8, exer-
cise 1). One further sees that if y(x) is differentiable several times, say q times, then the projection ¢(x) and with it also the normal e(x) is differentiable (q - 1) times.
In the following we fix at an arbitrary point y = y(x) of the surface a definite direction of the unit normal as the "positive" one. The positive direction of the normal is then defined at another point x1 by continuation on an arc x0 x1.
3.3. The second fundamental form. Henceforth we are going to assume that the function y(x) is twice continuously di//erentiable in the
V. Theory of Curves and Surfaces
196
region Gs (m = it - 1). By the above the normal e(x) is consequently continuously differentiable once. At each point y = y(x) of the surface the tangent space and the
positive unit normal span the entire embedding space R. Consequently. for each pair of parameter differentials h, k the vector y"(x) h k can be decomposed in a unique way into two orthogonal components, the tangential and the normal.
We shall treat the first component later. The second component is the orthogonal projection of the vectorv"(x) h k onto the positive unit normal, and is therefore equal to (y"(x) It k, e(x)) e(x) = L(x) h k e(x)
.
Here
L(x) It k = (y"(x) It k, e(x))
(1.3)
is the second fundamental fonts of the surface, first introduced by Gauss.
It is, like the first fundamental form, a real, bilinear, and, because of the symmetry of the second derivative, symmetric function of the differentials Is and k.
3.4. The operator I'(,r) and the derivative formula of Gauss. We now investigate the tangential component of the vector .v"(z) It It, that is, the orthogonal projection of this vector onto the tangent space y'(x) Rs According to the above this projection, on the one hand, is equal to the difference y"(x) h k - L(x) h k e(x) ,
from which it can be seen that it depends linearly and symmetrically on the parameter differentials It and It. On the other hand, as a vector of the tangent space, it has a unique preimage in the parameter space which at the given point x likewise will be a bilinear and symmetric function of It and of It and hence can be denoted by
1'(x)hk=T(x)Ah.
By this the tangential component of y"(x) 1t k considered is equal to
3,'(x)1'(x)hk. Comparison of both expressions yields the derivative formula of Gauss,
y"(x) It It == y'(x) 1'(x) h It + L(x) It It c(x) ,
(3.4)
by means of which the decomposition of the second derivative of y(x) into a tangential and a normal component is effected.
§ 3. Surface Theory
197
3.5. Dependence of the operator I'(z) on G(a). The bilinear symmetric operator r(x) can be computed from the first fundamental form G(x) h k.
To see this observe that the unit normal e(x) stands perpendicular to each tangent y'(.r) 1, from which it follows, by means of the above derivative formula, that (y"(x) h k, y'(x) 1) = (y'(x) 1'(x) h k, y'(x) 1) , and consequently G(x)1 r(x) h k = (y"(x) h k , y'(x) 1) . On the other hand, if for brevity we do not indicate the fixed point x and observe the symmetry of the second differential,
(y"hk,y'1) =G'hk1- (y' k, y" hl)
=G'hkl-(y"Ih,y'k) =G'hkl-G'Ihk-l-(y"k1.y'h) =G'hk1-G'11,k+G'k1h-(y"hk,y'l). Consequently
2 (y"h k,y'l) =G'h kJ +G'klh - G'lkk and
G(x)Ir(x)hk=-'2 (G'hkl+G'klh-G'1hk,'
(3.5)
Since the operator G(x) is not degenerate and equation (3.5) holds for an arbitrarily fixed pair of vectors h, k for each parameter vector 1,
it determines the vector r(x) h k uniquely. For the explicit computation of this vector one can, for example, metrize the parameter space at the point x in a euclidean fashion, by using G(.x) h k as the fundamental metric form, and construct a coordinate system a,(x), ..., a,_, (x) which with respect to this metric is orthonormal. Then if T'(.x) Is k are the components of T(x) 1t k in this coordinate system, we have for
r=
......
1
r'1tk=Ga;l'hk=-'2 (G'hka,±G'kaiii-G'altk). If h and k are also expressed in coordinates
hfia,,
N-t
n-i
k=E12at, t_t
5_I
and the coordinate representation (3.1') for G is substituted, one finds n-1
r'It k= :.t_t E J,J91''a,a,,
V. Theory of Curves and Surface-
198 and
1''a=at= 2(
i-
(3.6)
atit)=I L
JJ
These are the Ci:risto//el symbols o/ the first kind.
3.6. The operator A(r) and the differentiation formula of Weingarten. It was remarked above that the existence and continuity of the
derivative y"(x) implies the continuous differentiability of the unit normal e(x). To determine the derivative e'(x) we start from the identity (e(x), e(x)) == I
and from this, by means of differentiation with the differential dx = k, obtain the equation (e (x). e'(.x) k) = 0
.
This identity states that for every h in the parameter space, e'(x) h stands perpendicular to e(x) and therefore lies in the tangent space
y'(x) R. Consequently, this vector has a unique preimage in the parameter space that will likewise depend linearly on h and can thus be denoted by
-:1(x)h.
Accordingly, we have C'(x) It
y'(.r) ,I(x) it
(3.7)
This is Weingarten's differentiation formula. In connection with the Gaussian differentiation formula it plays a role in the surface theory analogous to that of the Frenet formulas in the theory of curves. The operator A(x) defines at each point x a linear self-mapping of
the parameter space. In order to find the relation of this operator to the fundamental forms G and L. we start from the equation k, e(x)) = 0 , which holds for x E G? and for each parameter vector k. If we differen-
tiate this identity with respect to x with the differential dx = k, then (y"(x) It k, e(x)) ± k, e'(x) h) = 0. Here the first term on the left, according to the definition of the second fundamental form, is equal to L(x) It k, while Weingarten's formula yields
- (y'(x) k, y'(x) :1(x) h) for the second term. Consequently,
G(x) h A(x) h
G(x)kil(x)It =L(x)Itk. This equation holds for an arbitrary fixed h and every k in the parameter space and therefore uniquely determines the vector A(x) It.
§ 3. Surface Theory
199
The coordinate system al(x), . . . , a.-I(x), introduced in the preceding section, which is orthonormal with respect to G(x) h k, is most naturally chosen to be the principal axis system of the form L(x) h k with respect to G(x) h k. Then, according to what was said in 1.6.8, A(x) ar (x) = x1(x) ai (x)
(i = 1, .
. .
, it - 1) .
where x{(.x) stands for the eigenvalues of the linear transformation A(x).
which because of (3.8) is self-adjoint with respect to G(x) h k. For N-I
h = Z 'a;(x) ;_t
therefore,
k-1
A(x) h = E x.(x) ;' a;(x). c=,
We shall return to the quantities x;(x) in another context. 3.7. The principal curvatures. In the following we investigate the curvature of a surface curve y = Y(x(a)) = Y(a)
at the point y(x) = y(x(a)) of the surface. Here a is the arc length of the surface curve; we assume that the preimage x = x(a) in the parameter space is twice continuously differentiable. For this choice of the parameter the unit tangent becomes
el(a)=e1=v'=y'(x) x', and therefore e, = y (x) x'x' + y'(x) x"
On the other hand, according to Frenet's formulas
e; =xe=, where e= = e2(a) stands for the principal normal and x = x(a) for the principal curvature ± 1y"(a)l of the surface curve. In the theory of curves we have taken the principal normal in the direction of -y"(a), so that x(a) always turns out to be positive. In the present context the orientation is to be taken care of in a different way,
namely, so that the angle enclosed by c2(a) and the already fixed positive unit normal to the surface, e(x), has a magnitude 0 = 6(a) 9 c/2, and thus cos 19 0. From the above equations it now follows, because cos 8= (es, e),
that x cos 0 = (x e., e) = (e;, e) = (y"(x) x'x', e) + (y'(x) x", e) . Here the second term on the right vanishes, while the first, according to the definition of the second fundamental form, is equal to L(x) x'x'. Therefore, x(a) cos 6(a) = L(x(cr)) x'(a) x'(a)
.
(3.4)
200
V. Theory of Curves and Surfaces
This formula of Mensnier shows that the curvature P. for the above normalization of the principal normal e, has the sign of L(x) x'x' and otherwise depends only on the direction of the tangent el = y'(x) x' and on the angle 0. We can restrict ourselves to the case 0 = 0, where one is dealing with "normal sections", whose osculating plane contains the normal to the surface at the point in question. In order to better understand the dependence of the curvature r. =L(x) x'x' of such normal sections on the direction of the tangent it is advisable to once more metrize the parameter space at the point x in question by means of the first fundamental form (cf. 3.1), in order to be able to use the principal axes a,(x), . . . , a,,-,(x) of the second fundamental form, introduced in the preceding section, as the coordinate system. If M-t
x' ° rE
when how
1,01 0, 1WV) -VI O as W141 k'1 P- Ip'1 ss
1
,
M-1
and consequently E ()s = 1, then (3.8) i-t
x = L(x) x'x' _ Z L(x) a,(x) a,(x) (`)$ = E x(x) (')4 ,
i-f
(3.10)
where the x,(x) are the eigenvalues of the transformation A(x) introduced earlier. In this formula of Euler the eigenvalues x,(x) = G(x) a,(x) A(x) a,(x) = L(x) a,(x) a,(x) are the curvatures in the directions e,(-v) = Y'(x) a,(x)
(i = 1, ... , it - 1)
which at the surface point y(x) form an orthonormal system in the tangent space. These directions are called the Principal curvature directions and the quantitiesx,(x) the principal curvatures of the surface
at the point y(x). Together with the unit normal e(x) the principal curvature directions form an orthonormal coordinate system e(x), el(x), ... , e,.-,(x)
for the embedding space R;.: this coordinate system is called the n-frame of the surface at the point y(x). the elementary symmetric Besides the principal curvatures polynomials in these quantities, in particular the Gaussian curvature -1
K(x)
xI(x) .
. .
II G(x) ai(x) A(x) a:(.i')
201
§ 3. Surface Theory
of the surface at the point y(x), play a central role in the investigation
of the "inner" (independent of the embedding in the surrounding (m + 1)-dimensional space) geometry of the surface. The Gaussian curvature vanishes if and only if one of the principal curvatures is equal to zero.
3.8. Exercises. I. Prove that the projection P(x) of a fixed vector y = a on the tangent plane of a regular m-dimensional surface y = y(x)
in the space R;, (m < it) is (q - 1) times continuously differentiable if y(x) is continuously differentiable q times.
Hint. Let h; (i = 1, ... , tit) be a (constant) linearly independent set of vectors. At the point x orthonormalize the tangent vectors y'(x) hi by the Schmidt process:
yi(x) =
Y(x) h.
- pi(x)
(i = 1, ... ,
tit)
where i-f
Ps(x) _ Z (y'(x) 1si, yf(x)) yi(x) .
i-I
From here one can see the claimed differentiability property for y,(x), p,(x), ys(x), ... , p,,,(x), y,,,(x), and the assertion results from M
p(x) = E (a, Y,(x)) Ys(x)
2. Prove: A regular tit-dimensional surface in a euclidean space of dimension m + 1 = n which has nothing but planar Points, so that the second fundamental form identically vanishes, is an sn-dimensional subspace (or an tit-dimensional portion) of the embedding space. Hint. From Weingarten's formula (3.7) it follows that e'(x) 1s = - y'(x) A(.x) It = 0; for because G(x) k A(x) It = L(x) It k = 0, A(x) h = 0. Therefore, e(x) = e is constant, and
L(x)hk= (y"(x)hk,co)=_ 0. from which follows, first, (y'(x) It, eo) = const. = 0 and then (y(x) - y(x0), e0) = 0. 3. Prove: A sphere (Y(x), y(x)) = o'
has only ssnsbilical points, where all of the principal curvatures are equal.
Hint. \t'e have (y'(x) h, y(x)) = 0, and thus y(x) = ), e(x), 7.' = o=, 7. = f o. We take, for example, y(x) = - o e(x) .
202
V. Theory of Curves and Surfaces
It follows from Weingarten's differentiation formula that y'(x) 11 v e'(x) It = t? )-'(x) A(x) h = y'(x) (o A(x) h) and therefore
A(x) h =--h=xh. e
All of the principal curvatures are therefore equal to t!q, and every point is consequently an umbilical point. Conversely, it can be shown that such a surface is always a sphere (cf. 6.3, exercise 2).
§ 4. Vectors and Tensors 4.1. Parameter transformations. Before we develop differential geometry any further, the transformation character of the quantities introduced up until now is to he investigated as we change from one parameter x to another. For this the surface F"' is to be defined as a continuously differentiable (regular) m-dimensional manifold, in the sense of the discussion in § 1, so that we can disregard its embedding in the space R,'. Around the point p on F'" we mark off a portion Gp which is homeomorphic to a parameter region G' c R;, and here we introduce further admissible parameters .... The vector and the tensor calculus is concerned with the simplest kinds of transformations which quantities given on the manifold F"' can experience when the parameters are changed'.
4.2. Invariants. From this standpoint the simplest structures are those which are uniquely defined in G1. The "representatives" F(x), of such a quantity are obtained from one another through the law of invariance: F(a) = F(x(s)), F(x) = If, in particular, the range of F lies in the set of real numbers or, more generally, in a linear space, then such an invariant is also called a scalar.
4.3. Contravariant vectors. In the neighborhood G"' of the manifold F"' consider a point p - (x, i, ...). Let dx = is be a differential, i. e. a vector in the space 1?, , and form the differentials dx = h = di Is, dr etc., that correspond to h. The equivalence class
dp = P. h, ...) I R. Neva nlinna IS..
3 4. Vectors and Tensors
203
defines a differential or a contravariant vector at the point p of the manifold.
Such a vector is determined by the pair {p. dp}, where dp is a class of vectors h E R.", I E R7, ... , which are related by the law of contravariancc lr
_dzh
h -dxlr; la-
dx
the derivatives here are to be taken at the point p = (x, z, ...). The set of all vectors dp at the fixed point p form an m-dimensional
linear space Td,, if the addition of two vectors d,p = (h,, hr, ...) and dtp = (hs, h2, . . .) and the multiplication of dp = (h, h, ...) with a real number ). are defined by means of the equations
d,p+dsp= (h,+ks,h,+h',...),
ddp-=(Ah,).h,...).
7" is the tangent plane of the surface at the "point of tangency" p. The representative in the parameter region Gs of a contravariant vector { p, dp} is the pair {x, dx) or also that vector in the affine space R? that has x as its initial point and x + dx as its end point (cf. I.1.5). This last interpretation corresponds to the elementary geometric view of a "vector on a surface". On the notation let the following be remarked. Since a point p = (x, i, ...) is uniquely determined by any of its representatives and the same holds for a differential dp = (dx, dx.... ), the point and the differential can also be denoted by representatives x and de in an arbitrary one of the admissible parameter spaces R. Thus if in what follows we speak simply of a contravariant vector it. say, we mean the entire equivalence class'. Let (p) be a point set on the surface. If a contravariant vector
dp = it = h(p) is associated with each point of (p), the set (h(p)) is called a contravariant vector field (more briefly a contravariant vector)
on (p). The notions continuity, differentiability, etc., of such a field are defined in an obvious way with respect to the representatives h(x), h(i), .. , of h(p) in the admissible parameters x, Fe.... for p. 4.4. Covarlants. Now let a class of linear operators A, A, ... on the parameter spaces R;', RL", ... be given at the point p =- (x, i, ...) I To distinguish between the admissible parameters z one can also use an index set (i). A point x on FM is then defined by the equivalence class (x4). Correspond-
ingly, a contravariant vector h at the point x is given by a class lhi), whereby hr E R'x"r and the law of contravariance is satisfied: if i and j are two indices in the
set (i), then
hi=
dx f
hi
204
V. Theory of Curves and Surfaces
that transforms according to the law of covariance, i.e.,
A=A!,
A=Ad ,
where the derivatives are to be taken at the point p. The range of the _ invariant differential
Adp=(Adx,Adx,...)
lies in an arbitrarily given, say n-dimensional, linear space. We call such an operator A a covariant of the manifold at the point p. One sees: If a covariant operates on a contravariant rector, the result is an invariant. This fact can be taken, conversely, as the definition of a covariant.
For if the quantity A h is invariant for every contravariant vector h, then, because h = -x h, A h == A h = A d h; hence A = A , i.e.,
-
A is covariant. On the other hand, the transformation law of contravariance follows from the definitions of invariance and covariance.
4.5. Covariant vectors. Let .4 be covariant and h contravariant. Because of the invariance of the differential A h, the covariant A can be thought of as a vector in an inn-dimensional operator space (I.3.5). The case it = 1, where the invariant A h can be taken to be a real number, merits special attention. Then
A=a* is a linear operator defined in a tangent space 7' = T;`1 and is thus a vector in the likewise tit-dimensional space T* dual to T. Because of the covariance of a*, tl.Y a*=a* dxa*a* dx'
di'
a* is called a covariant vector. If a* is a covariant vector and b a contravariant one, the expression
a* b is a real invariant that is linear in both arguments a* E T* and b c T. Here, as above, one can think of a* as a linear operator from T defined on T, or (dually) of b as a linear operator from T defined on 7'*. According to the notation we have adopted, the "contravariant argument" b stands to the right of the operator a* in the invariant bilinear form, and the "covariant argument" a* to the left of the "operator" b.
4.6. Gradient. The simplest example of a covariant vector is the derivative, the gradient a* of a real invariant F. with representatives
a*= rIF rlr
dP dx
§ 4. Vectors and Tensors
205
Conversely, a covariant vector a* in Gp is not always a gradient, i.e.,
the derivative of an invariant F. For that a* must satisfy the integrability condition, i.e., in the neighborhood Gp one must necessarily have
rot a* =
da
A dx = 0
(cf. III.3.3). It is clear that this condition is invariant (independent of the choice of the admissible parameter x). For if one carries out a twice differentiable
transformation of the parameter x - x, then for a covariant A and a contravariant h the invariance
AIs =Ah holds, where Ii = d h. If this is differentiated, corresponding to a differ-
ential dx = k, dx = k =
k, then
d(AIs)=(dA)h=dx'kh=d(AI)=(dA)Ii+A(dh)
=aAkh+A a=kh. Through a switch of h and k and a subsequent subtraction one obtains, because of the symmetry of the bilinear operator do/dx',
dAkh-ddAhk=LA kit-d'tlek; dx
T x_
dr
i.e., the bilinear form
dAhk A WT
-dx A"d''hk
is invariant, and the same is thus also true for the equation A dA/dx
= 0.
4.7. Tensors. For the bilinear operator B =A dA/dx the form B h k is invariant, provided the vectors h and k are contravariant. By this property rot A is defined as a covariant tensor of rank 2. With this we come to the general notion of a tensor. We consider a real (a + #)-linear form, defined at the point ' of the manifold IN, that depends linearly on ,% contravariant vectors h,, ... , h, and I3 covariant vectors k;', ... , k;. Such a multilinear operator is to be denoted by A. In analogy with the above, we write the contravariant arguments to the right, the covariant arguments to the left.
V. Theory of Curves and Surfaces
206
Since the form is uniquely given at the point p = (x, i.... ), independently of the choice of the parameter, the multilinear form
k*...k"A h, ... h, is thus an invariant (scalar). Under this condition the operator is said to be an t-covariant and #-contravariant tensor at the point p. The sum a + fl indicates the rank of the tensor. A vector is thus a tensor of rank one. P
We shall also occasionally denote the tensor A by .4, where the upper index indicates the rank of its contravariance, the lower index the rank of its covariance. A somewhat more detailed notation is 0,
Q
1,
with a contravariant "empty places" to the right and# empty covariant places to the left. If this empty form is saturated with x contravariant and jI covariant vectors, the result is the above invariant multilinear form, which we can also write d
h#
t
I
... It' A h, ... h. 4
I
a
For the saturated form the sum of the upper indices is the same as the sum of the lower indices; both are equal to the ranka + fl of the tensor. The transformation law of the representatives A, A.... of a tensor
for the different parameters of the point p = (x, s, ...) results from the invariance
+
t dt h 1; ... hl.4h,...h.
t
hO...Iil.41, ... h, t
t
o
t
t o
where
kt=azh'' hl=ltrdz t Using the empty places this law is written O
dX
O
dX
d.C
' '
dx
.
s R
Conversely, the tensor character of the operator A could be defined by this transformation law for its representatives.
4.8. Transformation of the components. In the usual tensor calculus, the rule for the transformation of a tensor is given with the
§ 4. Vectors and Tensors
207
aid of its components, which are obtained by choosing a coordinate system in each of the parameter spaces. These rules result at once from the above coordinate-free definition of a tensor. As is customary, we wish to indicate the contravariant components
by upper, the covariant by lower indices. Further, we shall omit the summation signs whenever the summation is on an index that appears above as well as below (Einstein's summation convention). Let a1, . . . , a", and al, . . . , a," be two linear coordinate systems for
the parameter spaces R' and Rs , and
dx=d'ai,
d.Y=dF,a1 two corresponding parameter differentials, so that
dx=ode.
dx=a dx, Then
dz = a'j d$' a1 = dl;' a and consequently ' dx
a =
dr -
afi dxa. =
8F'
ai
Now let l be an arbitrary contravariant vector at the point (x, i, ...) with the representatives l=Aiar, 1=i11af in the parameter spaces Rz and Rj . Then according to the above
l=
l='i'dva1 =A'a- at=A'af,
and therefore
These are the usual transformation formulas for the contravariant vector l = (A1, ... , A"). Conversely, one obtains from this component representation the coordinate-free law of contravariance:
Ei-,
a covariant vector at the point (x, z, ...). Its
representative l* in the space R; is a linear operator. For an 1 = Ai a, of this space, therefore,
1*l=A'l*a' =a*').,
A1,
where we have set a*' ='1i and A; = l* a,. The linear forms a*' = 7.' (i = 1, ... , ,n) form the basis dual to the coordinate system a;, and
V. Theory of Curves and Surfaces
208
the 7., are the coordinates of the vector l* in this dual system. Because
it turns out that
ail*aiai
!^l*a'=1*axof-l*o;ia`
aft ati According to this the law of covariance for the vector d* dz
ni
is
ail 1;=a1i.
B='
P
Finally, let A be a general mixed tensor of rank a + fi at the point (x, z.... ) of the m-dimensional manifold F°. With the above notations the invariant multilinear from p
1'...11A1,...1a 0
has the representative
a
a*yO110... a*i,).i,AA',ai,...A
aia
in the parameter space R', where the quantities 9
a*ta
... a*f,A a;, ... a;, a A
are the m*+P components of the representative A. By observing the transformation formulas for contra- and covariant
vector components the customary transformation formulas for the components of the tensor A, anu
ani... ak.
id
result from the invariance of the above multilinear form. From the coordinate-free tensor definition it follows at once that the transformation formulas for the components are invariant in form, independent of the choice of the coordinate system. If the tensor concept is defined by means of the formulas for the components, this invariance requires further verification. H
$
4.9. Tensor algebra. If A and B are two tensors of equal rank, a
a
the sum of the saturated empty forms a
Sa
209
§ 4. Vectors and Tensors
is a real invariant. If
B --=. B
stands for this form, then C is an a-covariant, j9-contravariant tensor, a
B
B
which one defines to be the sum of A and B, a B
B
a
B
B
B
C=A+B=B+A. a a a a a B
The product A A (). real) is defined correspondingly. a
With these definitions the set of all *-covariant and fl-contravariant tensors at the point p of the m-dimensional manifold F'" form a linear
vector space of dimension ma+B, For a = 0, fi = I and a = 1, ft = 0 this space is the tangent space T or its dual space T*, respectively, at the point p E F. Besides the above linear operations one can also introduce a commutative, associative and distributive tensor product a
B+a
a y
a+y
B
AB= C , and, in fact, as that tensor whose saturated empty form is equal to the product of the corresponding empty forms of the factors. If in particular the factors are alternating, then the tensor product will in general no longer possess this property. The alternating part of the saturated product
A(AP Ba)=ACe ay a+y is
a
is the Grassmann-Cartan exterior product of the tensors A and B. V
B
4.10. Contraction. With a tensor A, where a, (3 B-t
ciate a tensor A
1, one can asso-
a
through the process of contraction, which can be
a-f
defined in a coordinate-free way as follows.
Let us first consider the simplest case a = = 1, thus a "mixed" tensor A = A of rank 2. In the corresponding invariant empty form o A O fill only one of the empty places, for example, the contravariant
one (on the right) with a contravariant vector it - Is E T. The "halfsaturated" expression A h = A It then defines A as a linear transforma-
V. Theory of Curves and Surfaces
210
tion of the tangent space T into itself, for the differential A lr is, because of the invariance of the "fully saturated" bilinear form k*(A h) = h# A 1, (k* a vector in the space T*), again a contravariant vector. If A is conceived of in this way as being a linear self-mapping of the
tangent space T =Tag, then according to exercise 4 in I.5.9 a real number a, namely the trace of A, can be associated with the latter transformation by means of an arbitrary, not identically i-anishing, real, alternating form D h, . . . h(h, E T),
a=TrA= DA, ... It,,, re+ Dh,...1_, Alt; h,,.,...h,,,. 1
,
This number is independent of the choice of the auxiliary form 1) and of the vectors h, E T. The construction can be repeated for each representative A, A, ... The corresponding traces are all equal:
TrA=TrA. The proof of this invariance is a simple consequence of the definition of trace according to exercise 4 in I. 5.9 (cf. 4.20, exercise 1). The process of contraction, whereby there is associated with the I
tensor A of rank 2 a real invariant Tr A (a "tensor of rank 0"), consists
in the formation of this trace. d The contraction of an arbitrary mixed tensor A now offers no difficulty. This procedure consists of the elimination of one argument on R
the right and one on the left of the form C
C A C . . . 0, for example
of that one with the index i on the right and j on the left. To this end one considers the expression
1a
...0hto...0 .4 o.. .0h;o...0. which has only cc ; - 2 empty places. This invariant form, provided p
those a + (3 - 2 empty places are filled with fixed vectors, defines A I
a
,t
as a mixed tensor B = .4 of rank 2 with the two linear arguments h' 1
1
1
a
+
and h;. If R is now contracted, the result is an invariant (a -- 1) - 2)form
0...0 .4 O...r - -I
and A is the sought-for contracted tensor of rank a -= /f - 2. a-I
1 4. Vectors and Tensors
211
4.11. The rotor. We have seen in 4.6 that when A = A(p) = A is a differentiable covariant vector field defined in a portion Go" of the surface F' and the equivalence relations which determine the points p = (x, ti, ...) of GP are twice differentiable, then the form dx
hk
2 \dx
hk
dx k k
where h and k are arbitrary contravariant vectors, is invariant. The bilinear operator rot A = A dA/dr is hence a covariant tensor of rank two.
.
The corresponding holds for the rotor of a covariant tensor A = of rank q. If one sets h; h; (i = 1, ... , q + 1), where h,, ... , dx
hq+, are contravariant vectors, then
AIt,...hi...hq+, = .4 h, ...hi...hq:.,, and through differentiation with the differential dx = h; one obtains WT
hill,
.hi. ..hq+1
T hi h, ... hi... hq+,
+ qZ A h, .. hi_, d=Y h, h;l h;.., .. h; .. hq+, 1*:
After multiplication with (-
/ (q + 1) and summation over i = 1,
.... q + 1, the result, in view of the symmetry of the operator d' /dx2, is
rotA1r,...hq+, _A"h,...h<.,=rotAh,...)tq+, which expresses the asserted covariance of the operator rot A = A dA/dx. From this it follows, in particular, that Stokes's formula is invariant with respect to parameter transformations.
4.12. The fundamental metric tensor. Lowering and raising indices. According to the definition of the fundamental symmetric operator G(p) = (G(x), G(.), . . .) the expression G(x) h k is invariant: Ghk=G
d'4 h 7X
dx
where h and k are arbitrary contravariant vectors. The operator G is thus a (symmetric) covariant tensor of rank two: G = G.
212
V. Theory of Curves and Surfaces
From here it follows in particular that the expression I
It*=Gh=Gh 2
is a covariant vector h* = Is. Conversely, one can, as a consequence of t
the Frechet-Riesz theorem, associate with an arbitrary covariant vecI
for h a uniquely determined contravariant vector it such that for every
contravariant vector k I
'
'
hk=Ghk. 1
I
2
The dual vectors h and h, which through G are in one-to-one correspon-
dence, are said to be mutually conjugate; one also speaks of the "covariant or contravariant component" (h or i) of a vector h. The transition from the one component to the other corresponds in the usual tensor calculus to the process of raising and lowering of the indices. In the equation h = G is, G can also be interpreted as a linear trans-
formation that maps the "contravariant" parameter space into the dual
"covariant" parameter space. This transformation is regular, the I
reciprocal transformation G-' thus exists, and one has h = G-' h. I
4.13. Volume metric. At the point p = (x, z, ...) of the manifold F" we consider in tangent vectors dip = (he, h;, ...) (i = 1, . . . , in). In order to define the volume dv of the simplex (dip,
.... d qp) spanned
by these vectors take for a fixed p an arbitrary, real, nondegenerate, alternating, fundamental form d=v = D(p) dip
...
Since the differ-
entials dp are contravariant and the volume is to be invariant,
D(x)hi ...h,,,=D(sr)h,...i'm, the operator D(p) = (D(x), D(.Y), ...) is thereby defined as a covariant tensor of rank in. The volume element do is according to this definition endowed with a sign, corresponding to the orientation of the simplex , d,jp) (dip. The differential dig is dependent on the arbitrary alternating tensor D(p) and uniquely determined only up to an arbitrary real factor o(p). In order to determine this normalization factor in such a way that the volume metric at the point p is connected with the length metric determined by the fundamental tensor G(p) in a euclidean fashion, one proceeds by 1.6.10 as follows.
§ 4. Vectors and Tensors
213
The determinant det(G(x) hi kf), where hi, ki (e, j = 1, . . . , tit) are representatives of 2 in tangent vectors, is nondegenerate and alternating in the differentials hr as well as in the differentials k,. For k1 = hi
this determinant is nonnegative, and it vanishes if and only if the . . , It,,, are linearly dependent. The magnitude of the "locally euclidean" volume element is then (cf. I.6.10) invariantly
vectors h1, .
defined by the expression
0.
1ht,...,h,,,I
Now if D(p) _ (D(x), D(z), ...) is an arbitrary covariant alternating (nondegenerate) tensor of rank in, the volume element dv is normalized in a locally euclidean fashion by setting
dv=e(p)D(p)dip...dj, where e(p) (> 0) is for linearly independent dip If one writes
Q(p)
E(p) =
I D(P)d1P .
d,"PI
D(P) dip ... d,"P ID(P) dip...dmPI
then dv can also be expressed in the from dv ='(p) (/det (G(p) dip d;p)
If the manifold F'" is embedded by means of a mapping y = y(x) in a euclidean space R" and one defines, as done in this chapter, the fundamental tensor G(x), following Gauss, by G(x) It It = (y'(x) It, y'(x) k), then the normalized volume element Idvi coincides with the euclidean volume of the tangent simplex (dly, ... , d,,,y) measured in the space Ry, whereby diy = y'(x) d'x.
The above argument can be repeated without modification if one restricts the consideration to a q-dimensional subspace U1 of the tangent space Tap (1 tit). q
4.14. The second fundamental form. We come to the question of the transformation character of the remaining fundamental quantities in the Gaussian theory of surfaces; the manifold F" is thus given
as an tit-dimensional surface, defined by the mapping y = y(p), embedded in a space Ry of dimension is = in + 1. In the parameter space Rx the second fundamental form has the representation L(x) It It = (y"(x) It k, e(x)) , where e(x) is the representative of the (invariant) unit normal.
V. Theory of Curves and Surface
214
From the invariance dr k = d k , according to which dr drdx dx it follows by means of differentiation with the differential dx = h that di d.Y
d2} 4 6-
di d=r
di
di d.r2
dr2 dx
Or
hk
(10
It k
and thus d2Y --
hk
d.y h k
wit
=i dx
ddx)hk 2
dy d2x)
d2r = (dx2
dx d
(4 1)
h k
If the second term on the right in these transformation formulas were equal to zero, then d2yfdx2 as an operator would he a covariant of rank two. However, this is the case only when the relation x -- c is linear.
The second derivative d2y;dx2 of the invariant y is therefore not a covariant. But in view of the invariance c(x) = e(ar) it does follow from the above transformation formulas that h k, c l - (dx h k, el d -r2 / and because the second term on the right does vanish,
Lkk=
asy dxY
h it, e =
ay d2x - -
tlx2
Lhk=L1,/t. Consequently, the second fundamental operator L is also a covariant tensor of rank two: I. = L. 2
4.15. The operators T(a.) and A(r). In order to ascertain the law of transformation for the Christoffel operator 1', we start from the defining formula (3-5):
G(x) i 1'(x) it k=
s
(G'(x) If k 1+ G'(x) k I h - G'(x) l h-k)
If the invariance (s it l = G k l = G ` it dx 1 is differentiated with the dx differential dx = It, then dG dx
dG
dx
dr
d2a
da
d.r
f'
dhki=dxdxledxkdxl +Gdx2l,kdxl+Gdxkdx-lei
and therefore dG dA-
d=i d(; drr /.h l =Zh k i-}-Gddrshk-GkTX2hl.
§ 4. Vectors and Tensors
Thus, according to the above formula for G l r h k
215 ,
Gll'hk=GlFhk+G1 d"hk, dz1
or, because G 11' h k
Gl
TX
1' h k,
Gl(1'hk-ai rIsk+a2hk)=o. ivSince this holds for every 1, we have
(d'
r-`)hk
(4.2)
As in the transformation formula (4.1) for the second derivative d2y1dxs a "disturbing term" also turns up here on the right: the operator r is not a covariant. We shall presently return to the similarity of both of these transformations. The operator A was uniquely determined using both of the fundamental operators G and L by means of formula
G(x)kA(x)h=L(x)Itk. It follows from here that
Gk A1,=Lhk=Lhk=GkAh= Gkd Ah, and hence
Gk(Ah-d Ah)=0, which yields the transformation formula A 1t =
dx
A It
s
for ; 1. With a contravariant vector it and a covariant vector Is, there-
i
fore,
h ,1 h I
is invariant and real: A is a mixed tensor of rank two: A = A.
4.16. Invariance of the principal curvatures. The principal curvature directions e,, ... , and the corresponding principal curvaof the surface at a point ¢ = (x, z, . . .) of the surtace tures x, . . . , are as a consequence of their geometric significance invariant quantities. This can also be verified using the above transformation formulas. Let ei be one of these unit vectors and x, the associated principal curvature, then in the parameter space R?, provided it is metrized at
V. Theory of Curves and Surfaces
216
the point x by means of the first fundamental form, ef(x) = y'(x) af(x) and
A(.r) a,(x) = rf(x) af(x)
(Iaj(x)I = 1)
(cf. 3.6-7). If one goes by means of the mapping z = z(x), x = x(i) over to the parameter space RT and if one sets of = a,, then the
a
transformation formula for the operator A yields Aaf=ddxxAaj=xjd,af=rfof, dx
from which it is clear that the eigenvectors a f are transformed covariant-
ly and that the eigenvalues xf remain invariant. Then also
_dy_
_
dy
of'dxa,=dxaj=ej
is invariant.
Conversely, it would have been possible to deduce the transformation law for A from the invariance of of and x, and the contravariance of a,. As a consequence of the invariance of the principal curvatures xf, the elementary symmetric polynomials in the x1, thus in particular the Gaussian curvature
K(P) = xi(P) ... x.-,(P) are likewise invariant.
4.17. The covariant derivative. We turn back to the transformation formulas (4.1) and (4.2) derived above for the second derivative d=y/dx2 and for the operator T. By (4.1) we have for dsy/dx2
"" hk
-
d V2
dx dx2
and it follows from the transformation formula (4.2) for r, when the operator dy/dx ixs applied to both sides, that
dyl'li.k=ayl'hk-dyd"hk. dx dx= dx dx Subtracting these equations therefore yields the invariance
\d
2
d1')h k = (ass
dl'll: k,
which also follows` directly from Gauss's derivative formula (3.4)
y"(x)hk-y'(x)1'(x)hk=L(x)hke(x), in view of the invariance of the right hand side. Hence the operator d2y _ dy ,, dx2
is a covariant tensor of rank two.
dx
217
§ 4. Vectors and Tensors
This holds not only for the derivative operator dyldx, but for any simply covariant operator A on the manifold F". In fact, if k stands for the representative of a contravariant vector in the parameter space Rx, then
Ak=Ak=Ao-k, from which it follows by means of differentiation with the differential
dx = h that dA
_dAd-x dx
d=z
dxhkTdxhdxk-}-ADihk=
dA-0" hk+Adxshk.
On the other hand, the transformation formula (4.2) for 1' yields
'Arhk=AThk
=hk,
and the asserted invariance
(_)=(_Ar)hk
follows by subtraction. We call the covariant operator of rank two
'A=A'-AF
(4.3)
the covariant derivative 'A = 'A (x) o/ the covariant A (with respect to an
operator r which obeys the transformation law (4.2)). If A is in particular a covariant vector field l* = l*(x), then
'(l*)hk= ((i*)'-1*I')hk is a real invariant. The operator '(l*) is thus a covariant tensor of rank two.
The covariant derivative of a contravariant vector can be defined in a corresponding fashion. If k = k(x) is the representative of such a field, then with a covariant vector l*
1* k=1*k=l* If this equation is differentiated with the differential dx = h, then l*
h = l*7shk + 1*
h
On the other hand, according to the transformation formula (4.2)
for r
1*rkh=l*I'kh-l* skh.
218
V. Theory of Curves and Surfaces
and adding these equations yields the invariance
l*(d ±I'il i -1*(di+1'klh, by which the operator 'k = 'k(x), the `covariant derivative of the contravariant vector k defined by
'k = k' , I' k
(4.4)
is a mixed tensor of rank two. The covariant derivative of an invariant is by definition equal to the ordinary derivative. The above definitions can be generalized to tensors of arbitrary rank, and we shall come back to this in exercise 1 of 7.7. Thus, for example, the covariant derivatives of the tensors G and L are covariant tensors of rank three, and the corresponding linear forms
'G(x) h k I =G'(x)hkI -G(.x)kr(x)hl -G(x)I]'(x)hk,
'L(x)hkI=L'(x)hh1-L(x)kr(x)h1-L(x)11'(x)hk, are real invariant forms, where, by the way, the first as a consequence of (3.5) vanishes identically. The second, as we shall see in 6.1, is symmetric not only in k and 1, but also in h and k. As a second example, we consider the mixed tensor A of rank two. From the transformation law for this operator, according to which for each contravariant k and covariant l*
l* A k = l* Ak = t*d,,dxk, dx dx it follows by differentiation with the differential dx = h that
l*aA hk=1* d.rdx
TV
dx-
But on the other hand, by (4.2)
l*1'hAk=-l*`OxhAk+1*1'h Ak, dr
-l*AI'hk=-1*Ad Zhk-1*."tl'hk, and the addition of these three equations yields the invariance of the form
where the covariant derivative
o'Aoc -zc A'oo +c 1'oAo -cA1'oc is a doubly covariant and simply contravariant tensor of rank three. This operator is also symmetric in the contravariant arguments h and k.
§ 4. Vectors and Tensors
219
It is easily verified that the above coordinate-free definitions of covariant differentiation coincide, after the introduction of coordinates, with the usual definitions. 1
4.18. The divergence. Let u(p) = u(p) be a contravariant vector field which is differentiable on the surface GD c F°'. As in 4.13, we make use of a real, differentiable, alternating and invariant fundamental form D(p) dlp ... dp, whereby the differentials d p vary in the
tangent space T = 7'd,. Because of the contravariance of u(p) the alternating differential form of rank m - 1 D(p) ti(p) dip ...d,,,
p
is an invariant. From that it follows that its rotor rot (D(p) u(p)) dip ... d.p is an invariant alternating differential form of rank m. Referring to the definition of the divergence (cf. III. 2.10) we now define this quantity to be the rotor density div u(p) - n+
---
rot (l)(P) u(p))d1p ... d,,,h , D(P) dip . d.P ...-
According to this definition div u is given as a scalar.
In particular, according to 4.13, the fundamental form D can be chosen in a "locally euclidean" way by means of the fundamental metric form G. Using the extended definition of the operator rot (cf. 111.2.6) one also arrives at an extension of div it that does not necessarily presume the differentiability of it.
4.19. The Laplace operator. If the differentiable vector field is covariant, u(p) = u*(p), the technique of 4.18 cannot be used directly for the formation of the divergence. But if the manifold F"' is metrized by a Gaussian tensor G(P), then by the procedure of 4.12 the index can be raised and the given field u* replaced by the dual contravariant field u(p), which is uniquely determined by the equation m *(p) = G(p) u(p) , u(p) = G-1(p) u*(p) One then defines div u*(p) = div u(p) = div (G-1(p) u*(p)) This technique can be used for the definition of the Laplace di//erential operator. Let f(p) be a real twice differentiable invariant. The gradient of this quantity, which is defined by the class of representatives
dl
dr
V. Theory of Curvcs and Surfaces
220
as a covariant vector, has by the above definition a well-defined divergence, and this quantity Af divdx
is defined to be the generalized Laplace operator (Beltrami-operator) of f. According to this definition At is an invariant (in this regard cf. exercise 3 in 4.20). By the extended definition of the rotor and of the divergence, the operator d can also be defined without assuming f to be twice differentiable.
4.20. Exercises. 1. Prove the invariance of the trace for a mixed t
tensor A of rank 2. t
t
Hint. If A = A is interpreted as a linear transformation of the tangent space T, then its trace is defined (cf. 1.5.9. exercise 4) as
TrA =Dht1
0
EDh1...hi_,Ahihi+t ...hm:
hmi-t
this expression is independent of the vectors h ... , 1:,,, as well as of the choice of the real nondegenerate alternating form D. We choose hl, ... , h,,, to be contravariant and fix D arbitrarily. In order to form the trace, observe first that A follows the transformation law A h = dx A h (h a contravariant vector). To form the trace we use the nondegenerate alternating form WX_
D hl ... /tin =__ D - hl ... az h,5 = D hl ... h. which is permitted. With this we have
TrA -
'
Dhl ...hm
Dh,...hi_,Ahihi+,...hm E D `-- h,
i- t
dx
.
aY hi_, dx
` - (A hi) ` dx
dx
hi,., ...',. xhm
m
EDh1...hi_,Ahihi+t ... h,,= Tr A. i..t
2. Show that the covariant derivative of the first fundamental tensor vanishes:
'G(x)=0.
3. Show that
d/= div dl dx is invariant with respect to transformations of the parameter.
§ S. Integration of the Derivative Formulas
221
§ 5. Integration of the Derivative Formulas 5.1. Formulation of the problem. We turn back to the theory of surfaces and in particular to the differentiation formulas of Gauss and
Weingarten. For a twice differentiable in-dimensional surface F": y = y(x) (x e G', y e R", is = am + 1) it follows from (3.4) and (3.7) that in every admissible parameter space
y"(x)hk=y'(x)T(x)hk+L(x)hke(x),
(5.1)
e'(x) It = - y'(x) A(x) h , (5.2) where the operators r and A are uniquely determined by means of the formulas (cf. (3.5), (3.8))
G(x)ir(x)hk= Z (G'(x)It ki+G'(x)kIh-G'(x)Ihk), G(x) k A(x) h = L(x) h k ,
(5.3)
(5.4)
and by the fundamental Gaussian forms
G(x)hk=(y'(x)h,y'(x)k),
(5.5)
L(x) h k = (y"(x) Is k, e(x));
(5.6)
h, k, l stand for representatives of three arbitrary contravariant vectors. As in the theory of curves, we shall now reverse the problem and ask :
Suppose two sufficiently differentiable covariant and symmetric tensors G and L of rank two are given on a sufficiently differentiable :n-dimensional (its > 1) manifold F"', whereby G is to be positive definite.
Under which additional necessary and sufficient conditions does there then exist a regular n:-dimensional surface in a euclidean space R; of
dimension is = tit + I with the prescribed fundamental tensors G and L? That certain additional integrability conditions are necessary is clear, based on the theory of differential equations given in Chapter IV, since now to > 1, and we are dealing with a system of partial differen-
tial equations. We shall in what follows set up these necessary and sufficient conditions, integrate the differentiation formulas and show that up to a translation and an orthogonal transformation of the space R."." the surface we seek is uniquely determined by G and L.
5.2. Summary of the derivative formulas. In order to obtain a direct connection to the existence theorem in IV.2. we first summarize the differentiation formulas in one single differential equation. For this we consider the linear space R"'" (of dimension tit" = in (in + 1)) of all linear operators z that map the in-dimensional mani-
222
V. Theory of Curves and Surfaces
fold F"' into the space R;. and then form the product space R!!' - R",., whose elements consist of all ordered pairs of vectors
it =
,y:
Here precisely when
ze and Ys = y3; further
and
it
z,).
With these definitions for the linear relations the product space is linear and its dimension equal to in n + is = (ii - 1) is + it = its (cf. 1.1.6, exercises 6-7); for this reason we denote it by R"'. Assuming there exists an invariant y(p) on F"' with the prescribed fundamental tensors G(p) and L (P), which we now assume to be twice continuously differentiable, let v'(x) _ z(x) and e(x) be the representatives of the derivative y'(p), which is a covariant, and of the invariant unit normal e(p) in the parameter region G'" of the manifold F'". Then the vector function it(x)
e(x)_ ,
(5.7)
according to the differentiation formulas (5.1) and (5.2), satisfy the differential equation u'(x) It - ':'(x) 1s, c'(x) 1s'
_ .z(x) r(x) h- e(x) L(x) h, - z(x) A(x) h] Here for each x e G' the right hand side is linear in h as well as in u. and u = it(x) therefore satifsies the linear differential equation du
B(x) dx u ,
(5.8)
where
B(x)hu=B(x)h;z,y;=[sl(x)h+eL(x)h,-zA(x)h). (5.9)
Further, since the prescribed tensors G and L are twice continuously differentiable, the operators I'and1, because of the formulas (5.3) and (5.4), and hence also the bilinear operator B, are continuously differentiable once. Conversely, if ti(x) is a solution of the differential equation (5.8), it is twice differentiable, and the same holds for the quantities z(x) and e(x) uniquely determined from (5.7), which according to the definition of B(x) then satisfy the differentiation formulas :'(x) h = :(x) r(x) h -- e(x) L(x) h , c'(x) It = z(x) A(x) h. (5.10) The integration of the differentiation formulas (5.1) and (5.2) is now reduced to the solution of the above linear differential equation (5.8) for it- it(x).
§ S. Integration of the Derivative Formulas
223
The existence theorem in IV. 2.1 provides necessary and sufficient conditions for the integration, which can then be related by means of the definition of the operator B to the operators r, A and L and thus ultimately to G and L. We shall return to these integrability conditions later and in this connection extract the following from the existence theorem: If the integrability conditions hold in the region G', , then the differential equation (5.8) for u = u(x) has one and only one continuously differentiable solution ti(x) _ [z(x), c(x)j which at an arbitrarily given point xo E G,' assumes an arbitrarily prescribed value u(xo) = no = [ro, a 1. Transferred to the linear operator z and the vector e this means that the equations (5.10) for the derivatives then have solutions z(x) and e(x) that are uniquely determined if one arbitrarily prescribes the linear operator z(xo) = zo and the vector e(.ro) = eo.
5.3. Construction of the surface from z(z) and e(x). If there exists a surface on the manifold F" with the derivative v'(x) = z(x) and the unit normal e(x) in the parameter region G,', then for each x in the region Gx and for each It in the space R' we must have ;(x) At = (z(x) h, e(x)) = 0. "e(x) = (e(x), e(x)) =: 1 . (5.11) Since the first fundamental form of the surface is prescribed and equals G(x) h k, we further have G(x) it it = (z(x) it, z(x) k) = G(x) 1t it .
(5.11')
Because
(y"(x) it it, e(x)) - (z'(x) it it. e(x)) = (z(x) I'(-r) At k + e(x) L (x) h it, e(x)) = L(x) it k, the surface also has the prescribed second fundamental form. Hence the initial operator z(.v,) = u and the initial vector e(xo) = eo must be chosen so that (eo, eo) = I
,
(zo At, eo) = 0 ,
(zo it, zo k) = G(xo) it it .
(5.12)
These initial conditions are satisfied if first an arbitrary unit vector in the euclidean space R" is taken for eo and then, for =o, an arbitrary operator that maps the parameter space R", metrized with G(.ro) it it, orthogonally onto the sit-dimensional subspace of the space R" orthogonal to eo (n = fn + 1). We claim that the quantities (x). s(x) and G(x) then satisfy the above identities (5.11) and (5.11') not only for x = ro, but for every x E G,Y'.
224
V. Theory of Curves and Surfaces
In fact, according to the differentiation formulas (5.10),
G'hk1 = (z'hk,zl) -{- (zk,z'h1)
=G1lhk+GkrhI+LhkCI+LhlCk, C'hk = (z'hk,e) + (z k, e'h)
=Cl'hk+Lhke""-GkAh, E'h=2(e'h,e)=-2CAIt. But the functions G(x) h k, C(x) = 0, e(x) _ I also satisfy the same system of linear differential equations. For according to equations (5.3) and (5.4), which determine the operators I' and A,
G'hkI=Gll'hk+Gkl'hl,
Lhk-GkAh=0,
and consequently
G'hkI=Gll'hk+Gkl'hI+LhkCI+Lhi k, C'hk=CrIs k+Lhke-GkAh, e'h=-2CCAh. This system can now, just as in 5.2, be summarized in one single linear differential equation if one goes over to the product space of the real a-axis, of the C-space dual to R'` and of the space of the symmetric
tensor G. In this product space one obtains as the equivalent of the above system one single linear differential equation for the quantity [e(x), C(x),G(x)) or for [e(x), C(x), G(x)1, to which the existence theorem
in IV.2.1 can be applied. In particular, the uniqueness of the solution for a given initial value at xo follows from this theorem. Now since according to the choice of ee andzo G(xo) = G (xo)
S (xo) = b (xo) - 0 ,
,
f. (Y0) = e(xo) = 1 ,
then in G' (z(x) h, z(x) k) = G(x) h k
,
(z(x) h, e(x)) = 0 ,
(e(x), e(x)) = 1
,
and the claim is proved. As already mentioned it follows from here that (z'(x) It k, e(x)) = L(x) h k
.
The construction of a surface y = y(x) with the prescribed fundamental forms now offers no difficulties. It only remains to integrate the simple differential equation y'(x) h = z(x) it . (5.13) Because of the symmetry of the operators 1' and L,
:,'hk=zI'hk+eLltk=a'kh,
§ 5. Integration of the Derivative Formulas
225
and the integrability condition is consequently satisfied. Since z(x) was continuously differentiable twice, it follows from the general theory
(cf. III.3.3) that equation (5.13) has a solution y(x) which is continuously differentiable three times and which is uniquely determined if the point y(xe) = Ye in Ry is prescribed arbitrarily. Then according to the above, for h e R'' (y'h, e) = (z h, e) = 0 ,
(e,e)=1,
and e = e(x) is therefore the unit normal to the surface y = y(x) with the prescribed fundamental forms
Ghk = (zh,zk) _ (,y'h,y'k),
Lhk= (z'hk,e) _ (y"hk,e). Observe that this surface, because of the regularity of the operator z(xe) = ze, is regular at least in a sufficiently small neighborhood of the point yo = y(ro)
The above consideration was restricted to a region G' in a given parameter space R,". One obtains the invariant surface y = y(p) on the ,n-dimensional manifold F'" if for every admissible parameter transformation x = x(x-), x = i(x), following the law of invariance,
y(x) and e(x) are defined by y(x) = y(x(x)) , e(x) = Since G and L were given as covariant tensors of rank two, the relation (),'(x) It, y'(x) k) = G(x) It It , (y"(.r) It k, e(x)) = L(x) It It , then hold however the parameter x is chosen.
5.4. Discussion of uniqueness. We refer to what was said in 3.6-7.
Let xr=x1(xo)
(i=1,...,n-1)
stand for the eigenvalues of the linear transformation A(xe) and let a. = a,(xo) be the corresponding orthonormal eigenvectors with respect to G(xe) It It as fundamental metric form. They are thus quantities which are uniquely determined by the prescribed fundamental forms
G(x)hk,
L(x)hk=G(x)hA(x)k.
The initial operator ze = y'(xo) maps the eigenvectors a. to the principal curvature directions t,....n-1) ef=e1(xo)=zoa, at the point ye = y(xo) of the surface which has been constructed. Together with the arbitrarily prescribed unit normal ee = e(.r(,) these principal curvature directions constitute an orthonormal coordinate system
eo,c,,... at the point yo, the n-frame of the constructed surface at this point.
V. Theory of Curves and Surfaces
226
Now if besides the unit normal eo the principal curvature directions are arbitrarily prescribed in a sequence corresponding to the somehow ordered eigenvalues x,, the initial operator zo is thereby uniquely determined; for there exists precisely one linear mapping zo
e,, ... ,
of Rr into the orthogonal complement of eo such that zo ai = ec for
i-1,...,it -1.
if in addition the point %,, = y(xo) is fixed arbitrarily, then the solu-
tions of the integrated differential equations, and therefore also the surface constructed in the previous section, are uniquely determined by yo and by the n-frame given here. We conclude: A sur/ace is uniquely determined by its /undamental tensors G and L
up to a translation and an orthogonal trans/oration of the embedding space.
5.5. The main theorems of the theory of surfaces. The integrability conditions for the differentiation formulas (5.10): z'(x) h = z(x) 1'(x) h e(x) L(x) h , e'(x) h = - z(x) A(x) Is
or for the linear differential equation (5.8) equivalent to this system: u'(x) It = B(x) Is u(x) are still to be set up. Here it = u(x) varies in the n'-dimensional product
space R.' = R" X R", and the bilinear operator B(x) is defined in G' by (5.9) : B(x) h it = B(x) h[z, y) - jz 1'(x) h + y L(x) h, -z A(x) It] .
If, as above, the fundamental tensors G and L are taken to be twice continuously differentiable, then r and A, and as a consequence also B, are continuously differentiable once in GT'. The existence theorem in IV.2.1 then states the following: In order that the linear differential equation (5.8) possess a solution which is uniquely determined by the initial value rro = Lzo, e: .
which is arbitrarily prescribed at the arbitrary point xo of the region G."',
it is necessary and sufficient that the equation
R(x)hkis
r ((B'(x)hk- B(x)IiB(x)k)-(B'(x)k1,-B(x)kB(x)h)) a=0 (5.14)
be satisfied in Gc' for each pair of tangent vectors h, k and each is in the product space Rk'.
Here according to the definition of the linear operator B
B'hkft ='zF'hk+yL'hk, -zA'hk!
g 5. Integration of the Derivative Formulas
227
and
BhBku='zrkl'h+yLkrh-zAkLh, -z1'kAh-yLkAhj. Thus if the integrability condition is to hold for every is, i.e., for every linear operator z E R"'" and every vector y E R. then necessarily
T'Itkl-l'kl'hl+AkLhL-1"khl+ThFkl-AhLkl=0,(5.15a)
L'hkI-Lkl'hl-L'khl+Lhl'kI=0, I'kAh-A'hk-ThAk+A'kh =0, LkAh-Lh,tk=o,
(5.15b) (5.15c)
(5.15d)
where h, h, I are representatives of contravariant vectors. Before we go on to the analysis of these integrability conditions, we wish to show directly by means of the differentiation formulas (5.1) and (5.2) that they are in any case necessary. If our problem has a solution, assuming the prescribed tensors G and L to be twice continuously differentiable, then -(x) will be twice and y(x) therefore three times continuously differentiable. From Gauss's differentiation formula (5.1): y"(x) k I= y'(x) I'(x) k l+ L(x) k I e(x)
it then follows by differentiation with the parameter differential dx = h that
y"'hkI= y"hl'kI+y'I"hkI+LkIe'h+PhkIe, which in view of the two differentiation formulas yields the following decomposition of y"' h Is I into tangential and normal components:
y"'hkl=y'(r'hkl+Th1'kl-AhLkI)+(Lhl'kI+L'hkl)e. Here the left hand side is symmetric in Is and Is, and hence also is the right. Further, since y'(x) is regular, this yields the above equations (5.15 a) and (5.15 b).
If one differentiates Weingarten's differentiation formula (5.2):
e'(x) k = - y'(x) A(x) k , the result is
e"hk= - y"hAk - y'A'hk,
which in view of Gauss's differentiation formula (5.1) implies that
e"hk= -y'(ThAk+A'hk) - LAAke. Both sides here are symmetric in Is and Is, and this gives equations (5.15c) and (5.1Sd).
By the above, there are apparently four integrability conditions imposed on the tensors G and L, which by means of formulas (5.3) and
(5.4) define the operators r and A. If these conditions are to be
228
V. Theory of Curves and Surfaces
compatible and our problem to have any solution at all, these must reduce to at most two independent conditions. That is in fact the case. First, by formula (5.4)
LhAk= GAkAh, and condition (5.15 d) is as a consequence of the hypothesized symmetry
satisfied with no further ado. According to the same formula (5.4)
GIAk=Lk1, from which
G1A'hk= L'hkI - G'hIAk= L'hkl - G'hAkl follows by differentiation. Further, formula (5.3) with A k instead of k yields
GII'hAk= 2 (G'hAkI±G'AkIh-G'1/iAk) and by adding these equations one obtains
GI(A'hk+1'hAk)=L'hkl- `-(G'hAkl+G'lhilk-G'Aklh). 2 Here the subtrahend on the right is, according to formulas (5.3) and (5.4), equal to
GAk1'hl= Lkl'hl,
and therefore
Gl(A'hk+1'h ilk) =- L'hkl - Lkrhl. This relation shows at once that equations (5.15 b) and (5.15c) are equivalent.
According to this, the necessary and sufficient integrability conditions are reduced to two, namely 1"(x) It k l± I'(x) h 1'(x) k I - 1''(x) k h I - 1'(x) k 1'(x) h 1
=L(x)kIA(x)h-L(x)11A(x)k, L' (x) Irk 1- L (x) k 1(x) h 1= L' (x) k h 1 - L (x) Ir 1'(x) k 1.
(5.16a) (5.16 b)
These are the fundamental equations of the theory of surfaces. The first is the Gauss-Codazzi formula, the second the Codazzi-Mainardi /ortuula.
§ 6. Theorema Egregium 6.1. The curvature tensors. We are going to analyze the fundamental equations (5.16a) and (5.16b) more carefully, and start with
229
§ 6. Theorema Egregium
formula (5.16b), where we now write h hs, hs instead of h, k, 1. If one subtracts L(x) h3 T(x) h, hs = L(x) h3 r(x) hg h, ,
the covariant derivative can be used to write this equation briefly as 'L(x) h, hs h3 = 'L(x) h2 h, h3 .
(6.1)
Since the real form on the left 'L(x) h, las h3 = L'(x) h, hs ha - L(x) 113 T(x) h, h3 - L(x) h3 r(x) h, ha
is obviously also symmetric in hs and ha. the Codazzi-Mainardi equation
is equivalent to the following statement : The covariant derivative of the second fundamental tensor L is symmetric.
We go on to formula (5.16a), and for short we write it R(.x) h,1rs 113 = L(x) h3 h3 A(x) it, - L(x) h, ha A(x) hs ,
(6.2)
where
R(x) h, hs ha = 2 A (T'(x) It, ha + 1'(x) h r(x) hs) lta . (6.3) From the transformation formulas for A and L it follows at once that
the operator R(x)
R(x) s
is a triply covariant and simply contravariant tensor of rank four; this is the mixed Riemannian curvature tensor.
Observe that this tensor is uniquely determined by the first fundamental tensor G alone. For the expression (6.3) for R contains 3
only the operators 1' and I", and these can, based on formula (5.)) which uniquely determines 1', be computed from G, G' and G".
We introduce a fourth arbitrary contravariant vector h4. If one then sets R(x) h, ht h3 1:4 = G(x) h,, R(x) lr, hs ha 4
,
(6.4)
s
the Gauss-Codazzi formula can, as a consequence of the equations G(x) h4 tl(x) h, = L(x) 1t, h4, G(x) h4 el(x) h= = L(x) h244,
be brought into the equivalent form R(x) h, h= h3 hj = L(x) h, h4 L(.x) hs h3 - L(.x) h, ha L(.x) h= h4 .
(6.5)
4
Here, according to its definition (6.4), R is a covariant tensor of 4
rank four, the covariant Riemannian curvature tensor.
230
V. Theory of Curves and Surfaces f
Like R, R is also uniquely determined by the first fundamental 3
s
tensor G alone, and it can be computed from G, G' and G". Observe further that the above two curvature tensors vanish
simultaneously, so that the equations
R(x) = 0,
R(x) = 0 4
3
are equivalent. Certain symmetry properties of the tensor R result from the right side of formula (6.5). 4
The symmetric group of the 24 permutations of the indices 1, 2, 3, 4 has as normal subgroup the "four-group", which consists of the identity permutation and the permutations (12) (34) ,
(13) (24),
(14) (23)
The corresponding quotient group is isomorphic to the symmetric permutation group on three elements. One sees immediately that the form on the right in (6.5) is invariant for the permutations in the fourgroup and, corresponding to the permutations in the quotient group,
assumes altogether six different forms which differ pairwise with respect to sign. The above situation, according to which R is uniquely determined 4
by G, together with these symmetry properties contain the essence of the Gauss-Codazzi formula.
6.2. Theorems Egregium. Among other things, Gauss's classical "theorema egregium" follows from the Gauss-Codazzi formula (6.5). That coset of the four-group whose permutations change the sign of R(x) h, h= h3 lr4 contains the permutations (12), (12) (12) (34) = (34),
(12) (13) (24) = (1423) and (12) (14) (23) = (1)24). According to this there are precisely two transpositions, (12) and (34), which change the sign of the form named. For fixed h3 and h, R(x) h, h2 h3 h4 is alter4
nating in h, and hs, for fixed h, and h=, in h9 and h4.
But the form C(x) h, 1., h3 h4 _ G(x) h, h4 G(x) Ii3 h3 - G(x) h, h3 G(x) hs h4 4
also has the same property. This immediately implies: If a,, a3 and likewise as, a4 are linearly independent vectors of the parameter space R7, then R(x) h,hah3h4 1
C(x) h, h3 h3 h, 4
R(x) a,a2as a4 4 _ C(x) a, a3 a3 a4 4
231
1 6. Theorema Egregium
for each pair of vectors hl, h, from the two-dimensional subspace spanned by a,, as and each pair of vectors h,, h4 from the subspace spanned by a3, a4. If in particular one takes a, = a, = h, as = a4 = k, then
R(x)hhhk
R(x)h,h2h,h4
_
4
4
C(x) h h h k
C(x) h, h= h1 h,
4
4
provided h h,, It, h4 vary in the two-dimensional subspace of Rte' spanned by It and It.
the above holds with no re-
Now if in particular m = 2, n
strictions on h,, h,, h h4, however the linearly independent coordinate axes h and k are taken for the parameter space R;. Now bring L(x) into the principal axis form with respect to G(x) and take the principal axis directions a,(x) and as(x) for It and It (cf. 3.6-7). Then L (x) a; (x) a, (x) = x; (x) (i = 1, 2) , L(x) a, (.r) at(x) = 0, and therefore according to (6.5) R(x) It It It It = - x,(x) x,(x) , 4
and (i = 1, 2),
G(x) a,(x) a,(x) = 1
Consequently
G(x) a1(x) at(x) = 0.
C(x)hkhk=-1. 4
Therefore
R(x)h,h,h3h4
K(x) = x, (x) x:(x) =
- - -- kt hs h, h, 4
R(x)h,h,h,h4 G(x) h, h4 G(x) h, h, - G(x) h, h.G(x) h3 h4
Here on the left stands the Gaussian curvature of the surface at the point y(x), and on the right an expression that does not depend on h,, h,, h h4 and that therefore can be computed Irons G(x), G'(x) and G"(x) alone. That proves the Theorema egregium of Gauss.
6.3. Exercises. I. Let R It, It, be the differential form that arises by s
1
contracting the Riemannian differential form h' R h, h, h, with respect 3
1
to hl and h,, and R h, - the linear transformation that results from R It, through raising an index. Finally let R be the real scalar that s comes from the contraction of h' R h,. R and R are the so-called 1
1
1
2
1
232
V. Theory of Curves and Surfaces
covariant and mixed Ricci tensors, respectively, and R is the scalar Riemannian curtvature. Show:
If e , ,.- . , e,, stands for an arbitrary coordinate system orthonormalized at the point x with respect to the fundamental metric tensor, then M
H,
R/ish,= ZRh$e; eih3= E Rh3e,e,h,= Rh3h2, 2
i-1 4
i-1 4
m
1
2
1
Rh== Z'Rh.eye; i-1 3
1
,n
m
ra
R= - EZReiejejej= £rixj, i-'1 j=14
1,1-1 i#j
where x,, ... , x,,, stand for the principal curvatures at the point Y. 2. Show, inverting exercise 3 in 3.8, that a surface with nothing but umbilical points is a sphere.
Hint. For each x 16.1" all principal t'brvethres are equal. and rans04149ntly A(#) A on *(*) A and
L(x)k1-G(x)kA(x)l-x(x)G(x)k1. Because 'G(x) It k 1 = 0, covariant differentiation yields 'L(x) h k 1= x'(x) h G(x) k 1, and therefore according to the Codazzi-111ainardi formula
x'(x)hG(x)k1=x'(x)kG(x)hI. If for an arbitrary h the vectors k and 1 are taken so that I = k == 0 and G(x) It k = 0, then x'(x) h = 0, and x is therefore independent of X. The claim is then an immediate consequence of Veingarten's formula.
§ 7. Parallel Translation 7.1. Definition. In a neighborhood Gm on the m-dimensional mani-
fold F' let a piecewise regular arc p = p(I) be given, where I varies in an interval of some one-dimensional parameter space R. At each point p of this arc let a differentiable contrarariant vector it = u(p) = ti(p) be defined. Then the covariant derivative 'u(p) is a mixed tensor of rank 2. One says the vector field u(p) has come into being along the curve p = p(I) by means of parallel trnslation if the former derivative vanishes on the curve. In the parameter space RY', corresponding to the quantities p = p(t) and it = u(p), there is an are x = x(I) and a contravariant vector field it = u(x). The condition for parallelism of the field along x = x(t) is thus, provided dx = x'(I) dl,
'itdx=du+1'ndx=0.
(7.1)
§ 7. Parallel Translation
233
The parallel translation of a differentiable covariant vector field it = to(p) = u(p) along the curve p = p(t) is defined correspondingly. The covariant derivative 'u(p) is in this case a covariant tensor of rank two and parallelism is expressed through the equation ,it dx -
dzt - it [' dx = 0 .
(7.1 ')
The left hand side of the last equation is a covariant vector. The condition for the parallelism of a vector field u(p) is thus given by means of a normal linear homogeneous differential equation which is invariant with respect to parameter transformations. Conversely, if the curve p = p(t) is prescribed, a parallel vector field can be con-
structed by integrating the defining differential equation along the curve. According to the general theory of normal systems (cf. IV.1), the field is uniquely determined if one prescribes the initial value no = u(po) of the field vector in an arbitrary way at an arbitrarily chosen point po = p(1o). The integration then certainly succeeds if the Christoffel operator r(p) is continuous or, equivalently, if the fundamental metric tensor is continuously differentiable once. In the "embedding theory" it suffices to assume the embedding mapping y = y(x) to be twice continuously differentiable. 7.2. The translation operator. According to the general theory of linear homogeneous differential equations (cf. IV.3.8) there is associated with equation (7.1) (or (7.1')) a family (T) of regular linear transformations of the tangent space (or of the space dual to the tangent space) to the manifold F'" with the following properties: 1. Corresponding to each oriented piecewise differentiable path I in a (sufficiently small) neighborhood on the manifold F" there is a well-determined linear transformation T = T1.
2. For the product I = 1=11 of two paths l, and l one has
T, = 'l1. T1,.
3. For the path 1-' reciprocal to 1, T1 T1 t = T,., T, = I (the identity transformation). 4. If the path 1 joins the points p = p1 and p = ps on the manifold, then the vectors it1 = u(p1) and its = u(ps) of the field u(p) which is parallel along 1 are connected through the relations' 113 = T1111 ,
t11 = 7', '
113 =: 71-1 its .
In the parameter space R.;", corresponding to the operator T, there is a transformation T, of this space, and one has for an increment dx of the path l = lr c R"' at the point x the relation
du =Td,5u-it =(dT)u=-I'dxit . For the case of a covariant field u(p) it is advisable, in conformity with our presentation of the tensor calculus, to write the operator T1 to the right of the argument u: us - eel T1, etc.
234
V. Theory of Curves and Surfaces
Thus T is differentiable at the point x - p. and the derivative is T' = - I'. Conversely, the theory of parallelism can be most simply constructed based on a given operator group (T) and postulates 1-41. 7.3. Metric properties. Since the Christoffel operator r is uniquely determined by the fundamental metric tensor G and its first derivative, the translation operator is determined by G and its derivatives G', G".
To more carefully investigate this connection, Ave consider in the parameter space R; a piecewise regular arc x = x(t), and along it take two parallel contravariant, say, vector fields u(x) and r(x). Then the expression G(x) u(x) v(x) is an invariant, and its derivative is thus the same as its covariant derivative. If one differentiates it along the curve x = x(t), the result is thus (G it v)' dx = '(G it r) dx = 'G dx it r + G('rt dx) r + G u('v dx) . Here 'G = 0 (cf. 4.20, exercise 2), and since, because of the parallelism, 'v = 0, the entire above expression vanishes. It follows from this that G it v is constant along the curve x = x(t). and one concludes that with respect to the local euclidean metric G(x) the translation operator T, is an orthogonal tracts/ornialion of the space R',.
Now if, as is the case in the Gaussian theory, the manifold F' is embedded in the space so that the metric G is induced by the euclidean metric on the latter space, it turns out that parallel translation on the embedding surface F'" along an arc joining two surface points yr and y2 maps the tangent planes to the surface at these points orthogonally (euclideanly) onto one another. This mapping can be completed to an orthogonal transformation of the entire space R" by requiring that the unit normals at the points 9'r and y, correspond. 7.4. Geodesic lines. We now set ourselves the task of determining
the "straightest" lines on the manifold, i.e.. those paths whose tangents are parallel. If the equation for the lines we are seeking is
written in the form x = x(t) (t real), then a tangent vector has the form u = it(r) _ ;.(t) x'(t), where 2(t) (> 0) is a real multiplier. If this vector it = u(t) is to now define a parallel field along x = then by 7.3 its length in the metric G(x) is constant : dx dx _ s< da\° G dT n dT - A ld, J = const. ,
where da stands for the length of the arc differential dx = x'(r) dt. Thus, up to a constant multiplier, we must have;. = dr/da, and the tangent vector which is to be translated in a parallel fashion will be equal to dx
_ dx
dT
da
1 For this, cf. W. Graeub and ft. Nevanlinna [i^.
§ 7. Parallel Translation
235
Now if the arc length a of the path to be determined is chosen as the parameter, substitution of u = dx/do in the equation (7.1) for parallel translation yields the condition d=x
dx dr
dal
+ rdo da
0
(7.2)
for the "straightest" or geodesic line x = x(o). In order to integrate this second order differential equation, one again introduces an arbitrary parameter T. In this way a second order normal differential equation is obtained for the geodesic line x = x(T), which according to the theory of normal systems (cf. IV.1) can be integrated (cf. 7.7, exercise 5). Through each point x there goes a one parameter family of straightest line arcs, which are uniquely determined if the direction of the tangent is fixed at the former point. The geodesic lines are also characterized by a metric condition,
namely they are the shortest lines joining two points (which are sufficiently close to one another) on the manifold (cf. 7.7, exercise 6).
7.5. Integrability of the parallel translation equation. Until now the differential equation for the parallel translation of a vector u = u(x)
has been integrated along a prescribed path x = x(t). The question now arises, under which conditions can the partial differential equations
for the parallel translation of a (for example, contravariant) vector it,
du + I' a dx = 0 , be integrated in a full on-dimensional neighborhood on the manifold F'". This is the case if and only if the translation operator Ti is independent
of the course of the path l which joins its fixed beginning and end points. For this, by IV.3.,9 it is necessary and sufficient that the relation
T.,=I
holds for the boundary y = as' of every two-dimensional simplex sa on the manifold. If r is continuously differentiable (G is hence twice continuously differentiable) this condition is equivalent with the trilinear differential form (cf. IV.3.12)
A(T'Iik+rhrk)1 being zero for every h, it, I E R,". But the operator in this form is (up to a factor of 1/2) nothing other than the mixed Riemannian curvature tensor R of the manifold (cf. (6.3)), and therefore: 3
For the parallel translation equation to be integrable on the manifold F'" it is necessary and sufficient that its curvature vanish.
236
V. Theory of Curves and Surfaces
7.6. Determining manifolds of curvature zero. We assume that the curvature R(x) = 0 in a certain parameter neighborhood in the space R., and wish to show that for a suitable choice of the parameter x the Christoffel operator 1'(x) can then be made to vanish. To this end we first fix the parameter space RY arbitrarily. If the
Christoffel operator does not yet vanish we try to determine a new admissible parameter i = v(x) so that (cf. (4.2) in 4.15)
I'hk=d=I'hk-rxhk= o. d.r dr=
(7.3)
To solve this second order differential equation for x(x), we intro-
duce the regular operator z = dildr as the new variable. The first order differential equation which thus results,
ddx-z1'dr=0
(7.f')
can according to IV .3-12 he completely integrated provided for each h and k
z/\(1"hk- I'!r1'k) =0.
Because of the regularity of -- this means that
(T'Is k-:.1'h1'k)1=-' R1ikl=0. By hypothesis this integrability condition is satisfied, and the operator dzldx is hence uniquely determined, if it is arbitrarily fixed at an
initial point x = x0. To then integrate the equation
dx=zdr,
(73")
observe that by (7.3') the rotor of the operator z is equal to
A:'hk=zA1'hk=0. Therefore, the condition of integrability for the differential d.r = z dx is satisfied, and by I1I.3.3 x(x) is determined by means of z(x) up to an additive constant. The sought-for parameter space R', in which F(i) = 0, is therewith constructed; it is uniquely determined if one
associates with an arbitrary element {p, dp) of the tangent space of F'" an arbitrary line element (ix, di).
In this distinguished parameter space the equation for parallel translation is simply di. = 0. it(z) = const. Thus parallel translation coincides with the elementary translations of the space R. Actually, the geometry of a manifold F' of curvature zero is euclidean, for it follows from I' = 0 that the derivative 'G(x) = 0 (cf. 4.20, exercise 2).
§ 7. Parallel Translation
237
The metric tensor ?;(x) is thus independent of the point x, which implies the euclideanness of the geometry on RT.
If conversely a manifold F" admits a parametric representation where G = const., 1' = 0, then its curvature is obviously zero, and one concludes :
For a manifold to be' euclidean it is necessary and sufficient that its curvature vanish. 7.7. Exercises. 1. Let Fr be a differentiable manifold of dimension tit
and x c- Rx the representative of a point Q E Fr . Further, suppose A (x) h, ... hq is an invariant q-linear form in the contravariant arguments hi. The covariant derivative 'A is defined as a covariant tensor B by taking the arguments h,, ... , h9 to be vectors which are parallel q+f
along an arc emanating from x that has the tangent vector dx = 1t at
the initial point x and then differentiating the form A h, ... hq at a the point x with the differential dx = h. One then sets B h h, ... h9 4+1 d(A h, ... hq). q
According to this, what is the general form of the covariant derivative 'A ? Also define analoguously the covariant derivative of a 9
9
contravariant tensor A.
2. Determine the alternating part of the second covariant derivative of a contravariant vector field u(x). 3. Let Do h, ha (h,, h= E Rx) be a nondegenerate alternating real form.
The alternating fundamental form D(x) h, h, _ ± j1det (G(x) hi h) , where the sign is chosen to be the same as the sign of D. h, h=, satisfies the relation
'Dhh,hs=D'hh,hs-Dlt,rh,h+D4,I'I, h 0. 4. Prove the so-called Bianchi identity
'Rh,hl ha ha ha +'Rh,halt,h4Its +'Rheh,hlh44,=0. 4
4
4
5. Show that the geodesic line emanating from the point pu of the twice continuously differentiable in-dimensional manifold F'" is uniquely given if one prescribes the direction of the tangent at the initial point Po. These geodesic lines form a field which covers a certain neighborhood of P. simply. Hint. In the parameter space R', where / has the representative x0, the equation for the geodesic lines is
x"+Tx'x'=0,
(a)
238
V. Theory of Curves and Surfaces
where x' = dx/da (a arc length). For an arbitrary parameter r one has
(i = dx/dr) X
a
and equation (a) becomes
s 1 I'i.l = °s a
(b)
Equation (a) is thus invariant in its form if o = 0. a = a r + j , i.e., when the parameter is, up to a trivial normalization, equal to the arc length or.
On the other hand, if x = x(r) satisfies the equation
z -}- r i x= 0,
(c)
then the parameter t (up to an affine transformation) is equal to the arc length of the curve x = x(r). For v' = G i i, and by (c) ddt)
dr
x x x+ 2 G x .Y =
Z;
Y .Y x- 2G x I' x x= O,
and consequently a= a, a = a r i P. Equation (c) is equivalent to the normal system
.Y=u,
it =-I'uit ,
whose solution is uniquely determined if the initial values x0 _ x(ro), io = x(ro) are given. The first part of the claim follows from that. For the proof of the field property of the solutions in the vicinity of the point x0, set ro = ao = 0, and take a unit vector e (G(xo) e e = 1)
for the initial tangent i(0). If x = x(r) stands for the solution of equation (c) that satisfies the initial conditions x(0) = x0, i(0) = e., then P = 0 and a = 1, hence r = a. Corresponding to each e with G(xo) e c = I there is for sufficiently
small a < a* a well-determined point x = x(a; e). Thus if one sets a e = t, then x = x(t) WO) = xo) metrized with G(x) which is wellis a self-mapping of the space defined in the sphere III < a* of the space R? metrized with G(xo). Further, since the solutions of equation (c) (= (a)), assuming sufficient differentiability of the tensor G(x), are differentiable with respect to or and e, the derivative dx/dt exists. For t = 0 it reduces to the identity transformation and is therefore regular. It then follows from the inversion theorem in 11.4.2 that (f(xo) = 0) I = t(.r) is single-valued (and differentiable) in a certain neighborhood of the point p0: Through each point x of this neighborhood there consequently
239
1 7. Parallel Translation
goes precisely one geodesic line which emanates from xo, namely that one with the unit tangent e at the point x0, where e = I(x)/`'G(xo) I(x) I(x)
6. The shortest line joining two points which lie sufficiently close to one another on the manifold F'" is geodesic. Hint. Without drawing upon general principles from the calculus of variations, the assertion can be proved in the following direct way. Let pe be a point on F' and xo its representative in Rx'. The geodesic lines emanating from x0, according to the above, form a field: thus if t denotes an arbitrary common parameter for this field of curves, then
for each x in a certain neighborhood of xo the initial tangents i(0) (i(t) = dx/dr) can be taken in a unique way so that the geodesic line x = x(r; i(0)) joins the point xo = x(0; i(0)) with x. Let dx = li be a fixed differential. If one takes into account that the differential
d dx _ d dx di=dx dr)h=d:
and that d
d dal) =
dk dxh)=ds=0
d (dd
h = dir, ) then differentiation of the equation os = G x u with the differential dx wr-
(do) =
WT_
(TX
WX_
WT-
yields 2 a dT (da) = d(6,=)
-dxdxzi+ 2Gidi= -dxii=a'.-dxx'x' (x' = dx/da), and hence 2 1T (do)
=a
dx x' x'
Here (cf. 3.5))
dGdxx'x'=2dGx'x'dx-2Gdxrx'x', dx
dx
so that finally
d (do) = a (- x' x' dx - G dx r x, x') This result is valid for a given G, assuming sufficient differentiability, for any field of curves. If in particular the curves are geodesic,
then r x' x' = - x", and the above equation yields La ± o(Gx'dx) = d;FT(Gx'dx) d (du) =a(dGx'x'dx + Gx"dx)=di X
V. Theory of Curves and Surfaces
240
from which it can be seen that the difference da - G x' dx is constant
on the geodesic arc that joins the points x0 and x. But for r -,. 0, or -0and x-,. xa G(x) x'(a) dx -, G(xo) a dx
,
where e = x'(0) stands for the unit tangent to the arc at the initial point x0. Further, since near x0 (a(x))' can be replaced by G(x0) x x, as x -. xo we also have da(x) = dz dx -. G(xo) a dx
,
and the above difference is therefore = 0 on the entire geodesic arc, and consequently da(x) = G(x) x'(a) dx
,
where x'(a) stands for the unit tangent to the geodesic line through x at the latter point and dx is an arbitrary differential, da(x) being the corresponding differential of the field function. From this it follows by means of Schwarz's inequality that G(x) x' x' G(x) dx dx = G(x) dx dx = ldxl2 , and therefore idol Idxl, where ldxl is the length of the line element dx at the point x measured in the metric G(x). The claim is an immediate ldals
result of this inequality.
§ 8. The Gauss-Bonnet Theorem 8.1. The geodesic curvature vector. Suppose a regular, twice differentiable arc on the manifold F' has the equation x = x(a) in the parameter region G" (c R;`), where or stands for the arc length. The contravariant vector g(x) _: x" + J' .t'' x' _ '(xx'
dr (x' = da
(8.1)
vanishes when the arc is geodesic and thus gives a measure for the curvature of the are in the metric G(x). One calls it the geodesic curvature vector of the arc at the point x = x(a). It follows by differentiation of the identity G x' x' = I with respect
to a that 0
dG
dx
x' x' x' -_" 2 G x' `," = 2 G x'(x" -i- rx' x') = 2 G x' g ,
from which it can be seen that the geodesic curvature vector is a normal to the arc x = x(a). 8.2. The total geodesic curvature. In the following the dimension in of the manifold F'" is assumed to be equal to 2.
§ 8. The Gauss-Bonnet Theorem
241
We consider on the surface F2 a neighborhood that corresponds to
the region Gx in the parameter space R. For the orientation we introduce in R; an arbitrary nondegenerate real alternating form Do h, h, (hi E R;). Then if one sets (8.2) det (G(x) hi hi) = (D(x) h, ly)' , where D(x)1i, h, is to have the sign of D, h, h,, D(x) h, h, is a bilinear form, defined for each x e G. which is alternating and nondegenerate. In the following we let [h, h,] stand for the angle formed by the vectors h, and h,, which in the locally euclidean metric defined by the fundamental tensor G(x) is uniquely determined modulo 2.-t by the relations (8.3) Ihil Ih,I sin [h, h,] = D h, h, Ih,I Ih,I cos [h, h,] = G Is, ht. (1111' = G Is, h;).
Now let x = x(a) be a twice differentiable arc in Gx that joins the points x, = x(al) and x, = x(or,). Since the geodesic curvature vector
g(x) is at each point of the curve normal to the curve's tangent dx = x'(a) do, according to the first formula (8.3) D dx g = Idxl IgI sin [dx g] = f IgI do where IgI' = G g g and the sign f at each point of the arc is fixed by the sign of D. dx g. The integral a.
f D(x) dx g(x) = f ±Ig(x(a))I do X& xo
(8.4)
as
is called the total geodesic curvature of the arc x = x(o) with respect to the metric G(x) and the orientation Do of the plane R.
8.3. Computation of the total geodesic curvature. We wish to derive an expression for the total geodesic curvature (8.4) which is important for what follows. For this we consider on the arc x = x(or) two arbitrary contravariant and continuously differentiable vector fields u(z) and v(x) which we normalize to length one relative to the metric G(x),
Guu=Gvv=1,
(8.5)
and compute the derivative of the angle Cu v) formed by the vectors u and v, which according to (8.3) and (8.5) is determined at each point of the arc x = x(o) (modulo 2 r) by the relations
sinruv] =Duv,
cos[uv] =Guv.
If the first of these formulas is differentiated with respect to o, then in view of exercise 3 in 7.7 we have v Griv d[rr = D u('v x') - D v('u x') . do
V. Theory of Curves and Surfaces
242
Here, as a consequence of (8.5), it and the contravariant vector d,s dx 'u.r' =da+Tdou
obtained by covariant differentiation are mutually perpendicular, and the same is true of v and 'v x' (G u('u x') = G v('v x') = 0), and consequently ,Ux' it=Grtrt - Gu('vx'1 't- x, ;--,(2 v.r B
If this is substituted in the above equation, the result is the formula dres v_
do
= D r('r x') - D u('u x')
(8.6)
If in this formula one makes the choise
r!== X'
dx da '
so that according to (8.1) D v ('r x') = D x' g, then for the total geodesic curvature of the curve x = x(,) we find the expression we seek f D dx g = f D rt('u dx) + f d- -n x'? , (8.7) a,rs
X,rs
xsrs
which forms the foundation for the discussion to follow.
8.4. Case of a closed curve. We shall now apply formula (8.7) to a twice continuously differentiable closed curve y: x = x(o) (x(o1) =x(0,)).
Since the angle -it x' is well-determined modulo 2.r, because of the continuity of the vector it and of the tangent x', its increase on the closed path y is a multiple of 2 r, and one has f D dx g f D u('u dx) + 2.-r r,
(8.8)
where v is an integer. This result presumes that the curve y is twice continuously differentiable. If this is the case only piecewise, then a modification enters in. We indicate this in the special case where y is the boundary of a
triangle s' = s'(xr, x2, x3) under the additional assumption that the vector field u is continuously differentiable not only on y = as', but on the entire closed siutplex s'. We wish to determine the sequence of vertices xr, x1, x3 so that the orientation of ass induced by Do is positive.
We start from formula (8.7) and apply it for the three edges xi xi+ t
(i = 1, 2, 3; x4 = xr). By summation one obtains
I) u('u dx) ; r f d[u x'J .
D dx g = ss
rss
i.=t si
(8.9)
The angle here, ru x':, has a jump at the vertices xi which (modulo 2 r) is equal to the angle of rotation experienced by the tangent
§ 8. The Gauss-Bonnet Theorem
243
vector x' at xo measured in the metric G(x). If the corresponding interior angle of the triangle is equal to (,it, then the former angle is :r - V )p and thus one has 3
E f d[u x'] = f d[u x'l - 3 n + S? .
i-1 xixl+t
as,
(8.9')
where S? gives the sum of the angles in the triangle s' in the metric G(x)
and the boundary integral on the right is to be taken in the sense of Stieltjes, taking into account the jumps n - w at the vertices. Since
it and x' after a complete trip around as' return to their original positions, this Stieltjes integral is in every case an integral multiple
of 2a
,
f drux'] = 2nv
(8.10)
as,
We shall show that here v = 1. For the proof we decompose s2 into four triangles s4 (j = 1, 2, 3, 4) by drawing through the midpoints of each side of s' the parallels to the two remaining sides. Suppose the integral (8.10) over the boundary ash has the value 2 n v,, so that 4
4
E f d[u x'] = 2 x E vj .
j-1 j-t 8r' At a midpoint of a side the three adjoining angles have the sum x, and the corresponding contribution from these three vertex jumps of the small triangles to the sum on the left is therefore 3(3 x - x) = 6 n. Since the contributions of the interior angles cancel out in the summation, the above sum is, according to this, larger than the integral 4
(8.10) by 6 x, and thus v =
vj - 3 and
v-1= From this it follows that
4
j-1
4
Iv-1I ;5 E1v1- 11 94Iv1-11 1
if Iv, - 1 I stands for the largest of the numbers Ivj - 1 j. By repeating
the thus started "Goursat procedure", one obtains a sequence of nested triangles s,', which converge to a point xo of the closed triangle s1, and where
Iv- 11 22nlvn- 1;. f d.Ux'j = 2xvn
tszH
V. Theory of Curves and Surfaces
244
But for a sufficiently large n ? tso this integral has the value 2 r. For ultimately the triangle sM lies in an arbitrarily small neighborhood of the point x0 and because of the continuity of u(x) and of the funda-
mental metric form G(x) the above integral differs arbitrarily little from the integral f ditto x'o idn
where uo = u(xo) and the angle quo x'20 is measured in the constant euclidean metric G(xo). This integral is obviously equal to 2:r. According to this v,, - I = 0 for is no, and it therefore follows
from the above inequality for Iv - II that v = 1, which was to be proved.
If the value 2:r of the Stieltjes integral (8.10) is substituted in (8.9') and (8.9) the relation (8.9) assumes the form
fDdxg=f1)u('udx)+D-z. 09
B.S.
W
(8.11)
The Gauss-Bonnet theorem. We now come to the compu-
tation of the integral
f Du('udx) on the right in formula (8.11). For this we use the Stokes transformation formula, whose application to the linear form
Adx=Du('udx) is permitted if, for example, we hypothesize the given unit vector field u = u(x) to be twice continuously differentiable on the simplex s=. Observe that this hypothesis implies no restriction, for it is a remark-
able fact that the value of the above boundary integral in no way depends on the choice of the vector field u, provided only that it is continuously differentiable once, as can be read off immediately from formula (8.11). For the computation of the operator rot A we again use the formula in exercise 3 in 7.7, according to which for h, k E RY
A'hk - Ari k = 'A hk=Du("tthk)+D('uI:) ('u k); 'tt stands for the second covariant derivative of the contravariant vector it. Now for the three arbitrary vectors a, b, c E Rx the formula'
GccDab=1)cbGca-DcaGcb ' If one sets G a a = jail, G b b - lbi=, G c c = Jell, then according to (8.3)
DcbGca -DcaGcb=lallbIlcl'(sin (cblcos[ca:-sin [cajcos[rb;) Ial Ibl Icl2 sin ([c b] - [e a') - Iaj IbI Icl'sin [a b] = G c r Dab.
.
§ 8. The Gauss-Bonnet Theorem
245
holds, from which, because G it u = I and G tt('u dx) - 0, it follows,
with a = 'u h, b = 'n k, c = u, that D('u It) ('u k) = 0. Thus in view of exercise 2 in 7.7 we have t
2rotAhk=2DisA"sthk=DitRhkit, 3
where R is the mixed Riemannian tensor defined by (6.3). 3
In order to go on, we now take a unit vector v orthogonal to is,
so that Guv=0, Duv= 1. Then 1
t
t
1
hkit- = JRhkts3cos[vRhkul
2rotAhk= JRhkal sin [it 3
3
3
3
1
=GvRhkit=Rhkitr, 3
4
where R is the covariant Riemannian tensor (cf. (6.4)). Since this tensor in alternating in h and k, the quotient R h k is v/D h k is indet
pendent of it and k, and in view of the expression (6.6) for the Gaussian
curvature K(x) one finds that Rhkuv
Ruvuv
Rhkuv= 'ni:k Dhk= 4vtiv -Dhk
=RuruvDhk=-KDhk. i Further, since the oriented area element d/ (the area of the simplex spanned by it and k) is according to the first of formulas (8.3) equal to D h k12, Stokes's formula finally yields t
f Du('udx) of A dx = f rot Adsx dx
f Kd/.
R'hen this result is combined with formula (8.11) we obtain the Gauss-Bonnet theorem :
.f D(x) dx g(x) + f K(x) dl = Q - -r.
ess
(8.12)
SO
The terms on the left are the total geodesic curvature of the boundary NO and the total Gaussian curvature of the simplex ss. The sum of
these two curvatures is equal to the angular excess D - n of the triangle measured in the metric G(x). 8.6. Extensions. The Gauss-Bonnet formula yields a corresponding general relation for a polygon as c R2. For this one has to triangulate
246
V. Theory of Curves and Surfaces
the polygon and to add the formulas (8.12) for the individual subsimplexes s2 in the decomposition. In this way one obtains
fDdxg=flid/=E(Q-r). n'
cn8
s'
To evaluate the last sum, let ao, a,, %, stand for the number of vertices, edges and triangles in the polyhedron that results from the triangulation of :r2. If aol and A02 stand for the number of interior and
boundary vertices, respectively, of the polyhedron, then one has ao = ao,
=-and 'NO2
We now obtain
Z (Q - :[) _ ss
ty
3 12 - 2 :% 1 + a02 = 0 .
i - -r'2 = 2 :T 101 + L,t ej - .T It .
where the c' are the angles of the polygon :t2. If the angles q = :L - r supplementary to the angles n, are then introduced.
E(')=-raes-E9, and the above expression is therefore equal to
2naol+zao=-:r a=- E7 = 2.3 a0 -Ta2 -7ra0:- Em = 2-r(ao -a, +a=) - 3 9 - :z (3 a2 - 2>1 ±ao:)
=-2a x- Eq.
where
x=-ao-!-a,-aa
stands for the Enter characteristics of the polyhedral surface *r=.
We thus have, in summary, the Gauss-Bonnet formula for the polygron :r2
f D dx g {- f li d/
2.7 y -
= 0,
(8.12')
where x is the characteristic and 0 = A1,'T is the sum of the polygon's supplementary angles. It is an important property of the Gauss-Bonnet theorem that all
of the four terms which appear are invariant with respect to twice continuously differentiable transformations of the variable X. From this it follows that the theorem holds unchanged for a curve polygon .r2 whose bounding sides x = x(a) are twice continuously differentiable.
And with that the validity of the theorem is established for arbitrary triangulable polygons :r2 on the manifold F2. One only has to give a decomposition of ;r*- so fine that the triangles each lie in a parameter neighborhood, and the summation of the Gauss-Bonnet triangle formulas yields the theorem for .-O.
The above discussion takes on an especially simple form for a closed triangulable surface F2. In this case one finds
f K dl
F'
2.-r
VI. Riemannian Geometry Gaussian surface theory, treated in Chapter V, is relative insofar as the surface metric, which changes from point to point, is induced by the metric of the surrounding higher dimensional euclidean space. For Gauss, however, the crucial point was to construct a theory of surfaces "from the inside out", so to speak, using only concepts that relate to the surface itself, ignoring its embedding in a higher dimensional space. This guiding principle of Gauss's inner absolute geometry was sug-
gested by a practical task, the geodesic survey of the Kingdom of Hannover, which was entrusted to Gauss in the years 1821-1825. Geodesic cartography rests fundamentally upon local observations and measurements of the topological-metric structure of the surface of the earth undertaken on the latter surface itself.
But here another point enters in that occupies a central position in the Gaussian theory: Any attempt to represent a compact surface (say a sphere) by a two-dimensional, planar map cannot, for topological reasons, succeed "globally" (as already emphasized in V.1.4) using one single "chart". The surface is covered with a set of local neighborhoods (H) each of which can be mapped onto a planar chart K. If two neighborhoods (H) have a nonempty intersection D, any point P of D has an image point PI and P. on each of the corresponding charts K1 and K1. Conversely, one again obtains the entire surface by identify-
ing corresponding points (PI - P!, etc.) on the individual charts. In this way the individual charts are joined together into a global map ("atlas") of the surface. This idea is basic in practical geodesy, where the surface is triangulated and the individual triangular maps are then joined together, using the mapping correspondences (PI - P=, etc.), into a global entity. Starting from this idea, which Gauss developed in his general theory of surfaces (and which was also decisive for the discussion in Chapter V),
Riemann was able three decades later to construct his general theory of space, the theory of n-dimensional manifolds. Together with the notion of a "Riemann surface", which Riemann introduced in his fundamental investigations of abelian integrals of a complex variable,
248
V'I. Riemannian Geometry
the ideas of Gauss and Riemann are basic in the later investigations of topology, differential geometry and geometric function theory. In the present concluding chapter, the basic features of Riemannian geometry are presented. The reader who wishes to skip Chapter V can begin reading the differential geometric portion of this book directly with the present Chapter VI. In the following we shall, as the occasion demands, indicate at which points it is necessary to refer to the discussion in Chapter V.
§ 1. Affine Differential Geometry I.I. Elementary affine geometry. In Chapter I we discussed affine vector spaces and the parallel translation of vectors in such a space (I.1.5). If a vector uo is given at the point xo and x = x(t) designates some arc (x0 = x(to)), the vector is moved parallel along the curve with a family of translations T(x(t)) (cf. 1.3.9) that form an abelian group.
This elementary notion can be conceptually generalized in a/fine di//erential geometry. This has already been done within the context of the embedding theory (V,§ 7). To facilitate the reading we shall briefly summarize the ideas basic to the theory of parallel displacement.
1.2. Manifolds'. A set R. of objects ("points") p is called a topological space if the following axioms are satisfied:
1. In the point set R. certain subsets (H) of points (called "open" sets) are distinguished. I.J. The union of arbitrarily many and the intersection of finitely many open sets (H) is again an open set H. 1.2. The union of all open sets H is the entire set R.
The open set H is said to be a "neighborhood" of each point contained in H. A topological space is called Hausdorf/ if the following separation axiom is satisfied: 1.3. Two different points p of R. have two disjoint neighborhoods. A Hausdorff space Ry is called an m-dimensional manifold (R") if a system of covering neighborhoods (Ha) exists in R. each of which is homeomorphic to an (open) set H.., of an in-dimensional linear space R"' R.
1.3. Chart relations. Let H. and Hp be two neighborhoods on the manifold Ry and H.,. and H; their homeomorphic images ("charts", "maps" or "parameter regions") in the linear spaces R' and RM. ! Cf. V.i.3. S That is, there exists a mapping Hp« H,r which is one-t-one and continuous in both directions. so that each open set in H. corresponds to an open set in Hp, and conversely.
§ I. Affine Differential Geometry
249
Provided H. and Rq are not disjoint, corresponding to their intersection D are two open domains G, and G; in the charts H,, and Hs that are related to one another through a topological mapping x - x (Fig. 6). This chart relation is reflexive, symmetric and transitive.
Fig. 6
It defines an equivalence between the chart points (x, x, ...) that correspond to the same point (p) of the manifold Rp t.
1.4. Differentiable manifolds2. If the chart relations x - x are regularly differentiable
di = dx dx dx
dx = dx dxdz ,
(1.1)
then the manifold Rp is said to be regular or differentiable. If the chart relations are differentiable in-times, this designation is likewise carried over to the manifold. Let p = (x, z, . . .) be a point on the differentiable manifold Rp.
Let the parameters (x, x, ...) be increased by the differentials (dx, dz, ...), which are connected through the transformations (1.1). The classes (p fixed)
dp - (dx, di, ...) obtairt in a clear fashion a linear structure (as carefully demonstrated in V.1.2). They form an ni-dimensional linear space Rd",,, the tangent space of the manifold RP at the point p.
1.5. Invariants, vectors, tensors. The basic concepts of the affine tensor calculus are developed in Chapter V (4.1-4.12). We refer to this exposition.
1.6. Parallel displacement. Affine differential geometry concerns itself with differential manifolds on which is given a parallel displacenient or linear translation. While in elementary affine geometry two 1 Conversely, an abstract manifold RP can be generated by mear.s of the system
of charts H., Ri.... , which are provided with the given chart relations. by associating a "point" p with each equivalence class (x, z, ...). Besides the system of charts (H,, Ti-....1 used to define the manifold one can by means of additional topological parameter transformations of open subsets of the former charts admit new maps. A corresponding admissible extension of the charts is also to be allowed for more special manifolds (regular, differentiable). 2 This notion (already introduced in v.1.3) is briefly recapitulated here.
250
]tiemannian Geometry
vectors at points p, and ps are either parallel or not parallel, in differ-
ential geometry the parallelism relation is not given as a simple "distant parallelism", but placed in relation to the paths along which the vectors are transported. It is customary to define this translation by means of the Christoffel operators I nt (Christoffel symbols of the first kind) and the associated linear differential equation, as has been shown in Chapter V, § 7. However, instead of using this equation, it is advisable to proceed more directly from its integrals and to base the translation on their group structure'. This procedure, which has already been indicated in x'.7.2 and whose characteristics proceed from the discussion in I\'.8.3, is to serve as the foundation of parallelism.
1.7. The translations operator T. The manifold Rp is now assumed to be continuously differentiable'. An arc p = p(r), where r is a real parameter, is called regular when its chart projections are regular3. We consider a connected open region G on the manifold Rn and fix on it a piecewise regular arc I with initial point p, and final point pt. To each such path let there be assigned a regular linear self-transformation of a linear in-dimensional space R'.. By means of this displacement operator (T,) an "affine translation"
or "parallelism" between the contravariant vectors it = ti(p) (p a G) is defined in the following way: If it, is a tangent vector at the point p,, then one assigns to the end point p! of the arc 1 (= p, p=) the vector n= = T, it, .
I.S. Axioms of parallel displacement. The parallel displacement of vectors in elementary affine geometry is symmetric and transitive`. These two properties are to be retained for the generalized parallelism
concept. The displacement operator (T,) is therefore to satisfy the following special axioms: 1. Let l = p, p2, Is = p, p3 be two paths on G and 121, the composite path p, pt, p3. Then 7.1110 = 7.1, Tl, .
I Indeed, this point of view ought to be clearly emphasized in the theory of differential equations. 2 f.v , the equivalence r contiruousl differentiable. 8 More precisely: Let r, r _: rz he an interval on which the arc p(r) has the projection .r = x(r) on a chart ifs; this arc is to be regular (the derivative x' = d.r/'dr
is continuous and * o). Because of the hypothesized differentiability of the manifold, this definition is invariant with respect to transformations x The transitivity of parallelism in affine geometry is guaranteed by D sargucs's theorem.
§ 1. Affine Differential Geometry
251
2. If 1-1 = p, P, is the reorientation of the path I = b1 p then it is required that Tr_. = T1 '
.
Later (1.10) a kind of continuity axions is to be added to these group theoretical axioms.
Let us assume the path I (= p, p=) is so short that it lies in the
intersection of two charts H, and H. Let I = l,, l = l be the representatives of l in the latter charts. Corresponding to the operator T,
are then two regular linear transformations T = T, and T = Ti- of representatives it and a, respectively, of the tangent space. One has u(xI) = T,(u(x1))
t (x:) = Ti(u(x1)) and, because of the contravariance of the vector u, u(x 1 )
and hence
,
= dz 1 u(x ) ,
u (xs ) = d dX'-) u(rs)
1
dx,) u(x
-
dT
.
:) = T', u(x ) = T, a xj ') u(x ) 1
1
Consequently, 4l(x t)
= dx(x=) dx
dx(x1) T.1
dx
!t(x
and the operator T is transformed in the transition x - x according to T, = d dx2)
T, dY
(1.2)
1.9. Path independence of parallel displacement. In general the parallel displacement which is defined by a translation operator T, (1 = P, p,) depends on the choice of the path l joining the points p, and p, on the manifold R'. This leads us to ask, under which conditions is parallel translation independent of the path? If the initial point P. is fixed, the translation operator T, = T',.,,, becomes a welldefined function of the end point ps. The same problem has already been treated in the framework of the Gaussian embedding theory (cf. IV.3.7-3.9 and V.7.2), and in what follows we can rely upon this discussion.
1.10. The operator U, = T, - I. If parallel displacement is path independent on a connected subregion GP c Rp of the manifold, then for any two paths 1, and is between two prescribed points p, and p, on GI T, T',, = T,; For the closed path y = is ' 11, by postulate 2, 7',, = I , U,, = Ty - I = 0. (1.3)
252
\'I. Riemannian Geomelry
If, on the other hand, U,, = 0 for each closed path in Gp, then T,, = 1, from which the path independence of parallel displacement follows, because of postulate 2. In order to further analyze the condition U,, = 0, we restrict our attention to a neighborhood of the point p = P. lying within a chart H. (xo - po). On this chart we introduce an arbitrary euclidean metric in which each regular arc I c Hs obtains a length Ill = f jdxl. To r
investigate the path independence of the operator T', we assume that a kind of Lipschitz condition is valid:
3. It'll = ITl - il ;5 11I119 where Al is a fixed finite constant.
Path independence can then be established within a convex (or starlike) subregion G5 c Hx by following the procedure of IV.3.8. One
first considers the closed boundary y = as' of a triangle ss c G, and proves (cf. IV.3.10):
In order that U. = 0, it is necessary and sufficient that
limd = 0 for each point x of the closed simplex so, when the simplex s converges to the point x in the two-dimensional plane E0 spanned by sot. Here J is the area of the triangle s'.
By successive applications of this theorem we are able to conclude
that parallel displacement is path independent in Gx for polygonal paths 1r. If we restrict ourselves to such paths and assign an arbitrary (contravariant) vector no = rt(xo) to some fixed point x0 E Gs, a parallel
field u = u(x) is defined in G, by transporting uo along the segment I = xox to the point x E G. This restriction to polygonal paths in GX suffers from a defect, however, for the class of such paths :t is not invariant with respect to transformations of the parameter. In order to extend the above discussion to arbitrary piecewise regular paths 1, one must require that 4. T, - T l, when the distribution on 1 of successive vertices of the polygonal path sr inscribed on 1 is refined without bound.
1.11. Distant parallelism. These remarks solve the problem of path independence of parallel translation locally on the manifold 11"'. The corresponding global problem of so-called "distant parallelism" can be solved, as a consequence of the above local result, only under more special topological conditions relating to the manifold: I It even suffices to assume that the convergence s-. x is regular, i.e., that the quotient o/J remains bounded in the process, where 6 stands for the longest side of s.
1 1. Affine Differential Geometry
253
Let Gp be a simply connected region on the manifold R. If pl and ps are two arbitrarily chosen points on Gp, two paths 11 and l= (c Go) having p, as initial point and p, as end point can be continuously deformed on Gp into one another'. Under this condition the paralellism is path independent in Gp .
Remark. There exists an interesting relationship between the notion of distant parallelism and the theory of the Lie groups (cf. W. Greub [2).
1.12. Differentiability of the operator T. Up to this point the theory of parallel displacement has been constructed based solely upon 1 the four postulates 1-4. The continuity of T follows from 3: TT if the segment h with the fixed end point x tends to zero. We now assume that T is differentiable. For this one denotes the segment x xo byl and, keeping the point xo fixed, sets T, = T(xo, x) .
T is thereby defined in the neighborhood of x = xo as a linear operator function of x. We now suppose that T is differentiable (in the sense of the definition of 11.1.9). For x = xo, if we set dT
er s _t. = r(xo) . (1.4) dT = r(xo) dx . This expression (like T itself) is a linear operator. Writing x instead of xo, r(x) is a bilinear operator, the so-called Christojjel operator.:
1.13. Covariant derivative. In a neighborhood of the point x = xo e H, let a differentiable contravariant vector field x -0 u = u(x) be given. If u(x) is transported parallel along the segment x xo to the point xo, there results a contravariant vector T(xo, x) u(x). Hence for an arbitrary covariant vector uo at the point xo the expression 1lp T(.xro, x) u(x) is scalar, and so is its differential NWT) u(x) + uo T(xo, x) du(x)
= u; {(T dx) u(x) + T(xo, x) u'(x) dr, . I The notion of "continuous deformation" can (more simply than when making
use of the more customary "deformation rectangles") be reduced to successive "elementary deformations". If a b is a segment on I = pi a b p, represented in a convex region D. of a chart Hx by the path I., then Ir is deformed "elementarily" by replacing the segment a. bj by another arbitrary path that runs within D, and has the same end points av and b;. The two paths l1 and 12 are then said to be continuously deformable into one another provided they can be transformed into one another by means of a finite number of elementary deformations. 2 In this connection, cf. V.3.4, where the operator 1' was derived from the Riemannian tensor. We shall return to this question once more in 12 of this chapter.
VI. Riemannian Geometry
254
When x - xo it follows from the above that the sum 1'(x) dx it(x) -+ u'(x) dx is contrarariant. Thus, if one defines the operator 'tt(.r) by 'u dx = 1' d r it + u' dx ,
(1.5)
'u(x), the covariant derivative of the contravariant vector u(x), becomes a mixed tensor of rank 2 (cf. V.4.17).
1.14. Parallel translation of covarlant vectors. This is defined with the operator T, as follows. If uo is a covariant vector at the initial point xo of 1 = xo x, then the vector u* = u*(x), which has been translated parallel along I to the end point x, is given by
u*=tto7l When this expression is differentiated with respect to x. one obtains, if x -* ro,
du*(x) = tt*(x) d7' - tt*(x) 1'(x) dx , so that the parallel translation is now given by
du*=a*1'dx=0 (cf. V.4.17).
The covariant derivative of a covariant vector a*(x) is, correspondingly,
'ta * = It*, - U* 11. It is a covariant tensor of rank two.
(1.7)
1.15. The curvature tensor. Assuming the Christoffel operator to be dil/erentiable, we are going to examine more carefully the expression
U,,=T(,-I.
(1.8)
which is decisive for path independence. We fix a point 7 - x = xo and consider a small simplex s = (xo, x1, x=) and the expression U1., Uo = f du(x) Zs
f r(x) dx U(x)
i,
,
(1.9)
where 7s = xo x, xs xo; u(xo) = tto is the value of the transported vec-
tor n at the initial point x. By integration twice, starting with the constant value u(x) = no, one finds, upon setting h = x, - ao. tt(xi) - uo = - f (1'(x0) -!- 1"(xo) (x - xo) t ...) dx u(x) (- 1'(xo) It -F- -' 1'(x0) h ]'(To) Is - 2 ]"(.To) 1t h) tto 1... .
(1.10')
255
§ 1. Affine Differential Geometry
In a similar fashion, integration along the segment x,x2 yields
f d = u(xi) - it (x,) _ (-1'(x0) (k - h) + '-r(xo) (k - h) X,
1'(xo) (k + h) -
1,.(x0)
(k + h) (k - h) u0 + ....
where k=x=-x0. Finally, we have 10
f du = (1'(x0) It - s 1'(x0) It I'(xo) k + z 1"(x0) It k) u0 + ....
(1.10,.,)
.r,
Addition of the three above integrals gives the desired result: f du = '- (1'(x0) It 1'(x0) it - 1'(x0) h 1'(x0) It is
+ r'(x0) k Is - 1"(x0) It k) u0 + ....
(1.10)
Here the remainder term denoted by ... is of the order of magnitude 62(8), where 6 stands for the larger of the norms Ihi, Ikl (in some local euclidean metric in the parameter space). Expansion (1.10) can be derived somewhat more simply with the aid of Stokes's theorem (cf. III.2.7). The expression R(x) h k l= (A 1'(x) h r(x) k+ 1 `(x) h k) l (1. t t) hk
is a contravariant vector, and the operator R(x) is therefore a tensor of rank four with signature 1
R=R. 3
R is called the curvature tensor. From expansion (1.12) it follows that the limit lim
L''ts
D k h = n( x)
exists when the triangle (x, x + It, x 4. k) converges regularly to the point x. This quantity is a tensor density (cf. II1.2.5) which carries the signature Co. 1
In the above derivation we assumed that the simplex s has the point x = x0 as a vertex. This condition is not essential in order that expansion (1.10) hold. For if one considers a simplex (x1, r,, xo) which lies in a plane through the point x = x0 in a neighborhood Ix - x01 S o
of x0, the expansion (1.10) can be applied to the three simplexes
256
VI. Riemannian Geometry
(xo, x1, xs), (xo, x,, x,) and (xo, x2, x,). Then, because of the additivity of alternating forms, addition yields an expansion which is again of the
form (1.10), where now Is = xs - x,. k = x, - xl (where a cyclical change in the indices 1, 2, 3 is permitted). In summary we therefore have this result : If the Christoffel operator
1'(x) = T'(x) is differentiable, then the limit -Dhk(T3-I) -'A(x) Dkh Le, exists when the simplex s, which lies in a plane that goes through the point x and is spanned by the vectors h and k. converges regularly to the
point X.
Here D h k is an arbitrary alternating real form ( form
0). The trilinear
R(x)hkl=Dhko(x)l,
where h, k, l are arbitrary contravariant vectors, is contravariant, and t
R=R, 3
the curvature operator of the parallelism given by T, is a tensor of rank four. Observe that the existence and differentiability is sufficient (but not necessary) for the existence of the operator 1'(x) of the limit o(x) and (at the same time) of the curvature tensor R(x).
1.16. Local path Independence of linear translation. From the theorem in 1.1 it follows that the condition R(x) = 0 is necessary and sufficient for the independence of parallelism, provided one restricts oneself to simply connected neighborhoods on the manifold. There parallelism is defined by means of the translation operator T. If the latter is assumed to be differentiable and its derivative, the Christoffel operator, is differentiable, then the curvature tensor (p. 210)
R=A(lblb+lb6') exists, and the integrability condition is
A(1'hI'k+I"hk)=0, hk where h and k are arbitrary contravariant vectors. If, conversely, one defines parallel displacement, as is customary in classical differential geometry, by giving the Christoffel operator P(x) directly and integrating the differential equation
'udx=du+I'dxu=0,
(1.12)
§ 2. Riemannian Geometry
257
then the translation operator can, by integrating this differential equation along a path l = xo x, be determined from
T,uo=u(x), where u(x) stands for the final value and uo the initial value of the integral. In this regard, cf. IV, 3.8. 1.17. Elementary affine geometry. Parallel displacement is char-
acterized in elementary geometry by the condition T = I, U = 0. We set ourselves the task of investigating under which conditions the
operator 7', for a suitable choice of the parameter x, can be transformed into the identity 1. First, because of the independence of parallelism from the path, the curvature R vanishes. Assuming this, the operator T, in an x-chart where l corresponds to a path xo x is then a well-defined function T(xo, x) of xo and x in a (simply connected) neighborhood of x = xo. The transformation law for T under a parameter change x» :i is (cf. VIA -8) (xo,x)dz= T'(xo,x)dx.
Now if T is to be the identity, we deduce the necessary and sufficient condition
dz=A0T(xo,x)dx. This differential equation, provided Ao = x'(.ro) is prescribed, can be solved if and only if the bilinear operator d/dx (AO T(xo, x)) = AO T(xo, x) I'(x), and (because of the regularity of the operator AO T)
also 1'(x) is symmetric: 1'(x) h k = 1'(x) k 1r
.
Under this condition the integral x(x) = x(xo) + c'(xo) (x - xo) + .. . is uniquely determined in a neighborhood of the point x = xo, so that = I. This solves the problem. We have found
R=o and AT=0
to be necessary and sufficient conditions for the existence in a suitable parametric chart of an elementary affine translation. Remark. In general the Christoffel operator is not symmetric. The alternating part A I' of I' is called the torsion of the manifold.
§ 2. Riemannian Geometry Riemann, generalizing the Gaussian surface theory, developed in his famous inaugural dissertation, "Cber die Hypothesen, welche der
258
\'I. Riemannian Geometry
Geometrie zu Grunde liegen", the theory of manifolds, where the geometry is determined by a euclidean metric which varies from point to point. The metric at a point p -- x is therefore determined by means of a symmetric, positive definite invariant bilinear form G(x) h k
(2.1)
in the tangent vectors It, k. From this operator G(x), one can derive the symmetric Christoffel operator 1'(x) and thereby define parallel translation. Conversely, under certain conditions an affine differential geometry can be completed to a Riemannian manifold. 2.1. Local metrization of an affine manifold. As already done in the context of the theory of surfaces (V.7.1), the connection between metric and affine differential geometry is effected by requiring the
translation operator TI to he an orthogonal transformation of the tangent space, in the Sense of the metric G(x), at each point x of the path 1 = YO x. Thus, provided T is differentiable and u(x) and r(x)
are two contravariant vectors which are transported parallel along the path 1, G(x) u(x) vv(x) is constant along 1. If the operator G(x) is also assumed to be differentiable, then it follows that
n=d(eit r)= (*'dxit vTG(du)v-1-Gu(dv) Here dx is initially the tangent vector to the path I at the point x. But since the path 1, which was only assumed to be piecewise differentiable, can be continued in an arbitrary direction starting from the the above equation holds for every contravariant vector point dx = k at the point x. If in addition one observes that according to (1.5)
du _. - I'(x)kit,
then
dv=- -1'(x)kr,
G'kis r=G1'kit r- GI'kvit ,
(2.2)
where now k, it, v are arbitrary contravariant vectors.
2.2. Determination of I' through G. Henceforth we assume that the torsion vanishes and that the operator I' is therefore symniclric. Under this assumption one obtains by permuting the vector., k, it, r cyclically and adding the corresponding equations (2.2) (where the first
is to be multiplied by -1)
I (G'it rk±G'vkit - G'kur) = Gkl'itr.
(2.))
This same formula has already been found in the Gaussian embedding
theory For a given fundamental metric form G h k the Christoffel operator r is uniquely determined by equation (2.3) (cf. V.3.5).
1 2. Riemannian Geometry
259
2.3. Integration of the differential equation (2.2). Conversely, the Riemannian operator G(x) for the given 17(x) is determined through integration of equation (2.2), where u and r are assumed to be constant and k = dx designates the differential of x. Interpreted in this way, the differential equation can be written more briefly as 'c, = ()
,
(2.3')
where 'G stands for the covariant derivative of the tensor G (cf. V1.1.1 3)
For a given 1' and fixed contravariant vectors a and v, the relation (2.2) provides a linear homogeneous differential equation for the determination of the linear operator G(x). It can be solved locally as follows by means of the technique developed in IV.
Let B. be a convex region on a chart of the manifold M". We assume that the operator 1(x), which determines parallel displacement on B, is differentiable. Then let xo E 13w, and suppose and v are two arbitrarily fixed constant (contravariant) vectors in the parameter space R. In equation (2.2) we write k = dx. choose an arbitrary point .v E Bx and integrate (2.2) along the segment xo x. This normal system
then yields as its solution a radial integral G = G*(x). The latter is uniquely determined at each point x E Bs, provided the initial value G(xo) it v is given. If this is chosen arbitrarily as a positive definite symmetric form G(xa) is v in the vectors it and v, then the integral Cr also defines a similar form G*(x) is v in the vicinity of x0. By means of radial integration of equation (2.2) we have so determined a positive definite bilinear form G*(x)u v in a neighborhood of xo that d(G* it v) == 0, and G*(x) is r therefore remains constant when
the vectors is and v are transported parallel along the segment xo x.
This form is further invariant with respect to a parameter change x - * which is verified by applying the formula for transforming the operator.
2.4. The Riemannian curvature tensor. The radial metric tensor G* does not in general satisfy the requirement G* is v = const. for parallel displacement along paths l that are not radial. For arbitrary piecewise differentiable paths (with x = xo as initial point) this is true only if G*(x) satisfies the differential equation for arbitrarily directed differentials dx = la. According to IV this is the case if and only if the expression G" defined by equation (2.2) is symmetric for G = G*. In order to show this, differentiate equation (2.2), for two arbitrarily fixed (constant) vectors a, v and once again for an arbitrary differential dx = h. In this fashion one first obtains
G"hkis v=G'h1'kuit +G'hrkvis +Gl'hkis v+G1"hkvu.
VI. Itiemannian Geometry
260
If the first terms on the right are replaced by the values determined from (2.2), then in view of the symmetry of the tensor G and of the operator r, the result is
G"hkis r=G(1'hl'kit I"hku)r±G(1'hrkv±1"hhv)it +Grhvrkto +GrkvGhu. Hence, the integrability condition is
AG(I'hl'ku+1"hkit)v+ kkAG(I'Itl'kr+ 1"hkr)u=0, kk
or
t
t
3
3
GvRhkit +GuRhkr=O,
(2.4)
where R is the curvature tensor of the affine manifold M" determined by (1.11).
The Riemannian curvature tensor is defined as a covariant tensor of rank four by the invariant form Rl: h3113114=G(Rhlh=h3)h4, 4
(2.5)
3
where hl, ... , h4 are contravariant vectors. Condition (2.4) for the integrability of the differential equation (2.2) can therefore be stated: The Riemannian creature tensor (2.5) is alternating in the last two arguments lea and h4.
By definition R is also alternating in the first two arguments ht and h=.
4
2.5. Summary. Provided a given parallel translation is metrizable by means of a Riemannian tensor in a neighborhood of the point x0, this solution is uniquely determined in the neighborhood of x0 by prescribing the tensor G at the point x = x0. It is equal to the tensor G*(x) constructed by means of radial integration. In order for the problem to he solvable in the neighborhood of point x0, it is necessary and sufficient that the Riemannian tensor It 4 be alternating in its last two arguments.
Bibliography BACHLI, G.: [:] Cbcr die Integrierbarkeit von Systemen partiellcr, nichtlincarer Differentialgleichungen erster Ordnung. Comment. Math. Hely. 36.245-264 (19611 1962).
BARTLE, R. G.: [: l Implicit functions and solutions of equations in groups. Math. Z. 62, 335-346 (1955). - (2] On the openness and inversion of differentiable mappings. Anti. Acad. Sci. Fenn. A 1 257 (1958). BOt'RBAKI, N.: [1] Elements de mathematique. VII. Aigibre multilincaire. Actualitds Sci. Ind. 1044. Paris: Hermann (1948). CARTAV, E.: [:l Lecons our les invariants int6graux. Paris: Hermann (1922). r2; Les systimes diffrtrentiels ext!rieurs et leur applications gdom6triques. Actualites Sci. Indust. 994, Paris: Hermann (1945). Do 1BROwsi i. P., and F. HIRZEBRVCH: [:] \'ektoranalysis. [Hectographed lecture notes.] Bonn: Univ. Bonn, Math. Inst. (1962). DvxFoRo. N., u. J. T. SCHWARTZ (in collaboration with W. G. I3.%DR and R. G.
1;ARTLE) : ::; Linear operators. I. General theory. Pure Appl. Math. 7. New York/ London: Interscience (1958). FISCHER, H. R.: [:] Differentialkalkol for nicht-metrische Struktnren. Ann. Acad. Sci. Fenn. A 1 247 (1957). - [2] Differentialkalkol for nicht-metrische Strukturen. IT. 1)ifferentialformen. Arch. Math. 8. 438-443 (1957). FRE:CHET, M.: [:] Sur quelques points du calul fonctionnel. These. Rend. Circ. Mat. Palermo XXI I (1906). - [a] La notion de differentielle daps l'analyse gdndrale. Ann. sec. E'c. Norm. sup. XLII (1925). FREUDF.NTHAL. H.: [:; Simplizialzerlegungen von beschriinkter Flachheit. Ann. of Math. 43, 580-582 (1942). GILLis, P.: (:] Sur lee formes diffdrentielles et la formule de Stokes. Acad. Roy. Belg. Cl. Sci. MCm. Coll. in-S0 (2) 20:3 (1942). GRAEt B, \\'., and R. NEVANLINNA: [:] Zur Grundlegung der affinen Differential-
geometrie. Ann. Acad. Sci. Fenn. A I 224 (1956). GRAVERT, H., and W. FISCHER: [:] Differential- and Integralrechnung. It. Heidelberger Taschenbtlcher 36, Berlin/Heidelberg!New York: Springer (1968). GRAUERT, H., and I. I.Inn: [:] Differential- uncl integralrechnun;. III. Heidelberger Taschenb0cher 43. Berlin/Heidelberg/New York: Springer (1968). GRECB. \V. H.: (:' Linear algebra. [Third edition.; Grundlehren math. Wise. 97. Berlin/Heidelberg/New York: Springer (1967). - [2; Multilincar algebra. Grundlebren math. Wiss. 136. Berlin/Heidelberg!New York: Springer (1967). HAAHTT, H.: [:] Cber konfortne Abbildungen eines euklidischen Raumes in eine Riemannsche Mannigfaltigkeit. Ann. Acad. Sci. Fenn. A 1 287 (1960). HAAHTI, H., and T. K1.F.MOLA: [1] Zur Theorie der vektorwertigen Differentialformen. Ann. Acad. Sci. Fenn. A 1318 (1962). HF.IKKILA, S.: [1` On the complete integrability of the first order total differential equation. Ann. Acad. Sci. Fenn. A 1495'(1971).
Bibliography
262
I-TF.RJIANK, P.: [11 ('her ciuc Vcrallgemcincrung der alternierenden Ableitung
von Differentialformen. Univ. Jyvt skyls<,'Math. Inst.. I3ericht 12 (1971). - [22 Cher cinc Verallgemeincrung der alternierenden Ableitung von Differentialformen.
1. Math. Nachr. 52. 55.-99 (19721. - [3' Cber cine Veraligenleinerung der alternierenden Ableitung von Differentialformen. It. Math. Nachr. 52, 101- 1 1 1 (1972). '4-' Nine Cbcrtragung der Dificrentialgleichnng erster Ordnung auf Differentialformen in normiertcn linearen IUiunten. To appear in : Colloquium on mathematical analysis, Jyviiskyl3 19;11. Lecture Notes in Mathematics. BerlinjHeidelbergjNew
York: Springer. - [5' Cher den Satz von Stokes. To appear in Ann. Acad. Sci.
Fenn.:\ I. HILLS, E.. and It. S. PHILLIPS:
I
Functional analysis and semi-groups. Collo-
quium Publ. 31, Providence (It. I.: Amer. 'Math. Soc. (1957). KELLER, H. H.: [i" Cber di1 Differentialgleichung erster Ordnung in normierten linearen RBumen. Hend. Circ.'Jat. Palermo (2) 8. 117- 144 (1(1591. KLEJIOLA, T.: [s' ICber lineare partielle Differcntialglcichungctt erster Ordnung. Ann..\cad. Sci. Fenn. A 1260 (1955). -- [2. Regulgre Mengen von Simplexen and der Satz von Stokes. Ann. Acad. Sci. Fenn. A 1295 (1961). KRICKEBERG, IC.: [i] C`her den Gauf3schen and den Stokesschen Integralsatz. 111. Math. Nachr. 12, 341- 365 (1954). I. Tensoren LACGWITZ, D.: [s] Differentialgeolnetrie ohne auf lokal-linearen Raumen. Math. Z. 61, 100-118 (1054). - '2' Differentialgeometric ohno Dimensionsaxiom. 11. Iticntannsche Geometric in lokal-linearcn RBumen. Math. Z. 61, 134-149 (1954). LIcHSERowicz. A.: [s] Algcbre et analyse lin6aires. Collection d'ouvragcs de Mathdnlatiques II 1'usage des Phydiciens, Paris: Masson (1947). Lo,KA, H.: [2; Cber lineare vektorielle DIfferentialgleichungen zwciter Urdnung. Anti. Acad. Sci Fenn. A 1 350 (1964). I.OCHiVAARA, 1. S.: [s] ('ber die Differentialglelchung erster Urdnung in normier-
ten linearen RUumcn. Rend. Circ. Mat. Palermo (2) 10, 45-5h (11)61). - '2J Bemerkungen zur Vcktoranalysis. Studia logico-mathematica et philosophica in hono-
Acta Philos. Fenn. 18, 95-11 5 (1965). J.: [:] Zur Charakteristikenthcorie von Systemen partieller Differentialgleichungon crater Urdnung. Ann. Acad. Sci. Fenn. A 1 283 (19(10).. MiCHAL, A. D., and V. l i.cn is : -t; Completely integrable differential equations
rem Rolf Nevanlinna die natali eitts septuagesimo 22. X. 1965.
in abstract spaces. Acta'Math. 68, 71- 1117 (1937).
MC LLER, CL.: i'f' C'bor die Grundoperationen der Vcktoranalysis. ,lath. Ann. 124, 427-449 (1932). - 2] Grundprobleme der mathematischen Theorie elcktroGrundlehren math.\Viss.88. Berlin/G. ttingen,Heidelberg: Springer (1957). - 3. (`ber omen neuen Zugang zur mehrdimensionalen Differential- and Integralrechnung. To appear. NEB.\\LixXA, F.: ":j Clxrdie l mkchrungdiffereuzierbarer Abbildungen. Ann. .cad. tici. Fettn..i I 24S (1:157! - 2] Cber absolute Analysis. Trcizieme Congrc's
des MatbEmaticiens Scandinave it Helsinki 1957, 178- 197, Helsinki (1953). N1 vANL1NNA, V. and R.: ;1" Cbcr die integration eines Math. 98. 151 - 171, (1937).
lt.: [I-. Remerkung zur
.\cta
Math. Scand. 1.
[2] I3emerkung zur absoluten Analysis. Ann. Acad. Sci. Fenn. A 116q (1954). - ;3 ('her (lie I'mkehrnng differenzicrbarcr Abbiklungen. Anti. Acad. Sci. Fenn. A i 185 (19551 - :13 Clter den Satz von Stokes. Ann. Acad. Sci. Fenn. A 1 219 (1946). --- Is. Zur Theoric der Normalsysteniv con gettohnlichen Differentialgleichungen. Hommag.' 4 S. Stollow pour son 71v anniversaire. Itev. 1114-- 112 (1953).
Bibliography
263
Math. Purrs Appl. 2, 423-428 (1957). - f6] Sur les dquations aux ddriv6es partielles du premier ordre. C. R. Acad. Sci. Paris 247, 1953-1954 (1958). - [7] Application d'un principe de E. Goursat dana la theorie des equations aux ddriv6es partielles du premier ordre. C. R. Acad. Sci. Paris 247. 2087-2090 (1958). - (8] Ober Tensorrechnung. Rend. Circ. Mat. Palermo (2) 7, 2S5-302 (1958). - [9] Ober fastkonforme Abbildungen. Proceedings of the International colloquium on the theory of functions.. Helsinki 1957. Ann. Acad. Sci. Fenn. A 1251/7 (1958). - [1o] Cber die Mcthode der sukzessiven Approximationen. Ann. Acad. Sci. Fenn. A I 291
(1)60). - (ii) On differentiable mappings. Analytic functions, Princeton Math. Ser. 24, Princeton (N.J.): Princeton Univ. Press, 3-9 (1960). - (12! Remarks on complex and hypercomplex systems. Soc. Sci. Fenn. Comment. Phys.-Math. 26:3B (1962). - [13] Calculus of variation and partial differential equations. J. Analyse Math. 19, 273-281 (1967). NIEUINEN, T.: ri, On decompositions of simplexes and convex polyhedra. Soc. Sci. Penn. Comment. Phys.-Math. 20:5 (1957). PoLVA, G.: [a] Caber die Funktionalgleichung der Exponentialfunktion im MatrizenkalkOl. S.-B. Preuf3. Akad. Wiss. Phys: Math. ICI., 96-99 (1928). REICHARDT, H.: [i] Vorlesungen Ober Vektor- and Tensorrechnung. Hochschulbncher fur Mathematik 34. Berlin: VEB Deutscher Verlag der Wissenschaften (1957).
DE RHAAt, G.: (1] Ober mehrfache Integrale. Abh. Math. Sam. Univ. Hamburg 12, 313-339 (1937/1938). RtKxoNEr, H.: (,] Zur Einbettungstheorie. Ann. Acad. Sci. Fenn. A 1300 (1961).
ROTHE, E. H. : (1l Gradient mappings. Bull. Amer. Math. Soc. 59. 5-19 (1953). ScRiBA, C.-H.: [1) Cber die Differenzicrbarkeit der radialen Losung einer expli-
ziten vektoriellen Differentialgleichung erster Ordnung. Math. Nachr. 32. 25-40 (1906).
SEBASTIwo E SILVA, J.: (1] Integragao a derivasao em espaSos de Banach. Uni.
Lisbon. Revista Fac. Ci. (2) Al, 117-166 (1950). SEGRE, B.: (1] Forma differenziali e loro integrali. I. Calcolo algebrico esterno e
proprieti differenziali locali. Istituto Nazionale di Alta Matematica. Rome: Docet (1951).
SLEBODZTASK1, W.: (1] Formes extdrieures et leurs applications. II. Polska Akademia Nauk Monografie Matematyczne 40. Warszawa: Panstwowe Wydawnictwo Nnukowe (1973). SzOcs, A.: (a] Sur la variation des intdgrales triples at le thdoreme de Stokes. Acta Litt. Sci. Szeged. Sect. Sci. Math. 3, 81-95 (1927). TIENARI, M.: [:] Ober die Lbsung von partiellen Differentialgleichungen erster
Ordnung nach der Methode der sukzessiven Approximationen. Ann. Acad. Sci. Fenn. A 1380 (1965). WEYL, H.: [t] The method of orthogonal projection in potential theory. Duke Math. J. 7, 411-444 (1940). WHITNEY, H.: (i] Geometric integration theory. Princeton Math. Ser. 21. Princeton (N. J.) : Princeton Univ. Press (1957).
Index Adjoint linear mapping 31
Bilinear function, alternating part 21
- - transformation 31
- -.symmetric part 22
Admissible parameter i$2 Affine differential geometry 248, 231 integral 11S
Bonnet-Gauss, theorem MO 2¢4 Boundary point of a simplex L Bourbaki 2. 3. 25.1
- of an alternating differential no
- -, computation i2]
- manifold, local metrization of 2;S
- transformation 22 - translation 2`n - vector space Z
- - - metrization Algebra,
Calcul extterieur t2sl Calculus, exterior 128
Cartan 120.125 Cartan-Grassmann, exterior product 2i Cauchy convergence condition fi (polygonal) method 1 51-3
56
-- remainder term 3
, linear 3
-. tensor 209
Alternating bilinear function differential jig
22
- -, affine integral i12 n-linear function 4.2 operator 11S
Chain rule 78 Characteristic. Euler's 246 Chart extension 23.2
- relations 245
Christoffel symbols of the first kind 198.
- -, rotor 122
part of a bilinear function 22
- - of an n-linear function 42
Angles in a euclidean space Ffi Angular excess 24.5 Approximations, Picard's method of successive 15$, 166. L91 Are length 185
-. regular 2;0
operator r(x) ,62U,, - -, determination of through G 25S
Codazzi-Gauss formula 225 Codazzi-\fainardi formula 22$ Complement. linearly independent 12
-, orthogonal
1_3
Complete integrability of a differential
equation 1,L
Atlas 232
Components, transformation of the - of
Bade 261
a tensor 206 Conformal mapping 1
lianach metric 58 Banach-Mfinkouwski metric
;6
Bartle 251 l3anvicentric coordinate 13 decomposition of a simplex
-
Basis 10
-,dual in
Beltrami operator 220 Bessel's inequality 10
l3ianchi identity 21 Bilinear form 1 - function 26
- -. alternating 2Z - -, degenerate 34
- -, symmetric 2L 258
Congruence 6 Conjugate vector 212 Continuous deformation
233
- linear operator function 92 - p-linear operator function 53 - vector function 73 Contraction 2M Contravariance 182, 2n; Contravariant tensor 206
- vector 21
Convergence criterion, Cauchy's fiz Coordinate, barycentric 13
-, linear in
representation of the derivative L - of the divergence 135. 13$
Index Coordinate represensation of the
265
Differentiable manifold 184. 241
- -.continuously 2541
rotor 128
- mapping 75
Covariance 183, 2L)4.
Covariant 204
- derivative 2. UL Z3 2s-C 2S1 - tensor 2ofi - vector 204 - -, parallel translation of
231
Curvature of an arc 188 covariant Riemannian 229 - , Gaussian 200, 216, 233 -, mixed Riemannian22t) -, principal, direction 200 - . of a surface 122 -, Riemannian 232 tensor 228, 234
inversion 100 - p-linear operator function 84 I - vector function 7.5 Differential 7. 4 2L)3 -. alternating 118
- -, affine integral lit - calculus 74 equation 14114/148
- -, (homogeneous) linear, of first order L63 -, (homgeneous) linear, of first order with constant coefficients er', 16c
general linear. of first order 148.
-, total geodesic 24-4)
-, - Gaussian 243
158
vector, geodesic 2311 Curve, curvature 182;
-, normal system (normal equation)
-, degeneration 114 -, regular (regular arc)
-,complete integrability 163 d) = r(x) d.r - ,4 (x) d .r in
148 181
- theory 1j1 185
Decomposition, bar centric. of a simplex 55
-, simplicial, of a parallelpiped 55
- - of a prism 53
- of a simplex 46. 48 Definite quadratic (symmetric bilinear) function (form) 2g, 258 Deformation, continuous 253
-, elementary 253 Degenerate bilinear function 35 Degeneration of a curve o Dependent, linearly 5 Derivative 74 -, coordinate representation 76
rot Y(x) = A(x) iii
d y- A dx y! lfifi dv = .4(x) dx v 1C3 dv = /(x, y) dx 148. 1 3 -, exterior 122i - geometry LS.L 232
- -- , affine 248 -, partial e)5. rules
71
- total 23
- of a vector function (of a mapping) 76
Differentiation formula of Gauss 1, 21J
- - of Weingarten 198, 221
83
Differentiation formulas, integration of the 221 Dimension g Direct sum 9 Distant parallelism 252 Divergence 219 -, coordinate representation 1 ;6 1 3S Domain f 5 Dual basis 12
second 84
Dunford 26.1
-, covariant 216, 237 -, Lagrangian notation 7-5 -, Lcibniz's notation 76 - of a linear operator function 84 - of higher order 96 - of a p-linear operator function 84 - of a vector function (of a mapping) -, partial 95, uk
- -, symmetry 85, UI) Determinant 42 Determinants, product df two 44 Differentiable linear operator function 82
- linear space
L2.
Egregium, theorema 228, 2311
Eigenspace, -value, -vector 26 Einstein summation cons Elconin 3. 262
jr
Index
266
Gauss-Codazzi formula 228
affine geometry 257
- deformation 2.
-, theorema egregium 228, 230 - , transformation formula 136
Embed, embedding ,pace 1 a 1. 184
Empty place nf, Equivalence 6
Gaussian curvature 21ffi. 216. 23-1
- --,total 245
-. surface theory 181, L93
Euclid 122 Euclidean metric !,iw h6 2cS Ruler, characteristic 246
Generate 4 Geodesic curvature, total 240
- - vector 24f!
-, formula 206 Expansion, orthogonal 69 iixtdrieur, calcul 121, Exterior calculus 129
line
Geometry. affine 2 1
-. Riemannian 231
- differential 129 - point of a simplex 11
- product of Grassmann-Cartan 241v
- - of ordinary vector analysis
I.LZ
Factor space 5. 11 Finite dimensional linear space in Fischer 3. 261 Form. bilinear 27 homogeneous, of degree n 41
-, matrix of a bilinear Y -, quadratic 21 (see fundamental form) frame, associated
- associated u- of a surface 200 Frechet 3 theorem 36 Frenet differential equations (formulas) 185. 197
- matrix 187 - operator 187
Graves 3 Greub 261 Group of affinc transformations 22
-, orthogonal transformation 3< of regular linear transformations 21
- of translations 22 Haahti 261 Hausdorff space ICi. 239 Hermann 132, 262 Hildebrandt
Hille 262 Homogeneous linear differential equation of first order 16.3
- - - - with constant coefficients
Hyperplane 6, 12 Identity transformation 21 Implicit function 99 Indefinite quadratic (symmetric bilinear) function (form) 28 Independent, linearly 5. 9
Function, bilinear 26
-, implicit 99 -, linear operator 92 -, multilinear 41 -, multilinear operator -, n-linear 31
Goursat $1, 131.. 172. ti q. 243 Gradient 2f24 Graeub 234, 26-1 Grassmann-Cartan exterior product 29n
83
- - complement 12
-, quadratic 26
I nduced orientation of a side simplex 4;
- . vector I . Z4
- -,continuous 74
- -, differentiable 25 - - of several vector variables 95
Fundamental form, metric 6 LL C3. 1S4. 21L 25.8
- -, first, of surface theory 211 - -, second, of surface theory 2 213 e1
tensor, metric 211
Gateaux 3 Gauss 184, IUrL 1524.. 231
-. differentiation formula 19-6 221 - -Bonnet theorem 24f1. 244
inertia theorem 28 Inner product 01 Integral. affinc 118 - - of an alternating differential
1111
- -. computation 121 - calculus 1.L - -. fundamental theorem Si, - , Osgood's 1 `i
-, radial 156 Integration of the differentiation formulas 221 Interior point of a simplex 13 Intersection 9
Index
267
Linear transformation, symmetric 31 Linearly dependent 5
Invariance 182 202 Invariant 242. 249 Inverse function theorem 191
-linear mapping (transformation) 2! Inversion of a differentiable mapping too
- independent 5, q - - complement 12 Lipschitz condition 1 i 'Local metrization of affine manifolds258
Isomorphic mapping (isomorphism) I L20
- path independence 251
Jacobi. skew symmetric matrix
Lonka 262
I SS
Keller L 262 Kernel of a mapping 16
Kerner 3 Klemola 262 Krickeberg 262 Lagrange, method of multipliers it 7
-, remainder term 23
Louhivaara
137. 262
Lowering of indices 211
Maclaurin polynomial 42 Mainardi-Codazzi formula 22$ Manifold. continuously differentiable
1$4., ?i
Lagrangian notation for the derivative 25 Laplace operator 219.
-, m-dimensional 1S3. 248 -, regular 134, 21(2 Manninen 262
Laugivitz 1262
Mapping 1_9 16
Law of cosines 64 Leibniz's notation for the derivative -L Length, arc 185
- of a vector 51
- -, inversion of 1
Lichnerowicz 262 Linear algebra 4
- , linear 15
-, basic relations 4 coordinate In
- - system in -, definition 4
(homogeneous) differential equation
of first order 163
- - - - of first order with constant coefficients 9
- mapping 15 -, adjoint 3.7
-, conformal 87 -, continuous 74 -, differentiable ZS
165
- -, inverse 20
- -, nonregular 1.6
- -. regular IL - operator tZ
-
norm 51 - - space 18
- - function 32 space 4
- -,dual it
- -, finite dimensional io - transformation 20
- -, adjoint 32 -, inverse 20.
- -, nonregular 16
- -. regular
16i
Mappings, product of t'yo IS Matrix 1 Z
- of a bilinear form 21
-. Frenet 197 - of a linear mapping 12 -, rank 23 -. regular 23 -, skew symmetric Jacubi?n IRS
-. square 23 -. symmetric 23
-,trace 53
-. transpose 22 Matrices, product of two 1S Mean value theorem 7o Measure of a simplex 65 Metric. Banach 51
-, euclidean 0¢3
fundamental form (.1. 63. 1a4
- -, adjoint 3Z - -, inverse 20 - -,normal 3Z
-, --I3anach 56
- -, regular 20
Metrization of an affine space 16
- -, orthogonal 34 -, self-adjoint 37 - -, skew symmetric Zo 183
- tensor 211
Minkowski 51
-,volume 6iC 212
-, local 258
Meusnier, formula 2ffl
268
Index
Michal 1 262 \linkowski metric 51
Osgood. uniqueness theorem
lfinkowski-I3anach metric Morera-Pompeib, formula \MUller t3L 262 Multilinear (n-linear) function 41
Osgood's integral 15t
- - -, inversion
Parallel translation (displacement)
Z
232.244, 24.2
2ultilinear function, alternating 42
- -, alternating part 42 -- -, symmetric 42 Neumann, von 3 `evanlinna. F. !M 262
- -. axioms of 23t1 Parallel translation, path independence of 251 - - path indelxnelence of local 2; vector field 242
Parallelogram identity 6u
Nevanlinna, It. 9'1. 132,165, 25)2, 234.262
I'arallelpipcel, simplicial decomposition
a
Nieminen 120, 172 263 Nonregular linear mapping iL Norm of a linear operator ;!) - of a multilinear operator (n-t
Parameter, admissible 1112
Normal equation 14S
Partial derivative 9-1, txv
- space 151
- transformation 152 2112 Parseval's equation 31
- of a vector 57
- linear transformation spate 114
1590
1.52
- - of higher order L
3.7
- differential 45
system 14S
with respect to a symmetric bilinear function 3el unit, of a surface 123 Operator, bilinear 26
Path independence of parallel displacement 253
Phillips
l'icard, method of successive approximations 15S, 1Y2. 1!2.1
-,linear 11, 82
Planar point 2521
--, multilinear :L1., 83 -, alternating t
Plane 2 -, hyper- z. 12 -, osculating 1.86
-, norm of a linear 5y -, norm of a multilinear 6u -, rotor of an alternating 12i - , Christoffel I'(.r) 255 ?i - -, determination of through G
-, tangent
(L
133
Polarization formula 21 Pblya L( 2L 255
- , 4 (x) HIS -, Frenet -- , translation 25e
-, linear - function a2 linear - space 18
-, multilinear - function Orientation of a simplex 44 -, induced, of a side simplex 43 orthogonal 25,64
- complement 33
- expansion 6S linear transformation ;4 projection 39
- with respect to a euclidean metric 64 --- with respect to a symmetric bilinear function 28 e lrthogonalization method of Schmidt3l 5 )rthonormal 29, 64 1)+tulating plane, p-dimensional L86 space,p-dimensional iSei
Polygonal method of Cauchy 133 Polynomial, vector
-, Maclaurin u2 -, Taylor 21
Pompeib-Morera, formula 15p Power, vector Principal axis problem 64
- curvature of a surface 199 - direction 2um,
- -, invariance of 21; Prism, simplicial decomposition 5z
Product 4 -, exterior, of ordinary vector calculus 135
-,ai t:rassmann-Cartan 2ire
-, inner fit
of two determinants
- of two mappings 1$ of two matrices IS
-, scalar f 3 - space 1
44
Index Product, tensor 2519
Projection 26 -, orthogonal 352 Pythagorean theorem 63 Quadratic form 2!
- function 26
Raising of indices 211
Range 15 Rank of a matrix 23
26!)
Schwarz, theorem 22 Schwarz's inequality G; Scribes 263 Sehastilo e Silva 1263 Segre 263
Self-adjoint transformation 31 Semidefinite quadratic (symmetric bilinear) function (form) 25 Separation axiom 1%%;. 24 $ Simplex 13
- of a tensor 246
-, barycentric decomposition L -, boundary point 13
Regular are 2551
- curve 181
- linear mapping 16 - transformation 24 manifold 184. 24c)
matrix 23.
- point of a function 113 surface 113. tS1 Regularity index of a simplex 131 Reichardt 2611
-, closed 46 -, exterior point 13 -, induced orientation of as side 33 interior point 13
-, measure 6, -, orientation 43 -, regularity index 1311 -, simplicial decompomition (subdivision) 4fa, 48
Remainder term in Maclaurin's (Taylor's) theorem 91-!)3 - - in "Maclaurin's (Taylor's) theorem, Cauchv's 93 - - in Nlaclaurin's (Taylor's) theorem, l.agrange's - in Maclaurin's (Taylor's) theorem,
Simplicial decomposition of a parallel-
Schldmilch-Roche's 93 de Rham 2fa3 Riccitensor 232
Slebodzinski 263 Space, affine vector 7 -, cigen- 21a , emlx riding 181 -, euclidean 041, 03
Riemann =
- covariant curvature tensor 221. 251 - mixed curvature tensor 222 surface 231 Riemannian curvature 2 -- geometry 247, 231 Riesz-Fr<
35!
Rikkonen 26' Roche-Schl6milch remainder term 23
Rothe 2LI Rotor 127, 211
- of an alternating operator 125 ,coordinate representation 125 density 121L f.34
-, extension of the definition 12() - of ordinary vector analysis 138 Scalar
Li
- product 63 Schlomilch-Roche remainder term c23
Schmidt 3, 31 - orthogonalization process 31 Schwartz 261
piped 15
- - of a prism 55 - of a simplex 46, 4 - of a simplex, a sequence of 113 Skew symmetric Jacobi matrix 188
- - linear transformation a. t 187
-,factor b, 11 -, Hausdorff 183, 24S linear 4
-, dual
Iiq
- -, finite dimensional itt -, metrization of an alfine
51t
-, normal 113 - . operator IS -, osculating 186 parameter 1.5.1
-, product 1 -. sub- 6, 11 tangent 113, 1S. 2, )3,
-, topological 183, 24$ Span 5
Square matrix 23 Starlike region L41 Stationary point of one vector function with respect to another 1 trm
Index
271
$tok, s,
formula 123 applications of the theorem 141, theorem (integral theorem, transformation formula) 124., L3-1
Transformation, identitv2ius linear 2_c
-, adjoint
jZ
inverse 2ti
Subspace G 11
Successive
approximations,
method of t5
I'icard's
106. tf
- -,regular 2,i
Sum 4
--, direct 2 Surface general definition
2f
-, normal 3Z Transformation, linear, orthogonal U
114
as n-dimensional manifold f m-dimensional regular (regular subsurface) 114 181 Surface, Itiemann 241 --, triangulation 114, 241 Surface theory, Gaussian Lam, first fundamental form t -cond fundamental form tt15 -, main thtv,rems 221i Symmetric bilinear function (form) 1 linear transformation 31
,;
- matrix 2; n-linear function 42
-- part ofa bilinear function s skew, -Jacohi matrix Q L -, linear transformation 7u, 1 It-L Svmmetrv of the second derivative 85, 1.4-0
parameter 19 2L self-adjoint 31 ske'ty symmetric 7a 117 synnnctric 31 trace Ci Transformations. group of affinc 22 group of orthogonal 33 group of regular linear 21 Translation 22 affinc 210 operator 25,1
parallel 7 232 248, 244). 25.1
-, local path independence 22, Translations, group of 22 'transpose matrix 22 Triangle inequality 31 Triangulation of a surface 1 S4, 247 Triple index symbols of Christoffel
Ssfics 20 Tangent plane 1?5 183 space 114, 182, 221 24 vector 182 Taylor's formula tg1 - polynomial 9.1
Value, eigcn- at, lector 7, 2111. 14-4
't'ensor 202, ILIL 249 - a-covariant and ji-contra variant 2tlfa
- field, parallel 252
algebra 2u5
-, contraction 259 product 2LYj
-, Kicci 232 transformation of the components
2n, Tensors, ltientannian curvature 221, 254. 259 Theorema egregium 228, 23t) Tienari 2L)3 't'opological space 113,..41 Total Gaussian curvature 245
- geodesic curvature 240 - differential 25 Trace of a transformation (matrix); 3,21
Transformation, affine 22
ISmS
Umbilical point 2o1, 232 Unit normal
conjugate 2L' cuntravariant (contravariant vector field) 2o',
covariant (covariant vector field) 204
cigen- 21
function I:" 24
-, continuous L - -, derivative 73 - -, differentiable Z-, differential ZG -, geodesic curvature 2 norm (length) 51 polynomial L lower 99 - space. affine Z
-,tangent 1S2 Volume metric 65, 212 Weingarten, differentiation formula t
221
Weyl 2¢.3
Whitney 1 46,
17-2 23
Die Grundlehren der mathematischenWissenschaften in Einzeldarstellungen mit besonderer BerUcksichtigung der Anwendungsgebiete Eine Auswahl I. 27. 67. 69. 76. 77.
79.
$7. 4)4. 9n).
104. 107. 114. 116. 117. 120.
Blaschke: Elementare Differentialgeometrie,. S. Aufl. Hilbert/Ackermann: Grundzege der thcoretischen Logik. 6. Aufl. Byrd/Friedman : Handbook of Elliptic Integrals for Engineers and Scientists. 2nd edition Aumann: Reelle Funktionen. 2. Aufl. Tricomi: Vorlesungen fiber Orthogonalreihen. 2. Aufl. Behnkc/Sommer: Theoric der analytischen Funktionen ciner komplexen Vcriinderlichen. \achdruck der 3. Aufl. Lorenzen: Einfuhrung in die operative Logik und Mathematik. 2..l,ufl. van (ter Wacrden : Mathematische Statistik. 3. Aufl. Funk: Variationsrechnung und ihre Anwendungin Physik und Technik. 2. Aufl. Cassels: An introduction to the Geometry of Numbers. 2nd printing ('hung: Markov Chains with Stationary Transition Probabilities. 2nd edition K6the: Topologische lineare Raume. 1. 2. Aufl. MacLane: Homology. Reprint of 1st edition H6rmander: Linear Partial Differential Operators. 3rd printing O'Meara: Introduction to Quadratic Forms. 2nd printing Collatz: Fun ktionalanalysis und numerischc Mathematik. Nachdruck der 1. Auft.
121./122. Dynkin: Markov Processes 123. Yosida: Functional Analysis. 3rd edition 124. Morgenstern: Einfthrung in die Wahrscheinlichkeitsrechnung und mathematische Statistik. 2. Aufl. 125. ltd/McKean jr.: Diffusion Processes and Their Sample Paths 126. Lchto/Virtanen: Quasiconformal Mappings in the Plane 127. Hermes: Enumerability, Decidability, Computability. 2nd edition 129. Braun/Koecher: Jordan-Algebren 129. Nikodym: The Mathematical Apparatus for Quantum Theories 130. Money jr.: Multiple Integrals in the Calculus of Variations 131. Hirzebruch: Topological Methods in Algebraic Geometry. 3rd edition 132. Kato: Perturbation Theory for Linear t perators 133. Ilaupt/Kfinneth: Geometrische Ordnungen 134. Huppert: Endliche Gruppen 1 135. Handbook for Automatic Computation. Vol. I/Part a: Rutishauser: Description of ALGOL 60 136. Greub: Multilinear Algebra 137. Handbook for Automatic Computation. Vol I/Part b: Grau/Hi:l/Langmaack:1 Translation of ALGOL 60 13$. Hahn: Stability of Motion 139. Doetsch/Schafke/Tietz: Mathematische Hilfsmittel des ingenieurs. 1. Teil 140. Collatz/Nicolovius/T6rnig: Mathematische Hilfsmittcl des Ingenieurs. 2. Teil 141. Mathematische Hilfsmittcl des ingenieurs. 3. Tell 142. Mathematische Hilismittel des Ingenieurs. 4. Tell 143. Schur: Vorlesungen fiber Invariantentheorie 144. Neil: Basic Number Theory
Butzerllierens: Semi-Groups of Operators and approximation Sario/Oikawa: Capacity Functions ISo. losifcscu!Theodorescu: Random Processes and learning 152. Hewitt/Ross: Abstract Harmonic Analysis. Vol. 2: Structure and Analysis for Compact Groups. Analysis on Locally Compact Abelian Groups 154. Singer: Bases in Banach Spaces 1 156. van der \Vaerden: 'Mathematical Statistics 157. l'rohoroy'Rozanoy: Probability Theory I5S. CunstantinescujCornea: Potential Theory on Harmonic Spaces 159. Ruthe: Topological Vector Spaces 1 Agrest!Maksimov: Theory of Incomplete Cylindrical Functions and 10111. 145. 149.
161. 102. 163. 164. 165.
Applications 13hatia/Szegfl: Stability Theory of Dynamical Systems
Nevanlinna: Analytic Functions Stoer/\Vitzgall: Convexity and Optimization in Finite Dimensions I SairojNakai: Classification Theory of Riemann Surfaces Mitrinovn : Analytic Inequalities I66. Grothendieck/Dieudonnd: Elements de G@omitrie Algebriquc t 167. Chandrasekharan: Arithmetical Functions 16$. I'alainodoy: Linear Differential Operators with Constant Coefficients Lions: Optimal Control of S ystemsGoyerne(lby Partial 1711. Equations 171. Singer: Best Approximation in Normed Linear Spaces by Elements of Linear 172. 173. 174. 175. 176. 177. ITS.
I79. ISn. 151.
1$2.
Subspaces Buhlmann: 'Mathematical Methods in Risk Theory F. Maeda!S. Maeda: Theory of Symmetric Lattices
StiefeljScheifele: Linear and Regular Celestial Mechanics. Perturbed Twobody Motion - Numerical Methods - Canonical Theory farsen: An Introduction of the Theory of 'Multipliers C:rauert/Remmert : .\nalytische Stellenalgebren FlUgge: Practical Quantum Mechanics I FlUgge: Practical Quantum Mechanics i I Giraud: Cohomologio non abdlienne l.andkof: Foundations of 'Modern Potential Theor% I.iunsjMagenes: Non-Homogeneous Boundary Value Problems and Applications I Lions/Magenes: Non-Homogeneous Boundary Value Problems and Applications I I
1$3.
Lions/Magencs: Non-Homogeneous Boundary Value Problems and Applications I I I
154. 155. 156. 157. 155. 159. 1e11o.
191. 192. 193. 194. 195. 196. 197.
200.
Rosenblatt: Markoy Processes. Structure and Asymptotic Behavior Rubinowicz: Summerfeldsche 1'olynommethode \VilkinsonIRcinsch: Handbook for Automatic Computation ll. Linear.\lgebra Siegel;Moser: Lectures on Clestial Mechanics Warner: Harmonic Analysis on Semi-Simple Lie Groups I Warner: Harmonic Analysis on Semi-Simple Lie Groups I I Faith: Algebra: Rings, 'Modules. and Categories I Faith: Algebra: Rings, Modules, and Categories 11 Mal'ccy : Algebraic Systems l'olya'Szegd: problems and Theorems in Analysis. Vol. 1
Ignsa: Theta Functions Berberian:
Athreyaj\ey: Branching Processes Benz: Vorlesungen ul>er Geometric der Algebren
Doll: Lectures on algebraic Topology